From ncoghlan at gmail.com  Sat Oct  1 04:08:58 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 01 Oct 2005 12:08:58 +1000
Subject: [Python-Dev] PEP 350: Codetags
In-Reply-To: <ca471dc20509300723l68cb8e1ap476a89794866a7c@mail.gmail.com>
References: <20050926223521.GE10940@kitchen.client.attbi.com>	
	<20050928161039.GF10940@kitchen.client.attbi.com>	
	<20050929153237.97E1.JCARLSON@uci.edu> <433CFA1F.4010804@gmail.com>
	<ca471dc20509300723l68cb8e1ap476a89794866a7c@mail.gmail.com>
Message-ID: <433DEFBA.9050401@gmail.com>

Guido van Rossum wrote:
> On 9/30/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> 
>>An approach to this area that would make sense to me is:
>>
>>1. Defer PEP 350
>>2. Publish a simple Python module for finding and processing code tags in a
>>configurable fashion
>>3. Include a default configuration in the module that provides the behaviour
>>described in PEP 350
>>4. After this hypothetical code tag processing module has been out in the wild
>>for a while, re-open PEP 350 with an eye to including the module in the
>>standard library
>>
>>The idea is that it should be possible to tailor the processing module in
>>order to textually scan a codebase (possibly C or C++ rather than Python) in
>>accordance with a project-specific system of code tagging, rather than
>>requiring that the project necessarily use the default style included in the
>>processing module (Although using a system other than the default one may
>>result in reduced functionality, naturally).
> 
> 
> Maybe I'm just an old fart, but this all seems way over-engineered.
> 
> Even for projects the size of Python, a simple grep+find is sufficient.

I expect many people would agree with you, but Micah was interested enough in 
the area to write a PEP about it. The above was just a suggestion for a 
different way of looking at the problem, so that writing a PEP would actually 
make sense. At the moment, if the tags used are project-specific, and the 
method used to find them is a simple grep+find, then I don't see a reason for 
the idea to be a *Python* Enhancement Proposal.

Further, I see some interesting possibilities for automation if such a library 
exists. For example, a cron job that scans the checked in sources, and 
automatically converts new TODO's to RFE's in the project tracker, and adds a 
tracker cross-link into the source code comment. The job could similarly 
create bug reports for FIXME's. If the project tracker was one that supported 
URL links, and the project had a URL view of the source tree, then the 
cross-links between the code tag and the tracker could be actual URL 
references to each other.

However, the starting point for exploring any such ideas would be a library 
that made it easier to work with code tags.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From guido at python.org  Sat Oct  1 04:37:12 2005
From: guido at python.org (Guido van Rossum)
Date: Fri, 30 Sep 2005 19:37:12 -0700
Subject: [Python-Dev] PEP 350: Codetags
In-Reply-To: <433DEFBA.9050401@gmail.com>
References: <20050926223521.GE10940@kitchen.client.attbi.com>
	<20050928161039.GF10940@kitchen.client.attbi.com>
	<20050929153237.97E1.JCARLSON@uci.edu> <433CFA1F.4010804@gmail.com>
	<ca471dc20509300723l68cb8e1ap476a89794866a7c@mail.gmail.com>
	<433DEFBA.9050401@gmail.com>
Message-ID: <ca471dc20509301937w54f78f67v233c9cba691eb2f9@mail.gmail.com>

On 9/30/05, Nick Coghlan <ncoghlan at gmail.com> wrote:

> Further, I see some interesting possibilities for automation if such a library
> exists. For example, a cron job that scans the checked in sources, and
> automatically converts new TODO's to RFE's in the project tracker, and adds a
> tracker cross-link into the source code comment. The job could similarly
> create bug reports for FIXME's. If the project tracker was one that supported
> URL links, and the project had a URL view of the source tree, then the
> cross-links between the code tag and the tracker could be actual URL
> references to each other.

With all respect for the OP, that's exactly the kind of enthusiastic
over-engineering that I'm afraid the PEP will encourage. I seriously
doubt that any of that work will contribute towards a project's
success (compared to simply having a convention of putting XXX in the
code).

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From ms at cerenity.org  Sat Oct  1 16:21:32 2005
From: ms at cerenity.org (Michael Sparks)
Date: Sat, 1 Oct 2005 15:21:32 +0100
Subject: [Python-Dev] Active Objects in Python
In-Reply-To: <dhk9pf$orj$1@sea.gmane.org>
References: <397621172.20050927111836@MailBlocks.com>
	<dhk9pf$orj$1@sea.gmane.org>
Message-ID: <200510011521.33481.ms@cerenity.org>

On Friday 30 September 2005 22:13, Michael Sparks (home address) wrote:
> I wrote a white paper based on my Python UK talk, which is here:
> ? ? * http://www.bbc.co.uk/rd/pubs/whp/whp11.shtml

Oops that URL isn't right. It should be:
   * http://www.bbc.co.uk/rd/pubs/whp/whp113.shtml

Sorry! (Thanks to LD 'Gus' Landis for pointing that out!)

Regards,


Michael.
--
"Though we are not now that which in days of old moved heaven and earth, 
   that which we are, we are: one equal temper of heroic hearts made 
     weak by time and fate but strong in will to strive, to seek, 
          to find and not to yield" -- "Ulysses", Tennyson

From reinhold-birkenfeld-nospam at wolke7.net  Sat Oct  1 19:28:54 2005
From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld)
Date: Sat, 01 Oct 2005 19:28:54 +0200
Subject: [Python-Dev] Tests and unicode
Message-ID: <dhmh0n$41n$1@sea.gmane.org>

Hi,

I looked whether I could make the test suite pass again
when compiled with --disable-unicode.

One problem is that no Unicode escapes can be used since compiling
the file raises ValueErrors for them. Such strings would have to
be produced using unichr().

Is this the right way? Or is disabling Unicode not supported any more?

Reinhold

-- 
Mail address is perfectly valid!


From blais at furius.ca  Sat Oct  1 23:50:25 2005
From: blais at furius.ca (Martin Blais)
Date: Sat, 1 Oct 2005 17:50:25 -0400
Subject: [Python-Dev] Pythonic concurrency - cooperative MT
In-Reply-To: <1128073517.25688.10.camel@p-dvsi-418-1.rd.francetelecom.fr>
References: <fb6fbf5605092915471cbb1189@mail.gmail.com>
	<fb6fbf5605092916053506b9aa@mail.gmail.com>
	<1128073517.25688.10.camel@p-dvsi-418-1.rd.francetelecom.fr>
Message-ID: <8393fff0510011450l2e9dd608hce91ed00516d0935@mail.gmail.com>

Hi.

I hear a confusion that is annoying me a bit in some of the
discussions on concurrency, and I thought I'd flush my thoughts
here to help me clarify some of that stuff, because some people
on the list appear to discuss generators as a concurrency scheme,
and as far as I know (and please correct me if I'm wrong) they
really are not adressing that at all (full explanation below).
Before I go on, I must say that I am not in any way an authority
on concurrent programming, I'm just a guy who happens to have
done a fair amount of threaded programming, so if any of the
smart people on the list notice something completely stupid and
off the mark that I might be saying here, please feel free to
bang on it with the thousand-pound hammer of your hacker-fu and
put me to shame (I love to learn).

As far as I understand, generators are just a convenient way to
program apparently "independent" control flows (which are not the
same as "concurrent" control flows) in a constrained, structured
way, a way that is more powerful than what is allowed by using a
stack.  By giving up using the stack concept as a fast way to
allocate local function variables, it becomes possible to exit
and enter chunks of code multiple times, at specific points,
within an automatically restored local context (i.e. the local
variables, stored on the heap).  Generators make it more
convenient to do just that:  enter and re-enter some code that is
expressed as if it would be running in a single execution
flow (with explicit points of exit/re-entry, "yields").

The full monty version of that, is what you get when you write
assembly code (*memories of adolescent assembly programming on
the C=64 abound here now*): you can JMP anywhere anytime, and a
chunk of code (a function) can be reentered anywhere anytime as
well, maybe even reentered somewhere else than where it left off.
The price to pay for this is one of complexity: in assembly you
have to manage restoring the local context yourself (i.e in
assembly code this just means restoring the values of some
registers which are assumed set and used by the code, like the
local variables of a function), and there is no clear grouping of
the local scope that is saved.  Generators give you that for
free: they automatically organize all that local context as
belonging to the generator object, and it expresses clear points
of exit/re-entry with the yield calls.  They are really just a
fancy goto, with some convenient assumptions about how control
should flow.  This happens to be good enough for simplifying a
whole class of problems and I suppose the Python and Ruby
communities are all learning to love them and use them more and
more.

(I think the more fundamental consequence of generators is to
raise questions about the definition of what "a function" is:  if
I have a single chunk of code in which different parts uses two
disjoint sets of variables, and it can be entered via a few
entry/exit points, is it really one or two or
multiple "functions"?  What if different parts share some of the
local scope only?  Where does the function begin and end?  And
more importantly, is there a more complex yet stull manageable
abstraction that would allow even more flexible control flow than
generators allow, straddling the boundaries of what a function
is?)

You could easily implement something very similar to generators
by encapsulating the local scope explicitly in the form of a
class, with instance attributes, and having an normal
method "step()" that would be careful about saving state in the
object's attributes everytime it returns and restoring state from
those attributes everytime it gets called.  This is what
iterators do.  Whenever you want to "schedule" your object to be
running, you call the step() method.  So just in that sense
generators really aren't all that exciting or "new".  The main
problem that generators solve is that they make this save/restore
mechanism automatic, thus allowing you to write a single flow of
execution as a normal function with explicit exit points (yield).
It's much nicer having that in the language than having to write
code that can be restored (especially when you have to write a
loop with complex conditions/flow which must run and return only
one iteration every time they become runnable).

Therefore, as far as I understand it, generators themselves DO
NOT implement any form of concurrency.

I feel that where generators and concurrency come together is
often met with confusion in the discussions I see about them, but
maybe that's just me.  I see two aspects that allow generators to
participate in the elaboration of a concurrency scheme:

1. The convenience of expression of a single execution flow (with
   explicit interruption points) makes it easy to implement
   pseudo-concurrency IF AND ONLY IF you consider a generator as
   an independent unit of control flow (i.e. a task).  Whether
   those generators can run asynchronously is yet undefined and
   depends on who calls them.

2. In a more advanced framework/language, perhaps some generators
   could be considered to always be possible to run
   asynchronously, ruled by a system of true concurrency with
   some kind of scheduling algorithm that oversees that process.
   Whether this has been implemented by many is still a mystery
   to me, but I can see how a low-level library that provides
   asynchronously running execution vehicles for each CPU could
   be used to manage and run a pool of shared generator objects
   in a way that is better (for a specific application) than the
   relatively uninformed scheduling provided by the threads
   abstraction (more at the end).

Pseudo or cooperative concurrency is not the same as true
asynchronous concurrency.  You can ONLY avoid having to deal with
issues of mutual exclusion if you DO NOT have true asynchronous
concurrency (i.e. two real CPUs running at the same time)--unless
you have some special scheduling system that implements very
specific assumptions about data access vs the code that it
schedules, in which case that scheduling algorithm itself will
have to deal with potential mutual exclusion problems: YOU DON'T
GET OUT OF IT, if you have two real, concurrent processing units
making calculations, you have to deal with the way that they
might access some same piece of data at the same time.

I suppose that the essence of what I want to say with this
diatribe is that everyone who talks about generators as a way to
avoid the hard problems of concurrent programming should really
explicitly frame the discussion in the context of a single
process cooperative scheme that runs on a single processor (at
any one time).  It does not hold outside of that context, outside
of that context you HAVE to deal with mutex issues, and that's
always where it gets messy (even with generators).

Now, IMO where it gets interesting is when you consider that what
you're doing when you are executing multiple asynchronous control
flows with explicit code in your process, is that you're
essentially bringing "up" the scheduler from the kernel layer
into your own code.  This is very cool.  This may allow you
specialize that scheduler with assumptions which may ultimately
simplify the implementation of your independent control flows.
For example, if you have two sets of generators that access
disjoint sets of data, and two processing units, your scheduler
could make sure that no two generators from the same set get
scheduled at the same time.  If you do that then you might not
have to lock access to your data structures at all.  You can
imagine more complex variants on this theme.

One of the problems that you have with using generators like
this, is that automatic "yield" on resource access does not occur
automatically, like it does in threading.  With threads, the
kernel is invoked when access to a low-level resource is
requested, and may decide to put your process in the wait queue
when it judges necessary.  I don't know how you would do that
with generators.  To implement that explicitly, you would need an
asynchronous version of all the functions that may block on
resources (e.g. file open, socket write, etc.), in order to be
able to insert a yield statement at that point, after the async
call, and there should be a way for the scheduler to check if the
resource is "ready" to be able to put your generator back in the
runnable queue.

(A question comes to mind here: Twisted must be doing something
like this with their "deferred objects", no?  I figure they would
need to do something like this too.  I will have to check.)

Any comment welcome.
cheers,

From solipsis at pitrou.net  Sun Oct  2 00:46:21 2005
From: solipsis at pitrou.net (Antoine)
Date: Sun, 2 Oct 2005 00:46:21 +0200 (CEST)
Subject: [Python-Dev] Pythonic concurrency - cooperative MT
In-Reply-To: <8393fff0510011450l2e9dd608hce91ed00516d0935@mail.gmail.com>
References: <fb6fbf5605092915471cbb1189@mail.gmail.com>
	<fb6fbf5605092916053506b9aa@mail.gmail.com>
	<1128073517.25688.10.camel@p-dvsi-418-1.rd.francetelecom.fr>
	<8393fff0510011450l2e9dd608hce91ed00516d0935@mail.gmail.com>
Message-ID: <1860.::ffff:213.41.177.172.1128206781.squirrel@webmail.nerim.net>


Hi Martin,

[snip]

The "confusion" stems from the fact that two issues are mixed up in this
discussion thread:
- improving concurrency schemes to make it easier to write well-behaving
applications with independent parallel flows
- improving concurrency schemes to improve performance when there are
several hardware threads available

The respective solutions to these problems do not necessarily go hand in
hand.

> To implement that explicitly, you would need an
> asynchronous version of all the functions that may block on
> resources (e.g. file open, socket write, etc.), in order to be
> able to insert a yield statement at that point, after the async
> call, and there should be a way for the scheduler to check if the
> resource is "ready" to be able to put your generator back in the
> runnable queue.

You can also use a helper thread and signal the scheduling loop when some
action in the helper thread has finished. It is an elegant solution
because it helps you keep a small generic scheduling loop instead of
putting select()-like calls in it.
(this is how I've implemented timers in my little cooperative
multi-threading system, for example)

> (A question comes to mind here: Twisted must be doing something
> like this with their "deferred objects", no?  I figure they would
> need to do something like this too.  I will have to check.)

A Deferred object is just the abstraction of a callback - or, rather, two
callbacks: one for success and one for failure. Twisted is architected
around an event loop, which calls your code back when a registered event
happens (for example when an operation is finished, or when some data
arrives on the wire). Compared to generators, it is a different way of
expressing cooperative multi-threading.

Regards

Antoine.



From ms at cerenity.org  Sun Oct  2 01:13:15 2005
From: ms at cerenity.org (Michael Sparks)
Date: Sun, 2 Oct 2005 00:13:15 +0100
Subject: [Python-Dev] Pythonic concurrency - cooperative MT
In-Reply-To: <8393fff0510011450l2e9dd608hce91ed00516d0935@mail.gmail.com>
References: <fb6fbf5605092915471cbb1189@mail.gmail.com>
	<1128073517.25688.10.camel@p-dvsi-418-1.rd.francetelecom.fr>
	<8393fff0510011450l2e9dd608hce91ed00516d0935@mail.gmail.com>
Message-ID: <200510020013.16626.ms@cerenity.org>

On Saturday 01 October 2005 22:50, Martin Blais wrote:
...
> because some people on the list appear to discuss generators as
> a concurrency scheme, and as far as I know they really are not
> addressing that at all.

Our project started in the context of dealing with the task of a
naturally concurrent environment. Specifically the task is that of
dealing with large numbers of concurrent connections to a server.
As a result, when I've mentioned concurrency it's been due to coming
from that viewpoint.

In the past I have worked with systems essentially structured in a
similar way to Twisted for this kind of problem, but decided against that
style for our current project. (Note some people misunderstand my opinions
here due to a badly phrased lightning talk ~16 months ago at Europython
2004 - I think twisted is very much best of breed in python for what
it does. I just think there //might// be a nicer way. (might :) )

Since I now work in an R&D dept I wondered what would happen if instead
of the basic approach that underlies systems like twisted what would
happen if you took a much more CSP-like approach to building such
systems, but using generators rather than threads or explicit state
machines.

A specific goal was to try and make code simpler for people to work with -
with the aim actually of simplifying maintenance as the main by-product.
I hadn't heard of anyone trying this approach then and hypothesised it
*might* achieve that goal.

As a result from day 1 it became clear that where an event based system
would normally use a reactor/proactor based approach, that you replace
that with a scheduler that repeatedly calls a next method on objects
given to it to schedule.

In terms of concurrency that is clearly a co-operative multitasking system
in the same way as a simplistic event based system is. (Both get more
complex in reality when you can't avoid blocking forcing the use of
threads for some tasks)

So when you say this:
> explicitly frame the discussion in the context of a single
> process cooperative scheme that runs on a single processor (at
> any one time).  

This is spot on. However, any generator can be farmed off and run in
a thread. Any communications you did with the generator can be wrapped
via Queues then - forming a controlled bridge between the threads.
Similarly we're currently looking at using non-blocking pipes and
pickling to communicate with generators running in a forked environment.

As a result if you write your code as generators it can migrate to a
threaded or process based environment, and scale across multiple
processes (and hence processors) if tools to perform this migration
are put in place. We're a little way off doing that, but this looks
to be highly reasonable.

> As far as I understand, generators are just a convenient way to

They give you code objects that can do a return and continue later.
This isn't really the same as the ability to just do a goto into
random points in a function. You can only go back to the point the
generator yielded at (unless someone has a perverse trick :-).

> You could easily implement something very similar to generators
> by encapsulating the local scope explicitly in the form of a
> class, with instance attributes, and having an normal
> method "step()" that would be careful about saving state in the
> object's attributes everytime it returns and restoring state from
> those attributes everytime it gets called. 

For a more explicit version of this we have a (deliberately naive) C++
version of generators & our core concurrency system. Mechanism is here:
http://tinyurl.com/7gaol , example use here: http://tinyurl.com/bgwro
That does precisely that. (except we use a next() method there :-)

> Therefore, as far as I understand it, generators themselves DO
> NOT implement any form of concurrency.

By themselves, they don't. They can be used to deal with concurrency though.

> 2. In a more advanced framework/language, perhaps some generators
>    could be considered to always be possible to run
>    asynchronously, ruled by a system of true concurrency with
>    some kind of scheduling algorithm that oversees that process.
>    Whether this has been implemented by many is still a mystery
>    to me, 

This is what we do. Our tutorial we've given to trainees (one of whom
have had very little experience of even programming) was able to
pick up our approach quickly due to our tutorial. This requires them to
implement a mini-version of the framework, which might actually aid
the discussion here since it very clearly shows the core of our system.
(nb it is however a simplified version) I previously posted a link to
it, which is here: http://kamaelia.sourceforge.net/MiniAxon/

>    but I can see how a low-level library that provides
>    asynchronously running execution vehicles for each CPU could
>    be used to manage and run a pool of shared generator objects
>    in a way that is better (for a specific application) than the
>    relatively uninformed scheduling provided by the threads
>    abstraction (more at the end).
>
> Pseudo or cooperative concurrency is not the same as true
> asynchronous concurrency.  

Correct. I've had discussions with a colleague at work who wants to work
on the underlying formal semantics of our system for verification purposes,
and he pointed out that the core assumption with a pure generator approach
effectively serialises the application, which may hide problems in the true
parallel approach (eg only using processes for a CSP-like system).

However that statement had an underlying assumption: that the system would
be a pure generator system. As soon as you involve multiple systems using
network connections, and threads (since we have threaded components as well),
and processes (which has always been on the cards, all our desktop machines
are dual processor and it just seems a waste to use just one) then the system
goes truly asynchronous.

As a result we (at least :-) have thought about these problems along the way.

> you have to deal with the way that they
> might access some same piece of data at the same time.

We do. We have both an underlying approach to deal with this and a
metaphor that encourages correct usage. The underlying mechanism is based
on explicit hand off of data between asynchronous activities. Once you have
handed off a piece of data, you no longer own it and can no longer change it.
If you are handed a piece of data you can do anything you like with it, for
as long as you like until you hand it off or throw it away.

The metaphor of old-fashioned paper based inboxes (or in-trays) and outboxes
(or out-trays) conceptually reinforces this idea - naturally encouraging
safer programming styles.

This means that we only ever have a single reader and single writer for any
item of data, which eliminates whole swathes of concurrency issues - whether
you're pseudo-concurrent (ie threads[*], generators) or truly concurrent
(processes on multiple processors).
   [*] Still only 1 CPU really.

Effectively there is no global data. If there is any global data (since we
do have a global address space we tend to think of as similar to a linda
tuple space), then it has a single owner. Others may read it, but only one
may write to it. Because this is python, this is enforced by convention.
(But the use is discouraged and rarely needed).

The use of generators effectively also hides the local variables from
accidental external modification. Which is a secondary layer of protection.

> If you do that then you might not have to lock access to your data
> structures at all.

We don't have to lock data structures at all - this is because we have
explicit hand off of data. If we hand off between processes, we do this
via Queues that handle the locking issues for us.

> To implement that explicitly, you would need an
> asynchronous version of all the functions that may block on
> resources (e.g. file open, socket write, etc.)

Or you can create a generator that handles reading from a file and hands
off the data on to the next component explicitly. The file reader is given
CPU time by the scheduler. This can seem odd unless you've done any shell
programming in which case the idea should be obvious:

echo `ls *py |while read i; do wc -l $i |cut -d \  -f1; done` | sed -e 's/ /+/g' | bc
(yes I know there's better ways of doing this :)

So all in all, I'd say "yes" generators aren't really concurrent, but they
*are* a very good way (IMHO) of dealing with concurrency in a single thread
and map naturally if you're careful in designing your approach early on to
map to a thread/process based approach cleanly.

If you think I'm talking a load of sphericals (for all I know it's possible
I am, though I hope I'm not :-) , please look at our tutorial first, then
at our howto for building components [*] and tell me what we're
doing wrong. I'd really like to know so we can make the system better,
easier for newbies (and hence everyone else), and more trustable.
   [*] http://tinyurl.com/dp8n7

(This really feels like this more of a comp.lang.python discussion really
though, because AFAICT, python already has everything we need for this.
I might revisit that thought when we've looked at shared memory issues
though. IMHO though that would be largely stuff for the standard library.)

Best Regards,


Michael.
-- 
Michael Sparks, Senior R&D Engineer, Digital Media Group
Michael.Sparks at rd.bbc.co.uk, http://kamaelia.sourceforge.net/
British Broadcasting Corporation, Research and Development
Kingswood Warren, Surrey KT20 6NP

This e-mail may contain personal views which are not the views of the BBC.

From oliphant at ee.byu.edu  Sun Oct  2 01:39:24 2005
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Sat, 01 Oct 2005 17:39:24 -0600
Subject: [Python-Dev] Why does __getitem__ slot of builtin call sequence
	methods first?
Message-ID: <433F1E2C.4050501@ee.byu.edu>


The new ndarray object of scipy core (successor to Numeric Python) is a 
C extension type that has a getitem defined in both the as_mapping and 
the as_sequence structure. 

The as_sequence mapping is just so PySequence_GetItem will work correctly.

As exposed to Python the ndarray object as a .__getitem__  wrapper method.

Why does this wrapper call the sequence getitem instead of the mapping 
getitem method?

Is there anyway to get at a mapping-style __getitem__ method from Python?

This looks like a bug to me (which is why I'm posting here...)

Thanks for any help or insight.

-Travis Oliphant




From guido at python.org  Sun Oct  2 02:41:32 2005
From: guido at python.org (Guido van Rossum)
Date: Sat, 1 Oct 2005 17:41:32 -0700
Subject: [Python-Dev] Why does __getitem__ slot of builtin call sequence
	methods first?
In-Reply-To: <433F1E2C.4050501@ee.byu.edu>
References: <433F1E2C.4050501@ee.byu.edu>
Message-ID: <ca471dc20510011741h279946dfod4d5e2eb166e353d@mail.gmail.com>

On 10/1/05, Travis Oliphant <oliphant at ee.byu.edu> wrote:
>
> The new ndarray object of scipy core (successor to Numeric Python) is a
> C extension type that has a getitem defined in both the as_mapping and
> the as_sequence structure.
>
> The as_sequence mapping is just so PySequence_GetItem will work correctly.
>
> As exposed to Python the ndarray object has a .__getitem__  wrapper method.
>
> Why does this wrapper call the sequence getitem instead of the mapping
> getitem method?
>
> Is there anyway to get at a mapping-style __getitem__ method from Python?

Hmm... I'm sure the answer is in typeobject.c, but that is one of the
more obfuscated parts of Python's guts. I wrote it four years ago and
since then I've apparently lost enough brain cells (or migrated them
from language implementation to to language design service :) that I
don't understand it inside out any more like I did while I was in the
midst of it.

However, I wonder if the logic isn't such that if you define both
sq_item and mp_subscript, __getitem__ calls sq_item; I wonder if by
removing sq_item it might call mp_subscript? Worth a try, anyway.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From radeex at gmail.com  Sun Oct  2 04:00:03 2005
From: radeex at gmail.com (Christopher Armstrong)
Date: Sun, 2 Oct 2005 12:00:03 +1000
Subject: [Python-Dev] Pythonic concurrency - cooperative MT
In-Reply-To: <8393fff0510011450l2e9dd608hce91ed00516d0935@mail.gmail.com>
References: <fb6fbf5605092915471cbb1189@mail.gmail.com>
	<fb6fbf5605092916053506b9aa@mail.gmail.com>
	<1128073517.25688.10.camel@p-dvsi-418-1.rd.francetelecom.fr>
	<8393fff0510011450l2e9dd608hce91ed00516d0935@mail.gmail.com>
Message-ID: <60ed19d40510011900g7ef5c86jc047affbdd06fd59@mail.gmail.com>

On 10/2/05, Martin Blais <blais at furius.ca> wrote:
> One of the problems that you have with using generators like
> this, is that automatic "yield" on resource access does not occur
> automatically, like it does in threading.  With threads, the
> kernel is invoked when access to a low-level resource is
> requested, and may decide to put your process in the wait queue
> when it judges necessary.  I don't know how you would do that
> with generators.  To implement that explicitly, you would need an
> asynchronous version of all the functions that may block on
> resources (e.g. file open, socket write, etc.), in order to be
> able to insert a yield statement at that point, after the async
> call, and there should be a way for the scheduler to check if the
> resource is "ready" to be able to put your generator back in the
> runnable queue.
>
> (A question comes to mind here: Twisted must be doing something
> like this with their "deferred objects", no?  I figure they would
> need to do something like this too.  I will have to check.)

As I mentioned in the predecessor of this thread (I think), I've
written a thing called "Defgen" or "Deferred Generators" which allows
you to write a generator to yield control when waiting for a Deferred
to fire. So this is basically "yield or resource access". In the
Twisted universe, every asynchronous resource-retrieval is done by
returning a Deferred and later firing that Deferred. Generally, you
add callbacks to get the value, but if you use defgen you can say
stuff like (in Python 2.5 syntax)
try:
    x = yield getPage('http://python.org/')
except PageNotFound:
    print "Where did Python go!"
else:
    assert "object-oriented" in x

Many in the Twisted community get itchy about over-use of defgen,
since it makes it easier to assume too much consistency in state, but
it's still light-years beyond pre-emptive shared-memory threading when
it comes to that.

--
  Twisted   |  Christopher Armstrong: International Man of Twistery
   Radix    |    -- http://radix.twistedmatrix.com
            |  Release Manager, Twisted Project
  \\\V///   |    -- http://twistedmatrix.com
   |o O|    |
w----v----w-+

From oliphant at ee.byu.edu  Sun Oct  2 05:17:20 2005
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Sat, 01 Oct 2005 21:17:20 -0600
Subject: [Python-Dev] Why does __getitem__ slot of builtin call sequence
 methods first?
In-Reply-To: <ca471dc20510011741h279946dfod4d5e2eb166e353d@mail.gmail.com>
References: <433F1E2C.4050501@ee.byu.edu>
	<ca471dc20510011741h279946dfod4d5e2eb166e353d@mail.gmail.com>
Message-ID: <433F5140.8050705@ee.byu.edu>

Guido van Rossum wrote:

>On 10/1/05, Travis Oliphant <oliphant at ee.byu.edu> wrote:
>  
>
>>The new ndarray object of scipy core (successor to Numeric Python) is a
>>C extension type that has a getitem defined in both the as_mapping and
>>the as_sequence structure.
>>
>>The as_sequence mapping is just so PySequence_GetItem will work correctly.
>>
>>As exposed to Python the ndarray object has a .__getitem__  wrapper method.
>>
>>Why does this wrapper call the sequence getitem instead of the mapping
>>getitem method?
>>
>>Is there anyway to get at a mapping-style __getitem__ method from Python?
>>    
>>
>
>Hmm... I'm sure the answer is in typeobject.c, but that is one of the
>more obfuscated parts of Python's guts. I wrote it four years ago and
>since then I've apparently lost enough brain cells (or migrated them
>from language implementation to to language design service :) that I
>don't understand it inside out any more like I did while I was in the
>midst of it.
>
>However, I wonder if the logic isn't such that if you define both
>sq_item and mp_subscript, __getitem__ calls sq_item; I wonder if by
>removing sq_item it might call mp_subscript? Worth a try, anyway.
>
>  
>

Thanks for the tip.  I think I figured out the problem, and it was my 
misunderstanding of how types inherit in C that was the source of my 
problem.  

Basically, Python is doing what you would expect, the mp_item is used 
for __getitem__ if both mp_item and sq_item are present.  However, the 
addition of these descriptors  (and therefore the resolution of any 
comptetion for __getitem__ calls) is done  *before*  the inheritance of 
any slots takes place. 

The new ndarray object inherits from a "big" array object that doesn't 
define the sequence and buffer protocols (which have the size limiting 
int dependencing in their interfaces).   The ndarray object has standard 
tp_as_sequence and tp_as_buffer slots filled.  

Figuring the array object would inherit its tp_as_mapping protocol from 
"big" array (which it does just fine), I did not explicitly set that 
slot in its Type object.    Thus, when PyType_Ready was called on the 
ndarray object, the tp_as_mapping was NULL and so __getitem__ mapped to 
the sequence-defined version.  Later the tp_as_mapping slots were 
inherited but too late for __getitem__ to be what I expected.

The easy fix was to initialize the tp_as_mapping slot before calling 
PyType_Ready.    Hopefully, somebody else searching in the future for an 
answer to their problem will find this discussion useful.  

Thanks for your help,

-Travis





From ncoghlan at gmail.com  Sun Oct  2 05:23:19 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 02 Oct 2005 13:23:19 +1000
Subject: [Python-Dev] Why does __getitem__ slot of builtin call sequence
 methods first?
In-Reply-To: <ca471dc20510011741h279946dfod4d5e2eb166e353d@mail.gmail.com>
References: <433F1E2C.4050501@ee.byu.edu>
	<ca471dc20510011741h279946dfod4d5e2eb166e353d@mail.gmail.com>
Message-ID: <433F52A7.3010700@gmail.com>

Guido van Rossum wrote:
> Hmm... I'm sure the answer is in typeobject.c, but that is one of the
> more obfuscated parts of Python's guts. I wrote it four years ago and
> since then I've apparently lost enough brain cells (or migrated them
> from language implementation to to language design service :) that I
> don't understand it inside out any more like I did while I was in the
> midst of it.
> 
> However, I wonder if the logic isn't such that if you define both
> sq_item and mp_subscript, __getitem__ calls sq_item; I wonder if by
> removing sq_item it might call mp_subscript? Worth a try, anyway.

As near as I can tell, the C/API documentation is silent on how slots are 
populated when multiple methods mapping to the same slot are defined by a C 
object, but this is a quote from the comment describing add_operators() in 
typeobject.c:
>   In the latter case, the first slotdef entry encoutered wins.  Since
>    slotdef entries are sorted by the offset of the slot in the
>    PyHeapTypeObject, this gives us some control over disambiguating
>    between competing slots: the members of PyHeapTypeObject are listed
>    from most general to least general, so the most general slot is
>    preferred.  In particular, because as_mapping comes before as_sequence,
>    for a type that defines both mp_subscript and sq_item, mp_subscript
>    wins.

Further, in PyObject_GetItem (in abstract.c), tp_as_mapping->mp_subscript is 
checked first, with tp_as_sequence->mp_item only being checked if mp_subscript 
isn't found. Importantly, this is the function invoked by the BINARY_SUBSCR 
opcode.

So, the *intent* certainly appears to be that mp_subscript should be preferred 
both by the C abstract object API and from normal Python code.

*However*, the precedence applied by add_operators() is governed by the 
slotdefs structure in typeobject.c, which, according to the above comment, is 
meant to match the order the slots appear in memory in the _typeobject 
structure in object.h, and favour the mapping methods over the sequence methods.

There's actually two serious problems with the description in this comment:

Firstly, the two orders don't actually match. In the object layout, the 
ordering of the abstract object methods is as follows:
	PyNumberMethods *tp_as_number;
	PySequenceMethods *tp_as_sequence;
	PyMappingMethods *tp_as_mapping;

But in the slotdefs table, the PySequence and PyMapping slots are listed 
first, followed by the PyNumber methods.

Secondly, in both the object layout and the slotdefs table, the PySequence 
methods appear *before* the PyMapping methods, which means that 
tp_as_sequence->sq_item appears as "__getitem__" even though a subscript 
operation will actually invoke "tp_as_mapping->mp_subscript".

In short, I think Travis is right in calling this behaviour a bug. There's a 
similar problem with the methods that exist in both tp_as_number and 
tp_as_sequence - the abstract C API and the Python intepreter will favour the 
tp_as_number methods, but the slot definitions will favour tp_as_sequence.

The fix is actually fairly simple: reorder the slotdefs table so that the 
sequence of slots is "Number, Mapping, Sequence" rather than adhering strictly 
to the sequence of methods given in the definition of _typeobject.

The only objects affected by this change would be C extension objects which 
define two C-level methods which map to the same Python-level slot name. The 
observed behavioural change is that the methods accessible via the 
Python-level slot names would change (either from the Sequence method to the 
Mapping method, or from the Sequence method to the Number method).

Given that the only documentation I can find of the behaviour in that scenario 
is a comment in typeobject.c, that the implementation doesn't currently match 
the comment, and that the current implementation means that the methods 
accessed via the slot names don't match the methods normal Python syntax 
actually invokes, I find it hard to see how fixing it could cause any 
signficant problems.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Sun Oct  2 05:27:54 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 02 Oct 2005 13:27:54 +1000
Subject: [Python-Dev] Why does __getitem__ slot of builtin call sequence
 methods first?
In-Reply-To: <433F52A7.3010700@gmail.com>
References: <433F1E2C.4050501@ee.byu.edu>	<ca471dc20510011741h279946dfod4d5e2eb166e353d@mail.gmail.com>
	<433F52A7.3010700@gmail.com>
Message-ID: <433F53BA.7030203@gmail.com>

Nick Coghlan wrote:
[A load of baloney]

Scratch everything I said in my last message - init_slotdefs() sorts the 
slotdefs table correctly, so that the order it is written in the source is 
irrelevant.

Travis found the real answer to his problem.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From kbk at shore.net  Sun Oct  2 05:44:44 2005
From: kbk at shore.net (Kurt B. Kaiser)
Date: Sat, 01 Oct 2005 23:44:44 -0400
Subject: [Python-Dev] IDLE development
In-Reply-To: <b348a085050910165468abeb89@mail.gmail.com> (Noam Raphael's
	message of "Sun, 11 Sep 2005 02:54:08 +0300")
References: <b348a085050910165468abeb89@mail.gmail.com>
Message-ID: <87u0g0h9ar.fsf@hydra.bayview.thirdcreek.com>

Noam Raphael <noamraph at gmail.com> writes:

> More than a year and a half ago, I posted a big patch to IDLE which
> adds support for completion and much better calltips, along with some
> other improvements. 

I have responded on idle-dev.

-- 
KBK

From martin at v.loewis.de  Sun Oct  2 09:57:31 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 02 Oct 2005 09:57:31 +0200
Subject: [Python-Dev] Help needed with MSI permissions
Message-ID: <433F92EB.2040509@v.loewis.de>

I have various reports that the Python 2.4 installer does
not work if you are trying to install in a non-standard location
as a non-privileged user, e.g. #1298962, #1234328,
#1232947, #1199808.

Despite many attempts, I haven't been able to reproduce any
such problem, and the submitters weren't really able to experiment
much, either.

So, if anybody is able to reproduce any of these reports, and give
me instructions on how to reproduce it myself, that would be
very much appreciated.

Regards,
Martin

From martin at v.loewis.de  Sun Oct  2 21:07:02 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 02 Oct 2005 21:07:02 +0200
Subject: [Python-Dev] Tests and unicode
In-Reply-To: <dhmh0n$41n$1@sea.gmane.org>
References: <dhmh0n$41n$1@sea.gmane.org>
Message-ID: <43402FD6.7030905@v.loewis.de>

Reinhold Birkenfeld wrote:
> One problem is that no Unicode escapes can be used since compiling
> the file raises ValueErrors for them. Such strings would have to
> be produced using unichr().

You mean, in Unicode literals? There are various approaches, depending
on context:
- you could encode the literals as UTF-8, and decode it when the
  module/test case is imported. See test_support.TESTFN_UNICODE
  for an example.
- you could use unichr
- you could use eval, see test_re for an example

> Is this the right way? Or is disabling Unicode not supported any more?

There are certainly tests that cannot be executed when Unicode is not
available. It would be good if such tests get skipped instead of being
failing, and it would be good if all tests that do not require Unicode
support run even when Unicode support is missing.

Whether "it is supported" is a tricky question: your message indicates
that, right now, it is *not* supported (or else you wouldn't have
noticed a problem). Whether we think it should be supported depends
on who "we" is, as with all these minor features: some think it is
a waste of time, some think it should be supported if reasonably
possible, and some think this a conditio sine qua non. It certainly
isn't a release-critical feature.

Regards,
Martin

From martin at v.loewis.de  Sun Oct  2 21:52:14 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 02 Oct 2005 21:52:14 +0200
Subject: [Python-Dev] C API doc fix
In-Reply-To: <fb6fbf560509300900kc5bacfbh5d760f4fff90ee12@mail.gmail.com>
References: <fb6fbf560509300900kc5bacfbh5d760f4fff90ee12@mail.gmail.com>
Message-ID: <43403A6E.5080101@v.loewis.de>

Jim Jewett wrote:
> 
> ========
> Py_UNICODE
>      Python uses this type to store Unicode ordinals.  It is
> typically a typedef alias, but the underlying type -- and
> the size of that type -- varies across different systems.
> ========

I think I objected to such a formulation, requesting that the
precise procedure is documented for chosing the alias. I then
went on saying what the precise procedure is, and then an
argument about that procedure arose.

I still believe that the precise procedure should be documented
(in addition to saying that its outcome may vary across
installations).

Regards,
Martin

From mwh at python.net  Sun Oct  2 22:32:38 2005
From: mwh at python.net (Michael Hudson)
Date: Sun, 02 Oct 2005 21:32:38 +0100
Subject: [Python-Dev] Tests and unicode
In-Reply-To: <dhmh0n$41n$1@sea.gmane.org> (Reinhold Birkenfeld's message of
	"Sat, 01 Oct 2005 19:28:54 +0200")
References: <dhmh0n$41n$1@sea.gmane.org>
Message-ID: <2mk6gvfymx.fsf@starship.python.net>

Reinhold Birkenfeld <reinhold-birkenfeld-nospam at wolke7.net> writes:

> Hi,
>
> I looked whether I could make the test suite pass again
> when compiled with --disable-unicode.
>
> One problem is that no Unicode escapes can be used since compiling
> the file raises ValueErrors for them. Such strings would have to
> be produced using unichr().

Yeah, I've bumped into this.

> Is this the right way? Or is disabling Unicode not supported any more?

I don't know.  More particularly, I don't know if anyone actually uses
a unicode-disabled build.  If noone does, we might as well rip the
code out.

Cheers,
mwh

-- 
  Sufficiently advanced political correctness is indistinguishable
  from irony.                                           -- Erik Naggum

From mwh at python.net  Sun Oct  2 22:36:01 2005
From: mwh at python.net (Michael Hudson)
Date: Sun, 02 Oct 2005 21:36:01 +0100
Subject: [Python-Dev] Why does __getitem__ slot of builtin call sequence
 methods first?
In-Reply-To: <433F5140.8050705@ee.byu.edu> (Travis Oliphant's message of
	"Sat, 01 Oct 2005 21:17:20 -0600")
References: <433F1E2C.4050501@ee.byu.edu>
	<ca471dc20510011741h279946dfod4d5e2eb166e353d@mail.gmail.com>
	<433F5140.8050705@ee.byu.edu>
Message-ID: <2mfyrjfyha.fsf@starship.python.net>

Travis Oliphant <oliphant at ee.byu.edu> writes:

> Thanks for the tip.  I think I figured out the problem, and it was my 
> misunderstanding of how types inherit in C that was the source of my 
> problem.  
>
> Basically, Python is doing what you would expect, the mp_item is used 
> for __getitem__ if both mp_item and sq_item are present.  However, the 
> addition of these descriptors  (and therefore the resolution of any 
> comptetion for __getitem__ calls) is done  *before*  the inheritance of 
> any slots takes place. 

Oof.  That'd do it.

> The new ndarray object inherits from a "big" array object that doesn't 
> define the sequence and buffer protocols (which have the size limiting 
> int dependencing in their interfaces).   The ndarray object has standard 
> tp_as_sequence and tp_as_buffer slots filled.  

I guess the reason this hasn't come up before is that non-trivial C
inheritance is still pretty rare.

> The easy fix was to initialize the tp_as_mapping slot before calling 
> PyType_Ready.    Hopefully, somebody else searching in the future for an 
> answer to their problem will find this discussion useful.  

Well, it sounds like a bug that should be easy to fix.  I can't think
of a reason to do slot wrapper generation before slot inheritance,
though I wouldn't like to bet more than a beer on not having missed
something...

Cheers,
mwh

-- 
  There are two kinds of large software systems: those that evolved
  from small systems and those that don't work.
                           -- Seen on slashdot.org, then quoted by amk

From blais at furius.ca  Sun Oct  2 23:49:51 2005
From: blais at furius.ca (Martin Blais)
Date: Sun, 2 Oct 2005 17:49:51 -0400
Subject: [Python-Dev] Pythonic concurrency - cooperative MT
In-Reply-To: <1766050860214964952@unknownmsgid>
References: <fb6fbf5605092915471cbb1189@mail.gmail.com>
	<fb6fbf5605092916053506b9aa@mail.gmail.com>
	<1128073517.25688.10.camel@p-dvsi-418-1.rd.francetelecom.fr>
	<8393fff0510011450l2e9dd608hce91ed00516d0935@mail.gmail.com>
	<1766050860214964952@unknownmsgid>
Message-ID: <8393fff0510021449p3607898cp6dcecc615450b46b@mail.gmail.com>

On 10/1/05, Antoine <solipsis at pitrou.net> wrote:
>
> > like this with their "deferred objects", no?  I figure they would
> > need to do something like this too.  I will have to check.)
>
> A Deferred object is just the abstraction of a callback - or, rather, two
> callbacks: one for success and one for failure. Twisted is architected
> around an event loop, which calls your code back when a registered event
> happens (for example when an operation is finished, or when some data
> arrives on the wire). Compared to generators, it is a different way of
> expressing cooperative multi-threading.

So, the question is, in Twisted, if I want to defer on an operation
that is going to block, say I'm making a call to run a database query
that I'm expecting will take much time, and want to yield ("defer")
for other events to be processed while the query is executed, how do I
do that?  As far as I remember the Twisted docs I read a long time ago
did not provide a solution for that.

From radeex at gmail.com  Mon Oct  3 01:19:32 2005
From: radeex at gmail.com (Christopher Armstrong)
Date: Mon, 3 Oct 2005 09:19:32 +1000
Subject: [Python-Dev] Pythonic concurrency - cooperative MT
In-Reply-To: <8393fff0510021449p3607898cp6dcecc615450b46b@mail.gmail.com>
References: <fb6fbf5605092915471cbb1189@mail.gmail.com>
	<fb6fbf5605092916053506b9aa@mail.gmail.com>
	<1128073517.25688.10.camel@p-dvsi-418-1.rd.francetelecom.fr>
	<8393fff0510011450l2e9dd608hce91ed00516d0935@mail.gmail.com>
	<1766050860214964952@unknownmsgid>
	<8393fff0510021449p3607898cp6dcecc615450b46b@mail.gmail.com>
Message-ID: <60ed19d40510021619p7f6e2641udf172a0d0d19283e@mail.gmail.com>

On 10/3/05, Martin Blais <blais at furius.ca> wrote:
> On 10/1/05, Antoine <solipsis at pitrou.net> wrote:
> >
> > > like this with their "deferred objects", no?  I figure they would
> > > need to do something like this too.  I will have to check.)
> >
> > A Deferred object is just the abstraction of a callback - or, rather, two
> > callbacks: one for success and one for failure. Twisted is architected
> > around an event loop, which calls your code back when a registered event
> > happens (for example when an operation is finished, or when some data
> > arrives on the wire). Compared to generators, it is a different way of
> > expressing cooperative multi-threading.
>
> So, the question is, in Twisted, if I want to defer on an operation
> that is going to block, say I'm making a call to run a database query
> that I'm expecting will take much time, and want to yield ("defer")
> for other events to be processed while the query is executed, how do I
> do that?  As far as I remember the Twisted docs I read a long time ago
> did not provide a solution for that.

Deferreds don't make blocking code non-blocking; they're just a way to
make it nicer to write non-blocking code. There are utilities in
Twisted for wrapping a blocking function call in a thread and having
the result returned in a Deferred, though (see deferToThread). There
is also a lightweight and complete wrapper for DB-API2 database
modules in twisted.enterprise.adbapi, which does the threading
interaction for you.

So, since this then exposes a non-blocking API, you can do stuff like

d = pool.runQuery('SELECT User_ID FROM Users')
d.addCallback(gotDBData)
d2 = ldapfoo.getUser('bob')
d2.addCallback(gotLDAPData)

And both the database call and the ldap request will be worked on concurrently.

--
  Twisted   |  Christopher Armstrong: International Man of Twistery
   Radix    |    -- http://radix.twistedmatrix.com
            |  Release Manager, Twisted Project
  \\\V///   |    -- http://twistedmatrix.com
   |o O|    |
w----v----w-+

From blais at furius.ca  Mon Oct  3 07:53:52 2005
From: blais at furius.ca (Martin Blais)
Date: Mon, 3 Oct 2005 01:53:52 -0400
Subject: [Python-Dev] Pythonic concurrency - cooperative MT
In-Reply-To: <60ed19d40510021619p7f6e2641udf172a0d0d19283e@mail.gmail.com>
References: <fb6fbf5605092915471cbb1189@mail.gmail.com>
	<fb6fbf5605092916053506b9aa@mail.gmail.com>
	<1128073517.25688.10.camel@p-dvsi-418-1.rd.francetelecom.fr>
	<8393fff0510011450l2e9dd608hce91ed00516d0935@mail.gmail.com>
	<1766050860214964952@unknownmsgid>
	<8393fff0510021449p3607898cp6dcecc615450b46b@mail.gmail.com>
	<60ed19d40510021619p7f6e2641udf172a0d0d19283e@mail.gmail.com>
Message-ID: <8393fff0510022253l6c37f3f8x6f5254018d6f348e@mail.gmail.com>

On 10/2/05, Christopher Armstrong <radeex at gmail.com> wrote:
> On 10/3/05, Martin Blais <blais at furius.ca> wrote:
> > On 10/1/05, Antoine <solipsis at pitrou.net> wrote:
> > >
> > > > like this with their "deferred objects", no?  I figure they would
> > > > need to do something like this too.  I will have to check.)
> > >
> > > A Deferred object is just the abstraction of a callback - or, rather, two
> > > callbacks: one for success and one for failure. Twisted is architected
> > > around an event loop, which calls your code back when a registered event
> > > happens (for example when an operation is finished, or when some data
> > > arrives on the wire). Compared to generators, it is a different way of
> > > expressing cooperative multi-threading.
> >
> > So, the question is, in Twisted, if I want to defer on an operation
> > that is going to block, say I'm making a call to run a database query
> > that I'm expecting will take much time, and want to yield ("defer")
> > for other events to be processed while the query is executed, how do I
> > do that?  As far as I remember the Twisted docs I read a long time ago
> > did not provide a solution for that.
>
> Deferreds don't make blocking code non-blocking; they're just a way to
> make it nicer to write non-blocking code. There are utilities in
> Twisted for wrapping a blocking function call in a thread and having
> the result returned in a Deferred, though (see deferToThread). There
> is also a lightweight and complete wrapper for DB-API2 database
> modules in twisted.enterprise.adbapi, which does the threading
> interaction for you.
>
> So, since this then exposes a non-blocking API, you can do stuff like
>
> d = pool.runQuery('SELECT User_ID FROM Users')
> d.addCallback(gotDBData)
> d2 = ldapfoo.getUser('bob')
> d2.addCallback(gotLDAPData)
>
> And both the database call and the ldap request will be worked on concurrently.

Very nice!

However, if you're using a thread to do just that, it's just using a
part of what threads were designed for: it's really just using the
low-level kernel knowledge about resource access and when they become
ready to wait on the resource, since you're not going to run much
actual code in the thread itself (apart from setting up to do the
blocking call and returning its value).

Now, if we had something in the language that allows us to do
something like that--make the most important potentially blocking
calls asynchronously-- we could implement a more complete scheduler
that could really leverage generators to create a more interesting
concurrency solution with less overhead.  For example, imagine that
some class of generators are used as tasks, like we were discussing
before.  When you would call the special yield_read() call (a
variation on e.g. os.read() call), there is an implicit yield that
allows other generators which are ready to run until the data is
available, without the overhead of

1. context switching to the helper threads and back;
2. synchronization for communcation with the helper threads (I assume
threads would not be created dynamically, for efficiency.  I imagine
there is a pool of helpers waiting to do the async call jobs, and
communication with them to dispatch the call jobs does not come for
free (i.e. locking)).

We really don't need threads at all to do that (at least for the
common blocking calls), just some low-level support for building a
scheduler.  Using threads to do that has a cost, it is more or less a
kludge, in that context (but we have nothing better for now).

cheers,

From blais at furius.ca  Mon Oct  3 08:09:13 2005
From: blais at furius.ca (Martin Blais)
Date: Mon, 3 Oct 2005 02:09:13 -0400
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
	conversions).
Message-ID: <8393fff0510022309l5977e899led262cb3c1a7a805@mail.gmail.com>

Hi.

Like a lot of people (or so I hear in the blogosphere...), I've been
experiencing some friction in my code with unicode conversion
problems.  Even when being super extra careful with the types of str's
or unicode objects that my variables can contain, there is always some
case or oversight where something unexpected happens which results in
a conversion which triggers a decode error.  str.join() of a list of
strs, where one unicode object appears unexpectedly, and voila!
exception galore.  Sometimes the problem shows up late because your
test code doesn't always contain accented characters.  I'm sure many
of you experienced that or some variant at some point.

I came to realize recently that this problem shares strong similarity
with the problem of implicit type conversions in C++, or at least it
feels the same:  Stuff just happens implicitly, and it's hard to track
down where and when it happens by just looking at the code.  Part of
the problem is that the unicode object acts a lot like a str, which is
convenient, but...

What if we could completely disable the implicit conversions between
unicode and str?  In other words, if you would ALWAYS be forced to
call either .encode() or .decode() to convert between one and the
other... wouldn't that help a lot deal with that issue?

How hard would that be to implement?  Would it break a lot of code? 
Would some people want that?  (I know I would, at least for some of my
code.)  It seems to me that this would make the code more explicit and
force the programmer to become more aware of those conversions.  Any
opinions welcome.

cheers,

From reinhold-birkenfeld-nospam at wolke7.net  Mon Oct  3 10:15:49 2005
From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld)
Date: Mon, 03 Oct 2005 10:15:49 +0200
Subject: [Python-Dev] Tests and unicode
In-Reply-To: <43402FD6.7030905@v.loewis.de>
References: <dhmh0n$41n$1@sea.gmane.org> <43402FD6.7030905@v.loewis.de>
Message-ID: <dhqpbm$efg$1@sea.gmane.org>

Martin v. L?wis wrote:
> Reinhold Birkenfeld wrote:
>> One problem is that no Unicode escapes can be used since compiling
>> the file raises ValueErrors for them. Such strings would have to
>> be produced using unichr().
> 
> You mean, in Unicode literals? There are various approaches, depending
> on context:
> - you could encode the literals as UTF-8, and decode it when the
>   module/test case is imported. See test_support.TESTFN_UNICODE
>   for an example.
> - you could use unichr
> - you could use eval, see test_re for an example

Okay. I can fix this, but several library modules must be fixed too (mostly
simple fixes), e.g. pickletools, gettext, doctest or encodings.

>> Is this the right way? Or is disabling Unicode not supported any more?
> 
> There are certainly tests that cannot be executed when Unicode is not
> available. It would be good if such tests get skipped instead of being
> failing, and it would be good if all tests that do not require Unicode
> support run even when Unicode support is missing.

That's my approach too.

> Whether "it is supported" is a tricky question: your message indicates
> that, right now, it is *not* supported (or else you wouldn't have
> noticed a problem).

Well, the core builds without Unicode, and any code that doesn't use unicode
should run fine too. But the tests fail at the moment.

> Whether we think it should be supported depends
> on who "we" is, as with all these minor features: some think it is
> a waste of time, some think it should be supported if reasonably
> possible, and some think this a conditio sine qua non. It certainly
> isn't a release-critical feature.

Correct. I'll see if I have the time.

Reinhold

-- 
Mail address is perfectly valid!


From mwh at python.net  Mon Oct  3 10:43:06 2005
From: mwh at python.net (Michael Hudson)
Date: Mon, 03 Oct 2005 09:43:06 +0100
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
 conversions).
In-Reply-To: <8393fff0510022309l5977e899led262cb3c1a7a805@mail.gmail.com>
	(Martin Blais's message of "Mon, 3 Oct 2005 02:09:13 -0400")
References: <8393fff0510022309l5977e899led262cb3c1a7a805@mail.gmail.com>
Message-ID: <2mbr27f0th.fsf@starship.python.net>

Martin Blais <blais at furius.ca> writes:

> What if we could completely disable the implicit conversions between
> unicode and str?  In other words, if you would ALWAYS be forced to
> call either .encode() or .decode() to convert between one and the
> other... wouldn't that help a lot deal with that issue?

I don't know.  I've made one or two apps safe against this and it's
mostly just annoying.

> How hard would that be to implement?

import sys
reload(sys)
sys.setdefaultencoding('undefined')

> Would it break a lot of code?  Would some people want that?  (I know
> I would, at least for some of my code.)  It seems to me that this
> would make the code more explicit and force the programmer to become
> more aware of those conversions.  Any opinions welcome.

I'm not sure it's a sensible default.

Cheers,
mwh

-- 
  It is never worth a first class man's time to express a majority
  opinion.  By definition, there are plenty of others to do that.
                                                        -- G. H. Hardy

From mal at egenix.com  Mon Oct  3 11:43:13 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 03 Oct 2005 11:43:13 +0200
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
	conversions).
In-Reply-To: <2mbr27f0th.fsf@starship.python.net>
References: <8393fff0510022309l5977e899led262cb3c1a7a805@mail.gmail.com>
	<2mbr27f0th.fsf@starship.python.net>
Message-ID: <4340FD31.20202@egenix.com>

Michael Hudson wrote:
> Martin Blais <blais at furius.ca> writes:
> 
> 
>>What if we could completely disable the implicit conversions between
>>unicode and str?  In other words, if you would ALWAYS be forced to
>>call either .encode() or .decode() to convert between one and the
>>other... wouldn't that help a lot deal with that issue?
> 
> 
> I don't know.  I've made one or two apps safe against this and it's
> mostly just annoying.
>
>>How hard would that be to implement?
> 
> import sys
> reload(sys)
> sys.setdefaultencoding('undefined')

You shouldn't post tricks like these :-)

The correct way to change the default encoding is by
providing a sitecustomize.py module which then call the
sys.setdefaultencoding("undefined").

Note that the codec "undefined" was added for just this
reason.

>>Would it break a lot of code?  Would some people want that?  (I know
>>I would, at least for some of my code.)  It seems to me that this
>>would make the code more explicit and force the programmer to become
>>more aware of those conversions.  Any opinions welcome.
> 
> I'm not sure it's a sensible default.

Me neither, especially since this would make it impossible
to write polymorphic code - e.g. ', '.join(list) wouldn't
work anymore if list contains Unicode; dito for u', '.join(list)
with list containing a string.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 30 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From mal at egenix.com  Mon Oct  3 13:49:20 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 03 Oct 2005 13:49:20 +0200
Subject: [Python-Dev] --disable-unicode (Tests and unicode)
In-Reply-To: <dhqpbm$efg$1@sea.gmane.org>
References: <dhmh0n$41n$1@sea.gmane.org> <43402FD6.7030905@v.loewis.de>
	<dhqpbm$efg$1@sea.gmane.org>
Message-ID: <43411AC0.6040000@egenix.com>

Reinhold Birkenfeld wrote:
> Martin v. L?wis wrote:
>>>Whether we think it should be supported depends
>>on who "we" is, as with all these minor features: some think it is
>>a waste of time, some think it should be supported if reasonably
>>possible, and some think this a conditio sine qua non. It certainly
>>isn't a release-critical feature.
> 
> Correct. I'll see if I have the time.

Is the added complexity needed to support not having Unicode support
compiled into Python really worth it ?

I know that Martin introduced this feature a long time ago,
so he will have had a reason for it.

Today, I think the situation has changed: computers have more
memory, are faster and the need to integrate (e.g. via XML)
is stronger than ever - and maybe we should consider removing
the option to get a cleaner code base with fewer #ifdefs
and SyntaxErrors from the standard lib.

What do you think ?

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 30 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From solipsis at pitrou.net  Mon Oct  3 14:32:48 2005
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 03 Oct 2005 14:32:48 +0200
Subject: [Python-Dev] Divorcing str and unicode (no more
	implicit	conversions).
In-Reply-To: <8393fff0510022309l5977e899led262cb3c1a7a805@mail.gmail.com>
References: <8393fff0510022309l5977e899led262cb3c1a7a805@mail.gmail.com>
Message-ID: <1128342768.6138.114.camel@fsol>

Le lundi 03 octobre 2005 ? 02:09 -0400, Martin Blais a ?crit :
> 
> What if we could completely disable the implicit conversions between
> unicode and str?

This would be very annoying when dealing with some modules or libraries
where the type (str / unicode) returned by a function depends on the
context, build, or platform.

A good rule of thumb is to convert to unicode everything that is
semantically textual, and to only use str for what is to be semantically
treated as a string of bytes (network packets, identifiers...). This is
also, AFAIU, the semantic model which is favoured for a hypothetical
future version of Python.

This is what I'm using to do safe conversion to a given type without
worrying about the type of the argument:


DEFAULT_CHARSET = 'utf-8'

def safe_unicode(s, charset=None):
    """
    Forced conversion of a string to unicode, does nothing
    if the argument is already an unicode object.
    This function is useful because the .decode method
    on an unicode object, instead of being a no-op, tries to
    do a double conversion back and forth (which often fails
    because 'ascii' is the default codec).
    """
    if isinstance(s, str):
        return s.decode(charset or DEFAULT_CHARSET)
    else:
        return s

def safe_str(s, charset=None):
    """
    Forced conversion of an unicode to string, does nothing
    if the argument is already a plain str object.
    This function is useful because the .encode method
    on an str object, instead of being a no-op, tries to
    do a double conversion back and forth (which often fails
    because 'ascii' is the default codec).
    """
    if isinstance(s, unicode):
        return s.encode(charset or DEFAULT_CHARSET)
    else:
        return s


Good luck

Antoine.




From fredrik at pythonware.com  Mon Oct  3 14:59:44 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 3 Oct 2005 14:59:44 +0200
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
	conversions).
References: <8393fff0510022309l5977e899led262cb3c1a7a805@mail.gmail.com>
	<1128342768.6138.114.camel@fsol>
Message-ID: <dhra00$tnv$1@sea.gmane.org>

Antoine Pitrou wrote:

> A good rule of thumb is to convert to unicode everything that is
> semantically textual

and isn't pure ASCII.

(anyone who are tempted to argue otherwise should benchmark their
applications, both speed- and memorywise, and be prepared to come
up with very strong arguments for why python programs shouldn't be
allowed to be fast and memory-efficient whenever they can...)

</F> 




From solipsis at pitrou.net  Mon Oct  3 15:26:55 2005
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 03 Oct 2005 15:26:55 +0200
Subject: [Python-Dev] Divorcing str and unicode (no more
	implicit	conversions).
In-Reply-To: <dhra00$tnv$1@sea.gmane.org>
References: <8393fff0510022309l5977e899led262cb3c1a7a805@mail.gmail.com>
	<1128342768.6138.114.camel@fsol>  <dhra00$tnv$1@sea.gmane.org>
Message-ID: <1128346015.6138.149.camel@fsol>


Le lundi 03 octobre 2005 ? 14:59 +0200, Fredrik Lundh a ?crit :
> Antoine Pitrou wrote:
> 
> > A good rule of thumb is to convert to unicode everything that is
> > semantically textual
> 
> and isn't pure ASCII.

How can you be sure that something that is /semantically textual/ will
always remain "pure ASCII" ? That's contradictory, unless your software
never goes out of the anglo-saxon world (and even...).

> (anyone who are tempted to argue otherwise should benchmark their
> applications, both speed- and memorywise, and be prepared to come
> up with very strong arguments for why python programs shouldn't be
> allowed to be fast and memory-efficient whenever they can...)

I think most applications don't critically depend on text processing
performance. OTOH, international adaptability is the kind of thing
that /will/ bite you one day if you don't prepare for it at the
beginning.

Also, if necessary, the distinction could be an implementation detail
and the conversion be transparent (like int vs. long): the text would be
coded in an 8-bit charset as long as possible and converted to a wide
encoding only when necessary. The important thing is that these
optimisations, if they are necessary, should be transparently handled by
the Python runtime.

(it seems to me - I may be mistaken - that modern Windows versions treat
every string as 16-bit unicode internally. Why are they doing it if it
is that inefficient?)

Regards

Antoine.



From blais at furius.ca  Mon Oct  3 16:34:09 2005
From: blais at furius.ca (Martin Blais)
Date: Mon, 3 Oct 2005 10:34:09 -0400
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
	conversions).
In-Reply-To: <4340FD31.20202@egenix.com>
References: <8393fff0510022309l5977e899led262cb3c1a7a805@mail.gmail.com>
	<2mbr27f0th.fsf@starship.python.net> <4340FD31.20202@egenix.com>
Message-ID: <8393fff0510030734k12a9e032pf935979fe3579389@mail.gmail.com>

On 10/3/05, M.-A. Lemburg <mal at egenix.com> wrote:
> >
> > I'm not sure it's a sensible default.
>
> Me neither, especially since this would make it impossible
> to write polymorphic code - e.g. ', '.join(list) wouldn't
> work anymore if list contains Unicode; dito for u', '.join(list)
> with list containing a string.

Sounds like what you want is exactly what I want to avoid (for those
two types anyway).

cheers,

From fredrik at pythonware.com  Mon Oct  3 16:39:54 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 3 Oct 2005 16:39:54 +0200
Subject: [Python-Dev] Divorcing str and unicode (no
	moreimplicit	conversions).
References: <8393fff0510022309l5977e899led262cb3c1a7a805@mail.gmail.com><1128342768.6138.114.camel@fsol>
	<dhra00$tnv$1@sea.gmane.org> <1128346015.6138.149.camel@fsol>
Message-ID: <dhrfrr$i8c$1@sea.gmane.org>

Antoine Pitrou wrote:

> > > A good rule of thumb is to convert to unicode everything that is
> > > semantically textual
> >
> > and isn't pure ASCII.
>
> How can you be sure that something that is /semantically textual/ will
> always remain "pure ASCII" ?

"is" != "will always remain"

</F> 




From jim at zope.com  Mon Oct  3 16:49:44 2005
From: jim at zope.com (Jim Fulton)
Date: Mon, 03 Oct 2005 10:49:44 -0400
Subject: [Python-Dev] Divorcing str and unicode (no more
	implicit	conversions).
In-Reply-To: <8393fff0510022309l5977e899led262cb3c1a7a805@mail.gmail.com>
References: <8393fff0510022309l5977e899led262cb3c1a7a805@mail.gmail.com>
Message-ID: <43414508.9060807@zope.com>

Martin Blais wrote:
> Hi.
> 
> Like a lot of people (or so I hear in the blogosphere...), I've been
> experiencing some friction in my code with unicode conversion
> problems.  Even when being super extra careful with the types of str's
> or unicode objects that my variables can contain, there is always some
> case or oversight where something unexpected happens which results in
> a conversion which triggers a decode error.  str.join() of a list of
> strs, where one unicode object appears unexpectedly, and voila!
> exception galore.  Sometimes the problem shows up late because your
> test code doesn't always contain accented characters.  I'm sure many
> of you experienced that or some variant at some point.
> 
> I came to realize recently that this problem shares strong similarity
> with the problem of implicit type conversions in C++, or at least it
> feels the same:  Stuff just happens implicitly, and it's hard to track
> down where and when it happens by just looking at the code.  Part of
> the problem is that the unicode object acts a lot like a str, which is
> convenient, but...

I agree.  I think it was a mistake to implicitly convert mixed string
expressions to unicode.


> What if we could completely disable the implicit conversions between
> unicode and str?  In other words, if you would ALWAYS be forced to
> call either .encode() or .decode() to convert between one and the
> other... wouldn't that help a lot deal with that issue?

Perhaps.

> How hard would that be to implement? 

Not hard. We considered doing it for Zope 3, but ...

 > Would it break a lot of code?

Yes.

> Would some people want that? 

No, I wouldn't want lots of code to break. ;)

 > (I know I would, at least for some of my
> code.)  It seems to me that this would make the code more explicit and
> force the programmer to become more aware of those conversions.  Any
> opinions welcome.

I think it's too late to change this.  I wish it had been done
differently.  (OTOH, I'm very happy we have Unicode support, so
I'm not really complaining. :)

I'll note that this hasn't been that much of a problem for us in Zope.
We follow the strategy:

Antoine Pitrou wrote:
...
 > A good rule of thumb is to convert to unicode everything that is
 > semantically textual, and to only use str for what is to be semantically
 > treated as a string of bytes (network packets, identifiers...). This is
 > also, AFAIU, the semantic model which is favoured for a hypothetical
 > future version of Python.

This approach has worked pretty well for us.  Still, when there is a problem,
it's a real pain to debug because the error occurs too late, as you point
out.

Jim

-- 
Jim Fulton           mailto:jim at zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org

From jim at zope.com  Mon Oct  3 16:51:38 2005
From: jim at zope.com (Jim Fulton)
Date: Mon, 03 Oct 2005 10:51:38 -0400
Subject: [Python-Dev] Divorcing str and unicode (no more
	implicit	conversions).
In-Reply-To: <4340FD31.20202@egenix.com>
References: <8393fff0510022309l5977e899led262cb3c1a7a805@mail.gmail.com>	<2mbr27f0th.fsf@starship.python.net>
	<4340FD31.20202@egenix.com>
Message-ID: <4341457A.8090805@zope.com>

M.-A. Lemburg wrote:
> Michael Hudson wrote:
> 
>>Martin Blais <blais at furius.ca> writes:
>>
>>
>>
>>>What if we could completely disable the implicit conversions between
>>>unicode and str?  In other words, if you would ALWAYS be forced to
>>>call either .encode() or .decode() to convert between one and the
>>>other... wouldn't that help a lot deal with that issue?
>>
>>
>>I don't know.  I've made one or two apps safe against this and it's
>>mostly just annoying.
>>
>>
>>>How hard would that be to implement?
>>
>>import sys
>>reload(sys)
>>sys.setdefaultencoding('undefined')
> 
> 
> You shouldn't post tricks like these :-)
> 
> The correct way to change the default encoding is by
> providing a sitecustomize.py module which then call the
> sys.setdefaultencoding("undefined").

This is a much more evil trick IMO, as it affects all Python code,
rather than a single program.

I would argue that it's evil to change the default encoding
in the first place, except in this case to disable implicit
encoding or decoding.

Jim

-- 
Jim Fulton           mailto:jim at zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org

From fredrik at pythonware.com  Mon Oct  3 17:12:04 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 3 Oct 2005 17:12:04 +0200
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
	conversions).
References: <8393fff0510022309l5977e899led262cb3c1a7a805@mail.gmail.com>	<2mbr27f0th.fsf@starship.python.net><4340FD31.20202@egenix.com>
	<4341457A.8090805@zope.com>
Message-ID: <dhrho4$q1o$1@sea.gmane.org>

Jim Fulton wrote:

> I would argue that it's evil to change the default encoding
> in the first place, except in this case to disable implicit
> encoding or decoding.

absolutely.  unfortunately, all attempts to add such information to the
sys module documentation seem to have failed...

(last time I tried, I seem to remember that someone argued that "it's
there, so it should be documented in a neutral fashion")

</F> 




From pjd at satori.za.net  Mon Oct  3 18:13:05 2005
From: pjd at satori.za.net (Piet Delport)
Date: Mon, 03 Oct 2005 18:13:05 +0200
Subject: [Python-Dev] Proposal for 2.5: Returning values from PEP 342
	enhanced generators
Message-ID: <43415891.1040804@satori.za.net>

PEP 255 ("Simple Generators") closes with:

> Q. Then why not allow an expression on "return" too?
>
> A. Perhaps we will someday.  In Icon, "return expr" means both "I'm
>    done", and "but I have one final useful value to return too, and
>    this is it".  At the start, and in the absence of compelling uses
>    for "return expr", it's simply cleaner to use "yield" exclusively
>    for delivering values.

Now that Python 2.5 gained enhanced generators (multitudes rejoice!), i think
there is a compelling use for valued return statements in cooperative
multitasking code, of the kind:

def foo():
    Data = yield Client.read()
    [...]
    MoreData = yield Client.read()
    [...]
    return FinalResult

def bar():
    Result = yield foo()

For generators written in this style, "yield" means "suspend execution of the
current call until the requested result/resource can be provided", and
"return" regains its full conventional meaning of "terminate the current call
with a given result".

The simplest / most straightforward implementation would be for "return Foo"
to translate to "raise StopIteration, Foo". This is consistent with "return"
translating to "raise StopIteration", and does not break any existing
generator code.

(Another way to think about this change is that if a plain StopIteration means
"the iterator terminated", then a valued StopIteration, by extension, means
"the iterator terminated with the given value".)

Motivation by real-world example:

One system that could benefit from this change is Christopher Armstrong's
defgen.py[1] for Twisted, which he recently reincarnated (as newdefgen.py) to
use enhanced generators. The resulting code is much cleaner than before, and
closer to the conventional synchronous style of writing.

[1] the saga of which is summarized here:
    http://radix.twistedmatrix.com/archives/000114.html

However, because enhanced generators have no way to differentiate their
intermediate results from their "real" result, the current solution is a
somewhat confusing compromise: the last value yielded by the generator
implicitly becomes the result returned by the call. Thus, to return
something, in general, requires the idiom "yield Foo; return". If valued
returns are allowed, this would become "return Foo" (and the code implementing
defgen itself would probably end up simpler, as well).

From jcarlson at uci.edu  Mon Oct  3 18:35:34 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Mon, 03 Oct 2005 09:35:34 -0700
Subject: [Python-Dev] Divorcing str and unicode (no more
	implicit	conversions).
In-Reply-To: <1128346015.6138.149.camel@fsol>
References: <dhra00$tnv$1@sea.gmane.org> <1128346015.6138.149.camel@fsol>
Message-ID: <20051003091416.9817.JCARLSON@uci.edu>


Antoine Pitrou <solipsis at pitrou.net> wrote:
> 
> Le lundi 03 octobre 2005 ? 14:59 +0200, Fredrik Lundh a ?crit :
> > Antoine Pitrou wrote:
> > 
> > > A good rule of thumb is to convert to unicode everything that is
> > > semantically textual
> > 
> > and isn't pure ASCII.
> 
> How can you be sure that something that is /semantically textual/ will
> always remain "pure ASCII" ? That's contradictory, unless your software
> never goes out of the anglo-saxon world (and even...).

Non-unicode text input widgets.  Works great.  Can be had with the ANSI
wxPython installation.

> (it seems to me - I may be mistaken - that modern Windows versions treat
> every string as 16-bit unicode internally. Why are they doing it if it
> is that inefficient?)

Because modern Windows supports all sorts of symbols which are necessary
for certain special English uses (greek symbols for math, etc.), and
trying to have all of them without just using the unicode backend that
is used for all of the international "builds" (isn't it just a language
definition?) anyways, would be a waste of time/effort.

 - Josiah


From jason.orendorff at gmail.com  Mon Oct  3 18:37:22 2005
From: jason.orendorff at gmail.com (Jason Orendorff)
Date: Mon, 3 Oct 2005 12:37:22 -0400
Subject: [Python-Dev] PEP 343 and __with__
Message-ID: <bb8868b90510030937w645ba6f1qfa13bf5609cfcbc@mail.gmail.com>

I'm -1 on PEP 343.  It seems ...complex.  And even with all the
complexity, I *still* won't be able to type

    with self.lock: ...

which I submit is perfectly reasonable, clean, and clear.  Instead I
have to type

    with locking(self.lock): ...

where locking() is apparently either a new builtin, a standard library
function, or some 6-line contextmanager I have to write myself.

So I have two suggestions.

1.  I didn't find any suggestion of a __with__() method in the
archives.  So I feel I should suggest it.  It would work just like
__iter__().

    class RLock:
        @contextmanager
        def __with__(self):
            self.acquire()
            try:
                yield
            finally:
                self.release()

__with__() always returns a new context manager object.  Just as with
iterators, a context manager object has "cm.__with__() is cm".

The 'with' statement would call __with__(), of course.

Optionally, the type constructor could magically apply @contextmanager
to __with__() if it's a generator, which is the usual case.  It looks
like it already does similar magic with __new__().  Perhaps this is
too cute though.

2.  More radical:  Let's get rid of __enter__() and __exit__().  The
only example in PEP 343 that uses them is Example 4, which exists only
to show that "there's more than one way to do it". It all seems fishy
to me.  Why not get rid of them and use only __with__()?  In this
scenario, Python would expect __with__() to return a coroutine (not to
say "iterator") that yields exactly once.

Then the "@contextmanager" decorator wouldn't be needed on __with__(),
and neither would any type constructor magic.

The only drawback I see is that context manager methods implemented in
C will work differently from those implemented in Python.  Since C
doesn't have coroutines, I imagine there would have to be enter() and
exit() slots.  Maybe this is a major design concern; I don't know.

My apologies if this is redundant or unwelcome at this date.

-j

From fredrik at pythonware.com  Mon Oct  3 18:42:07 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 3 Oct 2005 18:42:07 +0200
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
	conversions).
References: <dhra00$tnv$1@sea.gmane.org> <1128346015.6138.149.camel@fsol>
	<20051003091416.9817.JCARLSON@uci.edu>
Message-ID: <dhrn0v$d33$1@sea.gmane.org>

Josiah Carlson wrote:

> > > and isn't pure ASCII.
> >
> > How can you be sure that something that is /semantically textual/ will
> > always remain "pure ASCII" ? That's contradictory, unless your software
> > never goes out of the anglo-saxon world (and even...).
>
> Non-unicode text input widgets.  Works great.  Can be had with the ANSI
> wxPython installation.

You're both missing that Python is dynamically typed.  A single string source
doesn't have to return the same type of strings, as long as the objects it returns
are compatible with Python's string model and with each other.

Under the default encoding (and quite a few other encodings), that's true for
plain ascii strings and Unicode strings.  This is a good thing.

</F> 




From pje at telecommunity.com  Mon Oct  3 19:02:40 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 03 Oct 2005 13:02:40 -0400
Subject: [Python-Dev] PEP 343 and __with__
In-Reply-To: <bb8868b90510030937w645ba6f1qfa13bf5609cfcbc@mail.gmail.com
 >
Message-ID: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>

At 12:37 PM 10/3/2005 -0400, Jason Orendorff wrote:
>I'm -1 on PEP 343.  It seems ...complex.  And even with all the
>complexity, I *still* won't be able to type
>
>     with self.lock: ...
>
>which I submit is perfectly reasonable, clean, and clear.

Which is why it's proposed to add __enter__/__exit__ to locks, and somewhat 
more controversially, file objects.  (Guido objected on the basis that 
people might reuse the file object, but reusing a closed file object 
results in a sensible error message and so doesn't seem like a problem to me.)


>[snip]
>__with__() always returns a new context manager object.  Just as with
>iterators, a context manager object has "cm.__with__() is cm".
>
>The 'with' statement would call __with__(), of course.

You didn't offer any reasons why this would be useful and/or good.


>2.  More radical:  Let's get rid of __enter__() and __exit__().  The
>only example in PEP 343 that uses them is Example 4, which exists only
>to show that "there's more than one way to do it". It all seems fishy
>to me.  Why not get rid of them and use only __with__()?  In this
>scenario, Python would expect __with__() to return a coroutine (not to
>say "iterator") that yields exactly once.

Because this multiplies the difficulty of implementing context managers in 
C.  It's easy to define a pair of C methods for __enter__ and __exit__, but 
an iterator requires creating another class in C.  The yield-based syntax 
is just syntax sugar, not the essence of the proposal.


>The only drawback I see is that context manager methods implemented in
>C will work differently from those implemented in Python.  Since C
>doesn't have coroutines, I imagine there would have to be enter() and
>exit() slots.  Maybe this is a major design concern; I don't know.

Considering your argument that locks should be contextmanagers, it would 
seem like a good idea for C implementations to be easy.  :)


>My apologies if this is redundant or unwelcome at this date.

Since the PEP is accepted and has patches for both its implementation and a 
good part of its documentation, a major change like this would certainly 
need a better rationale.  If your idea was that __with__ would somehow make 
it easier for locks to be context managers, it's based on a flawed 
premise.  All that's required now is to have __enter__ and __exit__ call 
acquire() and release().  At this point, it's simply an open issue as to 
which stdlib objects will be context managers, and which will have helper 
functions or classes to serve as context managers.  The actual API used to 
implement them has little or no bearing on that issue.


From solipsis at pitrou.net  Mon Oct  3 19:39:57 2005
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 03 Oct 2005 19:39:57 +0200
Subject: [Python-Dev] unifying str and unicode
In-Reply-To: <20051003091416.9817.JCARLSON@uci.edu>
References: <dhra00$tnv$1@sea.gmane.org> <1128346015.6138.149.camel@fsol>
	<20051003091416.9817.JCARLSON@uci.edu>
Message-ID: <1128361197.6138.212.camel@fsol>


Hi,

Josiah:
> > How can you be sure that something that is /semantically textual/ will
> > always remain "pure ASCII" ? That's contradictory, unless your software
> > never goes out of the anglo-saxon world (and even...).
> 
> Non-unicode text input widgets.

You didn't understand my statement.
I didn't mean :
  - how can you /technically enforce/ no unicode text at all but :
  - how can you be sure that your users will never /want/ to enter some
text that can't be represented with the current 8-bit charset?

Of course the answer to the latter is: you can't.


Fredrik:
> Under the default encoding (and quite a few other encodings), that's true for
> plain ascii strings and Unicode strings.

If I have an unicode string containing legal characters greater than
0x7F, and I pass it to a function which converts it to str, the
conversion fails.

If I have an 8-bit string containing legal non-ascii characters in it
(for example the name of a file as returned by the filesystem, which I
of course have no prior control on), and I give it to a function which
does an implicit conversion to unicode, the conversion fails.

Here is an example so that you really understand. I am under a French
locale (iso-8859-15), let's just try to enter a French word and see what
happens when converting to unicode:

-> As a string constant:

>>> s = "?t?"
>>> s
'\xe9t\xe9'
>>> u = unicode(s)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe9 in position 0: ordinal not in range(128)

-> By asking for input:

>>> s = raw_input()
?t?
>>> s
'\xe9t\xe9'
>>> unicode(s)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe9 in position 0: ordinal not in range(128)


It should work, but it fails miserably.

In the current situation, if the programmer doesn't carefully plan for
these cases by manually managing conversions (which of course he can do
- but it's boring and bothersome - not to mention that many programmers
do not even understand the issue!), some users will see the program die
with a nasty exception, just because they happen to need a bit more than
the plain latin alphabet without diacritics.

(even the standard Python library is bitten: witness the weird
getcwd() / getcwdu() pair...)


I find it surprising that you claim there is no difficulty when
everything points to the contrary. See for example how often confused
developers ask for help on mailing-lists...

Regards

Antoine.



From mwh at python.net  Mon Oct  3 20:02:13 2005
From: mwh at python.net (Michael Hudson)
Date: Mon, 03 Oct 2005 19:02:13 +0100
Subject: [Python-Dev] PEP 343 and __with__
In-Reply-To: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>
	(Phillip J. Eby's message of "Mon, 03 Oct 2005 13:02:40 -0400")
References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>
Message-ID: <2my85aeaxm.fsf@starship.python.net>

"Phillip J. Eby" <pje at telecommunity.com> writes:

> Since the PEP is accepted and has patches for both its implementation and a 
> good part of its documentation, a major change like this would certainly 
> need a better rationale.

Though given the amount of interest said patch has attracted (none at
all) perhaps noone cares very much and the proposal should be dropped.
Which would be a shame given the time I spent on it and all the hot
air here on python-dev...

Cheers,
mwh
(who still likes PEP 343 and doesn't particularly like Jason's
suggested changes).

-- 
  Gevalia is undrinkable low-octane see-through only slightly
  roasted bilge water. Compared to .us coffee it is quite
  drinkable.                                      -- M?ns Nilsson, asr

From guido at python.org  Mon Oct  3 20:07:07 2005
From: guido at python.org (Guido van Rossum)
Date: Mon, 3 Oct 2005 11:07:07 -0700
Subject: [Python-Dev] PEP 343 and __with__
In-Reply-To: <2my85aeaxm.fsf@starship.python.net>
References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>
	<2my85aeaxm.fsf@starship.python.net>
Message-ID: <ca471dc20510031107g16c78c9sf832e19244882961@mail.gmail.com>

For the record, I very much want PEPs 342 and 343 implemented. I
haven't had the time to look at the patch and don't expect to find the
time any time soon, but it's not for lack of desire to see this
feature implemented.

I don't like Jason's __with__ proposal and even less like his idea to
drop __enter__ and __exit__ (I think this would just make it harder to
provide efficient implementations in C).

I'm all for adding __enter__ and __exit__ to locks.

I'm even considering that it might be a good idea to add them to files.

For the record, here at Elemental we write a lot of Java code that
uses database connections in a pattern that would have greatly
benefited from a similar construct in Java. :)

--Guido

On 10/3/05, Michael Hudson <mwh at python.net> wrote:
> "Phillip J. Eby" <pje at telecommunity.com> writes:
>
> > Since the PEP is accepted and has patches for both its implementation and a
> > good part of its documentation, a major change like this would certainly
> > need a better rationale.
>
> Though given the amount of interest said patch has attracted (none at
> all) perhaps noone cares very much and the proposal should be dropped.
> Which would be a shame given the time I spent on it and all the hot
> air here on python-dev...
>
> Cheers,
> mwh
> (who still likes PEP 343 and doesn't particularly like Jason's
> suggested changes).
>
> --
>   Gevalia is undrinkable low-octane see-through only slightly
>   roasted bilge water. Compared to .us coffee it is quite
>   drinkable.                                      -- M?ns Nilsson, asr
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From fredrik at pythonware.com  Mon Oct  3 20:37:55 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 3 Oct 2005 20:37:55 +0200
Subject: [Python-Dev] unifying str and unicode
References: <dhra00$tnv$1@sea.gmane.org>
	<1128346015.6138.149.camel@fsol><20051003091416.9817.JCARLSON@uci.edu>
	<1128361197.6138.212.camel@fsol>
Message-ID: <dhrtq4$5j1$1@sea.gmane.org>

Antoine Pitrou wrote:

> > Under the default encoding (and quite a few other encodings), that's true for
> > plain ascii strings and Unicode strings.
>
> If I have an unicode string containing legal characters greater than
> 0x7F, and I pass it to a function which converts it to str, the
> conversion fails.

so?  if it does that, it's not unicode safe.  what's that has to do with
my argument (which is that you can safely mix ascii strings and unicode
strings, because that's how things were designed).

> Here is an example so that you really understand.

I wrote the unicode type.  I do understand how it works.

</F>




From pje at telecommunity.com  Mon Oct  3 21:20:34 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 03 Oct 2005 15:20:34 -0400
Subject: [Python-Dev] PEP 343 and __with__
In-Reply-To: <2my85aeaxm.fsf@starship.python.net>
References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>
	<5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com>

At 07:02 PM 10/3/2005 +0100, Michael Hudson wrote:
>"Phillip J. Eby" <pje at telecommunity.com> writes:
>
> > Since the PEP is accepted and has patches for both its implementation 
> and a
> > good part of its documentation, a major change like this would certainly
> > need a better rationale.
>
>Though given the amount of interest said patch has attracted (none at
>all)

Actually, I have been reading the patch and meant to comment on it.  I was 
perplexed by the odd stack behavior of the new opcode until I realized that 
it's try/finally that's weird.  :)  I was planning to look into whether 
that could be cleaned up as well, when I got distracted and didn't go back 
to it.


>  perhaps noone cares very much and the proposal should be dropped.

I care an awful lot, as 'with' is another framework-dissolving tool that 
makes it possible to do more things in library form, without needing to 
resort to template methods.  It also enables more context-sensitive 
programming, in that "global" states can be set and restored in a 
structured fashion.  It may take a while to feel the effects, but it's 
going to be a big improvement to Python, maybe as big as new-style classes, 
and certainly bigger than decorators.


From solipsis at pitrou.net  Mon Oct  3 21:37:22 2005
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 03 Oct 2005 21:37:22 +0200
Subject: [Python-Dev] unifying str and unicode
In-Reply-To: <dhrtq4$5j1$1@sea.gmane.org>
References: <dhra00$tnv$1@sea.gmane.org> <1128346015.6138.149.camel@fsol>
	<20051003091416.9817.JCARLSON@uci.edu> <1128361197.6138.212.camel@fsol>
	<dhrtq4$5j1$1@sea.gmane.org>
Message-ID: <1128368242.6138.258.camel@fsol>


Hi,

Le lundi 03 octobre 2005 ? 20:37 +0200, Fredrik Lundh a ?crit :
> > If I have an unicode string containing legal characters greater than
> > 0x7F, and I pass it to a function which converts it to str, the
> > conversion fails.
> 
> so?  if it does that, it's not unicode safe.  
[...]
> what's that has to do with
> my argument (which is that you can safely mix ascii strings and unicode
> strings, because that's how things were designed).

If that's how things were designed, then Python's entire standard
library (not to mention third-party libraries) is not "unicode safe" -
to quote your own words - since many functions may return 8-bit strings
containing non-ascii characters.

There lies the problem for many people, until the stdlib is fixed - or
until the string types are changed. That's why you very regularly see
people complaining about how conversions sometimes break their code in
various ways.

Anyway, I don't think we will reach an agreement here. We have different
expectations w.r.t. to how the programming language may/should handle
general text. I propose we end the discussion.

Regards

Antoine.



From fredrik at pythonware.com  Mon Oct  3 21:47:19 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 3 Oct 2005 21:47:19 +0200
Subject: [Python-Dev] unifying str and unicode
References: <dhra00$tnv$1@sea.gmane.org>
	<1128346015.6138.149.camel@fsol><20051003091416.9817.JCARLSON@uci.edu>
	<1128361197.6138.212.camel@fsol><dhrtq4$5j1$1@sea.gmane.org>
	<1128368242.6138.258.camel@fsol>
Message-ID: <dhs1sa$i8l$1@sea.gmane.org>

Antoine Pitrou wrote:

> > > If I have an unicode string containing legal characters greater than
> > > 0x7F, and I pass it to a function which converts it to str, the
> > > conversion fails.
> >
> > so?  if it does that, it's not unicode safe.
> [...]
> > what's that has to do with
> > my argument (which is that you can safely mix ascii strings and unicode
> > strings, because that's how things were designed).
>
> If that's how things were designed, then Python's entire standard
> brary (not to mention third-party libraries) is not "unicode safe" -
> to quote your own words - since many functions may return 8-bit strings
> containing non-ascii characters.

huh?  first you talk about functions that convert unicode strings to 8-bit
strings, now you talk about functions that return raw 8-bit strings?  and
all this in response to a post that argues that it's in fact a good idea to
use plain strings to hold textual data that happens to contain ASCII only,
because 1) it works, by design, and 2) it's almost always more efficient.

if you don't know what your own argument is, you cannot expect anyone
to understand it.

</F>




From martin at v.loewis.de  Mon Oct  3 22:32:11 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 03 Oct 2005 22:32:11 +0200
Subject: [Python-Dev] --disable-unicode (Tests and unicode)
In-Reply-To: <43411AC0.6040000@egenix.com>
References: <dhmh0n$41n$1@sea.gmane.org>
	<43402FD6.7030905@v.loewis.de>	<dhqpbm$efg$1@sea.gmane.org>
	<43411AC0.6040000@egenix.com>
Message-ID: <4341954B.6050300@v.loewis.de>

M.-A. Lemburg wrote:
> Is the added complexity needed to support not having Unicode support
> compiled into Python really worth it ?

If there are volunteers willing to maintain it, and the other volunteers
are not affected: certainly.

> I know that Martin introduced this feature a long time ago,
> so he will have had a reason for it.

I added it because users requested it. I personally never use it.

> Today, I think the situation has changed: computers have more
> memory, are faster and the need to integrate (e.g. via XML)
> is stronger than ever - and maybe we should consider removing
> the option to get a cleaner code base with fewer #ifdefs
> and SyntaxErrors from the standard lib.
> 
> What do you think ?

-0 for just ripping it out. +0 if PEP 5 is followed, atleast
in spirit (i.e. give users advance warning to let them protest).

I guess users in embedded builds (either in embedded systems,
or embedding Python into some other application) might still
be interested in the feature. Of course, these users could either
recreate the feature if we remove it, or just stay with
Python 2.4.

Regards,
Martin

From solipsis at pitrou.net  Mon Oct  3 22:38:19 2005
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 03 Oct 2005 22:38:19 +0200
Subject: [Python-Dev] unifying str and unicode
In-Reply-To: <dhs1sa$i8l$1@sea.gmane.org>
References: <dhra00$tnv$1@sea.gmane.org> <1128346015.6138.149.camel@fsol>
	<20051003091416.9817.JCARLSON@uci.edu> <1128361197.6138.212.camel@fsol>
	<dhrtq4$5j1$1@sea.gmane.org> <1128368242.6138.258.camel@fsol>
	<dhs1sa$i8l$1@sea.gmane.org>
Message-ID: <1128371900.6138.299.camel@fsol>


> > If that's how things were designed, then Python's entire standard
> > brary (not to mention third-party libraries) is not "unicode safe" -
> > to quote your own words - since many functions may return 8-bit strings
> > containing non-ascii characters.
> 
> huh?  first you talk about functions that convert unicode strings to 8-bit
> strings, now you talk about functions that return raw 8-bit strings?

Are you deliberately missing the argument?
And can't you understand that conversions are problematic in both
directions (str -> unicode /and/ unicode -> str)?

If an stdlib function returns an 8-bit string containing non-ascii data,
then this string used in unicode context incurs an implicit conversion,
which fails. How's that for "unicode safety" of stdlib functions? Will
you argue that this gives no difficulties to anyone ?


> all this in response to a post that argues that it's in fact a good idea to
> use plain strings to hold textual data that happens to contain ASCII only,

To which you apparently didn't read my answer, that is:
you can never be sure that a variable containing something which
is /semantically/ textual (*) will never contain anything other than
ASCII text. For example raw_input() won't tell you that its 8-bit string
result contains some chars > 0x7F. Same for many other library
functions. How do you cope with (more or less occasional) non-ascii data
coming in as 8-bit strings?

(*) that is, contains some natural language

Either you carefully plan for non-ascii text coming in your application
(including workarounds against Python's ascii-by-default conversion
policy), or you deliberately cripple your application by deciding that
non-ASCII text is forbidden in (some or all) places. Choose the latter
and you'll be hostile to users.

And this thread began with a poster who found difficult the way implicit
conversions happen in Python. So it's very funny that you deny the
existence of a problem for certain developers.


Antoine.



From mal at egenix.com  Mon Oct  3 22:52:04 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 03 Oct 2005 22:52:04 +0200
Subject: [Python-Dev] --disable-unicode (Tests and unicode)
In-Reply-To: <4341954B.6050300@v.loewis.de>
References: <dhmh0n$41n$1@sea.gmane.org>	<43402FD6.7030905@v.loewis.de>	<dhqpbm$efg$1@sea.gmane.org>	<43411AC0.6040000@egenix.com>
	<4341954B.6050300@v.loewis.de>
Message-ID: <434199F4.3010905@egenix.com>

Martin v. L?wis wrote:
> M.-A. Lemburg wrote:
> 
>>Is the added complexity needed to support not having Unicode support
>>compiled into Python really worth it ?
> 
> If there are volunteers willing to maintain it, and the other volunteers
> are not affected: certainly.

No objections there. I only see that --disable-unicode
has already been broken a couple of times in the past
and no-one (except those running test suites regularly)
really noticed - at least not AFAIK.

>>I know that Martin introduced this feature a long time ago,
>>so he will have had a reason for it.
> 
> I added it because users requested it. I personally never use it.
> 
>>Today, I think the situation has changed: computers have more
>>memory, are faster and the need to integrate (e.g. via XML)
>>is stronger than ever - and maybe we should consider removing
>>the option to get a cleaner code base with fewer #ifdefs
>>and SyntaxErrors from the standard lib.
>>
>>What do you think ?
> 
> -0 for just ripping it out. +0 if PEP 5 is followed, atleast
> in spirit (i.e. give users advance warning to let them protest).
> 
> I guess users in embedded builds (either in embedded systems,
> or embedding Python into some other application) might still
> be interested in the feature. Of course, these users could either
> recreate the feature if we remove it, or just stay with
> Python 2.4.

If embedded build users rely on it, I'd suggest that these
users take over maintenance of the patch set.

Let's add a note to the configure switch that the feature will
be removed in 2.6 and see what happens.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 30 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From pje at telecommunity.com  Mon Oct  3 22:56:34 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 03 Oct 2005 16:56:34 -0400
Subject: [Python-Dev] unifying str and unicode
In-Reply-To: <1128371900.6138.299.camel@fsol>
References: <dhs1sa$i8l$1@sea.gmane.org> <dhra00$tnv$1@sea.gmane.org>
	<1128346015.6138.149.camel@fsol>
	<20051003091416.9817.JCARLSON@uci.edu>
	<1128361197.6138.212.camel@fsol> <dhrtq4$5j1$1@sea.gmane.org>
	<1128368242.6138.258.camel@fsol> <dhs1sa$i8l$1@sea.gmane.org>
Message-ID: <5.1.1.6.0.20051003165105.03d84008@mail.telecommunity.com>

At 10:38 PM 10/3/2005 +0200, Antoine Pitrou wrote:
>To which you apparently didn't read my answer, that is:
>you can never be sure that a variable containing something which
>is /semantically/ textual (*) will never contain anything other than
>ASCII text. For example raw_input() won't tell you that its 8-bit string
>result contains some chars > 0x7F. Same for many other library
>functions. How do you cope with (more or less occasional) non-ascii data
>coming in as 8-bit strings?

Presumably in Python 3.0, opening a file in "text" mode will require an 
encoding to be specified, and opening it in "binary" mode will cause it to 
produce or consume byte arrays, not strings.  This should apply to sockets 
too, and really any I/O facility, including GUI frameworks, DBAPI objects, 
os.listdir(), etc.

Of course, to get there we really need to add a convenient bytes type, 
perhaps by enhancing the current 'array' module.  It'd be nice to have a 
way to get this in 2.x versions so people can start fixing stuff to work 
the right way.  With no 8-bit strings coming in, there should be no 
unicode/str problems except those you create yourself.


From solipsis at pitrou.net  Mon Oct  3 22:59:02 2005
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 03 Oct 2005 22:59:02 +0200
Subject: [Python-Dev] bytes type
In-Reply-To: <5.1.1.6.0.20051003165105.03d84008@mail.telecommunity.com>
References: <dhs1sa$i8l$1@sea.gmane.org> <dhra00$tnv$1@sea.gmane.org>
	<1128346015.6138.149.camel@fsol> <20051003091416.9817.JCARLSON@uci.edu>
	<1128361197.6138.212.camel@fsol> <dhrtq4$5j1$1@sea.gmane.org>
	<1128368242.6138.258.camel@fsol> <dhs1sa$i8l$1@sea.gmane.org>
	<5.1.1.6.0.20051003165105.03d84008@mail.telecommunity.com>
Message-ID: <1128373142.6138.301.camel@fsol>


> Presumably in Python 3.0, opening a file in "text" mode will require an 
> encoding to be specified, and opening it in "binary" mode will cause it to 
> produce or consume byte arrays, not strings.  This should apply to sockets 
> too, and really any I/O facility, including GUI frameworks, DBAPI objects, 
> os.listdir(), etc.

Great :)

> Of course, to get there we really need to add a convenient bytes type, 
> perhaps by enhancing the current 'array' module.  It'd be nice to have a 
> way to get this in 2.x versions so people can start fixing stuff to work 
> the right way.

Could the "bytes" type be just the same as the current "str" type but
without the implicit unicode conversion ? Or am I missing some desired
functionality ?




From guido at python.org  Mon Oct  3 23:02:37 2005
From: guido at python.org (Guido van Rossum)
Date: Mon, 3 Oct 2005 14:02:37 -0700
Subject: [Python-Dev] bytes type
In-Reply-To: <1128373142.6138.301.camel@fsol>
References: <dhra00$tnv$1@sea.gmane.org> <1128346015.6138.149.camel@fsol>
	<20051003091416.9817.JCARLSON@uci.edu>
	<1128361197.6138.212.camel@fsol> <dhrtq4$5j1$1@sea.gmane.org>
	<1128368242.6138.258.camel@fsol> <dhs1sa$i8l$1@sea.gmane.org>
	<5.1.1.6.0.20051003165105.03d84008@mail.telecommunity.com>
	<1128373142.6138.301.camel@fsol>
Message-ID: <ca471dc20510031402k301dcd77y7672d83a79fa0281@mail.gmail.com>

On 10/3/05, Antoine Pitrou <solipsis at pitrou.net> wrote:
> Could the "bytes" type be just the same as the current "str" type but
> without the implicit unicode conversion ? Or am I missing some desired
> functionality ?

No. It will be a mutable array of bytes. It will intentionally
resemble strings as little as possible. There won't be a literal for
it.

But you will be able to convert between bytes and strings quite easily
by specifying an encoding.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From jason.orendorff at gmail.com  Mon Oct  3 23:15:26 2005
From: jason.orendorff at gmail.com (Jason Orendorff)
Date: Mon, 3 Oct 2005 17:15:26 -0400
Subject: [Python-Dev] PEP 343 and __with__
In-Reply-To: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>
References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>
Message-ID: <bb8868b90510031415x711dc703u72f2612bcda3e457@mail.gmail.com>

Phillip J. Eby writes:
> You didn't offer any reasons why this would be useful and/or good.

It makes it dramatically easier to write Python classes that correctly
support 'with'.  I don't see any simple way to do this under PEP 343;
the only sane thing to do is write a separate @contextmanager
generator, as all of the examples do.

Consider:

    # decimal.py
    class Context:
        ...
        def __enter__(self):
            ???
        def __exit__(self, t, v, tb):
            ???

    DefaultContext = Context(...)

Kindly implement __enter__() and __exit__().  Make sure your
implementation is thread-safe (not easy, even though
decimal.getcontext/.setcontext are thread-safe!).  Also make sure it
supports nested 'with DefaultContext:' blocks (I don't mean lexically
nested, of course; I mean nested at runtime.)

The answer requires thread-local storage and a separate stack of saved
context objects per thread.  It seems a little ridiculous to me.

Whereas:

    class Context:
        ...
        def __with__(self):
            old = decimal.getcontext()
            decimal.setcontext(self)
            try:
                yield
            finally:
                decimal.setcontext(old)

As for the second proposal, I was thinking we'd have one mental model
for context managers (block template generators), rather than two
(generators vs. enter/exit methods).  Enter/exit seemed superfluous,
given the examples in the PEP.

> [T]his multiplies the difficulty of implementing context managers in C.

Nonsense.

    static PyObject *
    lock_with()
    {
        return PyContextManager_FromCFunctions(self, lock_acquire,
lock_release);
    }

There probably ought to be such an API even if my suggestion is in
fact garbage (as, admittedly, still seems the most likely thing).

Cheers,
-j

From martin at v.loewis.de  Mon Oct  3 23:28:46 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 03 Oct 2005 23:28:46 +0200
Subject: [Python-Dev] unifying str and unicode
In-Reply-To: <1128371900.6138.299.camel@fsol>
References: <dhra00$tnv$1@sea.gmane.org>
	<1128346015.6138.149.camel@fsol>	<20051003091416.9817.JCARLSON@uci.edu>
	<1128361197.6138.212.camel@fsol>	<dhrtq4$5j1$1@sea.gmane.org>
	<1128368242.6138.258.camel@fsol>	<dhs1sa$i8l$1@sea.gmane.org>
	<1128371900.6138.299.camel@fsol>
Message-ID: <4341A28E.7060506@v.loewis.de>

Antoine Pitrou wrote:
> To which you apparently didn't read my answer, that is:
> you can never be sure that a variable containing something which
> is /semantically/ textual (*) will never contain anything other than
> ASCII text.

That is simply not true. There are variables that is semantically
textual, yet I can be sure that this is a byte string only if it
consists just of ASCII.

For example, if you invoke a Tkinter function, it will return a byte
string if the result is purely ASCII, else return a Unicode string.
This is an interface guarantee, hence I can be sure.

Regards,
Martin

From blais at furius.ca  Mon Oct  3 23:35:40 2005
From: blais at furius.ca (Martin Blais)
Date: Mon, 3 Oct 2005 17:35:40 -0400
Subject: [Python-Dev] unifying str and unicode
In-Reply-To: <1128371900.6138.299.camel@fsol>
References: <dhra00$tnv$1@sea.gmane.org> <1128346015.6138.149.camel@fsol>
	<20051003091416.9817.JCARLSON@uci.edu>
	<1128361197.6138.212.camel@fsol> <dhrtq4$5j1$1@sea.gmane.org>
	<1128368242.6138.258.camel@fsol> <dhs1sa$i8l$1@sea.gmane.org>
	<1128371900.6138.299.camel@fsol>
Message-ID: <8393fff0510031435n7ef19cbcg297b8881d75d0a08@mail.gmail.com>

On 10/3/05, Antoine Pitrou <solipsis at pitrou.net> wrote:
>
> > > If that's how things were designed, then Python's entire standard
> > > brary (not to mention third-party libraries) is not "unicode safe" -
> > > to quote your own words - since many functions may return 8-bit strings
> > > containing non-ascii characters.
> >
> > huh?  first you talk about functions that convert unicode strings to 8-bit
> > strings, now you talk about functions that return raw 8-bit strings?
>
> Are you deliberately missing the argument?
> And can't you understand that conversions are problematic in both
> directions (str -> unicode /and/ unicode -> str)?

Both directions are a problem.

Just a note: it's not so much the conversions that I find problematic,
but rather the implicit nature of the conversions (combined with the
fact that they may fail).  In addition to being difficult to track
down, these implicit conversions may be costing processing time as
well.

cheers,

From pje at telecommunity.com  Tue Oct  4 01:26:39 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 03 Oct 2005 19:26:39 -0400
Subject: [Python-Dev] PEP 343 and __with__
In-Reply-To: <bb8868b90510031415x711dc703u72f2612bcda3e457@mail.gmail.co
 m>
References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>
	<5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20051003191825.01f8be98@mail.telecommunity.com>

At 05:15 PM 10/3/2005 -0400, Jason Orendorff wrote:
>Phillip J. Eby writes:
> > You didn't offer any reasons why this would be useful and/or good.
>
>It makes it dramatically easier to write Python classes that correctly
>support 'with'.  I don't see any simple way to do this under PEP 343;
>the only sane thing to do is write a separate @contextmanager
>generator, as all of the examples do.

Wha?  For locks (the example you originally gave), this is trivial.


>Consider:
>
>     # decimal.py
>     class Context:
>         ...
>         def __enter__(self):
>             ???
>         def __exit__(self, t, v, tb):
>             ???
>
>     DefaultContext = Context(...)
>
>Kindly implement __enter__() and __exit__().  Make sure your
>implementation is thread-safe (not easy, even though
>decimal.getcontext/.setcontext are thread-safe!).  Also make sure it
>supports nested 'with DefaultContext:' blocks (I don't mean lexically
>nested, of course; I mean nested at runtime.)
>
>The answer requires thread-local storage and a separate stack of saved
>context objects per thread.  It seems a little ridiculous to me.

Okay, it was completely non-obvious from your post that this was the 
problem you're trying to solve.


>Whereas:
>
>     class Context:
>         ...
>         def __with__(self):
>             old = decimal.getcontext()
>             decimal.setcontext(self)
>             try:
>                 yield
>             finally:
>                 decimal.setcontext(old)

This could also be done with a Context.replace() @contextmanager method.

On the whole, I'm torn.  I definitely like the additional flexibility this 
gives.  On the other hand, it seems to me that __with__ and the additional 
C baggage violates the "if the implementation is hard to explain" 
rule.  Also, people have already put a lot of effort into implementation 
and documentation patches based on an accepted PEP.  That's not enough to 
override "the right thing to do", especially if it comes with a volunteer 
willing to update the work, but in this case the amount of additional 
goodness seems small, and it's not immediately apparent that you're 
volunteering to help change this even if Guido blessed it.


From mal at egenix.com  Tue Oct  4 01:35:57 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 04 Oct 2005 01:35:57 +0200
Subject: [Python-Dev] unifying str and unicode
In-Reply-To: <8393fff0510031435n7ef19cbcg297b8881d75d0a08@mail.gmail.com>
References: <dhra00$tnv$1@sea.gmane.org>
	<1128346015.6138.149.camel@fsol>	<20051003091416.9817.JCARLSON@uci.edu>	<1128361197.6138.212.camel@fsol>
	<dhrtq4$5j1$1@sea.gmane.org>	<1128368242.6138.258.camel@fsol>
	<dhs1sa$i8l$1@sea.gmane.org>	<1128371900.6138.299.camel@fsol>
	<8393fff0510031435n7ef19cbcg297b8881d75d0a08@mail.gmail.com>
Message-ID: <4341C05D.5000706@egenix.com>

Martin Blais wrote:
> On 10/3/05, Antoine Pitrou <solipsis at pitrou.net> wrote:
> 
>>>>If that's how things were designed, then Python's entire standard
>>>>brary (not to mention third-party libraries) is not "unicode safe" -
>>>>to quote your own words - since many functions may return 8-bit strings
>>>>containing non-ascii characters.
>>>
>>>huh?  first you talk about functions that convert unicode strings to 8-bit
>>>strings, now you talk about functions that return raw 8-bit strings?
>>
>>Are you deliberately missing the argument?
>>And can't you understand that conversions are problematic in both
>>directions (str -> unicode /and/ unicode -> str)?
> 
> 
> Both directions are a problem.
> 
> Just a note: it's not so much the conversions that I find problematic,
> but rather the implicit nature of the conversions (combined with the
> fact that they may fail).  In addition to being difficult to track
> down, these implicit conversions may be costing processing time as
> well.

We've already pointed you to a solution which you might want
to use. Why don't you just try it ?

BTW, if you want to read up on all the reasons why Unicode
was done the way it was, have a look at:

http://www.python.org/peps/pep-0100.html

and read up in the python-dev archives:

http://mail.python.org/pipermail/python-dev/2000-March/thread.html

and the next months after the initial checkin.

>From what I've read on the web about the Python Unicode
implementation we have one of the better ones compared
to other languages implementations and their choices and
design decisions.

None of them is perfect, but that's seems to be an inherent
problem with Unicode no matter how you try to approach it -
even more so, if you are trying to add it to a language that
has used ordinary C strings for text from day 1.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 30 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From solipsis at pitrou.net  Tue Oct  4 02:37:42 2005
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 04 Oct 2005 02:37:42 +0200
Subject: [Python-Dev] bytes type
In-Reply-To: <ca471dc20510031402k301dcd77y7672d83a79fa0281@mail.gmail.com>
References: <dhra00$tnv$1@sea.gmane.org> <1128346015.6138.149.camel@fsol>
	<20051003091416.9817.JCARLSON@uci.edu> <1128361197.6138.212.camel@fsol>
	<dhrtq4$5j1$1@sea.gmane.org> <1128368242.6138.258.camel@fsol>
	<dhs1sa$i8l$1@sea.gmane.org>
	<5.1.1.6.0.20051003165105.03d84008@mail.telecommunity.com>
	<1128373142.6138.301.camel@fsol>
	<ca471dc20510031402k301dcd77y7672d83a79fa0281@mail.gmail.com>
Message-ID: <1128386262.6138.342.camel@fsol>

Le lundi 03 octobre 2005 ? 14:02 -0700, Guido van Rossum a ?crit :
> On 10/3/05, Antoine Pitrou <solipsis at pitrou.net> wrote:
> > Could the "bytes" type be just the same as the current "str" type but
> > without the implicit unicode conversion ? Or am I missing some desired
> > functionality ?
> 
> No. It will be a mutable array of bytes. It will intentionally
> resemble strings as little as possible. There won't be a literal for
> it.

Thinking about it, it may have to offer the search and replace
facilities offered by strings (including regular expressions).

Here is an use case : say I'm reading an HTML file (or receiving it over
the network). Since the character encoding can be specified in the HTML
file itself (in the <head>...</head>), I must first receive it as a
bytes object. But then I must fetch the encoding information from the
HTML header: therefore I must use some string ops on the bytes object to
parse this information. Only after I have discovered the encoding, can I
finally convert the bytes object to a text string.

Or would there be another way to do it?




From guido at python.org  Tue Oct  4 02:42:49 2005
From: guido at python.org (Guido van Rossum)
Date: Mon, 3 Oct 2005 17:42:49 -0700
Subject: [Python-Dev] bytes type
In-Reply-To: <1128386262.6138.342.camel@fsol>
References: <dhra00$tnv$1@sea.gmane.org> <20051003091416.9817.JCARLSON@uci.edu>
	<1128361197.6138.212.camel@fsol> <dhrtq4$5j1$1@sea.gmane.org>
	<1128368242.6138.258.camel@fsol> <dhs1sa$i8l$1@sea.gmane.org>
	<5.1.1.6.0.20051003165105.03d84008@mail.telecommunity.com>
	<1128373142.6138.301.camel@fsol>
	<ca471dc20510031402k301dcd77y7672d83a79fa0281@mail.gmail.com>
	<1128386262.6138.342.camel@fsol>
Message-ID: <ca471dc20510031742q2b788e0dt3241f10e6f7bf03c@mail.gmail.com>

This would presumaby support the (read-only part of the) buffer API so
search would be covered.

I don't see a use case for replace.

Alternatively, you could always specify Latin-1 as the encoding and
convert it that way -- I don't think there's any input that can cause
Latin-1 decoding to fail.

On 10/3/05, Antoine Pitrou <solipsis at pitrou.net> wrote:
> Le lundi 03 octobre 2005 ? 14:02 -0700, Guido van Rossum a ?crit :
> > On 10/3/05, Antoine Pitrou <solipsis at pitrou.net> wrote:
> > > Could the "bytes" type be just the same as the current "str" type but
> > > without the implicit unicode conversion ? Or am I missing some desired
> > > functionality ?
> >
> > No. It will be a mutable array of bytes. It will intentionally
> > resemble strings as little as possible. There won't be a literal for
> > it.
>
> Thinking about it, it may have to offer the search and replace
> facilities offered by strings (including regular expressions).
>
> Here is an use case : say I'm reading an HTML file (or receiving it over
> the network). Since the character encoding can be specified in the HTML
> file itself (in the <head>...</head>), I must first receive it as a
> bytes object. But then I must fetch the encoding information from the
> HTML header: therefore I must use some string ops on the bytes object to
> parse this information. Only after I have discovered the encoding, can I
> finally convert the bytes object to a text string.
>
> Or would there be another way to do it?
>
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From solipsis at pitrou.net  Tue Oct  4 02:50:34 2005
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 04 Oct 2005 02:50:34 +0200
Subject: [Python-Dev] bytes type
In-Reply-To: <ca471dc20510031742q2b788e0dt3241f10e6f7bf03c@mail.gmail.com>
References: <dhra00$tnv$1@sea.gmane.org>
	<20051003091416.9817.JCARLSON@uci.edu> <1128361197.6138.212.camel@fsol>
	<dhrtq4$5j1$1@sea.gmane.org> <1128368242.6138.258.camel@fsol>
	<dhs1sa$i8l$1@sea.gmane.org>
	<5.1.1.6.0.20051003165105.03d84008@mail.telecommunity.com>
	<1128373142.6138.301.camel@fsol>
	<ca471dc20510031402k301dcd77y7672d83a79fa0281@mail.gmail.com>
	<1128386262.6138.342.camel@fsol>
	<ca471dc20510031742q2b788e0dt3241f10e6f7bf03c@mail.gmail.com>
Message-ID: <1128387034.6138.355.camel@fsol>


Le lundi 03 octobre 2005 ? 17:42 -0700, Guido van Rossum a ?crit :
> I don't see a use case for replace.

Agreed.

> Alternatively, you could always specify Latin-1 as the encoding and
> convert it that way -- I don't think there's any input that can cause
> Latin-1 decoding to fail.

You seem to be right.
? In 1992, the IANA registered the character map ISO-8859-1 (note the
extra hyphen), a superset of ISO/IEC 8859-1, for use on the Internet.
This map assigns control characters to the code values 00-1F, 7F, and
80-9F. It thus provides for 256 characters via every possible 8-bit
value. ?
http://en.wikipedia.org/wiki/ISO_8859-1#ISO-8859-1

Regards

Antoine.



From pjd at satori.za.net  Mon Oct  3 07:53:50 2005
From: pjd at satori.za.net (Piet Delport)
Date: Mon, 03 Oct 2005 07:53:50 +0200
Subject: [Python-Dev] Proposal for 2.5: Returning values from PEP 342
	enhanced generators
Message-ID: <4340C76E.8020502@satori.za.net>

PEP 255 ("Simple Generators") closes with:

> Q. Then why not allow an expression on "return" too?
>
> A. Perhaps we will someday.  In Icon, "return expr" means both "I'm
>    done", and "but I have one final useful value to return too, and
>    this is it".  At the start, and in the absence of compelling uses
>    for "return expr", it's simply cleaner to use "yield" exclusively
>    for delivering values.

Now that Python 2.5 gained enhanced generators (multitudes rejoice!), i think
there is a compelling use for valued return statements in cooperative
multitasking code, of the kind:

def foo():
    Data = yield Client.read()
    [...]
    MoreData = yield Client.read()
    [...]
    return FinalResult

def bar():
    Result = yield foo()

For generators written in this style, "yield" means "suspend execution of the
current call until the requested result/resource can be provided", and
"return" regains its full conventional meaning of "terminate the current call
with a given result".

The simplest / most straightforward implementation would be for "return Foo"
to translate to "raise StopIteration, Foo". This is consistent with "return"
translating to "raise StopIteration", and does not break any existing
generator code.

(Another way to think about this change is that if a plain StopIteration means
"the iterator terminated", then a valued StopIteration, by extension, means
"the iterator terminated with the given value".)

Motivation by real-world example:

One system that could benefit from this change is Christopher Armstrong's
defgen.py[1] for Twisted, which he recently reincarnated (as newdefgen.py) to
use enhanced generators. The resulting code is much cleaner than before, and
closer to the conventional synchronous style of writing.

[1] the saga of which is summarized here:
    http://radix.twistedmatrix.com/archives/000114.html

However, because enhanced generators have no way to differentiate their
intermediate results from their "real" result, the current solution is a
somewhat confusing compromise: the last value yielded by the generator
implicitly becomes the result returned by the call. Thus, to return
something, in general, requires the idiom "yield Foo; return". If valued
returns are allowed, this would become "return Foo" (and the code implementing
defgen itself would probably end up simpler, as well).

From tonynelson at georgeanelson.com  Tue Oct  4 03:11:29 2005
From: tonynelson at georgeanelson.com (Tony Nelson)
Date: Mon, 3 Oct 2005 21:11:29 -0400
Subject: [Python-Dev] Unicode charmap decoders slow
Message-ID: <v04020a00bf67666fe9d4@[192.168.123.162]>

Is there a faster way to transcode from 8-bit chars (charmaps) to utf-8
than going through unicode()?

I'm writing a small card-file program. As a test, I use a 53 MB MBox file,
in mac-roman encoding.  My program reads and parses the file into messages
in about 3 to 5 seconds (Wow! Go Python!), but takes about 14 seconds to
iterate over the cards and convert them to utf-8:

    for i in xrange(len(cards)):
        u = unicode(cards[i], encoding)
        cards[i] = u.encode('utf-8')

The time is nearly all in the unicode() call.  It's not so much how much
time it takes, but that it takes 4 times as long as the real work, just to
do table lookups.

Looking at the source (which, if I have it right, is
PyUnicode_DecodeCharmap() in unicodeobject.c), I think it is doing a
dictionary lookup for each character.  I would have thought that it would
make and cache a LUT the size of the charmap (and hook the relevent
dictionary stuff to delete the cached LUT if the dictionary is changed).
(You may consider this a request for enhancement. ;)

I thought of using U"".translate(), but the unicode version is defined to
be slow, and anyway I can't find any way to just shove my 8-bit data into a
unicode string without translation.  Is there some similar approach?  I'm
almost (but not quite) ready to try it in Pyrex.

I'm new to Python.  I didn't google anything relevent on python.org or in
groups.  I posted this in comp.lang.python yesterday, got a couple of
responses, but I think this may be too sophisticated a question for that
group.

I'm not a member of this list, so please copy me on replies so I don't have
to hunt them down in the archive.
____________________________________________________________________
TonyN.:'                       <mailto:tonynelson at georgeanelson.com>
      '                              <http://www.georgeanelson.com/>

From radeex at gmail.com  Tue Oct  4 04:03:41 2005
From: radeex at gmail.com (Christopher Armstrong)
Date: Tue, 4 Oct 2005 13:03:41 +1100
Subject: [Python-Dev] Proposal for 2.5: Returning values from PEP 342
	enhanced generators
In-Reply-To: <43415891.1040804@satori.za.net>
References: <43415891.1040804@satori.za.net>
Message-ID: <60ed19d40510031903h3b33c62cifbadcfd7aa83cdd2@mail.gmail.com>

On 10/4/05, Piet Delport <pjd at satori.za.net> wrote:
> One system that could benefit from this change is Christopher Armstrong's
> defgen.py[1] for Twisted, which he recently reincarnated (as newdefgen.py) to
> use enhanced generators. The resulting code is much cleaner than before, and
> closer to the conventional synchronous style of writing.
>
> [1] the saga of which is summarized here:
>     http://radix.twistedmatrix.com/archives/000114.html
>
> However, because enhanced generators have no way to differentiate their
> intermediate results from their "real" result, the current solution is a
> somewhat confusing compromise: the last value yielded by the generator
> implicitly becomes the result returned by the call. Thus, to return
> something, in general, requires the idiom "yield Foo; return". If valued
> returns are allowed, this would become "return Foo" (and the code implementing
> defgen itself would probably end up simpler, as well).

Hey, that would be nice. I've found people confused by the way defgen
handles return values before, getting seemingly meaningless values out
of their defgens (if the defgen didn't specifically yield some
meaningful value at the end).

At first I thought "return foo" in a generator ought to be equivalent
to "yield foo; return", but at least for defgen, it turns out raising
StopIteration(foo) would be better, as I would have a very explicit
way to specify and find the return value of the generator.


--
  Twisted   |  Christopher Armstrong: International Man of Twistery
   Radix    |    -- http://radix.twistedmatrix.com
            |  Release Manager, Twisted Project
  \\\V///   |    -- http://twistedmatrix.com
   |o O|    |
w----v----w-+

From jepler at unpythonic.net  Tue Oct  4 04:25:48 2005
From: jepler at unpythonic.net (jepler@unpythonic.net)
Date: Mon, 3 Oct 2005 21:25:48 -0500
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <v04020a00bf67666fe9d4@[192.168.123.162]>
References: <v04020a00bf67666fe9d4@[192.168.123.162]>
Message-ID: <20051004022548.GC7081@unpythonic.net>

As the OP suggests, decoding with a codec like mac-roman or iso8859-1 is very
slow compared to encoding or decoding with utf-8.  Here I'm working with 53k of
data instead of 53 megs.  (Note: this is a laptop, so it's possible that
thermal or battery management features affected these numbers a bit, but by a
factor of 3 at most)

$ timeit.py -s "s='a'*53*1024; u=unicode(s)" "u.encode('utf-8')"
1000 loops, best of 3: 591 usec per loop
$ timeit.py -s "s='a'*53*1024; u=unicode(s)" "s.decode('utf-8')"
1000 loops, best of 3: 1.25 msec per loop
$ timeit.py -s "s='a'*53*1024; u=unicode(s)" "s.decode('mac-roman')"
100 loops, best of 3: 13.5 msec per loop
$ timeit.py -s "s='a'*53*1024; u=unicode(s)" "s.decode('iso8859-1')"
100 loops, best of 3: 13.6 msec per loop

With utf-8 encoding as the baseline, we have
	decode('utf-8')      2.1x as long
	decode('mac-roman') 22.8x as long
	decode('iso8859-1') 23.0x as long

Perhaps this is an area that is ripe for optimization.

Jeff
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20051003/28a511e5/attachment.pgp

From skip at pobox.com  Mon Oct  3 23:45:44 2005
From: skip at pobox.com (skip@pobox.com)
Date: Mon, 3 Oct 2005 16:45:44 -0500
Subject: [Python-Dev] unifying str and unicode
In-Reply-To: <1128371900.6138.299.camel@fsol>
References: <dhra00$tnv$1@sea.gmane.org> <1128346015.6138.149.camel@fsol>
	<20051003091416.9817.JCARLSON@uci.edu>
	<1128361197.6138.212.camel@fsol> <dhrtq4$5j1$1@sea.gmane.org>
	<1128368242.6138.258.camel@fsol> <dhs1sa$i8l$1@sea.gmane.org>
	<1128371900.6138.299.camel@fsol>
Message-ID: <17217.42632.726476.530117@montanaro.dyndns.org>


    Antoine> If an stdlib function returns an 8-bit string containing
    Antoine> non-ascii data, then this string used in unicode context incurs
    Antoine> an implicit conversion, which fails. 

Such strings should be converted to Unicode at the point where they enter
the application.  That's likely the only place where you have a good chance
of knowing the data encoding.  Files generally have no encoding information
associated with them.  Some databases don't handle Unicode transparently.
If you hang onto the input from such devices as plain strings until you need
them as Unicode, you will almost certainly not know how the string was
encoded.  The state of the outside Unicode world being as miserable as it is
(think web input forms), you often don't know the encoding at the interface
and have to guess anyway.  Even so, isolating that guesswork to the
interface is better than recovering somewhere further downstream.

Skip

From foom at fuhm.net  Tue Oct  4 05:44:13 2005
From: foom at fuhm.net (James Y Knight)
Date: Mon, 3 Oct 2005 23:44:13 -0400
Subject: [Python-Dev] unifying str and unicode
In-Reply-To: <dhs1sa$i8l$1@sea.gmane.org>
References: <dhra00$tnv$1@sea.gmane.org>
	<1128346015.6138.149.camel@fsol><20051003091416.9817.JCARLSON@uci.edu>
	<1128361197.6138.212.camel@fsol><dhrtq4$5j1$1@sea.gmane.org>
	<1128368242.6138.258.camel@fsol> <dhs1sa$i8l$1@sea.gmane.org>
Message-ID: <0F192CD5-00EE-466B-A1D6-528F394754D1@fuhm.net>

On Oct 3, 2005, at 3:47 PM, Fredrik Lundh wrote:
> Antoine Pitrou wrote:
>
>
>>>> If I have an unicode string containing legal characters greater  
>>>> than
>>>> 0x7F, and I pass it to a function which converts it to str, the
>>>> conversion fails.
>>>>
>>>
>>> so?  if it does that, it's not unicode safe.
>>>
>> [...]
>>
>>> what's that has to do with
>>> my argument (which is that you can safely mix ascii strings and  
>>> unicode
>>> strings, because that's how things were designed).
>>>
>>
>> If that's how things were designed, then Python's entire standard
>> brary (not to mention third-party libraries) is not "unicode safe" -
>> to quote your own words - since many functions may return 8-bit  
>> strings
>> containing non-ascii characters.
>>
>
> huh?  first you talk about functions that convert unicode strings  
> to 8-bit
> strings, now you talk about functions that return raw 8-bit  
> strings?  and
> all this in response to a post that argues that it's in fact a good  
> idea to
> use plain strings to hold textual data that happens to contain  
> ASCII only,
> because 1) it works, by design, and 2) it's almost always more  
> efficient.
>
> if you don't know what your own argument is, you cannot expect anyone
> to understand it.

Your point would be much easier to stomach if the "str" type could  
*only* hold 7-bit ASCII. Perhaps that can be done when Python gets an  
actual bytes type in 3.0. There indeed are a multitude of uses for  
the efficient storage/processing of ASCII-only data. However,  
currently, there are problems because it's so easy to screw yourself  
without noticing when mixing unicode and str objects. If, on the  
other hand, you have a 7bit ascii string type, and a 16/32-bit  
unicode string type, both can be used interchangeably and there is no  
possibility for any en/de-coding issues. And  
asciiOnlyStringType.encode('utf-8') can become _ultra_ efficient, as  
a bonus. :)

Seems win-win to me.

James


From walter at livinglogic.de  Tue Oct  4 09:37:29 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Tue, 4 Oct 2005 09:37:29 +0200
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <20051004022548.GC7081@unpythonic.net>
References: <v04020a00bf67666fe9d4@[192.168.123.162]>
	<20051004022548.GC7081@unpythonic.net>
Message-ID: <167C9F31-31A9-4679-94A3-097191786241@livinglogic.de>

Am 04.10.2005 um 04:25 schrieb jepler at unpythonic.net:

> As the OP suggests, decoding with a codec like mac-roman or  
> iso8859-1 is very
> slow compared to encoding or decoding with utf-8.  Here I'm working  
> with 53k of
> data instead of 53 megs.  (Note: this is a laptop, so it's possible  
> that
> thermal or battery management features affected these numbers a  
> bit, but by a
> factor of 3 at most)
>
> $ timeit.py -s "s='a'*53*1024; u=unicode(s)" "u.encode('utf-8')"
> 1000 loops, best of 3: 591 usec per loop
> $ timeit.py -s "s='a'*53*1024; u=unicode(s)" "s.decode('utf-8')"
> 1000 loops, best of 3: 1.25 msec per loop
> $ timeit.py -s "s='a'*53*1024; u=unicode(s)" "s.decode('mac-roman')"
> 100 loops, best of 3: 13.5 msec per loop
> $ timeit.py -s "s='a'*53*1024; u=unicode(s)" "s.decode('iso8859-1')"
> 100 loops, best of 3: 13.6 msec per loop
>
> With utf-8 encoding as the baseline, we have
>     decode('utf-8')      2.1x as long
>     decode('mac-roman') 22.8x as long
>     decode('iso8859-1') 23.0x as long
>
> Perhaps this is an area that is ripe for optimization.

For charmap decoding we might be able to use an array (e.g. a tuple  
(or an array.array?) of codepoints instead of dictionary.

Or we could implement this array as a C array (i.e. gencodec.py would  
generate C code).

Bye,
    Walter D?rwald


From fredrik at pythonware.com  Tue Oct  4 10:33:15 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 4 Oct 2005 10:33:15 +0200
Subject: [Python-Dev] unifying str and unicode
References: <dhra00$tnv$1@sea.gmane.org><1128346015.6138.149.camel@fsol><20051003091416.9817.JCARLSON@uci.edu><1128361197.6138.212.camel@fsol><dhrtq4$5j1$1@sea.gmane.org><1128368242.6138.258.camel@fsol>
	<dhs1sa$i8l$1@sea.gmane.org>
	<0F192CD5-00EE-466B-A1D6-528F394754D1@fuhm.net>
Message-ID: <dhteob$qui$1@sea.gmane.org>

James Y Knight wrote:

> Your point would be much easier to stomach if the "str" type could
> *only* hold 7-bit ASCII.

why?  strings are not mutable, so it's not like an ASCII string will suddenly sprout
non-ASCII characters.  what ends up in a string is defined by the string source.  if
you cannot trust the source, your programs will never work.  after all, there's no-
thing in Python that keeps things like:

    s = file.readline().decode("iso-8859-1")
    s = elem.findtext("node")
    s = device.read_encoded_data()

from returning integers instead of strings, or returning socket objects on odd fridays.
but if the interface spec says that they always return strings that adher to python's
text model (=unicode or things that can be mixed with unicode), you can trust them
as much as you can trust anything else in Python.

(this is of course also why we talk about file-like objects in Python, and sequences,
and iterators and iterables, and stuff like that.  it's not type(obj) that's important, it's
what you can do with obj and how it behaves when you do it)

</F> 




From mwh at python.net  Tue Oct  4 10:50:16 2005
From: mwh at python.net (Michael Hudson)
Date: Tue, 04 Oct 2005 09:50:16 +0100
Subject: [Python-Dev] PEP 343 and __with__
In-Reply-To: <5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com>
	(Phillip J. Eby's message of "Mon, 03 Oct 2005 15:20:34 -0400")
References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>
	<5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>
	<5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com>
Message-ID: <2mu0fxekdz.fsf@starship.python.net>

"Phillip J. Eby" <pje at telecommunity.com> writes:

> At 07:02 PM 10/3/2005 +0100, Michael Hudson wrote:
>>"Phillip J. Eby" <pje at telecommunity.com> writes:
>>
>> > Since the PEP is accepted and has patches for both its implementation 
>> and a
>> > good part of its documentation, a major change like this would certainly
>> > need a better rationale.
>>
>>Though given the amount of interest said patch has attracted (none at
>>all)
>
> Actually, I have been reading the patch and meant to comment on it.  

Oh, good.

> I was perplexed by the odd stack behavior of the new opcode until I
> realized that it's try/finally that's weird.  :)

:)

> I was planning to look into whether that could be cleaned up as
> well, when I got distracted and didn't go back to it.

I see.

I don't know whether trying to clean up the stack protocol around
exceptions is worth the about of pain it causes in the head (anyone
still thinking about removing the block stack?).

>>  perhaps noone cares very much and the proposal should be dropped.
>
> I care an awful lot, as 'with' is another framework-dissolving tool that 
> makes it possible to do more things in library form, without needing to 
> resort to template methods.  It also enables more context-sensitive 
> programming, in that "global" states can be set and restored in a 
> structured fashion.  It may take a while to feel the effects, but it's 
> going to be a big improvement to Python, maybe as big as new-style classes, 
> and certainly bigger than decorators.

I think 'as big as new-style classes' is probably an exaggeration, but
I'm glad my troll caught a few people :)

Cheers,
mwh

-- 
  Those who have deviant punctuation desires should take care of their
  own perverted needs.                  -- Erik Naggum, comp.lang.lisp

From ncoghlan at gmail.com  Tue Oct  4 10:59:44 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 04 Oct 2005 18:59:44 +1000
Subject: [Python-Dev] PEP 343 and __with__
In-Reply-To: <2mu0fxekdz.fsf@starship.python.net>
References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>	<5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>	<5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com>
	<2mu0fxekdz.fsf@starship.python.net>
Message-ID: <43424480.9080900@gmail.com>

Michael Hudson wrote:
> I think 'as big as new-style classes' is probably an exaggeration, but
> I'm glad my troll caught a few people :)

I was planning on looking at your patch too, but I was waiting for an answer 
from Guido about the fate of the ast-branch for Python 2.5. Given that we have 
patches for PEP 342 and PEP 343 against the trunk, but ast-branch still isn't 
even passing the Python 2.4 test suite, I'm wondering if it should be bumped 
from the feature list again.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Tue Oct  4 12:21:43 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 04 Oct 2005 20:21:43 +1000
Subject: [Python-Dev] PEP 343 and __with__
In-Reply-To: <bb8868b90510031415x711dc703u72f2612bcda3e457@mail.gmail.com>
References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>
	<bb8868b90510031415x711dc703u72f2612bcda3e457@mail.gmail.com>
Message-ID: <434257B7.9000909@gmail.com>

Jason Orendorff wrote:
> Phillip J. Eby writes:
> 
>>You didn't offer any reasons why this would be useful and/or good.
> 
> 
> It makes it dramatically easier to write Python classes that correctly
> support 'with'.  I don't see any simple way to do this under PEP 343;
> the only sane thing to do is write a separate @contextmanager
> generator, as all of the examples do.

Hmm, it's kind of like the iterable/iterator distinction. Being able to do:

   class Whatever(object):
       def __iter__(self):
           for item in self.stuff:
               yield item

is a very handy way of defining "this is how you iterate over this class". The 
only cost is that actual iterators then need to define an __iter__ method that 
returns 'self' (which isn't much of a cost, and is trivial to do even for 
iterators written in C).

If there was a __with__ slot, then we could consider that as identifying a 
"manageable context", with three methods to identify an actual context manager:
   __with__ that returns self
   __enter__
   __exit__


Then the explanation of what a with statement does would simply look like:

         abc = EXPR.__with__() # This is the only change
         exc = (None, None, None)
         VAR = abc.__enter__()
         try:
             try:
                 BLOCK
             except:
                 exc = sys.exc_info()
                 raise
         finally:
             abc.__exit__(*exc)


And the context management for decimal.Context would look like:
      class Context:
          ...
          @contextmanager
          def __with__(self):
              old = decimal.getcontext()
              new = self.copy() # Make this nesting and thread safe
              decimal.setcontext(new)
              try:
                  yield new
              finally:
                  decimal.setcontext(old)

And for threading.Lock would look like:
      class Lock:
          ...
          def __with__(self):
              return self
          def __enter__(self):
              self.acquire()
              return self
          def __exit__(self):
              self.release()

Also, any class could make an existing independent context manager (such as 
'closing') its native context manager as follows:

      class SomethingCloseable:
          ...
          def __with__(self):
              return closing(self)

> As for the second proposal, I was thinking we'd have one mental model
> for context managers (block template generators), rather than two
> (generators vs. enter/exit methods).  Enter/exit seemed superfluous,
> given the examples in the PEP.

Try to explain the semantics of the with statement without referring to the 
__enter__ and __exit__ methods, and then see if you still think they're 
superfluous ;)

The @contextmanager generator decorator is just syntactic sugar for writing 
duck-typed context managers - the semantics of the with statement itself can 
only be explained in terms of the __enter__ and __exit__ methods. Indeed, 
explaining how the @contextmanager decorator itself works requires recourse to 
the __enter__ and __exit__ methods of the actual context manager object the 
decorator produces.

However, I think the idea of have a distinction between manageable contexts 
and context managers similar to the distinction between iterables and 
iterators is one well worth considering.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From guido at python.org  Tue Oct  4 16:31:55 2005
From: guido at python.org (Guido van Rossum)
Date: Tue, 4 Oct 2005 07:31:55 -0700
Subject: [Python-Dev] PEP 343 and __with__
In-Reply-To: <43424480.9080900@gmail.com>
References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>
	<5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com>
	<2mu0fxekdz.fsf@starship.python.net> <43424480.9080900@gmail.com>
Message-ID: <ca471dc20510040731v5fde7f79k3f9b649e65436054@mail.gmail.com>

On 10/4/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> I was planning on looking at your patch too, but I was waiting for an answer
> from Guido about the fate of the ast-branch for Python 2.5. Given that we have
> patches for PEP 342 and PEP 343 against the trunk, but ast-branch still isn't
> even passing the Python 2.4 test suite, I'm wondering if it should be bumped
> from the feature list again.

What do you want me to say about the AST branch? It's not my branch, I
haven't even checked it out, I'm just patiently waiting for the folks
who started it to finally finish it.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From jason.orendorff at gmail.com  Tue Oct  4 16:38:49 2005
From: jason.orendorff at gmail.com (Jason Orendorff)
Date: Tue, 4 Oct 2005 10:38:49 -0400
Subject: [Python-Dev] PEP 343 and __with__
In-Reply-To: <434257B7.9000909@gmail.com>
References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>
	<bb8868b90510031415x711dc703u72f2612bcda3e457@mail.gmail.com>
	<434257B7.9000909@gmail.com>
Message-ID: <bb8868b90510040738l34a397c5s27ccb74eb1ebbbec@mail.gmail.com>

The argument I am going to try to make is that Python coroutines need
a more usable API.

> Try to explain the semantics of the with statement without referring to the
> __enter__ and __exit__ methods, and then see if you still think they're
> superfluous ;)
>
> The @contextmanager generator decorator is just syntactic sugar [...]
> [T]he semantics of the with statement itself can
> only be explained in terms of the __enter__ and __exit__ methods.

That's not true.  It can certainly use the coroutine API instead.

Now... as specified in PEP 342, the coroutine API can be used to
implement 'with', but it's ugly.  I think this is a problem with the
coroutine API, not the idea of using coroutines per se.  Actually I
think 'with' is a pretty tame use case for coroutines.  Other Python
objects (dicts, lists, strings) have convenience methods that are
strictly redundant but make them much easier to use.  Coroutines
should, too.

This:

    with EXPR as VAR:
        BLOCK

expands to this under PEP 342:

    _cm = contextmanager(EXPR)
    VAR = _cm.next()
    try:
        BLOCK
    except:
        try:
            _cm.throw(*sys.exc_info())
        except:
            pass
        raise
    finally:
        try:
            _cm.next()
        except StopIteration:
            pass
        except:
            raise
        else:
            raise RuntimeError

Blah.  But it could look like this:

    _cm = (EXPR).__with__()
    VAR = _cm.start()
    try:
        BLOCK
    except:
        _cm.throw(*excinfo)
    else:
        _cm.finish()

I think that looks quite nice.

Here is the proposed specification for start() and finish():

    class coroutine:  # pseudocode
        ...
        def start(self):
            """ Convenience method -- exactly like next(), but
            assert that this coroutine hasn't already been started.
            """
            if self.__started:
                raise ValueError  # or whatever
            return self.next()

        def finish(self):
            """ Convenience method -- like next(), but expect the
            coroutine to complete without yielding again.
            """
            try:
                self.next()
            except (StopIteration, GeneratorExit):
                pass
            else:
                raise RuntimeError("coroutine didn't finish")

Why is this good?

  - Makes coroutines more usable for everyone, not just for
    implementing 'with'.
  - For example, if you want to feed values to a coroutine, call
    start() first and then send() repeatedly.  Quite sensible.
  - Single mental model for 'with' (always uses a coroutine or
    lookalike object).
  - No need for "contextmanager" wrapper.
  - Harder to implement a context manager object incorrectly
    (it's quite easy to screw up with __begin__ and __end__).

-j

From jason.orendorff at gmail.com  Tue Oct  4 16:51:11 2005
From: jason.orendorff at gmail.com (Jason Orendorff)
Date: Tue, 4 Oct 2005 10:51:11 -0400
Subject: [Python-Dev] PEP 343 and __with__
In-Reply-To: <bb8868b90510040738l34a397c5s27ccb74eb1ebbbec@mail.gmail.com>
References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>
	<bb8868b90510031415x711dc703u72f2612bcda3e457@mail.gmail.com>
	<434257B7.9000909@gmail.com>
	<bb8868b90510040738l34a397c5s27ccb74eb1ebbbec@mail.gmail.com>
Message-ID: <bb8868b90510040751u454a8f85ma6e6ecf90600ef71@mail.gmail.com>

Right after I sent the preceding message I got a funny feeling I'm
wasting everybody's time here.  I apologize.  Guido's original concern
about speedy C implementation for locks stands.  I don't see a good
way around it.

By the way, my expansion of 'with' using coroutines (in previous
message) was incorrect.  The corrected version is shorter; see below.

-j


This:

    with EXPR as VAR:
        BLOCK

would expand to this under PEP 342 and my proposal:

    _cm = (EXPR).__with__()
    VAR = _cm.next()
    try:
        BLOCK
    except:
        _cm.throw(*sys.exc_info())
    finally:
        try:
            _cm.next()
        except (StopIteration, GeneratorExit):
            pass
        else:
            raise RuntimeError("coroutine didn't finish")

From guido at python.org  Tue Oct  4 16:54:15 2005
From: guido at python.org (Guido van Rossum)
Date: Tue, 4 Oct 2005 07:54:15 -0700
Subject: [Python-Dev] PEP 343 and __with__
In-Reply-To: <bb8868b90510040738l34a397c5s27ccb74eb1ebbbec@mail.gmail.com>
References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>
	<bb8868b90510031415x711dc703u72f2612bcda3e457@mail.gmail.com>
	<434257B7.9000909@gmail.com>
	<bb8868b90510040738l34a397c5s27ccb74eb1ebbbec@mail.gmail.com>
Message-ID: <ca471dc20510040754q612c1f69sdce298a38cecb4a6@mail.gmail.com>

On 10/4/05, Jason Orendorff <jason.orendorff at gmail.com> wrote:
> This:
>
>     with EXPR as VAR:
>         BLOCK
>
> expands to this under PEP 342:
>
>     _cm = contextmanager(EXPR)
>     VAR = _cm.next()
>     try:
>         BLOCK
>     except:
>         try:
>             _cm.throw(*sys.exc_info())
>         except:
>             pass
>         raise
>     finally:
>         try:
>             _cm.next()
>         except StopIteration:
>             pass
>         except:
>             raise
>         else:
>             raise RuntimeError

Where in the world do you get this idea? The translation is as
follows, according to PEP 343:

        abc = EXPR
        exc = (None, None, None)
        VAR = abc.__enter__()
        try:
            try:
                BLOCK
            except:
                exc = sys.exc_info()
                raise
        finally:
            abc.__exit__(*exc)

PEP 342 doesn't touch on the expansion of with-statements at all.

I think I know where you're coming from, but please do us a favor and
don't misrepresent the PEPs.  If anything, your proposal is more
complicated; it requires four new APIs instead of two, and requires an
extra call to set up (__with__() followed by start()).

Proposals like yours (and every other permutation) were brought up
during the initial discussion. We picked one. Don't create more churn
by arguing for a different variant. Spend your efforts on implementing
it so you can actually use it and see how bad it is (I predict it
won't be bad at all).

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Oct  4 16:56:32 2005
From: guido at python.org (Guido van Rossum)
Date: Tue, 4 Oct 2005 07:56:32 -0700
Subject: [Python-Dev] PEP 343 and __with__
In-Reply-To: <bb8868b90510040751u454a8f85ma6e6ecf90600ef71@mail.gmail.com>
References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>
	<bb8868b90510031415x711dc703u72f2612bcda3e457@mail.gmail.com>
	<434257B7.9000909@gmail.com>
	<bb8868b90510040738l34a397c5s27ccb74eb1ebbbec@mail.gmail.com>
	<bb8868b90510040751u454a8f85ma6e6ecf90600ef71@mail.gmail.com>
Message-ID: <ca471dc20510040756w1c4fb28eo94fd8c3b498cd48b@mail.gmail.com>

On 10/4/05, Jason Orendorff <jason.orendorff at gmail.com> wrote:
> Right after I sent the preceding message I got a funny feeling I'm
> wasting everybody's time here.  I apologize.  Guido's original concern
> about speedy C implementation for locks stands.  I don't see a good
> way around it.

OK. Our messages crossed, so you can ignore my response. Let's spend
our time implementing the PEPs as they stand, then see what else we
can do with the new APIs.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From martin at v.loewis.de  Tue Oct  4 21:50:04 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 04 Oct 2005 21:50:04 +0200
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <167C9F31-31A9-4679-94A3-097191786241@livinglogic.de>
References: <v04020a00bf67666fe9d4@[192.168.123.162]>	<20051004022548.GC7081@unpythonic.net>
	<167C9F31-31A9-4679-94A3-097191786241@livinglogic.de>
Message-ID: <4342DCEC.5020204@v.loewis.de>

Walter D?rwald wrote:
> For charmap decoding we might be able to use an array (e.g. a tuple  
> (or an array.array?) of codepoints instead of dictionary.

This array would have to be sparse, of course. Using an array.array
would be more efficient, I guess - but we would need a C API for arrays
(to validate the type code, and to get ob_item).

> Or we could implement this array as a C array (i.e. gencodec.py would  
> generate C code).

For decoding, we would not get any better than array.array, except for
startup cost.

For encoding, having a C trie might give considerable speedup. _codecs
could offer an API to convert the current dictionaries into
lookup-efficient structures, and the conversion would be done when
importing the codec.

For the trie, two levels (higher and lower byte) would probably be
sufficient: I believe most encodings only use 2 "rows" (256 code
point blocks), very few more than three.

Regards,
Martin

From mal at egenix.com  Tue Oct  4 22:29:36 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 04 Oct 2005 22:29:36 +0200
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <167C9F31-31A9-4679-94A3-097191786241@livinglogic.de>
References: <v04020a00bf67666fe9d4@[192.168.123.162]>	<20051004022548.GC7081@unpythonic.net>
	<167C9F31-31A9-4679-94A3-097191786241@livinglogic.de>
Message-ID: <4342E630.5060801@egenix.com>

Walter D?rwald wrote:
> Am 04.10.2005 um 04:25 schrieb jepler at unpythonic.net:
> 
> 
>>As the OP suggests, decoding with a codec like mac-roman or  
>>iso8859-1 is very
>>slow compared to encoding or decoding with utf-8.  Here I'm working  
>>with 53k of
>>data instead of 53 megs.  (Note: this is a laptop, so it's possible  
>>that
>>thermal or battery management features affected these numbers a  
>>bit, but by a
>>factor of 3 at most)
>>
>>$ timeit.py -s "s='a'*53*1024; u=unicode(s)" "u.encode('utf-8')"
>>1000 loops, best of 3: 591 usec per loop
>>$ timeit.py -s "s='a'*53*1024; u=unicode(s)" "s.decode('utf-8')"
>>1000 loops, best of 3: 1.25 msec per loop
>>$ timeit.py -s "s='a'*53*1024; u=unicode(s)" "s.decode('mac-roman')"
>>100 loops, best of 3: 13.5 msec per loop
>>$ timeit.py -s "s='a'*53*1024; u=unicode(s)" "s.decode('iso8859-1')"
>>100 loops, best of 3: 13.6 msec per loop
>>
>>With utf-8 encoding as the baseline, we have
>>    decode('utf-8')      2.1x as long
>>    decode('mac-roman') 22.8x as long
>>    decode('iso8859-1') 23.0x as long
>>
>>Perhaps this is an area that is ripe for optimization.
> 
> 
> For charmap decoding we might be able to use an array (e.g. a tuple  
> (or an array.array?) of codepoints instead of dictionary.
> 
> Or we could implement this array as a C array (i.e. gencodec.py would  
> generate C code).

That would be a possibility, yes.

Note that the charmap codec was meant as faster replacement
for the old string transpose function. Dictionaries are used
for the mapping to avoid having to store huge (largely empty)
mapping tables - it's a memory-speed tradeoff.

Of course, a C version could use the same approach as
the unicodedatabase module: that of compressed lookup
tables...

	http://aggregate.org/TechPub/lcpc2002.pdf

genccodec.py anyone ?

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 04 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From walter at livinglogic.de  Tue Oct  4 23:48:08 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Tue, 4 Oct 2005 23:48:08 +0200
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <4342DCEC.5020204@v.loewis.de>
References: <v04020a00bf67666fe9d4@[192.168.123.162]>	<20051004022548.GC7081@unpythonic.net>
	<167C9F31-31A9-4679-94A3-097191786241@livinglogic.de>
	<4342DCEC.5020204@v.loewis.de>
Message-ID: <6773C3BF-F909-404A-AEB8-EF5023D86513@livinglogic.de>

Am 04.10.2005 um 21:50 schrieb Martin v. L?wis:

> Walter D?rwald wrote:
>
>> For charmap decoding we might be able to use an array (e.g. a  
>> tuple  (or an array.array?) of codepoints instead of dictionary.
>>
>
> This array would have to be sparse, of course.

For encoding yes, for decoding no.

> Using an array.array
> would be more efficient, I guess - but we would need a C API for  
> arrays
> (to validate the type code, and to get ob_item).

For decoding it should be sufficient to use a unicode string of  
length 256. u"\ufffd" could be used for "maps to undefined". Or the  
string might be shorter and byte values greater than the length of  
the string are treated as "maps to undefined" too.

>> Or we could implement this array as a C array (i.e. gencodec.py  
>> would  generate C code).
>>
>
> For decoding, we would not get any better than array.array, except for
> startup cost.

Yes.

> For encoding, having a C trie might give considerable speedup. _codecs
> could offer an API to convert the current dictionaries into
> lookup-efficient structures, and the conversion would be done when
> importing the codec.
>
> For the trie, two levels (higher and lower byte) would probably be
> sufficient: I believe most encodings only use 2 "rows" (256 code
> point blocks), very few more than three.

This might work, although nobody has complained about charmap  
encoding yet. Another option would be to generate a big switch  
statement in C and let the compiler decide about the best data  
structure.

Bye,
    Walter D?rwald


From marvinpublic at comcast.net  Wed Oct  5 00:05:20 2005
From: marvinpublic at comcast.net (Marvin)
Date: Tue, 04 Oct 2005 18:05:20 -0400
Subject: [Python-Dev] Static builds on Windows (continued)
Message-ID: <4342FCA0.8020409@comcast.net>

Earlier references:
http://mail.python.org/pipermail/python-dev/2004-July/046499.html

I want to be able to create a version of python24.lib that is a static library,
suitable for creating a python.exe or other .exe using python's api.

So I did as the earlier poster suggested, using 2.4.1 sources.  I modified the
PCBuild/pythoncore and python .vcproj files as follows:

  General/ ConfigurationType/ Static library (was dynamic in pythoncore)
  c/C++ Code Generation RT Library /MT (was /MTD for mt DLL)
  c/c++/Precompiled/ Not Using Precompiled headers (based on some MSDN hints)
  librarian OutputFile .//python24.lib
  Preprocessor: added Py_NO_ENABLED_SHARED. Removed USE_DL_IMPORT

I built pythoncore and python. The resulting python.exe worked fine, but did
indeed fail when I tried to dynamically load anything (Dialog said: the
application terminated abnormally)

Now I am not very clueful about the dllimport/dllexport business.  But it seems
that I should be able to link MY program against a .lib somehow (a real lib),
and let the .EXE export the symbols somehow.

My first guess is to try to use /MTD, use Py_NO_ENABLE_SHARED when building
python24.lib, but then use PY_ENABLE_SHARED when compiling the python.c.  I'll
try that later, but anyone have more insight into the right way to do this?

marvin

From martin at v.loewis.de  Wed Oct  5 00:08:45 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 05 Oct 2005 00:08:45 +0200
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <6773C3BF-F909-404A-AEB8-EF5023D86513@livinglogic.de>
References: <v04020a00bf67666fe9d4@[192.168.123.162]>	<20051004022548.GC7081@unpythonic.net>
	<167C9F31-31A9-4679-94A3-097191786241@livinglogic.de>
	<4342DCEC.5020204@v.loewis.de>
	<6773C3BF-F909-404A-AEB8-EF5023D86513@livinglogic.de>
Message-ID: <4342FD6D.7000308@v.loewis.de>

Walter D?rwald wrote:
>> This array would have to be sparse, of course.
> 
> 
> For encoding yes, for decoding no.
[...]
> For decoding it should be sufficient to use a unicode string of  length 
> 256. u"\ufffd" could be used for "maps to undefined". Or the  string 
> might be shorter and byte values greater than the length of  the string 
> are treated as "maps to undefined" too.

Right. That's what I meant with "sparse": you somehow need to represent
"no value".

> This might work, although nobody has complained about charmap  encoding 
> yet. Another option would be to generate a big switch  statement in C 
> and let the compiler decide about the best data  structure.

I would try to avoid generating C code at all costs. Maintaining the 
build processes will just be a nightmare.

Regards,
Martin

From martin at v.loewis.de  Wed Oct  5 00:21:20 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 05 Oct 2005 00:21:20 +0200
Subject: [Python-Dev] Static builds on Windows (continued)
In-Reply-To: <4342FCA0.8020409@comcast.net>
References: <4342FCA0.8020409@comcast.net>
Message-ID: <43430060.6070909@v.loewis.de>

Marvin wrote:
> I built pythoncore and python. The resulting python.exe worked fine, but did
> indeed fail when I tried to dynamically load anything (Dialog said: the
> application terminated abnormally)

Not sure what you are trying to do here. In your case, dynamic loading 
simply cannot work. The extension modules all link with python24.dll, 
which you don't have. It may find some python24.dll, which then gives 
conflicts with the Python interpreter that is already running.

So what you really should do is disable dynamic loading entirely. To do
so, remove dynload_win from your project, and #undef 
HAVE_DYNAMIC_LOADING in PC/pyconfig.h.

Not sure if anybody has recently tested whether this configuration
actually works - if you find that it doesn't, please post your patches
to sf.net/projects/python.

If you really want to provide dynamic loading of some kind, you should
arrange the extension modules to import the symbols from your .exe.
Linking the exe should generate an import library, and you should link
the extensions against that.

HTH,
Martin

From tonynelson at georgeanelson.com  Tue Oct  4 18:44:16 2005
From: tonynelson at georgeanelson.com (Tony Nelson)
Date: Tue, 4 Oct 2005 12:44:16 -0400
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <167C9F31-31A9-4679-94A3-097191786241@livinglogic.de>
References: <20051004022548.GC7081@unpythonic.net>
	<v04020a00bf67666fe9d4@[192.168.123.162]>
	<20051004022548.GC7081@unpythonic.net>
Message-ID: <v04020a00bf6843a8eac0@[192.168.123.162]>

At 9:37 AM +0200 10/4/05, Walter D?rwald wrote:
>Am 04.10.2005 um 04:25 schrieb jepler at unpythonic.net:
>
>>As the OP suggests, decoding with a codec like mac-roman or iso8859-1 is
>>very slow compared to encoding or decoding with utf-8. Here I'm working
>>with 53k of data instead of 53 megs. (Note: this is a laptop, so it's
>>possible that thermal or battery management features affected these
>>numbers a bit, but by a factor of 3 at most)
>>
>> $ timeit.py -s "s='a'*53*1024; u=unicode(s)" "u.encode('utf-8')"
>> 1000 loops, best of 3: 591 usec per loop
>> $ timeit.py -s "s='a'*53*1024; u=unicode(s)" "s.decode('utf-8')"
>> 1000 loops, best of 3: 1.25 msec per loop
>> $ timeit.py -s "s='a'*53*1024; u=unicode(s)" "s.decode('mac-roman')"
>> 100 loops, best of 3: 13.5 msec per loop
>> $ timeit.py -s "s='a'*53*1024; u=unicode(s)" "s.decode('iso8859-1')"
>> 100 loops, best of 3: 13.6 msec per loop
>>
>> With utf-8 encoding as the baseline, we have
>>     decode('utf-8')      2.1x as long
>>     decode('mac-roman') 22.8x as long
>>     decode('iso8859-1') 23.0x as long
>>
>> Perhaps this is an area that is ripe for optimization.
>
>For charmap decoding we might be able to use an array (e.g. a tuple
>(or an array.array?) of codepoints instead of dictionary.
>
>Or we could implement this array as a C array (i.e. gencodec.py would
>generate C code).

Fine -- as long as it still allows changing code points.  I add the missing
"Apple logo" code point to mac-roman in order to permit round-tripping
(0xF0 <=> 0xF8FF, per Apple docs).  (New bug #1313051.)

If an all-C implementation wouldn't permit changing codepoints, I suggest
instead just /caching/ the translation in C arrays stored with the codec
object.  The cache would be invalidated on any write to the codec's mapping
dictionary, and rebuilt the next time anything was translated.  This would
maintain the present semantics, work with current codecs, and still provide
the desired speed improvement.

But is there really no way to say this fast in pure Python?  The way a
one-to-one byte mapping can be done with "".translate()?
____________________________________________________________________
TonyN.:'                       <mailto:tonynelson at georgeanelson.com>
      '                              <http://www.georgeanelson.com/>

From tonynelson at georgeanelson.com  Wed Oct  5 04:52:22 2005
From: tonynelson at georgeanelson.com (Tony Nelson)
Date: Tue, 4 Oct 2005 22:52:22 -0400
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <6773C3BF-F909-404A-AEB8-EF5023D86513@livinglogic.de>
References: <4342DCEC.5020204@v.loewis.de>
	<v04020a00bf67666fe9d4@[192.168.123.162]>
	<20051004022548.GC7081@unpythonic.net>
	<167C9F31-31A9-4679-94A3-097191786241@livinglogic.de>
	<4342DCEC.5020204@v.loewis.de>
Message-ID: <v04020a01bf68e10ce9f2@[192.168.123.162]>

[Recipient list not trimmed, as my replies must be vetted by a moderator,
which seems to delay them. :]

At 11:48 PM +0200 10/4/05, Walter D?rwald wrote:
>Am 04.10.2005 um 21:50 schrieb Martin v. L?wis:
>
>> Walter D?rwald wrote:
>>
>>> For charmap decoding we might be able to use an array (e.g. a
>>> tuple  (or an array.array?) of codepoints instead of dictionary.
>>>
>>
>> This array would have to be sparse, of course.
>
>For encoding yes, for decoding no.
>
>> Using an array.array would be more efficient, I guess - but we would
>> need a C API for arrays (to validate the type code, and to get ob_item).
>
>For decoding it should be sufficient to use a unicode string of
>length 256. u"\ufffd" could be used for "maps to undefined". Or the
>string might be shorter and byte values greater than the length of
>the string are treated as "maps to undefined" too.

With Unicode using more than 64K codepoints now, it might be more forward
looking to use a table of 256 32-bit values, with no need for tricky
values.  There is no need to add any C code to the codecs; just add some
more code to the existing C function (which, if I have it right, is
PyUnicode_DecodeCharmap() in unicodeobject.c).

 ...
>> For encoding, having a C trie might give considerable speedup. _codecs
>> could offer an API to convert the current dictionaries into
>> lookup-efficient structures, and the conversion would be done when
>> importing the codec.
>>
>> For the trie, two levels (higher and lower byte) would probably be
>> sufficient: I believe most encodings only use 2 "rows" (256 code
>> point blocks), very few more than three.
>
>This might work, although nobody has complained about charmap
>encoding yet. Another option would be to generate a big switch
>statement in C and let the compiler decide about the best data
>structure.

I'm willing to complain. :)  I might allow saving of my (53 MB) MBox file.
(Not that editing received mail makes as much sense as searching it.)

Encoding can be made fast using a simple hash table with external chaining.
There are max 256 codepoints to encode, and they will normally be well
distributed in their lower 8 bits.  Hash on the low 8 bits (just mask), and
chain to an area with 256 entries.  Modest storage, normally short chains,
therefore fast encoding.


At 12:08 AM +0200 10/5/05, Martin v. L?wis wrote:

>I would try to avoid generating C code at all costs. Maintaining the
>build processes will just be a nightmare.

I agree; also I don't think the generated codecs need to be changed at all.
All the changes can be made to the existing C functions, by adding caching
per a reply of mine that hasn't made it to the list yet.  Well, OK,
something needs to hook writes to the codec's dictionary, but I /think/
that only needs Python code.  I say:

>...I suggest instead just /caching/ the translation in C arrays stored
>with the codec object.  The cache would be invalidated on any write to the
>codec's mapping dictionary, and rebuilt the next time anything was
>translated.  This would maintain the present semantics, work with current
>codecs, and still provide the desired speed improvement.

Note that this caching is done by new code added to the existing C
functions (which, if I have it right, are in unicodeobject.c).  No
architectural changes are made; no existing codecs need to be changed;
everything will just work, and usually work faster, with very modest memory
requirements of one 256 entry array of 32-bit Unicode values and a hash
table with 256 1-byte slots and 256 chain entries, each having a 4 byte
Unicode value, a byte output value, a byte chain index, and probably 2
bytes of filler, for a hash table size of 2304 bytes per codec.
____________________________________________________________________
TonyN.:'                       <mailto:tonynelson at georgeanelson.com>
      '                              <http://www.georgeanelson.com/>

From martin at v.loewis.de  Wed Oct  5 08:36:58 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 05 Oct 2005 08:36:58 +0200
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <v04020a01bf68e10ce9f2@[192.168.123.162]>
References: <4342DCEC.5020204@v.loewis.de>
	<v04020a00bf67666fe9d4@[192.168.123.162]>
	<20051004022548.GC7081@unpythonic.net>
	<167C9F31-31A9-4679-94A3-097191786241@livinglogic.de>
	<4342DCEC.5020204@v.loewis.de>
	<v04020a01bf68e10ce9f2@[192.168.123.162]>
Message-ID: <4343748A.9050105@v.loewis.de>

Tony Nelson wrote:
>>For decoding it should be sufficient to use a unicode string of
>>length 256. u"\ufffd" could be used for "maps to undefined". Or the
>>string might be shorter and byte values greater than the length of
>>the string are treated as "maps to undefined" too.
> 
> 
> With Unicode using more than 64K codepoints now, it might be more forward
> looking to use a table of 256 32-bit values, with no need for tricky
> values.

You might be missing the point. \ufffd is REPLACEMENT CHARACTER,
which would indicate that the byte with that index is really unused
in that encoding.

> Encoding can be made fast using a simple hash table with external chaining.
> There are max 256 codepoints to encode, and they will normally be well
> distributed in their lower 8 bits.  Hash on the low 8 bits (just mask), and
> chain to an area with 256 entries.  Modest storage, normally short chains,
> therefore fast encoding.

This is what is currently done: a hash map with 256 keys. You are 
complaining about the performance of that algorithm. The issue of
external chaining is likely irrelevant: there likely are no collisions,
even though Python uses open addressing.

>>...I suggest instead just /caching/ the translation in C arrays stored
>>with the codec object.  The cache would be invalidated on any write to the
>>codec's mapping dictionary, and rebuilt the next time anything was
>>translated.  This would maintain the present semantics, work with current
>>codecs, and still provide the desired speed improvement.

That is not implementable. You cannot catch writes to the dictionary.

> Note that this caching is done by new code added to the existing C
> functions (which, if I have it right, are in unicodeobject.c).  No
> architectural changes are made; no existing codecs need to be changed;
> everything will just work

Please try to implement it. You will find that you cannot. I don't
see how regenerating/editing the codecs could be avoided.

Regards,
Martin

From martin at v.loewis.de  Wed Oct  5 08:47:54 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 05 Oct 2005 08:47:54 +0200
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <v04020a00bf6843a8eac0@[192.168.123.162]>
References: <20051004022548.GC7081@unpythonic.net>	<v04020a00bf67666fe9d4@[192.168.123.162]>	<20051004022548.GC7081@unpythonic.net>
	<v04020a00bf6843a8eac0@[192.168.123.162]>
Message-ID: <4343771A.5040203@v.loewis.de>

Tony Nelson wrote:
 > But is there really no way to say this fast in pure Python?  The way a
 > one-to-one byte mapping can be done with "".translate()?

Well, .translate isn't exactly pure Python. One-to-one between bytes
and Unicode code points simply can't work. Just try all alternatives
yourself and see if you can get any better than charmap_decode.

Some would argue that charmap_decode *is* fast.

Regards,
Martin

From walter at livinglogic.de  Wed Oct  5 10:21:06 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Wed, 5 Oct 2005 10:21:06 +0200
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <4342FD6D.7000308@v.loewis.de>
References: <v04020a00bf67666fe9d4@[192.168.123.162]>	<20051004022548.GC7081@unpythonic.net>
	<167C9F31-31A9-4679-94A3-097191786241@livinglogic.de>
	<4342DCEC.5020204@v.loewis.de>
	<6773C3BF-F909-404A-AEB8-EF5023D86513@livinglogic.de>
	<4342FD6D.7000308@v.loewis.de>
Message-ID: <BFA38410-807D-4F1B-819D-6FE60C3E42E1@livinglogic.de>

Am 05.10.2005 um 00:08 schrieb Martin v. L?wis:

> Walter D?rwald wrote:
>
>>> This array would have to be sparse, of course.
>>>
>> For encoding yes, for decoding no.
>>
> [...]
>
>> For decoding it should be sufficient to use a unicode string of   
>> length 256. u"\ufffd" could be used for "maps to undefined". Or  
>> the  string might be shorter and byte values greater than the  
>> length of  the string are treated as "maps to undefined" too.
>
> Right. That's what I meant with "sparse": you somehow need to  
> represent
> "no value".

OK, but I don't think that we really need a sparse data structure for  
that. I used the following script to check that:
-----
import sys, os.path, glob, encodings

has = 0
hasnt = 0

for enc in glob.glob("%s/*.py" % os.path.dirname(encodings.__file__)):
   enc = enc.rsplit(".")[-2].rsplit("/")[-1]
   try:
     __import__("encodings.%s" % enc)
     codec = sys.modules["encodings.%s" % enc]
   except:
     pass
   else:
     if hasattr(codec, "decoding_map"):
       print codec
       for i in xrange(0, 256):
         if codec.decoding_map.get(i, None) is not None:
           has += 1
         else:
           hasnt += 1
print "assigned values:", has, "unassigned values:", hasnt
----
It reports that in all the charmap codecs there are 15292 assigned  
byte values and only 324 unassigned ones. I.e. only about 2% of the  
byte values map to "undefined". Storing those codepoints in the array  
as U+FFFD would only need 648 (or 1296 for wide builds) additional  
bytes. I don 't think a sparse data structure could beat that.

>> This might work, although nobody has complained about charmap   
>> encoding yet. Another option would be to generate a big switch   
>> statement in C and let the compiler decide about the best data   
>> structure.
> I would try to avoid generating C code at all costs. Maintaining  
> the build processes will just be a nightmare.

Sounds resonable.

Bye,
    Walter D?rwald


From mal at egenix.com  Wed Oct  5 10:39:25 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 05 Oct 2005 10:39:25 +0200
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <4342FD6D.7000308@v.loewis.de>
References: <v04020a00bf67666fe9d4@[192.168.123.162]>	<20051004022548.GC7081@unpythonic.net>	<167C9F31-31A9-4679-94A3-097191786241@livinglogic.de>	<4342DCEC.5020204@v.loewis.de>	<6773C3BF-F909-404A-AEB8-EF5023D86513@livinglogic.de>
	<4342FD6D.7000308@v.loewis.de>
Message-ID: <4343913D.3080609@egenix.com>

Martin v. L?wis wrote:
>>Another option would be to generate a big switch  statement in C 
>>and let the compiler decide about the best data  structure.
> 
> I would try to avoid generating C code at all costs. Maintaining the 
> build processes will just be a nightmare.

We could automate this using distutils; however I'm not sure
whether this would then also work on Windows.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 05 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From jepler at unpythonic.net  Wed Oct  5 14:54:05 2005
From: jepler at unpythonic.net (jepler@unpythonic.net)
Date: Wed, 5 Oct 2005 07:54:05 -0500
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <20051004022548.GC7081@unpythonic.net>
References: <v04020a00bf67666fe9d4@[192.168.123.162]>
	<20051004022548.GC7081@unpythonic.net>
Message-ID: <20051005125405.GA13147@unpythonic.net>

The function the module below, xlate.xlate, doesn't quite do what "".decode
does.  (mostly that characters that don't exist are mapped to u+fffd always,
instead of having the various behaviors avilable to "".decode)

It builds the fast decoding structure once per call, but when decoding 53kb of
data that overhead is small enough to make it much faster than
s.decode('mac-roman').  For smaller buffers (I tried 53 characters), s.decode is
two times faster. (43us vs 21us)

$ timeit.py -s "s='a'*53*1024; import xlate" "s.decode('mac-roman')"
100 loops, best of 3: 12.8 msec per loop
$ timeit.py -s "s='a'*53*1024; import xlate, encodings.mac_roman" \
	"xlate.xlate(s, encodings.mac_roman.decoding_map)"
1000 loops, best of 3: 573 usec per loop

Jeff
-------------- next part --------------
#include <Python.h>
#include <stringobject.h>
#include <dictobject.h>

PyObject *xlate(PyObject *s, PyObject *o) {
    unsigned char *inbuf;
    int i, length, pos=0;
    PyObject *map, *key, *value, *ret;
    Py_UNICODE *u, *ru;

    if(!PyArg_ParseTuple(o, "s#O", (char*)&inbuf, &length, &map)) return NULL;
    if(!PyDict_Check(map)) {
        PyErr_SetString(PyExc_TypeError, "Argument 2 must be a dictionary");
        return NULL;
    }

    u = PyMem_Malloc(sizeof(Py_UNICODE) * 256);
    if(!u) { return NULL; }
    for(i=0; i<256; i++) {
        u[i] = 0xfffd;
    }

    while(PyDict_Next(map, &pos, &key, &value)) {
        int ki, vi;
        if(!PyInt_Check(key)) { 
            PyErr_SetString(PyExc_TypeError, "Dictionary keys must be ints");
            return NULL;
        }
        ki = PyInt_AsLong(key);
        if(ki < 0 || ki > 255) { 
            PyErr_Format(PyExc_TypeError,
                "Dictionary keys must be in the range 0..255 (saw %d)", ki);
            return NULL;
        }
        if(value == Py_None) continue;
        if(!PyInt_Check(value)) { 
            PyErr_SetString(PyExc_TypeError, "Dictionary values must be ints or None");
            return NULL;
        }
        vi = PyInt_AsLong(value);
        u[ki] = vi;
    }

    ret = PyUnicode_FromUnicode(NULL, length);
    if(!ret) { free(u); return NULL; }
    ru = PyUnicode_AsUnicode(ret);
    for(i=0; i<length; i++) {
        ru[i] = u[inbuf[i]];
    }
    free(u);
    return ret;
}

PyMethodDef md[] = {
    {"xlate", (PyCFunction)xlate, METH_VARARGS, NULL},
    {NULL, NULL, 0, NULL}
};

void initxlate(void) {
    Py_InitModule("xlate", md);
}
-------------- next part --------------
import encodings.mac_roman
import xlate

def test(encname, decoding_map):

    s = ""
    for k, v in decoding_map.items():
        if v is not None: 
            s += chr(k)

    u1 = s.decode(encname)
    print decoding_map
    u2 = xlate.xlate(s, decoding_map)
    assert u1 == u2

test("mac-roman", encodings.mac_roman.decoding_map)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20051005/ebe46a0e/attachment.pgp

From ncoghlan at gmail.com  Wed Oct  5 15:29:22 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 05 Oct 2005 23:29:22 +1000
Subject: [Python-Dev] Python 2.5 and ast-branch
In-Reply-To: <ca471dc20510040731v5fde7f79k3f9b649e65436054@mail.gmail.com>
References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>	
	<5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com>	
	<2mu0fxekdz.fsf@starship.python.net> <43424480.9080900@gmail.com>
	<ca471dc20510040731v5fde7f79k3f9b649e65436054@mail.gmail.com>
Message-ID: <4343D532.2030202@gmail.com>

Guido van Rossum wrote:
> On 10/4/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> 
>>I was planning on looking at your patch too, but I was waiting for an answer
>>from Guido about the fate of the ast-branch for Python 2.5. Given that we have
>>patches for PEP 342 and PEP 343 against the trunk, but ast-branch still isn't
>>even passing the Python 2.4 test suite, I'm wondering if it should be bumped
>>from the feature list again.
> 
> 
> What do you want me to say about the AST branch? It's not my branch, I
> haven't even checked it out, I'm just patiently waiting for the folks
> who started it to finally finish it.

It was a question I asked a few weeks back [1] that didn't get any response 
(even from Brett!), to do with the fact that for Python 2.4 there was a 
deadline for landing the ast-branch that was a month or two in advance of the 
deadline for 2.4a1. I thought you'd set that deadline, but now that I look for 
it, I can't actually find any evidence of that. The only thing I can find is 
Jeremy's email saying it wasn't ready in time [2] (Jeremy's concern about 
reference leaks in ast-branch when it encounters compile errors is one I 
share, btw).

Anyway, the question is: What do we want to do with ast-branch? Finish 
bringing it up to Python 2.4 equivalence, make it the HEAD, and only then 
implement the approved PEP's (308, 342, 343) that affect the compiler? Or 
implement the approved PEP's on the HEAD, and move the goalposts for 
ast-branch to include those features as well?

I believe the latter is the safe option in terms of making sure 2.5 is a solid 
release, but doing it that way suggests to me that the ast compiler would need 
to be held over until 2.6, which would be somewhat unfortunate.

Given that I don't particularly like that answer, I'd love for someone to 
convince me I'm wrong ;)

Cheers,
Nick.

[1] http://mail.python.org/pipermail/python-dev/2005-September/056449.html
[2] http://mail.python.org/pipermail/python-dev/2004-June/045121.html

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From guido at python.org  Wed Oct  5 16:52:44 2005
From: guido at python.org (Guido van Rossum)
Date: Wed, 5 Oct 2005 07:52:44 -0700
Subject: [Python-Dev] Python 2.5 and ast-branch
In-Reply-To: <4343D532.2030202@gmail.com>
References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>
	<5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com>
	<2mu0fxekdz.fsf@starship.python.net> <43424480.9080900@gmail.com>
	<ca471dc20510040731v5fde7f79k3f9b649e65436054@mail.gmail.com>
	<4343D532.2030202@gmail.com>
Message-ID: <ca471dc20510050752x8296aaax7d0bfb88c25516a5@mail.gmail.com>

On 10/5/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Anyway, the question is: What do we want to do with ast-branch? Finish
> bringing it up to Python 2.4 equivalence, make it the HEAD, and only then
> implement the approved PEP's (308, 342, 343) that affect the compiler? Or
> implement the approved PEP's on the HEAD, and move the goalposts for
> ast-branch to include those features as well?
>
> I believe the latter is the safe option in terms of making sure 2.5 is a solid
> release, but doing it that way suggests to me that the ast compiler would need
> to be held over until 2.6, which would be somewhat unfortunate.
>
> Given that I don't particularly like that answer, I'd love for someone to
> convince me I'm wrong ;)

Given the total lack of response, I have a different suggestion. Let's
*abandon* the AST-branch. We're fooling ourselves believing that we
can ever switch to that branch, no matter how theoretically better it
is.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From hyeshik at gmail.com  Wed Oct  5 17:06:06 2005
From: hyeshik at gmail.com (Hye-Shik Chang)
Date: Thu, 6 Oct 2005 00:06:06 +0900
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <4342E630.5060801@egenix.com>
References: <v04020a00bf67666fe9d4@192.168.123.162>
	<20051004022548.GC7081@unpythonic.net>
	<167C9F31-31A9-4679-94A3-097191786241@livinglogic.de>
	<4342E630.5060801@egenix.com>
Message-ID: <4f0b69dc0510050806h478a1b9fr6953999e8d9d312b@mail.gmail.com>

On 10/5/05, M.-A. Lemburg <mal at egenix.com> wrote:
> Of course, a C version could use the same approach as
> the unicodedatabase module: that of compressed lookup
> tables...
>
>         http://aggregate.org/TechPub/lcpc2002.pdf
>
> genccodec.py anyone ?
>

I had written a test codec for single byte character sets to evaluate
algorithms to use in CJKCodecs once before  (it's not a direct
implemention of you've mentioned, tough) I just ported it
to unicodeobject (as attached).  It showed relatively fine result
than charmap codecs:

% python ./Lib/timeit.py -s "s='a'*1024*1024; u=unicode(s)"
"s.decode('iso8859-1')"
10 loops, best of 3: 96.7 msec per loop
% ./python ./Lib/timeit.py -s "s='a'*1024*1024; u=unicode(s)"
"s.decode('iso8859_10_fc')"
10 loops, best of 3: 22.7 msec per loop
% ./python ./Lib/timeit.py -s "s='a'*1024*1024; u=unicode(s)"
"s.decode('utf-8')"
100 loops, best of 3: 18.9 msec per loop

(Note that it doesn't contain any documentation nor good error
handling yet. :-)


Hye-Shik
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fastmapcodec.diff
Type: application/octet-stream
Size: 18814 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20051006/2106c236/fastmapcodec-0001.obj

From walter at livinglogic.de  Wed Oct  5 17:08:04 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Wed, 05 Oct 2005 17:08:04 +0200
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <4343748A.9050105@v.loewis.de>
References: <4342DCEC.5020204@v.loewis.de>
	<v04020a00bf67666fe9d4@[192.168.123.162]>
	<20051004022548.GC7081@unpythonic.net>
	<167C9F31-31A9-4679-94A3-097191786241@livinglogic.de>
	<4342DCEC.5020204@v.loewis.de>
	<v04020a01bf68e10ce9f2@[192.168.123.162]>
	<4343748A.9050105@v.loewis.de>
Message-ID: <4343EC54.7090201@livinglogic.de>

Martin v. L?wis wrote:

> Tony Nelson wrote:
> 
>>> For decoding it should be sufficient to use a unicode string of
>>> length 256. u"\ufffd" could be used for "maps to undefined". Or the
>>> string might be shorter and byte values greater than the length of
>>> the string are treated as "maps to undefined" too.
>>
>> With Unicode using more than 64K codepoints now, it might be more forward
>> looking to use a table of 256 32-bit values, with no need for tricky
>> values.
> 
> You might be missing the point. \ufffd is REPLACEMENT CHARACTER,
> which would indicate that the byte with that index is really unused
> in that encoding.

OK, here's a patch that implements this enhancement to 
PyUnicode_DecodeCharmap(): http://www.python.org/sf/1313939

The mapping argument to PyUnicode_DecodeCharmap() can be a unicode 
string and is used as a decoding table.

Speed looks like this:

python2.4 -mtimeit "s='a'*53*1024; u=unicode(s)" "s.decode('utf-8')"
1000 loops, best of 3: 538 usec per loop
python2.4 -mtimeit "s='a'*53*1024; u=unicode(s)" "s.decode('mac-roman')"
100 loops, best of 3: 3.85 msec per loop
./python-cvs -mtimeit "s='a'*53*1024; u=unicode(s)" "s.decode('utf-8')"
1000 loops, best of 3: 539 usec per loop
./python-cvs -mtimeit "s='a'*53*1024; u=unicode(s)" "s.decode('mac-roman')"
1000 loops, best of 3: 623 usec per loop

Creating the decoding_map as a string should probably be done by 
gencodec.py directly. This way the first import of the codec would be 
faster too.

Bye,
    Walter D?rwald

From mal at egenix.com  Wed Oct  5 17:52:54 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 05 Oct 2005 17:52:54 +0200
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <4f0b69dc0510050806h478a1b9fr6953999e8d9d312b@mail.gmail.com>
References: <v04020a00bf67666fe9d4@192.168.123.162>	
	<20051004022548.GC7081@unpythonic.net>	
	<167C9F31-31A9-4679-94A3-097191786241@livinglogic.de>	
	<4342E630.5060801@egenix.com>
	<4f0b69dc0510050806h478a1b9fr6953999e8d9d312b@mail.gmail.com>
Message-ID: <4343F6D6.3030305@egenix.com>

Hye-Shik Chang wrote:
> On 10/5/05, M.-A. Lemburg <mal at egenix.com> wrote:
> 
>>Of course, a C version could use the same approach as
>>the unicodedatabase module: that of compressed lookup
>>tables...
>>
>>        http://aggregate.org/TechPub/lcpc2002.pdf
>>
>>genccodec.py anyone ?
>>
> 
> 
> I had written a test codec for single byte character sets to evaluate
> algorithms to use in CJKCodecs once before  (it's not a direct
> implemention of you've mentioned, tough) I just ported it
> to unicodeobject (as attached). 

Thanks. Please upload the patch to SF.

Looks like we now have to competing patches: yours and the
one written by Walter.

So far you've only compared decoding strings into Unicode
and they seem to be similar in performance. Do they differ
in encoding performance ?

> It showed relatively fine result
> than charmap codecs:
> 
> % python ./Lib/timeit.py -s "s='a'*1024*1024; u=unicode(s)"
> "s.decode('iso8859-1')"
> 10 loops, best of 3: 96.7 msec per loop
> % ./python ./Lib/timeit.py -s "s='a'*1024*1024; u=unicode(s)"
> "s.decode('iso8859_10_fc')"
> 10 loops, best of 3: 22.7 msec per loop
> % ./python ./Lib/timeit.py -s "s='a'*1024*1024; u=unicode(s)"
> "s.decode('utf-8')"
> 100 loops, best of 3: 18.9 msec per loop
> 
> (Note that it doesn't contain any documentation nor good error
> handling yet. :-)
> 
> 
> Hye-Shik

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 05 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From martin at v.loewis.de  Wed Oct  5 20:21:41 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 05 Oct 2005 20:21:41 +0200
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <4343913D.3080609@egenix.com>
References: <v04020a00bf67666fe9d4@[192.168.123.162]>	<20051004022548.GC7081@unpythonic.net>	<167C9F31-31A9-4679-94A3-097191786241@livinglogic.de>	<4342DCEC.5020204@v.loewis.de>	<6773C3BF-F909-404A-AEB8-EF5023D86513@livinglogic.de>
	<4342FD6D.7000308@v.loewis.de> <4343913D.3080609@egenix.com>
Message-ID: <434419B5.7030803@v.loewis.de>

M.-A. Lemburg wrote:
>>I would try to avoid generating C code at all costs. Maintaining the 
>>build processes will just be a nightmare.
> 
> 
> We could automate this using distutils; however I'm not sure
> whether this would then also work on Windows.

It wouldn't.

Regards,
Martin


From martin at v.loewis.de  Wed Oct  5 20:40:04 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 05 Oct 2005 20:40:04 +0200
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <4343EC54.7090201@livinglogic.de>
References: <4342DCEC.5020204@v.loewis.de>
	<v04020a00bf67666fe9d4@[192.168.123.162]>
	<20051004022548.GC7081@unpythonic.net>
	<167C9F31-31A9-4679-94A3-097191786241@livinglogic.de>
	<4342DCEC.5020204@v.loewis.de>
	<v04020a01bf68e10ce9f2@[192.168.123.162]>
	<4343748A.9050105@v.loewis.de> <4343EC54.7090201@livinglogic.de>
Message-ID: <43441E04.3060307@v.loewis.de>

Walter D?rwald wrote:
> OK, here's a patch that implements this enhancement to 
> PyUnicode_DecodeCharmap(): http://www.python.org/sf/1313939

Looks nice!

> Creating the decoding_map as a string should probably be done by 
> gencodec.py directly. This way the first import of the codec would be 
> faster too.

Hmm. How would you represent the string in source code? As a Unicode
literal? With \u escapes, or in a UTF-8 source file? Or as a UTF-8
string, with an explicit decode call?

I like the current dictionary style for being readable, as it also
adds the Unicode character names into comments.

Regards,
Martin

From mal at egenix.com  Wed Oct  5 22:34:16 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 05 Oct 2005 22:34:16 +0200
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <434419B5.7030803@v.loewis.de>
References: <v04020a00bf67666fe9d4@[192.168.123.162]>	<20051004022548.GC7081@unpythonic.net>	<167C9F31-31A9-4679-94A3-097191786241@livinglogic.de>	<4342DCEC.5020204@v.loewis.de>	<6773C3BF-F909-404A-AEB8-EF5023D86513@livinglogic.de>	<4342FD6D.7000308@v.loewis.de>
	<4343913D.3080609@egenix.com> <434419B5.7030803@v.loewis.de>
Message-ID: <434438C8.1030001@egenix.com>

Martin v. L?wis wrote:
> M.-A. Lemburg wrote:
> 
>>> I would try to avoid generating C code at all costs. Maintaining the
>>> build processes will just be a nightmare.
>>
>>
>>
>> We could automate this using distutils; however I'm not sure
>> whether this would then also work on Windows.
> 
> 
> It wouldn't.

Could you elaborate why not ? Using distutils on Windows is really
easy...

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 05 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From mal at egenix.com  Wed Oct  5 22:45:18 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 05 Oct 2005 22:45:18 +0200
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <43441E04.3060307@v.loewis.de>
References: <4342DCEC.5020204@v.loewis.de>	<v04020a00bf67666fe9d4@[192.168.123.162]>	<20051004022548.GC7081@unpythonic.net>	<167C9F31-31A9-4679-94A3-097191786241@livinglogic.de>	<4342DCEC.5020204@v.loewis.de>	<v04020a01bf68e10ce9f2@[192.168.123.162]>	<4343748A.9050105@v.loewis.de>
	<4343EC54.7090201@livinglogic.de> <43441E04.3060307@v.loewis.de>
Message-ID: <43443B5E.5010606@egenix.com>

Martin v. L?wis wrote:
> Walter D?rwald wrote:
> 
>>OK, here's a patch that implements this enhancement to 
>>PyUnicode_DecodeCharmap(): http://www.python.org/sf/1313939
> 
> Looks nice!

Indeed (except for the choice of the "map this character
to undefined" code point).

Hye-Shik, could you please provide some timeit figures for
the fastmap encoding ?

>>Creating the decoding_map as a string should probably be done by 
>>gencodec.py directly. This way the first import of the codec would be 
>>faster too.
> 
> 
> Hmm. How would you represent the string in source code? As a Unicode
> literal? With \u escapes, or in a UTF-8 source file? Or as a UTF-8
> string, with an explicit decode call?
> 
> I like the current dictionary style for being readable, as it also
> adds the Unicode character names into comments.

Not only that: it also allows 1-n and 1-0 mappings which was part
of the idea to use a mapping object (such as a dictionary) as basis
for the codec.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 05 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From martin at v.loewis.de  Wed Oct  5 22:57:21 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 05 Oct 2005 22:57:21 +0200
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <434438C8.1030001@egenix.com>
References: <v04020a00bf67666fe9d4@[192.168.123.162]>	<20051004022548.GC7081@unpythonic.net>	<167C9F31-31A9-4679-94A3-097191786241@livinglogic.de>	<4342DCEC.5020204@v.loewis.de>	<6773C3BF-F909-404A-AEB8-EF5023D86513@livinglogic.de>	<4342FD6D.7000308@v.loewis.de>
	<4343913D.3080609@egenix.com> <434419B5.7030803@v.loewis.de>
	<434438C8.1030001@egenix.com>
Message-ID: <43443E31.1070606@v.loewis.de>

M.-A. Lemburg wrote:
>>It wouldn't.
> 
> 
> Could you elaborate why not ? Using distutils on Windows is really
> easy...

The current build process for Windows simply doesn't provide it.
You expect to select "Build/All" from the menu (or some such),
and expect all code to be compiled. The VC build process only
considers VC project files.

Maybe it is possible to hack up a project file to invoke distutils
as the build process, but no such project file is currently available,
nor is it known whether it is possible to create one. Whatever the
build process: it should properly with debug and release build,
with alternative compilers (such as Itanium compiler), and place
the files so that debugging from the VStudio environment is possible.
All of this is not the case of today, and nobody has worked on
making it possible. I very much doubt distutils in its current form
could handle it.

Regards,
Martin


From bcannon at gmail.com  Wed Oct  5 23:00:35 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Wed, 5 Oct 2005 14:00:35 -0700
Subject: [Python-Dev] Python 2.5 and ast-branch
In-Reply-To: <ca471dc20510050752x8296aaax7d0bfb88c25516a5@mail.gmail.com>
References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>
	<5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com>
	<2mu0fxekdz.fsf@starship.python.net> <43424480.9080900@gmail.com>
	<ca471dc20510040731v5fde7f79k3f9b649e65436054@mail.gmail.com>
	<4343D532.2030202@gmail.com>
	<ca471dc20510050752x8296aaax7d0bfb88c25516a5@mail.gmail.com>
Message-ID: <bbaeab100510051400i5430320co67c2000ec53776f@mail.gmail.com>

To answer Nick's email here, I didn't respond to that initial email
because it seemed specifically directed at Guido and not me.

On 10/5/05, Guido van Rossum <guido at python.org> wrote:
> On 10/5/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> > Anyway, the question is: What do we want to do with ast-branch? Finish
> > bringing it up to Python 2.4 equivalence, make it the HEAD, and only then
> > implement the approved PEP's (308, 342, 343) that affect the compiler? Or
> > implement the approved PEP's on the HEAD, and move the goalposts for
> > ast-branch to include those features as well?
> >
> > I believe the latter is the safe option in terms of making sure 2.5 is a solid
> > release, but doing it that way suggests to me that the ast compiler would need
> > to be held over until 2.6, which would be somewhat unfortunate.
> >
> > Given that I don't particularly like that answer, I'd love for someone to
> > convince me I'm wrong ;)
>
> Given the total lack of response, I have a different suggestion. Let's
> *abandon* the AST-branch. We're fooling ourselves believing that we
> can ever switch to that branch, no matter how theoretically better it
> is.
>

Since the original people who have done the majority of the work
(Jeremy, Tim, Neal, Nick, logistix, and myself) have fallen so far
behind this probably is not a bad decision.  Obviously I would like to
see the work pan out, but since I personally just have not found the
time to shuttle the branch the rest of the way I really am in no
position to say much in terms of objecting to its demise.

Maybe I can come up with a new design and get my dissertation out of it.  =)

-Brett

From mal at egenix.com  Wed Oct  5 23:15:56 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 05 Oct 2005 23:15:56 +0200
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <43443E31.1070606@v.loewis.de>
References: <v04020a00bf67666fe9d4@[192.168.123.162]>	<20051004022548.GC7081@unpythonic.net>	<167C9F31-31A9-4679-94A3-097191786241@livinglogic.de>	<4342DCEC.5020204@v.loewis.de>	<6773C3BF-F909-404A-AEB8-EF5023D86513@livinglogic.de>	<4342FD6D.7000308@v.loewis.de>
	<4343913D.3080609@egenix.com>	<434419B5.7030803@v.loewis.de>
	<434438C8.1030001@egenix.com> <43443E31.1070606@v.loewis.de>
Message-ID: <4344428C.4020309@egenix.com>

Martin v. L?wis wrote:
> M.-A. Lemburg wrote:
> 
>>> It wouldn't.
>>
>>
>>
>> Could you elaborate why not ? Using distutils on Windows is really
>> easy...
> 
> 
> The current build process for Windows simply doesn't provide it.
> You expect to select "Build/All" from the menu (or some such),
> and expect all code to be compiled. The VC build process only
> considers VC project files.
> 
> Maybe it is possible to hack up a project file to invoke distutils
> as the build process, but no such project file is currently available,
> nor is it known whether it is possible to create one. Whatever the
> build process: it should properly with debug and release build,
> with alternative compilers (such as Itanium compiler), and place
> the files so that debugging from the VStudio environment is possible.
> All of this is not the case of today, and nobody has worked on
> making it possible. I very much doubt distutils in its current form
> could handle it.

I see, so you have to create a VC project file for each codec -
that would be hard to maintain indeed.

For Unix platforms this would be no problem at all since there all
extension modules are built using distutils anyway.

Thanks for the explanation.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 05 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From trentm at ActiveState.com  Wed Oct  5 23:18:31 2005
From: trentm at ActiveState.com (Trent Mick)
Date: Wed, 5 Oct 2005 14:18:31 -0700
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <43443E31.1070606@v.loewis.de>
References: <v04020a00bf67666fe9d4@[192.168.123.162]>
	<20051004022548.GC7081@unpythonic.net>
	<167C9F31-31A9-4679-94A3-097191786241@livinglogic.de>
	<4342DCEC.5020204@v.loewis.de>
	<6773C3BF-F909-404A-AEB8-EF5023D86513@livinglogic.de>
	<4342FD6D.7000308@v.loewis.de> <4343913D.3080609@egenix.com>
	<434419B5.7030803@v.loewis.de> <434438C8.1030001@egenix.com>
	<43443E31.1070606@v.loewis.de>
Message-ID: <20051005211831.GB5220@ActiveState.com>

[Martin v. Loewis wrote]
> Maybe it is possible to hack up a project file to invoke distutils
> as the build process, but no such project file is currently available,
> nor is it known whether it is possible to create one. 

This is essentially what the "_ssl" project does, no? It defers to
"build_ssl.py" to do the build work. I didn't see what the full build
requirements were earlier in this thread though, so I may be missing
something.

Trent

-- 
Trent Mick
TrentM at ActiveState.com

From martin at v.loewis.de  Thu Oct  6 00:00:52 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 06 Oct 2005 00:00:52 +0200
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <20051005211831.GB5220@ActiveState.com>
References: <v04020a00bf67666fe9d4@[192.168.123.162]>
	<20051004022548.GC7081@unpythonic.net>
	<167C9F31-31A9-4679-94A3-097191786241@livinglogic.de>
	<4342DCEC.5020204@v.loewis.de>
	<6773C3BF-F909-404A-AEB8-EF5023D86513@livinglogic.de>
	<4342FD6D.7000308@v.loewis.de> <4343913D.3080609@egenix.com>
	<434419B5.7030803@v.loewis.de> <434438C8.1030001@egenix.com>
	<43443E31.1070606@v.loewis.de>
	<20051005211831.GB5220@ActiveState.com>
Message-ID: <43444D14.2070802@v.loewis.de>

Trent Mick wrote:
> [Martin v. Loewis wrote]
> 
>>Maybe it is possible to hack up a project file to invoke distutils
>>as the build process, but no such project file is currently available,
>>nor is it known whether it is possible to create one. 
> 
> 
> This is essentially what the "_ssl" project does, no? 

More or less, yes. It does support both debug and release build. It
does not support Itanium builds (atleast not the way the other projects
do); as a result, the Itanium build currently just doesn't offer SSL.

More importantly, build_ssl.py is not based on distutils. Instead, it
is manually hacked up - a VBScript file would have worked as well. So
if you were to create many custom build scripts (one per codec), you
might just as well generate the VS project files directly.

Regards,
Martin

From hyeshik at gmail.com  Thu Oct  6 05:11:06 2005
From: hyeshik at gmail.com (Hye-Shik Chang)
Date: Thu, 6 Oct 2005 12:11:06 +0900
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <43443B5E.5010606@egenix.com>
References: <v04020a00bf67666fe9d4@192.168.123.162>
	<20051004022548.GC7081@unpythonic.net>
	<167C9F31-31A9-4679-94A3-097191786241@livinglogic.de>
	<4342DCEC.5020204@v.loewis.de> <v04020a01bf68e10ce9f2@192.168.123.162>
	<4343748A.9050105@v.loewis.de> <4343EC54.7090201@livinglogic.de>
	<43441E04.3060307@v.loewis.de> <43443B5E.5010606@egenix.com>
Message-ID: <4f0b69dc0510052011u31847835q1abacbb35afd0a15@mail.gmail.com>

On 10/6/05, M.-A. Lemburg <mal at egenix.com> wrote:
> Hye-Shik, could you please provide some timeit figures for
> the fastmap encoding ?
>

(before applying Walter's patch, charmap decoder)

% ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10';
u=unicode(s, e)" "s.decode(e)"
100 loops, best of 3: 3.35 msec per loop

(applied the patch, improved charmap decoder)

% ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10';
u=unicode(s, e)" "s.decode(e)"
1000 loops, best of 3: 1.11 msec per loop

(the fastmap decoder)

% ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10_fc';
u=unicode(s, e)" "s.decode(e)"
1000 loops, best of 3: 1.04 msec per loop

(utf-8 decoder)

% ./python Lib/timeit.py -s "s='a'*53*1024; e='utf_8'; u=unicode(s,
e)" "s.decode(e)"
1000 loops, best of 3: 851 usec per loop

Walter's decoder and the fastmap decoder run in mostly same way.
So the performance difference is quite minor.  Perhaps, the minor
difference came from the existence of wrapper function on each codecs;
the fastmap codec provides functions usable as Codecs.{en,de}code
directly.

(encoding, charmap codec)

% ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10';
u=unicode(s, e)" "u.encode(e)"
100 loops, best of 3: 3.51 msec per loop

(encoding, fastmap codec)

% ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10_fc';
u=unicode(s, e)" "u.encode(e)"
1000 loops, best of 3: 536 usec per loop

(encoding, utf-8 codec)

% ./python Lib/timeit.py -s "s='a'*53*1024; e='utf_8'; u=unicode(s,
e)" "u.encode(e)"
1000 loops, best of 3: 1.5 msec per loop

If the encoding optimization can be easily done in Walter's approach,
the fastmap codec would be too expensive way for the objective because
we must maintain not only fastmap but also charmap for backward
compatibility.

Hye-Shik

From pje at telecommunity.com  Thu Oct  6 06:47:40 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 06 Oct 2005 00:47:40 -0400
Subject: [Python-Dev] Removing the block stack (was Re: PEP 343 and __with__)
In-Reply-To: <2mu0fxekdz.fsf@starship.python.net>
References: <5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com>
	<5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>
	<5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>
	<5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20051005191114.01f6f018@mail.telecommunity.com>

At 09:50 AM 10/4/2005 +0100, Michael Hudson wrote:
>(anyone still thinking about removing the block stack?).

I'm not any more.  My thought was that it would be good for performance, by 
reducing the memory allocation overhead for frames enough to allow pymalloc 
to be used instead of the platform malloc.  After more investigation, 
however, I realized that was a dumb idea, because for a typical application 
the amortized allocation cost of frames approaches zero as the program runs 
and allocates as many frames as it will ever use, as large as it will ever 
use them, and just recycles them on the free list.  And all of the ways I 
came up with for removing the block stack were a lot more complex than 
leaving it as-is.

Clearly, the cost of function calls in Python lies somewhere else, and I'd 
probably look next at parameter tuple allocation, and other frame 
initialization activities.  I seem to recall that Armin Rigo once supplied 
a patch that sped up calls at the cost of slowing down recursive or 
re-entrant ones, and I seem to recall that it was based on preinitializing 
frames, not just preallocating them:

     http://mail.python.org/pipermail/python-dev/2004-March/042871.html

However, the patch was never applied because of its increased memory usage 
as well as the slowdown for recursion.

Every so often, in blue-sky thinking about alternative Python VM designs, I 
think about making frames virtual, in the sense of not even having "real" 
frame objects except for generators, sys._getframe(), and tracebacks.  I 
suspect, however, that doing this in a way that doesn't mess with the 
current C API is non-trivial.  And for many "obvious" ways to simplify the 
various stacks, locals, etc., the downside could be more complexity for 
generators, and probably less speed as well.

For example, we could use a single "stack" arena in the heap for 
parameters, locals, cells, and blocks, rather than doing all the various 
sub-allocations within the frame.  But then creating a frame would involve 
copying data off the top of this pseudo-stack, and doing all the offset 
computations and perhaps some other trickery as well.  And resuming a 
generator would have to either copy it back, or have some sane way to make 
calls out to a new stack arena when calling other functions - thus making 
those operations slower.

The real problem, of course, with any of these ideas is that we are at best 
shaving a few percentage points here, a few points there, so it's 
comparatively speaking rather expensive to do the experiments to see if 
they help anything.


From nnorwitz at gmail.com  Thu Oct  6 07:09:21 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Wed, 5 Oct 2005 22:09:21 -0700
Subject: [Python-Dev] Removing the block stack (was Re: PEP 343 and
	__with__)
In-Reply-To: <5.1.1.6.0.20051005191114.01f6f018@mail.telecommunity.com>
References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>
	<5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com>
	<2mu0fxekdz.fsf@starship.python.net>
	<5.1.1.6.0.20051005191114.01f6f018@mail.telecommunity.com>
Message-ID: <ee2a432c0510052209w38ffc982w46b53bc18796e72@mail.gmail.com>

On 10/5/05, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 09:50 AM 10/4/2005 +0100, Michael Hudson wrote:
> >(anyone still thinking about removing the block stack?).
>
> I'm not any more.  My thought was that it would be good for performance, by
> reducing the memory allocation overhead for frames enough to allow pymalloc
> to be used instead of the platform malloc.

I did something similar to reduce the frame size to under 256 bytes
(don't recall if I made a patch or not) and it had no overall effect
on perf.

> Clearly, the cost of function calls in Python lies somewhere else, and I'd
> probably look next at parameter tuple allocation, and other frame
> initialization activities.

I think that's a big part of it.  This patch shows C calls getting
sped up primarly by avoiding tuple creation:

    http://python.org/sf/1107887

I hope to work on that and get it into 2.5.

I've also been thinking about avoiding tuple creation when calling
python functions.  The change I have in mind would probably have to
wait until p3k, but could yield some speed ups.

Warning:  half baked idea follows.

My thoughts are to dynamically allocate the Python stack memory (e.g.,
void *stack = malloc(128MB)).  Then all calls within each thread uses
its own stack.  So things would be pushed onto the stack like they are
currently, but we wouldn't need to do create a tuple to pass to a
method, they could just be used directly.  Basically more closely
simulate the way it currently works in hardware.

This would mean all the PyArg_ParseTuple()s would have to change.  It
may be possible to fake it out, but I'm not sure it's worth it which
is why it would be easier to do this for p3k.

The general idea is to allocate the stack in one big hunk and just
walk up/down it as functions are called/returned.  This only means
incrementing or decrementing pointers.  This should allow us to avoid
a bunch of copying and tuple creation/destruction.  Frames would
hopefully be the same size which would help.  Note that even though
there is a free list for frames, there could still be
PyObject_GC_Resize()s often (or unused memory).  WIth my idea,
hopefully there would be better memory locality, which could speed
things up.

n

From martin at v.loewis.de  Thu Oct  6 09:04:14 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 06 Oct 2005 09:04:14 +0200
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <4f0b69dc0510052011u31847835q1abacbb35afd0a15@mail.gmail.com>
References: <v04020a00bf67666fe9d4@192.168.123.162>	
	<20051004022548.GC7081@unpythonic.net>	
	<167C9F31-31A9-4679-94A3-097191786241@livinglogic.de>	
	<4342DCEC.5020204@v.loewis.de>
	<v04020a01bf68e10ce9f2@192.168.123.162>	
	<4343748A.9050105@v.loewis.de> <4343EC54.7090201@livinglogic.de>	
	<43441E04.3060307@v.loewis.de> <43443B5E.5010606@egenix.com>
	<4f0b69dc0510052011u31847835q1abacbb35afd0a15@mail.gmail.com>
Message-ID: <4344CC6E.40206@v.loewis.de>

Hye-Shik Chang wrote:
> If the encoding optimization can be easily done in Walter's approach,
> the fastmap codec would be too expensive way for the objective because
> we must maintain not only fastmap but also charmap for backward
> compatibility.

IMO, whether a new function is added or whether the existing function
becomes polymorphic (depending on the type of table being passed) is
a minor issue. Clearly, the charmap API needs to stay for backwards
compatibility; in terms of code size or maintenance, I would actually
prefer separate functions.

One issue apparently is people tweaking the existing dictionaries,
with additional entries they think belong there. I don't think we
need to preserve compatibility with that approach in 2.5, but I
also think that breakage should be obvious: the dictionary should
either go away completely at run-time, or be stored under a
different name, so that any attempt of modifying the dictionary
gives an exception instead of having no interesting effect.

I envision a layout of the codec files like this:

decoding_dict = ...
decoding_map, encoding_map = codecs.make_lookup_tables(decoding_dict)

I think it should be possible to build efficient tables in a single
pass over the dictionary, so startup time should be fairly small
(given that the dictionaries are currently built incrementally, anyway,
due to the way dictionary literals work).

Regards,
Martin

From pje at telecommunity.com  Thu Oct  6 09:06:39 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 06 Oct 2005 03:06:39 -0400
Subject: [Python-Dev] Removing the block stack (was Re: PEP 343 and
 __with__)
In-Reply-To: <ee2a432c0510052209w38ffc982w46b53bc18796e72@mail.gmail.com
 >
References: <5.1.1.6.0.20051005191114.01f6f018@mail.telecommunity.com>
	<5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>
	<5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com>
	<2mu0fxekdz.fsf@starship.python.net>
	<5.1.1.6.0.20051005191114.01f6f018@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20051006024517.01f6f0d0@mail.telecommunity.com>

At 10:09 PM 10/5/2005 -0700, Neal Norwitz wrote:
>I've also been thinking about avoiding tuple creation when calling
>python functions.  The change I have in mind would probably have to
>wait until p3k, but could yield some speed ups.
>
>Warning:  half baked idea follows.

Yeah, I've been baking that idea for a long time, and it's a bit more 
complex than you've suggested, due to generators, sys._getframe(), and 
tracebacks.


>My thoughts are to dynamically allocate the Python stack memory (e.g.,
>void *stack = malloc(128MB)).  Then all calls within each thread uses
>its own stack.  So things would be pushed onto the stack like they are
>currently, but we wouldn't need to do create a tuple to pass to a
>method, they could just be used directly.  Basically more closely
>simulate the way it currently works in hardware.

Actually, Python/ceval.c already skips creating a tuple when calling Python 
functions with a fixed number of arguments (caller and callee) and no cell 
vars (i.e., not a closure).  It copies them straight from the calling frame 
stack to the callee frame's stack.


>This would mean all the PyArg_ParseTuple()s would have to change.  It
>may be possible to fake it out, but I'm not sure it's worth it which
>is why it would be easier to do this for p3k.

Actually, I've been thinking that replacing the arg tuple with a PyObject* 
array would allow us to skip tuple creation when calling C functions, since 
you could just give the C functions a pointer to the arguments on the 
caller's stack.  That would let us get rid of most remaining tuple 
allocations.  I suppose we'd also need either an argcount parameter.  The 
old APIs taking tuples for calls could trivially convert the tuples to a 
array pointer and size, then call the new APIs.

Actually, we'd probably have to have a tp_arraycall slot or something, with 
the existing tp_call forwarding to tp_arraycall in most cases, but 
occasionally the reverse.  The tricky part is making sure you don't end up 
with cases where you call a tuple API that converts to an array that then 
turns it back into a tuple!


>The general idea is to allocate the stack in one big hunk and just
>walk up/down it as functions are called/returned.  This only means
>incrementing or decrementing pointers.  This should allow us to avoid
>a bunch of copying and tuple creation/destruction.  Frames would
>hopefully be the same size which would help.  Note that even though
>there is a free list for frames, there could still be
>PyObject_GC_Resize()s often (or unused memory).  WIth my idea,
>hopefully there would be better memory locality, which could speed
>things up.

Yeah, unfortunately for your idea, generators would have to copy off bits 
of the stack and then copy them back in, making generators slower.  If it 
weren't for that part, the idea would probably be a good one, as arguments, 
locals, cells, and the block and value stacks could all be handled that 
way, with the compiler treating all operations as base-pointer offsets, 
thereby eliminating lots of more-complex pointer management in ceval.c and 
frameobject.c.

Another possible fix for generators would be of course to give them their 
own stack arena, but then you have the problem of needing to copy overflows 
from one such stack to another - at which point you're basically back to 
having frames.

On the other hand, maybe the good part of this idea is just eliminating all 
the pointer fudging and having the compiler determine stack offsets.  Then, 
the frame object layout would just consist of a big hunk of stack space, 
laid out as a PyObject* array.

The main problem with this concept is that it would change the meaning of 
certain opcodes, since right now the offsets of free variables in opcodes 
start over the numbering, but this approach would add the number of locals 
to those offsets.


From martin at v.loewis.de  Thu Oct  6 09:15:06 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 06 Oct 2005 09:15:06 +0200
Subject: [Python-Dev] Removing the block stack (was Re: PEP 343
	and	__with__)
In-Reply-To: <ee2a432c0510052209w38ffc982w46b53bc18796e72@mail.gmail.com>
References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>	<5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com>	<2mu0fxekdz.fsf@starship.python.net>	<5.1.1.6.0.20051005191114.01f6f018@mail.telecommunity.com>
	<ee2a432c0510052209w38ffc982w46b53bc18796e72@mail.gmail.com>
Message-ID: <4344CEFA.1040708@v.loewis.de>

Neal Norwitz wrote:
> My thoughts are to dynamically allocate the Python stack memory (e.g.,
> void *stack = malloc(128MB)).  Then all calls within each thread uses
> its own stack.  So things would be pushed onto the stack like they are
> currently, but we wouldn't need to do create a tuple to pass to a
> method, they could just be used directly.  Basically more closely
> simulate the way it currently works in hardware.

One issue with argument tuples on the stack (or some sort of stack) is
that functions may hold onto argument tuples longer:

def foo(*args):
     global last_args
     last_args = args

I considered making true tuple objects (i.e. with ob_type etc.) on
the stack, but this possibility breaks it.

Regards,
Martin

From walter at livinglogic.de  Thu Oct  6 09:28:05 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Thu, 06 Oct 2005 09:28:05 +0200
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <43441E04.3060307@v.loewis.de>
References: <4342DCEC.5020204@v.loewis.de>
	<v04020a00bf67666fe9d4@[192.168.123.162]>
	<20051004022548.GC7081@unpythonic.net>
	<167C9F31-31A9-4679-94A3-097191786241@livinglogic.de>
	<4342DCEC.5020204@v.loewis.de>
	<v04020a01bf68e10ce9f2@[192.168.123.162]>
	<4343748A.9050105@v.loewis.de> <4343EC54.7090201@livinglogic.de>
	<43441E04.3060307@v.loewis.de>
Message-ID: <4344D205.6090800@livinglogic.de>

Martin v. L?wis wrote:

> Walter D?rwald wrote:
> 
>> OK, here's a patch that implements this enhancement to 
>> PyUnicode_DecodeCharmap(): http://www.python.org/sf/1313939
> 
> Looks nice!
> 
>> Creating the decoding_map as a string should probably be done by 
>> gencodec.py directly. This way the first import of the codec would be 
>> faster too.
> 
> Hmm. How would you represent the string in source code? As a Unicode
> literal? With \u escapes,

Yes, simply by outputting repr(decoding_string).

> or in a UTF-8 source file?

This might get unreadable, if your editor can't detect the coding header.

> Or as a UTF-8
> string, with an explicit decode call?

This is another possibility, but is unreadable too. But we might add the 
real codepoints as comments.

> I like the current dictionary style for being readable, as it also
> adds the Unicode character names into comments.

We could use

decoding_string = (
    u"\u009c" # 0x0004 -> U+009C: CONTROL
    u"\u0009" # 0x0005 -> U+000c: HORIZONTAL TABULATION
    ...
)

However the current approach has the advantage, that only those byte 
values that differ from the identical mapping have to be specified.

Bye,
    Walter D?rwald

From stephen at xemacs.org  Thu Oct  6 10:41:33 2005
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Thu, 06 Oct 2005 17:41:33 +0900
Subject: [Python-Dev] unifying str and unicode
In-Reply-To: <4341C05D.5000706@egenix.com> (M.'s message of "Tue, 04 Oct
	2005 01:35:57 +0200")
References: <dhra00$tnv$1@sea.gmane.org> <1128346015.6138.149.camel@fsol>
	<20051003091416.9817.JCARLSON@uci.edu>
	<1128361197.6138.212.camel@fsol> <dhrtq4$5j1$1@sea.gmane.org>
	<1128368242.6138.258.camel@fsol> <dhs1sa$i8l$1@sea.gmane.org>
	<1128371900.6138.299.camel@fsol>
	<8393fff0510031435n7ef19cbcg297b8881d75d0a08@mail.gmail.com>
	<4341C05D.5000706@egenix.com>
Message-ID: <87fyrfox4y.fsf@tleepslib.sk.tsukuba.ac.jp>

>>>>> "M" == "M.-A. Lemburg" <mal at egenix.com> writes:

    M> From what I've read on the web about the Python Unicode
    M> implementation we have one of the better ones compared to other
    M> languages implementations and their choices and design
    M> decisions.

Yes, indeed!

Speaking-as-a-card-carrying-member-of-the-loyal-opposition-ly y'rs,

-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.

From mwh at python.net  Thu Oct  6 10:44:49 2005
From: mwh at python.net (Michael Hudson)
Date: Thu, 06 Oct 2005 09:44:49 +0100
Subject: [Python-Dev] Removing the block stack
In-Reply-To: <ee2a432c0510052209w38ffc982w46b53bc18796e72@mail.gmail.com> (Neal
	Norwitz's message of "Wed, 5 Oct 2005 22:09:21 -0700")
References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>
	<5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com>
	<2mu0fxekdz.fsf@starship.python.net>
	<5.1.1.6.0.20051005191114.01f6f018@mail.telecommunity.com>
	<ee2a432c0510052209w38ffc982w46b53bc18796e72@mail.gmail.com>
Message-ID: <2mpsqjdofy.fsf@starship.python.net>

Neal Norwitz <nnorwitz at gmail.com> writes:

> On 10/5/05, Phillip J. Eby <pje at telecommunity.com> wrote:
>> At 09:50 AM 10/4/2005 +0100, Michael Hudson wrote:
>> >(anyone still thinking about removing the block stack?).
>>
>> I'm not any more.  My thought was that it would be good for performance, by
>> reducing the memory allocation overhead for frames enough to allow pymalloc
>> to be used instead of the platform malloc.
>
> I did something similar to reduce the frame size to under 256 bytes
> (don't recall if I made a patch or not) and it had no overall effect
> on perf.

Hey, me too!  I also came to the same conclusion.

Cheers,
mwh

-- 
  The ultimate laziness is not using Perl.  That saves you so much
  work you wouldn't believe it if you had never tried it.
                                        -- Erik Naggum, comp.lang.lisp

From walter at livinglogic.de  Thu Oct  6 10:51:47 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Thu, 06 Oct 2005 10:51:47 +0200
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <4344CC6E.40206@v.loewis.de>
References: <v04020a00bf67666fe9d4@192.168.123.162>	
	<20051004022548.GC7081@unpythonic.net>	
	<167C9F31-31A9-4679-94A3-097191786241@livinglogic.de>	
	<4342DCEC.5020204@v.loewis.de>
	<v04020a01bf68e10ce9f2@192.168.123.162>	
	<4343748A.9050105@v.loewis.de> <4343EC54.7090201@livinglogic.de>	
	<43441E04.3060307@v.loewis.de> <43443B5E.5010606@egenix.com>
	<4f0b69dc0510052011u31847835q1abacbb35afd0a15@mail.gmail.com>
	<4344CC6E.40206@v.loewis.de>
Message-ID: <4344E5A3.7060401@livinglogic.de>

Martin v. L?wis wrote:

> Hye-Shik Chang wrote:
> 
>> If the encoding optimization can be easily done in Walter's approach,
>> the fastmap codec would be too expensive way for the objective because
>> we must maintain not only fastmap but also charmap for backward
>> compatibility.
> 
> IMO, whether a new function is added or whether the existing function
> becomes polymorphic (depending on the type of table being passed) is
> a minor issue. Clearly, the charmap API needs to stay for backwards
> compatibility; in terms of code size or maintenance, I would actually
> prefer separate functions.

OK, I can update the patch accordingly. Any suggestions for the name? 
PyUnicode_DecodeCharmapString?

> One issue apparently is people tweaking the existing dictionaries,
> with additional entries they think belong there. I don't think we
> need to preserve compatibility with that approach in 2.5, but I
> also think that breakage should be obvious: the dictionary should
> either go away completely at run-time, or be stored under a
> different name, so that any attempt of modifying the dictionary
> gives an exception instead of having no interesting effect.

IMHO it should be stored under a different name, because there are 
codecs (c037, koi8_r, iso8859_11), that reuse existing dictionaries.

Or we could have a function that recreates the dictionary from the string.

> I envision a layout of the codec files like this:
> 
> decoding_dict = ...
> decoding_map, encoding_map = codecs.make_lookup_tables(decoding_dict)

Apart from the names (and the fact that encoding_map is still a 
dictionary), that's what my patch does.

> I think it should be possible to build efficient tables in a single
> pass over the dictionary, so startup time should be fairly small
> (given that the dictionaries are currently built incrementally, anyway,
> due to the way dictionary literals work).

Bye,
    Walter D?rwald

From mal at egenix.com  Thu Oct  6 11:09:51 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 06 Oct 2005 11:09:51 +0200
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <4344E5A3.7060401@livinglogic.de>
References: <v04020a00bf67666fe9d4@192.168.123.162>	<20051004022548.GC7081@unpythonic.net>	<167C9F31-31A9-4679-94A3-097191786241@livinglogic.de>	<4342DCEC.5020204@v.loewis.de>
	<v04020a01bf68e10ce9f2@192.168.123.162>	<4343748A.9050105@v.loewis.de>
	<4343EC54.7090201@livinglogic.de>	<43441E04.3060307@v.loewis.de>
	<43443B5E.5010606@egenix.com>	<4f0b69dc0510052011u31847835q1abacbb35afd0a15@mail.gmail.com>	<4344CC6E.40206@v.loewis.de>
	<4344E5A3.7060401@livinglogic.de>
Message-ID: <4344E9DF.4060706@egenix.com>

Walter D?rwald wrote:
> Martin v. L?wis wrote:
> 
>> Hye-Shik Chang wrote:
>>
>>> If the encoding optimization can be easily done in Walter's approach,
>>> the fastmap codec would be too expensive way for the objective because
>>> we must maintain not only fastmap but also charmap for backward
>>> compatibility.
>>
>>
>> IMO, whether a new function is added or whether the existing function
>> becomes polymorphic (depending on the type of table being passed) is
>> a minor issue. Clearly, the charmap API needs to stay for backwards
>> compatibility; in terms of code size or maintenance, I would actually
>> prefer separate functions.
> 
> 
> OK, I can update the patch accordingly. Any suggestions for the name?
> PyUnicode_DecodeCharmapString?

No, you can factor this part out into a separate C function
- there's no need to add a completely new entry point just
for this optimization. Later on we can then also add support
for compressed tables to the codec in the same way.

>> One issue apparently is people tweaking the existing dictionaries,
>> with additional entries they think belong there. I don't think we
>> need to preserve compatibility with that approach in 2.5, but I
>> also think that breakage should be obvious: the dictionary should
>> either go away completely at run-time, or be stored under a
>> different name, so that any attempt of modifying the dictionary
>> gives an exception instead of having no interesting effect.
> 
> 
> IMHO it should be stored under a different name, because there are
> codecs (c037, koi8_r, iso8859_11), that reuse existing dictionaries.

Only koi8_u reuses the dictionary from koi8_r - and it's
easy to recreate the codec from a standard mapping file.

> Or we could have a function that recreates the dictionary from the string.

Actually, I'd prefer that these operations be done by the
codec generator script, so that we don't have additional
startup time. The dictionaries should then no longer be
generated and instead. I'd like the comments to stay, though.
This can be done like this (using string concatenation
applied by the compiler):

decoding_charmap = (
    u'x' # 0x0000 -> 0x0078 LATIN SMALL LETTER X
    u'y' # 0x0001 -> 0x0079 LATIN SMALL LETTER Y
    ...
)

Either way, monkey patching the codec won't work anymore.
Doesn't really matter, though, as this was never officially
supported.

We've always told people to write their own codecs
if they need to modify an existing one and then hook it into
the system using either a new codec search function or by
adding an appropriate alias.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 06 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From mal at egenix.com  Thu Oct  6 11:13:50 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 06 Oct 2005 11:13:50 +0200
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <4f0b69dc0510052011u31847835q1abacbb35afd0a15@mail.gmail.com>
References: <v04020a00bf67666fe9d4@192.168.123.162>	<20051004022548.GC7081@unpythonic.net>	<167C9F31-31A9-4679-94A3-097191786241@livinglogic.de>	<v04020a01bf68e10ce9f2@192.168.123.162>	<4343748A.9050105@v.loewis.de>
	<4343EC54.7090201@livinglogic.de>	<43441E04.3060307@v.loewis.de>
	<43443B5E.5010606@egenix.com>
	<4f0b69dc0510052011u31847835q1abacbb35afd0a15@mail.gmail.com>
Message-ID: <4344EACE.5070102@egenix.com>

Hye-Shik Chang wrote:
> On 10/6/05, M.-A. Lemburg <mal at egenix.com> wrote:
> 
>>Hye-Shik, could you please provide some timeit figures for
>>the fastmap encoding ?
>>

Thanks for the timings.

> (before applying Walter's patch, charmap decoder)
> 
> % ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10';
> u=unicode(s, e)" "s.decode(e)"
> 100 loops, best of 3: 3.35 msec per loop
> 
> (applied the patch, improved charmap decoder)
> 
> % ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10';
> u=unicode(s, e)" "s.decode(e)"
> 1000 loops, best of 3: 1.11 msec per loop
> 
> (the fastmap decoder)
> 
> % ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10_fc';
> u=unicode(s, e)" "s.decode(e)"
> 1000 loops, best of 3: 1.04 msec per loop
> 
> (utf-8 decoder)
> 
> % ./python Lib/timeit.py -s "s='a'*53*1024; e='utf_8'; u=unicode(s,
> e)" "s.decode(e)"
> 1000 loops, best of 3: 851 usec per loop
> 
> Walter's decoder and the fastmap decoder run in mostly same way.
> So the performance difference is quite minor.  Perhaps, the minor
> difference came from the existence of wrapper function on each codecs;
> the fastmap codec provides functions usable as Codecs.{en,de}code
> directly.
> 
> (encoding, charmap codec)
> 
> % ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10';
> u=unicode(s, e)" "u.encode(e)"
> 100 loops, best of 3: 3.51 msec per loop
> 
> (encoding, fastmap codec)
> 
> % ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10_fc';
> u=unicode(s, e)" "u.encode(e)"
> 1000 loops, best of 3: 536 usec per loop
> 
> (encoding, utf-8 codec)
> 
> % ./python Lib/timeit.py -s "s='a'*53*1024; e='utf_8'; u=unicode(s,
> e)" "u.encode(e)"
> 1000 loops, best of 3: 1.5 msec per loop

I wonder why the UTF-8 codec is slower than the fastmap
codec in this case.

> If the encoding optimization can be easily done in Walter's approach,
> the fastmap codec would be too expensive way for the objective because
> we must maintain not only fastmap but also charmap for backward
> compatibility.

Indeed. Let's go with a patched charmap codec then.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 06 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From ncoghlan at gmail.com  Thu Oct  6 12:06:22 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 06 Oct 2005 20:06:22 +1000
Subject: [Python-Dev] Python 2.5 and ast-branch
In-Reply-To: <bbaeab100510051400i5430320co67c2000ec53776f@mail.gmail.com>
References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>	
	<5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com>	
	<2mu0fxekdz.fsf@starship.python.net> <43424480.9080900@gmail.com>	
	<ca471dc20510040731v5fde7f79k3f9b649e65436054@mail.gmail.com>	
	<4343D532.2030202@gmail.com>	
	<ca471dc20510050752x8296aaax7d0bfb88c25516a5@mail.gmail.com>
	<bbaeab100510051400i5430320co67c2000ec53776f@mail.gmail.com>
Message-ID: <4344F71E.9000708@gmail.com>

[Brett]
> To answer Nick's email here, I didn't respond to that initial email
> because it seemed specifically directed at Guido and not me.

Fair enough. I think I was actually misrembering the sequence of events 
leading up to 2.4a1, so the question was less appropriate for Guido than I 
thought :)

[Guido]
> On 10/5/05, Guido van Rossum <guido at python.org> wrote:
>>Given the total lack of response, I have a different suggestion. Let's
>>*abandon* the AST-branch. We're fooling ourselves believing that we
>>can ever switch to that branch, no matter how theoretically better it
>>is.

[Brett]
> Since the original people who have done the majority of the work
> (Jeremy, Tim, Neal, Nick, logistix, and myself) have fallen so far
> behind this probably is not a bad decision.  Obviously I would like to
> see the work pan out, but since I personally just have not found the
> time to shuttle the branch the rest of the way I really am in no
> position to say much in terms of objecting to its demise.

If we kill the branch for now, then anyone that wants to bring up the idea 
again can write a PEP first, not only to articulate the benefits of switching 
to an AST compiler (Jeremy has a few notes scattered around the web on that 
front), but also to propose a solid migration strategy. We tried the "develop 
in parallel, switch when done"; it doesn't seem to have worked due to the way 
it split developer effort between the branches, and both the HEAD and 
ast-branch ended up losing out.

> Maybe I can come up with a new design and get my dissertation out of it.  =)

A strategy that may work out better is to develop something independent of the 
Python core that can:

   1. Produce an ASDL based AST structure from:
         - Python source code
         - CPython 'AST'
         - CPython bytecode
   2. Parse an ASDL based AST structure and produce:
         - Python source code
         - CPython 'AST'
         - CPython bytecode

That is, initially develop an enhanced replacement for the compiler package, 
rather than aiming directly to replace the actual CPython compiler.

Then the folks who want to do serious bytecode hacking can reverse compile the 
bytecode on the fly ;)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From hyeshik at gmail.com  Thu Oct  6 13:33:11 2005
From: hyeshik at gmail.com (Hye-Shik Chang)
Date: Thu, 6 Oct 2005 20:33:11 +0900
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <4344EACE.5070102@egenix.com>
References: <v04020a00bf67666fe9d4@192.168.123.162>
	<20051004022548.GC7081@unpythonic.net>
	<167C9F31-31A9-4679-94A3-097191786241@livinglogic.de>
	<v04020a01bf68e10ce9f2@192.168.123.162> <4343748A.9050105@v.loewis.de>
	<4343EC54.7090201@livinglogic.de> <43441E04.3060307@v.loewis.de>
	<43443B5E.5010606@egenix.com>
	<4f0b69dc0510052011u31847835q1abacbb35afd0a15@mail.gmail.com>
	<4344EACE.5070102@egenix.com>
Message-ID: <4f0b69dc0510060433l1ce316coe769eaf65ae34b1a@mail.gmail.com>

On 10/6/05, M.-A. Lemburg <mal at egenix.com> wrote:
> Hye-Shik Chang wrote:
> > (encoding, fastmap codec)
> >
> > % ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10_fc';
> > u=unicode(s, e)" "u.encode(e)"
> > 1000 loops, best of 3: 536 usec per loop
> >
> > (encoding, utf-8 codec)
> >
> > % ./python Lib/timeit.py -s "s='a'*53*1024; e='utf_8'; u=unicode(s,
> > e)" "u.encode(e)"
> > 1000 loops, best of 3: 1.5 msec per loop
>
> I wonder why the UTF-8 codec is slower than the fastmap
> codec in this case.

I guess that resizing made the difference.  fastmap encoder doesn't
resize the output buffer at all in the test case while UTF-8 encoder
allocates 4*53*1024 bytes and resizes it to 53*1024 bytes in the end.

Hye-Shik

From mfb at lotusland.dyndns.org  Thu Oct  6 14:36:51 2005
From: mfb at lotusland.dyndns.org (Matthew F. Barnes)
Date: Thu, 6 Oct 2005 07:36:51 -0500 (CDT)
Subject: [Python-Dev] Lexical analysis and NEWLINE tokens
Message-ID: <23766.64.141.129.62.1128602211.squirrel@localhost>

I posted this question to python-help, but I think I have a better chance
of getting the answer here.

I'm looking for clarification on when NEWLINE tokens are generated during
lexical analysis of Python source code.  In particular, I'm confused about
some of the top-level components in Python's grammar (file_input,
interactive_input, and eval_input).

Section 2.1.7 of the reference manual states that blank lines (lines
consisting only of whitespace and possibly a comment) do not generate
NEWLINE tokens.  This is supported by the definition of a suite, which
does not allow for standalone or consecutive NEWLINE tokens.

    suite ::= stmt_list NEWLINE | NEWLINE INDENT statement+ DEDENT

Yet the grammar for top-level components seems to suggest that a parsable
input may consist entirely of a single NEWLINE token, or include
consecutive NEWLINE tokens.

    file_input ::= (NEWLINE | statement)*
    interactive_input ::= [stmt_list] NEWLINE | compound_stmt NEWLINE
    eval_input ::= expression_list NEWLINE*

To me this seems to contradict section 2.1.7 in so far as I don't see how
it's possible to generate such a sequence of tokens.

What kind of input would generate NEWLINE tokens in the top-level
components of the grammar?

Matthew Barnes
matthew at barnes.net

From walter at livinglogic.de  Thu Oct  6 14:40:24 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Thu, 06 Oct 2005 14:40:24 +0200
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <4344E9DF.4060706@egenix.com>
References: <v04020a00bf67666fe9d4@192.168.123.162>	<20051004022548.GC7081@unpythonic.net>	<167C9F31-31A9-4679-94A3-097191786241@livinglogic.de>	<4342DCEC.5020204@v.loewis.de>
	<v04020a01bf68e10ce9f2@192.168.123.162>	<4343748A.9050105@v.loewis.de>
	<4343EC54.7090201@livinglogic.de>	<43441E04.3060307@v.loewis.de>
	<43443B5E.5010606@egenix.com>	<4f0b69dc0510052011u31847835q1abacbb35afd0a15@mail.gmail.com>	<4344CC6E.40206@v.loewis.de>
	<4344E5A3.7060401@livinglogic.de> <4344E9DF.4060706@egenix.com>
Message-ID: <43451B38.5030203@livinglogic.de>

M.-A. Lemburg wrote:

 > [...]
>>Or we could have a function that recreates the dictionary from the string.
> 
> Actually, I'd prefer that these operations be done by the
> codec generator script, so that we don't have additional
> startup time. The dictionaries should then no longer be
> generated and instead. I'd like the comments to stay, though.
> This can be done like this (using string concatenation
> applied by the compiler):
> 
> decoding_charmap = (
>     u'x' # 0x0000 -> 0x0078 LATIN SMALL LETTER X
>     u'y' # 0x0001 -> 0x0079 LATIN SMALL LETTER Y
>     ...
> )

I'd prefer that too.

> Either way, monkey patching the codec won't work anymore.
> Doesn't really matter, though, as this was never officially
> supported.
> 
> We've always told people to write their own codecs
> if they need to modify an existing one and then hook it into
> the system using either a new codec search function or by
> adding an appropriate alias.

OK, so can someone update gencodec.py and recreate the charmap codecs?

BTW, is codecs.make_encoding_map part of the official API, or can we 
change it to expect a string instead of a dictionary?

Bye,
    Walter D?rwald

From tonynelson at georgeanelson.com  Wed Oct  5 20:03:03 2005
From: tonynelson at georgeanelson.com (Tony Nelson)
Date: Wed, 5 Oct 2005 14:03:03 -0400
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <4343748A.9050105@v.loewis.de>
References: <v04020a01bf68e10ce9f2@[192.168.123.162]>
	<4342DCEC.5020204@v.loewis.de>
	<v04020a00bf67666fe9d4@[192.168.123.162]>
	<20051004022548.GC7081@unpythonic.net>
	<167C9F31-31A9-4679-94A3-097191786241@livinglogic.de>
	<4342DCEC.5020204@v.loewis.de>
	<v04020a01bf68e10ce9f2@[192.168.123.162]>
Message-ID: <v04020a01bf69877f0af1@[192.168.123.162]>

At 8:36 AM +0200 10/5/05, Martin v. L?wis wrote:
>Tony Nelson wrote:
 ...
>> Encoding can be made fast using a simple hash table with external chaining.
>> There are max 256 codepoints to encode, and they will normally be well
>> distributed in their lower 8 bits.  Hash on the low 8 bits (just mask), and
>> chain to an area with 256 entries.  Modest storage, normally short chains,
>> therefore fast encoding.
>
>This is what is currently done: a hash map with 256 keys. You are
>complaining about the performance of that algorithm. The issue of
>external chaining is likely irrelevant: there likely are no collisions,
>even though Python uses open addressing.

I think I'm complaining about the implementation, though on decode, not encode.

In any case, there are likely to be collisions in my scheme.  Over the next
few days I will try to do it myself, but I will need to learn Pyrex, some
of the Python C API, and more about Python to do it.


>>>...I suggest instead just /caching/ the translation in C arrays stored
>>>with the codec object.  The cache would be invalidated on any write to the
>>>codec's mapping dictionary, and rebuilt the next time anything was
>>>translated.  This would maintain the present semantics, work with current
>>>codecs, and still provide the desired speed improvement.
>
>That is not implementable. You cannot catch writes to the dictionary.

I should have been more clear.  I am thinking about using a proxy object in
the codec's 'encoding_map' and 'decoding_map' slots, that will forward all
the dictionary stuff.  The proxy will delete the cache on any call which
changes the dictionary contents.  There are proxy classed and dictproxy
(don't know how its implemented yet) so it seems doable, at least as far as
I've gotten so far.


>> Note that this caching is done by new code added to the existing C
>> functions (which, if I have it right, are in unicodeobject.c).  No
>> architectural changes are made; no existing codecs need to be changed;
>> everything will just work
>
>Please try to implement it. You will find that you cannot. I don't
>see how regenerating/editing the codecs could be avoided.

Will do!
____________________________________________________________________
TonyN.:'                       <mailto:tonynelson at georgeanelson.com>
      '                              <http://www.georgeanelson.com/>

From mwh at python.net  Thu Oct  6 17:07:47 2005
From: mwh at python.net (Michael Hudson)
Date: Thu, 06 Oct 2005 16:07:47 +0100
Subject: [Python-Dev] Lexical analysis and NEWLINE tokens
In-Reply-To: <23766.64.141.129.62.1128602211.squirrel@localhost> (Matthew F.
	Barnes's message of "Thu, 6 Oct 2005 07:36:51 -0500 (CDT)")
References: <23766.64.141.129.62.1128602211.squirrel@localhost>
Message-ID: <2mhdbuela4.fsf@starship.python.net>

"Matthew F. Barnes" <mfb at lotusland.dyndns.org> writes:

> I posted this question to python-help, but I think I have a better chance
> of getting the answer here.
>
> I'm looking for clarification on when NEWLINE tokens are generated during
> lexical analysis of Python source code.  In particular, I'm confused about
> some of the top-level components in Python's grammar (file_input,
> interactive_input, and eval_input).
>
> Section 2.1.7 of the reference manual states that blank lines (lines
> consisting only of whitespace and possibly a comment) do not generate
> NEWLINE tokens.  This is supported by the definition of a suite, which
> does not allow for standalone or consecutive NEWLINE tokens.
>
>     suite ::= stmt_list NEWLINE | NEWLINE INDENT statement+ DEDENT

I don't have the spare brain cells to think about your real problem
(sorry) but something to be aware of is that the pseudo EBNF of the
reference manual is purely descriptive -- it is not actually used in
the parsing of Python code at all.  Among other things this means it
could well just be wrong :/

The real grammar is Grammar/Grammar in the source distribution.

Cheers,
mwh

-- 
  The Internet is full.  Go away.
                      -- http://www.disobey.com/devilshat/ds011101.htm

From guido at python.org  Thu Oct  6 17:27:13 2005
From: guido at python.org (Guido van Rossum)
Date: Thu, 6 Oct 2005 08:27:13 -0700
Subject: [Python-Dev] PEP 343 and __with__
In-Reply-To: <bb8868b90510040751u454a8f85ma6e6ecf90600ef71@mail.gmail.com>
References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>
	<bb8868b90510031415x711dc703u72f2612bcda3e457@mail.gmail.com>
	<434257B7.9000909@gmail.com>
	<bb8868b90510040738l34a397c5s27ccb74eb1ebbbec@mail.gmail.com>
	<bb8868b90510040751u454a8f85ma6e6ecf90600ef71@mail.gmail.com>
Message-ID: <ca471dc20510060827s2cb0fac0r7c8bc9c25e5ed604@mail.gmail.com>

Just a quick note. Nick convinced me that adding __with__ (without
losing __enter__ and __exit__!) is a good thing, especially for the
decimal context manager. He's got a complete proposal for PEP changes
which he'll post here. After a brief feedback period I'll approve his
changes and he'll check them into the PEP.

My apologies to Jason for missing the point he was making; thanks to
Nick for getting it and turning it into a productive change proposal.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Thu Oct  6 17:30:45 2005
From: guido at python.org (Guido van Rossum)
Date: Thu, 6 Oct 2005 08:30:45 -0700
Subject: [Python-Dev] Lexical analysis and NEWLINE tokens
In-Reply-To: <2mhdbuela4.fsf@starship.python.net>
References: <23766.64.141.129.62.1128602211.squirrel@localhost>
	<2mhdbuela4.fsf@starship.python.net>
Message-ID: <ca471dc20510060830nb708df9y5d10c43d6242d228@mail.gmail.com>

I think it is a relic from the distant past, when the lexer did
generate NEWLINE for every blank line. I think the only case where you
can still get a NEWLINE by itself is in interactive mode. This code is
extremely convoluted and may be buggy in end cases; this could explain
why you get a continuation prompt after entering a comment in
interactive mode...

--Guido

On 10/6/05, Michael Hudson <mwh at python.net> wrote:
> "Matthew F. Barnes" <mfb at lotusland.dyndns.org> writes:
>
> > I posted this question to python-help, but I think I have a better chance
> > of getting the answer here.
> >
> > I'm looking for clarification on when NEWLINE tokens are generated during
> > lexical analysis of Python source code.  In particular, I'm confused about
> > some of the top-level components in Python's grammar (file_input,
> > interactive_input, and eval_input).
> >
> > Section 2.1.7 of the reference manual states that blank lines (lines
> > consisting only of whitespace and possibly a comment) do not generate
> > NEWLINE tokens.  This is supported by the definition of a suite, which
> > does not allow for standalone or consecutive NEWLINE tokens.
> >
> >     suite ::= stmt_list NEWLINE | NEWLINE INDENT statement+ DEDENT
>
> I don't have the spare brain cells to think about your real problem
> (sorry) but something to be aware of is that the pseudo EBNF of the
> reference manual is purely descriptive -- it is not actually used in
> the parsing of Python code at all.  Among other things this means it
> could well just be wrong :/
>
> The real grammar is Grammar/Grammar in the source distribution.
>
> Cheers,
> mwh
>
> --
>   The Internet is full.  Go away.
>                       -- http://www.disobey.com/devilshat/ds011101.htm
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Thu Oct  6 18:11:17 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 06 Oct 2005 12:11:17 -0400
Subject: [Python-Dev] Lexical analysis and NEWLINE tokens
In-Reply-To: <23766.64.141.129.62.1128602211.squirrel@localhost>
Message-ID: <5.1.1.6.0.20051006120937.01f6f2a0@mail.telecommunity.com>

At 07:36 AM 10/6/2005 -0500, Matthew F. Barnes wrote:
>I posted this question to python-help, but I think I have a better chance
>of getting the answer here.
>
>I'm looking for clarification on when NEWLINE tokens are generated during
>lexical analysis of Python source code.

If you're talking about the "tokenize" module, NEWLINE is only generated 
following a logical line, which is one that contains code.


From nas at arctrix.com  Thu Oct  6 18:22:53 2005
From: nas at arctrix.com (Neil Schemenauer)
Date: Thu, 6 Oct 2005 16:22:53 +0000 (UTC)
Subject: [Python-Dev] Python 2.5 and ast-branch
References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>
	<5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com>
	<2mu0fxekdz.fsf@starship.python.net> <43424480.9080900@gmail.com>
	<ca471dc20510040731v5fde7f79k3f9b649e65436054@mail.gmail.com>
	<4343D532.2030202@gmail.com>
	<ca471dc20510050752x8296aaax7d0bfb88c25516a5@mail.gmail.com>
	<bbaeab100510051400i5430320co67c2000ec53776f@mail.gmail.com>
	<4344F71E.9000708@gmail.com>
Message-ID: <di3j0t$kv2$1@sea.gmane.org>

Nick Coghlan <ncoghlan at gmail.com> wrote:
> If we kill the branch for now, then anyone that wants to bring up the idea 
> again can write a PEP first

I still have some (very) small hope that it can be finished.  If we
don't get it done soon then I fear that it will never happen.  I had
hoped that a SoC student would pick up the task or someone would ask
for a grant from the PSF.  Oh well.

> A strategy that may work out better is [...]

Another thought I've had recently is that most of the complexity
seems to be in the CST to AST translator.  Perhaps having a parser
that provided a nicer CST might help.

  Neil


From guido at python.org  Thu Oct  6 19:03:00 2005
From: guido at python.org (Guido van Rossum)
Date: Thu, 6 Oct 2005 10:03:00 -0700
Subject: [Python-Dev] Python 2.5 and ast-branch
In-Reply-To: <di3j0t$kv2$1@sea.gmane.org>
References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>
	<5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com>
	<2mu0fxekdz.fsf@starship.python.net> <43424480.9080900@gmail.com>
	<ca471dc20510040731v5fde7f79k3f9b649e65436054@mail.gmail.com>
	<4343D532.2030202@gmail.com>
	<ca471dc20510050752x8296aaax7d0bfb88c25516a5@mail.gmail.com>
	<bbaeab100510051400i5430320co67c2000ec53776f@mail.gmail.com>
	<4344F71E.9000708@gmail.com> <di3j0t$kv2$1@sea.gmane.org>
Message-ID: <ca471dc20510061003m34d8007cofe610e785256194@mail.gmail.com>

On 10/6/05, Neil Schemenauer <nas at arctrix.com> wrote:
> Nick Coghlan <ncoghlan at gmail.com> wrote:
> > If we kill the branch for now, then anyone that wants to bring up the idea
> > again can write a PEP first
>
> I still have some (very) small hope that it can be finished.  If we
> don't get it done soon then I fear that it will never happen.  I had
> hoped that a SoC student would pick up the task or someone would ask
> for a grant from the PSF.  Oh well.
>
> > A strategy that may work out better is [...]
>
> Another thought I've had recently is that most of the complexity
> seems to be in the CST to AST translator.  Perhaps having a parser
> that provided a nicer CST might help.

Dream on, Neil... Adding more work won't make it more likely to happen.

The only alternative to abandoning it that I see is to merge it back
into main NOW, using the time that remains us until the 2.5 release to
make it robust. That way, everybody can help out (and it may motivate
more people).

Even if this is a temporary regression (e.g. PEP 342), it might be
worth it -- but only if there are at least two people committed to
help out quickly when there are problems.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From BruceEckel-Python3234 at mailblocks.com  Thu Oct  6 19:12:05 2005
From: BruceEckel-Python3234 at mailblocks.com (Bruce Eckel)
Date: Thu, 6 Oct 2005 11:12:05 -0600
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <8393fff0510022253l6c37f3f8x6f5254018d6f348e@mail.gmail.com>
References: <fb6fbf5605092915471cbb1189@mail.gmail.com>
	<fb6fbf5605092916053506b9aa@mail.gmail.com>
	<1128073517.25688.10.camel@p-dvsi-418-1.rd.francetelecom.fr>
	<8393fff0510011450l2e9dd608hce91ed00516d0935@mail.gmail.com>
	<1766050860214964952@unknownmsgid>
	<8393fff0510021449p3607898cp6dcecc615450b46b@mail.gmail.com>
	<60ed19d40510021619p7f6e2641udf172a0d0d19283e@mail.gmail.com>
	<8393fff0510022253l6c37f3f8x6f5254018d6f348e@mail.gmail.com>
Message-ID: <9410082351.20051006111205@MailBlocks.com>

Jeremy Jones published a blog discussing some of the ideas we've
talked about here:
http://www.oreillynet.com/pub/wlg/8002
Although I hope our conversation isn't done, as he suggests!

At some point when more ideas have been thrown about (and TIJ4 is
done) I hope to summarize what we've talked about in an article.

Bruce Eckel    http://www.BruceEckel.com   mailto:BruceEckel-Python3234 at mailblocks.com
Contains electronic books: "Thinking in Java 3e" & "Thinking in C++ 2e"
Web log: http://www.artima.com/weblogs/index.jsp?blogger=beckel
Subscribe to my newsletter:
http://www.mindview.net/Newsletter
My schedule can be found at:
http://www.mindview.net/Calendar




From arathorn at fastwebnet.it  Thu Oct  6 19:26:26 2005
From: arathorn at fastwebnet.it (Paolo Invernizzi)
Date: Thu, 06 Oct 2005 19:26:26 +0200
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <9410082351.20051006111205@MailBlocks.com>
References: <fb6fbf5605092915471cbb1189@mail.gmail.com>	<fb6fbf5605092916053506b9aa@mail.gmail.com>	<1128073517.25688.10.camel@p-dvsi-418-1.rd.francetelecom.fr>	<8393fff0510011450l2e9dd608hce91ed00516d0935@mail.gmail.com>	<1766050860214964952@unknownmsgid>	<8393fff0510021449p3607898cp6dcecc615450b46b@mail.gmail.com>	<60ed19d40510021619p7f6e2641udf172a0d0d19283e@mail.gmail.com>	<8393fff0510022253l6c37f3f8x6f5254018d6f348e@mail.gmail.com>
	<9410082351.20051006111205@MailBlocks.com>
Message-ID: <di3mns$4kn$1@sea.gmane.org>

Just to add another 2 cents....

http://www.erights.org/talks/promises/paper/tgc05.pdf

---
Paolo Invernizzi


Bruce Eckel wrote:
> Jeremy Jones published a blog discussing some of the ideas we've
> talked about here:
> http://www.oreillynet.com/pub/wlg/8002
> Although I hope our conversation isn't done, as he suggests!
> 
> At some point when more ideas have been thrown about (and TIJ4 is
> done) I hope to summarize what we've talked about in an article.


From jeremy at alum.mit.edu  Thu Oct  6 21:42:47 2005
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Thu, 6 Oct 2005 15:42:47 -0400
Subject: [Python-Dev] Python 2.5 and ast-branch
In-Reply-To: <ca471dc20510061003m34d8007cofe610e785256194@mail.gmail.com>
References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>
	<2mu0fxekdz.fsf@starship.python.net> <43424480.9080900@gmail.com>
	<ca471dc20510040731v5fde7f79k3f9b649e65436054@mail.gmail.com>
	<4343D532.2030202@gmail.com>
	<ca471dc20510050752x8296aaax7d0bfb88c25516a5@mail.gmail.com>
	<bbaeab100510051400i5430320co67c2000ec53776f@mail.gmail.com>
	<4344F71E.9000708@gmail.com> <di3j0t$kv2$1@sea.gmane.org>
	<ca471dc20510061003m34d8007cofe610e785256194@mail.gmail.com>
Message-ID: <e8bf7a530510061242o1404a013pc84ad40a0927f0c3@mail.gmail.com>

On 10/6/05, Guido van Rossum <guido at python.org> wrote:
> On 10/6/05, Neil Schemenauer <nas at arctrix.com> wrote:
> > Nick Coghlan <ncoghlan at gmail.com> wrote:
> > > If we kill the branch for now, then anyone that wants to bring up the idea
> > > again can write a PEP first
> >
> > I still have some (very) small hope that it can be finished.  If we
> > don't get it done soon then I fear that it will never happen.  I had
> > hoped that a SoC student would pick up the task or someone would ask
> > for a grant from the PSF.  Oh well.
> >
> > > A strategy that may work out better is [...]
> >
> > Another thought I've had recently is that most of the complexity
> > seems to be in the CST to AST translator.  Perhaps having a parser
> > that provided a nicer CST might help.
>
> Dream on, Neil... Adding more work won't make it more likely to happen.

You're both right.  The CST-to-AST translator is fairly complex; it
would be better to parse directly to an AST.  On the other hand, the
AST translator seems fairly complete and not particularly hard to
write.  I'd love to see a new parser in 2.6.

> The only alternative to abandoning it that I see is to merge it back
> into main NOW, using the time that remains us until the 2.5 release to
> make it robust. That way, everybody can help out (and it may motivate
> more people).
>
> Even if this is a temporary regression (e.g. PEP 342), it might be
> worth it -- but only if there are at least two people committed to
> help out quickly when there are problems.

I'm sorry I didn't respond earlier.  I've been home with a new baby
for the last six weeks and haven't been keeping a close eye on my
email.  (I didn't see Nick's earlier email until his most recent
post.)

It would take a few days of work to get the branch ready to merge to
the head.  There are basic issues like renaming newcompile.c to
compile.c and the like.  I could work on that tomorrow and Monday.

I did do a little work on the ast branch earlier this week.  The
remaining issues feel pretty manageable, so you can certainly count me
as one of the two people committed to help out.  I'll make a point of
keeping a closer eye on python-dev email, in addition to writing some
code.

Jeremy

From ms at cerenity.org  Thu Oct  6 21:54:56 2005
From: ms at cerenity.org (Michael Sparks)
Date: Thu, 6 Oct 2005 20:54:56 +0100
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <9410082351.20051006111205@MailBlocks.com>
References: <fb6fbf5605092915471cbb1189@mail.gmail.com>
	<8393fff0510022253l6c37f3f8x6f5254018d6f348e@mail.gmail.com>
	<9410082351.20051006111205@MailBlocks.com>
Message-ID: <200510062054.56985.ms@cerenity.org>

Hi Bruce,


On Thursday 06 October 2005 18:12, Bruce Eckel wrote:
> Although I hope our conversation isn't done, as he suggests!
...
> At some point when more ideas have been thrown about (and TIJ4 is
> done) I hope to summarize what we've talked about in an article.

I don't know if you saw my previous post[1] to python-dev on this topic, but 
Kamaelia is specifically aimed at making concurrency simple and easy to use. 
Initially we were focussed on using scheduled generators for co-operative 
CSP-style (but with buffers) concurrency.
   [1] http://tinyurl.com/dfnah, http://tinyurl.com/e4jfq

We've tested the system so far on 2 relatively inexperienced programmers
(as well as experienced, but the more interesting group is novices). The one
who hadn't done much programming at all (a little bit of VB, pre-university)
actually fared better IMO. This is probably because concurrency became
part of his standard toolbox of approaches.

I've placed the slides I've produced for Euro OSCON on Kamaelia here:
   * http://cerenity.org/KamaeliaEuroOSCON2005.pdf

The corrected URL for the whitepaper based on work now 6 months old (we've 
come quite a way since then!) is here:
   * http://www.bbc.co.uk/rd/pubs/whp/whp113.shtml

Consider a simple server for sending text (generated by a user typing into the 
server) to multiple clients connecting to a server. This is a naturally 
concurrent problem in various ways (user interaction, splitting, listening 
for connections, serving connections, etc). Why is that interesting to us? 
It's effectively a microcosm of how subtitling works. (I work at the BBC)

In Kamaelia this looks like this:

=== start ===
class ConsoleReader(threadedcomponent):
   def run(self):
      while 1:
         line = raw_input(">>> ")
         line = line + "\n"
         self.outqueues["outbox"].put(line)

Backplane("subtitles").activate()
pipeline(
    ConsoleReader(),
    publishTo("subtitles"),
).activate()
def subtitles_protocol():
    return subscribeTo("subtitles")

SimpleServer(subtitles_protocol, 5000).run()
=== end ===

The ConsoleReader is threaded to allow the use of the naive way of
reading from the input, whereas the server, backplane (a named splitter
component in practice), pipelines, publishing, subscribing, splitting,
etc are all single threaded co-operative concurrency.

A possible client for this text service might be:

pipeline(
    TCPClient("subtitles.rd.bbc.co.uk", 5000),
    Ticker(),
).run()

(Though that would be a bit bare, even if it does use pygame :)

The entire system is based around communicating generators, but we also
have threads for blocking operations. (Though the entire network subsystem
is non-blocking)

What I'd be interested in, is hearing how our system doesn't match with
the goals of the hypothetical concurrency system you'd like to see (if it
doesn't). The main reason I'm interested in hearing this, is because the
goals you listed are ones we want to achieve. If you don't think our system
matches it (we don't have process migration as yet, so that's one area)
I'd be interested in hearing what areas you think are deficient.

However, the way we're beginning to refer to the project is to refer to
just the component aspect rather than concurrency - for one simple
reason - we're getting to stage where we can ignore /most/ concurrency
issues(not all).

If you have any time for feedback, it'd be appreciated. If you don't I hope 
it's useful food for thought! 

Best Regards,


Michael
-- 
Michael Sparks, Senior R&D Engineer, Digital Media Group
Michael.Sparks at rd.bbc.co.uk, http://kamaelia.sourceforge.net/
British Broadcasting Corporation, Research and Development
Kingswood Warren, Surrey KT20 6NP

This e-mail may contain personal views which are not the views of the BBC.

From BruceEckel-Python3234 at mailblocks.com  Thu Oct  6 22:06:37 2005
From: BruceEckel-Python3234 at mailblocks.com (Bruce Eckel)
Date: Thu, 6 Oct 2005 14:06:37 -0600
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <200510062054.56985.ms@cerenity.org>
References: <fb6fbf5605092915471cbb1189@mail.gmail.com>
	<8393fff0510022253l6c37f3f8x6f5254018d6f348e@mail.gmail.com>
	<9410082351.20051006111205@MailBlocks.com>
	<200510062054.56985.ms@cerenity.org>
Message-ID: <1093762964.20051006140637@MailBlocks.com>

This does look quite fascinating, and I know there's a lot of really
interesting work going on at the BBC now -- looks like some really
pioneering stuff going on with respect to TV show distribution over
the internet, new compression formats, etc.

So yes indeed, this is quite high on my list to research. Looks like
people there have been doing some interesting work.

Right now I'm just trying to cast a net, so that people can put in
ideas, for when the Java book is done and I can spend more time on it.

Thursday, October 6, 2005, 1:54:56 PM, Michael Sparks wrote:

> Hi Bruce,


> On Thursday 06 October 2005 18:12, Bruce Eckel wrote:
>> Although I hope our conversation isn't done, as he suggests!
> ...
>> At some point when more ideas have been thrown about (and TIJ4 is
>> done) I hope to summarize what we've talked about in an article.

> I don't know if you saw my previous post[1] to python-dev on this topic, but
> Kamaelia is specifically aimed at making concurrency simple and easy to use.
> Initially we were focussed on using scheduled generators for co-operative
> CSP-style (but with buffers) concurrency.
>    [1] http://tinyurl.com/dfnah, http://tinyurl.com/e4jfq

> We've tested the system so far on 2 relatively inexperienced programmers
> (as well as experienced, but the more interesting group is novices). The one
> who hadn't done much programming at all (a little bit of VB, pre-university)
> actually fared better IMO. This is probably because concurrency became
> part of his standard toolbox of approaches.

> I've placed the slides I've produced for Euro OSCON on Kamaelia here:
>    * http://cerenity.org/KamaeliaEuroOSCON2005.pdf

> The corrected URL for the whitepaper based on work now 6 months old (we've
> come quite a way since then!) is here:
>    * http://www.bbc.co.uk/rd/pubs/whp/whp113.shtml

> Consider a simple server for sending text (generated by a user typing into the
> server) to multiple clients connecting to a server. This is a naturally
> concurrent problem in various ways (user interaction, splitting, listening
> for connections, serving connections, etc). Why is that interesting to us?
> It's effectively a microcosm of how subtitling works. (I work at the BBC)

> In Kamaelia this looks like this:

> === start ===
> class ConsoleReader(threadedcomponent):
>    def run(self):
>       while 1:
>          line = raw_input(">>> ")
>          line = line + "\n"
>          self.outqueues["outbox"].put(line)

> Backplane("subtitles").activate()
> pipeline(
>     ConsoleReader(),
>     publishTo("subtitles"),
> ).activate()
> def subtitles_protocol():
>     return subscribeTo("subtitles")

> SimpleServer(subtitles_protocol, 5000).run()
> === end ===

> The ConsoleReader is threaded to allow the use of the naive way of
> reading from the input, whereas the server, backplane (a named splitter
> component in practice), pipelines, publishing, subscribing, splitting,
> etc are all single threaded co-operative concurrency.

> A possible client for this text service might be:

> pipeline(
>     TCPClient("subtitles.rd.bbc.co.uk", 5000),
>     Ticker(),
> ).run()

> (Though that would be a bit bare, even if it does use pygame :)

> The entire system is based around communicating generators, but we also
> have threads for blocking operations. (Though the entire network subsystem
> is non-blocking)

> What I'd be interested in, is hearing how our system doesn't match with
> the goals of the hypothetical concurrency system you'd like to see (if it
> doesn't). The main reason I'm interested in hearing this, is because the
> goals you listed are ones we want to achieve. If you don't think our system
> matches it (we don't have process migration as yet, so that's one area)
> I'd be interested in hearing what areas you think are deficient.

> However, the way we're beginning to refer to the project is to refer to
> just the component aspect rather than concurrency - for one simple
> reason - we're getting to stage where we can ignore /most/ concurrency
> issues(not all).

> If you have any time for feedback, it'd be appreciated. If you don't I hope
> it's useful food for thought! 

> Best Regards,


> Michael


Bruce Eckel    http://www.BruceEckel.com   mailto:BruceEckel-Python3234 at mailblocks.com
Contains electronic books: "Thinking in Java 3e" & "Thinking in C++ 2e"
Web log: http://www.artima.com/weblogs/index.jsp?blogger=beckel
Subscribe to my newsletter:
http://www.mindview.net/Newsletter
My schedule can be found at:
http://www.mindview.net/Calendar




From marvinpublic at comcast.net  Thu Oct  6 23:22:09 2005
From: marvinpublic at comcast.net (Marvin)
Date: Thu, 06 Oct 2005 17:22:09 -0400
Subject: [Python-Dev] Static builds on Windows (continued)
In-Reply-To: <mailman.3948.1128494877.508.python-dev@python.org>
References: <mailman.3948.1128494877.508.python-dev@python.org>
Message-ID: <43459581.4060509@comcast.net>

> Date: Wed, 05 Oct 2005 00:21:20 +0200
> From: "Martin v. L?wis" <martin at v.loewis.de>
> Subject: Re: [Python-Dev] Static builds on Windows (continued)
> Cc: python-dev at python.org
> Message-ID: <43430060.6070909 at v.loewis.de>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> 
> Marvin wrote:
> 
>>I built pythoncore and python. The resulting python.exe worked fine, but did
>>indeed fail when I tried to dynamically load anything (Dialog said: the
>>application terminated abnormally)
> 
> 
> Not sure what you are trying to do here. In your case, dynamic loading 
> simply cannot work. The extension modules all link with python24.dll, 
> which you don't have. It may find some python24.dll, which then gives 
> conflicts with the Python interpreter that is already running.
> 
> So what you really should do is disable dynamic loading entirely. To do
> so, remove dynload_win from your project, and #undef 
> HAVE_DYNAMIC_LOADING in PC/pyconfig.h.
> 
> Not sure if anybody has recently tested whether this configuration
> actually works - if you find that it doesn't, please post your patches
> to sf.net/projects/python.
> 
> If you really want to provide dynamic loading of some kind, you should
> arrange the extension modules to import the symbols from your .exe.
> Linking the exe should generate an import library, and you should link
> the extensions against that.
> 
> HTH,
> Martin
> 

I'll try that when I get back to this and feed back my results.  I figured out 
that I can avoid the need for dynamic loading.  I wanted to use some existing 
extension modules, but the whole point was to use the existing ones which as you 
point out are linked against a dll.  So even if I created an .EXE that exported 
the symbols, I'd still have to rebuild the extensions.

From jcarlson at uci.edu  Fri Oct  7 00:15:07 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Thu, 06 Oct 2005 15:15:07 -0700
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <200510062054.56985.ms@cerenity.org>
References: <9410082351.20051006111205@MailBlocks.com>
	<200510062054.56985.ms@cerenity.org>
Message-ID: <20051006143740.287E.JCARLSON@uci.edu>


Michael Sparks <ms at cerenity.org> wrote:
> What I'd be interested in, is hearing how our system doesn't match with
> the goals of the hypothetical concurrency system you'd like to see (if it
> doesn't). The main reason I'm interested in hearing this, is because the
> goals you listed are ones we want to achieve. If you don't think our system
> matches it (we don't have process migration as yet, so that's one area)
> I'd be interested in hearing what areas you think are deficient.

I've not used the system you have worked on, so perhaps this is easy,
but the vast majority of concurrency issues can be described as fitting
into one or more of the following task distribution categories.

1. one to many (one producer, many consumers) without duplication (no
consumer has the same data, essentially a distributed queue)
2. one to many (one producer, many consumers) with duplication (the
producer broadcasts to all consumers)
3. many to one (many producers, one consumer)
4. many to many (many producers, many consumers) without duplication (no
consumer has the same data, essentially a distributed queue)
5. many to many (many producers, many consumers) with duplication (all
producers broadcast to all consumers)
6. one to one without duplication

MPI, for example, handles all the above cases with minor work, and
tuple space systems such as Linda can support all of the above with a
bit of work in cases 2 and 5.

If Kamaelia is able to handle all of the above mechanisms in both a
blocking and non-blocking fashion, then I would guess it has the basic
requirements for most concurrent applications.  If, however, it is not
able to easily handle all of the above mechanisms, or has issues with
blocking and/or non-blocking semantics on the producer and/or consumer
end, then it is likely that it will have difficulty gaining traction in
certain applications where the unsupported mechanism is common and/or
necessary.

One nice thing about the message queue style (which it seems as though
Kamaelia implements) is that it guarantees that a listener won't recieve
the same message twice when broadcasting a message to multiple listeners
(case 2 and 5 above) - something that is a bit more difficult to
guarantee in a tuple space scenario, but which is still possible (which
spurns me to add it into my tuple space implementation before it is
released). Another nice thing is that subscriptions to a queue seem to
be persistant in Kamaelia, which I should also implement.


 - Josiah


From kbk at shore.net  Fri Oct  7 02:00:29 2005
From: kbk at shore.net (Kurt B. Kaiser)
Date: Thu, 06 Oct 2005 20:00:29 -0400
Subject: [Python-Dev] Python 2.5 and ast-branch
In-Reply-To: <e8bf7a530510061242o1404a013pc84ad40a0927f0c3@mail.gmail.com>
	(Jeremy Hylton's message of "Thu, 6 Oct 2005 15:42:47 -0400")
References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>
	<2mu0fxekdz.fsf@starship.python.net> <43424480.9080900@gmail.com>
	<ca471dc20510040731v5fde7f79k3f9b649e65436054@mail.gmail.com>
	<4343D532.2030202@gmail.com>
	<ca471dc20510050752x8296aaax7d0bfb88c25516a5@mail.gmail.com>
	<bbaeab100510051400i5430320co67c2000ec53776f@mail.gmail.com>
	<4344F71E.9000708@gmail.com> <di3j0t$kv2$1@sea.gmane.org>
	<ca471dc20510061003m34d8007cofe610e785256194@mail.gmail.com>
	<e8bf7a530510061242o1404a013pc84ad40a0927f0c3@mail.gmail.com>
Message-ID: <87irwadwma.fsf@hydra.bayview.thirdcreek.com>

Jeremy Hylton <jeremy at alum.mit.edu> writes:

> On 10/6/05, Guido van Rossum <guido at python.org> wrote:
>> The only alternative to abandoning it that I see is to merge it back
>> into main NOW, using the time that remains us until the 2.5 release to
>> make it robust. That way, everybody can help out (and it may motivate
>> more people).
>>
>> Even if this is a temporary regression (e.g. PEP 342), it might be
>> worth it -- but only if there are at least two people committed to
>> help out quickly when there are problems.
>
> I'm sorry I didn't respond earlier.  I've been home with a new baby
> for the last six weeks and haven't been keeping a close eye on my
> email.  (I didn't see Nick's earlier email until his most recent
> post.)
>
> It would take a few days of work to get the branch ready to merge to
> the head.  There are basic issues like renaming newcompile.c to
> compile.c and the like.  I could work on that tomorrow and Monday.

Unless I'm missing something, we would need to merge HEAD to the AST
branch once more to pick up the changes in MAIN since the last merge,
and then make sure everything in the AST branch is passing the test
suite.  Otherwise we risk having MAIN broken for awhile following a
merge.

Finally, we can then merge the diff of HEAD to AST back into MAIN.

If we try to merge the entire AST branch since its inception, we will
re-apply to MAIN those changes made in MAIN which have already been
merged to the AST branch and it will be difficult to sort out all the
conflicts.

If we try to merge the AST branch from the its last merge tag to its
head we will miss the work done on AST prior to that merge.

Let me know at kbk at shore.net if you want to do this.

-- 
KBK

From raymond.hettinger at verizon.net  Fri Oct  7 02:26:07 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Thu, 06 Oct 2005 20:26:07 -0400
Subject: [Python-Dev] Python 2.5 and ast-branch
In-Reply-To: <87irwadwma.fsf@hydra.bayview.thirdcreek.com>
Message-ID: <000001c5cad5$b6b83ee0$a105a044@oemcomputer>

> Unless I'm missing something, we would need to merge HEAD to the AST
> branch once more to pick up the changes in MAIN since the last merge,
> and then make sure everything in the AST branch is passing the test
> suite.  Otherwise we risk having MAIN broken for awhile following a
> merge.

IMO, merging to the head is a somewhat dangerous strategy that doesn't
have any benefits.  Whether done on the head or in the branch, the same
amount of work needs to be done.

If the stability of the head is disrupted, it may impede other
maintenance efforts because it is harder to test bug fixes when the test
suites are not passing. 


From ms at cerenity.org  Fri Oct  7 02:45:16 2005
From: ms at cerenity.org (Michael Sparks)
Date: Fri, 7 Oct 2005 01:45:16 +0100
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <20051006143740.287E.JCARLSON@uci.edu>
References: <9410082351.20051006111205@MailBlocks.com>
	<200510062054.56985.ms@cerenity.org>
	<20051006143740.287E.JCARLSON@uci.edu>
Message-ID: <200510070145.17284.ms@cerenity.org>

On Thursday 06 October 2005 23:15, Josiah Carlson wrote:
[... 6 specific use cases ...]
> If Kamaelia is able to handle all of the above mechanisms in both a
> blocking and non-blocking fashion, then I would guess it has the basic
> requirements for most concurrent applications.

It can. I can easily knock up examples for each if required :-)

That said, a more interesting example implemented this week (as part of
a rapid prototyping project to look at collaborative community radio)
implements an networked audio mixer matrix. That allows mutiple sources of
audio to be mixed, sent on to multiple destinations, may be duplicate mixes
of each other, but also may select different mixes. The same system also
includes point to point communications for network control of the mix.

That application covers ( I /think/ ) 1, 2, 3, 4,  and 6 on your list of
things as I understand what you mean. 5 is fairly trivial though. (The
largest bottleneck in writing it was my personal misunderstanding of
how to actually mix 16bit signed audio :-)

Regarding blocking & non-blocking, links can be marked to synchronous, which
forces blocking style behaviour. Since generally we're using generators, we
can't block for real which is why we throw an exception there. However,
threaded components can & do block. The reason for this was due to the
architecture being inspired by noting the similarities between asynchronous
hardware systems/langages and network systems.

> into my tuple space implementation before it is released. 

I'd be interested in hearing more about that BTW. One thing we've found is
that much organic systems have a neural system for communications between
things, (hence Axon :), that you also need to equivalent of a hormonal system.
In the unix shell world, IMO the environment acts as that for pipelines, and
similarly that's why we have an assistant system. (Which has key/value lookup
facilities)

It's a less obvious requirement, but is a useful one nonetheless, so I don't
really see a message passing style as excluding a linda approach - since
they're orthoganal approaches.

Best Regards,


Michael.
-- 
Michael Sparks, Senior R&D Engineer, Digital Media Group
Michael.Sparks at rd.bbc.co.uk, http://kamaelia.sourceforge.net/
British Broadcasting Corporation, Research and Development
Kingswood Warren, Surrey KT20 6NP

This e-mail may contain personal views which are not the views of the BBC.

From guido at python.org  Fri Oct  7 04:34:11 2005
From: guido at python.org (Guido van Rossum)
Date: Thu, 6 Oct 2005 19:34:11 -0700
Subject: [Python-Dev] Python 2.5 and ast-branch
In-Reply-To: <000001c5cad5$b6b83ee0$a105a044@oemcomputer>
References: <87irwadwma.fsf@hydra.bayview.thirdcreek.com>
	<000001c5cad5$b6b83ee0$a105a044@oemcomputer>
Message-ID: <ca471dc20510061934jfb4f01frecaf562291a57038@mail.gmail.com>

[Kurt]
> > Unless I'm missing something, we would need to merge HEAD to the AST
> > branch once more to pick up the changes in MAIN since the last merge,
> > and then make sure everything in the AST branch is passing the test
> > suite.  Otherwise we risk having MAIN broken for awhile following a
> > merge.

[Raymond]
> IMO, merging to the head is a somewhat dangerous strategy that doesn't
> have any benefits.  Whether done on the head or in the branch, the same
> amount of work needs to be done.
>
> If the stability of the head is disrupted, it may impede other
> maintenance efforts because it is harder to test bug fixes when the test
> suites are not passing.

Well, at some point it will HAVE to be merged into the head. The
longer we wait the more painful it will be. If we suffer a week of
instability now, I think that's acceptable, as long as all developers
are suitably alerted, and as long as the AST team works towards
resolving the issues ASAP.

I happen to agree with Kurt that we should first merge the head into
the branch; then the AST team can work on making sure the entire test
suite passes; then they can merge back into the head.

BUT this should only be done with a serious commitment from the AST
team (I think Neil and Jeremy are offering this -- I just don't know
how much time they will have available, realistically).

My main point is, we should EITHER abandon the AST branch, OR force a
quick resolution. I'm willing to suffer a week of instability in head
now, or in a week or two -- but I'm not willing to wait again.

Let's draw a line in the sand. The AST team (which includes whoever
will help) has up to three weeks to het the AST branch into a position
where it passes all the current unit tests merged in from the head.
Then they merge it into the head after which we can accept at most a
week of instability in the head. After that the AST team must remain
available to resolve remaining issues quickly.

How does this sound to the non-AST-branch developers who have to
suffer the inevitable post-merge instability? I think it's now or
never -- waiting longer isn't going to make this thing easier (not
with several more language changes approved: with-statement, extended
import, what else...)

What does the AST team think?

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Fri Oct  7 04:42:16 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 06 Oct 2005 22:42:16 -0400
Subject: [Python-Dev] Python 2.5 and ast-branch
In-Reply-To: <ca471dc20510061934jfb4f01frecaf562291a57038@mail.gmail.com
 >
References: <000001c5cad5$b6b83ee0$a105a044@oemcomputer>
	<87irwadwma.fsf@hydra.bayview.thirdcreek.com>
	<000001c5cad5$b6b83ee0$a105a044@oemcomputer>
Message-ID: <5.1.1.6.0.20051006224020.01f76148@mail.telecommunity.com>

At 07:34 PM 10/6/2005 -0700, Guido van Rossum wrote:
>How does this sound to the non-AST-branch developers who have to
>suffer the inevitable post-merge instability? I think it's now or
>never -- waiting longer isn't going to make this thing easier (not
>with several more language changes approved: with-statement, extended
>import, what else...)

Do the AST branch changes affect the interface of the "parser" module?  Or 
do they just add new functionality?


From bcannon at gmail.com  Fri Oct  7 05:48:40 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Thu, 6 Oct 2005 20:48:40 -0700
Subject: [Python-Dev] Python 2.5 and ast-branch
In-Reply-To: <ca471dc20510061934jfb4f01frecaf562291a57038@mail.gmail.com>
References: <87irwadwma.fsf@hydra.bayview.thirdcreek.com>
	<000001c5cad5$b6b83ee0$a105a044@oemcomputer>
	<ca471dc20510061934jfb4f01frecaf562291a57038@mail.gmail.com>
Message-ID: <bbaeab100510062048t412e8bafo56dd13daa7b608e8@mail.gmail.com>

On 10/6/05, Guido van Rossum <guido at python.org> wrote:
> [Kurt]
> > > Unless I'm missing something, we would need to merge HEAD to the AST
> > > branch once more to pick up the changes in MAIN since the last merge,
> > > and then make sure everything in the AST branch is passing the test
> > > suite.  Otherwise we risk having MAIN broken for awhile following a
> > > merge.
>
> [Raymond]
> > IMO, merging to the head is a somewhat dangerous strategy that doesn't
> > have any benefits.  Whether done on the head or in the branch, the same
> > amount of work needs to be done.
> >
> > If the stability of the head is disrupted, it may impede other
> > maintenance efforts because it is harder to test bug fixes when the test
> > suites are not passing.
>
> Well, at some point it will HAVE to be merged into the head. The
> longer we wait the more painful it will be. If we suffer a week of
> instability now, I think that's acceptable, as long as all developers
> are suitably alerted, and as long as the AST team works towards
> resolving the issues ASAP.
>
> I happen to agree with Kurt that we should first merge the head into
> the branch; then the AST team can work on making sure the entire test
> suite passes; then they can merge back into the head.
>
> BUT this should only be done with a serious commitment from the AST
> team (I think Neil and Jeremy are offering this -- I just don't know
> how much time they will have available, realistically).
>
> My main point is, we should EITHER abandon the AST branch, OR force a
> quick resolution. I'm willing to suffer a week of instability in head
> now, or in a week or two -- but I'm not willing to wait again.
>
> Let's draw a line in the sand. The AST team (which includes whoever
> will help) has up to three weeks to het the AST branch into a position
> where it passes all the current unit tests merged in from the head.
> Then they merge it into the head after which we can accept at most a
> week of instability in the head. After that the AST team must remain
> available to resolve remaining issues quickly.
>

So basically we have until November 1 to get all tests passing?

For anyone who wants a snapshot of where things stand,
http://www.python.org/sf/1191458 lists the tests that are currently
failing (read the comments to get the current list; count is at 14). 
All AST-related tracker items are under the AST group so filtering to
just AST stuff is easy.

I am willing to guess a couple of those tests will start passing as
soon as http://www.python.org/sf/1246473 is dealt with (this is just
based on looking at some of the failure output seeming to be off by
one).  As of right now the lnotab is only has statement granularity
when it really needs expression granularity.  That requires tweaking
all instances where an expression node is created to also take in the
line number of where the expression exists.  This fix is one of the
main reasons I have not touched the AST branch; it is not difficult,
but it is not exactly fun or small either.  =)

> How does this sound to the non-AST-branch developers who have to
> suffer the inevitable post-merge instability? I think it's now or
> never -- waiting longer isn't going to make this thing easier (not
> with several more language changes approved: with-statement, extended
> import, what else...)
>
> What does the AST team think?
>

Well, I have homework this weekend, a midterm two weeks from tomorrow
(so the preceding weekend will be studying), and October 23 is my
birthday so I will be busy that entire weekend visiting family.  In
other words Python time is a premium this month.  But I will try to
squeeze in what time I can.

But I think the three week time frame is reasonable to light the fire
under our asses to get this thing done (especially if it inspires
people to jump in and help out; as always, people interested in
joining in, check out the branch and read Python/compile.txt ).

-Brett

From kbk at shore.net  Fri Oct  7 07:01:15 2005
From: kbk at shore.net (Kurt B. Kaiser)
Date: Fri, 07 Oct 2005 01:01:15 -0400
Subject: [Python-Dev] Python 2.5 and ast-branch
In-Reply-To: <ca471dc20510061934jfb4f01frecaf562291a57038@mail.gmail.com>
	(Guido van Rossum's message of "Thu, 6 Oct 2005 19:34:11 -0700")
References: <87irwadwma.fsf@hydra.bayview.thirdcreek.com>
	<000001c5cad5$b6b83ee0$a105a044@oemcomputer>
	<ca471dc20510061934jfb4f01frecaf562291a57038@mail.gmail.com>
Message-ID: <873bndex9g.fsf@hydra.bayview.thirdcreek.com>

Guido van Rossum <guido at python.org> writes:

> I happen to agree with Kurt that we should first merge the head into
> the branch; then the AST team can work on making sure the entire
> test suite passes; then they can merge back into the head.

I can be available to do this again.  It would involve freezing the
AST branch for a day.

Once the AST branch is stable, we would need to freeze everything,
merge MAIN to AST one more time to pick up the last few changes in
MAIN, and then merge the AST head back to MAIN.

By doing these merges from MAIN to AST we would have effectively moved
the AST branch point along MAIN to HEAD.  So the final join is HEAD to
AST, conducted from MAIN.  I'll run a local experiment to verify this
concept is workable.

-- 
KBK

From jcarlson at uci.edu  Fri Oct  7 08:25:37 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Thu, 06 Oct 2005 23:25:37 -0700
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <200510070145.17284.ms@cerenity.org>
References: <20051006143740.287E.JCARLSON@uci.edu>
	<200510070145.17284.ms@cerenity.org>
Message-ID: <20051006221436.2892.JCARLSON@uci.edu>


Michael Sparks <ms at cerenity.org> wrote:
> 
> On Thursday 06 October 2005 23:15, Josiah Carlson wrote:
> [... 6 specific use cases ...]
> > If Kamaelia is able to handle all of the above mechanisms in both a
> > blocking and non-blocking fashion, then I would guess it has the basic
> > requirements for most concurrent applications.
> 
> It can. I can easily knock up examples for each if required :-)

That's cool, I trust you.  One thing I notice is absent from the
Kamaelia page is benchmarks.

On the one hand, benchmarks are technically useless, as one can tend to
benchmark those things that a system does well, and ignore those things
that it does poorly (take, for example how PyLinda's speed test only
ever inserts and removes one tuple at a time...try inserting 100k  and
use wildcards to extract those 100k, and you'll note how poor it
performs, or database benchmarks, etc.).  However, if one's benchmarks
provide examples from real use, then it shows that at least someone has
gotten some X performance from the system.

I'm personally interested in latency and throughput for varying sizes of
data being passed through the system.


> That said, a more interesting example implemented this week (as part of
> a rapid prototyping project to look at collaborative community radio)
> implements an networked audio mixer matrix. That allows mutiple sources of
> audio to be mixed, sent on to multiple destinations, may be duplicate mixes
> of each other, but also may select different mixes. The same system also
> includes point to point communications for network control of the mix.

Very neat.  How much data?  What kind of throughput?  What kinds of
latencies?

> That application covers ( I /think/ ) 1, 2, 3, 4,  and 6 on your list of
> things as I understand what you mean. 5 is fairly trivial though.

Cool.

> Regarding blocking & non-blocking, links can be marked to synchronous, which
> forces blocking style behaviour. Since generally we're using generators, we
> can't block for real which is why we throw an exception there. However,
> threaded components can & do block. The reason for this was due to the
> architecture being inspired by noting the similarities between asynchronous
> hardware systems/langages and network systems.

On the client side, I was lazy and used synchronous/blocking sockets to
block on read/write (every client thread gets its own connection,
meaning that tuple puts are never sitting in a queue).  I've also got
server-side timeouts for when you don't want to wait too long for data.
    rslt = tplspace.get(PATTERN, timeout=None)


> > into my tuple space implementation before it is released. 
> 
> I'd be interested in hearing more about that BTW. One thing we've found is
> that much organic systems have a neural system for communications between
> things, (hence Axon :), that you also need to equivalent of a hormonal system.
> In the unix shell world, IMO the environment acts as that for pipelines, and
> similarly that's why we have an assistant system. (Which has key/value lookup
> facilities)

I have two recent posts about the performance and features of a (hacked
together) tuple space system I worked on (for two afternoons) in my blog.
"Feel Lucky" for "Josiah Carlson" in google and you will find it.


> It's a less obvious requirement, but is a useful one nonetheless, so I don't
> really see a message passing style as excluding a linda approach - since
> they're orthoganal approaches.

Indeed.  For me, the idea of being able to toss a tuple into memory
somewhere and being able to find it later maps into my mind as:
('name', arg1, ...) -> name(arg1, ...), which is, quite literally, an
RPC semantic (which seems a bit more natural to me than subscribing to
the 'name' queue).  With the ability to send to either single or
multiple listeners, you get message passing, broadcast messages, and a
standard job/result queueing semantic. The only thing that it is missing
is a prioritization mechanism (fifo, numeric priority, etc.), which
would get us a job scheduling kernel. Not bad for a "message
passing"/"tuple space"/"IPC" library.  (all of the above described have
direct algorithms for implementation).


 - Josiah


From mwh at python.net  Fri Oct  7 08:51:24 2005
From: mwh at python.net (Michael Hudson)
Date: Fri, 07 Oct 2005 07:51:24 +0100
Subject: [Python-Dev] Python 2.5 and ast-branch
In-Reply-To: <ca471dc20510061934jfb4f01frecaf562291a57038@mail.gmail.com>
	(Guido van Rossum's message of "Thu, 6 Oct 2005 19:34:11 -0700")
References: <87irwadwma.fsf@hydra.bayview.thirdcreek.com>
	<000001c5cad5$b6b83ee0$a105a044@oemcomputer>
	<ca471dc20510061934jfb4f01frecaf562291a57038@mail.gmail.com>
Message-ID: <2mwtkpddlf.fsf@starship.python.net>

Guido van Rossum <guido at python.org> writes:

> How does this sound to the non-AST-branch developers who have to
> suffer the inevitable post-merge instability? I think it's now or
> never -- waiting longer isn't going to make this thing easier (not
> with several more language changes approved: with-statement, extended
> import, what else...)

It sounds OK to me.

Cheers,
mwh

-- 
  To summarise the summary of the summary:- people are a problem.
                   -- The Hitch-Hikers Guide to the Galaxy, Episode 12

From martin at v.loewis.de  Fri Oct  7 09:59:50 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 07 Oct 2005 09:59:50 +0200
Subject: [Python-Dev] PyObject_Init documentation
Message-ID: <43462AF6.3080103@v.loewis.de>

says

If type  indicates that the object participates in the cyclic garbage 
detector, it is added to the detector's set of observed objects.

Is this really correct? I thought you need to invoke PyObject_GC_TRACK
explicitly?

Regards,
Martin

From ncoghlan at iinet.net.au  Fri Oct  7 11:57:17 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Fri, 07 Oct 2005 19:57:17 +1000
Subject: [Python-Dev] Proposed changes to PEP 343
Message-ID: <4346467D.5010005@iinet.net.au>

Based on Jason's comments regarding decimal.Context, and to explicitly cover 
the terminology agreed on during the documentation discussion back in July, 
I'm proposing a number of changes to PEP 343. I'll be updating the checked in 
PEP assuming there aren't any objections in the next week or so (and assuming 
I get CVS access sorted out ;).

The idea of dropping __enter__/__exit__ and defining the with statement solely 
in terms of coroutines is *not* included in the suggested changes, but I added 
a new item under "Resolved Open Issues" to cover some of the reasons why.

Cheers,
Nick.

1. Amend the statement specification such that:

       with EXPR as VAR:
           BLOCK

is translated as:

       abc = (EXPR).__with__()
       exc = (None, None, None)
       VAR = abc.__enter__()
       try:
           try:
               BLOCK
           except:
               exc = sys.exc_info()
               raise
       finally:
           abc.__exit__(*exc)

2. Add the following to the subsequent explanation:

     The call to the __with__ method serves a similar purpose to the __iter__
   method for iterables and iterators. An object such as threading.Lock may
   provide its own __enter__ and __exit__ methods, and simply return 'self'
   from its __with__ method. A more complex object such as decimal.Context may
   return a distinct context manager which takes care of setting and restoring
   the appropriate decimal context in the thread.

3. Update ContextWrapper in the "Generator Decorator" section to include:

      def  __with__(self):
          return self

4. Add a paragraph to the end of the "Generator Decorator" section:

     By applying the @contextmanager decorator to a context's __with__ method,
   it is as easy to write a generator-based context manager for the context as
   it is to write a generator-based iterator for an iterable (see the
   decimal.Context example below).


5. Add three items under "Resolved Open Issues":

     2.  After this PEP was originally approved, a subsequent discussion on
       python-dev [4] settled on the term "context manager" for objects which
       provide __enter__ and __exit__ methods, and "context management
       protocol" for the protocol itself. With the addition of the __with__
       method to the protocol, a natural extension is to call objects which
       provide only a __with__ method "contexts" (or "manageable contexts" in
       situations where the general term "context" would be ambiguous).
         The distinction between a context and a context manager is very
       similar to the distinction between an iterable and an iterator.

     3.  The originally approved version of this PEP did not include a __with__
       method - the method was only added to the PEP after Jason Orendorff
       pointed out the difficulty of writing appropriate __enter__ and __exit__
       methods for decimal.Context [5].
          This approach allows a class to use the @contextmanager decorator
       to defines a native context manager using generator syntax. It also
       allows a class to use an existing independent context manager as its
       native context manager by applying the independent context manager to
       'self' in its __with__ method. It even allows a class written in C to
       use a coroutine based context manager written in Python.
          The __with__ method parallels the __iter__ method which forms part of
       the iterator protocol.

     4.  The suggestion was made by Jason Orendorff that the __enter__ and
       __exit__ methods could be removed from the context management protocol,
       and the protocol instead defined directly in terms of the coroutine
       interface described in PEP 342 (or a cleaner version of that interface
       with start() and finish() convenience methods) [6].
         Guido rejected this idea [7]. The following are some of benefits of
       keeping the __enter__ and __exit__ methods:
           - it makes it easy to implement a simple context manager in C
             without having to rely on a separate coroutine builder
           - it makes it easy to provide a low-overhead implementation for
             context managers which don't need to maintain any special state
             between the __enter__ and __exit__ methods (having to use a
             coroutine for these would impose unnecessary overhead without any
             compensating benefit)
           - it makes it possible to understand how the with statement works
             without having to first understand the concept of a coroutine

6. Add new references:

   [4] http://mail.python.org/pipermail/python-dev/2005-July/054658.html
   [5] http://mail.python.org/pipermail/python-dev/2005-October/056947.html
   [6] http://mail.python.org/pipermail/python-dev/2005-October/056969.html
   [7] http://mail.python.org/pipermail/python-dev/2005-October/057018.html

7. Update Example 4 to include a __with__ method:

      def  __with__(self):
          return self

8. Replace Example 9 with the following example:

     9. Here's a proposed native context manager for decimal.Context:

         # This would be a new decimal.Context method
         @contextmanager
         def __with__(self):
             # We set the thread context to a copy of this context
             # to ensure that changes within the block are kept
             # local to the block. This also gives us thread safety
             # and supports nested usage of a given context.
             newctx = self.copy()
             oldctx = decimal.getcontext()
             decimal.setcontext(newctx)
             try:
                 yield newctx
             finally:
                 decimal.setcontext(oldctx)

        Sample usage:

         def sin(x):
             with decimal.getcontext() as ctx:
                 ctx.prec += 2
                 # Rest of sin calculation algorithm
                 # uses a precision 2 greater than normal
             return +s # Convert result to normal precision

         def sin(x):
             with decimal.ExtendedContext:
                 # Rest of sin calculation algorithm
                 # uses the Extended Context from the
                 # General Decimal Arithmetic Specification
             return +s # Convert result to normal context

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From fredrik at pythonware.com  Fri Oct  7 12:26:34 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 7 Oct 2005 12:26:34 +0200
Subject: [Python-Dev] Proposed changes to PEP 343
References: <4346467D.5010005@iinet.net.au>
Message-ID: <di5igq$41k$1@sea.gmane.org>

Nick Coghlan wrote:

>     9. Here's a proposed native context manager for decimal.Context:
>
>         # This would be a new decimal.Context method
>         @contextmanager
>         def __with__(self):

wouldn't it be better if the ContextWrapper class (or some variation thereof) could
be used as a base class for the decimal.Context class?  using decorators on methods
to provide "is a" behaviour for the class doesn't really feel pythonic...

</F> 




From ncoghlan at iinet.net.au  Fri Oct  7 13:50:42 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Fri, 07 Oct 2005 21:50:42 +1000
Subject: [Python-Dev] PEP 342 suggestion: start(),
	__call__() and unwind_call() methods
Message-ID: <43466112.5050608@iinet.net.au>

I'm lifting Jason's PEP 342 suggestions out of the recent PEP 343 thread, in 
case some of the folks interested in coroutines stopped following that discussion.

Jason suggested two convenience methods, .start() and .finish().

start() simply asserted that the generator hadn't been started yet, and I find 
the parallel with "Thread.start()" appealing:

         def start(self):
             """ Convenience method -- exactly like next(), but
             assert that this coroutine hasn't already been started.
             """
             if self.__started:
                 raise RuntimeError("Coroutine already started")
             return self.next()


I've embellished Jason's suggested finish() method quite a bit though.
   1. Use send() rather than next()
   2. Call it __call__() rather than finish()
   3. Add an unwind_call() variant that gives similar semantics for throw()
   4. Support getting a return value from the coroutine
      using the syntax "raise StopIteration(val)"
   5. Add an exception "ContinueIteration" that is used to indicate the
      generator hasn't finished yet, rather than expecting the generator to
      finish and raising RuntimeError if it doesn't

It ends up looking like this:

         def __call__(self, value=None):
             """ Call a generator as a coroutine

             Returns the first argument supplied to StopIteration or
             None if no argument was supplied.
             Raises ContinueIteration with the value yielded as the
             argument if the generator yields a value
             """
             if not self.__started:
                 raise RuntimeError("Coroutine not started")
             try:
                 if exc:
                     yield_val = self.throw(value, *exc)
                 else:
                     yield_val = self.send(value)
             except (StopIteration), ex:
                 if ex.args:
                     return args[0]
             else:
                 raise ContinueIteration(yield_val)

         def unwind_call(self, *exc):
             """Raise an exception in a generator used as a coroutine.

             Returns the first argument supplied to StopIteration or
             None if no argument was supplied.
             Raises ContinueIteration if the generator yields a value
             with the value yield as the argument
             """
             try:
                 yield_val = self.throw(*exc)
             except (StopIteration), ex:
                 if ex.args:
                     return args[0]
             else:
                 raise ContinueIteration(yield_val)

Now here's the trampoline scheduler from PEP 342 using this idea:

         import collections

         class Trampoline:
             """Manage communications between coroutines"""

             running = False

             def __init__(self):
                 self.queue = collections.deque()

             def add(self, coroutine):
                 """Request that a coroutine be executed"""
                 self.schedule(coroutine)

             def run(self):
                 result = None
                 self.running = True
                 try:
                     while self.running and self.queue:
                         func = self.queue.popleft()
                         result = func()
                     return result
                 finally:
                     self.running = False

             def stop(self):
                 self.running = False

             def schedule(self, coroutine, stack=(), call_result=None, *exc):
                 # Define the new pseudothread
                 def pseudothread():
                     try:
                         if exc:
                             result = coroutine.unwind_call(call_result, *exc)
                         else:
                             result = coroutine(call_result)
                     except (ContinueIteration), ex:
                         # Called another coroutine
                         callee = ex.args[0]
                         self.schedule(callee, (coroutine,stack))
                     except:
                         if stack:
                             # send the error back to the caller
                            caller = stack[0]
                            prev_stack = stack[1]
                            self.schedule(
                                 caller, prev_stack, *sys.exc_info()
                            )
                         else:
                             # Nothing left in this pseudothread to
                             # handle it, let it propagate to the
                             # run loop
                             raise
                     else:
                         if stack:
                             # Finished, so pop the stack and send the
                             # result to the caller
                             caller = stack[0]
                             prev_stack = stack[1]
                             self.schedule(caller, prev_stack, result)

                 # Add the new pseudothread to the execution queue
                 self.queue.append(pseudothread)

Notice how a non-coroutine callable can be yielded, and it will still work 
happily with the scheduler, because the desire to continue execution is 
indicated by the ContinueIteration exception, rather than by the type of the 
returned value.

With this relatively dumb scheduler, that doesn't provide any particular 
benefit - the specific pseudothread doesn't block, but eventually the 
scheduler itself blocks when it executes the non-coroutine call.

However, it wouldn't take too much to make the scheduler smarter and give it a 
physical thread pool that it used whenever it encountered a non-coroutine call

And that's the real trick here: with these additions to PEP 342, the decision 
of how to deal with blocking calls could be made in the scheduler, without 
affecting the individual coroutines. All the coroutine writers would need to 
remember to do is to write any potentially blocking operations as yielded 
lambda expressions or functional.partial invocations, rather than as direct 
function calls.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Fri Oct  7 14:38:19 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 07 Oct 2005 22:38:19 +1000
Subject: [Python-Dev] Proposed changes to PEP 343
In-Reply-To: <di5igq$41k$1@sea.gmane.org>
References: <4346467D.5010005@iinet.net.au> <di5igq$41k$1@sea.gmane.org>
Message-ID: <43466C3B.50704@gmail.com>

Fredrik Lundh wrote:
> Nick Coghlan wrote:
> 
> 
>>    9. Here's a proposed native context manager for decimal.Context:
>>
>>        # This would be a new decimal.Context method
>>        @contextmanager
>>        def __with__(self):
> 
> 
> wouldn't it be better if the ContextWrapper class (or some variation thereof) could
> be used as a base class for the decimal.Context class?  using decorators on methods
> to provide "is a" behaviour for the class doesn't really feel pythonic...

That's not what the decorator is for - it's there to turn the generator used 
to implement the __with__ method into a context manager, rather than saying 
anything about decimal.Context as a whole.

However, requiring a decorator to get a slot to work right looks pretty ugly 
to me, too.

What if we simply special-cased the __with__ slot in type(), such that if it 
is populated with a generator object, that object is automatically wrapped 
using the @contextmanager decorator? (Jason actually suggested this idea 
previously)

I initially didn't like the idea because of EIBTI, but I've realised that "def 
__with__(self):" is pretty darn explicit in its own right. I've also realised 
that defining __with__ using a generator, but forgetting to add the 
@contextmanager to the front would be a lovely source of bugs, particularly if 
generators are given a default __exit__() method that simply invokes self.close().

On the other hand, if __with__ is special-cased, then the slot definition 
wouldn't look ugly, and we'd still be free to define a generator's normal with 
statement semantics as:

   def __exit__(self, *exc):
       self.close()

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Fri Oct  7 14:43:07 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 07 Oct 2005 22:43:07 +1000
Subject: [Python-Dev] PEP 342 suggestion: start(),
 __call__() and unwind_call() methods
In-Reply-To: <43466112.5050608@iinet.net.au>
References: <43466112.5050608@iinet.net.au>
Message-ID: <43466D5B.7050103@gmail.com>

Nick Coghlan wrote:
> It ends up looking like this:
> 
>          def __call__(self, value=None):
>              """ Call a generator as a coroutine
> 
>              Returns the first argument supplied to StopIteration or
>              None if no argument was supplied.
>              Raises ContinueIteration with the value yielded as the
>              argument if the generator yields a value
>              """
>              if not self.__started:
>                  raise RuntimeError("Coroutine not started")
>              try:
>                  if exc:
>                      yield_val = self.throw(value, *exc)
>                  else:
>                      yield_val = self.send(value)
>              except (StopIteration), ex:
>                  if ex.args:
>                      return args[0]
>              else:
>                  raise ContinueIteration(yield_val)

Oops, I didn't finish fixing this after I added unwind_call(). Try this 
version instead:

           def __call__(self, value=None):
               """ Call a generator as a coroutine

               Returns the first argument supplied to StopIteration or
               None if no argument was supplied.
               Raises ContinueIteration with the value yielded as the
               argument if the generator yields a value
               """
               try:
                   yield_val = self.send(value)
               except (StopIteration), ex:
                   if ex.args:
                       return args[0]
               else:
                   raise ContinueIteration(yield_val)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From mwh at python.net  Fri Oct  7 15:02:12 2005
From: mwh at python.net (Michael Hudson)
Date: Fri, 07 Oct 2005 14:02:12 +0100
Subject: [Python-Dev] Proposed changes to PEP 343
In-Reply-To: <43466C3B.50704@gmail.com> (Nick Coghlan's message of "Fri, 07
	Oct 2005 22:38:19 +1000")
References: <4346467D.5010005@iinet.net.au> <di5igq$41k$1@sea.gmane.org>
	<43466C3B.50704@gmail.com>
Message-ID: <2mfyrdcwff.fsf@starship.python.net>

Nick Coghlan <ncoghlan at gmail.com> writes:

> What if we simply special-cased the __with__ slot in type(), such that if it 
> is populated with a generator object, that object is automatically wrapped 
> using the @contextmanager decorator? (Jason actually suggested this idea 
> previously)

<nit>
You don't want to check if it's a generator, you want to check if it's
a function whose func_code has the relavent bit set.
</nit>

Seems a bit magical to me, but haven't thought about it hard.

Cheers,
mwh

-- 
  I think my standards have lowered enough that now I think ``good
  design'' is when the page doesn't irritate the living fuck out of 
  me.                        -- http://www.jwz.org/gruntle/design.html

From ajm at flonidan.dk  Fri Oct  7 15:23:47 2005
From: ajm at flonidan.dk (Anders J. Munch)
Date: Fri, 7 Oct 2005 15:23:47 +0200
Subject: [Python-Dev] Proposed changes to PEP 343
Message-ID: <9B1795C95533CA46A83BA1EAD4B01030031F0B@flonidanmail.flonidan.net>

Nick Coghlan did a +1 job to write:
> 1. Amend the statement specification such that:
> 
>        with EXPR as VAR:
>            BLOCK
> 
> is translated as:
> 
>        abc = (EXPR).__with__()
>        exc = (None, None, None)
>        VAR = abc.__enter__()
>        try:
>            try:
>                BLOCK
>            except:
>                exc = sys.exc_info()
>                raise
>        finally:
>            abc.__exit__(*exc)

Note that __with__ and __enter__ could be combined into one with no
loss of functionality:

        abc,VAR = (EXPR).__with__()
        exc = (None, None, None)
        try:
            try:
                BLOCK
           except:
                exc = sys.exc_info()
                raise
        finally:
            abc.__exit__(*exc)
 
- Anders

From eric.nieuwland at xs4all.nl  Fri Oct  7 15:38:20 2005
From: eric.nieuwland at xs4all.nl (Eric Nieuwland)
Date: Fri, 7 Oct 2005 15:38:20 +0200
Subject: [Python-Dev] Proposed changes to PEP 343
In-Reply-To: <4346467D.5010005@iinet.net.au>
References: <4346467D.5010005@iinet.net.au>
Message-ID: <3479d2bd94a07f4dd6e06a6874ca74f0@xs4all.nl>

Nick Coghlan wrote:

> 1. Amend the statement specification such that:
>
>        with EXPR as VAR:
>            BLOCK
>
> is translated as:
>
>        abc = (EXPR).__with__()
>        exc = (None, None, None)
>        VAR = abc.__enter__()
>        try:
>            try:
>                BLOCK
>            except:
>                exc = sys.exc_info()
>                raise
>        finally:
>            abc.__exit__(*exc)

Is this correct?
What happens to

	with 40*13+2 as X:
		print X

?

--eric


From ncoghlan at gmail.com  Fri Oct  7 16:09:48 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 08 Oct 2005 00:09:48 +1000
Subject: [Python-Dev] Proposed changes to PEP 343
In-Reply-To: <3479d2bd94a07f4dd6e06a6874ca74f0@xs4all.nl>
References: <4346467D.5010005@iinet.net.au>
	<3479d2bd94a07f4dd6e06a6874ca74f0@xs4all.nl>
Message-ID: <434681AC.2060503@gmail.com>

Eric Nieuwland wrote:
> What happens to
> 
>     with 40*13+2 as X:
>         print X

It would fail with a TypeError because the relevant slot in the type object 
was NULL - the TypeError checks aren't shown for simplicity's sake.

This behaviour isn't really any different from the existing PEP 343 - the only 
difference is that the statement looks for a __with__ slot on the original 
EXPR, rather than looking directly for an __enter__ slot.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Fri Oct  7 16:12:39 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 08 Oct 2005 00:12:39 +1000
Subject: [Python-Dev] Proposed changes to PEP 343
In-Reply-To: <9B1795C95533CA46A83BA1EAD4B01030031F0B@flonidanmail.flonidan.net>
References: <9B1795C95533CA46A83BA1EAD4B01030031F0B@flonidanmail.flonidan.net>
Message-ID: <43468257.9030008@gmail.com>

Anders J. Munch wrote:
> Note that __with__ and __enter__ could be combined into one with no
> loss of functionality:
> 
>         abc,VAR = (EXPR).__with__()

They can't be combined, because they're invoked on different objects. It would 
be like trying to combine __iter__() and next() into the same method for 
iterators. . .

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Fri Oct  7 16:16:20 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 08 Oct 2005 00:16:20 +1000
Subject: [Python-Dev] Proposed changes to PEP 343
In-Reply-To: <2mfyrdcwff.fsf@starship.python.net>
References: <4346467D.5010005@iinet.net.au>
	<di5igq$41k$1@sea.gmane.org>	<43466C3B.50704@gmail.com>
	<2mfyrdcwff.fsf@starship.python.net>
Message-ID: <43468334.2000307@gmail.com>

Michael Hudson wrote:
> <nit>
> You don't want to check if it's a generator, you want to check if it's
> a function whose func_code has the relavent bit set.
> </nit>

Fair point :)

> Seems a bit magical to me, but haven't thought about it hard.

Same here - I'm just starting to think that the alternative is worse, because 
it leaves open the nonsensical possibility of writing a __with__ method as a 
generator *without* applying the contextmanager decorator, and that would just 
be bizarre - if you want to get an iterable, why aren't you writing an 
__iter__ method instead?

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at iinet.net.au  Fri Oct  7 16:20:07 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Sat, 08 Oct 2005 00:20:07 +1000
Subject: [Python-Dev] Sourceforge CVS access
Message-ID: <43468417.4000701@iinet.net.au>

Could one of the Sourceforge powers-that-be grant me check in access so I can 
update PEP 343 directly?

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From eric.nieuwland at xs4all.nl  Fri Oct  7 16:21:37 2005
From: eric.nieuwland at xs4all.nl (Eric Nieuwland)
Date: Fri, 7 Oct 2005 16:21:37 +0200
Subject: [Python-Dev] Proposed changes to PEP 343
In-Reply-To: <434681AC.2060503@gmail.com>
References: <4346467D.5010005@iinet.net.au>
	<3479d2bd94a07f4dd6e06a6874ca74f0@xs4all.nl>
	<434681AC.2060503@gmail.com>
Message-ID: <be48f2ed84179d498e1c1b5199985b07@xs4all.nl>

Nick Coghlan wrote:
> Eric Nieuwland wrote:
>> What happens to
>>
>>     with 40*13+2 as X:
>>         print X
>
> It would fail with a TypeError because the relevant slot in the type 
> object
> was NULL - the TypeError checks aren't shown for simplicity's sake.
>
> This behaviour isn't really any different from the existing PEP 343 - 
> the only
> difference is that the statement looks for a __with__ slot on the 
> original
> EXPR, rather than looking directly for an __enter__ slot.

Hmmm I hadn't noticed that.
In my memory a partial implementation of the protocol was possible.
Thus, __enter__/__exit__ would only be called if they exist.

Oh well, I'll just add some empty methods.

--eric


From guido at python.org  Fri Oct  7 18:14:12 2005
From: guido at python.org (Guido van Rossum)
Date: Fri, 7 Oct 2005 09:14:12 -0700
Subject: [Python-Dev] Sourceforge CVS access
In-Reply-To: <43468417.4000701@iinet.net.au>
References: <43468417.4000701@iinet.net.au>
Message-ID: <ca471dc20510070914h404045c1r8ab1d88ccb2c64de@mail.gmail.com>

I will, if you tell me your sourceforge username.

On 10/7/05, Nick Coghlan <ncoghlan at iinet.net.au> wrote:
> Could one of the Sourceforge powers-that-be grant me check in access so I can
> update PEP 343 directly?


--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From fredrik at pythonware.com  Fri Oct  7 18:12:31 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 7 Oct 2005 18:12:31 +0200
Subject: [Python-Dev] Proposed changes to PEP 343
References: <4346467D.5010005@iinet.net.au> <di5igq$41k$1@sea.gmane.org>
	<43466C3B.50704@gmail.com>
Message-ID: <di66pi$f3t$1@sea.gmane.org>

Nick Coghlan wrote:

> That's not what the decorator is for - it's there to turn the generator used
> to implement the __with__ method into a context manager, rather than saying
> anything about decimal.Context as a whole.

possibly, but using a decorated __with__ method doesn't make much
sense if the purpose isn't to turn the class into something that can be
used with the "with" statement.

> However, requiring a decorator to get a slot to work right looks pretty ugly
> to me, too.

the whole concept might be perfectly fine on the "this construct corre-
sponds to this code" level, but if you immediately end up with things that
are not what they seem, and names that don't mean what the say, either
the design or the description of it needs work.

 ("yes, I know you can use this class to manage the context, but it's not
really a context manager, because it's that method that's a manager, not
the class itself.  yes, all the information that belongs to the context are
managed by the class, but that doesn't make... oh, shut up and read the
PEP")

</F>




From BruceEckel-Python3234 at mailblocks.com  Fri Oct  7 18:47:51 2005
From: BruceEckel-Python3234 at mailblocks.com (Bruce Eckel)
Date: Fri, 7 Oct 2005 10:47:51 -0600
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <20051006221436.2892.JCARLSON@uci.edu>
References: <20051006143740.287E.JCARLSON@uci.edu>
	<200510070145.17284.ms@cerenity.org>
	<20051006221436.2892.JCARLSON@uci.edu>
Message-ID: <415220344.20051007104751@MailBlocks.com>

Early in this thread there was a comment to the effect that "if you
don't know how to use threads, don't use them," which I pointedly
avoided responding to because it seemed to me to simply be
inflammatory. But Ian Bicking just posted a weblog entry:
http://blog.ianbicking.org/concurrency-and-processes.html where he
says "threads aren't as hard as they imply" and "An especially poor
argument is one that tells me that I'm currently being beaten with a
stick, but apparently don't know it."

I always have a problem with this. After many years of studying
concurrency on-and-off, I continue to believe that threading is very
difficult (indeed, the more I study it, the more difficult I
understand it to be). And I admit this. The comments I sometimes get
back are to the effect that "threading really isn't that hard." Thus,
I am just too dense to get it.

It's hard to know how to answer. I've met enough brilliant people to
know that it's just possible that the person posting really does
easily grok concurrency issues and thus I must seem irreconcilably
thick. This may actually be one of those people for whom threading is
obvious (and Ian has always seemed like a smart guy, for example).

But. I do happen to have contact with a lot of people who are at the
forefront of the threading world, and *none* of them (many of whom
have written the concurrency libraries for Java 5, for example) ever
imply that threading is easy. In fact, they generally go out of their
way to say that it's insanely difficult.

And Java has taken until version 5 to (apparently) get it right,
partly by defining a new memory model in order to accurately describe
what goes on with threading issues. This same model is being adapted
for the next version of C++. This is not stuff that was already out
there, that everyone knew about -- this is new stuff.

Also, look at the work that Scott Meyers, Andrei Alexandrescu, et al
did on the "Double Checked Locking" idiom, showing that it was broken
under threading. That was by no means "trivial and obvious" during all
the years that people thought that it worked.

My own experience in discussions with folks who think that threading
is transparent usually uncovers, after a few appropriate questions,
that said person doesn't actually understand the depth of the issues
involved. A common story is someone who has written a few programs and
convinced themselves that these programs work (the "it works for me"
proof of correctness). Thus, concurrency must be easy.

I know about this because I have learned the hard way throughout many
years, over and over again. Every time I've thought that I understood
concurrency, something new has popped up and shown me a whole new
aspect of things that I have heretofore missed. Then I start thinking
"OK, now I finally understand concurrency."

One example: when I was rewriting the threading chapter for the 3rd
(previous) edition of Thinking in Java, I decided to get a
dual-processor machine so I could really test things. This way, I
discovered that the behavior of a program on a single-processor
machine could be dramatically different than the same program on a
multiprocessor machine. That seems obvious, now, but at the time I
thought I was writing pretty reasonable code. In addition, it turns
out that some things in Java concurrency were broken (even the people
who were creating thread support in the language weren't getting it
right) so that threw in extra monkey wrenches. And when you start
studying the new memory model, which takes into account instruction
reordering and cache coherency issues, you realize that it's
mind-numbingly far from trivial.

Or maybe not, for those who think it's easy. But my experience is that
the people who really do understand concurrency never suggest that
it's easy.

Bruce Eckel    http://www.BruceEckel.com   mailto:BruceEckel-Python3234 at mailblocks.com
Contains electronic books: "Thinking in Java 3e" & "Thinking in C++ 2e"
Web log: http://www.artima.com/weblogs/index.jsp?blogger=beckel
Subscribe to my newsletter:
http://www.mindview.net/Newsletter
My schedule can be found at:
http://www.mindview.net/Calendar




From gustavo at niemeyer.net  Fri Oct  7 19:22:37 2005
From: gustavo at niemeyer.net (Gustavo Niemeyer)
Date: Fri, 7 Oct 2005 14:22:37 -0300
Subject: [Python-Dev] Extending tuple unpacking
Message-ID: <20051007172237.GA13288@localhost.localdomain>

Not sure if this has been proposed before, but one thing
I occasionally miss regarding tuple unpack is being able
to do:

  first, second, *rest = something

Also in for loops:

  for first, second, *rest in iterator:
      pass

This seems to match the current meaning for starred
variables in other contexts.

What do you think?

-- 
Gustavo Niemeyer
http://niemeyer.net

From jeremy at alum.mit.edu  Fri Oct  7 19:32:29 2005
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Fri, 7 Oct 2005 13:32:29 -0400
Subject: [Python-Dev] Python 2.5 and ast-branch
In-Reply-To: <5.1.1.6.0.20051006224020.01f76148@mail.telecommunity.com>
References: <87irwadwma.fsf@hydra.bayview.thirdcreek.com>
	<000001c5cad5$b6b83ee0$a105a044@oemcomputer>
	<5.1.1.6.0.20051006224020.01f76148@mail.telecommunity.com>
Message-ID: <e8bf7a530510071032m682b9234y5b2848b17d024763@mail.gmail.com>

On 10/6/05, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 07:34 PM 10/6/2005 -0700, Guido van Rossum wrote:
> >How does this sound to the non-AST-branch developers who have to
> >suffer the inevitable post-merge instability? I think it's now or
> >never -- waiting longer isn't going to make this thing easier (not
> >with several more language changes approved: with-statement, extended
> >import, what else...)
>
> Do the AST branch changes affect the interface of the "parser" module?  Or
> do they just add new functionality?

It doesn't affect the parser module.  For now, the same parser is
used, so the parser module can still work the way it does.  If we
changed the parser in the future, well, the parser module would
change, too.  I'd also like to add an analogous ast module that
exposed the abstract syntax tree for manipulation, along the lines of
the parser module.  Not sure if we'll actually get to it for this
release.

Jeremy

From guido at python.org  Fri Oct  7 19:34:05 2005
From: guido at python.org (Guido van Rossum)
Date: Fri, 7 Oct 2005 10:34:05 -0700
Subject: [Python-Dev] Extending tuple unpacking
In-Reply-To: <20051007172237.GA13288@localhost.localdomain>
References: <20051007172237.GA13288@localhost.localdomain>
Message-ID: <ca471dc20510071034y39b3facpe1d43e34b11e69db@mail.gmail.com>

On 10/7/05, Gustavo Niemeyer <gustavo at niemeyer.net> wrote:
> Not sure if this has been proposed before, but one thing
> I occasionally miss regarding tuple unpack is being able
> to do:
>
>   first, second, *rest = something
>
> Also in for loops:
>
>   for first, second, *rest in iterator:
>       pass
>
> This seems to match the current meaning for starred
> variables in other contexts.

Someone should really write up a PEP -- this was just discussed a week
or two ago.

I personally think this is adequately handled by writing:

  (first, second), rest = something[:2], something[2:]

I believe that this wish is an example of "hypergeneralization" -- an
incorrect generalization based on a misunderstanding of the underlying
principle.

Argument lists are not tuples [*] and features of argument lists
should not be confused with features of tuple unpackings.

[*] Proof: f(1) is equivalent to f(1,) even though (1) is an int but
(1,) is a tuple.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From aahz at pythoncraft.com  Fri Oct  7 19:45:45 2005
From: aahz at pythoncraft.com (Aahz)
Date: Fri, 7 Oct 2005 10:45:45 -0700
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <415220344.20051007104751@MailBlocks.com>
References: <20051006143740.287E.JCARLSON@uci.edu>
	<200510070145.17284.ms@cerenity.org>
	<20051006221436.2892.JCARLSON@uci.edu>
	<415220344.20051007104751@MailBlocks.com>
Message-ID: <20051007174545.GA8369@panix.com>

On Fri, Oct 07, 2005, Bruce Eckel wrote:
>
> I always have a problem with this. After many years of studying
> concurrency on-and-off, I continue to believe that threading is very
> difficult (indeed, the more I study it, the more difficult I
> understand it to be). And I admit this. The comments I sometimes get
> back are to the effect that "threading really isn't that hard." Thus,
> I am just too dense to get it.

What I generally say is that threading isn't too hard if you stick with
some fairly simple idioms and tools -- and make absolutely certain to
follow some rules about sharing data.  But it's certainly true that
threading (and concurrency) in general is mind-numbingly complex.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"If you think it's expensive to hire a professional to do the job, wait
until you hire an amateur."  --Red Adair

From pje at telecommunity.com  Fri Oct  7 19:51:15 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 07 Oct 2005 13:51:15 -0400
Subject: [Python-Dev] PEP 342 suggestion: start(),
 __call__() and  unwind_call() methods
In-Reply-To: <43466112.5050608@iinet.net.au>
Message-ID: <5.1.1.6.0.20051007134543.03b2f728@mail.telecommunity.com>

At 09:50 PM 10/7/2005 +1000, Nick Coghlan wrote:
>Notice how a non-coroutine callable can be yielded, and it will still work
>happily with the scheduler, because the desire to continue execution is
>indicated by the ContinueIteration exception, rather than by the type of the
>returned value.

Whaaaa?  You raise an exception to indicate the *normal* case?  That seems, 
um...  well, a Very Bad Idea.

I also don't see any point to start(), or understand what finish() does or 
why you'd want it.

Last, but far from least, as far as I can tell you can implement all of 
these semantics using PEP 342 as it sits.  That is, it's very simple to 
make decorators or classes that add those semantics.  I don't see anything 
that requires them to be part of Python.


From gustavo at niemeyer.net  Fri Oct  7 19:56:10 2005
From: gustavo at niemeyer.net (Gustavo Niemeyer)
Date: Fri, 7 Oct 2005 14:56:10 -0300
Subject: [Python-Dev] Extending tuple unpacking
In-Reply-To: <ca471dc20510071034y39b3facpe1d43e34b11e69db@mail.gmail.com>
References: <20051007172237.GA13288@localhost.localdomain>
	<ca471dc20510071034y39b3facpe1d43e34b11e69db@mail.gmail.com>
Message-ID: <20051007175610.GA13795@localhost.localdomain>

> Someone should really write up a PEP -- this was just discussed a week
> or two ago.

Heh.. I should follow the list more closely.

> I personally think this is adequately handled by writing:
> 
>   (first, second), rest = something[:2], something[2:]

That's an alternative indeed. But the the proposed way does look better:

  for item in iterator:
      (first, second), rest = item[2:], item[:2]
      ...

vs.

  for first, second, *rest in iterator:
     ...

> I believe that this wish is an example of "hypergeneralization" -- an
> incorrect generalization based on a misunderstanding of the underlying
> principle.

Thanks for trying so hard to say in a nice way that this is not
a good idea. :-)

> Argument lists are not tuples [*] and features of argument lists
> should not be confused with features of tuple unpackings.

Do you agree that the concepts are related?

For instance:

  >>> def f(first, second, *rest):
  ...   print first, second, rest
  ...
  >>> f(1,2,3,4)
  1 2 (3, 4)

  >>> first, second, *rest = (1,2,3,4)
  >>> print first, second, rest
  1 2 (3, 4)

> [*] Proof: f(1) is equivalent to f(1,) even though (1) is an int but
> (1,) is a tuple.

"Extended *tuple* unpacking" was a wrong subject indeed. This is
general unpacking, since it's supposed to work with any sequence.

-- 
Gustavo Niemeyer
http://niemeyer.net

From pje at telecommunity.com  Fri Oct  7 20:07:23 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 07 Oct 2005 14:07:23 -0400
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <415220344.20051007104751@MailBlocks.com>
References: <20051006221436.2892.JCARLSON@uci.edu>
	<20051006143740.287E.JCARLSON@uci.edu>
	<200510070145.17284.ms@cerenity.org>
	<20051006221436.2892.JCARLSON@uci.edu>
Message-ID: <5.1.1.6.0.20051007140112.03b2fc80@mail.telecommunity.com>

At 10:47 AM 10/7/2005 -0600, Bruce Eckel wrote:
>Also, look at the work that Scott Meyers, Andrei Alexandrescu, et al
>did on the "Double Checked Locking" idiom, showing that it was broken
>under threading. That was by no means "trivial and obvious" during all
>the years that people thought that it worked.

One of the nice things about the GIL is that it means double-checked 
locking *does* work in Python.  :)


>My own experience in discussions with folks who think that threading
>is transparent usually uncovers, after a few appropriate questions,
>that said person doesn't actually understand the depth of the issues
>involved. A common story is someone who has written a few programs and
>convinced themselves that these programs work (the "it works for me"
>proof of correctness). Thus, concurrency must be easy.
>
>I know about this because I have learned the hard way throughout many
>years, over and over again. Every time I've thought that I understood
>concurrency, something new has popped up and shown me a whole new
>aspect of things that I have heretofore missed. Then I start thinking
>"OK, now I finally understand concurrency."

The day when I knew, beyond all shadow of a doubt, that the people who say 
threading is easy are full of it, is when I wrote an event-driven 
co-operative multitasking system in Python and managed to create a race 
condition in *single-threaded code*.

Of course, due to its nature, a race condition in an event-driven system is 
at least reproducible given the same sequence of events, and it's fixable 
using "turns" (as described in a paper posted here yesterday).  With 
threads, it's not anything like reproducible, because pre-emptive threading 
is non-deterministic.

What the GIL-ranters don't get is that the GIL actually gives you just 
enough determinism to be able to write threaded programs that don't crash, 
and that maybe will even work if you treat every point of interaction 
between threads as a minefield and program with appropriate care.  So, if 
threads are "easy" in Python compared to other langauges, it's *because of* 
the GIL, not in spite of it.


From shane at hathawaymix.org  Fri Oct  7 20:42:02 2005
From: shane at hathawaymix.org (Shane Hathaway)
Date: Fri, 07 Oct 2005 12:42:02 -0600
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <415220344.20051007104751@MailBlocks.com>
References: <20051006143740.287E.JCARLSON@uci.edu>	<200510070145.17284.ms@cerenity.org>	<20051006221436.2892.JCARLSON@uci.edu>
	<415220344.20051007104751@MailBlocks.com>
Message-ID: <4346C17A.2090204@hathawaymix.org>

Bruce Eckel wrote:
> But. I do happen to have contact with a lot of people who are at the
> forefront of the threading world, and *none* of them (many of whom
> have written the concurrency libraries for Java 5, for example) ever
> imply that threading is easy. In fact, they generally go out of their
> way to say that it's insanely difficult.

What's insanely difficult is really locking, and locking is driven by 
concurrency in general, not just threads.  It's hard to reason about 
locks.  There are only general rules about how to apply locking 
correctly, efficiently, and without deadlocks.  Personally, to be 
absolutely certain I've applied locks correctly, I have to think for 
hours.  Even then, it's hard to express my conclusions, so it's hard to 
be sure future maintainers will keep the locking correct.

Java uses locks very liberally, which is to be expected of a language 
that provides locking using a keyword.  This forces Java programmers to 
deal with the burden of locking everywhere.  It also forces the 
developers of the language and its core libraries to make locking 
extremely fast yet safe.  Java threads would be easy if there wasn't so 
much locking going on.

Zope, OTOH, is far more conservative with locks.  There is some code 
that dispatches HTTP requests to a worker thread, and other code that 
reads and writes an object database, but most Zope code isn't aware of 
concurrency.  Thus locking is hardly an issue in Zope, and as a result, 
threading is quite easy in Zope.

Recently, I've been simulating high concurrency on a PostgreSQL 
database, and I've discovered that the way you reason about row and 
table locks is very similar to the way you reason about locking among 
threads.  The big difference is the consequence of incorrect locking: in 
PostgreSQL, using the serializable mode, incorrect locking generally 
only leads to aborted transactions; while in Python and most programming 
languages, incorrect locking instantly causes corruption and chaos. 
That's what hurts developers.  I want a concurrency model in Python that 
acknowledges the need for locking while punishing incorrect locking with 
an exception rather than corruption.  *That* would be cool, IMHO.

Shane

From barry at python.org  Fri Oct  7 20:58:25 2005
From: barry at python.org (Barry Warsaw)
Date: Fri, 07 Oct 2005 14:58:25 -0400
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <4346C17A.2090204@hathawaymix.org>
References: <20051006143740.287E.JCARLSON@uci.edu>
	<200510070145.17284.ms@cerenity.org>	<20051006221436.2892.JCARLSON@uci.edu>
	<415220344.20051007104751@MailBlocks.com>
	<4346C17A.2090204@hathawaymix.org>
Message-ID: <1128711505.9875.19.camel@geddy.wooz.org>

On Fri, 2005-10-07 at 14:42, Shane Hathaway wrote:

> What's insanely difficult is really locking, and locking is driven by 
> concurrency in general, not just threads.  It's hard to reason about 
> locks.  

I think that's a very interesting observation!  I have not built a
tremendous number of concurrent apps, but even the dumb locking that
Mailman does (which is not a great model of granularity ;) has burned
many bch's (brain cell hours) to get right.

Where I have used more concurrency, I generally try to structure my apps
into the one-producer-many-independent-consumers architecture that was
outlined in a previous message.  In that case, if you can narrow your
touch points to the Queue module for example, then yeah, threading is
easy.  A gaggle of independent workers isn't that hard to get right in
Python.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20051007/09266eeb/attachment.pgp

From solipsis at pitrou.net  Fri Oct  7 21:13:08 2005
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Fri, 07 Oct 2005 21:13:08 +0200
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <4346C17A.2090204@hathawaymix.org>
References: <20051006143740.287E.JCARLSON@uci.edu>
	<200510070145.17284.ms@cerenity.org>	<20051006221436.2892.JCARLSON@uci.edu>
	<415220344.20051007104751@MailBlocks.com>
	<4346C17A.2090204@hathawaymix.org>
Message-ID: <1128712388.6251.21.camel@fsol>


Hi,

(my 2 cents, probably not very constructive)

> Recently, I've been simulating high concurrency on a PostgreSQL 
> database, and I've discovered that the way you reason about row and 
> table locks is very similar to the way you reason about locking among 
> threads.  The big difference is the consequence of incorrect locking: in 
> PostgreSQL, using the serializable mode, incorrect locking generally 
> only leads to aborted transactions; while in Python and most programming 
> languages, incorrect locking instantly causes corruption and chaos. 
> That's what hurts developers.  I want a concurrency model in Python that 
> acknowledges the need for locking while punishing incorrect locking with 
> an exception rather than corruption.  *That* would be cool, IMHO.

A relational database has a very strict and regular data model. Also, it
has transactions. This makes it easy to precisely define concurrency at
the engine level.

To apply the same thing to Python you would at least need :
  1. a way to define a subset of the current bag of reachable objects
which has to stay consistent w.r.t. transactions that are applied to it
(of course, you would have several such subsets in any non-trivial
application)
  2. a way to start and end a transaction on a bag of objects (begin /
commit / rollback)
  3. a precise definition of the semantics of "consistency" here : for
example, only one thread could modify a bag of objects at any given
time, and other threads would continue to see the frozen, stable version
of that bag until the next version is committed by the writing thread

For 1), a helpful paradigm would be to define an object as being the
"root" of a bag, and all its properties would automatically and
recursively (or not ?) belong to this bag. One has to be careful that no
property "leaks" and makes the bag become the set of all reachable
Python objects (one could provide a means to say that a specific
property must not be transitively put in the bag). Then, use
my_object.begin_transaction() and my_object.commit_transaction().

The implementation of 3) does not look very obvious ;-S
 
Regards

Antoine.



From Martin.Maly at microsoft.com  Fri Oct  7 21:15:04 2005
From: Martin.Maly at microsoft.com (Martin Maly)
Date: Fri, 7 Oct 2005 12:15:04 -0700
Subject: [Python-Dev] __doc__ behavior in class definitions
Message-ID: <1DFB396200705E46B5338CA4B2E25BDE9E6FF5@DF-BANDIT-MSG.exchange.corp.microsoft.com>

Hello Python-Dev,
 
My name is Martin Maly and I am a developer at Microsoft, working on the
IronPython project with Jim Hugunin. I am spending lot of time making
IronPython compatible with Python to the extent possible.

I came across a case which I am not sure if by design or a bug in Python
(Python 2.4.1 (#65, Mar 30 2005, 09:13:57)). Consider following Python
module:

# module begin
"module doc"

class c:
    print __doc__
    __doc__ = "class doc"			(1)
    print __doc__

print c.__doc__
# module end

When ran, it prints:

module doc
class doc
class doc

Based on the binding rules described in the Python documentation, I
would expect the code to throw because binding created on the line (1)
is local to the class block and all the other __doc__ uses should
reference that binding. Apparently, it is not the case.

Is this bug in Python or are __doc__ strings in classes subject to some
additional rules?

Thanks
Martin

From fredrik at pythonware.com  Fri Oct  7 22:18:14 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 7 Oct 2005 22:18:14 +0200
Subject: [Python-Dev] __doc__ behavior in class definitions
References: <1DFB396200705E46B5338CA4B2E25BDE9E6FF5@DF-BANDIT-MSG.exchange.corp.microsoft.com>
Message-ID: <di6l68$vt6$1@sea.gmane.org>

Martin Maly wrote:

> I came across a case which I am not sure if by design or a bug in Python
> (Python 2.4.1 (#65, Mar 30 2005, 09:13:57)). Consider following Python
> module:
>
> # module begin
> "module doc"
>
> class c:
>     print __doc__
>     __doc__ = "class doc" (1)
>     print __doc__
>
> print c.__doc__
> # module end
>
> When ran, it prints:
>
> module doc
> class doc
> class doc
>
> Based on the binding rules described in the Python documentation, I
> would expect the code to throw because binding created on the line (1)
> is local to the class block and all the other __doc__ uses should
> reference that binding. Apparently, it is not the case.
>
> Is this bug in Python or are __doc__ strings in classes subject to some
> additional rules?

it's not limited to __doc__ strings, or, for that matter, to attributes:

    spam = "spam"

    class c:
        print spam
        spam = "bacon"
        print spam

        print len(spam)

        def len(self):
            return 10

    print c.spam

the language reference uses the term "local scope" for both class and
def-statements, but it's not really the same thing.  the former is more
like a temporary extra global scope with a (class, global) search path,
names are resolved when they are found (just as in the global scope);
there's no preprocessing step.

for additional class issues, see the "Discussion" in the nested scopes
PEP:

    http://www.python.org/peps/pep-0227.html

hope this helps!

</F>




From pje at telecommunity.com  Fri Oct  7 22:28:33 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 07 Oct 2005 16:28:33 -0400
Subject: [Python-Dev] __doc__ behavior in class definitions
Message-ID: <5.1.1.6.0.20051007162832.01f7c080@mail.telecommunity.com>

At 12:15 PM 10/7/2005 -0700, Martin Maly wrote:
>Based on the binding rules described in the Python documentation, I
>would expect the code to throw because binding created on the line (1)
>is local to the class block and all the other __doc__ uses should
>reference that binding. Apparently, it is not the case.

Correct - the scoping rules about local bindings causing a symbol to be 
local only apply to *function* scopes.  Class scopes are able to refer to 
module-level names until the name is shadowed in the class scope.


>Is this bug in Python or are __doc__ strings in classes subject to some
>additional rules?

Neither; the behavior you're seeing doesn't have anything to do with 
docstrings per se, it's just normal Python binding behavior, coupled with 
the fact that the class' docstring isn't set until the class suite is 
completed.

It's currently acceptable (if questionable style) to do things like this in 
today's Python:

     X = 1

     class X:
         X = X + 1

     print X.X  # this will print "2"

More commonly, and less questionably, this would manifest as something like:

     def function_taking_foo(foo, bar):
         ...

     class Foo(blah):
         function_taking_foo = function_taking_foo

This makes it possible to call 'function_taking_foo(aFooInstance, someBar)' 
or 'aFooInstance.function_taking_foo(someBar)'.  I've used this pattern a 
couple times myself, and I believe there may actually be cases in the 
standard library that do something like this, although maybe not binding 
the method under the same name as the function.


From steve at holdenweb.com  Fri Oct  7 22:33:57 2005
From: steve at holdenweb.com (Steve Holden)
Date: Fri, 07 Oct 2005 21:33:57 +0100
Subject: [Python-Dev] __doc__ behavior in class definitions
In-Reply-To: <1DFB396200705E46B5338CA4B2E25BDE9E6FF5@DF-BANDIT-MSG.exchange.corp.microsoft.com>
References: <1DFB396200705E46B5338CA4B2E25BDE9E6FF5@DF-BANDIT-MSG.exchange.corp.microsoft.com>
Message-ID: <di6m3b$2su$1@sea.gmane.org>

Martin Maly wrote:
> Hello Python-Dev,
>  
> My name is Martin Maly and I am a developer at Microsoft, working on the
> IronPython project with Jim Hugunin. I am spending lot of time making
> IronPython compatible with Python to the extent possible.
> 
> I came across a case which I am not sure if by design or a bug in Python
> (Python 2.4.1 (#65, Mar 30 2005, 09:13:57)). Consider following Python
> module:
> 
> # module begin
> "module doc"
> 
> class c:
>     print __doc__
>     __doc__ = "class doc"			(1)
>     print __doc__
> 
> print c.__doc__
> # module end
> 
> When ran, it prints:
> 
> module doc
> class doc
> class doc
> 
> Based on the binding rules described in the Python documentation, I
> would expect the code to throw because binding created on the line (1)
> is local to the class block and all the other __doc__ uses should
> reference that binding. Apparently, it is not the case.
> 
> Is this bug in Python or are __doc__ strings in classes subject to some
> additional rules?
> 
Well, it's nothing to do with __doc__, as the following example shows:

crud = "module crud"

class c:
     print crud
     crud = "class crud"
     print crud

print c.crud

As you might by now expect, this outputs

module crud
class crud
class crud

Clearly the rules for class scopes aren't quite the same as those for 
function scopes, as the module

crud = "module crud"

def f():
     print crud
     crud = "function crud"
     print crud

f()

does indeed raise an UnboundLocalError exception.

I'm not enough of a language lawyer to determine exactly why this is, 
but it's clear that class variables aren't scoped in the same way as 
function locals.

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC                     www.holdenweb.com
PyCon TX 2006                  www.python.org/pycon/


From jack at performancedrivers.com  Fri Oct  7 22:52:37 2005
From: jack at performancedrivers.com (Jack Diederich)
Date: Fri, 7 Oct 2005 16:52:37 -0400
Subject: [Python-Dev] __doc__ behavior in class definitions
In-Reply-To: <1DFB396200705E46B5338CA4B2E25BDE9E6FF5@DF-BANDIT-MSG.exchange.corp.microsoft.com>
References: <1DFB396200705E46B5338CA4B2E25BDE9E6FF5@DF-BANDIT-MSG.exchange.corp.microsoft.com>
Message-ID: <20051007205237.GL6255@performancedrivers.com>

On Fri, Oct 07, 2005 at 12:15:04PM -0700, Martin Maly wrote:
> Hello Python-Dev,
>  
> My name is Martin Maly and I am a developer at Microsoft, working on the
> IronPython project with Jim Hugunin. I am spending lot of time making
> IronPython compatible with Python to the extent possible.
> 
> I came across a case which I am not sure if by design or a bug in Python
> (Python 2.4.1 (#65, Mar 30 2005, 09:13:57)). Consider following Python
> module:
> 
> # module begin
> "module doc"
> 
> class c:
>     print __doc__
>     __doc__ = "class doc"			(1)
>     print __doc__
>
[snip]
>
> Based on the binding rules described in the Python documentation, I
> would expect the code to throw because binding created on the line (1)
> is local to the class block and all the other __doc__ uses should
> reference that binding. Apparently, it is not the case.
> 
> Is this bug in Python or are __doc__ strings in classes subject to some
> additional rules?

Classes behave just like you would expect them to, for proper variations
of what to expect *wink*.

The class body is evaluated first with the same local/global name lookups
as would happen inside another scope (e.g. a function).  The results
of that evaluation are then passed to the class constructor as a dict.
The __new__ method of metaclasses and the less used 'new' module highlight
the final step that turns a bucket of stuff in a namespace into a class.

>>> import new
>>> A = new.classobj('w00t', (object,), {'__doc__':"no help at all", 'myself':lambda x:x})
>>> a = A()
>>> a.myself()
<__main__.w00t object at 0xb7bc32cc>
>>> a
<__main__.w00t object at 0xb7bc32cc>
>>> help(a)
Help on w00t in module __main__ object:

class w00t(__builtin__.object)
 |  no help at all
 |  
 |  Methods defined here:
 |  
 |  lambdax
 |
>>>

Hope that helps,

-jackdied

From shane at hathawaymix.org  Fri Oct  7 22:55:58 2005
From: shane at hathawaymix.org (Shane Hathaway)
Date: Fri, 07 Oct 2005 14:55:58 -0600
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <1128712388.6251.21.camel@fsol>
References: <20051006143740.287E.JCARLSON@uci.edu>	<200510070145.17284.ms@cerenity.org>	<20051006221436.2892.JCARLSON@uci.edu>	<415220344.20051007104751@MailBlocks.com>	<4346C17A.2090204@hathawaymix.org>
	<1128712388.6251.21.camel@fsol>
Message-ID: <4346E0DE.70502@hathawaymix.org>

Antoine Pitrou wrote:
> A relational database has a very strict and regular data model. Also, it
> has transactions. This makes it easy to precisely define concurrency at
> the engine level.
> 
> To apply the same thing to Python you would at least need :
>   1. a way to define a subset of the current bag of reachable objects
> which has to stay consistent w.r.t. transactions that are applied to it
> (of course, you would have several such subsets in any non-trivial
> application)
>   2. a way to start and end a transaction on a bag of objects (begin /
> commit / rollback)
>   3. a precise definition of the semantics of "consistency" here : for
> example, only one thread could modify a bag of objects at any given
> time, and other threads would continue to see the frozen, stable version
> of that bag until the next version is committed by the writing thread
> 
> For 1), a helpful paradigm would be to define an object as being the
> "root" of a bag, and all its properties would automatically and
> recursively (or not ?) belong to this bag. One has to be careful that no
> property "leaks" and makes the bag become the set of all reachable
> Python objects (one could provide a means to say that a specific
> property must not be transitively put in the bag). Then, use
> my_object.begin_transaction() and my_object.commit_transaction().
> 
> The implementation of 3) does not look very obvious ;-S

Well, I think you just described ZODB. ;-)  I'd be happy to explain how 
ZODB solves those problems, if you're interested.

However, ZODB doesn't provide locking, and that bothers me somewhat.  If 
two threads try to modify an object at the same time, one of the threads 
will be forced to abort, unless a method has been defined for resolving 
the conflict.  If there are too many writers, ZODB crawls.  ZODB's 
strategy works fine when there aren't many conflicting, concurrent 
changes, but the complex locking done by relational databases seems to 
be required for handling a lot of concurrent writers.

Shane

From ms at cerenity.org  Fri Oct  7 23:02:38 2005
From: ms at cerenity.org (Michael Sparks)
Date: Fri, 7 Oct 2005 22:02:38 +0100
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <20051006221436.2892.JCARLSON@uci.edu>
References: <20051006143740.287E.JCARLSON@uci.edu>
	<200510070145.17284.ms@cerenity.org>
	<20051006221436.2892.JCARLSON@uci.edu>
Message-ID: <200510072202.39129.ms@cerenity.org>

[ Possibly overlengthy reply. However given a multiple sets of cans of
  worms... ]
On Friday 07 October 2005 07:25, Josiah Carlson wrote:
> One thing I notice is absent from the Kamaelia page is benchmarks.

That's largely for one simple reason: we haven't done any yet. 

At least not anything I'd call a benchmark. "There's lies, damn lies,
statistics and then there's benchmarks."

//Theoretically// I suspect that the system /could/ perform as well as
traditional approaches to dealing with concurrent problems single threaded 
(and multi-thread/process). This is based on the recognition of two things:

    * Event systems (often implementing state machine type behaviour, not
      always though), often have intermediate buffers between states &
      operations. Some systems divide a problem into multiple reactors and
      stages and have communication between them, though this can sometimes
     be hidden. All we've done is make this much more explicit.

   * Event systems (and state machine based approaches) can often be used
      to effectively say "I want to stop and wait here, come back to me later"
      or simply "I'm doing something processor intensive, but I'm being nice
      and letting something else have a go". The use of generators here simply
      makes that particular behaviour more explicit. This is a nice bonus of
      python.

[neither is a negative really, just different. The first bullet has implicit
 buffers in the system, the latter has a more implicit state machine
 in the system. ICBVW here of course.]

However, COULD is not is, and whilst I say "in theory", I am painfully aware 
that theory and practice often have a big gulf between them.

Also, I'm certain that at present our performance is nowhere near optimal.
We've focussed on trying to find what works from a few perspectives rather
than performance (one possible definition of correctness here, but certainly
not the only one). Along the way we've made compomises in favour of clarity
as to what's going on, rather than performance.

For example, one are we know we can optimise is the handling of
message delivery. The mini-axon tutorial represents delivery between
active components as being performed by an independent party - a
postman. This is precisely what happens in the current system.

That can be optimised for example by collapsing outboxes into inboxes
(ie removing one of the lists when a linkage is made and changing the
refernce), and at that point you have a single intermediate buffer (much
like an event/state system communicating between subsystems). We haven't
done this yet, Whilst it would partly simplify things, it makes other
areas more complex, and seems like premature optimisation.

However I have performed an //informal comparison// between the use of a 
Kamaelia type approach and a traditional approach not using any framework at 
all for implementing a trivial game. (Cats bouncing around the screen 
scaling, rotating, etc, controlled by a user) The reason I say Kamaelia-type 
approach is because it was a mini-axon based experiment using collapsed 
outboxes to inboxes (as above).

The measure I used was simply framerate. This is a fair real value and has a 
real use - if it drops too low, the system is simply unusable. I measured the 
framerate before transforming the simplistic game to work well in the 
framework, and after transforming it. The differences were:
   * 5% drop in performance/framerate
   * The ability to reuse much of the code in other systems and environments.

From that perspective it seems acceptable (for now). This *isn't* as you would
probably say a rigorous or trustable benchmark, but was a useful "smoke test"
if you like of the approach.

From a *pragmatic* perspective, currently the system is fast enough for simple 
games (say a hundred, 2 hundred, maybe more, sprites actve at once),
for interactive applications, video players, realtime audio mixing and a
variety of other things, so currently we're leaving that aside.

Also from an even more pragmatic perspective, I would say if you're after 
performance and throughput then I'd say use Twisted, since it's a proven 
technology.

**If** our stuff turns out to be useful, we'd like to find  way of making our
stuff available inside twisted -- if they'd like it (*) --  since we're not 
the least bit interested in competing with anyone :-) So far *we're* finding 
it useful, which is all I'd personally claim, and hope that it's useful to 
others.
   (*) The all too brief conversation I had with Tommi Virtanen at Europython
       suggested that he at least thought the pipeline/graphline idea was
       worth taking - so I'd like to do that at some point, even if it 
       sidelines our work to date.

Once we've validated the model though (which I expect to take some time,
you only learn if it's validated by builiding things IMO), then we'll look at
optimisation.  (if the model is validated :-)

All that said, I'm open to suggestion as to what sort of benchmark you'd like 
to see. I'm more interested in benchmarks that actually mean something rather 
than say X is better than Y though.

Summarising them, no benchmarks, yet. If you're after speed, I'm certain
you can find that elsewhere. If you're after an easy way of dealing with a
concurrent problem, that's where we're starting from, and then optimising. 
We're very open to suggestions for improvement on both usability/leanability 
and on keeping doors open/open doors to performance though. I'd hate to
have to rewrite everything in a another language later simply due to poor
design decisions. 

[ Network controlled Networked Audio Mixing Matrix ]
> Very neat.  How much data?  What kind of throughput?  What kinds of
> latencies?

For the test system we tested with 3 raw PCM audio data streams. That 's
3 x 44.1Khz, 16 bit stereo - which is around 4.2Mbit/s of data from the
network being processed realtime and output back to the network at
1.4Mbit/s. So, not huge numbers, but not insignificant amounts of data
either. I suppose one thing I can take more time with now is to look at
the specific latency of the mixer. It didn't *appear* to be large however.
(there appeared to be similar latency in the system with or without the
mixer)

[[The aim of the rapid prototyping session was to see what could be done
  rather than to measure the results. The total time taken for coding the
  mixing matrix was 2.5 days. About 1/2 day spent on finding an issue we had
  with network resends regarding non-blocking sockets. A day with me totally
  misunderstanding how mixing raw audio byte streams works. The backplane
  was written during that 3 day time period. The control protocol for
  switching on/off mixes and querying the system though was ~1.5 hours
  from start to finish, including testing. To experiment with what dataflow
  architecture might work, I knocked up a command line controlled dynamic
  graph viewer (add nodes, link nodes, delete nodes) in about 5 minutes and
  then experimented with what the system would look like if done naively. The
   backplane idea became clear as useful here because we wanted to allow
  multiple mixers. ]]

A more interesting effect we found was dealing with mouse movement in pygame
where we found that *huge* numbers of messages being sent one at a time and
processed one at a time (with yields after each) became a huge bottleneck.

It became more sense to batch the events and pass them to client surfaces.
(If that makes no sense we allow pygame components to act as if they have
control of the display by giving them a surface from a pygame display service.
This acts essentially as a simplistic window manager. That means pygame events 
need to be passed through quickly and cleanly.)

The reason I like using pygame for these things is because a) it's relatively 
raw and fast b) games are another often /naturally/ concurrent system. Also 
it normally allows other senses beyond reading numbers/graphs to kick in when 
evaluating changes "that looks better/worse", "Theres's something wrong 
there".

> I have two recent posts about the performance and features of a (hacked
> together) tuple space system 

Great :-) I'll have a dig around.

> The only thing that it is missing is a prioritization mechanism (fifo,
> numeric priority, etc.), which would get us a job scheduling kernel. Not
> bad  for a "message passing"/"tuple space"/"IPC" library.   

Sounds interesting. I'll try and find some time to have a look and have a 
play. FWIW, we're also missing a prioritisation mechanism right now. Though 
currently I have SImon Wittber's latest release of Nanothreads on my stack of 
to look at. I do have a soft spot for Linda type approaches though :-)

Best Regards,


Michael.
-- 
Michael Sparks, Senior R&D Engineer, Digital Media Group
Michael.Sparks at rd.bbc.co.uk, http://kamaelia.sourceforge.net/
British Broadcasting Corporation, Research and Development
Kingswood Warren, Surrey KT20 6NP

This e-mail may contain personal views which are not the views of the BBC.

From jim at zope.com  Fri Oct  7 23:07:58 2005
From: jim at zope.com (Jim Fulton)
Date: Fri, 07 Oct 2005 17:07:58 -0400
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <4346E0DE.70502@hathawaymix.org>
References: <20051006143740.287E.JCARLSON@uci.edu>	<200510070145.17284.ms@cerenity.org>	<20051006221436.2892.JCARLSON@uci.edu>	<415220344.20051007104751@MailBlocks.com>	<4346C17A.2090204@hathawaymix.org>	<1128712388.6251.21.camel@fsol>
	<4346E0DE.70502@hathawaymix.org>
Message-ID: <4346E3AE.6020506@zope.com>

Shane Hathaway wrote:
> Antoine Pitrou wrote:
> 
>>A relational database has a very strict and regular data model. Also, it
>>has transactions. This makes it easy to precisely define concurrency at
>>the engine level.
>>
>>To apply the same thing to Python you would at least need :
>>  1. a way to define a subset of the current bag of reachable objects
>>which has to stay consistent w.r.t. transactions that are applied to it
>>(of course, you would have several such subsets in any non-trivial
>>application)
>>  2. a way to start and end a transaction on a bag of objects (begin /
>>commit / rollback)
>>  3. a precise definition of the semantics of "consistency" here : for
>>example, only one thread could modify a bag of objects at any given
>>time, and other threads would continue to see the frozen, stable version
>>of that bag until the next version is committed by the writing thread
>>
>>For 1), a helpful paradigm would be to define an object as being the
>>"root" of a bag, and all its properties would automatically and
>>recursively (or not ?) belong to this bag. One has to be careful that no
>>property "leaks" and makes the bag become the set of all reachable
>>Python objects (one could provide a means to say that a specific
>>property must not be transitively put in the bag). Then, use
>>my_object.begin_transaction() and my_object.commit_transaction().
>>
>>The implementation of 3) does not look very obvious ;-S
> 
> 
> Well, I think you just described ZODB. ;-)  I'd be happy to explain how 
> ZODB solves those problems, if you're interested.
> 
> However, ZODB doesn't provide locking, and that bothers me somewhat.  If 
> two threads try to modify an object at the same time, one of the threads 
> will be forced to abort, unless a method has been defined for resolving 
> the conflict.  If there are too many writers, ZODB crawls.  ZODB's 
> strategy works fine when there aren't many conflicting, concurrent 
> changes, but the complex locking done by relational databases seems to 
> be required for handling a lot of concurrent writers.

I don't think it would be all that hard to use a locking (rather than
a time-stamp) strategy for ZODB, although ZEO would make this
extra challenging.

In any case, the important thing to agree on here is that transactions
provide a useful approach to concurrency control in the case where

- separate control flows are independent, and

- we need to mediate access to shared resources.

Someone else pointed out essentially the same thing at the beginning
of this thread.

Jim

-- 
Jim Fulton           mailto:jim at zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org

From solipsis at pitrou.net  Fri Oct  7 23:19:25 2005
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Fri, 07 Oct 2005 23:19:25 +0200
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <4346E0DE.70502@hathawaymix.org>
References: <20051006143740.287E.JCARLSON@uci.edu>
	<200510070145.17284.ms@cerenity.org>	<20051006221436.2892.JCARLSON@uci.edu>
	<415220344.20051007104751@MailBlocks.com>
	<4346C17A.2090204@hathawaymix.org> <1128712388.6251.21.camel@fsol>
	<4346E0DE.70502@hathawaymix.org>
Message-ID: <1128719965.6251.46.camel@fsol>


> Well, I think you just described ZODB. ;-)

*gasp*

> I'd be happy to explain how 
> ZODB solves those problems, if you're interested.

Well, yes, I'm interested :)
(I don't anything about Zope internals though, and I've never even used
it)




From shane at hathawaymix.org  Sat Oct  8 00:12:13 2005
From: shane at hathawaymix.org (Shane Hathaway)
Date: Fri, 07 Oct 2005 16:12:13 -0600
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <1128719965.6251.46.camel@fsol>
References: <20051006143740.287E.JCARLSON@uci.edu>	<200510070145.17284.ms@cerenity.org>	<20051006221436.2892.JCARLSON@uci.edu>	<415220344.20051007104751@MailBlocks.com>	<4346C17A.2090204@hathawaymix.org>
	<1128712388.6251.21.camel@fsol>	<4346E0DE.70502@hathawaymix.org>
	<1128719965.6251.46.camel@fsol>
Message-ID: <4346F2BD.8060301@hathawaymix.org>

Antoine Pitrou wrote:
>>I'd be happy to explain how 
>>ZODB solves those problems, if you're interested.
> 
> 
> Well, yes, I'm interested :)
> (I don't anything about Zope internals though, and I've never even used
> it)

Ok.  Quoting your list:

 > To apply the same thing to Python you would at least need :
 >   1. a way to define a subset of the current bag of reachable objects
 > which has to stay consistent w.r.t. transactions that are applied
 > to it (of course, you would have several such subsets in any
 > non-trivial application)

ZODB holds a tree of objects.  When you add an attribute to an object 
managed by ZODB, you're expanding the tree.  Consistency comes from 
several features:

   - Each thread has its own lazy copy of the object tree.

   - The application doesn't see changes to the object tree except at 
transaction boundaries.

   - The ZODB store keeps old revisions, and the new MVCC feature lets 
the application see the object system as it was at the beginning of the 
transaction.

   - If you make a change to the object tree that conflicts with a 
concurrent change, all changes to that copy of the object tree are aborted.

 >   2. a way to start and end a transaction on a bag of objects (begin /
 > commit / rollback)

ZODB includes a transaction module that does just that.  In fact, the 
module is so useful that I think it belongs in the standard library.

 >   3. a precise definition of the semantics of "consistency" here : for
 > example, only one thread could modify a bag of objects at any given
 > time, and other threads would continue to see the frozen,
 > stable version of that bag until the next version is committed by the
 > writing thread

As mentioned above, the key is that ZODB maintains a copy of the objects 
per thread.  A fair amount of RAM is lost that way, but the benefit in 
simplicity is tremendous.

You also talked about the risk that applications would accidentally pull 
a lot of objects into the tree just by setting an attribute.  That can 
and does happen, but the most common case is already solved by the 
pickle machinery: if you pickle something global like a class, the 
pickle stores the name and location of the class instead of the class 
itself.

Shane

From BruceEckel-Python3234 at mailblocks.com  Sat Oct  8 00:26:47 2005
From: BruceEckel-Python3234 at mailblocks.com (Bruce Eckel)
Date: Fri, 7 Oct 2005 16:26:47 -0600
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <200510072202.39129.ms@cerenity.org>
References: <20051006143740.287E.JCARLSON@uci.edu>
	<200510070145.17284.ms@cerenity.org>
	<20051006221436.2892.JCARLSON@uci.edu>
	<200510072202.39129.ms@cerenity.org>
Message-ID: <1245045308.20051007162647@MailBlocks.com>

> //Theoretically// I suspect that the system /could/ perform as well as
> traditional approaches to dealing with concurrent problems single threaded
> (and multi-thread/process).

I also think it's important to factor in the possibility of
multiprocessors. If Kamaelia (for example) has a very safe and
straightforward programming model so that more people are easily able
to use it, but it has some performance impact over more complex
systems, I think the ease of use issue opens up far greater
possibilities if you include multiprocessing -- because if you can
easily write concurrent programs in Python, then Python could gain a
significant advantage over less agile languages when multiprocessors
become common. That is, with multiprocessors, it could be way easier
to write a program in Python that also runs way faster than the
competition. Yes, of course given enough time they might theoretically
be able to write a program that is as fast or faster using their
threading mechanism, but it would be so hard by comparison that
they'll either never get it done or never be sure if it's reliable.

That's what I'm looking for.

Bruce Eckel    http://www.BruceEckel.com   mailto:BruceEckel-Python3234 at mailblocks.com
Contains electronic books: "Thinking in Java 3e" & "Thinking in C++ 2e"
Web log: http://www.artima.com/weblogs/index.jsp?blogger=beckel
Subscribe to my newsletter:
http://www.mindview.net/Newsletter
My schedule can be found at:
http://www.mindview.net/Calendar




From ms at cerenity.org  Sat Oct  8 00:47:13 2005
From: ms at cerenity.org (Michael Sparks)
Date: Fri, 7 Oct 2005 23:47:13 +0100
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <1245045308.20051007162647@MailBlocks.com>
References: <20051006143740.287E.JCARLSON@uci.edu>
	<200510072202.39129.ms@cerenity.org>
	<1245045308.20051007162647@MailBlocks.com>
Message-ID: <200510072347.14006.ms@cerenity.org>

On Friday 07 October 2005 23:26, Bruce Eckel wrote:
>  I think the ease of use issue opens up far greater possibilities if you
>  include multiprocessing  
...
> That's what I'm looking for.

In which case that's an area we need to push our work into sooner rather
than later. After all, the PS3 and CELL arrive next year. Sun already has
some interesting stuff shipping. I'd like to use that kit effectively, and
more importantly make using that kit effectively available to collegues
sooner rather than later. That really means multiprocess "now" not later.

BTW, I hope it's clear that I'm not saying concurrency is easy per se (noting
your previous post ;-) but rather than it /should/ be made as simple as is
humanly possible.

Thanks!


Michael.
-- 
Michael Sparks, Senior R&D Engineer, Digital Media Group
Michael.Sparks at rd.bbc.co.uk, http://kamaelia.sourceforge.net/
British Broadcasting Corporation, Research and Development
Kingswood Warren, Surrey KT20 6NP

This e-mail may contain personal views which are not the views of the BBC.

From ms at cerenity.org  Sat Oct  8 00:49:42 2005
From: ms at cerenity.org (Michael Sparks)
Date: Fri, 7 Oct 2005 23:49:42 +0100
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <1093762964.20051006140637@MailBlocks.com>
References: <fb6fbf5605092915471cbb1189@mail.gmail.com>
	<200510062054.56985.ms@cerenity.org>
	<1093762964.20051006140637@MailBlocks.com>
Message-ID: <200510072349.42943.ms@cerenity.org>

On Thursday 06 October 2005 21:06, Bruce Eckel wrote:
> So yes indeed, this is quite high on my list to research. Looks like
> people there have been doing some interesting work.
>
> Right now I'm just trying to cast a net, so that people can put in
> ideas, for when the Java book is done and I can spend more time on it.

Thanks for your kind words. Hopefully it's of use!

:-)


Michael.

From jason.orendorff at gmail.com  Sat Oct  8 00:51:21 2005
From: jason.orendorff at gmail.com (Jason Orendorff)
Date: Fri, 7 Oct 2005 18:51:21 -0400
Subject: [Python-Dev] __doc__ behavior in class definitions
In-Reply-To: <1DFB396200705E46B5338CA4B2E25BDE9E6FF5@DF-BANDIT-MSG.exchange.corp.microsoft.com>
References: <AcXLc2wvV9cl6RTqSm2fiVTMVP0PXw==>
	<1DFB396200705E46B5338CA4B2E25BDE9E6FF5@DF-BANDIT-MSG.exchange.corp.microsoft.com>
Message-ID: <bb8868b90510071551p650cdcr36c0005fed0320f6@mail.gmail.com>

Martin,

These two cases generate different bytecode.

    def foo():     # foo.func_code.co_flags == 0x43
        print x    # LOAD_FAST 0
        x = 3

    class Foo:     # <code object>.co_flags == 0x40
        print x    # LOAD_NAME 'x'
        x = 3

In functions, local variables are just numbered slots. (co_flags bits
1 and 2 indicate this.)  The LOAD_FAST opcode is used.  If the slot is
empty, LOAD_FAST throws.

In other code, the local variables are actually stored in a
dictionary.  LOAD_NAME is used.  This does a locals dictionary lookup;
failing that, it falls back on the globals dictionary; and failing
that, it falls back on builtins.

Why the discrepancy?  Beats me.  I would definitely implement what
CPython does up to this point, if that's your question.

Btw, functions that use 'exec' are in their own category way out
there:

    def foo2():     # foo2.func_code.co_flags == 0x42
        print x     # LOAD_NAME 'x'
        exec "x=3"  # don't ever do this, it screws everything up
        print x

Pretty weird.  Jython seems to implement this.

-j

From ncoghlan at gmail.com  Sat Oct  8 00:54:16 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 08 Oct 2005 08:54:16 +1000
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <415220344.20051007104751@MailBlocks.com>
References: <20051006143740.287E.JCARLSON@uci.edu>	<200510070145.17284.ms@cerenity.org>	<20051006221436.2892.JCARLSON@uci.edu>
	<415220344.20051007104751@MailBlocks.com>
Message-ID: <4346FC98.5050504@gmail.com>

Bruce Eckel wrote:
> I always have a problem with this. After many years of studying
> concurrency on-and-off, I continue to believe that threading is very
> difficult (indeed, the more I study it, the more difficult I
> understand it to be). And I admit this. The comments I sometimes get
> back are to the effect that "threading really isn't that hard." Thus,
> I am just too dense to get it.

The few times I have encountered anyone saying anything resembling "threading 
is easy", it was because the full sentence went something like "threading is 
easy if you use message passing and copy-on-send or release-reference-on-send 
to communicate between threads, and limit the shared data structures to those 
required to support the messaging infrastructure". And most of the time there 
was an implied "compared to using semaphores and locks directly, " at the start.

Which is obiously a far cry from simply saying "threading is easy". If I 
encountered anyone who thought it was easy *in general*, then I would fear any 
threaded code they wrote, because they clearly weren't thinking about the 
problem hard enough ;)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Sat Oct  8 01:31:48 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 08 Oct 2005 09:31:48 +1000
Subject: [Python-Dev] Proposed changes to PEP 343
In-Reply-To: <di66pi$f3t$1@sea.gmane.org>
References: <4346467D.5010005@iinet.net.au>
	<di5igq$41k$1@sea.gmane.org>	<43466C3B.50704@gmail.com>
	<di66pi$f3t$1@sea.gmane.org>
Message-ID: <43470564.6040903@gmail.com>

Fredrik Lundh wrote:
> Nick Coghlan wrote:
>>However, requiring a decorator to get a slot to work right looks pretty ugly
>>to me, too.
> 
> 
> the whole concept might be perfectly fine on the "this construct corre-
> sponds to this code" level, but if you immediately end up with things that
> are not what they seem, and names that don't mean what the say, either
> the design or the description of it needs work.
> 
>  ("yes, I know you can use this class to manage the context, but it's not
> really a context manager, because it's that method that's a manager, not
> the class itself.  yes, all the information that belongs to the context are
> managed by the class, but that doesn't make... oh, shut up and read the
> PEP")

Heh. OK, my current inclinitation is to make the new paragraph at the end of 
the "Generator Decorator" section read like this:

4. Add a paragraph to the end of the "Generator Decorator" section:

      If a generator is used to write a context's __with__ method, then
    Python's type machinery will automatically take care of applying this
    decorator. This means that it is just as easy to write a generator-based
    context manager for a context as it is to write a generator-based iterator
    for an iterable (see the decimal.Context example below).

And then update the decimal.Context example to remove the @contextmanager 
decorator.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From rhamph at gmail.com  Sat Oct  8 02:12:31 2005
From: rhamph at gmail.com (Adam Olsen)
Date: Fri, 7 Oct 2005 18:12:31 -0600
Subject: [Python-Dev] Sandboxed Threads in Python
Message-ID: <aac2c7cb0510071712k7149ff07t6fb97108cff5a79@mail.gmail.com>

Okay, basic principal first.  You start with a sandboxed thread that
has access to nothing.  No modules, no builtins, *nothing*.  This
means it can run without the GIL but it can't do any work.  To make it
do something useful we need to give it two things: first, immutable
types that can be safely accessed without locks, and second a
thread-safe queue to coordinate.  With those you can bring modules and
builtins back into the picture, either by making them immutable or
using a proxy that handles all the methods in a single thread.

Unfortunately python has a problem with immutable types.  For the most
part it uses an honor system, trusting programmers not to make a class
that claims to be immutable yet changes state anyway.  We need more
than that, and "freezing" a dict would work well enough, so it's not
the problem.  The problem is the reference counting, and even if we do
it "safely" all the memory writes just kill performance so we need to
avoid it completely.

Turns out it's quite easy and it doesn't harm performance of existing
code or require modification (but a recompile is necessary).  The idea
is to only use a cyclic garbage collector for cleaning them up, which
means we need to disable the reference counting.  That requires we
modify Py_INCREF and Py_DECREF to be a no-op if ob_refcnt is set to a
magic constant (probably a negative value).

That's all it takes.  Modify Py_INCREF and Py_DECREFs to check for a
magic constant.  Ahh, but the performance?  See for yourself.

Normal Py_INCREF/Py_DECREF
rhamph at factor:~/src/Python-2.4.1$ ./python Lib/test/pystone.py 500000
Pystone(1.1) time for 500000 passes = 13.34
This machine benchmarks at 37481.3 pystones/second

Modified Py_INCREF/Py_DECREF with magic constant
rhamph at factor:~/src/Python-2.4.1-sandbox$ ./python Lib/test/pystone.py 500000
Pystone(1.1) time for 500000 passes = 13.38
This machine benchmarks at 37369.2 pystones/second

The numbers aren't significantly different.  In fact the second one is
often slightly faster, which shows the difference is smaller than the
statistical noise.

So to sum up, by prohibiting mutable objects from being transferred
between sandboxes we can achieve scalability on multiple CPUs, making
threaded programming easier and more reliable, as a bonus get secure
sandboxes[1], and do that all while maintaining single-threaded
performance and requiring minimal changes to existing C modules
(recompiling).

A "proof of concept" patch to Py_INCREF/Py_DECREF (only demonstrates
performance effects, does not create or utilize any new functionality)
can be found here:
https://sourceforge.net/tracker/index.php?func=detail&aid=1316653&group_id=5470&atid=305470

[1] We need to remove any backdoor methods of getting to mutable
objects outside of your sandbox, which gets us most of the way towards
a restricted execution environment.

--
Adam Olsen, aka Rhamphoryncus

From pje at telecommunity.com  Sat Oct  8 02:51:41 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 07 Oct 2005 20:51:41 -0400
Subject: [Python-Dev] Sandboxed Threads in Python
In-Reply-To: <aac2c7cb0510071712k7149ff07t6fb97108cff5a79@mail.gmail.com
 >
Message-ID: <5.1.1.6.0.20051007203427.02005e30@mail.telecommunity.com>

At 06:12 PM 10/7/2005 -0600, Adam Olsen wrote:
>Okay, basic principal first.  You start with a sandboxed thread that
>has access to nothing.  No modules, no builtins, *nothing*.  This
>means it can run without the GIL but it can't do any work.

It sure can't.  You need at least the threadstate and a builtins dictionary 
to do any work.


>   To make it
>do something useful we need to give it two things: first, immutable
>types that can be safely accessed without locks,

This is harder than it sounds.  Integers, for example, have a custom 
allocator and a free list, not to mention a small-integer cache.  You would 
somehow need to duplicate all that for each sandbox, or else you have to 
make those integers immortal using your "magic constant".


>Turns out it's quite easy and it doesn't harm performance of existing
>code or require modification (but a recompile is necessary).  The idea
>is to only use a cyclic garbage collector for cleaning them up,

Um, no, actually.  You need a mark-and-sweep GC or something of that 
ilk.  Python's GC only works with objects that *have refcounts*, and it 
works by clearing objects that are in cycles.  The clearing causes 
DECREF-ing, which then causes objects to be freed.  If you have objects 
without refcounts, they would be immortal and utterly unrecoverable.


>which
>means we need to disable the reference counting.  That requires we
>modify Py_INCREF and Py_DECREF to be a no-op if ob_refcnt is set to a
>magic constant (probably a negative value).

And any object with the magic refcount will live *forever*, unless you 
manually deallocate it.



>That's all it takes.  Modify Py_INCREF and Py_DECREFs to check for a
>magic constant.  Ahh, but the performance?  See for yourself.

First, you need to implement a garbage collection scheme that can deal with 
not having refcounts.  Otherwise you're not comparing apples to apples 
here, and your programs will leak like crazy.

Note that implementing a root-based GC for Python is non-trivial, since 
extension modules can store pointers to PyObjects anywhere they 
like.  Further, many Python objects don't even support being tracked by the 
current cycle collector.

So, changing this would probably require a lot of C extensions to be 
rewritten to support the needed API changes for the new garbage collection 
strategy.


>So to sum up, by prohibiting mutable objects from being transferred
>between sandboxes we can achieve scalability on multiple CPUs, making
>threaded programming easier and more reliable, as a bonus get secure
>sandboxes[1], and do that all while maintaining single-threaded
>performance and requiring minimal changes to existing C modules
>(recompiling).

Unfortunately, you have only succeeded in restating the problem, not 
reducing its complexity.  :)  In fact, you may have increased the 
complexity, since now you need a threadsafe garbage collector, too.

Oh, and don't forget - newstyle classes keep weak references to all their 
subclasses, which means for example that every time you subclass 'dict', 
you're modifying the "immutable" 'dict' class.  So, unless you recreate all 
the classes in each sandbox, you're back to needing locking.  And if you 
recreate everything in each sandbox, well, I think you've just reinvented 
"processes".  :)


From ncoghlan at gmail.com  Sat Oct  8 03:06:28 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 08 Oct 2005 11:06:28 +1000
Subject: [Python-Dev] Sandboxed Threads in Python
In-Reply-To: <5.1.1.6.0.20051007203427.02005e30@mail.telecommunity.com>
References: <5.1.1.6.0.20051007203427.02005e30@mail.telecommunity.com>
Message-ID: <43471B94.60104@gmail.com>

Phillip J. Eby wrote:
> Oh, and don't forget - newstyle classes keep weak references to all their 
> subclasses, which means for example that every time you subclass 'dict', 
> you're modifying the "immutable" 'dict' class.  So, unless you recreate all 
> the classes in each sandbox, you're back to needing locking.  And if you 
> recreate everything in each sandbox, well, I think you've just reinvented 
> "processes".  :)

After all, there's a reason Bruce Eckel's recent post about multi-processing 
attracted a fair amount of interest.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From rhamph at gmail.com  Sat Oct  8 03:17:01 2005
From: rhamph at gmail.com (Adam Olsen)
Date: Fri, 7 Oct 2005 19:17:01 -0600
Subject: [Python-Dev] Sandboxed Threads in Python
In-Reply-To: <5.1.1.6.0.20051007203427.02005e30@mail.telecommunity.com>
References: <5.1.1.6.0.20051007203427.02005e30@mail.telecommunity.com>
Message-ID: <aac2c7cb0510071817w28b14768t5cedbaac27c354a@mail.gmail.com>

On 10/7/05, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 06:12 PM 10/7/2005 -0600, Adam Olsen wrote:
> >Okay, basic principal first.  You start with a sandboxed thread that
> >has access to nothing.  No modules, no builtins, *nothing*.  This
> >means it can run without the GIL but it can't do any work.
>
> It sure can't.  You need at least the threadstate and a builtins dictionary
> to do any work.
>
>
> >   To make it
> >do something useful we need to give it two things: first, immutable
> >types that can be safely accessed without locks,
>
> This is harder than it sounds.  Integers, for example, have a custom
> allocator and a free list, not to mention a small-integer cache.  You would
> somehow need to duplicate all that for each sandbox, or else you have to
> make those integers immortal using your "magic constant".

Yes, we'd probably want some per-sandbox allocators.  I'm no expert on
that but I know it can be done.


> >Turns out it's quite easy and it doesn't harm performance of existing
> >code or require modification (but a recompile is necessary).  The idea
> >is to only use a cyclic garbage collector for cleaning them up,
>
> Um, no, actually.  You need a mark-and-sweep GC or something of that
> ilk.  Python's GC only works with objects that *have refcounts*, and it
> works by clearing objects that are in cycles.  The clearing causes
> DECREF-ing, which then causes objects to be freed.  If you have objects
> without refcounts, they would be immortal and utterly unrecoverable.

Perhaps I wasn't clear enough, I was assuming appropriate changes to
the GC would be done.  The important thing is it can be done without
changing the interface that the existing modules use.


> >which
> >means we need to disable the reference counting.  That requires we
> >modify Py_INCREF and Py_DECREF to be a no-op if ob_refcnt is set to a
> >magic constant (probably a negative value).
>
> And any object with the magic refcount will live *forever*, unless you
> manually deallocate it.

See above.


> >That's all it takes.  Modify Py_INCREF and Py_DECREFs to check for a
> >magic constant.  Ahh, but the performance?  See for yourself.
>
> First, you need to implement a garbage collection scheme that can deal with
> not having refcounts.  Otherwise you're not comparing apples to apples
> here, and your programs will leak like crazy.
>
> Note that implementing a root-based GC for Python is non-trivial, since
> extension modules can store pointers to PyObjects anywhere they
> like.  Further, many Python objects don't even support being tracked by the
> current cycle collector.
>
> So, changing this would probably require a lot of C extensions to be
> rewritten to support the needed API changes for the new garbage collection
> strategy.

They only need to be rewritten if you want them to provide an
immutable type that can be transferred between sandboxes.  Short of
that you can make the module object itself immutable, and from it
create mutable instances that are private to each sandbox and not
sharable.

If you make no changes at all the module still works, but is only
usable from the main thread.  That allows us to transition
incrementally.


> >So to sum up, by prohibiting mutable objects from being transferred
> >between sandboxes we can achieve scalability on multiple CPUs, making
> >threaded programming easier and more reliable, as a bonus get secure
> >sandboxes[1], and do that all while maintaining single-threaded
> >performance and requiring minimal changes to existing C modules
> >(recompiling).
>
> Unfortunately, you have only succeeded in restating the problem, not
> reducing its complexity.  :)  In fact, you may have increased the
> complexity, since now you need a threadsafe garbage collector, too.
>
> Oh, and don't forget - newstyle classes keep weak references to all their
> subclasses, which means for example that every time you subclass 'dict',
> you're modifying the "immutable" 'dict' class.  So, unless you recreate all
> the classes in each sandbox, you're back to needing locking.  And if you
> recreate everything in each sandbox, well, I think you've just reinvented
> "processes".  :)

I was aware that weakrefs needed some special handling (I just forgot
to mention it), but I didn't know it was used by subclassing. 
Unfortunately I don't know what purpose it serves so I can't
contemplate how to deal with it.

I need to stress that *only* the new, immutable and "thread-safe
mark-and-sweep" types would be affected by these changes.  Everything
else would continue to exist as it did before, and the benchmark
exists to show they can coexist without killing performance.

--
Adam Olsen, aka Rhamphoryncus

From ncoghlan at gmail.com  Sat Oct  8 03:19:58 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 08 Oct 2005 11:19:58 +1000
Subject: [Python-Dev] PEP 342 suggestion: start(),
 __call__() and  unwind_call() methods
In-Reply-To: <5.1.1.6.0.20051007134543.03b2f728@mail.telecommunity.com>
References: <5.1.1.6.0.20051007134543.03b2f728@mail.telecommunity.com>
Message-ID: <43471EBE.40401@gmail.com>

Phillip J. Eby wrote:
> At 09:50 PM 10/7/2005 +1000, Nick Coghlan wrote:
> 
>> Notice how a non-coroutine callable can be yielded, and it will still 
>> work
>> happily with the scheduler, because the desire to continue execution is
>> indicated by the ContinueIteration exception, rather than by the type 
>> of the
>> returned value.
> 
> 
> Whaaaa?  You raise an exception to indicate the *normal* case?  That 
> seems, um...  well, a Very Bad Idea.

The sheer backwardness of my idea occurred to me after I'd got some sleep :)

> Last, but far from least, as far as I can tell you can implement all of 
> these semantics using PEP 342 as it sits.  That is, it's very simple to 
> make decorators or classes that add those semantics.  I don't see 
> anything that requires them to be part of Python.

Yeah, I've now realised that you can do all of this more simply by doing it 
directly in the scheduler using StopIteration to indicate when the coroutine 
is done, and using yield to indicate "I'm not done yet".

So with a bit of thought, I came up with a scheduler that has all the benefits 
I described, and only uses the existing PEP 342 methods.

When writing a coroutine for this scheduler, you can do 6 things via the 
scheduler:

   1. Raise StopIteration to indicate "I'm done" and return None to your caller
   2. Raise StopIteration with a single argument to return a value other than 
None to your caller
   3. Raise a different exception and have that exception propagate up to your 
caller
   5. Yield None to allow other coroutines to be executed
   5. Yield a coroutine to request a call to that coroutine
   6. Yield a callable to request an asynchronous call using that object

Yielding anything else, or trying to raise StopIteration with more than one 
argument results in a TypeError being raised *at the point of the offending 
yield or raise statement*, rather than taking out the scheduler itself.

The more I explore the possibilities of PEP 342, the more impressed I am by 
the work that went into it!

Cheers,
Nick.

P.S. Here's the Trampoline scheduler described above:

         import collections

         class Trampoline:
             """Manage communications between coroutines"""

             running = False

             def __init__(self):
                 self.queue = collections.deque()

             def add(self, coroutine):
                 """Request that a coroutine be executed"""
                 self.schedule(coroutine)

             def run(self):
                 result = None
                 self.running = True
                 try:
                     while self.running and self.queue:
                         func = self.queue.popleft()
                         result = func()
                     return result
                 finally:
                     self.running = False

             def stop(self):
                 self.running = False

             def schedule(self, coroutine, stack=(), call_result=None, *exc):
                 # Define the new pseudothread
                 def pseudothread():
                     try:
                         if exc:
                             callee = coroutine.throw(call_result, *exc)
                         else:
                             callee = coroutine(call_result)
                     except (StopIteration), ex:
                         # Coroutine finished cleanly
                         if stack:
                             # Send the result to the caller
                             caller = stack[0]
                             prev_stack = stack[1]
                             if len(ex.args) > 1:
                                 # Raise a TypeError in the current coroutine
                                 self.schedule(coroutine, stack,
                                      TypeError,
                                      "Too many arguments to StopIteration"
                                 )
                             elif ex.args:
                                 self.schedule(caller, prev_stack, ex.args[0])
                             else:
                                 self.schedule(caller, prev_stack)
                     except:
                         # Coroutine finished with an exception
                         if stack:
                             # send the error back to the caller
                             caller = stack[0]
                             prev_stack = stack[1]
                             self.schedule(
                                  caller, prev_stack, *sys.exc_info()
                             )
                         else:
                             # Nothing left in this pseudothread to
                             # handle it, let it propagate to the
                             # run loop
                             raise
                     else:
                         # Coroutine isn't finished yet
                         if callee is None:
                             # Reschedule the current coroutine
                             self.schedule(coroutine, stack)
                         elif isinstance(callee, types.GeneratorType):
                             # Requested a call to another coroutine
                             self.schedule(callee, (coroutine,stack))
                         elif callable(callee):
                             # Requested an asynchronous call
                             self._make_async_call(callee, coroutine, stack)
                         else:
                             # Raise a TypeError in the current coroutine
                             self.schedule(coroutine, stack,
                                  TypeError, "Illegal argument to yield"
                             )

                 # Add the new pseudothread to the execution queue
                 self.queue.append(pseudothread)

             def _make_async_call(self, blocking_call, caller, stack):
                 # Assume @threaded decorator takes care of
                 #   - returning a function with a call method which
                 #     kick starts the function execution and returns
                 #     a Future object to give access to the result.
                 #   - farming call out to a physical thread pool
                 #   - keeping the Thread object executing the async
                 #     call alive after this function exits
                 @threaded
                 def async_call():
                     try:
                         result = blocking_call()
                     except:
                         # send the error back to the caller
                         self.schedule(
                             caller, stack, *sys.exc_info()
                         )
                     else:
                         # Send the result back to the caller
                         self.schedule(caller, stack, result)

                 # Start the asynchronous call
                 async_call()


-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From jcarlson at uci.edu  Sat Oct  8 04:05:57 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri, 07 Oct 2005 19:05:57 -0700
Subject: [Python-Dev] Sandboxed Threads in Python
In-Reply-To: <aac2c7cb0510071817w28b14768t5cedbaac27c354a@mail.gmail.com>
References: <5.1.1.6.0.20051007203427.02005e30@mail.telecommunity.com>
	<aac2c7cb0510071817w28b14768t5cedbaac27c354a@mail.gmail.com>
Message-ID: <20051007183215.28A4.JCARLSON@uci.edu>


Adam Olsen <rhamph at gmail.com> wrote:
> I need to stress that *only* the new, immutable and "thread-safe
> mark-and-sweep" types would be affected by these changes.  Everything
> else would continue to exist as it did before, and the benchmark
> exists to show they can coexist without killing performance.

All the benchmark showed was that checking for a constant in the
refcount during in/decrefing, and not garbage collecting those objects,
didn't adversely affect performance.

As an aside, there's also the ugly bit about being able to guarantee
that an object is immutable.  I personally mutate Python strings in my C
code all the time (long story, not to be discussed here), and if I can
do it now, then any malicious or "inventive" person can do the same in
this "sandboxed thread" Python "of the future".

At least in the case of integers, one could work the tagged integer idea
to bypass the freelist issue the Phillip offered, but in general, I
don't believe there exists a truely immutable type as long as there is C
extensions and/or cTypes.  Further, the work to actually implement a new
garbage collector for Python in order to handle these 'immutable' types
seems to me to be more trouble than it is worth.

 - Josiah


From foom at fuhm.net  Sat Oct  8 04:50:11 2005
From: foom at fuhm.net (James Y Knight)
Date: Fri, 7 Oct 2005 22:50:11 -0400
Subject: [Python-Dev] Proposal for 2.5: Returning values from PEP 342
	enhanced generators
In-Reply-To: <4340C76E.8020502@satori.za.net>
References: <4340C76E.8020502@satori.za.net>
Message-ID: <4047F300-5065-4573-9D39-3117FC3D6E2B@fuhm.net>

On Oct 3, 2005, at 1:53 AM, Piet Delport wrote:
> For generators written in this style, "yield" means "suspend  
> execution of the
> current call until the requested result/resource can be provided", and
> "return" regains its full conventional meaning of "terminate the  
> current call
> with a given result".
>
> The simplest / most straightforward implementation would be for  
> "return Foo"
> to translate to "raise StopIteration, Foo". This is consistent with  
> "return"
> translating to "raise StopIteration", and does not break any existing
> generator code.
>
> (Another way to think about this change is that if a plain  
> StopIteration means
> "the iterator terminated", then a valued StopIteration, by  
> extension, means
> "the iterator terminated with the given value".)
>

It sounds like a nice idea to me. Of course, it is only useful to  
functions calling ".next()" explicitly; in something like a for loop,  
the return value would just be ignored.

James

From jcarlson at uci.edu  Sat Oct  8 05:05:20 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri, 07 Oct 2005 20:05:20 -0700
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <200510072202.39129.ms@cerenity.org>
References: <20051006221436.2892.JCARLSON@uci.edu>
	<200510072202.39129.ms@cerenity.org>
Message-ID: <20051007190739.28AA.JCARLSON@uci.edu>


Michael Sparks <ms at cerenity.org> wrote:
> [ Possibly overlengthy reply. However given a multiple sets of cans of
>   worms... ]
> On Friday 07 October 2005 07:25, Josiah Carlson wrote:
> > One thing I notice is absent from the Kamaelia page is benchmarks.
> 
> That's largely for one simple reason: we haven't done any yet. 

Perfectly reasonable.  If you ever do, I'd be happy to know!


> At least not anything I'd call a benchmark. "There's lies, damn lies,
> statistics and then there's benchmarks."

Indeed.  But it does allow people to get an idea whether a system could
handle their workload.


> The measure I used was simply framerate. This is a fair real value and has a 
> real use - if it drops too low, the system is simply unusable. I measured the 
> framerate before transforming the simplistic game to work well in the 
> framework, and after transforming it. The differences were:
>    * 5% drop in performance/framerate
>    * The ability to reuse much of the code in other systems and environments.

Single process?  Multi-process single machine?  Multiprocess multiple
machine?


> Also from an even more pragmatic perspective, I would say if you're after 
> performance and throughput then I'd say use Twisted, since it's a proven 
> technology.

I'm just curious.  I keep my fingers away from Twisted as a matter of
personal taste (I'm sure its great, but it's not for me).


> All that said, I'm open to suggestion as to what sort of benchmark you'd like 
> to see. I'm more interested in benchmarks that actually mean something rather 
> than say X is better than Y though.

I wouldn't dream of saying that X was better or worse than Y, unless one
was obvious crap (since it works for you, and you've gotten new users to
use it successfully, that is obviously not the case).

There are five benchmarks that I think would be interesting to see:
1. Send ~500 bytes of data round-trip from process A to process B and
back on the same machine as fast as you can (simulates a synchronous
message passing and discovers transfer latencies) a few (tens of)
thousands of times (A doesn't send message i until it has recieved
message i-1 back from B).

2. Increase the number of processes that round trip with B.  A quick
chart of #senders vs. messages/second would be far more than adequate.

3. Have process B send ~500 byte messages to many listening processes
via whatever is the fastest method (direct connections, multiple
subscriptions to a 'channel', etc.).  Knowing #listeners vs.
messages/second would be cool.

4. Send blocks of data from process A to process B (any size you want).
B immediately discards the data, but you pay attention to how much
data/second B recieves (a dual processor machine with proper processor
affinities would be fine here).

5. Start increasing the number of processes that send data to B.  A
quick chart of #senders vs. total bytes/second would be far more than
adequate.


I'm just offering the above as example benchmarks (you certainly don't
need to do them to satisfy me, but I'll be doing those when my tuple
space implementation is closer to being done). They are certainly not
exhaustive, but they do offer a method by which one can measure
latencies, message volume throughput, data volume throughput, and
ability to handle many senders and/or recipients.

> [ Network controlled Networked Audio Mixing Matrix ]
> > Very neat.  How much data?  What kind of throughput?  What kinds of
> > latencies?
> 
> For the test system we tested with 3 raw PCM audio data streams. That 's
> 3 x 44.1Khz, 16 bit stereo - which is around 4.2Mbit/s of data from the
> network being processed realtime and output back to the network at
> 1.4Mbit/s. So, not huge numbers, but not insignificant amounts of data
> either. I suppose one thing I can take more time with now is to look at
> the specific latency of the mixer. It didn't *appear* to be large however.
> (there appeared to be similar latency in the system with or without the
> mixer)

530Kbytes/second in, 176kbytes/second out.  Not bad (I imagine you are
using a C library/extension of some sort to do the mixing...perhaps
numarray, Numeric, ...).  How large are the blocks of data that you are
shuffling around at one time?  1,5,10,50,150kbytes?

> A more interesting effect we found was dealing with mouse movement in pygame
> where we found that *huge* numbers of messages being sent one at a time and
> processed one at a time (with yields after each) became a huge bottleneck.

I can imagine.

> The reason I like using pygame for these things is because a) it's relatively 
> raw and fast b) games are another often /naturally/ concurrent system. Also 
> it normally allows other senses beyond reading numbers/graphs to kick in when 
> evaluating changes "that looks better/worse", "Theres's something wrong 
> there".

Indeed.  I'm should get my fingers into PyGame, but haven't yet due to
other responsibilities.


> > I have two recent posts about the performance and features of a (hacked
> > together) tuple space system 
> 
> Great :-) I'll have a dig around.

Make that 3.


> > The only thing that it is missing is a prioritization mechanism (fifo,
> > numeric priority, etc.), which would get us a job scheduling kernel. Not
> > bad  for a "message passing"/"tuple space"/"IPC" library.   
> 
> Sounds interesting. I'll try and find some time to have a look and have a 
> play. FWIW, we're also missing a prioritisation mechanism right now. Though 
> currently I have SImon Wittber's latest release of Nanothreads on my stack of 
> to look at. I do have a soft spot for Linda type approaches though :-)

I've not yet released anything.  The version I'm working on essentially
indexes tuples in a set of specialized structures to make certain kinds
of matching fast (both insertions and removals are also fast), which has
a particular kind of queue at the 'leaf' (if one were to look at it as a
tree).

Those queues also support listeners which want to be notified about one
or many tuples which happen to match up with the pattern, resulting in
the tuple being consumed by one listener, broadcast to all listeners,
etc.

In the case of no listeners, but someone who just wants one tuple, one
can prioritize tuple fetches based on fifo, numeric priority, lifo, or
whatever other useful semantic that I get around to putting in there for
whatever set of tuples matches it.


 - Josiah


From pje at telecommunity.com  Sat Oct  8 05:17:08 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 07 Oct 2005 23:17:08 -0400
Subject: [Python-Dev] Sandboxed Threads in Python
In-Reply-To: <aac2c7cb0510071817w28b14768t5cedbaac27c354a@mail.gmail.com
 >
References: <5.1.1.6.0.20051007203427.02005e30@mail.telecommunity.com>
	<5.1.1.6.0.20051007203427.02005e30@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20051007231002.02aae820@mail.telecommunity.com>

At 07:17 PM 10/7/2005 -0600, Adam Olsen wrote:
>On 10/7/05, Phillip J. Eby <pje at telecommunity.com> wrote:
> > At 06:12 PM 10/7/2005 -0600, Adam Olsen wrote:
> > >Turns out it's quite easy and it doesn't harm performance of existing
> > >code or require modification (but a recompile is necessary).  The idea
> > >is to only use a cyclic garbage collector for cleaning them up,
> >
> > Um, no, actually.  You need a mark-and-sweep GC or something of that
> > ilk.  Python's GC only works with objects that *have refcounts*, and it
> > works by clearing objects that are in cycles.  The clearing causes
> > DECREF-ing, which then causes objects to be freed.  If you have objects
> > without refcounts, they would be immortal and utterly unrecoverable.
>
>Perhaps I wasn't clear enough, I was assuming appropriate changes to
>the GC would be done.  The important thing is it can be done without
>changing the interface that the existing modules use.

No, it can't.  See more below.


> > >That's all it takes.  Modify Py_INCREF and Py_DECREFs to check for a
> > >magic constant.  Ahh, but the performance?  See for yourself.
> >
> > First, you need to implement a garbage collection scheme that can deal with
> > not having refcounts.  Otherwise you're not comparing apples to apples
> > here, and your programs will leak like crazy.
> >
> > Note that implementing a root-based GC for Python is non-trivial, since
> > extension modules can store pointers to PyObjects anywhere they
> > like.  Further, many Python objects don't even support being tracked by the
> > current cycle collector.
> >
> > So, changing this would probably require a lot of C extensions to be
> > rewritten to support the needed API changes for the new garbage collection
> > strategy.
>
>They only need to be rewritten if you want them to provide an
>immutable type that can be transferred between sandboxes.

No.  You're missing my point.  If they are able to *reference* these 
objects, then the garbage collector has to know about it, or else it can't 
know when to reclaim them.  Ergo, these objects will leak, or else 
extensions will crash when they refer to the deallocated memory.

In other words, you can't handwave the whole problem away by assuming "a 
garbage collector".  The garbage collector has to actually be able to work, 
and you haven't specified *how* it can work without changing the C API.


>I was aware that weakrefs needed some special handling (I just forgot
>to mention it), but I didn't know it was used by subclassing.
>Unfortunately I don't know what purpose it serves so I can't
>contemplate how to deal with it.

It allows changes to a supertype's C-level slots to propagate to subclasses.


From ncoghlan at gmail.com  Sat Oct  8 05:34:41 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 08 Oct 2005 13:34:41 +1000
Subject: [Python-Dev] Sourceforge CVS access
In-Reply-To: <ca471dc20510070914h404045c1r8ab1d88ccb2c64de@mail.gmail.com>
References: <43468417.4000701@iinet.net.au>
	<ca471dc20510070914h404045c1r8ab1d88ccb2c64de@mail.gmail.com>
Message-ID: <43473E51.4010103@gmail.com>

Guido van Rossum wrote:
> I will, if you tell me your sourceforge username.

Sorry, forgot about that little detail ;)

Anyway, its ncoghlan, same as the gmail account.

Cheers,
Nick.


-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From kbk at shore.net  Sat Oct  8 06:34:33 2005
From: kbk at shore.net (Kurt B. Kaiser)
Date: Sat, 8 Oct 2005 00:34:33 -0400 (EDT)
Subject: [Python-Dev] Weekly Python Patch/Bug Summary
Message-ID: <200510080434.j984YXHG031113@bayview.thirdcreek.com>

Patch / Bug Summary
___________________

Patches :  341 open ( +4) /  2953 closed ( +6) /  3294 total (+10)
Bugs    :  884 open (-28) /  5321 closed (+43) /  6205 total (+15)
RFE     :  196 open ( +1) /   187 closed ( +0) /   383 total ( +1)

New / Reopened Patches
______________________

Make fcntl work properly on AMD64  (2005-09-30)
       http://python.org/sf/1309352  opened by  Brad Hards

A wait4() implementation  (2005-09-30)
       http://python.org/sf/1309579  opened by  chads

httplib error handling and HTTP/0.9 fix  (2005-10-04)
       http://python.org/sf/1312980  opened by  Yusuke Shinyama

Speedup PyUnicode_DecodeCharmap  (2005-10-05)
       http://python.org/sf/1313939  opened by  Walter D?rwald

os.makedirs - robust against partial path  (2005-10-05)
       http://python.org/sf/1314067  opened by  Jim Jewett

ensure lock is released if exception is raised  (2005-10-06)
       http://python.org/sf/1314396  opened by  Eric Blossom

ToolTip.py: fix main() function  (2005-10-06)
       http://python.org/sf/1315161  opened by  sebastien blanchet

Py_INCREF/Py_DECREF with magic constant demo  (2005-10-07)
       http://python.org/sf/1316653  opened by  Adam Olsen

Patches Closed
______________

pyexpat.c: Two line fix for decoding crash  (2005-09-29)
       http://python.org/sf/1309009  closed by  nnorwitz

patch IDLE to allow running anonymous code in editor window  (2005-05-13)
       http://python.org/sf/1201522  closed by  kbk

BSD-style wait4 implementation  (2004-07-29)
       http://python.org/sf/1000267  closed by  nnorwitz

Patch for (Doc) bug #1219273  (2005-06-25)
       http://python.org/sf/1227568  closed by  nnorwitz

Greatly enhanced webbrowser.py  (2003-06-13)
       http://python.org/sf/754022  closed by  birkenfeld

Patch for [ 1163563 ] Sub threads execute in restricted mode  (05/17/05)
       http://python.org/sf/1203393  closed by  sf-robot

New / Reopened Bugs
___________________

linechache module returns wrong results  (2005-09-30)
       http://python.org/sf/1309567  opened by  Thomas Heller

2.4.2 make problems  (2005-10-03)
       http://python.org/sf/1311579  opened by  Paul Mothersdill

broken link in sets page  (2005-10-03)
CLOSED http://python.org/sf/1311674  opened by  Fernando Canizo

python.exe 2.4.2 compiled with VS2005 crashes  (2005-10-03)
       http://python.org/sf/1311784  opened by  Peter Klotz

Python breakdown in windows (uses apsw)  (2005-10-03)
CLOSED http://python.org/sf/1311993  opened by  Benjamin Hinrichs

mac_roman codec missing "apple" codepoint  (2005-10-04)
       http://python.org/sf/1313051  opened by  Tony Nelson

urlparse "caches" parses regardless of encoding  (2005-10-04)
       http://python.org/sf/1313119  opened by  Ken Kinder

bisect C replacement doesn't accept named args  (2005-10-04)
CLOSED http://python.org/sf/1313496  opened by  Drew Perttula

Compile fails for configure "--without-threads"  (2005-10-05)
CLOSED http://python.org/sf/1313974  opened by  seidl

Issue in unicode args in logging   (2005-10-05)
CLOSED http://python.org/sf/1314107  reopened by  tungwaiyip

Issue in unicode args in logging   (2005-10-05)
CLOSED http://python.org/sf/1314107  opened by  Wai Yip Tung

crash in longobject (invalid memory access)  (2005-10-05)
CLOSED http://python.org/sf/1314182  opened by  Jon Nelson

logging run into deadlock in some error handling situation  (2005-10-05)
CLOSED http://python.org/sf/1314519  opened by  Wai Yip Tung

Trailing slash redirection for SimpleHTTPServer  (2005-10-05)
       http://python.org/sf/1314572  opened by  Josiah Carlson

inspect.getsourcelines() broken  (2005-10-07)
       http://python.org/sf/1315961  opened by  Walter D?rwald

gzip.GzipFile.seek missing second argument  (2005-10-07)
       http://python.org/sf/1316069  opened by  Neil Schemenauer

Segmentation fault with invalid "coding"  (2005-10-07)
       http://python.org/sf/1316162  opened by  Humberto Di?genes

Bugs Closed
___________

Datagram Socket Timeouts  (2005-09-29)
       http://python.org/sf/1308042  closed by  nnorwitz

Unsatisfied symbols: _PyGILState_NoteThreadState (code)  (2005-09-29)
       http://python.org/sf/1307978  closed by  mwh

subprocess.Popen locks on Cygwin  (2005-09-29)
       http://python.org/sf/1307798  closed by  jlt63

test_posix fails on cygwin  (2005-04-10)
       http://python.org/sf/1180147  closed by  jlt63

can't import thru cygwin symlink  (2005-04-08)
       http://python.org/sf/1179412  closed by  jlt63

popen4/cygwin ssh hangs  (2005-01-13)
       http://python.org/sf/1101756  closed by  jlt63

2.3.3 tests on cygwin  (2003-12-22)
       http://python.org/sf/864374  closed by  jlt63

Datagram Socket Timeouts  (2005-09-28)
       http://python.org/sf/1307357  closed by  nnorwitz

log() on a big number fails on powerpc64  (2005-07-26)
       http://python.org/sf/1245381  closed by  nnorwitz

__getnewargs__ is in pickle docs but not in code  (2005-09-30)
       http://python.org/sf/1309724  closed by  nnorwitz

Win registry problem  (2005-07-15)
       http://python.org/sf/1239120  closed by  birkenfeld

unknown encoding -&gt; MemoryError  (2003-07-17)
       http://python.org/sf/772896  closed by  nnorwitz

'Expression' AST Node not documented  (2005-06-12)
       http://python.org/sf/1219273  closed by  nnorwitz

segfault if redirecting directory  (2004-01-30)
       http://python.org/sf/887946  closed by  nnorwitz

Acroread aborts printing PDF documentation  (2004-05-30)
       http://python.org/sf/963321  closed by  hgolden

Python hangs up on I/O operations on the latest FreeBSD 4.10  (2004-06-09)
       http://python.org/sf/969492  closed by  birkenfeld

empty raise after handled exception  (2004-06-15)
       http://python.org/sf/973103  closed by  nnorwitz

Unhelpful error message when mtime of a module is -1  (2004-06-21)
       http://python.org/sf/976608  closed by  nnorwitz

compile of code with incorrect encoding yields MemoryError  (2004-06-25)
       http://python.org/sf/979739  closed by  nnorwitz

os.access reports true for read-only directories  (2004-07-15)
       http://python.org/sf/991735  closed by  nnorwitz

seg fault when calling bsddb.hashopen()  (2004-07-19)
       http://python.org/sf/994100  closed by  montanaro

os.major() os.minor() example and description change  (2004-08-12)
       http://python.org/sf/1008310  closed by  nnorwitz

test_pep277 fails  (2004-09-16)
       http://python.org/sf/1029561  closed by  nnorwitz

broken link in sets page  (2005-10-03)
       http://python.org/sf/1311674  closed by  fdrake

Python breakdown in windows (uses apsw)  (2005-10-03)
       http://python.org/sf/1311993  closed by  birkenfeld

Missing sk_SK in windows_locale  (2005-07-13)
       http://python.org/sf/1237015  deleted by  luks

bisect C replacement doesn't accept named args  (2005-10-05)
       http://python.org/sf/1313496  closed by  rhettinger

Compile fails for configure "--without-threads"  (2005-10-06)
       http://python.org/sf/1313974  closed by  perky

Issue in unicode args in logging   (2005-10-05)
       http://python.org/sf/1314107  closed by  vsajip

Issue in unicode args in logging   (2005-10-05)
       http://python.org/sf/1314107  closed by  vsajip

crash in longobject (invalid memory access)  (2005-10-05)
       http://python.org/sf/1314182  closed by  tim_one

logging run into deadlock in some error handling situation  (2005-10-06)
       http://python.org/sf/1314519  closed by  vsajip

A command history for the idle interactive shell  (2005-09-23)
       http://python.org/sf/1302267  closed by  kbk

New / Reopened RFE
__________________

Add os.path.relpath  (2005-09-30)
       http://python.org/sf/1309676  opened by  Reinhold Birkenfeld


From pjd at satori.za.net  Sat Oct  8 08:20:30 2005
From: pjd at satori.za.net (Piet Delport)
Date: Sat, 08 Oct 2005 08:20:30 +0200
Subject: [Python-Dev] PEP 342 suggestion: start(),
 __call__() and  unwind_call() methods
In-Reply-To: <43471EBE.40401@gmail.com>
References: <5.1.1.6.0.20051007134543.03b2f728@mail.telecommunity.com>
	<43471EBE.40401@gmail.com>
Message-ID: <4347652E.1090705@satori.za.net>

Nick Coghlan wrote:
> Phillip J. Eby wrote:
>> Nick Coghlan wrote:
> 
[...]
> 
>> Last, but far from least, as far as I can tell you can implement all of 
>> these semantics using PEP 342 as it sits.  That is, it's very simple to 
>> make decorators or classes that add those semantics.  I don't see 
>> anything that requires them to be part of Python.
> 
> 
> Yeah, I've now realised that you can do all of this more simply by doing it 
> directly in the scheduler using StopIteration to indicate when the coroutine 
> is done, and using yield to indicate "I'm not done yet".

Earlier this week, i proposed legalizing "return Result" inside a generator,
and making it act like "raise StopIteration( Result )", for exactly this reason.

IMHO, this is an elegant and straightforward extension of the current
semantics of returns inside generators, and is the final step toward making
generator-based concurrent tasks[1] look just like the equivalent synchronous
code (with the only difference, more-or-less, being the need for appropriate
"yield" keywords, and a task runner/scheduler loop).

This change would make a huge difference to the practical usability of these
generator-based tasks.  I think they're much less likely to catch on if you
have to write "raise StopIteration( Result )" (or "_return( Result )") all the
time.

[1] a.k.a. coroutines, which i don't think is an accurate name, anymore.

From ncoghlan at iinet.net.au  Sat Oct  8 10:23:57 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Sat, 08 Oct 2005 18:23:57 +1000
Subject: [Python-Dev] test_cmd_line failure on Kubuntu 5.10 with GCC 4.0
Message-ID: <4347821D.1070105@iinet.net.au>

Anyone else seeing any problems with test_cmd_line? I've got a few failures in 
test_cmd_line on Kubuntu 5.10 with GCC 4.0 relating to a missing "\n" line ending.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From martin at v.loewis.de  Sat Oct  8 12:32:00 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 08 Oct 2005 12:32:00 +0200
Subject: [Python-Dev] PythonCore\CurrentVersion
Message-ID: <4347A020.2050008@v.loewis.de>

What happened to the CurrentVersion registry entry documented at

http://www.python.org/windows/python/registry.html

AFAICT, even the python15.wse file did not fill a value in this
entry (perhaps I'm misinterpreting the wse file, though).

So was this ever used? Why is it documented, and who documented it
(unfortunately, registry.html is not in cvs/subversion, either)?

Regards,
Martin

From ms at cerenity.org  Sat Oct  8 12:44:00 2005
From: ms at cerenity.org (Michael Sparks)
Date: Sat, 8 Oct 2005 11:44:00 +0100
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <20051007190739.28AA.JCARLSON@uci.edu>
References: <20051006221436.2892.JCARLSON@uci.edu>
	<200510072202.39129.ms@cerenity.org>
	<20051007190739.28AA.JCARLSON@uci.edu>
Message-ID: <200510081144.02122.ms@cerenity.org>

On Saturday 08 October 2005 04:05, Josiah Carlson wrote:
[ simplistic, informal benchmark of a test optimised versioned of the system,
 based on bouncing scaing rotating sprites around the screen. ]
> Single process? ?Multi-process single machine? ?Multiprocess multiple
> machine?

SIngle process, single CPU, not very recent machine. (600MHz crusoe based 
machine so) That machine wasn't hardware accelerated though, so was only able 
to handle several dozen sprites before slowing down. The slowdown was due to 
the hardware not being able to keep up with pygame's drawing requests though 
rather than the framework.

> I'm just offering the above as example benchmarks (you certainly don't
> need to do them to satisfy me, but I'll be doing those when my tuple
> space implementation is closer to being done).

I'll note them as things worth doing - they look reasonable and interesting
benchmarks. (I can think of a few modifications I might make though. For
example in 3 you say "fastest". I might have that as a 3b. 3a could be
"simplest to use/read" or "most likely to pick". Obviously there's a good
chance that's not the fastest. (Could be optimised to be under the hood
I suppose, but that wouldn't be the point of the test)

> > [ Network controlled Networked Audio Mixing Matrix ]
> I imagine you are using a C library/extension of some sort to do the
> mixing...perhaps numarray, Numeric, ... 

Nope, just plain old python (I'm now using a 1.6Ghz centrino machine
though). My mixing function is particularly naive as well. To me that says
more about python than my code. I did consider using pyrex to wrap (or
write) an optimised version, but there didn't seem to be any need for
last week (Though for a non-prototype something faster would be
nice :).

I'll save responding the linda things until I have a chance to read in detail
what you've written. It sounds very promising though - having multiple
approaches to different styles of concurrency that work nicely with each
other safely is always a positive thing IMO.

Thanks for the suggestions and best regards,


Michael.
--
"Though we are not now that which in days of old moved heaven and earth, 
   that which we are, we are: one equal temper of heroic hearts made 
     weak by time and fate but strong in will to strive, to seek, 
          to find and not to yield" -- "Ulysses", Tennyson

From rhamph at gmail.com  Sat Oct  8 14:29:25 2005
From: rhamph at gmail.com (Adam Olsen)
Date: Sat, 8 Oct 2005 06:29:25 -0600
Subject: [Python-Dev] Sandboxed Threads in Python
In-Reply-To: <5.1.1.6.0.20051007231002.02aae820@mail.telecommunity.com>
References: <5.1.1.6.0.20051007203427.02005e30@mail.telecommunity.com>
	<5.1.1.6.0.20051007231002.02aae820@mail.telecommunity.com>
Message-ID: <aac2c7cb0510080529j699cb6d4k46fc4e27ffb64b93@mail.gmail.com>

On 10/7/05, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 07:17 PM 10/7/2005 -0600, Adam Olsen wrote:
> >On 10/7/05, Phillip J. Eby <pje at telecommunity.com> wrote:
> > > Note that implementing a root-based GC for Python is non-trivial, since
> > > extension modules can store pointers to PyObjects anywhere they
> > > like.  Further, many Python objects don't even support being tracked by the
> > > current cycle collector.
> > >
> > > So, changing this would probably require a lot of C extensions to be
> > > rewritten to support the needed API changes for the new garbage collection
> > > strategy.
> >
> >They only need to be rewritten if you want them to provide an
> >immutable type that can be transferred between sandboxes.
>
> No.  You're missing my point.  If they are able to *reference* these
> objects, then the garbage collector has to know about it, or else it can't
> know when to reclaim them.  Ergo, these objects will leak, or else
> extensions will crash when they refer to the deallocated memory.
>
> In other words, you can't handwave the whole problem away by assuming "a
> garbage collector".  The garbage collector has to actually be able to work,
> and you haven't specified *how* it can work without changing the C API.

Unfortunately the rammifications of your original statement didn't set
in until well after I sent my reply.  You are right, it does make it
impossible without changing the C API, so that much of the idea is
dead.

I wonder if it would be possible to use a wrapper around the immutable
type instead.. something to ponder anyway.


> >I was aware that weakrefs needed some special handling (I just forgot
> >to mention it), but I didn't know it was used by subclassing.
> >Unfortunately I don't know what purpose it serves so I can't
> >contemplate how to deal with it.
>
> It allows changes to a supertype's C-level slots to propagate to subclasses.

I see.  Well, I would have required the supertype to be immutable, so
there couldn't be any changes to the C-level slots.

--
Adam Olsen, aka Rhamphoryncus

From rhamph at gmail.com  Sat Oct  8 14:34:19 2005
From: rhamph at gmail.com (Adam Olsen)
Date: Sat, 8 Oct 2005 06:34:19 -0600
Subject: [Python-Dev] Sandboxed Threads in Python
In-Reply-To: <20051007183215.28A4.JCARLSON@uci.edu>
References: <5.1.1.6.0.20051007203427.02005e30@mail.telecommunity.com>
	<aac2c7cb0510071817w28b14768t5cedbaac27c354a@mail.gmail.com>
	<20051007183215.28A4.JCARLSON@uci.edu>
Message-ID: <aac2c7cb0510080534p971f52ey6f1425d34e53891d@mail.gmail.com>

On 10/7/05, Josiah Carlson <jcarlson at uci.edu> wrote:
>
> Adam Olsen <rhamph at gmail.com> wrote:
> > I need to stress that *only* the new, immutable and "thread-safe
> > mark-and-sweep" types would be affected by these changes.  Everything
> > else would continue to exist as it did before, and the benchmark
> > exists to show they can coexist without killing performance.
>
> All the benchmark showed was that checking for a constant in the
> refcount during in/decrefing, and not garbage collecting those objects,
> didn't adversely affect performance.
>
> As an aside, there's also the ugly bit about being able to guarantee
> that an object is immutable.  I personally mutate Python strings in my C
> code all the time (long story, not to be discussed here), and if I can
> do it now, then any malicious or "inventive" person can do the same in
> this "sandboxed thread" Python "of the future".

Malicious use is hardly a serious concern.  Someone using C code could
just as well crash the interpreter.

Modifying a python string you just created before you expose it to
python code should be fine.  If that's not what you're doing.. I'm not
sure I want to know *wink*


> At least in the case of integers, one could work the tagged integer idea
> to bypass the freelist issue the Phillip offered, but in general, I
> don't believe there exists a truely immutable type as long as there is C
> extensions and/or cTypes.  Further, the work to actually implement a new
> garbage collector for Python in order to handle these 'immutable' types
> seems to me to be more trouble than it is worth.

Maybe.. I'm not convinced.  There's a lot of payback IMO.

--
Adam Olsen, aka Rhamphoryncus

From hyeshik at gmail.com  Sat Oct  8 16:23:06 2005
From: hyeshik at gmail.com (Hye-Shik Chang)
Date: Sat, 8 Oct 2005 23:23:06 +0900
Subject: [Python-Dev] test_cmd_line failure on Kubuntu 5.10 with GCC 4.0
In-Reply-To: <4347821D.1070105@iinet.net.au>
References: <4347821D.1070105@iinet.net.au>
Message-ID: <4f0b69dc0510080723s2585ae2cw23cbfbc71941cf92@mail.gmail.com>

On 10/8/05, Nick Coghlan <ncoghlan at iinet.net.au> wrote:
> Anyone else seeing any problems with test_cmd_line? I've got a few failures in
> test_cmd_line on Kubuntu 5.10 with GCC 4.0 relating to a missing "\n" line ending.
>

Same problem here. (FreeBSD 6.0 with GCC 3.4.4)
In my short inspection, popen2.popen4.read() returned just an empty string.
I'll investigate more in this weekend.

Hye-Shik

From ncoghlan at gmail.com  Sat Oct  8 18:18:27 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 09 Oct 2005 02:18:27 +1000
Subject: [Python-Dev] test_cmd_line failure on Kubuntu 5.10 with GCC 4.0
In-Reply-To: <4f0b69dc0510080723s2585ae2cw23cbfbc71941cf92@mail.gmail.com>
References: <4347821D.1070105@iinet.net.au>
	<4f0b69dc0510080723s2585ae2cw23cbfbc71941cf92@mail.gmail.com>
Message-ID: <4347F153.8050904@gmail.com>

Hye-Shik Chang wrote:
> On 10/8/05, Nick Coghlan <ncoghlan at iinet.net.au> wrote:
> 
>>Anyone else seeing any problems with test_cmd_line? I've got a few failures in
>>test_cmd_line on Kubuntu 5.10 with GCC 4.0 relating to a missing "\n" line ending.
>>
> 
> 
> Same problem here. (FreeBSD 6.0 with GCC 3.4.4)
> In my short inspection, popen2.popen4.read() returned just an empty string.

Good to know it isn't just a system quirk, as that's the same behaviour I'm 
getting.

I noticed that the ones which appear to be failing (-E, -O, -S, -Q) are the 
ones which expect an interactive session to open. The tests which pass (-V, 
-h, directory as argument or stdin) are the ones which don't actually start 
the interpreter.

If I explicitly write Ctrl-D to the subprocess's stdin for the tests which 
open the interpreter, then the tests pass. So it looks like some sort of 
buffering problem with standard out not getting flushed before the test tries 
to read the data.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From jcarlson at uci.edu  Sat Oct  8 20:03:31 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sat, 08 Oct 2005 11:03:31 -0700
Subject: [Python-Dev] Sandboxed Threads in Python
In-Reply-To: <aac2c7cb0510080534p971f52ey6f1425d34e53891d@mail.gmail.com>
References: <20051007183215.28A4.JCARLSON@uci.edu>
	<aac2c7cb0510080534p971f52ey6f1425d34e53891d@mail.gmail.com>
Message-ID: <20051008104605.28B3.JCARLSON@uci.edu>


Adam Olsen <rhamph at gmail.com> wrote:
> On 10/7/05, Josiah Carlson <jcarlson at uci.edu> wrote:
> > Adam Olsen <rhamph at gmail.com> wrote:
> > > I need to stress that *only* the new, immutable and "thread-safe
> > > mark-and-sweep" types would be affected by these changes.  Everything
> > > else would continue to exist as it did before, and the benchmark
> > > exists to show they can coexist without killing performance.
> >
> > All the benchmark showed was that checking for a constant in the
> > refcount during in/decrefing, and not garbage collecting those objects,
> > didn't adversely affect performance.
> >
> > As an aside, there's also the ugly bit about being able to guarantee
> > that an object is immutable.  I personally mutate Python strings in my C
> > code all the time (long story, not to be discussed here), and if I can
> > do it now, then any malicious or "inventive" person can do the same in
> > this "sandboxed thread" Python "of the future".
> 
> Malicious use is hardly a serious concern.  Someone using C code could
> just as well crash the interpreter.

Your malicious user is my inventive colleague.  Here's one: performing
zero-copy inter-thread IPC by modifying shared immutables. Attempting to
enforce a policy of "don't do that, it's not supported" is not going to
be effective, especially when doing unsupported things increase speed.

People have known for decades that having anything run in kernel space
beyond the kernel is dangerous, but they still do because it is faster.

I can (but won't) point out examples for days of bad decisions made for
the sake of speed, or policy that has been ignored for the sake of speed
(some of these overlap and some don't).


> Modifying a python string you just created before you expose it to
> python code should be fine.  If that's not what you're doing.. I'm not
> sure I want to know *wink*

You really don't want to know.


> Maybe.. I'm not convinced.  There's a lot of payback IMO.

You've not convinced me either.  Good luck in getting a group together
to make it happen.

 - Josiah


From jcarlson at uci.edu  Sat Oct  8 20:42:32 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sat, 08 Oct 2005 11:42:32 -0700
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <200510081144.02122.ms@cerenity.org>
References: <20051007190739.28AA.JCARLSON@uci.edu>
	<200510081144.02122.ms@cerenity.org>
Message-ID: <20051008100848.28B0.JCARLSON@uci.edu>


Michael Sparks <ms at cerenity.org> wrote:
> On Saturday 08 October 2005 04:05, Josiah Carlson wrote:
> > I'm just offering the above as example benchmarks (you certainly don't
> > need to do them to satisfy me, but I'll be doing those when my tuple
> > space implementation is closer to being done).
> 
> I'll note them as things worth doing - they look reasonable and interesting
> benchmarks. (I can think of a few modifications I might make though. For
> example in 3 you say "fastest". I might have that as a 3b. 3a could be
> "simplest to use/read" or "most likely to pick". Obviously there's a good
> chance that's not the fastest. (Could be optimised to be under the hood
> I suppose, but that wouldn't be the point of the test)

Good point.
3a. Use 1024 byte blocks...
3b. Use whatever makes your system perform best (if you have the time to
tune it)...


> > > [ Network controlled Networked Audio Mixing Matrix ]
> > I imagine you are using a C library/extension of some sort to do the
> > mixing...perhaps numarray, Numeric, ... 
> 
> Nope, just plain old python (I'm now using a 1.6Ghz centrino machine
> though). My mixing function is particularly naive as well. To me that says
> more about python than my code. I did consider using pyrex to wrap (or
> write) an optimised version, but there didn't seem to be any need for
> last week (Though for a non-prototype something faster would be
> nice :).

Indeed.  A quick array.array('h',...) implementation is able to run 7-8x
real time on 3->1 stream mixing on my 1.3 ghz laptop.  Maybe numeric or
numarray isn't necessary.


> I'll save responding the linda things until I have a chance to read in detail
> what you've written. It sounds very promising though - having multiple
> approaches to different styles of concurrency that work nicely with each
> other safely is always a positive thing IMO.
> 
> Thanks for the suggestions and best regards,

Thank you for the interesting and informative discussion.

 - Josiah


From nnorwitz at gmail.com  Sat Oct  8 20:52:52 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Sat, 8 Oct 2005 11:52:52 -0700
Subject: [Python-Dev] test_cmd_line failure on Kubuntu 5.10 with GCC 4.0
In-Reply-To: <4347F153.8050904@gmail.com>
References: <4347821D.1070105@iinet.net.au>
	<4f0b69dc0510080723s2585ae2cw23cbfbc71941cf92@mail.gmail.com>
	<4347F153.8050904@gmail.com>
Message-ID: <ee2a432c0510081152q4f13a00ek1abed945475c419f@mail.gmail.com>

On 10/8/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Hye-Shik Chang wrote:
> > On 10/8/05, Nick Coghlan <ncoghlan at iinet.net.au> wrote:
> >
> >>Anyone else seeing any problems with test_cmd_line? I've got a few failures in
> >>test_cmd_line on Kubuntu 5.10 with GCC 4.0 relating to a missing "\n" line ending.
>
> If I explicitly write Ctrl-D to the subprocess's stdin for the tests which
> open the interpreter, then the tests pass. So it looks like some sort of
> buffering problem with standard out not getting flushed before the test tries
> to read the data.

Sorry, that's a new test I added recently.  It works for me on gentoo.
 The test is very simple and shouldn't be hard to fix.  Can you fix
it?  I assume Guido (or someone) added you as a developer.  If not, if
you can give me enough info, I can try to fix it.

n

From rhamph at gmail.com  Sat Oct  8 21:00:38 2005
From: rhamph at gmail.com (Adam Olsen)
Date: Sat, 8 Oct 2005 13:00:38 -0600
Subject: [Python-Dev] Sandboxed Threads in Python
In-Reply-To: <20051008104605.28B3.JCARLSON@uci.edu>
References: <20051007183215.28A4.JCARLSON@uci.edu>
	<aac2c7cb0510080534p971f52ey6f1425d34e53891d@mail.gmail.com>
	<20051008104605.28B3.JCARLSON@uci.edu>
Message-ID: <aac2c7cb0510081200u6d0b132cke66f85f2bb050339@mail.gmail.com>

On 10/8/05, Josiah Carlson <jcarlson at uci.edu> wrote:
> Your malicious user is my inventive colleague.  Here's one: performing
> zero-copy inter-thread IPC by modifying shared immutables. Attempting to
> enforce a policy of "don't do that, it's not supported" is not going to
> be effective, especially when doing unsupported things increase speed.

I actually have no problem with that, so long as you use a custom
type.  It may not technically be immutable but it does provide its own
clearly defined semantics for simultaneous modification, and that's
enough.

Anyway, the idea as I presented it is dead at this point, so I'll
leave it at that.

--
Adam Olsen, aka Rhamphoryncus

From BruceEckel-Python3234 at mailblocks.com  Sat Oct  8 21:14:25 2005
From: BruceEckel-Python3234 at mailblocks.com (Bruce Eckel)
Date: Sat, 8 Oct 2005 13:14:25 -0600
Subject: [Python-Dev] Sandboxed Threads in Python
In-Reply-To: <20051008104605.28B3.JCARLSON@uci.edu>
References: <20051007183215.28A4.JCARLSON@uci.edu>
	<aac2c7cb0510080534p971f52ey6f1425d34e53891d@mail.gmail.com>
	<20051008104605.28B3.JCARLSON@uci.edu>
Message-ID: <1377773721.20051008131425@MailBlocks.com>

> I can (but won't) point out examples for days of bad decisions made for
> the sake of speed, or policy that has been ignored for the sake of speed
> (some of these overlap and some don't).

As long as you've entered premature-optimization land, how about
decisions made because it's *assumed* that (A) We must have speed here
and (B) This will make it happen.

My hope would be that we could find a solution that would by default
keep you out of trouble when writing concurrent programs, but provide
a back door if you wanted to do something special. If you choose to go
in the back door, you have to do it consciously and take
responsibility for the outcome.

With Java, in contrast, as soon as you step into the world of
concurrency (even if you step in by accident, which is not uncommon),
lots of rules change. What was an ordinary method call before is now
something risky that can cause great damage. Should I make this
variable volatile? Is an operation atomic? You have to learn a lot of
things all over again.

I don't want that for Python. I'd like the move into concurrency to be
a gentle slope, not a sudden reality-shift. If a novice decides they
want to try game programming with concurrency, I want there to be
training wheels on by default, so that their first experience will be
a successful one, and they can then start learning more features and
ideas incrementally, without trying a feature and suddenly having the
whole thing get weird and crash down on their heads and cause them to
run screaming away ...

I know there have been some technologies that have already been
mentioned on this list and I hope that we can continue to experiment
with and discuss those and also new ideas until we shake out the
fundamental issues and maybe even come up with a list of possible
solutions.


Bruce Eckel    http://www.BruceEckel.com   mailto:BruceEckel-Python3234 at mailblocks.com
Contains electronic books: "Thinking in Java 3e" & "Thinking in C++ 2e"
Web log: http://www.artima.com/weblogs/index.jsp?blogger=beckel
Subscribe to my newsletter:
http://www.mindview.net/Newsletter
My schedule can be found at:
http://www.mindview.net/Calendar




From guido at python.org  Sat Oct  8 22:28:26 2005
From: guido at python.org (Guido van Rossum)
Date: Sat, 8 Oct 2005 13:28:26 -0700
Subject: [Python-Dev] test_cmd_line failure on Kubuntu 5.10 with GCC 4.0
In-Reply-To: <ee2a432c0510081152q4f13a00ek1abed945475c419f@mail.gmail.com>
References: <4347821D.1070105@iinet.net.au>
	<4f0b69dc0510080723s2585ae2cw23cbfbc71941cf92@mail.gmail.com>
	<4347F153.8050904@gmail.com>
	<ee2a432c0510081152q4f13a00ek1abed945475c419f@mail.gmail.com>
Message-ID: <ca471dc20510081328g1f489055w3e4f77c8b95d3d11@mail.gmail.com>

On 10/8/05, Neal Norwitz <nnorwitz at gmail.com> wrote:
> On 10/8/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> > Hye-Shik Chang wrote:
> > > On 10/8/05, Nick Coghlan <ncoghlan at iinet.net.au> wrote:
> > >
> > >>Anyone else seeing any problems with test_cmd_line? I've got a few failures in
> > >>test_cmd_line on Kubuntu 5.10 with GCC 4.0 relating to a missing "\n" line ending.
> >
> > If I explicitly write Ctrl-D to the subprocess's stdin for the tests which
> > open the interpreter, then the tests pass. So it looks like some sort of
> > buffering problem with standard out not getting flushed before the test tries
> > to read the data.
>
> Sorry, that's a new test I added recently.  It works for me on gentoo.
>  The test is very simple and shouldn't be hard to fix.  Can you fix
> it?  I assume Guido (or someone) added you as a developer.  If not, if
> you can give me enough info, I can try to fix it.

I guess Neil's test was expecting at least one line of output from
python at all times, but on most systems it is completely silent when
the input is empty. I fixed the test (also in 2.4) to allow empty
input as well as input ending in \n.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From jcarlson at uci.edu  Sat Oct  8 22:38:45 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sat, 08 Oct 2005 13:38:45 -0700
Subject: [Python-Dev] Sandboxed Threads in Python
In-Reply-To: <1377773721.20051008131425@MailBlocks.com>
References: <20051008104605.28B3.JCARLSON@uci.edu>
	<1377773721.20051008131425@MailBlocks.com>
Message-ID: <20051008125655.28B8.JCARLSON@uci.edu>


Bruce Eckel <BruceEckel-Python3234 at mailblocks.com> wrote:
> 
> > I can (but won't) point out examples for days of bad decisions made for
> > the sake of speed, or policy that has been ignored for the sake of speed
> > (some of these overlap and some don't).
> 
> As long as you've entered premature-optimization land, how about
> decisions made because it's *assumed* that (A) We must have speed here
> and (B) This will make it happen.

A. From what I understand about sandboxing threads, the point was to
remove the necessity for the GIL, so that every thread can go out on its
own and run on its own processor.

B. Shared memory vs. queues vs. pipes vs. ...  Concurrency without
communication is almost totally worthless.  Historically, shared memory
has tended to be one of the fastest (if not the fastest) communication
methods available.  Whether or not mutable shared memory would be faster
or slower than queues is unknown, but I'm going to stick with my
experience until I am proved wrong by this mythical free threaded system
with immutables.


> My hope would be that we could find a solution that would by default
> keep you out of trouble when writing concurrent programs, but provide
> a back door if you wanted to do something special. If you choose to go
> in the back door, you have to do it consciously and take
> responsibility for the outcome.
> 
> With Java, in contrast, as soon as you step into the world of
> concurrency (even if you step in by accident, which is not uncommon),
> lots of rules change. What was an ordinary method call before is now
> something risky that can cause great damage. Should I make this
> variable volatile? Is an operation atomic? You have to learn a lot of
> things all over again.
>
> I don't want that for Python. I'd like the move into concurrency to be
> a gentle slope, not a sudden reality-shift. If a novice decides they
> want to try game programming with concurrency, I want there to be
> training wheels on by default, so that their first experience will be
> a successful one, and they can then start learning more features and
> ideas incrementally, without trying a feature and suddenly having the
> whole thing get weird and crash down on their heads and cause them to
> run screaming away ...

I don't want to get into an argument here.  While I agree that
concurrent programming should be easier, my experience with MPI (and
other similar systems) and writing parallel algorithms leads me to
believe that even if you have a simple method for communication, even if
you can guarantee that thread/process A won't clobber thread/process B,
actually writing software which executes in some way which made the
effort of making the software concurrent worthwhile, is less than easy.
I'd love to be proved wrong (I'm hoping to do it myself with tuple
spaces).

I do, however, doubt that free threading approaches will be the future
for concurrent programming in CPython.

 - Josiah


From guido at python.org  Sat Oct  8 23:29:21 2005
From: guido at python.org (Guido van Rossum)
Date: Sat, 8 Oct 2005 14:29:21 -0700
Subject: [Python-Dev] PEP 342 suggestion: start(),
	__call__() and unwind_call() methods
In-Reply-To: <4347652E.1090705@satori.za.net>
References: <5.1.1.6.0.20051007134543.03b2f728@mail.telecommunity.com>
	<43471EBE.40401@gmail.com> <4347652E.1090705@satori.za.net>
Message-ID: <ca471dc20510081429t2f268860s4927ec2d1124e961@mail.gmail.com>

On 10/7/05, Piet Delport <pjd at satori.za.net> wrote:
> Earlier this week, i proposed legalizing "return Result" inside a generator,
> and making it act like "raise StopIteration( Result )", for exactly this reason.
>
> IMHO, this is an elegant and straightforward extension of the current
> semantics of returns inside generators, and is the final step toward making
> generator-based concurrent tasks[1] look just like the equivalent synchronous
> code (with the only difference, more-or-less, being the need for appropriate
> "yield" keywords, and a task runner/scheduler loop).
>
> This change would make a huge difference to the practical usability of these
> generator-based tasks.  I think they're much less likely to catch on if you
> have to write "raise StopIteration( Result )" (or "_return( Result )") all the
> time.
>
> [1] a.k.a. coroutines, which i don't think is an accurate name, anymore.

Before we do this I'd like to see you show some programming examples
that show how this would be used. I'm having a hard time understanding
where you would need this but I realize I haven't used this paradigm
enough to have a good feel for it, so I'm open for examples.

At least this makes more sense than mapping "return X" into "yield X;
return" as someone previously proposed. :)

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From ncoghlan at gmail.com  Sun Oct  9 03:10:56 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 09 Oct 2005 11:10:56 +1000
Subject: [Python-Dev] PEP 342 suggestion: start(),
 __call__() and unwind_call() methods
In-Reply-To: <ca471dc20510081429t2f268860s4927ec2d1124e961@mail.gmail.com>
References: <5.1.1.6.0.20051007134543.03b2f728@mail.telecommunity.com>	<43471EBE.40401@gmail.com>
	<4347652E.1090705@satori.za.net>
	<ca471dc20510081429t2f268860s4927ec2d1124e961@mail.gmail.com>
Message-ID: <43486E20.3010908@gmail.com>

Guido van Rossum wrote:
>>This change would make a huge difference to the practical usability of these
>>generator-based tasks.  I think they're much less likely to catch on if you
>>have to write "raise StopIteration( Result )" (or "_return( Result )") all the
>>time.
>>
>>[1] a.k.a. coroutines, which i don't think is an accurate name, anymore.
> 
> 
> Before we do this I'd like to see you show some programming examples
> that show how this would be used. I'm having a hard time understanding
> where you would need this but I realize I haven't used this paradigm
> enough to have a good feel for it, so I'm open for examples.
> 
> At least this makes more sense than mapping "return X" into "yield X;
> return" as someone previously proposed. :)

It would be handy when the generators are being used as true pseudothreads 
with a scheduler like the one I posted earlier in this discussion. It allows 
these pseudothreads to "call" each other by yielding the call as a lambda or 
partial function application that produces a zero-argument callable. The 
called pseudothread can then yield as many times as it wants (either making 
its own calls, or just being a well-behaved member of a cooperatively MT 
environment), and then finally returning the value that the original caller 
requested.

Using 'return' for this is actually a nice idea, and if we ever do make it 
legal to use 'return' in generators, these are the semantics it should have.

However, I'm not sure its something we should be adding *right now* as part of 
PEP 342 - writing "raise StopIteration" and "raise StopIteration(result)", and 
saying that a generator includes an implied "raise StopIteration" after its 
last line of code really isn't that difficult to understand, and is completely 
explicit about what is going on.

My basic concern is that I think replacing "raise StopIteration" with "return" 
and "raise StopIteration(EXPR)" with "return EXPR" would actually make such 
code easier to write at the expense of making it harder to *read*, because the 
fact that an exception is being raised is obscured. Consider the following two 
code snippets:

   def function():
      try:
         return
      except StopIteration:
         print "We never get here."

   def generator():
      yield
      try:
         return
      except StopIteration:
         print "But we would get here!"


So, instead of having "return" automatically map to "raise StopIteration" 
inside generators, I'd like to suggest we keep it illegal to use "return" 
inside a generator, and instead add a new attribute "result" to StopIteration 
instances such that the following three conditions hold:

      # Result is None if there is no argument to StopIteration
      try:
         raise StopIteration
      except StopIteration, ex:
         assert ex.result is None

      # Result is the argument if there is exactly one argument
      try:
         raise StopIteration(expr)
      except StopIteration, ex:
         assert ex.result == ex.args[0]

      # Result is the argument tuple if there are multiple arguments
      try:
         raise StopIteration(expr1, expr2)
      except StopIteration, ex:
         assert ex.result == ex.args

This precisely parallels the behaviour of return statements:
   return                 # Call returns None
   return expr            # Call returns expr
   return expr1, expr2    # Call returns (expr1, expr2)


Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Sun Oct  9 03:25:31 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 09 Oct 2005 11:25:31 +1000
Subject: [Python-Dev] Sandboxed Threads in Python
In-Reply-To: <20051008125655.28B8.JCARLSON@uci.edu>
References: <20051008104605.28B3.JCARLSON@uci.edu>	<1377773721.20051008131425@MailBlocks.com>
	<20051008125655.28B8.JCARLSON@uci.edu>
Message-ID: <4348718B.9000502@gmail.com>

Josiah Carlson wrote:
> I do, however, doubt that free threading approaches will be the future
> for concurrent programming in CPython.

Hear, hear! IMO, it's the combination of the GIL with a compiler which never 
decides to change the code execution order under the covers that makes 
threading *not* a pain in Python (so long as one remembers to release the GIL 
around blocking calls to external libraries, and to use threading.Queue to get 
info between threads wherever possible).

The desire to change that seems to be a classic case of wanting to write 
C/C++/Java/whatever in Python, rather than writing Python in Python.

And thanks to Bruce for starting the recent multi-processing discussion - 
hopefully one day we will have mechanisms in the standard library that scale 
relatively smoothly from PEP 342 based logical threads, through 
threading.Thread based physical threads, to <something> based subprocesses.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From guido at python.org  Sun Oct  9 03:25:39 2005
From: guido at python.org (Guido van Rossum)
Date: Sat, 8 Oct 2005 18:25:39 -0700
Subject: [Python-Dev] PEP 342 suggestion: start(),
	__call__() and unwind_call() methods
In-Reply-To: <43486E20.3010908@gmail.com>
References: <5.1.1.6.0.20051007134543.03b2f728@mail.telecommunity.com>
	<43471EBE.40401@gmail.com> <4347652E.1090705@satori.za.net>
	<ca471dc20510081429t2f268860s4927ec2d1124e961@mail.gmail.com>
	<43486E20.3010908@gmail.com>
Message-ID: <ca471dc20510081825l1f8b271du21b04c69d5eadda6@mail.gmail.com>

> Guido van Rossum wrote:
> > Before we do this I'd like to see you show some programming examples
> > that show how this would be used. I'm having a hard time understanding
> > where you would need this but I realize I haven't used this paradigm
> > enough to have a good feel for it, so I'm open for examples.

On 10/8/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> It would be handy when the generators are being used as true pseudothreads
> with a scheduler like the one I posted earlier in this discussion. It allows
> these pseudothreads to "call" each other by yielding the call as a lambda or
> partial function application that produces a zero-argument callable. The
> called pseudothread can then yield as many times as it wants (either making
> its own calls, or just being a well-behaved member of a cooperatively MT
> environment), and then finally returning the value that the original caller
> requested.
>
> Using 'return' for this is actually a nice idea, and if we ever do make it
> legal to use 'return' in generators, these are the semantics it should have.
>
> However, I'm not sure its something we should be adding *right now* as part of
> PEP 342 - writing "raise StopIteration" and "raise StopIteration(result)", and
> saying that a generator includes an implied "raise StopIteration" after its
> last line of code really isn't that difficult to understand, and is completely
> explicit about what is going on.
>
> My basic concern is that I think replacing "raise StopIteration" with "return"
> and "raise StopIteration(EXPR)" with "return EXPR" would actually make such
> code easier to write at the expense of making it harder to *read*, because the
> fact that an exception is being raised is obscured. Consider the following two
> code snippets:
>
>    def function():
>       try:
>          return
>       except StopIteration:
>          print "We never get here."
>
>    def generator():
>       yield
>       try:
>          return
>       except StopIteration:
>          print "But we would get here!"

Right.  Plus, Piet also remarked that the value is silently ignored
when the generator is used in a for-loop. Since that's likely to be
the majority of generators, I'd worry that accepting "return X" would
increase the occurrence of bugs caused by someone habitually writing
"return X" where they meant "yield X". (Assuming there's another yield
in the generator, otherwise it wouldn't be a generator and the error
would reveal itself very differently.)

> So, instead of having "return" automatically map to "raise StopIteration"
> inside generators, I'd like to suggest we keep it illegal to use "return"
> inside a generator, and instead add a new attribute "result" to StopIteration
> instances such that the following three conditions hold:
>
>       # Result is None if there is no argument to StopIteration
>       try:
>          raise StopIteration
>       except StopIteration, ex:
>          assert ex.result is None
>
>       # Result is the argument if there is exactly one argument
>       try:
>          raise StopIteration(expr)
>       except StopIteration, ex:
>          assert ex.result == ex.args[0]
>
>       # Result is the argument tuple if there are multiple arguments
>       try:
>          raise StopIteration(expr1, expr2)
>       except StopIteration, ex:
>          assert ex.result == ex.args
>
> This precisely parallels the behaviour of return statements:
>    return                 # Call returns None
>    return expr            # Call returns expr
>    return expr1, expr2    # Call returns (expr1, expr2)

This seems a bit overdesigned; I'd expect that the trampoline
scheduler could easily enough pick the args tuple apart to get the
same effect without adding another attribute unique to StopIteration.
I'd like to keep StopIteration really lightweight so it doesn't slow
down its use in other places.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From ncoghlan at gmail.com  Sun Oct  9 04:43:34 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 09 Oct 2005 12:43:34 +1000
Subject: [Python-Dev] PEP 342 suggestion: start(),
 __call__() and unwind_call() methods
In-Reply-To: <ca471dc20510081825l1f8b271du21b04c69d5eadda6@mail.gmail.com>
References: <5.1.1.6.0.20051007134543.03b2f728@mail.telecommunity.com>	
	<43471EBE.40401@gmail.com> <4347652E.1090705@satori.za.net>	
	<ca471dc20510081429t2f268860s4927ec2d1124e961@mail.gmail.com>	
	<43486E20.3010908@gmail.com>
	<ca471dc20510081825l1f8b271du21b04c69d5eadda6@mail.gmail.com>
Message-ID: <434883D6.80009@gmail.com>

Guido van Rossum wrote:
> On 10/8/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
>>So, instead of having "return" automatically map to "raise StopIteration"
>>inside generators, I'd like to suggest we keep it illegal to use "return"
>>inside a generator, and instead add a new attribute "result" to StopIteration
>>instances such that the following three conditions hold:
>>
>>      # Result is None if there is no argument to StopIteration
>>      try:
>>         raise StopIteration
>>      except StopIteration, ex:
>>         assert ex.result is None
>>
>>      # Result is the argument if there is exactly one argument
>>      try:
>>         raise StopIteration(expr)
>>      except StopIteration, ex:
>>         assert ex.result == ex.args[0]
>>
>>      # Result is the argument tuple if there are multiple arguments
>>      try:
>>         raise StopIteration(expr1, expr2)
>>      except StopIteration, ex:
>>         assert ex.result == ex.args
>>
>>This precisely parallels the behaviour of return statements:
>>   return                 # Call returns None
>>   return expr            # Call returns expr
>>   return expr1, expr2    # Call returns (expr1, expr2)
> 
> 
> This seems a bit overdesigned; I'd expect that the trampoline
> scheduler could easily enough pick the args tuple apart to get the
> same effect without adding another attribute unique to StopIteration.
> I'd like to keep StopIteration really lightweight so it doesn't slow
> down its use in other places.

True. And it would be easy enough for a framework to have a utility function 
that looked like:

   def getresult(ex):
     args = ex.args
     if not args:
         return None
     elif len(args) == 1:
         return args[0]
     else:
         return args

Although, if StopIteration.result was a read-only property with the above 
definition, wouldn't that give us the benefit of "one obvious way" to return a 
value from a coroutine without imposing any runtime cost on normal use of 
StopIteration to finish an iterator?

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From foom at fuhm.net  Sun Oct  9 04:54:10 2005
From: foom at fuhm.net (James Y Knight)
Date: Sat, 8 Oct 2005 22:54:10 -0400
Subject: [Python-Dev] PEP 342 suggestion: start(),
	__call__() and unwind_call() methods
In-Reply-To: <43486E20.3010908@gmail.com>
References: <5.1.1.6.0.20051007134543.03b2f728@mail.telecommunity.com>	<43471EBE.40401@gmail.com>
	<4347652E.1090705@satori.za.net>
	<ca471dc20510081429t2f268860s4927ec2d1124e961@mail.gmail.com>
	<43486E20.3010908@gmail.com>
Message-ID: <953B6108-C621-46C3-B492-6A726595403B@fuhm.net>


On Oct 8, 2005, at 9:10 PM, Nick Coghlan wrote:
> So, instead of having "return" automatically map to "raise  
> StopIteration"
> inside generators, I'd like to suggest we keep it illegal to use  
> "return"
> inside a generator

Only one issue with that: it's _not currently illegal_ to use return  
inside a generator. From the view of the outsider, it currently  
effectively does currently map to "raise StopIteration". But not on  
the inside, just like you'd expect to happen. The only proposed  
change to the semantics is to also allow a value to be provided with  
the return.

>    def generator():
>       yield
>       try:
>          return
>       except StopIteration:
>          print "But we would get here!"

 >>> def generator():
...  yield 5
...  try:
...   return
...  except StopIteration:
...   print "But we would get here!"
...
 >>> x=generator()
 >>> x.next()
5
 >>> x.next()
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
StopIteration
 >>>


James


From ncoghlan at gmail.com  Sun Oct  9 05:26:15 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 09 Oct 2005 13:26:15 +1000
Subject: [Python-Dev] PEP 342 suggestion: start(),
 __call__() and unwind_call() methods
In-Reply-To: <953B6108-C621-46C3-B492-6A726595403B@fuhm.net>
References: <5.1.1.6.0.20051007134543.03b2f728@mail.telecommunity.com>	<43471EBE.40401@gmail.com>
	<4347652E.1090705@satori.za.net>
	<ca471dc20510081429t2f268860s4927ec2d1124e961@mail.gmail.com>
	<43486E20.3010908@gmail.com>
	<953B6108-C621-46C3-B492-6A726595403B@fuhm.net>
Message-ID: <43488DD7.9010700@gmail.com>

James Y Knight wrote:
> 
> On Oct 8, 2005, at 9:10 PM, Nick Coghlan wrote:
> 
>> So, instead of having "return" automatically map to "raise  
>> StopIteration"
>> inside generators, I'd like to suggest we keep it illegal to use  
>> "return"
>> inside a generator
> 
> 
> Only one issue with that: it's _not currently illegal_ to use return  
> inside a generator. From the view of the outsider, it currently  
> effectively does currently map to "raise StopIteration". But not on  the 
> inside, just like you'd expect to happen. The only proposed  change to 
> the semantics is to also allow a value to be provided with  the return.

Huh. I'd have sworn I'd tried that and it didn't work. Maybe I was using a 
value with the return, and had forgotten the details of the error message.

In that case, I have far less of an objection to the idea - particularly since 
it *does* forcibly terminate the generator's block without triggering any 
exception handlers. I was forgetting that the StopIteration exception is 
actually raised external to the generator code block - it's created by the 
surrounding generator object once the code block terminates.

That means the actual change being proposed is smaller than I thought:
   1. Change the compiler to allow an argument to return inside a generator
   2. Change generator objects to use the value returned by their internal 
code block as the argument to the StopIteration exception they create if the 
block terminates

Note that this would change the behaviour of normal generators - they will 
raise "StopIteration(None)", rather than the current "StopIteration()".

I actually kind of like that - it means that generators become even more like 
functions, with their return value being held in ex.args[0].

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From mhammond at skippinet.com.au  Sun Oct  9 10:29:27 2005
From: mhammond at skippinet.com.au (Mark Hammond)
Date: Sun, 9 Oct 2005 18:29:27 +1000
Subject: [Python-Dev] PythonCore\CurrentVersion
In-Reply-To: <4347A020.2050008@v.loewis.de>
Message-ID: <DAELJHBGPBHPJKEBGGLNKEBMHDAD.mhammond@skippinet.com.au>

> What happened to the CurrentVersion registry entry documented at
>
> http://www.python.org/windows/python/registry.html
>
> AFAICT, even the python15.wse file did not fill a value in this
> entry (perhaps I'm misinterpreting the wse file, though).
>
> So was this ever used? Why is it documented, and who documented it
> (unfortunately, registry.html is not in cvs/subversion, either)?

I believe I documented it many moons ago.  I don't think CurrentVersion was
ever implemented (or possibly was for a very short time before being
removed).  The "registered modules" concept was misguided and AFAIK is not
used by anyone - IMO it should be deprecated (if not just removed!).
Further, I believe the documentation in the file for PYTHONPATH is, as said
in those docs, out of date, but that the comments in getpathp.c are correct.

Cheers,

Mark


From ncoghlan at gmail.com  Sun Oct  9 15:08:32 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 09 Oct 2005 23:08:32 +1000
Subject: [Python-Dev] New PEP 342 suggestion: result() and allow "return
 with arguments" in generators (was Re: PEP 342 suggestion: start(),
 __call__() and unwind_call() methods)
In-Reply-To: <434883D6.80009@gmail.com>
References: <5.1.1.6.0.20051007134543.03b2f728@mail.telecommunity.com>		<43471EBE.40401@gmail.com>
	<4347652E.1090705@satori.za.net>		<ca471dc20510081429t2f268860s4927ec2d1124e961@mail.gmail.com>		<43486E20.3010908@gmail.com>	<ca471dc20510081825l1f8b271du21b04c69d5eadda6@mail.gmail.com>
	<434883D6.80009@gmail.com>
Message-ID: <43491650.1020704@gmail.com>

Nick Coghlan wrote:
> Although, if StopIteration.result was a read-only property with the above 
> definition, wouldn't that give us the benefit of "one obvious way" to return a 
> value from a coroutine without imposing any runtime cost on normal use of 
> StopIteration to finish an iterator?

Sometimes I miss the obvious. There's a *much*, *much* better place to store 
the return value of a generator than on the StopIteration exception that it 
raises when it finishes. Just save the return value in the *generator*.

And then provide a method on generators that is the functional equivalent of:

     def result():
         # Finish the generator if it isn't finished already
         for step in self:
             pass
         return self._result # Return the result saved when the block finished

It doesn't matter that a for loop swallows the StopIteration exception any 
more, because the return value is retrieved directly from the generator.

I also like that this interface could still be used even if the work of 
getting the result is actually farmed off to a separate thread or process 
behind the scenes.

Cheers,
Nick.

P.S. Here's what a basic trampoline scheduler without builtin asynchronous 
call support would look like if coroutines could return values directly. The 
bits that it cleans up are marked "NEW":

          import collections

          class Trampoline:
              """Manage communications between coroutines"""

              running = False

              def __init__(self):
                  self.queue = collections.deque()

              def add(self, coroutine):
                  """Request that a coroutine be executed"""
                  self.schedule(coroutine)

              def run(self):
                  result = None
                  self.running = True
                  try:
                      while self.running and self.queue:
                          func = self.queue.popleft()
                          result = func()
                      return result
                  finally:
                      self.running = False

              def stop(self):
                  self.running = False

              def schedule(self, coroutine, stack=(), call_result=None, *exc):
                  # Define the new pseudothread
                  def pseudothread():
                      try:
                          if exc:
                              callee = coroutine.throw(call_result, *exc)
                          else:
                              callee = coroutine.send(call_result)
                      except StopIteration: # NEW: no need to name exception
                          # Coroutine finished cleanly
                          if stack:
                              # Send the result to the caller
                              caller = stack[0]
                              prev_stack = stack[1]
                              # NEW: get result directly from callee
                              self.schedule(
                                   caller, prev_stack, callee.result()
                              )
                      except:
                          # Coroutine finished with an exception
                          if stack:
                              # send the error back to the caller
                              caller = stack[0]
                              prev_stack = stack[1]
                              self.schedule(
                                   caller, prev_stack, *sys.exc_info()
                              )
                          else:
                              # Nothing left in this pseudothread to
                              # handle it, let it propagate to the
                              # run loop
                              raise
                      else:
                          # Coroutine isn't finished yet
                          if callee is None:
                              # Reschedule the current coroutine
                              self.schedule(coroutine, stack)
                          elif isinstance(callee, types.GeneratorType):
                              # Make a call to another coroutine
                              self.schedule(callee, (coroutine,stack))
                          elif iscallable(callee):
                              # Make a blocking call in a separate thread
                              self.schedule(
                                   threaded(callee), (coroutine,stack)
                              )
                          else:
                              # Raise a TypeError in the current coroutine
                              self.schedule(coroutine, stack,
                                   TypeError, "Illegal argument to yield"
                              )

                  # Add the new pseudothread to the execution queue
                  self.queue.append(pseudothread)


P.P.S. Here's the simple coroutine that threads out a call to support 
asynchronous calls with the above scheduler:

   def threaded(func):
       class run_func(threading.Thread):
           def __init__(self):
               super(run_func, self).__init__()
               self.finished = False
           def run(self):
               print "Making call"
               self.result = func()
               self.finished = True
               print "Made call"
       call = run_func()
       call.start()
       print "Started call"
       while not call.finished:
           yield # Not finished yet so reschedule
       print "Finished call"
       return call.result

I tried this out by replacing 'yield' with 'yield None' and 'return 
call.result' with 'print call.result':

Py> x = threaded(lambda: "Hi there!")
Py> x.next()
Started call
Making call
Made call
Py> x.next()
Finished call
Hi there!
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
StopIteration

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From andersjm at inbound.dk  Sun Oct  9 16:00:04 2005
From: andersjm at inbound.dk (Anders J. Munch)
Date: Sun, 09 Oct 2005 16:00:04 +0200
Subject: [Python-Dev] Proposed changes to PEP 343
In-Reply-To: <43468257.9030008@gmail.com>
References: <9B1795C95533CA46A83BA1EAD4B01030031F0B@flonidanmail.flonidan.net>
	<43468257.9030008@gmail.com>
Message-ID: <43492264.5080403@inbound.dk>

Nick Coghlan wrote:
 >Anders J. Munch wrote:
 >
 >>Note that __with__ and __enter__ could be combined into one with no
 >>loss of functionality:
 >>
 >>        abc,VAR = (EXPR).__with__()
 >>    
 >
 >They can't be combined, because they're invoked on different objects.
 >

Sure they can.  The combined method first does what __with__ would
have done to create abc, and then does whatever abc.__enter__ would
have done.  Since the type of 'abc' is always known to the author of
__with__, this is trivial.

Strictly speaking there's no guarantee that the type of 'abc' is known
to the author of __with__, but I can't imagine an example where that
would not be the case.

 >It would
 >be like trying to combine __iter__() and next() into the same method for
 >iterators. . .

The with-statement needs two pieces of information from the
expression: Which object to bind to the users's variable (VAR) and
which object takes care of block-exit cleanup (abc).  A combined
method would give these two equal standing rather than deriving one
from the other. Nothing ugly about that.

- Anders


From guido at python.org  Sun Oct  9 16:28:29 2005
From: guido at python.org (Guido van Rossum)
Date: Sun, 9 Oct 2005 07:28:29 -0700
Subject: [Python-Dev] Proposed changes to PEP 343
In-Reply-To: <43492264.5080403@inbound.dk>
References: <9B1795C95533CA46A83BA1EAD4B01030031F0B@flonidanmail.flonidan.net>
	<43468257.9030008@gmail.com> <43492264.5080403@inbound.dk>
Message-ID: <ca471dc20510090728m32aa88eev67ce703fa2405332@mail.gmail.com>

On 10/9/05, Anders J. Munch <andersjm at inbound.dk> wrote:
> Nick Coghlan wrote:
>  >Anders J. Munch wrote:
>  >
>  >>Note that __with__ and __enter__ could be combined into one with no
>  >>loss of functionality:
>  >>
>  >>        abc,VAR = (EXPR).__with__()
>  >>
>  >
>  >They can't be combined, because they're invoked on different objects.
>  >
>
> Sure they can.  The combined method first does what __with__ would
> have done to create abc, and then does whatever abc.__enter__ would
> have done.  Since the type of 'abc' is always known to the author of
> __with__, this is trivial.

I'm sure it can be done, but I find this ugly API design. While I'm
not keen on complicating the API, the decimal context example has
convinced me that it's necessary. The separation into __with__ which
asks EXPR for a context manager and __enter__ / __exit__ which handle
try/finally feels right. An API returning a tuple is asking for bugs.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Sun Oct  9 16:46:09 2005
From: guido at python.org (Guido van Rossum)
Date: Sun, 9 Oct 2005 07:46:09 -0700
Subject: [Python-Dev] New PEP 342 suggestion: result() and allow "return
	with arguments" in generators (was Re: PEP 342 suggestion:
	start(), __call__() and unwind_call() methods)
In-Reply-To: <43491650.1020704@gmail.com>
References: <5.1.1.6.0.20051007134543.03b2f728@mail.telecommunity.com>
	<43471EBE.40401@gmail.com> <4347652E.1090705@satori.za.net>
	<ca471dc20510081429t2f268860s4927ec2d1124e961@mail.gmail.com>
	<43486E20.3010908@gmail.com>
	<ca471dc20510081825l1f8b271du21b04c69d5eadda6@mail.gmail.com>
	<434883D6.80009@gmail.com> <43491650.1020704@gmail.com>
Message-ID: <ca471dc20510090746p369845fdgc73a3d4c59478c30@mail.gmail.com>

On 10/9/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Sometimes I miss the obvious. There's a *much*, *much* better place to store
> the return value of a generator than on the StopIteration exception that it
> raises when it finishes. Just save the return value in the *generator*.
>
> And then provide a method on generators that is the functional equivalent of:
>
>      def result():
>          # Finish the generator if it isn't finished already
>          for step in self:
>              pass
>          return self._result # Return the result saved when the block finished
>
> It doesn't matter that a for loop swallows the StopIteration exception any
> more, because the return value is retrieved directly from the generator.

Actually, I don't like this at all. It harks back to earlier proposals
where state was stored on the generator (e.g. PEP 288).

> I also like that this interface could still be used even if the work of
> getting the result is actually farmed off to a separate thread or process
> behind the scenes.

That seems an odd use case for generators, better addressed by
creating an explicit helper object when the need exists. I bet that
object will need to exist anyway to hold other information related to
the exchange of information between threads (like a lock or a Queue).

Looking at your example, I have to say that I find the trampoline
example from PEP 342 really hard to understand. It took me several
days to get it after Phillip first put it in the PEP, and that was
after having reconstructed the same functionality independently. (I
have plans to replace or augment it with a different set of examples,
but haven't gotten the time. Old story...) I don't think that
something like that ought to be motivating generator extensions. I
also think that using a thread for async I/O is the wrong approach --
if you wanted to use threads shou should be using threads and you
wouldn't be dealing with generators. There's a solution that uses
select() which can handle as many sockets as you want without threads
and without the clumsy polling ("is it ready yet? is it ready yet? is
it ready yet?").

I urge you to leave well enough alone. There's room for extensions
after people have built real systems with the raw material provided by
PEP 342 and 343.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From jim at zope.com  Sun Oct  9 18:33:12 2005
From: jim at zope.com (Jim Fulton)
Date: Sun, 09 Oct 2005 12:33:12 -0400
Subject: [Python-Dev] defaultproperty (was: Re:  RFC: readproperty)
In-Reply-To: <433BA3CF.1090205@zope.com>
References: <433AA5AC.6040509@zope.com>	<ca471dc205092807477ff2d0f1@mail.gmail.com>
	<433BA3CF.1090205@zope.com>
Message-ID: <43494648.6040904@zope.com>

Based on the discussion, I think I'd go with defaultproperty.

Questions:

- Should this be in builtins, alongside property, or in
   a library module? (Oleg suggested propertytools.)

- Do we need a short PEP?

Jim

Jim Fulton wrote:
> Guido van Rossum wrote:
> 
>>On 9/28/05, Jim Fulton <jim at zope.com> wrote:
>>
> 
> ...
> 
>>I think we need to be real careful with chosing a name -- in Jim's
>>example, *anyone* could assign to Spam().eggs to override the value.
>>The name "readproperty" is too close to "readonlyproperty",
> 
> 
> In fact, property creates read-only properties for new-style classes.
> (I hadn't realized, until reading this thread, that for classic
> classes, you could still set the attribute.)
> 
>  > but
> 
>>read-only it ain't! "Lazy" also doesn't really describe what's going
>>on.
> 
> 
> I agree.
> 
> 
>>I believe some folks use a concept of "memo functions" which resemble
>>this proposal except the notation is different: IIRC a memo function
>>is always invoked as a function, but stores its result in a private
>>instance variable, which it returns upon subsequent calls. This is a
>>common pattern. Jim's proposal differs because the access looks like
>>an attribute, not a method call. Still, perhaps memoproperty would be
>>a possible name.
>>
>>Another way to look at the naming problem is to recognize that the
>>provided function really computes a default value if the attribute
>>isn't already set. So perhaps defaultproperty?
> 
> 
> Works for me.
> 
> Oleg Broytmann wrote:
>  > On Wed, Sep 28, 2005 at 10:16:12AM -0400, Jim Fulton wrote:
>  >
>  >>   class readproperty(object):
>  >
>  > [skip]
>  >
>  >>I do this often enough
>  >
>  >
>  >    I use it since about 2000 often enough under the name CachedAttribute:
>  >
>  > http://cvs.sourceforge.net/viewcvs.py/ppa/qps/qUtils.py
> 
> Steven Bethard wrote:
>  > Jim Fulton wrote:
>  >
> ...
>  > I've also needed behavior like this a few times, but I use a variant
>  > of Scott David Daniel's recipe[1]:
>  >
>  > class _LazyAttribute(object):
> 
> 
> Yup, the Zope 3 sources have something very similar:
> 
> http://svn.zope.org/Zope3/trunk/src/zope/cachedescriptors/property.py?view=markup
> 
> I actually think this does too much.  All it saves me, compared to what I proposed
> is one assignment.  I'd rather make that assignment explicit.
> 
> Anyway, all I wanted with readproperty was a property that implemented only
> __get__, as opposed to property, which implements __get__, __set__, and __delete__.
> 
> I'd be happy to call it readproprty or getproperty or defaulproperty or whatever. :)
> 
> I'd prefer that it's semantics stay fairly simple though.
> 
> 
> Jim
> 


-- 
Jim Fulton           mailto:jim at zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org

From solipsis at pitrou.net  Sun Oct  9 21:02:16 2005
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sun, 09 Oct 2005 21:02:16 +0200
Subject: [Python-Dev] async IO and helper threads
In-Reply-To: <ca471dc20510090746p369845fdgc73a3d4c59478c30@mail.gmail.com>
References: <5.1.1.6.0.20051007134543.03b2f728@mail.telecommunity.com>
	<43471EBE.40401@gmail.com> <4347652E.1090705@satori.za.net>
	<ca471dc20510081429t2f268860s4927ec2d1124e961@mail.gmail.com>
	<43486E20.3010908@gmail.com>
	<ca471dc20510081825l1f8b271du21b04c69d5eadda6@mail.gmail.com>
	<434883D6.80009@gmail.com> <43491650.1020704@gmail.com>
	<ca471dc20510090746p369845fdgc73a3d4c59478c30@mail.gmail.com>
Message-ID: <1128884536.6142.10.camel@fsol>

Le dimanche 09 octobre 2005 ? 07:46 -0700, Guido van Rossum a ?crit :
> I
> also think that using a thread for async I/O is the wrong approach --
> if you wanted to use threads shou should be using threads and you
> wouldn't be dealing with generators. There's a solution that uses
> select() which can handle as many sockets as you want without threads
> and without the clumsy polling

select() works with sockets. But nothing else if you want to stay
cross-platform, so async file IO and other things remain open questions.
By the way, you don't need clumsy polling to wait for helper threads ;)
You can just use a ConditionVariable from the threading package (or
something else with the same semantics).


BTW, I'm not arguing at all for the extension proposal. Integrating
async stuff into generators does not need an API extension IMO. I'm
already doing it in my scheduler.
An example which just waits for an external command to finish and
periodically spins a character in the meantime:
http://svn.berlios.de/viewcvs/tasklets/trunk/examples/popen1.py?view=markup 
The scheduler code is here:
http://svn.berlios.de/viewcvs/tasklets/trunk/softlets/core/switcher.py?view=markup

Regards

Antoine.



From greg.ewing at canterbury.ac.nz  Mon Oct 10 02:33:42 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 10 Oct 2005 13:33:42 +1300
Subject: [Python-Dev] Removing the block stack (was Re: PEP 343 and
 __with__)
In-Reply-To: <5.1.1.6.0.20051005191114.01f6f018@mail.telecommunity.com>
References: <5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com>
	<5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>
	<5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>
	<5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com>
	<5.1.1.6.0.20051005191114.01f6f018@mail.telecommunity.com>
Message-ID: <4349B6E6.4020804@canterbury.ac.nz>

Phillip J. Eby wrote:

> Clearly, the cost of function calls in Python lies somewhere else, and I'd 
> probably look next at parameter tuple allocation,

For simple calls where there aren't any *args or other
such complications, it seems like it should be possible
to just copy the args from the calling frame straight
into the called one.

Or is this already done these days?

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From guido at python.org  Mon Oct 10 02:35:55 2005
From: guido at python.org (Guido van Rossum)
Date: Sun, 9 Oct 2005 17:35:55 -0700
Subject: [Python-Dev] defaultproperty (was: Re: RFC: readproperty)
In-Reply-To: <43494648.6040904@zope.com>
References: <433AA5AC.6040509@zope.com>
	<ca471dc205092807477ff2d0f1@mail.gmail.com>
	<433BA3CF.1090205@zope.com> <43494648.6040904@zope.com>
Message-ID: <ca471dc20510091735h7a6e871ai5bde26b48fc2673b@mail.gmail.com>

On 10/9/05, Jim Fulton <jim at zope.com> wrote:
> Based on the discussion, I think I'd go with defaultproperty.

Great.

> Questions:
>
> - Should this be in builtins, alongside property, or in
>    a library module? (Oleg suggested propertytools.)
>
> - Do we need a short PEP?

I think so. From the responses I'd say there's at most lukewarm
interest (including from me). You might also want to drop it and just
add it to your personal (or Zope's) library.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Mon Oct 10 03:18:30 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun, 09 Oct 2005 21:18:30 -0400
Subject: [Python-Dev] Removing the block stack (was Re: PEP 343 and
 __with__)
In-Reply-To: <4349B6E6.4020804@canterbury.ac.nz>
References: <5.1.1.6.0.20051005191114.01f6f018@mail.telecommunity.com>
	<5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com>
	<5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>
	<5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>
	<5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com>
	<5.1.1.6.0.20051005191114.01f6f018@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20051009211304.01f30270@mail.telecommunity.com>

At 01:33 PM 10/10/2005 +1300, Greg Ewing wrote:
>Phillip J. Eby wrote:
>
> > Clearly, the cost of function calls in Python lies somewhere else, and I'd
> > probably look next at parameter tuple allocation,
>
>For simple calls where there aren't any *args or other
>such complications, it seems like it should be possible
>to just copy the args from the calling frame straight
>into the called one.
>
>Or is this already done these days?

It's already done, if the number of arguments matches, the code flags are 
just so, etc.


From greg.ewing at canterbury.ac.nz  Mon Oct 10 04:43:20 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 10 Oct 2005 15:43:20 +1300
Subject: [Python-Dev] New PEP 342 suggestion: result() and allow "return
 with arguments" in generators (was Re: PEP 342 suggestion: start(),
 __call__() and unwind_call() methods)
In-Reply-To: <43491650.1020704@gmail.com>
References: <5.1.1.6.0.20051007134543.03b2f728@mail.telecommunity.com>
	<43471EBE.40401@gmail.com> <4347652E.1090705@satori.za.net>
	<ca471dc20510081429t2f268860s4927ec2d1124e961@mail.gmail.com>
	<43486E20.3010908@gmail.com>
	<ca471dc20510081825l1f8b271du21b04c69d5eadda6@mail.gmail.com>
	<434883D6.80009@gmail.com> <43491650.1020704@gmail.com>
Message-ID: <4349D548.3030000@canterbury.ac.nz>

Nick Coghlan wrote:

> Sometimes I miss the obvious. There's a *much*, *much* better place to store 
> the return value of a generator than on the StopIteration exception that it 
> raises when it finishes. Just save the return value in the *generator*.

I'm not convinced that this is better, because it would
make value-returning something specific to generators.

On the other hand, raising StopIteration(value) is something
that any iterator can easily do, whether it's implemented
as a generator, a Python class, a C type, or whatever.

Besides, it doesn't smell right to me -- sort of like returning
a value from a function by storing it in a global rather than
using a return statement.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From greg.ewing at canterbury.ac.nz  Mon Oct 10 04:43:31 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 10 Oct 2005 15:43:31 +1300
Subject: [Python-Dev] PEP 342 suggestion: start(),
 __call__() and unwind_call() methods
In-Reply-To: <ca471dc20510081825l1f8b271du21b04c69d5eadda6@mail.gmail.com>
References: <5.1.1.6.0.20051007134543.03b2f728@mail.telecommunity.com>
	<43471EBE.40401@gmail.com> <4347652E.1090705@satori.za.net>
	<ca471dc20510081429t2f268860s4927ec2d1124e961@mail.gmail.com>
	<43486E20.3010908@gmail.com>
	<ca471dc20510081825l1f8b271du21b04c69d5eadda6@mail.gmail.com>
Message-ID: <4349D553.1060909@canterbury.ac.nz>

Guido van Rossum wrote:

> Plus, Piet also remarked that the value is silently ignored
> when the generator is used in a for-loop. ... I'd worry that accepting
 > "return X" would increase the occurrence of bugs caused by someone
 > habitually writing "return X" where they meant "yield X".

Then have for-loops raise an exception if they get a
StopIteration with something other than None as an
argument.

> I'd like to keep StopIteration really lightweight so it doesn't slow
> down its use in other places.

You could leave StopIteration itself alone altogether
and have a subclass StopIterationWithValue for returning
things. This would make the for-loop situation even safer,
since then you could distinguish between falling off the
end of a generator and executing 'return None' inside it.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From greg.ewing at canterbury.ac.nz  Mon Oct 10 04:44:12 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 10 Oct 2005 15:44:12 +1300
Subject: [Python-Dev] Extending tuple unpacking
In-Reply-To: <ca471dc20510071034y39b3facpe1d43e34b11e69db@mail.gmail.com>
References: <20051007172237.GA13288@localhost.localdomain>
	<ca471dc20510071034y39b3facpe1d43e34b11e69db@mail.gmail.com>
Message-ID: <4349D57C.7010509@canterbury.ac.nz>

Guido van Rossum wrote:

> I personally think this is adequately handled by writing:
> 
>   (first, second), rest = something[:2], something[2:]

That's less than satisfying because it violates DRY
three times (once for mentioning 'something' twice,
once for mentioning the index twice, and once for
needing to make sure the index agrees with the number
of items on the LHS).

> Argument lists are not tuples [*] and features of argument lists
> should not be confused with features of tuple unpackings.

I'm aware of the differences, but I still see a strong
similarity where this particular feature is concerned.
The pattern of thinking is the same: "I want to deal
with the first n of these things individually, and the
rest collectively."

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From fdrake at acm.org  Mon Oct 10 05:04:58 2005
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Sun, 9 Oct 2005 23:04:58 -0400
Subject: [Python-Dev] Extending tuple unpacking
In-Reply-To: <4349D57C.7010509@canterbury.ac.nz>
References: <20051007172237.GA13288@localhost.localdomain>
	<ca471dc20510071034y39b3facpe1d43e34b11e69db@mail.gmail.com>
	<4349D57C.7010509@canterbury.ac.nz>
Message-ID: <200510092304.58638.fdrake@acm.org>

On Sunday 09 October 2005 22:44, Greg Ewing wrote:
 > I'm aware of the differences, but I still see a strong
 > similarity where this particular feature is concerned.
 > The pattern of thinking is the same: "I want to deal
 > with the first n of these things individually, and the
 > rest collectively."

Well stated.  I'm in complete agreement on this matter.


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From ncoghlan at gmail.com  Mon Oct 10 00:28:14 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 10 Oct 2005 08:28:14 +1000
Subject: [Python-Dev] defaultproperty
In-Reply-To: <43494648.6040904@zope.com>
References: <433AA5AC.6040509@zope.com>	<ca471dc205092807477ff2d0f1@mail.gmail.com>	<433BA3CF.1090205@zope.com>
	<43494648.6040904@zope.com>
Message-ID: <4349997E.9010208@gmail.com>

Jim Fulton wrote:
> Based on the discussion, I think I'd go with defaultproperty.
> 
> Questions:
> 
> - Should this be in builtins, alongside property, or in
>    a library module? (Oleg suggested propertytools.)
> 
> - Do we need a short PEP?

The much-discussed never-created decorators module, perhaps?

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ironfroggy at gmail.com  Mon Oct 10 07:35:55 2005
From: ironfroggy at gmail.com (Calvin Spealman)
Date: Mon, 10 Oct 2005 01:35:55 -0400
Subject: [Python-Dev] Removing the block stack (was Re: PEP 343 and
	__with__)
In-Reply-To: <5.1.1.6.0.20051006024517.01f6f0d0@mail.telecommunity.com>
References: <5.1.1.6.0.20051003125254.01f95e50@mail.telecommunity.com>
	<5.1.1.6.0.20051003151012.01f95ca0@mail.telecommunity.com>
	<2mu0fxekdz.fsf@starship.python.net>
	<5.1.1.6.0.20051005191114.01f6f018@mail.telecommunity.com>
	<5.1.1.6.0.20051006024517.01f6f0d0@mail.telecommunity.com>
Message-ID: <76fd5acf0510092235u82c5a32vb436e3e10a4118cf@mail.gmail.com>

On 10/6/05, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 10:09 PM 10/5/2005 -0700, Neal Norwitz wrote:
> >The general idea is to allocate the stack in one big hunk and just
> >walk up/down it as functions are called/returned.  This only means
> >incrementing or decrementing pointers.  This should allow us to avoid
> >a bunch of copying and tuple creation/destruction.  Frames would
> >hopefully be the same size which would help.  Note that even though
> >there is a free list for frames, there could still be
> >PyObject_GC_Resize()s often (or unused memory).  WIth my idea,
> >hopefully there would be better memory locality, which could speed
> >things up.
>
> Yeah, unfortunately for your idea, generators would have to copy off bits
> of the stack and then copy them back in, making generators slower.  If it
> weren't for that part, the idea would probably be a good one, as arguments,
> locals, cells, and the block and value stacks could all be handled that
> way, with the compiler treating all operations as base-pointer offsets,
> thereby eliminating lots of more-complex pointer management in ceval.c and
> frameobject.c.

If we had these seperate stacks for each thread, would it be possible
to also create a stack for generator calls? The current call
operations could possibly do a check to see if the function being
called is a generator (if they don't have a generator bit, could they,
to speed this up?). This generator-specific stack would be used for
the generator's frame and any calls it makes on each iteration. This
may pose threat of a bottleneck, allocating a new stack in the heap
for every generator call, but generators are generally iterated more
than created and the stacks could be pooled, of course.

I don't know as much as I'd like about the CPython internals, so I'm
just throwing this out there for commenting by those in the know.

From ironfroggy at gmail.com  Mon Oct 10 07:47:32 2005
From: ironfroggy at gmail.com (Calvin Spealman)
Date: Mon, 10 Oct 2005 01:47:32 -0400
Subject: [Python-Dev] Fwd:  defaultproperty
In-Reply-To: <76fd5acf0510092246q1861403flda2dd49ddbe02d6c@mail.gmail.com>
References: <433AA5AC.6040509@zope.com>
	<ca471dc205092807477ff2d0f1@mail.gmail.com>
	<433BA3CF.1090205@zope.com> <43494648.6040904@zope.com>
	<4349997E.9010208@gmail.com>
	<76fd5acf0510092246q1861403flda2dd49ddbe02d6c@mail.gmail.com>
Message-ID: <76fd5acf0510092247m4d8383e8o40260de7950906@mail.gmail.com>

Sorry, Nick. GMail, for some reason, doesn't follow the reply-to
properly for python-dev. Forwarding to list now...

On 10/9/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Jim Fulton wrote:
> > Based on the discussion, I think I'd go with defaultproperty.
> >
> > Questions:
> >
> > - Should this be in builtins, alongside property, or in
> >    a library module? (Oleg suggested propertytools.)
> >
> > - Do we need a short PEP?
>
> The much-discussed never-created decorators module, perhaps?
>
> Cheers,
> Nick.

Never created for a reason? lumping things together for having the
similar usage semantics, but unrelated purposes, might be something to
avoid and maybe that's why it hasn't happened yet for decorators. If
ever there was a makethreadsafe decorator, it should go in the thread
module, etc. I mean, come on, its like making a module just to store a
bunch of unrelated types just to lump them together because they're
types. Who wants that?

From fredrik at pythonware.com  Mon Oct 10 10:24:47 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 10 Oct 2005 10:24:47 +0200
Subject: [Python-Dev] defaultproperty
References: <433AA5AC.6040509@zope.com><ca471dc205092807477ff2d0f1@mail.gmail.com><433BA3CF.1090205@zope.com>
	<43494648.6040904@zope.com><4349997E.9010208@gmail.com><76fd5acf0510092246q1861403flda2dd49ddbe02d6c@mail.gmail.com>
	<76fd5acf0510092247m4d8383e8o40260de7950906@mail.gmail.com>
Message-ID: <did8gh$nch$1@sea.gmane.org>

Calvin Spealman wrote:

> I mean, come on, its like making a module just to store a bunch of
> unrelated types just to lump them together because they're types.

import types

</F>




From ncoghlan at gmail.com  Mon Oct 10 11:02:57 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 10 Oct 2005 19:02:57 +1000
Subject: [Python-Dev] New PEP 342 suggestion: result() and allow "return
 with arguments" in generators (was Re: PEP 342 suggestion: start(),
 __call__() and unwind_call() methods)
In-Reply-To: <4349D548.3030000@canterbury.ac.nz>
References: <5.1.1.6.0.20051007134543.03b2f728@mail.telecommunity.com>	<43471EBE.40401@gmail.com>
	<4347652E.1090705@satori.za.net>	<ca471dc20510081429t2f268860s4927ec2d1124e961@mail.gmail.com>	<43486E20.3010908@gmail.com>	<ca471dc20510081825l1f8b271du21b04c69d5eadda6@mail.gmail.com>	<434883D6.80009@gmail.com>
	<43491650.1020704@gmail.com> <4349D548.3030000@canterbury.ac.nz>
Message-ID: <434A2E41.7040903@gmail.com>

Greg Ewing wrote:
> Nick Coghlan wrote:
> 
> 
>>Sometimes I miss the obvious. There's a *much*, *much* better place to store 
>>the return value of a generator than on the StopIteration exception that it 
>>raises when it finishes. Just save the return value in the *generator*.
> 
> 
> I'm not convinced that this is better, because it would
> make value-returning something specific to generators.
> 
> On the other hand, raising StopIteration(value) is something
> that any iterator can easily do, whether it's implemented
> as a generator, a Python class, a C type, or whatever.
> 
> Besides, it doesn't smell right to me -- sort of like returning
> a value from a function by storing it in a global rather than
> using a return statement.

Yeah, the various responses have persuaded me that having generators resemble 
threads in that they don't have a defined "return value" isn't a bad thing at all.

Although that means I've gone all the way back to preferring the status quo - 
if you want to pass data back from a generator when it terminates, just use 
StopIteration(result).

I'm starting to think we want to let PEP 342 bake for at least one release 
cycle before deciding what (if any) additional behaviour should be added to 
generators.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Mon Oct 10 11:21:56 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 10 Oct 2005 19:21:56 +1000
Subject: [Python-Dev] Extending tuple unpacking
In-Reply-To: <200510092304.58638.fdrake@acm.org>
References: <20051007172237.GA13288@localhost.localdomain>	<ca471dc20510071034y39b3facpe1d43e34b11e69db@mail.gmail.com>	<4349D57C.7010509@canterbury.ac.nz>
	<200510092304.58638.fdrake@acm.org>
Message-ID: <434A32B4.1050709@gmail.com>

Fred L. Drake, Jr. wrote:
> On Sunday 09 October 2005 22:44, Greg Ewing wrote:
>  > I'm aware of the differences, but I still see a strong
>  > similarity where this particular feature is concerned.
>  > The pattern of thinking is the same: "I want to deal
>  > with the first n of these things individually, and the
>  > rest collectively."
> 
> Well stated.  I'm in complete agreement on this matter.

It also works for situations where "the first n items are mandatory, the rest 
are optional". This usage was brought up in the context of a basic line 
interpreter:

   cmd, *args = input.split()

Another usage is to have a Python function which doesn't support keywords for 
its positional arguments (to avoid namespace clashes in the keyword dict), but 
can still unpack the mandatory arguments easily:

   def func(*args, **kwds):
       arg1, arg2, *rest = args # Unpack the positional arguments

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From tzot at mediconsa.com  Mon Oct 10 13:25:40 2005
From: tzot at mediconsa.com (Christos Georgiou)
Date: Mon, 10 Oct 2005 14:25:40 +0300
Subject: [Python-Dev] PEP 3000 and exec
Message-ID: <didj3k$o0r$1@sea.gmane.org>

This might be minor-- but I didn't see anyone mentioning it so far.  If 
`exec` functionality is to be provided, then I think it still should be a 
keyword for the parser to know; currently bytecode generation is affected if 
`exec` is present.  Even if that changes for Python 3k (we don't know yet), 
the paragraph for exec should be annotated with a note about this issue. 



From jim at zope.com  Mon Oct 10 14:25:59 2005
From: jim at zope.com (Jim Fulton)
Date: Mon, 10 Oct 2005 08:25:59 -0400
Subject: [Python-Dev] defaultproperty
In-Reply-To: <ca471dc20510091735h7a6e871ai5bde26b48fc2673b@mail.gmail.com>
References: <433AA5AC.6040509@zope.com>	
	<ca471dc205092807477ff2d0f1@mail.gmail.com>	
	<433BA3CF.1090205@zope.com> <43494648.6040904@zope.com>
	<ca471dc20510091735h7a6e871ai5bde26b48fc2673b@mail.gmail.com>
Message-ID: <434A5DD7.70707@zope.com>

Guido van Rossum wrote:
> On 10/9/05, Jim Fulton <jim at zope.com> wrote:
> 
>>Based on the discussion, I think I'd go with defaultproperty.
> 
> 
> Great.
> 
> 
>>Questions:
>>
>>- Should this be in builtins, alongside property, or in
>>   a library module? (Oleg suggested propertytools.)
>>
>>- Do we need a short PEP?
> 
> 
> I think so. From the responses I'd say there's at most lukewarm
> interest (including from me).

Hm, I saw several responses from people who'd built something
quite similar.  This suggests to me that this is a common need.

 > You might also want to drop it and just
> add it to your personal (or Zope's) library.

I have something like this in Zope's library.  I end up with a
very small package that isn't logically part of other packages,
but that is a dependency of lots of packages.  I don't like that,
but I guess I should get over it.

I must say that I am of 2 minds about things like this.  On the one
hand, I'd like Python's standard library to be small with packaging
systems to provide "extra batteries".  OTOH, I often find small
tools like this that would be nice to have readily available.

Jim

-- 
Jim Fulton           mailto:jim at zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org

From circlecycle at gmail.com  Sun Oct  9 02:04:13 2005
From: circlecycle at gmail.com (jamesr)
Date: Sat, 8 Oct 2005 20:04:13 -0400
Subject: [Python-Dev]  C.E.R. Thoughts
Message-ID: <78d129adb4581d24b1d07844019a2afe@gmail.com>

Congragulations heartily given. I missed the ternary op in c... Way to 
go! clean and easy and now i can do:

if ((sys.argv[1] =='debug') if len(sys.argv) > 1 else False):
	pass

and check variables IF AND ONLY if they exist, in a single line!

but y'all knew that..


From phd at mail2.phd.pp.ru  Mon Oct 10 15:02:48 2005
From: phd at mail2.phd.pp.ru (Oleg Broytmann)
Date: Mon, 10 Oct 2005 17:02:48 +0400
Subject: [Python-Dev] C.E.R. Thoughts
In-Reply-To: <78d129adb4581d24b1d07844019a2afe@gmail.com>
References: <78d129adb4581d24b1d07844019a2afe@gmail.com>
Message-ID: <20051010130248.GB19369@phd.pp.ru>

On Sat, Oct 08, 2005 at 08:04:13PM -0400, jamesr wrote:
> if ((sys.argv[1] =='debug') if len(sys.argv) > 1 else False):
> 	pass

   Very good example! Very good example why ternary operators must be
forbidden!

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd at phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.

From exarkun at divmod.com  Mon Oct 10 15:26:15 2005
From: exarkun at divmod.com (Jp Calderone)
Date: Mon, 10 Oct 2005 09:26:15 -0400
Subject: [Python-Dev] C.E.R. Thoughts
In-Reply-To: <78d129adb4581d24b1d07844019a2afe@gmail.com>
Message-ID: <20051010132615.3914.1043000115.divmod.quotient.26309@ohm>

On Sat, 8 Oct 2005 20:04:13 -0400, jamesr <circlecycle at gmail.com> wrote:
>Congragulations heartily given. I missed the ternary op in c... Way to
>go! clean and easy and now i can do:
>
>if ((sys.argv[1] =='debug') if len(sys.argv) > 1 else False):
>	pass
>
>and check variables IF AND ONLY if they exist, in a single line!

if len(sys.argv) > 1 and sys.argv[1] == 'debug':
    ...

usually-wouldn't-but-can't-pass-it-up-ly y'rs,

Jp

From ark at acm.org  Mon Oct 10 15:26:04 2005
From: ark at acm.org (Andrew Koenig)
Date: Mon, 10 Oct 2005 09:26:04 -0400
Subject: [Python-Dev] C.E.R. Thoughts
In-Reply-To: <78d129adb4581d24b1d07844019a2afe@gmail.com>
Message-ID: <000901c5cd9e$2d7275d0$6402a8c0@arkdesktop>

> Congragulations heartily given. I missed the ternary op in c... Way to
> go! clean and easy and now i can do:

> if ((sys.argv[1] =='debug') if len(sys.argv) > 1 else False):
> 	pass

> and check variables IF AND ONLY if they exist, in a single line!

Umm... Is this a joke?




From skip at pobox.com  Mon Oct 10 15:48:30 2005
From: skip at pobox.com (skip@pobox.com)
Date: Mon, 10 Oct 2005 08:48:30 -0500
Subject: [Python-Dev] C.E.R. Thoughts
In-Reply-To: <000901c5cd9e$2d7275d0$6402a8c0@arkdesktop>
References: <78d129adb4581d24b1d07844019a2afe@gmail.com>
	<000901c5cd9e$2d7275d0$6402a8c0@arkdesktop>
Message-ID: <17226.28974.868184.492129@montanaro.dyndns.org>


    Andrew> Umm... Is this a joke?

I hope so.  I must admit the OP's intent didn't make itself known to me with
the cursory glance I gave it.  Jp's formulation is how I would have written
it.  Assuming of course, that was the OP's intent.

Skip

From barry at python.org  Mon Oct 10 16:41:32 2005
From: barry at python.org (Barry Warsaw)
Date: Mon, 10 Oct 2005 10:41:32 -0400
Subject: [Python-Dev] Fwd:  defaultproperty
In-Reply-To: <76fd5acf0510092247m4d8383e8o40260de7950906@mail.gmail.com>
References: <433AA5AC.6040509@zope.com>
	<ca471dc205092807477ff2d0f1@mail.gmail.com> <433BA3CF.1090205@zope.com>
	<43494648.6040904@zope.com> <4349997E.9010208@gmail.com>
	<76fd5acf0510092246q1861403flda2dd49ddbe02d6c@mail.gmail.com>
	<76fd5acf0510092247m4d8383e8o40260de7950906@mail.gmail.com>
Message-ID: <1128955292.27841.2.camel@geddy.wooz.org>

On Mon, 2005-10-10 at 01:47, Calvin Spealman wrote:

> Never created for a reason? lumping things together for having the
> similar usage semantics, but unrelated purposes, might be something to
> avoid and maybe that's why it hasn't happened yet for decorators. If
> ever there was a makethreadsafe decorator, it should go in the thread
> module, etc. I mean, come on, its like making a module just to store a
> bunch of unrelated types just to lump them together because they're
> types. Who wants that?

Like itertools?

+1 for a decorators module.
-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20051010/5bb19553/attachment.pgp

From abo at minkirri.apana.org.au  Mon Oct 10 16:45:33 2005
From: abo at minkirri.apana.org.au (Donovan Baarda)
Date: Mon, 10 Oct 2005 15:45:33 +0100
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <4346FC98.5050504@gmail.com>
References: <20051006143740.287E.JCARLSON@uci.edu>
	<200510070145.17284.ms@cerenity.org>	<20051006221436.2892.JCARLSON@uci.edu>
	<415220344.20051007104751@MailBlocks.com> <4346FC98.5050504@gmail.com>
Message-ID: <1128955532.32340.141.camel@parabolic.corp.google.com>

On Fri, 2005-10-07 at 23:54, Nick Coghlan wrote:
[...]
> The few times I have encountered anyone saying anything resembling "threading 
> is easy", it was because the full sentence went something like "threading is 
> easy if you use message passing and copy-on-send or release-reference-on-send 
> to communicate between threads, and limit the shared data structures to those 
> required to support the messaging infrastructure". And most of the time there 
> was an implied "compared to using semaphores and locks directly, " at the start.

LOL! So threading is easy if you restrict inter-thread communication to
message passing... and what makes multi-processing hard is your only
inter-process communication mechanism is message passing :-)

Sounds like yet another reason to avoid threading and use processes
instead... effort spent on threading based message passing
implementations could instead be spent on inter-process messaging.

-- 
Donovan Baarda <abo at minkirri.apana.org.au>


From guido at python.org  Mon Oct 10 16:50:02 2005
From: guido at python.org (Guido van Rossum)
Date: Mon, 10 Oct 2005 07:50:02 -0700
Subject: [Python-Dev] Extending tuple unpacking
In-Reply-To: <434A32B4.1050709@gmail.com>
References: <20051007172237.GA13288@localhost.localdomain>
	<ca471dc20510071034y39b3facpe1d43e34b11e69db@mail.gmail.com>
	<4349D57C.7010509@canterbury.ac.nz>
	<200510092304.58638.fdrake@acm.org> <434A32B4.1050709@gmail.com>
Message-ID: <ca471dc20510100750m6f986bd2t7c9a24edcf87c31a@mail.gmail.com>

On 10/10/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> It also works for situations where "the first n items are mandatory, the rest
> are optional". This usage was brought up in the context of a basic line
> interpreter:
>
>    cmd, *args = input.split()

That's a really poor example though.  You really don't want a line
interpreter to bomb if the line is empty!

> Another usage is to have a Python function which doesn't support keywords for
> its positional arguments (to avoid namespace clashes in the keyword dict), but
> can still unpack the mandatory arguments easily:
>
>    def func(*args, **kwds):
>        arg1, arg2, *rest = args # Unpack the positional arguments

Again, I'd be more comfortable if this was preceded by a check for
len(args) >= 2.

I should add that I'm just -0 on this. I think proponents ought to
find better motivating examples that aren't made-up.

Perhaps Raymond's requirement would help -- find places in the
standard library where this would make code more
readable/maintainable.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon Oct 10 16:51:28 2005
From: guido at python.org (Guido van Rossum)
Date: Mon, 10 Oct 2005 07:51:28 -0700
Subject: [Python-Dev] New PEP 342 suggestion: result() and allow "return
	with arguments" in generators (was Re: PEP 342 suggestion:
	start(), __call__() and unwind_call() methods)
In-Reply-To: <434A2E41.7040903@gmail.com>
References: <5.1.1.6.0.20051007134543.03b2f728@mail.telecommunity.com>
	<43471EBE.40401@gmail.com> <4347652E.1090705@satori.za.net>
	<ca471dc20510081429t2f268860s4927ec2d1124e961@mail.gmail.com>
	<43486E20.3010908@gmail.com>
	<ca471dc20510081825l1f8b271du21b04c69d5eadda6@mail.gmail.com>
	<434883D6.80009@gmail.com> <43491650.1020704@gmail.com>
	<4349D548.3030000@canterbury.ac.nz> <434A2E41.7040903@gmail.com>
Message-ID: <ca471dc20510100751l144cb500o3f21183720c78a90@mail.gmail.com>

On 10/10/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> I'm starting to think we want to let PEP 342 bake for at least one release
> cycle before deciding what (if any) additional behaviour should be added to
> generators.

Yes please!

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From abo at minkirri.apana.org.au  Mon Oct 10 17:01:05 2005
From: abo at minkirri.apana.org.au (Donovan Baarda)
Date: Mon, 10 Oct 2005 16:01:05 +0100
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <415220344.20051007104751@MailBlocks.com>
References: <20051006143740.287E.JCARLSON@uci.edu>
	<200510070145.17284.ms@cerenity.org>
	<20051006221436.2892.JCARLSON@uci.edu>
	<415220344.20051007104751@MailBlocks.com>
Message-ID: <1128956465.32337.152.camel@parabolic.corp.google.com>

On Fri, 2005-10-07 at 17:47, Bruce Eckel wrote:
> Early in this thread there was a comment to the effect that "if you
> don't know how to use threads, don't use them," which I pointedly
> avoided responding to because it seemed to me to simply be
> inflammatory. But Ian Bicking just posted a weblog entry:
> http://blog.ianbicking.org/concurrency-and-processes.html where he
> says "threads aren't as hard as they imply" and "An especially poor
> argument is one that tells me that I'm currently being beaten with a
> stick, but apparently don't know it."

The problem with threads is at first glance they appear easy, which
seduces many beginning programmers into using them. The hard part is
knowing when and how to lock shared resources... at first glance you
don't even realise you need to do this. So many threaded applications
are broken and don't know it, because this kind of broken-ness is nearly
always intermittant and very hard to reproduce and debug.

One common alternative is async polling frameworks like Twisted. These
scare beginners away because a first glance, they appear hideously
complicated. However, if you take the time to get your head around them,
you get a better feel for all the nasty implications of concurrency, and
end up designing better applications.

This is the reason why, given a choice between an async and a threaded
implementation of an application, I will always choose the async
solution. Not because async is inherently better than threading, but
because the programmer who bothered to grock async is more likely to get
it right.

-- 
Donovan Baarda <abo at minkirri.apana.org.au>


From Michaels at rd.bbc.co.uk  Mon Oct 10 15:58:25 2005
From: Michaels at rd.bbc.co.uk (Michael Sparks)
Date: Mon, 10 Oct 2005 14:58:25 +0100
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <1128955532.32340.141.camel@parabolic.corp.google.com>
References: <20051006143740.287E.JCARLSON@uci.edu> <4346FC98.5050504@gmail.com>
	<1128955532.32340.141.camel@parabolic.corp.google.com>
Message-ID: <200510101458.25644.Michaels@rd.bbc.co.uk>

On Monday 10 Oct 2005 15:45, Donovan Baarda wrote:
> Sounds like yet another reason to avoid threading and use processes
> instead... effort spent on threading based message passing
> implementations could instead be spent on inter-process messaging.

I can't let that pass (even if our threaded component has a couple of warts
at the moment).

# Blocking thread example (uses raw_input) to single threaded pygame
# display ticker. (The display is rate limited to 8 words per second at
# most since it was designed for subtitles)
#
from Axon.ThreadedComponent import threadedcomponent
from Kamaelia.Util.PipelineComponent import pipeline
from Kamaelia.UI.Pygame.Ticker import Ticker

class ConsoleReader(threadedcomponent):
   def __init__(self, prompt=">>> "):
      super(ConsoleReader, self).__init__()
      self.prompt = prompt

   def run(self): # implementation wart, should be "main"
      while 1:
         line = raw_input(self.prompt)
         line = line + "\n"
         self.outqueues["outbox"].put(line)  # implementation wart, should be self.send(line, "outbox")

pipeline(
          ConsoleReader(),
          Ticker() # Single threaded pygame based text ticker
).run()

There's other ways with other systems to achieve the same goal. 

Inter-process based messaging can be done in various ways. The API though
can look pretty much the same. (There's obviously some implications of
crossing process boundaries though, but that's for the system composer
to deal with, not the components).

Regards,


Michael.
-- 
Michael Sparks, Senior R&D Engineer, Digital Media Group
Michael.Sparks at rd.bbc.co.uk, http://kamaelia.sourceforge.net/
British Broadcasting Corporation, Research and Development
Kingswood Warren, Surrey KT20 6NP

This e-mail may contain personal views which are not the views of the BBC.

From robey at lag.net  Mon Oct 10 19:18:13 2005
From: robey at lag.net (Robey Pointer)
Date: Mon, 10 Oct 2005 10:18:13 -0700
Subject: [Python-Dev] C API doc fix
In-Reply-To: <d11dcfba0509291206558d47cb@mail.gmail.com>
References: <4092C34F-5A07-47D0-A27F-1781EBFE887A@lag.net>
	<d11dcfba0509291206558d47cb@mail.gmail.com>
Message-ID: <DEA6BDBE-3722-43EA-AC14-658FC3D0863D@lag.net>


On 29 Sep 2005, at 12:06, Steven Bethard wrote:

> On 9/29/05, Robey Pointer <robey at lag.net> wrote:
>
>> Yesterday I ran into a bug in the C API docs.  The top of this page:
>>
>>      http://docs.python.org/api/unicodeObjects.html
>>
>> says:
>>
>> Py_UNICODE
>>      This type represents a 16-bit unsigned storage type which is
>> used by Python internally as basis for holding Unicode ordinals. On
>> platforms where wchar_t is available and also has 16-bits, Py_UNICODE
>> is a typedef alias for wchar_t to enhance native platform
>> compatibility. On all other platforms, Py_UNICODE is a typedef alias
>> for unsigned short.
>>
>
> I believe this is the same issue that was brought up in May[1].  My
> impression was that people could not agree on a documentation patch.

Would it help if I tried my hand at it?  My impression so far is that  
extension coders should probably try not to worry about the size or  
content of Py_UNICODE.  (The thread seems to have wandered off into  
nowhere again...)


Py_UNICODE
This type represents an unsigned storage type at least 16-bits long  
(but sometimes more) which is used by Python internally as basis for  
holding Unicode ordinals. On platforms where wchar_t is available and  
also has 16-bits, Py_UNICODE is a typedef alias for wchar_t to  
enhance native platform compatibility.  In general, you should use  
PyUnicode_FromEncodedObject and PyUnicode_AsEncodedString to convert  
strings to/from unicode objects, and consider Py_UNICODE to be an  
implementation detail.


robey


From janssen at parc.com  Mon Oct 10 19:59:54 2005
From: janssen at parc.com (Bill Janssen)
Date: Mon, 10 Oct 2005 10:59:54 PDT
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: Your message of "Mon, 10 Oct 2005 08:01:05 PDT."
	<1128956465.32337.152.camel@parabolic.corp.google.com> 
Message-ID: <05Oct10.105958pdt."58617"@synergy1.parc.xerox.com>

> The problem with threads is at first glance they appear easy...

Anyone who thinks that a "glance" is enough to understand something is
too far gone to worry about.  On the other hand, you might be
referring to a putative brokenness of the Python documentation on
Python threads.  I'm not sure they're broken, though.  They just point
out the threading that Python provides, for folks who want to use
threads.  Are they required to provide a full course in threads?

> ...which seduces many beginning programmers into using them.

Don't worry about this.  That's how "beginning programmers" learn.

> The hard part is knowing when and how to lock shared resources...

Well, I might say the "careful part".

> ...at first glance you don't even realise you need to do this.

Again, I'm not sure why you care what "glancers" do and don't realize.
You could say the same about most algorithms and data structures.

Bill



From skip at pobox.com  Mon Oct 10 20:20:31 2005
From: skip at pobox.com (skip@pobox.com)
Date: Mon, 10 Oct 2005 13:20:31 -0500
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <05Oct10.105958pdt."58617"@synergy1.parc.xerox.com>
References: <1128956465.32337.152.camel@parabolic.corp.google.com>
	<05Oct10.105958pdt."58617"@synergy1.parc.xerox.com>
Message-ID: <17226.45295.661911.542400@montanaro.dyndns.org>


    >> The hard part is knowing when and how to lock shared resources...

    Bill> Well, I might say the "careful part".

With the Mojam middleware stuff I suffered quite awhile with a
single-threaded implementation that would hang the entire webserver if a
backend query took too long.  I realized I needed to do something (threads,
asyncore, whatever), but didn't think I understood the issues well enough to
do it right.  Once I finally bit the bullet and switched to a multithreaded
implementation, I didn't have too much trouble.  Of course, the application
was pretty mature at that point and I understood what objects were shared
and needed to be locked.  Oh, and I took Aahz's admonition to heart and
pretty much stuck to using Queues for all synchronization.  It ain't rocket
science, but it can be subtle.

Skip

From abo at minkirri.apana.org.au  Mon Oct 10 20:39:58 2005
From: abo at minkirri.apana.org.au (Donovan Baarda)
Date: Mon, 10 Oct 2005 19:39:58 +0100
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <05Oct10.105958pdt."58617"@synergy1.parc.xerox.com>
References: <05Oct10.105958pdt."58617"@synergy1.parc.xerox.com>
Message-ID: <1128969598.32345.224.camel@parabolic.corp.google.com>

On Mon, 2005-10-10 at 18:59, Bill Janssen wrote:
> > The problem with threads is at first glance they appear easy...
> 
> Anyone who thinks that a "glance" is enough to understand something is
> too far gone to worry about.  On the other hand, you might be
> referring to a putative brokenness of the Python documentation on
> Python threads.  I'm not sure they're broken, though.  They just point
> out the threading that Python provides, for folks who want to use
> threads.  Are they required to provide a full course in threads?

I was speaking in general, not about Python in particular. If anything,
Python is one of the simplest and safest platforms for threading (thanks
mostly to the GIL). And I find the documentation excellent :-)

> > ...which seduces many beginning programmers into using them.
> 
> Don't worry about this.  That's how "beginning programmers" learn.

Many other things "beginning programmers" learn very quickly break if
you do it wrong, until you learn to do it right. Threads are tricky in
that they can "mostly work", and it can be a long while before you
realise it is actually broken.

I don't know how many bits of other people's code I've had to fix that
worked for years until it was run on hardware fast enough to trigger
that nasty race condition :-)

-- 
Donovan Baarda <abo at minkirri.apana.org.au>


From mal at egenix.com  Mon Oct 10 21:09:58 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 10 Oct 2005 21:09:58 +0200
Subject: [Python-Dev] C API doc fix
In-Reply-To: <DEA6BDBE-3722-43EA-AC14-658FC3D0863D@lag.net>
References: <4092C34F-5A07-47D0-A27F-1781EBFE887A@lag.net>	<d11dcfba0509291206558d47cb@mail.gmail.com>
	<DEA6BDBE-3722-43EA-AC14-658FC3D0863D@lag.net>
Message-ID: <434ABC86.80202@egenix.com>

Robey Pointer wrote:
> On 29 Sep 2005, at 12:06, Steven Bethard wrote:
> 
> 
>>On 9/29/05, Robey Pointer <robey at lag.net> wrote:
>>
>>
>>>Yesterday I ran into a bug in the C API docs.  The top of this page:
>>>
>>>     http://docs.python.org/api/unicodeObjects.html
>>>
>>>says:
>>>
>>>Py_UNICODE
>>>     This type represents a 16-bit unsigned storage type which is
>>>used by Python internally as basis for holding Unicode ordinals. On
>>>platforms where wchar_t is available and also has 16-bits, Py_UNICODE
>>>is a typedef alias for wchar_t to enhance native platform
>>>compatibility. On all other platforms, Py_UNICODE is a typedef alias
>>>for unsigned short.
>>>
>>
>>I believe this is the same issue that was brought up in May[1].  My
>>impression was that people could not agree on a documentation patch.

FYI, I've fixed the Py_UNICODE description now.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 10 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From janssen at parc.com  Mon Oct 10 21:26:45 2005
From: janssen at parc.com (Bill Janssen)
Date: Mon, 10 Oct 2005 12:26:45 PDT
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: Your message of "Mon, 10 Oct 2005 11:20:31 PDT."
	<17226.45295.661911.542400@montanaro.dyndns.org> 
Message-ID: <05Oct10.122654pdt."58617"@synergy1.parc.xerox.com>

Skip,

> With the Mojam middleware stuff I suffered quite awhile with a
> single-threaded implementation that would hang the entire webserver if a
> backend query took too long.  I realized I needed to do something (threads,
> asyncore, whatever), but didn't think I understood the issues well enough to
> do it right.

Yes, there's a troublesome meme in the world: "threads are hard".
They aren't, really.  You just have to know what you're doing.  But
that meme seems to keep quite capable people from doing things they
are well qualified to do.

> Once I finally bit the bullet and switched to a multithreaded
> implementation, I didn't have too much trouble.

Yep.

> Of course, the application
> was pretty mature at that point and I understood what objects were shared
> and needed to be locked.

Kind of like managing people, isn't it :-?.

I've done a lot of middleware myself, of course.  ILU was based on a
thread-safe C library and worked with Python threads quite well.
Lately I've been building UpLib (a threaded Python service) on top of
Medusa, which has worked quite well.  UpLib handles calls
sequentially, but uses threads internally to manage underlying data
transformations.  Medusa almost but not quite supports per-request
threads; I'm wondering if I should just fix that and post a patch.

Or would that just be re-creating ZServer, which I admit I haven't
figured out how to look at?

Bill

From paul.dubois at gmail.com  Mon Oct 10 22:14:30 2005
From: paul.dubois at gmail.com (Paul Du Bois)
Date: Mon, 10 Oct 2005 13:14:30 -0700
Subject: [Python-Dev] Extending tuple unpacking
In-Reply-To: <ca471dc20510100750m6f986bd2t7c9a24edcf87c31a@mail.gmail.com>
References: <20051007172237.GA13288@localhost.localdomain>
	<ca471dc20510071034y39b3facpe1d43e34b11e69db@mail.gmail.com>
	<4349D57C.7010509@canterbury.ac.nz>
	<200510092304.58638.fdrake@acm.org> <434A32B4.1050709@gmail.com>
	<ca471dc20510100750m6f986bd2t7c9a24edcf87c31a@mail.gmail.com>
Message-ID: <85f6a31f0510101314x4a1ccfdeu43d3d9436031fe3c@mail.gmail.com>

On 10/10/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
>    cmd, *args = input.split()

These examples also have a reasonable implementation using list.pop(),
albeit one that requires more typing.  On the plus side, it does not violate
DRY and is explicit about the error cases.

  args = input.split()
  try:
    cmd = input.pop(0)
  except IndexError:
    cmd = ''

> def func(*args, **kwds):
>     arg1, arg2, *rest = args # Unpack the positional arguments

  rest = args    # or args[:] if you really did want a copy
  try:
    arg1 = rest.pop(0)
    arg2 = rest.pop(0)
  except IndexError:
    raise TypeError("foo() takes at least 2 arguments")

paul

From BruceEckel-Python3234 at mailblocks.com  Mon Oct 10 22:15:18 2005
From: BruceEckel-Python3234 at mailblocks.com (Bruce Eckel)
Date: Mon, 10 Oct 2005 14:15:18 -0600
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <05Oct10.122654pdt."58617"@synergy1.parc.xerox.com>
References: Your message of "Mon, 10 Oct 2005 11:20:31 PDT."
	<17226.45295.661911.542400@montanaro.dyndns.org>
	<05Oct10.122654pdt."58617"@synergy1.parc.xerox.com>
Message-ID: <746109444.20051010141518@MailBlocks.com>

> Yes, there's a troublesome meme in the world: "threads are hard".
> They aren't, really.  You just have to know what you're doing.

I would say that the troublesome meme is that "threads are easy." I
posted an earlier, rather longish message about this. The gist of
which was: "when someone says that threads are easy, I have no idea
what they mean by it."

Perhaps this means "threads in Python are easier than threads in other
languages."

But I just finished a 150-page chapter on Concurrency in Java which
took many months to write, based on a large chapter on Concurrency in
C++ which probably took longer to write. I keep in reasonably good
touch with some of the threading experts. I can't get any of them to
say that it's easy, even though they really do understand the issues
and think about it all the time. *Because* of that, they say that it's
hard.

So alright, I'll take the bait that you've laid down more than once,
now. Perhaps you can go beyond saying that "threads really aren't
hard" and explain the aspects of them that seem so easy to you.
Perhaps you can give a nice clear explanation of cache coherency and
memory barriers in multiprocessor machines? Or explain atomicity,
volatility and visibility? Or, even better, maybe you can come up with
a better concurrency model, which is what I think most of us are
looking for in this discussion.

Bruce Eckel    http://www.BruceEckel.com   mailto:BruceEckel-Python3234 at mailblocks.com
Contains electronic books: "Thinking in Java 3e" & "Thinking in C++ 2e"
Web log: http://www.artima.com/weblogs/index.jsp?blogger=beckel
Subscribe to my newsletter:
http://www.mindview.net/Newsletter
My schedule can be found at:
http://www.mindview.net/Calendar




From bcannon at gmail.com  Mon Oct 10 22:29:26 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Mon, 10 Oct 2005 13:29:26 -0700
Subject: [Python-Dev] PEP 3000 and exec
In-Reply-To: <didj3k$o0r$1@sea.gmane.org>
References: <didj3k$o0r$1@sea.gmane.org>
Message-ID: <bbaeab100510101329l116efbao369d0b6710331a5d@mail.gmail.com>

On 10/10/05, Christos Georgiou <tzot at mediconsa.com> wrote:
> This might be minor-- but I didn't see anyone mentioning it so far.  If
> `exec` functionality is to be provided, then I think it still should be a
> keyword for the parser to know; currently bytecode generation is affected if
> `exec` is present.  Even if that changes for Python 3k (we don't know yet),
> the paragraph for exec should be annotated with a note about this issue.
>

But the PEP says that 'exec' will become a function and thus no longer
become a built-in, so changing the grammar is not needed.

-Brett

From bcannon at gmail.com  Mon Oct 10 22:33:15 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Mon, 10 Oct 2005 13:33:15 -0700
Subject: [Python-Dev] Fwd: defaultproperty
In-Reply-To: <1128955292.27841.2.camel@geddy.wooz.org>
References: <433AA5AC.6040509@zope.com>
	<ca471dc205092807477ff2d0f1@mail.gmail.com>
	<433BA3CF.1090205@zope.com> <43494648.6040904@zope.com>
	<4349997E.9010208@gmail.com>
	<76fd5acf0510092246q1861403flda2dd49ddbe02d6c@mail.gmail.com>
	<76fd5acf0510092247m4d8383e8o40260de7950906@mail.gmail.com>
	<1128955292.27841.2.camel@geddy.wooz.org>
Message-ID: <bbaeab100510101333s317d1a98r2b2af3b93b7788ad@mail.gmail.com>

On 10/10/05, Barry Warsaw <barry at python.org> wrote:
> On Mon, 2005-10-10 at 01:47, Calvin Spealman wrote:
>
> > Never created for a reason? lumping things together for having the
> > similar usage semantics, but unrelated purposes, might be something to
> > avoid and maybe that's why it hasn't happened yet for decorators. If
> > ever there was a makethreadsafe decorator, it should go in the thread
> > module, etc. I mean, come on, its like making a module just to store a
> > bunch of unrelated types just to lump them together because they're
> > types. Who wants that?
>
> Like itertools?
>
> +1 for a decorators module.

+1 from me as well.  And placing defaultproperty in there makes sense
if it is meant to be used as a decorator and not viewed as some spiffy
descriptor.

Should probably work in Michael's update_meta() function as well
(albeit maybe with a different name since I think I remember Guido
saying he didn't like the name).

-Brett

From ianb at colorstudy.com  Mon Oct 10 23:57:24 2005
From: ianb at colorstudy.com (Ian Bicking)
Date: Mon, 10 Oct 2005 16:57:24 -0500
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <5.1.1.6.0.20051007140112.03b2fc80@mail.telecommunity.com>
References: <20051006221436.2892.JCARLSON@uci.edu>	<20051006143740.287E.JCARLSON@uci.edu>	<200510070145.17284.ms@cerenity.org>	<20051006221436.2892.JCARLSON@uci.edu>
	<5.1.1.6.0.20051007140112.03b2fc80@mail.telecommunity.com>
Message-ID: <434AE3C4.6030401@colorstudy.com>

Phillip J. Eby wrote:
> What the GIL-ranters don't get is that the GIL actually gives you just 
> enough determinism to be able to write threaded programs that don't crash, 
> and that maybe will even work if you treat every point of interaction 
> between threads as a minefield and program with appropriate care.  So, if 
> threads are "easy" in Python compared to other langauges, it's *because of* 
> the GIL, not in spite of it.

Three cheers for the GIL!

For the record, since I was quoted at the beginning of this subthread, 
*I* don't think threads are easy.  But among all ways to handle 
concurrency, I just don't think they are so bad.  And unlike many 
alternatives, they are relatively easy to get started with, and you can 
do a lot of work in a threaded system without knowing anything about 
threads.  Of course, threads aren't the only way to accomplish that, 
just one of the easiest.

-- 
Ian Bicking  /  ianb at colorstudy.com  /  http://blog.ianbicking.org


From skip at pobox.com  Tue Oct 11 00:00:48 2005
From: skip at pobox.com (skip@pobox.com)
Date: Mon, 10 Oct 2005 17:00:48 -0500
Subject: [Python-Dev] PEP 3000 and exec
In-Reply-To: <bbaeab100510101329l116efbao369d0b6710331a5d@mail.gmail.com>
References: <didj3k$o0r$1@sea.gmane.org>
	<bbaeab100510101329l116efbao369d0b6710331a5d@mail.gmail.com>
Message-ID: <17226.58512.451743.300957@montanaro.dyndns.org>

    >> This might be minor-- but I didn't see anyone mentioning it so far.
    >> If `exec` functionality is to be provided, then I think it still
    >> should be a keyword for the parser to know; currently bytecode
    >> generation is affected if `exec` is present.  Even if that changes
    >> for Python 3k (we don't know yet), the paragraph for exec should be
    >> annotated with a note about this issue.

    Brett> But the PEP says that 'exec' will become a function and thus no
    Brett> longer become a built-in, so changing the grammar is not needed.

I don't think that was the OP's point though it might not have been terribly
clear.  Today, the presence of the exec statement in a function changes how
non-local load instructions are generated.  Consider f and g with their
dis.dis output:

    >>> def f(a):
    ...   exec "import %s" % a
    ...   print q
    ... 
    >>> def g(a):
    ...   __import__(a)
    ...   print q
    ... 
    >>> dis.dis(f)
      2           0 LOAD_CONST               1 ('import %s')
                  3 LOAD_FAST                0 (a)
                  6 BINARY_MODULO       
                  7 LOAD_CONST               0 (None)
                 10 DUP_TOP             
                 11 EXEC_STMT           

      3          12 LOAD_NAME                1 (q)
                 15 PRINT_ITEM          
                 16 PRINT_NEWLINE       
                 17 LOAD_CONST               0 (None)
                 20 RETURN_VALUE        
    >>> dis.dis(g)
      2           0 LOAD_GLOBAL              0 (__import__)
                  3 LOAD_FAST                0 (a)
                  6 CALL_FUNCTION            1
                  9 POP_TOP             

      3          10 LOAD_GLOBAL              2 (q)
                 13 PRINT_ITEM          
                 14 PRINT_NEWLINE       
                 15 LOAD_CONST               0 (None)
                 18 RETURN_VALUE        

If the exec statement is replaced by a function, how will the bytecode
generator know that q should be looked up using LOAD_NAME instead of
LOAD_GLOBAL?  Maybe it's a non-issue, but even if so, a note to that affect
on the wiki page might be worthwhile.

Skip

From guido at python.org  Tue Oct 11 00:05:56 2005
From: guido at python.org (Guido van Rossum)
Date: Mon, 10 Oct 2005 15:05:56 -0700
Subject: [Python-Dev] PEP 3000 and exec
In-Reply-To: <17226.58512.451743.300957@montanaro.dyndns.org>
References: <didj3k$o0r$1@sea.gmane.org>
	<bbaeab100510101329l116efbao369d0b6710331a5d@mail.gmail.com>
	<17226.58512.451743.300957@montanaro.dyndns.org>
Message-ID: <ca471dc20510101505t76a5455ak6af18582ff243b43@mail.gmail.com>

My idea was to make the compiler smarter so that it would recognize
exec() even if it was just a function.

Another idea might be to change the exec() spec so that you are
required to pass in a namespace (and you can't use locals() either!).
Then the whole point becomes moot.

On 10/10/05, skip at pobox.com <skip at pobox.com> wrote:
>     >> This might be minor-- but I didn't see anyone mentioning it so far.
>     >> If `exec` functionality is to be provided, then I think it still
>     >> should be a keyword for the parser to know; currently bytecode
>     >> generation is affected if `exec` is present.  Even if that changes
>     >> for Python 3k (we don't know yet), the paragraph for exec should be
>     >> annotated with a note about this issue.
>
>     Brett> But the PEP says that 'exec' will become a function and thus no
>     Brett> longer become a built-in, so changing the grammar is not needed.
>
> I don't think that was the OP's point though it might not have been terribly
> clear.  Today, the presence of the exec statement in a function changes how
> non-local load instructions are generated.  Consider f and g with their
> dis.dis output:
>
>     >>> def f(a):
>     ...   exec "import %s" % a
>     ...   print q
>     ...
>     >>> def g(a):
>     ...   __import__(a)
>     ...   print q
>     ...
>     >>> dis.dis(f)
>       2           0 LOAD_CONST               1 ('import %s')
>                   3 LOAD_FAST                0 (a)
>                   6 BINARY_MODULO
>                   7 LOAD_CONST               0 (None)
>                  10 DUP_TOP
>                  11 EXEC_STMT
>
>       3          12 LOAD_NAME                1 (q)
>                  15 PRINT_ITEM
>                  16 PRINT_NEWLINE
>                  17 LOAD_CONST               0 (None)
>                  20 RETURN_VALUE
>     >>> dis.dis(g)
>       2           0 LOAD_GLOBAL              0 (__import__)
>                   3 LOAD_FAST                0 (a)
>                   6 CALL_FUNCTION            1
>                   9 POP_TOP
>
>       3          10 LOAD_GLOBAL              2 (q)
>                  13 PRINT_ITEM
>                  14 PRINT_NEWLINE
>                  15 LOAD_CONST               0 (None)
>                  18 RETURN_VALUE
>
> If the exec statement is replaced by a function, how will the bytecode
> generator know that q should be looked up using LOAD_NAME instead of
> LOAD_GLOBAL?  Maybe it's a non-issue, but even if so, a note to that affect
> on the wiki page might be worthwhile.
>
> Skip
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From bcannon at gmail.com  Tue Oct 11 00:15:39 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Mon, 10 Oct 2005 15:15:39 -0700
Subject: [Python-Dev] PEP 3000 and exec
In-Reply-To: <17226.58512.451743.300957@montanaro.dyndns.org>
References: <didj3k$o0r$1@sea.gmane.org>
	<bbaeab100510101329l116efbao369d0b6710331a5d@mail.gmail.com>
	<17226.58512.451743.300957@montanaro.dyndns.org>
Message-ID: <bbaeab100510101515o109c7c58o9d7114364b1d4d22@mail.gmail.com>

On 10/10/05, skip at pobox.com <skip at pobox.com> wrote:
>     >> This might be minor-- but I didn't see anyone mentioning it so far.
>     >> If `exec` functionality is to be provided, then I think it still
>     >> should be a keyword for the parser to know; currently bytecode
>     >> generation is affected if `exec` is present.  Even if that changes
>     >> for Python 3k (we don't know yet), the paragraph for exec should be
>     >> annotated with a note about this issue.
>
>     Brett> But the PEP says that 'exec' will become a function and thus no
>     Brett> longer become a built-in, so changing the grammar is not needed.
>
> I don't think that was the OP's point though it might not have been terribly
> clear.  Today, the presence of the exec statement in a function changes how
> non-local load instructions are generated.  Consider f and g with their
> dis.dis output:
>
>     >>> def f(a):
>     ...   exec "import %s" % a
>     ...   print q
>     ...
>     >>> def g(a):
>     ...   __import__(a)
>     ...   print q
>     ...
>     >>> dis.dis(f)
>       2           0 LOAD_CONST               1 ('import %s')
>                   3 LOAD_FAST                0 (a)
>                   6 BINARY_MODULO
>                   7 LOAD_CONST               0 (None)
>                  10 DUP_TOP
>                  11 EXEC_STMT
>
>       3          12 LOAD_NAME                1 (q)
>                  15 PRINT_ITEM
>                  16 PRINT_NEWLINE
>                  17 LOAD_CONST               0 (None)
>                  20 RETURN_VALUE
>     >>> dis.dis(g)
>       2           0 LOAD_GLOBAL              0 (__import__)
>                   3 LOAD_FAST                0 (a)
>                   6 CALL_FUNCTION            1
>                   9 POP_TOP
>
>       3          10 LOAD_GLOBAL              2 (q)
>                  13 PRINT_ITEM
>                  14 PRINT_NEWLINE
>                  15 LOAD_CONST               0 (None)
>                  18 RETURN_VALUE
>
> If the exec statement is replaced by a function, how will the bytecode
> generator know that q should be looked up using LOAD_NAME instead of
> LOAD_GLOBAL?  Maybe it's a non-issue, but even if so, a note to that affect
> on the wiki page might be worthwhile.

Ah, OK.  That makes more sense.  For a quick, on-the-spot answer, one
possibility is for the 'exec' function to examine the execution stack,
go back to the caller, and patch the bytecode so that it uses
LOAD_NAME instead of LOAD_GLOBAL.  Total hack, but it would work and
since 'exec' is not exactly performance-critical to begin with
something this expensive wouldn't necessarily out of the question.

But the better answer is we will just find a way.  =)

-Brett

From tim.peters at gmail.com  Tue Oct 11 00:42:26 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 10 Oct 2005 18:42:26 -0400
Subject: [Python-Dev] PythonCore\CurrentVersion
In-Reply-To: <DAELJHBGPBHPJKEBGGLNKEBMHDAD.mhammond@skippinet.com.au>
References: <4347A020.2050008@v.loewis.de>
	<DAELJHBGPBHPJKEBGGLNKEBMHDAD.mhammond@skippinet.com.au>
Message-ID: <1f7befae0510101542tf65c250ybf93eb7f11c187ae@mail.gmail.com>

[Martin v. L?wis]
>> What happened to the CurrentVersion registry entry documented at
>>
>> http://www.python.org/windows/python/registry.html
>>
>> AFAICT, even the python15.wse file did not fill a value in this
>> entry (perhaps I'm misinterpreting the wse file, though).
>>
>> So was this ever used? Why is it documented, and who documented it
>> (unfortunately, registry.html is not in cvs/subversion, either)?

[Mark Hammond]
> I believe I documented it many moons ago.  I don't think CurrentVersion was
> ever implemented (or possibly was for a very short time before being
> removed).  The "registered modules" concept was misguided and AFAIK is not
> used by anyone - IMO it should be deprecated (if not just removed!).
> Further, I believe the documentation in the file for PYTHONPATH is, as said
> in those docs, out of date, but that the comments in getpathp.c are correct.

It would be good to update that web page ;-)

The construction of PYTHONPATH differs across platforms, which isn't
good.  Here's a key difference:

    playground/
        someother/
            script.py

This is script.py:

"""
import sys
from pprint import pprint

pprint(sys.path)
"""

Suppose we run script.py while playground/ is the current directory. 
I'm using 2.4.2 here, but doubt it matters much.  No Python-related
envars are set.

Windows (and the PIL and pywin32 extensions are installed here):

C:\playground>\python24\python.exe someother\script.py
['C:\\playground\\someother',
 'C:\\python24\\python24.zip',
 'C:\\playground',
 'C:\\python24\\DLLs',
 'C:\\python24\\lib',
 'C:\\python24\\lib\\plat-win',
 'C:\\python24\\lib\\lib-tk',
 'C:\\python24',
 'C:\\python24\\lib\\site-packages',
 'C:\\python24\\lib\\site-packages\\PIL',
 'C:\\python24\\lib\\site-packages\\win32',
 'C:\\python24\\lib\\site-packages\\win32\\lib',
 'C:\\python24\\lib\\site-packages\\Pythonwin']

When PC/getpathp.c says:

   * Python always adds an empty entry at the start, which corresponds
     to the current directory.

I'm not sure what it means.  The directory containing the script we're
_running_ shows up first in sys.path there, while the _current_
directory shows up third.

Linux:  the current directory doesn't show up at all:

[playground]$ ~/Python-2.4.2/python someother/script.py
['/home/tim/playground/someother',
 '/usr/local/lib/python24.zip',
 '/home/tim/Python-2.4.2/Lib',
 '/home/tim/Python-2.4.2/Lib/plat-linux2',
 '/home/tim/Python-2.4.2/Lib/lib-tk',
 '/home/tim/Python-2.4.2/Modules',
 '/home/tim/Python-2.4.2/build/lib.linux-i686-2.4']

I have no concrete suggestion, as any change to sys.path will break
something for someone.  It's nevertheless not good that "current
directory on sys.path?" doesn't have the same answer across platforms
(unsure why, but I've been burned by that several times this year, but
never before this year -- maybe sys.path _used_ to contain the current
directory on Linux?).

From tdelaney at avaya.com  Tue Oct 11 00:50:39 2005
From: tdelaney at avaya.com (Delaney, Timothy (Tim))
Date: Tue, 11 Oct 2005 08:50:39 +1000
Subject: [Python-Dev] Extending tuple unpacking
Message-ID: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com>

Paul Du Bois wrote:

> On 10/10/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
>>    cmd, *args = input.split()
> 
> These examples also have a reasonable implementation using list.pop(),
> albeit one that requires more typing.  On the plus side, it does not
> violate 
> DRY and is explicit about the error cases.
> 
>   args = input.split()
>   try:
>     cmd = input.pop(0)
>   except IndexError:
>     cmd = ''

I'd say you violated it right there ... (should have been)::

    args = input.split()

    try:
        cmd = arg.pop()
    except IndexError:
        cmd = ''

FWIW, I've been +1 on * unpacking since I first saw the proposal, and
have yet to see a convincing argument against it other than people
wanting to stick the * anywhere but at the end. Perhaps I'll take the
stdlib challenge (unfortunately, I have to travel this weekend, but I'll
see if I can make time).

Tim Delaney

From tdelaney at avaya.com  Tue Oct 11 00:54:22 2005
From: tdelaney at avaya.com (Delaney, Timothy (Tim))
Date: Tue, 11 Oct 2005 08:54:22 +1000
Subject: [Python-Dev] Extending tuple unpacking
Message-ID: <2773CAC687FD5F4689F526998C7E4E5F4DB6B7@au3010avexu1.global.avaya.com>

Delaney, Timothy (Tim) wrote:

>     args = input.split()
> 
>     try:
>         cmd = arg.pop()
                ^^^ args ...
>     except IndexError:
>         cmd = ''

Can't even get it right myself - does that prove a point? <wink>

Tim Delaney

From mhammond at skippinet.com.au  Tue Oct 11 01:20:43 2005
From: mhammond at skippinet.com.au (Mark Hammond)
Date: Tue, 11 Oct 2005 09:20:43 +1000
Subject: [Python-Dev] PythonCore\CurrentVersion
In-Reply-To: <1f7befae0510101542tf65c250ybf93eb7f11c187ae@mail.gmail.com>
Message-ID: <DAELJHBGPBHPJKEBGGLNKEKLHDAD.mhammond@skippinet.com.au>

> Suppose we run script.py while playground/ is the current directory.
> I'm using 2.4.2 here, but doubt it matters much.  No Python-related
> envars are set.
>
> Windows (and the PIL and pywin32 extensions are installed here):
>
> C:\playground>\python24\python.exe someother\script.py
> ['C:\\playground\\someother',
>  'C:\\python24\\python24.zip',
>  'C:\\playground',
...

> When PC/getpathp.c says:
>
>    * Python always adds an empty entry at the start, which corresponds
>      to the current directory.

I believe it used to mean that literally '' was at the start of sys.path,
but all the way back to 1.5.2 it seems that it really is the dirname of the
script.  Up to 2.2 it was as specifed in sys.argv, in 2.3 and later it was
made absolute.

> I'm not sure what it means.  The directory containing the script we're
> _running_ shows up first in sys.path there, while the _current_
> directory shows up third.

That's strange - I don't see the current directory at all in any version.  I
get something very close to you when I have PYTHONPATH=. - although it then
turns up as the second entry, consistent with the docs.

Mark


From nyamatongwe at gmail.com  Tue Oct 11 01:52:36 2005
From: nyamatongwe at gmail.com (Neil Hodgson)
Date: Tue, 11 Oct 2005 09:52:36 +1000
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <746109444.20051010141518@MailBlocks.com>
References: <17226.45295.661911.542400@montanaro.dyndns.org>
	<746109444.20051010141518@MailBlocks.com>
Message-ID: <50862ebd0510101652y70250d23vdf04d0e19872ee1b@mail.gmail.com>

Bruce Eckel:

> I would say that the troublesome meme is that "threads are easy." I
> posted an earlier, rather longish message about this. The gist of
> which was: "when someone says that threads are easy, I have no idea
> what they mean by it."

   I think you are overcomplicating the issue by looking at too many
levels at once. The memory model is something that implementers of
threading support need to understand. Users of that threading support
just need to know that concurrent access to variables is dangerous and
that they should use locks to access shared variables or use other
forms of packaged inter-thread communication.

   Double Checked Locking is an optimization (removal of a lock) of an
attempt to better modularize code (by automating the helper object
creation). I'd either just leave the lock in or if benchmarking
revealed an unacceptable performance problem, allocate the helper
object before the resource is accessible to more than one thread. For
statics, expose an Init method that gets called when the application
is in the initial one user thread state.

> But I just finished a 150-page chapter on Concurrency in Java which
> took many months to write, based on a large chapter on Concurrency in
> C++ which probably took longer to write. I keep in reasonably good
> touch with some of the threading experts. I can't get any of them to
> say that it's easy, even though they really do understand the issues
> and think about it all the time. *Because* of that, they say that it's
> hard.

   Implementing threading is hard. Using threading is not that hard.
Its a source of complexity but so are many aspects of development. I
get scared by reentrance in UI code.

   Neil

From greg.ewing at canterbury.ac.nz  Tue Oct 11 02:09:03 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 11 Oct 2005 13:09:03 +1300
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <434AE3C4.6030401@colorstudy.com>
References: <20051006221436.2892.JCARLSON@uci.edu>
	<20051006143740.287E.JCARLSON@uci.edu>
	<200510070145.17284.ms@cerenity.org>
	<20051006221436.2892.JCARLSON@uci.edu>
	<5.1.1.6.0.20051007140112.03b2fc80@mail.telecommunity.com>
	<434AE3C4.6030401@colorstudy.com>
Message-ID: <434B029F.3050000@canterbury.ac.nz>

Ian Bicking wrote:

> What the GIL-ranters don't get is that the GIL actually gives you just 
>  enough determinism to be able to write threaded programs that don't crash,

The GIL no doubt helps, but your threads can still get
preempted between bytecodes, so I can't see it making
much difference at the Python thought-level.

I'm wondering whether Python threads should be
non-preemptive by default. Preemptive threading is
massive overkill for many applications. You don't
need it, for example, if you just want to use threads
to structure your program, overlap processing with I/O,
etc.

Preemptive threading would still be there as an option
to turn on when you really need it.

Or perhaps there could be a priority system, with a
thread only able to be preempted by a thread of higher
priority. If you ignore priorities, all your threads
default to the same priority, so there's no preemption.
If you want a thread that can preempt others, you give
it a higher priority.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From guido at python.org  Tue Oct 11 02:18:15 2005
From: guido at python.org (Guido van Rossum)
Date: Mon, 10 Oct 2005 17:18:15 -0700
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <434B029F.3050000@canterbury.ac.nz>
References: <20051006143740.287E.JCARLSON@uci.edu>
	<200510070145.17284.ms@cerenity.org>
	<20051006221436.2892.JCARLSON@uci.edu>
	<5.1.1.6.0.20051007140112.03b2fc80@mail.telecommunity.com>
	<434AE3C4.6030401@colorstudy.com> <434B029F.3050000@canterbury.ac.nz>
Message-ID: <ca471dc20510101718y5fdd5af3x1e5a93642a90a20e@mail.gmail.com>

On 10/10/05, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> I'm wondering whether Python threads should be
> non-preemptive by default. Preemptive threading is
> massive overkill for many applications. You don't
> need it, for example, if you just want to use threads
> to structure your program, overlap processing with I/O,
> etc.

I recall using a non-preemptive system in the past; in Amoeba, to be precise.

Initially it worked great.

But as we added more powerful APIs to the library, we started to run
into bugs that were just as if you had preemptive scheduling: it
wouldn't always be predictable whether a call into the library would
need to do I/O or not (it might use some sort of cache) so it would
sometimes allow other threads to run and sometimes not. Or a change to
the library would change this behavior (making a call that didn't use
to block into sometimes-blocking).

Given the tendency of Python developers to build layers of
abstractions I don't think it will help much.

> Preemptive threading would still be there as an option
> to turn on when you really need it.
>
> Or perhaps there could be a priority system, with a
> thread only able to be preempted by a thread of higher
> priority. If you ignore priorities, all your threads
> default to the same priority, so there's no preemption.
> If you want a thread that can preempt others, you give
> it a higher priority.

If you ask me, priorities are worse than the problem they are trying to solve.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From radeex at gmail.com  Tue Oct 11 02:31:09 2005
From: radeex at gmail.com (Christopher Armstrong)
Date: Tue, 11 Oct 2005 11:31:09 +1100
Subject: [Python-Dev] PEP 3000 and exec
In-Reply-To: <ca471dc20510101505t76a5455ak6af18582ff243b43@mail.gmail.com>
References: <didj3k$o0r$1@sea.gmane.org>
	<bbaeab100510101329l116efbao369d0b6710331a5d@mail.gmail.com>
	<17226.58512.451743.300957@montanaro.dyndns.org>
	<ca471dc20510101505t76a5455ak6af18582ff243b43@mail.gmail.com>
Message-ID: <60ed19d40510101731x61359b1evb3256aefc82369b0@mail.gmail.com>

On 10/11/05, Guido van Rossum <guido at python.org> wrote:
> My idea was to make the compiler smarter so that it would recognize
> exec() even if it was just a function.
>
> Another idea might be to change the exec() spec so that you are
> required to pass in a namespace (and you can't use locals() either!).
> Then the whole point becomes moot.

I think that's a great idea. It goes a step towards a more analyzable
Python, and really, I've never found a *good* use case for allowing
this invisible munging of locals. I would guess that it would simplify
the implementation, given that there are currently so many special
cases around exec, including when used with nested scopes.

--
 Twisted   |  Christopher Armstrong: International Man of Twistery
  Radix    |    -- http://radix.twistedmatrix.com
           |  Release Manager, Twisted Project
 \\\V///   |    -- http://twistedmatrix.com
  |o O|    |
w----v----w-+

From greg.ewing at canterbury.ac.nz  Tue Oct 11 02:41:05 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 11 Oct 2005 13:41:05 +1300
Subject: [Python-Dev] PEP 3000 and exec
In-Reply-To: <bbaeab100510101515o109c7c58o9d7114364b1d4d22@mail.gmail.com>
References: <didj3k$o0r$1@sea.gmane.org>
	<bbaeab100510101329l116efbao369d0b6710331a5d@mail.gmail.com>
	<17226.58512.451743.300957@montanaro.dyndns.org>
	<bbaeab100510101515o109c7c58o9d7114364b1d4d22@mail.gmail.com>
Message-ID: <434B0A21.6070103@canterbury.ac.nz>

Brett Cannon wrote:

> But the better answer is we will just find a way.  =)

I think the best answer would be just to dump the idea of
exec-in-local-namespace altogether. I don't think I've
ever seen a use case for it that wasn't better done some
other way.

Most often it seems to be used to answer newbie "variable
variable" questions, to which the *correct* answer is
invariably "start thinking in Python, not bash/perl/tcl/PHP/
whatever."

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From bcannon at gmail.com  Tue Oct 11 02:53:18 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Mon, 10 Oct 2005 17:53:18 -0700
Subject: [Python-Dev] PEP 3000 and exec
In-Reply-To: <434B0A21.6070103@canterbury.ac.nz>
References: <didj3k$o0r$1@sea.gmane.org>
	<bbaeab100510101329l116efbao369d0b6710331a5d@mail.gmail.com>
	<17226.58512.451743.300957@montanaro.dyndns.org>
	<bbaeab100510101515o109c7c58o9d7114364b1d4d22@mail.gmail.com>
	<434B0A21.6070103@canterbury.ac.nz>
Message-ID: <bbaeab100510101753u28cba095uaf72b3a348196c90@mail.gmail.com>

On 10/10/05, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Brett Cannon wrote:
>
> > But the better answer is we will just find a way.  =)
>
> I think the best answer would be just to dump the idea of
> exec-in-local-namespace altogether. I don't think I've
> ever seen a use case for it that wasn't better done some
> other way.
>

I agree that 'exec' could really stand to be tweaked.  As it stands
now it is nasty to deal with when it comes to program analysis. 
Anything that will make that easier gets my vote.

-Brett

From radeex at gmail.com  Tue Oct 11 03:01:03 2005
From: radeex at gmail.com (Christopher Armstrong)
Date: Tue, 11 Oct 2005 12:01:03 +1100
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <ca471dc20510101718y5fdd5af3x1e5a93642a90a20e@mail.gmail.com>
References: <20051006143740.287E.JCARLSON@uci.edu>
	<200510070145.17284.ms@cerenity.org>
	<20051006221436.2892.JCARLSON@uci.edu>
	<5.1.1.6.0.20051007140112.03b2fc80@mail.telecommunity.com>
	<434AE3C4.6030401@colorstudy.com> <434B029F.3050000@canterbury.ac.nz>
	<ca471dc20510101718y5fdd5af3x1e5a93642a90a20e@mail.gmail.com>
Message-ID: <60ed19d40510101801i57e37379n4ed85a9703ba82e9@mail.gmail.com>

On 10/11/05, Guido van Rossum <guido at python.org> wrote:
> I recall using a non-preemptive system in the past; in Amoeba, to be precise.
>
> Initially it worked great.
>
> But as we added more powerful APIs to the library, we started to run
> into bugs that were just as if you had preemptive scheduling: it
> wouldn't always be predictable whether a call into the library would
> need to do I/O or not (it might use some sort of cache) so it would
> sometimes allow other threads to run and sometimes not. Or a change to
> the library would change this behavior (making a call that didn't use
> to block into sometimes-blocking).

I'm going to be giving a talk at OSDC (in Melbourne) this year about
concurrency systems, and I'm going to talk a lot about the subtleties
between these various non-preemptive (let's call them cooperative :)
systems. I advocate a system that gives you really
straightforward-looking code, but still requires you to annotate the
fact that context switches can occur on every frame where they might
occur (i.e., with a yield). I've given examples before of my new
2.5-yield + twisted Deferred code here, but to recap it just  means
that you have to do:

def foo():
    x = yield getPage()
    return "Yay"

when you want to download a web page, and the caller of 'foo' would
*also* need to do something like "yay = yield foo()". I think this is
a very worthwhile tradeoff for those obsessed with "natural" code.


--
  Twisted   |  Christopher Armstrong: International Man of Twistery
   Radix    |    -- http://radix.twistedmatrix.com
            |  Release Manager, Twisted Project
  \\\V///   |    -- http://twistedmatrix.com
   |o O|    |
w----v----w-+

From janssen at parc.com  Tue Oct 11 03:05:59 2005
From: janssen at parc.com (Bill Janssen)
Date: Mon, 10 Oct 2005 18:05:59 PDT
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: Your message of "Mon, 10 Oct 2005 17:18:15 PDT."
	<ca471dc20510101718y5fdd5af3x1e5a93642a90a20e@mail.gmail.com> 
Message-ID: <05Oct10.180605pdt."58617"@synergy1.parc.xerox.com>

Guido writes:
> Given the tendency of Python developers to build layers of
> abstractions I don't think [non-preemptive threads] will help much.

I think that's right, although I think adding priorities to Python's
existing preemptive threads might be useful for real-time programmers
(yes, as machines continue to get faster people are writing real-time
software on top of VMs).

IMO, if one understands the issues of simultaneous memory access by
multiple threads, and understands condition variables (and their
underlying concept of mutexes), threads are pretty easy to use.
Getting into the habit of always writing thread-safe code is a good
idea, too.  It would be nice if some of these programming environments
(IDLE, Emacs, Eclipse, Visual Studio) provided better support for
analysis of threading issues in programs.  I'd love to have the
Interlisp thread inspector for Python.

I sympathize with Bruce's Java experience, though.  Java's original
threading design is one of the many misfeatures of that somewhat
horrible language (along with lack of multiple-inheritance, hybrid
types, omission of unsigned integers, static typing, etc.).
Synchronized methods is a weird way of presenting mutexes, IMO.
Java's condition variables don't (didn't?  has this been fixed?) quite
work.  The emphasis on portability and the resulting notions of
red/green threading packages at the beginning didn't help either.
Read Allen Holub's book.  And Doug Lea's book.  I understand much of
this has been addressed with a new package in Java 1.5.

Bill

From fdrake at acm.org  Tue Oct 11 03:09:37 2005
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Mon, 10 Oct 2005 21:09:37 -0400
Subject: [Python-Dev] PythonCore\CurrentVersion
In-Reply-To: <1f7befae0510101542tf65c250ybf93eb7f11c187ae@mail.gmail.com>
References: <4347A020.2050008@v.loewis.de>
	<DAELJHBGPBHPJKEBGGLNKEBMHDAD.mhammond@skippinet.com.au>
	<1f7befae0510101542tf65c250ybf93eb7f11c187ae@mail.gmail.com>
Message-ID: <200510102109.37690.fdrake@acm.org>

On Monday 10 October 2005 18:42, Tim Peters wrote:
 > never before this year -- maybe sys.path _used_ to contain the current
 > directory on Linux?).

It's been a long time since this was the case on Unix of any variety; I 
*think* this changed to the current state back before 2.0.


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From nnorwitz at gmail.com  Tue Oct 11 06:15:22 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Mon, 10 Oct 2005 21:15:22 -0700
Subject: [Python-Dev] problem with genexp
Message-ID: <ee2a432c0510102115m19581b97h79cc3df6e1dadd27@mail.gmail.com>

There's a problem with genexp's that I think really needs to get
fixed.  See http://python.org/sf/1167751 the details are below.  This
code:

>>> foo(a = i for i in range(10))

generates "NameError: name 'i' is not defined" when run because:

2           0 LOAD_GLOBAL              0 (foo)
              3 LOAD_CONST               1 ('a')
              6 LOAD_GLOBAL              1 (i)
              9 CALL_FUNCTION          256
             12 POP_TOP
             13 LOAD_CONST               0 (None)
             16 RETURN_VALUE

If you add parens around the code:   foo(a = i for i in range(10))
You get something quite different:

2           0 LOAD_GLOBAL              0 (foo)
              3 LOAD_CONST               1 ('a')
              6 LOAD_CONST               2 (<code object <generator
expression> at 0x2a960baae8, file "<stdin>", line 2>)
              9 MAKE_FUNCTION            0
             12 LOAD_GLOBAL              1 (range)
             15 LOAD_CONST               3 (10)
             18 CALL_FUNCTION            1
             21 GET_ITER
             22 CALL_FUNCTION            1
             25 CALL_FUNCTION          256
             28 POP_TOP
             29 LOAD_CONST               0 (None)
             32 RETURN_VALUE

I agree with the bug report that the code should either raise a
SyntaxError or do the right thing.

n

From rrr at ronadam.com  Tue Oct 11 06:13:53 2005
From: rrr at ronadam.com (Ron Adam)
Date: Tue, 11 Oct 2005 00:13:53 -0400
Subject: [Python-Dev] Extending tuple unpacking
In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com>
References: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com>
Message-ID: <434B3C01.5030001@ronadam.com>

Delaney, Timothy (Tim) wrote:
> Paul Du Bois wrote:
> 
> 
>>On 10/10/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
>>
>>>   cmd, *args = input.split()
>>
>>These examples also have a reasonable implementation using list.pop(),
>>albeit one that requires more typing.  On the plus side, it does not
>>violate 
>>DRY and is explicit about the error cases.
>>
>>  args = input.split()
>>  try:
>>    cmd = input.pop(0)
>>  except IndexError:
>>    cmd = ''
> 
> 
> I'd say you violated it right there ... (should have been)::
> 
>     args = input.split()
> 
>     try:
>         cmd = arg.pop()
>     except IndexError:
>         cmd = ''
> 
> FWIW, I've been +1 on * unpacking since I first saw the proposal, and
> have yet to see a convincing argument against it other than people
> wanting to stick the * anywhere but at the end. Perhaps I'll take the
> stdlib challenge (unfortunately, I have to travel this weekend, but I'll
> see if I can make time).
> 
> Tim Delaney

I'm +1 for some way to do partial tuple unpacking, yet -1 on using the * 
symbol for that purpose outside of functions calls.

The problem is the '*' means different things depending on where it's 
located.  In a function def, it means to group or to pack, but from the 
calling end it's used to unpack.  I don't expect it to change as it's 
been a part of Python for a long time and as long as it's only used with 
argument passing it's not too difficult to keep straight.

My concern is if it's used outside of functions, then on the left hand 
side of assignments, it will be used to pack, but if used on the right 
hand side it will be to unpack.  And if it becomes as common place as I 
think it will, it will present confusing uses and or situations where 
you may have to think, "oh yeah, it's umm... unpacking here and umm... 
packing there, but multiplying there".  The point is it could be a 
stumbling block, especially for new Python users.  So I think a certain 
amount of caution should be in order on this item.  At least check that 
it's doesn't cause confusing situations.

I really would like some form of easy and efficient tuple unpacking if 
possibly.  I've played around with using '/' and '-' to split and to 
partially unpack lists, but it's probably better to use a named method. 
  That has the benefit of always reading the same.

Also packing tuples (other than in function defs) isn't needed if you 
have a way to do partial unpacking.

     a,b,c = alist[:2]+[alist[2:]]  # a,b,rest

Not the most efficient way I think, but maybe as a sequence method 
written in C it could be better?

Cheers,
    Ron













From guido at python.org  Tue Oct 11 06:55:58 2005
From: guido at python.org (Guido van Rossum)
Date: Mon, 10 Oct 2005 21:55:58 -0700
Subject: [Python-Dev] Extending tuple unpacking
In-Reply-To: <434B3C01.5030001@ronadam.com>
References: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com>
	<434B3C01.5030001@ronadam.com>
Message-ID: <ca471dc20510102155p7f17fd14h5fe6043ea64717f6@mail.gmail.com>

On 10/10/05, Ron Adam <rrr at ronadam.com> wrote:
> The problem is the '*' means different things depending on where it's
> located.  In a function def, it means to group or to pack, but from the
> calling end it's used to unpack.  I don't expect it to change as it's
> been a part of Python for a long time and as long as it's only used with
> argument passing it's not too difficult to keep straight.
>
> My concern is if it's used outside of functions, then on the left hand
> side of assignments, it will be used to pack, but if used on the right
> hand side it will be to unpack.  And if it becomes as common place as I
> think it will, it will present confusing uses and or situations where
> you may have to think, "oh yeah, it's umm... unpacking here and umm...
> packing there, but multiplying there".  The point is it could be a
> stumbling block, especially for new Python users.  So I think a certain
> amount of caution should be in order on this item.  At least check that
> it's doesn't cause confusing situations.

This particular concern, I believe, is a fallacy. If you squint the
right way, using *rest for both packing and unpacking is totally
logical. If

    a, b, *rest = (1, 2, 3, 4, 5)

puts 1 into a, 2 into b, and (3, 4, 5) into rest, then it's totally
logical and symmetrical  if after that

    x = a, b, *rest

puts (1, 2, 3, 4, 5) into x.

BTW, what should

    [a, b, *rest] = (1, 2, 3, 4, 5)

do? Should it set rest to (3, 4, 5) or to [3, 4, 5]? Suppose the
latter. Then should we allow

    [*rest] = x

as alternative syntax for

    rest = list(x)

? And then perhaps

    *rest = x

should mean

    rest = tuple(x)

Or should that be disallowed and would we have to write

    *rest, = x

analogous to singleton tuples?

There certainly is a need for doing the same from the end:

    *rest, a, b = (1, 2, 3, 4, 5)

could set rest to (1, 2, 3), a to 4, and b to 5. From there it's a
simple step towards

    a, b, *rest, d, e = (1, 2, 3, 4, 5)

meaning

    a, b, rest, d, e = (1, 2, (3,), 4, 5)

and so on. Where does it stop?

BTW, and quite unrelated, I've always felt uncomfortable that you have to write

    f(a, b, foo=1, bar=2, *args, **kwds)

I've always wanted to write that as

    f(a, b, *args, foo=1, bar=2, **kwds)

but the current grammar doesn't allow it.

Still -0 on the whole thing,

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From bcannon at gmail.com  Tue Oct 11 07:10:18 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Mon, 10 Oct 2005 22:10:18 -0700
Subject: [Python-Dev] problem with genexp
In-Reply-To: <ee2a432c0510102115m19581b97h79cc3df6e1dadd27@mail.gmail.com>
References: <ee2a432c0510102115m19581b97h79cc3df6e1dadd27@mail.gmail.com>
Message-ID: <bbaeab100510102210p6f8014eay1cda07c6aace38af@mail.gmail.com>

On 10/10/05, Neal Norwitz <nnorwitz at gmail.com> wrote:
> There's a problem with genexp's that I think really needs to get
> fixed.  See http://python.org/sf/1167751 the details are below.  This
> code:
>
> >>> foo(a = i for i in range(10))
>
> generates "NameError: name 'i' is not defined" when run because:
[SNIP]
> If you add parens around the code:   foo(a = i for i in range(10))
> You get something quite different:

Do you mean having ``(foo(a = i for i in range(10))``?  Otherwise I
see no difference when compared to the first value.

-Brett

From nnorwitz at gmail.com  Tue Oct 11 07:14:44 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Mon, 10 Oct 2005 22:14:44 -0700
Subject: [Python-Dev] problem with genexp
In-Reply-To: <bbaeab100510102210p6f8014eay1cda07c6aace38af@mail.gmail.com>
References: <ee2a432c0510102115m19581b97h79cc3df6e1dadd27@mail.gmail.com>
	<bbaeab100510102210p6f8014eay1cda07c6aace38af@mail.gmail.com>
Message-ID: <ee2a432c0510102214s6a5253edo38f9e7fd9a26f04d@mail.gmail.com>

On 10/10/05, Brett Cannon <bcannon at gmail.com> wrote:
> On 10/10/05, Neal Norwitz <nnorwitz at gmail.com> wrote:
> > There's a problem with genexp's that I think really needs to get
> > fixed.  See http://python.org/sf/1167751 the details are below.  This
> > code:
> >
> > >>> foo(a = i for i in range(10))
> >
> > generates "NameError: name 'i' is not defined" when run because:
> [SNIP]
> > If you add parens around the code:   foo(a = i for i in range(10))
> > You get something quite different:
>
> Do you mean having ``(foo(a = i for i in range(10))``?  Otherwise I
> see no difference when compared to the first value.

Sorry, I think I put it in the bug report, but forgot to add it here:

>>> foo(a = (i for i in range(10)))

n

From martin at v.loewis.de  Tue Oct 11 08:16:53 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 11 Oct 2005 08:16:53 +0200
Subject: [Python-Dev] PythonCore\CurrentVersion
In-Reply-To: <200510102109.37690.fdrake@acm.org>
References: <4347A020.2050008@v.loewis.de>
	<DAELJHBGPBHPJKEBGGLNKEBMHDAD.mhammond@skippinet.com.au>
	<1f7befae0510101542tf65c250ybf93eb7f11c187ae@mail.gmail.com>
	<200510102109.37690.fdrake@acm.org>
Message-ID: <434B58D5.7020206@v.loewis.de>

Fred L. Drake, Jr. wrote:
> On Monday 10 October 2005 18:42, Tim Peters wrote:
>  > never before this year -- maybe sys.path _used_ to contain the current
>  > directory on Linux?).
> 
> It's been a long time since this was the case on Unix of any variety; I 
> *think* this changed to the current state back before 2.0.

Please check again:

[GCC 4.0.2 20050821 (prerelease) (Debian 4.0.1-6)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
 >>> import sys
 >>> sys.path
['', '/usr/lib/python23.zip', '/usr/lib/python2.3', 
'/usr/lib/python2.3/plat-linux2', '/usr/lib/python2.3/lib-tk', 
'/usr/lib/python2.3/lib-dynload', 
'/usr/local/lib/python2.3/site-packages', 
'/usr/lib/python2.3/site-packages', 
'/usr/lib/python2.3/site-packages/Numeric', 
'/usr/lib/python2.3/site-packages/gtk-2.0', '/usr/lib/site-python']

We still have the empty string in sys.path, and it still
denotes the current directory.

Regards,
Martin

From greg.ewing at canterbury.ac.nz  Tue Oct 11 09:21:06 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 11 Oct 2005 20:21:06 +1300
Subject: [Python-Dev] Extending tuple unpacking
In-Reply-To: <434B3C01.5030001@ronadam.com>
References: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com>
	<434B3C01.5030001@ronadam.com>
Message-ID: <434B67E2.8060909@canterbury.ac.nz>

Ron Adam wrote:

> My concern is if it's used outside of functions, then on the left hand 
> side of assignments, it will be used to pack, but if used on the right 
> hand side it will be to unpack.

I don't see why that should be any more confusing than the
fact that commas denote tuple packing on the right and
unpacking on the left.

Greg


From greg.ewing at canterbury.ac.nz  Tue Oct 11 09:39:38 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 11 Oct 2005 20:39:38 +1300
Subject: [Python-Dev] Extending tuple unpacking
In-Reply-To: <ca471dc20510102155p7f17fd14h5fe6043ea64717f6@mail.gmail.com>
References: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com>
	<434B3C01.5030001@ronadam.com>
	<ca471dc20510102155p7f17fd14h5fe6043ea64717f6@mail.gmail.com>
Message-ID: <434B6C3A.7020001@canterbury.ac.nz>

Guido van Rossum wrote:

> BTW, what should
> 
>     [a, b, *rest] = (1, 2, 3, 4, 5)
> 
> do? Should it set rest to (3, 4, 5) or to [3, 4, 5]?

Whatever type is chosen, it should be the same type, always.
The rhs could be any iterable, not just a tuple or a list.
Making a special case of preserving one or two types doesn't
seem worth it to me.

> Suppose the latter. Then should we allow
> 
>     [*rest] = x
> 
> as alternative syntax for
> 
>     rest = list(x)

That would be a consequence of that choice, yes, but so what?
There are already infinitely many ways of writing any expression.

> ? And then perhaps
> 
>     *rest = x
> 
> should mean
> 
>     rest = tuple(x)
> 
> Or should that be disallowed

Why bother? What harm would result from the ability to write that?

> There certainly is a need for doing the same from the end:
> 
>     *rest, a, b = (1, 2, 3, 4, 5)

I wouldn't mind at all if *rest were only allowed at the end.
There's a pragmatic reason for that if nothing else: the rhs
can be any iterable, and there's no easy way of getting "all
but the last n" items from a general iterable.

> Where does it stop?

For me, it stops with *rest only allowed at the end, and
always yielding a predictable type (which could be either tuple
or list, I don't care).

> BTW, and quite unrelated, I've always felt uncomfortable that you have to write
> 
>     f(a, b, foo=1, bar=2, *args, **kwds)
> 
> I've always wanted to write that as
> 
>     f(a, b, *args, foo=1, bar=2, **kwds)

Yes, I'd like that too, with the additional meaning that
foo and bar can only be specified by keyword, not by
position.

Greg

From ncoghlan at gmail.com  Tue Oct 11 11:05:36 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 11 Oct 2005 19:05:36 +1000
Subject: [Python-Dev] Fwd: defaultproperty
In-Reply-To: <bbaeab100510101333s317d1a98r2b2af3b93b7788ad@mail.gmail.com>
References: <433AA5AC.6040509@zope.com>	<ca471dc205092807477ff2d0f1@mail.gmail.com>	<433BA3CF.1090205@zope.com>
	<43494648.6040904@zope.com>	<4349997E.9010208@gmail.com>	<76fd5acf0510092246q1861403flda2dd49ddbe02d6c@mail.gmail.com>	<76fd5acf0510092247m4d8383e8o40260de7950906@mail.gmail.com>	<1128955292.27841.2.camel@geddy.wooz.org>
	<bbaeab100510101333s317d1a98r2b2af3b93b7788ad@mail.gmail.com>
Message-ID: <434B8060.6070903@gmail.com>

Brett Cannon wrote:
> On 10/10/05, Barry Warsaw <barry at python.org> wrote:
> 
>>On Mon, 2005-10-10 at 01:47, Calvin Spealman wrote:
>>
>>
>>>Never created for a reason? lumping things together for having the
>>>similar usage semantics, but unrelated purposes, might be something to
>>>avoid and maybe that's why it hasn't happened yet for decorators. If
>>>ever there was a makethreadsafe decorator, it should go in the thread
>>>module, etc. I mean, come on, its like making a module just to store a
>>>bunch of unrelated types just to lump them together because they're
>>>types. Who wants that?
>>
>>Like itertools?
>>
>>+1 for a decorators module.
> 
> 
> +1 from me as well.  And placing defaultproperty in there makes sense
> if it is meant to be used as a decorator and not viewed as some spiffy
> descriptor.
> 
> Should probably work in Michael's update_meta() function as well
> (albeit maybe with a different name since I think I remember Guido
> saying he didn't like the name).

I thought mimic was a nice name:

   @mimic(func)
   def wrapper(*args, **kwds):
      return func(*args, **kwds)

As a location for this, I would actually suggest a module called something 
like "metatools", rather than "decorators". The things these have in common is 
that they're about manipulating the way functions and the like interact with 
the Python language infrastructure - they're tools to make metaprogramming a 
bit easier.

If "contextmanager" isn't made a builtin, this module would also be the place 
for it.

Ditto for any standard context managers (such as closing()) which aren't made 
builtins.

At the moment, the only location for such things is the builtin namespace 
(e.g. classmethod, staticmethod).

Regardless, a short PEP is needed to:
   a. pick a name for the module
   b. decide precisely what will be in it for Python 2.5

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at iinet.net.au  Tue Oct 11 12:02:39 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Tue, 11 Oct 2005 20:02:39 +1000
Subject: [Python-Dev] Making Queue.Queue easier to use
Message-ID: <434B8DBF.9080509@iinet.net.au>

The multi-processing discussion reminded me that I have a few problems I run 
into every time I try to use Queue objects.

My first problem is finding it:

Py> from threading import Queue # Nope
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
ImportError: cannot import name Queue
Py> from Queue import Queue # Ah, there it is

What do people think of the idea of adding an alias to Queue into the 
threading module so that:
    a) the first line above works; and
    b) Queue can be documented with all of the other threading primitives,
       rather than being off somewhere else in its own top-level section.

My second problem is with the current signatures of the put() and get() 
methods. Specifically, the following code blocks forever instead of raising an 
Empty exception after 500 milliseconds as one might expect:
   from Queue import Queue
   x = Queue()
   x.get(0.5)

I assume the current signature is there for backward compatibility with the 
original version that didn't support timeouts (considering the difficulty of 
telling the difference between "x.get(1)" and "True = 1; x.get(True)" from 
inside the get() method)

However, the need to write "x.get(True, 0.5)" seems seriously redundant, given 
that a single paramater can actually handle all the options (as is currently 
the case with Condition.wait()).

The "put_nowait" and "get_nowait" functions are fine, because they serve a 
useful documentation purpose at the calling point (particularly given the 
current clumsy timeout signature).

What do people think of the idea of adding "put_wait" and "get_wait" methods 
with the signatures:
   put_wait(item,[timeout=None)
   get_wait([timeout=None])

Optionally, the existing "put" and "get" methods could be deprecated, with the 
goal of eventually changing their signature to match the put_wait and get_wait 
methods above.

If people are amenable to these ideas, I should be able to work up a patch for 
them this week.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Tue Oct 11 12:04:07 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 11 Oct 2005 20:04:07 +1000
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <1128955532.32340.141.camel@parabolic.corp.google.com>
References: <20051006143740.287E.JCARLSON@uci.edu>	
	<200510070145.17284.ms@cerenity.org>	<20051006221436.2892.JCARLSON@uci.edu>	
	<415220344.20051007104751@MailBlocks.com>
	<4346FC98.5050504@gmail.com>
	<1128955532.32340.141.camel@parabolic.corp.google.com>
Message-ID: <434B8E17.9040105@gmail.com>

Donovan Baarda wrote:
> On Fri, 2005-10-07 at 23:54, Nick Coghlan wrote:
> [...]
> 
>>The few times I have encountered anyone saying anything resembling "threading 
>>is easy", it was because the full sentence went something like "threading is 
>>easy if you use message passing and copy-on-send or release-reference-on-send 
>>to communicate between threads, and limit the shared data structures to those 
>>required to support the messaging infrastructure". And most of the time there 
>>was an implied "compared to using semaphores and locks directly, " at the start.
> 
> 
> LOL! So threading is easy if you restrict inter-thread communication to
> message passing... and what makes multi-processing hard is your only
> inter-process communication mechanism is message passing :-)
> 
> Sounds like yet another reason to avoid threading and use processes
> instead... effort spent on threading based message passing
> implementations could instead be spent on inter-process messaging.
> 

Actually, I think it makes it worth building a decent message-passing paradigm 
(like, oh, PEP 342) that can then be scaled using backends with four different 
levels of complexity:
   - logical threading (generators)
   - physical threading (threading.Thread and Queue.Queue)
   - multiple processing
   - distributed processing

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Tue Oct 11 12:06:31 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 11 Oct 2005 20:06:31 +1000
Subject: [Python-Dev] problem with genexp
In-Reply-To: <ee2a432c0510102115m19581b97h79cc3df6e1dadd27@mail.gmail.com>
References: <ee2a432c0510102115m19581b97h79cc3df6e1dadd27@mail.gmail.com>
Message-ID: <434B8EA7.5040205@gmail.com>

Neal Norwitz wrote:
> There's a problem with genexp's that I think really needs to get
> fixed.  See http://python.org/sf/1167751 the details are below.  This
> code:
> I agree with the bug report that the code should either raise a
> SyntaxError or do the right thing.

I agree it should be a SyntaxError - I believe the AST compiler actually 
raises one in this situation.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Tue Oct 11 12:14:46 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 11 Oct 2005 20:14:46 +1000
Subject: [Python-Dev] C.E.R. Thoughts
In-Reply-To: <78d129adb4581d24b1d07844019a2afe@gmail.com>
References: <78d129adb4581d24b1d07844019a2afe@gmail.com>
Message-ID: <434B9096.5040100@gmail.com>

jamesr wrote:
> Congragulations heartily given. I missed the ternary op in c... Way to 
> go! clean and easy and now i can do:
> 
> if ((sys.argv[1] =='debug') if len(sys.argv) > 1 else False):
> 	pass
> 
> and check variables IF AND ONLY if they exist, in a single line!
> 
> but y'all knew that..

Yep, it was a conscious decision to add a construct with the *potential* to be 
abused for use in places where the existing "and" and "or" expressions *are* 
being abused and resulting in buggy code.

The code in your example is lousy because it's unreadable (and there are far 
more readable alternatives like a simple short-circuiting usage of "and"), but 
at least it's semantically correct (whereas the same can't be said for the 
current abuse of "and" and "or").

If code using a conditional expression is unclear, blame the programmer for 
choosing to write the code, don't blame the existence of the conditional 
expression :)

We're-all-adults-here-ly yours,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Tue Oct 11 12:25:44 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 11 Oct 2005 20:25:44 +1000
Subject: [Python-Dev] PEP 3000 and exec
In-Reply-To: <ca471dc20510101505t76a5455ak6af18582ff243b43@mail.gmail.com>
References: <didj3k$o0r$1@sea.gmane.org>	<bbaeab100510101329l116efbao369d0b6710331a5d@mail.gmail.com>	<17226.58512.451743.300957@montanaro.dyndns.org>
	<ca471dc20510101505t76a5455ak6af18582ff243b43@mail.gmail.com>
Message-ID: <434B9328.4030105@gmail.com>

Guido van Rossum wrote:
> My idea was to make the compiler smarter so that it would recognize
> exec() even if it was just a function.
> 
> Another idea might be to change the exec() spec so that you are
> required to pass in a namespace (and you can't use locals() either!).
> Then the whole point becomes moot.

I vote for the latter option. Particularly if something like Namespace objects 
make their way into the standard lib before Py3k (a Namespace object is 
essentially designed to provide attribute style lookup into a string-keyed 
dictionary- you can fake it pretty well with an empty class, but there are a 
few quirks with doing it that way).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Tue Oct 11 12:36:59 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 11 Oct 2005 20:36:59 +1000
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <746109444.20051010141518@MailBlocks.com>
References: Your message of "Mon, 10 Oct 2005 11:20:31
	PDT."	<17226.45295.661911.542400@montanaro.dyndns.org>	<05Oct10.122654pdt."58617"@synergy1.parc.xerox.com>
	<746109444.20051010141518@MailBlocks.com>
Message-ID: <434B95CB.5000107@gmail.com>

Bruce Eckel wrote:
>>Yes, there's a troublesome meme in the world: "threads are hard".
>>They aren't, really.  You just have to know what you're doing.
> 
> 
> I would say that the troublesome meme is that "threads are easy." I
> posted an earlier, rather longish message about this. The gist of
> which was: "when someone says that threads are easy, I have no idea
> what they mean by it."
> 
> Perhaps this means "threads in Python are easier than threads in other
> languages."

One key thing is that the Python is so dynamic that the compiler can't get too 
fancy with the order in which it does things. However, Python threading has 
its own traps for the unwary (mainly related to badly-behaved C extensions, 
but they're still traps).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From steve at holdenweb.com  Tue Oct 11 13:06:39 2005
From: steve at holdenweb.com (Steve Holden)
Date: Tue, 11 Oct 2005 12:06:39 +0100
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <746109444.20051010141518@MailBlocks.com>
References: Your message of "Mon, 10 Oct 2005 11:20:31
	PDT."	<17226.45295.661911.542400@montanaro.dyndns.org>	<05Oct10.122654pdt."58617"@synergy1.parc.xerox.com>
	<746109444.20051010141518@MailBlocks.com>
Message-ID: <dig6c2$kcg$1@sea.gmane.org>

Bruce Eckel wrote:
[Bill Janssen]
>>Yes, there's a troublesome meme in the world: "threads are hard".
>>They aren't, really.  You just have to know what you're doing.
> 
But that begs the question, because there is a significant amount of 
evidence that when it comes to threads "knowing what you are doing" is 
hard to the point that people can *think* they do when they demonstrably 
don't!
> 
> I would say that the troublesome meme is that "threads are easy." I
> posted an earlier, rather longish message about this. The gist of
> which was: "when someone says that threads are easy, I have no idea
> what they mean by it."
> 
I would suggest that the truth lies in the middle ground, and would say 
that "you can get yourself into a lot of trouble using threads without 
considering the subtleties". It's an area where anything but the most 
simplistic solutions are almost always wrong at some point.

> Perhaps this means "threads in Python are easier than threads in other
> languages."
> 
> But I just finished a 150-page chapter on Concurrency in Java which
> took many months to write, based on a large chapter on Concurrency in
> C++ which probably took longer to write. I keep in reasonably good
> touch with some of the threading experts. I can't get any of them to
> say that it's easy, even though they really do understand the issues
> and think about it all the time. *Because* of that, they say that it's
> hard.
> 
> So alright, I'll take the bait that you've laid down more than once,
> now. Perhaps you can go beyond saying that "threads really aren't
> hard" and explain the aspects of them that seem so easy to you.
> Perhaps you can give a nice clear explanation of cache coherency and
> memory barriers in multiprocessor machines? Or explain atomicity,
> volatility and visibility? Or, even better, maybe you can come up with
> a better concurrency model, which is what I think most of us are
> looking for in this discussion.
> 
The nice thing about Python threads (or rather threading.threads) is 
that since each thread is an instance it's *relatively* easy to ensure 
that a thread restricts itself to manipulating thread-local resources 
(i.e. instance members).

This makes it possible to write algorithms parameterized for the number 
of "worker threads" where the workers are taking their tasks off a Queue 
with entries generated by a single producer thread. With care, multiple 
producers can be used. More complex inter-thread communications are 
problematic, and arbitrary access to foreign-thread state is a nightmare 
(although the position has been somewhat alleviated by the introduction 
of threading.local).

Beyond the single-producer many-consumers model there is still plenty of 
room to shoot yourself in the foot. In the case of threads true 
sophistication is staying away from the difficult cases, an option which 
unfortunately isn't always available in the real world.

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC                     www.holdenweb.com
PyCon TX 2006                  www.python.org/pycon/


From jeremy at alum.mit.edu  Tue Oct 11 15:40:56 2005
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Tue, 11 Oct 2005 09:40:56 -0400
Subject: [Python-Dev] problem with genexp
In-Reply-To: <434B8EA7.5040205@gmail.com>
References: <ee2a432c0510102115m19581b97h79cc3df6e1dadd27@mail.gmail.com>
	<434B8EA7.5040205@gmail.com>
Message-ID: <e8bf7a530510110640g11504363jaa1b2a34150d2f76@mail.gmail.com>

On 10/11/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Neal Norwitz wrote:
> > There's a problem with genexp's that I think really needs to get
> > fixed.  See http://python.org/sf/1167751 the details are below.  This
> > code:
> > I agree with the bug report that the code should either raise a
> > SyntaxError or do the right thing.
>
> I agree it should be a SyntaxError - I believe the AST compiler actually
> raises one in this situation.

Could someone add a test for this on the AST branch?

BTW, it looks like doctest is the way to go for SyntaxError tests. 
There are older tests, like test_scope.py, that use separate files
with bad syntax (and lots of extra kludges in the infrastructure to
ignore the fact that those .py files can't be compiled).

Jeremy

From ncoghlan at gmail.com  Tue Oct 11 15:44:59 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 11 Oct 2005 23:44:59 +1000
Subject: [Python-Dev] problem with genexp
In-Reply-To: <434B8EA7.5040205@gmail.com>
References: <ee2a432c0510102115m19581b97h79cc3df6e1dadd27@mail.gmail.com>
	<434B8EA7.5040205@gmail.com>
Message-ID: <434BC1DB.4040806@gmail.com>

Nick Coghlan wrote:
> Neal Norwitz wrote:
> 
>>There's a problem with genexp's that I think really needs to get
>>fixed.  See http://python.org/sf/1167751 the details are below.  This
>>code:
>>I agree with the bug report that the code should either raise a
>>SyntaxError or do the right thing.
> 
> 
> I agree it should be a SyntaxError - I believe the AST compiler actually 
> raises one in this situation.

I was half right. Both the normal compiler and the AST compiler give a 
SyntaxError if you write:

   foo((a=i for i in range(10)))

The problem is definitely on the parser end though:

Py> compiler.parse("foo(x=i for i in range(10))")
Module(None, Stmt([Discard(CallFunc(Name('foo'), [Keyword('x', Name('i'))], 
None, None))]))

It's getting to what looks like a valid keyword argument in "x=i" and throwing 
the rest of it away, when it should be flagging a syntax error (the parser's 
limited lookahead should still be enough to spot the erroneous 'for' keyword 
and bail out). The error will be even more obscure if there is an "i" visible 
from the location of the function call.

Whereas when it's parenthesised correctly, the parse tree looks more like this:
Py> compiler.parse("foo(x=(i for i in range(10)))")
Module(None, Stmt([Discard(CallFunc(Name('foo'), [Keyword('x', 
GenExpr(GenExprInner(Name('i'), [GenExprFor(AssName('i', 'OP_ASSIGN'), 
CallFunc(Name('range'), [Const(10)], None, None), [])])))], None, None))]))

Cheers,
Nick.

P.S. I added a comment showing the parser output to the SF bug report.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From tim.peters at gmail.com  Tue Oct 11 15:51:06 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 11 Oct 2005 09:51:06 -0400
Subject: [Python-Dev] PythonCore\CurrentVersion
In-Reply-To: <434B58D5.7020206@v.loewis.de>
References: <4347A020.2050008@v.loewis.de>
	<DAELJHBGPBHPJKEBGGLNKEBMHDAD.mhammond@skippinet.com.au>
	<1f7befae0510101542tf65c250ybf93eb7f11c187ae@mail.gmail.com>
	<200510102109.37690.fdrake@acm.org> <434B58D5.7020206@v.loewis.de>
Message-ID: <1f7befae0510110651o504958det5d2409b3f724070e@mail.gmail.com>

[Tim Peters]
>>> never before this year -- maybe sys.path _used_ to contain the current
>>> directory on Linux?).

[Fred L. Drake, Jr.]
>> It's been a long time since this was the case on Unix of any variety; I
>> *think* this changed to the current state back before 2.0.

[Martin v. L?wis]
> Please check again:
>
> [GCC 4.0.2 20050821 (prerelease) (Debian 4.0.1-6)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>  >>> import sys
>  >>> sys.path
> ['', '/usr/lib/python23.zip', '/usr/lib/python2.3',
> '/usr/lib/python2.3/plat-linux2', '/usr/lib/python2.3/lib-tk',
> '/usr/lib/python2.3/lib-dynload',
> '/usr/local/lib/python2.3/site-packages',
> '/usr/lib/python2.3/site-packages',
> '/usr/lib/python2.3/site-packages/Numeric',
> '/usr/lib/python2.3/site-packages/gtk-2.0', '/usr/lib/site-python']
>
> We still have the empty string in sys.path, and it still
> denotes the current directory.

Well, that's in interactive mode, and I see sys.path[0] == "" on both
Windows and Linux then.  I don't see "" in sys.path on either box in
batch mode, although I do see the absolutized path to the current
directory in sys.path in batch mode on Windows but not on Linux -- but
Mark Hammond says he doesn't see (any form of) the current directory
in sys.path in batch mode on Windows.

It's a bit confusing ;-)

From fumanchu at amor.org  Tue Oct 11 16:46:40 2005
From: fumanchu at amor.org (Robert Brewer)
Date: Tue, 11 Oct 2005 07:46:40 -0700
Subject: [Python-Dev] Pythonic concurrency
References: Your message of "Mon, 10 Oct 2005 11:20:31
	PDT."	<17226.45295.661911.542400@montanaro.dyndns.org>	<05Oct10.122654pdt."58617"@synergy1.parc.xerox.com>
	<746109444.20051010141518@MailBlocks.com>
	<dig6c2$kcg$1@sea.gmane.org>
Message-ID: <A77618B80CDD2543B705C82B7665D9F9012B0490@ex9.hostedexchange.local>

Steve Holden wrote:
> The nice thing about Python threads (or rather threading.threads) is
> that since each thread is an instance it's *relatively* easy to ensure
> that a thread restricts itself to manipulating thread-local resources
> (i.e. instance members).
> 
> This makes it possible to write algorithms parameterized for the number
> of "worker threads" where the workers are taking their tasks off a Queue
> with entries generated by a single producer thread. With care, multiple
> producers can be used. More complex inter-thread communications are
> problematic, and arbitrary access to foreign-thread state is a nightmare
> (although the position has been somewhat alleviated by the introduction
> of threading.local).

"Somewhat alleviated" and somewhat worsened. I've had half a dozen conversations in the last year about sharing data between threads; in every case, I've had to work quite hard to convince the other person that threading.local is *not* magic pixie thread dust. Each time, they had come to the conclusion that if they had a global variable, they could just stick a reference to it into a threading.local object and instantly have safe, concurrent access to it.


Robert Brewer
System Architect
Amor Ministries
fumanchu at amor.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20051011/39c9073f/attachment.html

From ncoghlan at gmail.com  Tue Oct 11 16:51:30 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 12 Oct 2005 00:51:30 +1000
Subject: [Python-Dev] Extending tuple unpacking
In-Reply-To: <434B6C3A.7020001@canterbury.ac.nz>
References: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com>	<434B3C01.5030001@ronadam.com>	<ca471dc20510102155p7f17fd14h5fe6043ea64717f6@mail.gmail.com>
	<434B6C3A.7020001@canterbury.ac.nz>
Message-ID: <434BD172.3030004@gmail.com>

Greg Ewing wrote:
> Guido van Rossum wrote:
> 
> 
>>BTW, what should
>>
>>    [a, b, *rest] = (1, 2, 3, 4, 5)
>>
>>do? Should it set rest to (3, 4, 5) or to [3, 4, 5]?
> 
> 
> Whatever type is chosen, it should be the same type, always.
> The rhs could be any iterable, not just a tuple or a list.
> Making a special case of preserving one or two types doesn't
> seem worth it to me.

And, for consistency with functions, the type chosen should be a tuple.

I'm also trying to figure out why you would ever write:
   [a, b, c, d] = seq

instead of:
   a, b, c, d = seq

or:
   (a, b, c, d) = seq

It's not like the square brackets generate different code:

Py> def foo():
...     x, y = 1, 2
...     (x, y) = 1, 2
...     [x, y] = 1, 2
...
Py> dis.dis(foo)
   2           0 LOAD_CONST               3 ((1, 2))
               3 UNPACK_SEQUENCE          2
               6 STORE_FAST               1 (x)
               9 STORE_FAST               0 (y)

   3          12 LOAD_CONST               4 ((1, 2))
              15 UNPACK_SEQUENCE          2
              18 STORE_FAST               1 (x)
              21 STORE_FAST               0 (y)

   4          24 LOAD_CONST               5 ((1, 2))
              27 UNPACK_SEQUENCE          2
              30 STORE_FAST               1 (x)
              33 STORE_FAST               0 (y)
              36 LOAD_CONST               0 (None)
              39 RETURN_VALUE

So my vote would actually go for deprecating the use of square brackets to 
surround an assignment target list - it makes it look like an actual list 
object should be involved somewhere, but there isn't one.

>>? And then perhaps
>>
>>    *rest = x
>>
>>should mean
>>
>>    rest = tuple(x)
>>
>>Or should that be disallowed
> 
> Why bother? What harm would result from the ability to write that?

Given that:
   def foo(*args):
       print args

is legal, I would have no problem with "*rest = x" being legal.

>>There certainly is a need for doing the same from the end:
>>
>>    *rest, a, b = (1, 2, 3, 4, 5)
> 
> 
> I wouldn't mind at all if *rest were only allowed at the end.
> There's a pragmatic reason for that if nothing else: the rhs
> can be any iterable, and there's no easy way of getting "all
> but the last n" items from a general iterable.

Agreed. The goal here is to make the name binding rules consistent between for 
loops, tuple assigment and function entry, not to create different rules.

>>Where does it stop?
> For me, it stops with *rest only allowed at the end, and
> always yielding a predictable type (which could be either tuple
> or list, I don't care).

For me, it stops when the rules for positional name binding are more 
consistent across operations that bind names (although complete consistency 
isn't possible, given that function calls don't unpack sequences automatically).

Firstly, let's list the operations that permit name binding to a list of 
identifiers:
   - binding of function parameters to function arguments
   - binding of assignment target list to assigned sequence
   - binding of iteration variables to iteration values

However, that function argument case needs to be recognised as a two step 
operation, whereby the arguments are *always* packed into a tuple before being 
bound to the parameters.

That is something very vaguely like:
   if numargs > 0:
     if numargs == 1:
       argtuple = args, # One argument gives singleton tuple
     else:
       argtuple = args # More arguments gives appropriate tuple
     argtuple += tuple(starargs) # Extended arguments are added to the tuple
     param1, param2, *rest = argtuple # Tuple is unpacked to parameters

This means that the current behaviour of function parameters is actually the 
same as assignment target lists and iteration variables, in that the argument 
tuple is *always* unpacked into the parameter list - the only difference is 
that a single argument is always considered a singleton tuple. You can get the 
same behaviour with target lists and iteration variables by only using tuples 
of identifiers as targets (i.e., use "x," rather than just "x").

So the proposal at this stage is simply to mimic the unpacking of the argument 
tuple into the formal parameter list in the other two name list binding cases, 
such that the pseudocode above would actually do the same thing as building an 
argument list and binding it to its formal parameters does.

Now, when it came to tuple *packing* syntax (i.e., extended call syntax) The 
appropriate behaviour would be for:

   1, 2, 3, *range(10)

to translate (roughly) to:

   (1, 2, 3) + tuple(range(10))

However, given that the equivalent code works just fine anywhere it really 
matters (assignment value, return value, yield value), and is clearer about 
what is going on, this option is probably worth avoiding.

>>BTW, and quite unrelated, I've always felt uncomfortable that you have to write
>>
>>    f(a, b, foo=1, bar=2, *args, **kwds)
>>
>>I've always wanted to write that as
>>
>>    f(a, b, *args, foo=1, bar=2, **kwds)
> 
> 
> Yes, I'd like that too, with the additional meaning that
> foo and bar can only be specified by keyword, not by
> position.

Indeed. It's a (minor) pain that optional flag variables and variable length 
argument lists are currently mutually exclusive. Although, if you had that 
rule, I'd want to be able to write:

   def f(a, b, *, foo=1, bar=2): pass

to get a function which required exactly two positional arguments, but had a 
couple of optional keyword arguments, rather than having to do:

   def f(a, b, *args, foo=1, bar=2):
     if args:
       raise TypeError("f() takes exactly 2 positional arguments (%d given)",
                        2 + len(args))

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Tue Oct 11 16:54:19 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 12 Oct 2005 00:54:19 +1000
Subject: [Python-Dev] Extending tuple unpacking
In-Reply-To: <434BD172.3030004@gmail.com>
References: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com>	<434B3C01.5030001@ronadam.com>	<ca471dc20510102155p7f17fd14h5fe6043ea64717f6@mail.gmail.com>	<434B6C3A.7020001@canterbury.ac.nz>
	<434BD172.3030004@gmail.com>
Message-ID: <434BD21B.8020208@gmail.com>

Nick Coghlan wrote:
> For me, it stops when the rules for positional name binding are more 
> consistent across operations that bind names (although complete consistency 
> isn't possible, given that function calls don't unpack sequences automatically).

Oops - forgot to delete this bit once I realised that functions actually *do* 
unpack the arugment tuple automatically. It's just that an argument which is a 
single sequence gets put into a singleton tuple before being unpacked.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Tue Oct 11 16:55:31 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 12 Oct 2005 00:55:31 +1000
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <A77618B80CDD2543B705C82B7665D9F9012B0490@ex9.hostedexchange.local>
References: Your message of "Mon, 10 Oct 2005
	11:20:31	PDT."	<17226.45295.661911.542400@montanaro.dyndns.org>	<05Oct10.122654pdt."58617"@synergy1.parc.xerox.com>	<746109444.20051010141518@MailBlocks.com>	<dig6c2$kcg$1@sea.gmane.org>
	<A77618B80CDD2543B705C82B7665D9F9012B0490@ex9.hostedexchange.local>
Message-ID: <434BD263.40907@gmail.com>

Robert Brewer wrote:
> "Somewhat alleviated" and somewhat worsened. I've had half a dozen 
> conversations in the last year about sharing data between threads; in 
> every case, I've had to work quite hard to convince the other person 
> that threading.local is *not* magic pixie thread dust. Each time, they 
> had come to the conclusion that if they had a global variable, they 
> could just stick a reference to it into a threading.local object and 
> instantly have safe, concurrent access to it.

Ouch. Copy, yes, reference, no. . .

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From guido at python.org  Tue Oct 11 17:12:03 2005
From: guido at python.org (Guido van Rossum)
Date: Tue, 11 Oct 2005 08:12:03 -0700
Subject: [Python-Dev] Making Queue.Queue easier to use
In-Reply-To: <434B8DBF.9080509@iinet.net.au>
References: <434B8DBF.9080509@iinet.net.au>
Message-ID: <ca471dc20510110812h3485c0eck35d7ab6d4dd2c8d5@mail.gmail.com>

On 10/11/05, Nick Coghlan <ncoghlan at iinet.net.au> wrote:
> The multi-processing discussion reminded me that I have a few problems I run
> into every time I try to use Queue objects.
>
> My first problem is finding it:
>
> Py> from threading import Queue # Nope
> Traceback (most recent call last):
>    File "<stdin>", line 1, in ?
> ImportError: cannot import name Queue
> Py> from Queue import Queue # Ah, there it is

I don't think that's a reason to move it.

>>> from sys import Queue
ImportError: cannon import name Queue
>>> from os import Queue
ImportError: cannot import name Queue
>>> # Well where the heck is it?!

> What do people think of the idea of adding an alias to Queue into the
> threading module so that:
>     a) the first line above works; and

I see no need. Code that *doesn't* need Queue but does use threading
shouldn't have to pay for loading Queue.py.

>     b) Queue can be documented with all of the other threading primitives,
>        rather than being off somewhere else in its own top-level section.

Do top-level sections have to limit themselves to a single module?

Even if they do, I think it's fine to plant a prominent link to the
Queue module. You can't really expect people to learn how to use
threads wisely from reading the library reference anyway.

> My second problem is with the current signatures of the put() and get()
> methods. Specifically, the following code blocks forever instead of raising an
> Empty exception after 500 milliseconds as one might expect:
>    from Queue import Queue
>    x = Queue()
>    x.get(0.5)

I'm not sure if I have much sympathy with a bug due to refusing to
read the docs... :)

> I assume the current signature is there for backward compatibility with the
> original version that didn't support timeouts (considering the difficulty of
> telling the difference between "x.get(1)" and "True = 1; x.get(True)" from
> inside the get() method)

Huh? What a bizarre idea. Why would you do that? I gues I don't
understand where you're coming from.

> However, the need to write "x.get(True, 0.5)" seems seriously redundant, given
> that a single paramater can actually handle all the options (as is currently
> the case with Condition.wait()).

So write x.get(timeout=0.5). That's clear and unambiguous.

> The "put_nowait" and "get_nowait" functions are fine, because they serve a
> useful documentation purpose at the calling point (particularly given the
> current clumsy timeout signature).
>
> What do people think of the idea of adding "put_wait" and "get_wait" methods
> with the signatures:
>    put_wait(item,[timeout=None)
>    get_wait([timeout=None])

-1. I'd rather not tweak the current Queue module at all until Python
3000. Then we could force people to use keyword args.

> Optionally, the existing "put" and "get" methods could be deprecated, with the
> goal of eventually changing their signature to match the put_wait and get_wait
> methods above.

Apart from trying to guess the API without reading the docs (:-), what
are the use cases for using put/get with a timeout? I have a feeling
it's not that common.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Oct 11 17:19:00 2005
From: guido at python.org (Guido van Rossum)
Date: Tue, 11 Oct 2005 08:19:00 -0700
Subject: [Python-Dev] PythonCore\CurrentVersion
In-Reply-To: <1f7befae0510110651o504958det5d2409b3f724070e@mail.gmail.com>
References: <4347A020.2050008@v.loewis.de>
	<DAELJHBGPBHPJKEBGGLNKEBMHDAD.mhammond@skippinet.com.au>
	<1f7befae0510101542tf65c250ybf93eb7f11c187ae@mail.gmail.com>
	<200510102109.37690.fdrake@acm.org> <434B58D5.7020206@v.loewis.de>
	<1f7befae0510110651o504958det5d2409b3f724070e@mail.gmail.com>
Message-ID: <ca471dc20510110819l1d11e77eve9fcad783f0dea90@mail.gmail.com>

On 10/11/05, Tim Peters <tim.peters at gmail.com> wrote:
> Well, that's in interactive mode, and I see sys.path[0] == "" on both
> Windows and Linux then.  I don't see "" in sys.path on either box in
> batch mode, although I do see the absolutized path to the current
> directory in sys.path in batch mode on Windows but not on Linux -- but
> Mark Hammond says he doesn't see (any form of) the current directory
> in sys.path in batch mode on Windows.
>
> It's a bit confusing ;-)

How did you test batch mode?

All:

sys.path[0] is *not* defined to be the current directory.

It is defined to be the directory of the script that was used to
invoke python (sys.argv[0], typically). If there is no script, or it
is being read from stdin, the default is ''.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From tim.peters at gmail.com  Tue Oct 11 18:08:42 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 11 Oct 2005 12:08:42 -0400
Subject: [Python-Dev] PythonCore\CurrentVersion
In-Reply-To: <ca471dc20510110819l1d11e77eve9fcad783f0dea90@mail.gmail.com>
References: <4347A020.2050008@v.loewis.de>
	<DAELJHBGPBHPJKEBGGLNKEBMHDAD.mhammond@skippinet.com.au>
	<1f7befae0510101542tf65c250ybf93eb7f11c187ae@mail.gmail.com>
	<200510102109.37690.fdrake@acm.org> <434B58D5.7020206@v.loewis.de>
	<1f7befae0510110651o504958det5d2409b3f724070e@mail.gmail.com>
	<ca471dc20510110819l1d11e77eve9fcad783f0dea90@mail.gmail.com>
Message-ID: <1f7befae0510110908g2bada0a7vfd98df080c9af745@mail.gmail.com>

[Tim]
>> Well, that's in interactive mode, and I see sys.path[0] == "" on both
>> Windows and Linux then.  I don't see "" in sys.path on either box in
>> batch mode, although I do see the absolutized path to the current
>> directory in sys.path in batch mode on Windows but not on Linux -- but
>> Mark Hammond says he doesn't see (any form of) the current directory
>> in sys.path in batch mode on Windows.
>>
>> It's a bit confusing ;-)

[Guido]
> How did you test batch mode?

I gave full code (it's brief) and screen-scrapes from Windows and
Linux yesterday:

    http://mail.python.org/pipermail/python-dev/2005-October/057162.html

By batch mode, I meant invoking

    path_to_python   path_to_python_script.py

from a shell prompt.

> All:
>
> sys.path[0] is *not* defined to be the current directory.
>
> It is defined to be the directory of the script that was used to
> invoke python (sys.argv[0], typically).

In my runs, sys.argv[0] was the path to the Python executable, not to
the script being run.  The directory of the script being run was
nevertheless in sys.path[0] on both Windows and Linux.  On Windows,
but not on Linux, the _current_ directory (the directory I happened to
be in at the time I invoked Python) was also on sys.path; Mark Hammond
said it was not when he tried, but he didn't show exactly what he did
so I'm not sure what he saw.

> If there is no script, or it is being read from stdin, the default is ''.

I believe everyone sees that.

From guido at python.org  Tue Oct 11 18:22:43 2005
From: guido at python.org (Guido van Rossum)
Date: Tue, 11 Oct 2005 09:22:43 -0700
Subject: [Python-Dev] PythonCore\CurrentVersion
In-Reply-To: <1f7befae0510110908g2bada0a7vfd98df080c9af745@mail.gmail.com>
References: <4347A020.2050008@v.loewis.de>
	<DAELJHBGPBHPJKEBGGLNKEBMHDAD.mhammond@skippinet.com.au>
	<1f7befae0510101542tf65c250ybf93eb7f11c187ae@mail.gmail.com>
	<200510102109.37690.fdrake@acm.org> <434B58D5.7020206@v.loewis.de>
	<1f7befae0510110651o504958det5d2409b3f724070e@mail.gmail.com>
	<ca471dc20510110819l1d11e77eve9fcad783f0dea90@mail.gmail.com>
	<1f7befae0510110908g2bada0a7vfd98df080c9af745@mail.gmail.com>
Message-ID: <ca471dc20510110922u5faf6264g40f7928de9af7d71@mail.gmail.com>

On 10/11/05, Tim Peters <tim.peters at gmail.com> wrote:
> [Tim]
> >> Well, that's in interactive mode, and I see sys.path[0] == "" on both
> >> Windows and Linux then.  I don't see "" in sys.path on either box in
> >> batch mode, although I do see the absolutized path to the current
> >> directory in sys.path in batch mode on Windows but not on Linux -- but
> >> Mark Hammond says he doesn't see (any form of) the current directory
> >> in sys.path in batch mode on Windows.
> >>
> >> It's a bit confusing ;-)
>
> [Guido]
> > How did you test batch mode?
>
> I gave full code (it's brief) and screen-scrapes from Windows and
> Linux yesterday:
>
>     http://mail.python.org/pipermail/python-dev/2005-October/057162.html
>
> By batch mode, I meant invoking
>
>     path_to_python   path_to_python_script.py
>
> from a shell prompt.
>
> > All:
> >
> > sys.path[0] is *not* defined to be the current directory.
> >
> > It is defined to be the directory of the script that was used to
> > invoke python (sys.argv[0], typically).
>
> In my runs, sys.argv[0] was the path to the Python executable, not to
> the script being run.

I tried your experiment but added 'print sys.argv[0]' and didn't see
that. sys.argv[0] is the path to the script.

> The directory of the script being run was
> nevertheless in sys.path[0] on both Windows and Linux.  On Windows,
> but not on Linux, the _current_ directory (the directory I happened to
> be in at the time I invoked Python) was also on sys.path; Mark Hammond
> said it was not when he tried, but he didn't show exactly what he did
> so I'm not sure what he saw.

I see what you see.  The first entry is the script's directory, the
2nd is a nonexistent zip file, the 3rd is the current directory, then
the rest is standard library stuff.

I suppose PC/getpathp.c puts it there, per your post quoted above?

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From steve at holdenweb.com  Tue Oct 11 18:22:06 2005
From: steve at holdenweb.com (Steve Holden)
Date: Tue, 11 Oct 2005 17:22:06 +0100
Subject: [Python-Dev] Extending tuple unpacking
In-Reply-To: <434BD172.3030004@gmail.com>
References: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com>	<434B3C01.5030001@ronadam.com>	<ca471dc20510102155p7f17fd14h5fe6043ea64717f6@mail.gmail.com>	<434B6C3A.7020001@canterbury.ac.nz>
	<434BD172.3030004@gmail.com>
Message-ID: <digorg$opr$1@sea.gmane.org>

Nick Coghlan wrote:
> Greg Ewing wrote:
> 
>>Guido van Rossum wrote:
>>
>>
>>
>>>BTW, what should
>>>
>>>   [a, b, *rest] = (1, 2, 3, 4, 5)
>>>
>>>do? Should it set rest to (3, 4, 5) or to [3, 4, 5]?
>>
>>
>>Whatever type is chosen, it should be the same type, always.
>>The rhs could be any iterable, not just a tuple or a list.
>>Making a special case of preserving one or two types doesn't
>>seem worth it to me.
> 
> 
> And, for consistency with functions, the type chosen should be a tuple.
> 
> I'm also trying to figure out why you would ever write:
>    [a, b, c, d] = seq
> 
> instead of:
>    a, b, c, d = seq
> 
> or:
>    (a, b, c, d) = seq
> 
[...]
> So my vote would actually go for deprecating the use of square brackets to 
> surround an assignment target list - it makes it look like an actual list 
> object should be involved somewhere, but there isn't one.
> 
But don't forget that at present unpacking can be used at several levels:

 >>> ((a, b), c) = ((1, 2), 3)
 >>> a, b, c
(1, 2, 3)
 >>>

So presumably you'd need to be able to say

   ((a, *b), c, *d) = ((1, 2, 3), 4, 5, 6)

and see

   a, b, c, d == 1, (2, 3), 4, (5, 6)

if we are to retain today's multi-level consistency. And are you also 
proposing to allow

   a, *b = [1]

to put the empty list into b, or is that an unpacking error?


> 
>>>? And then perhaps
>>>
>>>   *rest = x
>>>
>>>should mean
>>>
>>>   rest = tuple(x)
>>>
>>>Or should that be disallowed
>>
>>Why bother? What harm would result from the ability to write that?
> 
> 
> Given that:
>    def foo(*args):
>        print args
> 
> is legal, I would have no problem with "*rest = x" being legal.
> 
Though presumably we'd still be raising TypeError is x weren't a sequence.
> 
>>>There certainly is a need for doing the same from the end:
>>>
>>>   *rest, a, b = (1, 2, 3, 4, 5)
>>
>>
>>I wouldn't mind at all if *rest were only allowed at the end.
>>There's a pragmatic reason for that if nothing else: the rhs
>>can be any iterable, and there's no easy way of getting "all
>>but the last n" items from a general iterable.
> 
> 
> Agreed. The goal here is to make the name binding rules consistent between for 
> loops, tuple assigment and function entry, not to create different rules.
> 
> 
>>>Where does it stop?
>>
>>For me, it stops with *rest only allowed at the end, and
>>always yielding a predictable type (which could be either tuple
>>or list, I don't care).
> 
> 
> For me, it stops when the rules for positional name binding are more 
> consistent across operations that bind names (although complete consistency 
> isn't possible, given that function calls don't unpack sequences automatically).
> 
Hmm. Given that today we can write

 >>> def foo((a, b), c):
...   print a, b, c
...
 >>> foo((1, 2, 3))
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
TypeError: foo() takes exactly 2 arguments (1 given)
 >>> foo((1, 2), 3)
1 2 3
 >>>

does this mean that you'd also like to be able to write

   def foo((a, *b), *c):
     print a, b, c

and then call it like

   foo((1, 2, 3, 4), 5, 6)

to see

   1, (2, 3, 4), (5, 6)

[...]
> 
>>>BTW, and quite unrelated, I've always felt uncomfortable that you have to write
>>>
>>>   f(a, b, foo=1, bar=2, *args, **kwds)
>>>
>>>I've always wanted to write that as
>>>
>>>   f(a, b, *args, foo=1, bar=2, **kwds)
>>
>>
>>Yes, I'd like that too, with the additional meaning that
>>foo and bar can only be specified by keyword, not by
>>position.
> 
> 
> Indeed. It's a (minor) pain that optional flag variables and variable length 
> argument lists are currently mutually exclusive. Although, if you had that 
> rule, I'd want to be able to write:
> 
>    def f(a, b, *, foo=1, bar=2): pass
> 
> to get a function which required exactly two positional arguments, but had a 
> couple of optional keyword arguments, rather than having to do:
> 
>    def f(a, b, *args, foo=1, bar=2):
>      if args:
>        raise TypeError("f() takes exactly 2 positional arguments (%d given)",
>                         2 + len(args))
> 
I do feel that for Python 3 it might be better to make a clean 
separation between keywords and positionals: in other words, of the 
function definition specifies a keyword argument then a keyword must be 
used to present it.

This would allow users to provide an arbitrary number of positionals 
rather than having them become keyword arguments. At present it's 
difficult to specify that.

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC                     www.holdenweb.com
PyCon TX 2006                  www.python.org/pycon/


From solipsis at pitrou.net  Tue Oct 11 18:46:18 2005
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 11 Oct 2005 18:46:18 +0200
Subject: [Python-Dev] Extending tuple unpacking
In-Reply-To: <digorg$opr$1@sea.gmane.org>
References: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com>
	<434B3C01.5030001@ronadam.com>
	<ca471dc20510102155p7f17fd14h5fe6043ea64717f6@mail.gmail.com>
	<434B6C3A.7020001@canterbury.ac.nz> <434BD172.3030004@gmail.com>
	<digorg$opr$1@sea.gmane.org>
Message-ID: <1129049178.6162.5.camel@fsol>


(my own 2 eurocents)

> I do feel that for Python 3 it might be better to make a clean 
> separation between keywords and positionals: in other words, of the 
> function definition specifies a keyword argument then a keyword must be 
> used to present it.

Do you mean it would also be forbidden to invoke a "positional" argument
using its keyword? It would be a huge step back in usability IMO. Some
people like invoking by position (because it's shorter) and some others
prefer invoking by keyword (because it's more explicit). Why should the
implementer of the API have to make a choice for the user of the API ?




From BruceEckel-Python3234 at mailblocks.com  Tue Oct 11 18:53:02 2005
From: BruceEckel-Python3234 at mailblocks.com (Bruce Eckel)
Date: Tue, 11 Oct 2005 10:53:02 -0600
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <05Oct10.180605pdt."58617"@synergy1.parc.xerox.com>
References: Your message of "Mon, 10 Oct 2005 17:18:15 PDT."
	<ca471dc20510101718y5fdd5af3x1e5a93642a90a20e@mail.gmail.com>
	<05Oct10.180605pdt."58617"@synergy1.parc.xerox.com>
Message-ID: <638942434.20051011105302@MailBlocks.com>

> Java's condition variables don't (didn't?  has this been fixed?) quite
> work.  The emphasis on portability and the resulting notions of
> red/green threading packages at the beginning didn't help either.
> Read Allen Holub's book.  And Doug Lea's book.  I understand much of
> this has been addressed with a new package in Java 1.5.

Not only are there significant new library components in
java.util.concurrent in J2SE5, but perhaps more important is the new
memory model that deals with issues that are (especially) revealed in
multiprocessor environments. The new memory model represents new work
in the computer science field; apparently the original paper is
written by Ph.D.s and is a bit too theoretical for the normal person
to follow. But the smart threading guys studied this and came up with
the new Java memory model -- so that volatile, for example, which
didn't work quite right before, does now. This is part of J2SE5, and
this work is being incorporated into the upcoming C++0x.

Java concurrency is certainly one of the bad examples of language
design. Apparently, they grabbed stuff from C++ (mostly the volatile
keyword) and combined it with what they new about pthreads, and
decided that being able to declare a method as synchronized made the
whole thing object-oriented. But you can see how ill-thought-out the
design was because in later versions of Java some fundamental methods:
stop(), suspend(), resume() and destroy(), were deprecated because ...
oops, we didn't really think those out very well. And then finally,
with J2SE5, it *appears* that all the kinks have been fixed, but only
with some really smart folks like Doug Lea, Brian Goetz, and that
gang, working long and hard on all these issues and (we hope) figuring
them all out.

I think threading *can* be much simpler, and I *want* it to be that
way in Python. But that can only happen if the right model is chosen,
and that model is not pthreads. People migrate to pthreads if they
already understand it and so it might seem "simple" to them because of
that. But I think we need something that supports an object-oriented
approach to concurrency that doesn't prevent beginners from using it
safely.

Bruce Eckel



From jcarlson at uci.edu  Tue Oct 11 20:07:24 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Tue, 11 Oct 2005 11:07:24 -0700
Subject: [Python-Dev] Making Queue.Queue easier to use
In-Reply-To: <ca471dc20510110812h3485c0eck35d7ab6d4dd2c8d5@mail.gmail.com>
References: <434B8DBF.9080509@iinet.net.au>
	<ca471dc20510110812h3485c0eck35d7ab6d4dd2c8d5@mail.gmail.com>
Message-ID: <20051011105128.28E0.JCARLSON@uci.edu>


Guido van Rossum <guido at python.org> wrote:
> > Optionally, the existing "put" and "get" methods could be deprecated, with the
> > goal of eventually changing their signature to match the put_wait and get_wait
> > methods above.
> 
> Apart from trying to guess the API without reading the docs (:-), what
> are the use cases for using put/get with a timeout? I have a feeling
> it's not that common.

With timeout=0, a shared connection/resource pool (perhaps DB, etc., I
use one in the tuple space implementation I have for connections to the
tuple space). Note that technically speaking, Queue.Queue from Pythons
prior to 2.4 is broken: get_nowait() may not get an object even if the
Queue is full, this is caused by "elif not self.esema.acquire(0):" being
called for non-blocking requests.  Tim did more than simplify the
structure by rewriting it, he fixed this bug.

With block=True, timeout=None, worker threads pulling from a work-to-do
queue, and even a thread which handles the output of those threads via
a result queue.

 - Josiah


From steven.bethard at gmail.com  Tue Oct 11 20:09:01 2005
From: steven.bethard at gmail.com (Steven Bethard)
Date: Tue, 11 Oct 2005 12:09:01 -0600
Subject: [Python-Dev] Extending tuple unpacking
In-Reply-To: <434BD172.3030004@gmail.com>
References: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com>
	<434B3C01.5030001@ronadam.com>
	<ca471dc20510102155p7f17fd14h5fe6043ea64717f6@mail.gmail.com>
	<434B6C3A.7020001@canterbury.ac.nz> <434BD172.3030004@gmail.com>
Message-ID: <d11dcfba0510111109m4e4d4579ydddf3623895e55c0@mail.gmail.com>

Nick Coghlan wrote:
> So my vote would actually go for deprecating the use of square brackets to
> surround an assignment target list - it makes it look like an actual list
> object should be involved somewhere, but there isn't one.

I've found myself using square brackets a few times for more
complicated unpacking, e.g.:

try:
    x, y = args
except ValueError:
    [x], y = args, None

where I thought that

    (x,), y = args, None

would have been more confusing.  OTOH, I usually end up rewriting this to

    x, = args
    y = None

because even the bracketed form is a bit confusing.  So I wouldn't
really be upset if the brackets went away.

STeVe
--
You can wordify anything if you just verb it.
        --- Bucky Katt, Get Fuzzy

From john.m.camara at comcast.net  Tue Oct 11 20:09:25 2005
From: john.m.camara at comcast.net (john.m.camara@comcast.net)
Date: Tue, 11 Oct 2005 18:09:25 +0000
Subject: [Python-Dev] Python-Dev Digest, Vol 27, Issue 44
Message-ID: <101120051809.17864.434BFFD10007FB43000045C822007358340E9D0E030E0CD203D202080106@comcast.net>


> Date: Tue, 11 Oct 2005 09:51:06 -0400 
> From: Tim Peters 
> Subject: Re: [Python-Dev] PythonCore\CurrentVersion 
> To: Martin v. L?wis 
> Cc: python-dev at python.org 
> Message-ID: 
> <1f7befae0510110651o504958det5d2409b3f724070e at mail.gmail.com> 
> Content-Type: text/plain; charset=ISO-8859-1 
> 
> [Tim Peters] 
> >>> never before this year -- maybe sys.path _used_ to contain the current 
> >>> directory on Linux?). 
> 
> [Fred L. Drake, Jr.] 
> >> It's been a long time since this was the case on Unix of any variety; I 
> >> *think* this changed to the current state back before 2.0. 
> 
> [Martin v. L?wis] 
> > Please check again: 
> > 
> > [GCC 4.0.2 20050821 (prerelease) (Debian 4.0.1-6)] on linux2 
> > Type "help", "copyright", "credits" or "license" for more information. 
> > >>> import sys 
> > >>> sys.path 
> > ['', '/usr/lib/python23.zip', '/usr/lib/python2.3', 
> > '/usr/lib/python2.3/plat-linux2', '/usr/lib/python2.3/lib-tk', 
> > '/usr/lib/python2.3/lib-dynload', 
> > '/usr/local/lib/python2.3/site-packages', 
> > '/usr/lib/python2.3/site-packages', 
> > '/usr/lib/python2.3/site-packages/Numeric', 
> > '/usr/lib/python2.3/site-packages/gtk-2.0', '/usr/lib/site-python'] 
> > 
> > We still have the empty string in sys.path, and it still 
> > denotes the current directory. 
> 
> Well, that's in interactive mode, and I see sys.path[0] == "" on both 
> Windows and Linux then. I don't see "" in sys.path on either box in 
> batch mode, although I do see the absolutized path to the current 
> directory in sys.path in batch mode on Windows but not on Linux -- but 
> Mark Hammond says he doesn't see (any form of) the current directory 
> in sys.path in batch mode on Windows. 
> 
> It's a bit confusing ;-) 
> 
Been bit by this in the past.  On windows, it's a relative path in interactive mode and absolute path in non-interactive mode.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20051011/735447ed/attachment-0001.html

From jcarlson at uci.edu  Tue Oct 11 20:26:42 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Tue, 11 Oct 2005 11:26:42 -0700
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <A77618B80CDD2543B705C82B7665D9F9012B0490@ex9.hostedexchange.local>
References: <dig6c2$kcg$1@sea.gmane.org>
	<A77618B80CDD2543B705C82B7665D9F9012B0490@ex9.hostedexchange.local>
Message-ID: <20051011104737.28DD.JCARLSON@uci.edu>


"Robert Brewer" <fumanchu at amor.org> wrote:
> "Somewhat alleviated" and somewhat worsened. I've had half a dozen
> conversations in the last year about sharing data between threads; in
> every case, I've had to work quite hard to convince the other person
> that threading.local is *not* magic pixie thread dust. Each time, they
> had come to the conclusion that if they had a global variable, they
> could just stick a reference to it into a threading.local object and
> instantly have safe, concurrent access to it.

*boggles* Perhaps there should be an entry in the documentation about
this.  Here is a proposed modification.

Despite desires and assumptions to the contrary, <b>threading.local is
not magic</b>. Placing references to global shared objects into
threading.local <b>will not make them magically threadsafe</b>.  Only by
using threadsafe shared objects (by design with Queue.Queue, or by
desire with lock.acquire()/release() placed around object accesses) will
you have the potential for doing safe things.

 - Josiah


From tim.peters at gmail.com  Tue Oct 11 20:35:52 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 11 Oct 2005 14:35:52 -0400
Subject: [Python-Dev] Making Queue.Queue easier to use
In-Reply-To: <20051011105128.28E0.JCARLSON@uci.edu>
References: <434B8DBF.9080509@iinet.net.au>
	<ca471dc20510110812h3485c0eck35d7ab6d4dd2c8d5@mail.gmail.com>
	<20051011105128.28E0.JCARLSON@uci.edu>
Message-ID: <1f7befae0510111135n48591bcawb39d7cae0698d9ff@mail.gmail.com>

[Guido]
>> Apart from trying to guess the API without reading the docs (:-), what
>> are the use cases for using put/get with a timeout? I have a feeling
>> it's not that common.

[Josiah Carlson]
> With timeout=0, a shared connection/resource pool (perhaps DB, etc., I
> use one in the tuple space implementation I have for connections to the
> tuple space).

Passing timeout=0 is goofy:  use {get,put}_nowait() instead.  There's
no difference in semantics.

> Note that technically speaking, Queue.Queue from Pythons
> prior to 2.4 is broken: get_nowait() may not get an object even if the
> Queue is full, this is caused by "elif not self.esema.acquire(0):" being
> called for non-blocking requests.  Tim did more than simplify the
> structure by rewriting it, he fixed this bug.

I don't agree it was a bug, but I did get fatally weary of arguing
with people who insisted it was ;-)  It's certainly easier to explain
(and the code is easier to read) now.

> With block=True, timeout=None, worker threads pulling from a work-to-do
> queue, and even a thread which handles the output of those threads via
> a result queue.

Guido understands use cases for blocking and non-blocking put/get, and
Queue always supported those possibilities.  The timeout argument got
added later, and it's not really clear _why_ it was added.  timeout=0
isn't a sane use case (because the same effect can be gotten with
non-blocking put/get).

From guido at python.org  Tue Oct 11 20:45:28 2005
From: guido at python.org (Guido van Rossum)
Date: Tue, 11 Oct 2005 11:45:28 -0700
Subject: [Python-Dev] Making Queue.Queue easier to use
In-Reply-To: <1f7befae0510111135n48591bcawb39d7cae0698d9ff@mail.gmail.com>
References: <434B8DBF.9080509@iinet.net.au>
	<ca471dc20510110812h3485c0eck35d7ab6d4dd2c8d5@mail.gmail.com>
	<20051011105128.28E0.JCARLSON@uci.edu>
	<1f7befae0510111135n48591bcawb39d7cae0698d9ff@mail.gmail.com>
Message-ID: <ca471dc20510111145i770be9d7wdaca1b1e57f56863@mail.gmail.com>

On 10/11/05, Tim Peters <tim.peters at gmail.com> wrote:
> Guido understands use cases for blocking and non-blocking put/get, and
> Queue always supported those possibilities.  The timeout argument got
> added later, and it's not really clear _why_ it was added.  timeout=0
> isn't a sane use case (because the same effect can be gotten with
> non-blocking put/get).

In the socket world, a similar bifurcation of the API has happened
(also under my supervision, even though the idea and prototype code
were contributed by others). The API there is very different because
the blocking or timeout is an attribute of the socket, not passed in
to every call.

But one lesson we can learn from sockets (or perhaps the reason why
people kept asking for timeout=0 to be "fixed" :) is that timeout=0 is
just a different way to spell blocking=False. The socket module makes
sure that the socket ends up in exactly the same state no matter which
API is used; and in fact the setblocking() API is redundant.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From reinhold-birkenfeld-nospam at wolke7.net  Tue Oct 11 20:43:59 2005
From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld)
Date: Tue, 11 Oct 2005 20:43:59 +0200
Subject: [Python-Dev] Extending tuple unpacking
In-Reply-To: <434B6C3A.7020001@canterbury.ac.nz>
References: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com>	<434B3C01.5030001@ronadam.com>	<ca471dc20510102155p7f17fd14h5fe6043ea64717f6@mail.gmail.com>
	<434B6C3A.7020001@canterbury.ac.nz>
Message-ID: <dih15g$nih$1@sea.gmane.org>

Greg Ewing wrote:
> Guido van Rossum wrote:
> 
>> BTW, what should
>> 
>>     [a, b, *rest] = (1, 2, 3, 4, 5)
>> 
>> do? Should it set rest to (3, 4, 5) or to [3, 4, 5]?
> 
> Whatever type is chosen, it should be the same type, always.
> The rhs could be any iterable, not just a tuple or a list.
> Making a special case of preserving one or two types doesn't
> seem worth it to me.

I don't think that

[a, b, c] = iterable

is good style right now, so I'd say that

[a, b, *rest] = iterable

should be disallowed or be the same as with parentheses. It's not
intuitive that rest could be a list here.

>> ? And then perhaps
>> 
>>     *rest = x
>> 
>> should mean
>> 
>>     rest = tuple(x)
>> 
>> Or should that be disallowed
> 
> Why bother? What harm would result from the ability to write that?
> 
>> There certainly is a need for doing the same from the end:
>> 
>>     *rest, a, b = (1, 2, 3, 4, 5)
> 
> I wouldn't mind at all if *rest were only allowed at the end.
> There's a pragmatic reason for that if nothing else: the rhs
> can be any iterable, and there's no easy way of getting "all
> but the last n" items from a general iterable.
> 
>> Where does it stop?
> 
> For me, it stops with *rest only allowed at the end, and
> always yielding a predictable type (which could be either tuple
> or list, I don't care).

+1. Tuple is more consistent.

>> BTW, and quite unrelated, I've always felt uncomfortable that you have to write
>> 
>>     f(a, b, foo=1, bar=2, *args, **kwds)
>> 
>> I've always wanted to write that as
>> 
>>     f(a, b, *args, foo=1, bar=2, **kwds)
> 
> Yes, I'd like that too, with the additional meaning that
> foo and bar can only be specified by keyword, not by
> position.

That would be a logical consequence. But one should also be able to give default values
for positional parameters. So:

foo(a, b, c=1, *args, d=2, e=5, **kwargs)
    ^^^^^^^^^         ^^^^^^^^
    positional        only with kw
    or with kw


Reinhold

-- 
Mail address is perfectly valid!


From tim.peters at gmail.com  Tue Oct 11 21:09:02 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Tue, 11 Oct 2005 15:09:02 -0400
Subject: [Python-Dev] PythonCore\CurrentVersion
In-Reply-To: <ca471dc20510110922u5faf6264g40f7928de9af7d71@mail.gmail.com>
References: <4347A020.2050008@v.loewis.de>
	<DAELJHBGPBHPJKEBGGLNKEBMHDAD.mhammond@skippinet.com.au>
	<1f7befae0510101542tf65c250ybf93eb7f11c187ae@mail.gmail.com>
	<200510102109.37690.fdrake@acm.org> <434B58D5.7020206@v.loewis.de>
	<1f7befae0510110651o504958det5d2409b3f724070e@mail.gmail.com>
	<ca471dc20510110819l1d11e77eve9fcad783f0dea90@mail.gmail.com>
	<1f7befae0510110908g2bada0a7vfd98df080c9af745@mail.gmail.com>
	<ca471dc20510110922u5faf6264g40f7928de9af7d71@mail.gmail.com>
Message-ID: <1f7befae0510111209j40975a17w98bec7350d6c4874@mail.gmail.com>

[Guido]
> I tried your experiment but added 'print sys.argv[0]' and didn't see
> that. sys.argv[0] is the path to the script.

My mistake!  You're right, sys.argv[0] is the path to the script for me too.

[Tim]
>> The directory of the script being run was
>> nevertheless in sys.path[0] on both Windows and Linux.  On Windows,
>> but not on Linux, the _current_ directory (the directory I happened to
>> be in at the time I invoked Python) was also on sys.path; Mark Hammond
>> said it was not when he tried, but he didn't show exactly what he did
>> so I'm not sure what he saw.

[Guido]
> I see what you see.  The first entry is the script's directory, the
> 2nd is a nonexistent zip file, the 3rd is the current directory, then
> the rest is standard library stuff.

So why doesn't Mark see that?  I'll ask him ;-)

> I suppose PC/getpathp.c puts it there, per your post quoted above?

I don't think it does (although I understand why it's sane to believe
that it must).  Curiously, I do _not_ see the current directory on
sys.path on Windows if I run from current CVS HEAD.  I do see it
running Pythons 2.2.3, 2.3.5 and 2.4.2.  PC/getpathp.c doesn't appear
to have changed in a relevant way.

blor.py:

"""
import sys
from pprint import pprint
print sys.version_info
pprint(sys.path)
"""

C:\>\code\python\PCbuild\python.exe code\blor.py  # C:\ not in sys.path
(2, 5, 0, 'alpha', 0)
['C:\\code',
 'C:\\code\\python\\PCbuild\\python25.zip',
 'C:\\code\\python\\DLLs',
 'C:\\code\\python\\lib',
 'C:\\code\\python\\lib\\plat-win',
 'C:\\code\\python\\lib\\lib-tk',
 'C:\\code\\python\\PCbuild',
 'C:\\code\\python',
 'C:\\code\\python\\lib\\site-packages']

C:\>\python24\python.exe code\blor.py  # C:\ in sys.path
(2, 4, 2, 'final', 0)
['C:\\code',
 'C:\\python24\\python24.zip',
 'C:\\',
 'C:\\python24\\DLLs',
 'C:\\python24\\lib',
 'C:\\python24\\lib\\plat-win',
 'C:\\python24\\lib\\lib-tk',
 'C:\\python24',
 'C:\\python24\\lib\\site-packages',
 'C:\\python24\\lib\\site-packages\\PIL',
 'C:\\python24\\lib\\site-packages\\win32',
 'C:\\python24\\lib\\site-packages\\win32\\lib',
 'C:\\python24\\lib\\site-packages\\Pythonwin']

From jason.orendorff at gmail.com  Tue Oct 11 22:15:03 2005
From: jason.orendorff at gmail.com (Jason Orendorff)
Date: Tue, 11 Oct 2005 16:15:03 -0400
Subject: [Python-Dev] Proposed changes to PEP 343
In-Reply-To: <bb8868b90510111312i46ff76dvd72484ec42d5d26e@mail.gmail.com>
References: <4346467D.5010005@iinet.net.au> <di5igq$41k$1@sea.gmane.org>
	<43466C3B.50704@gmail.com> <di66pi$f3t$1@sea.gmane.org>
	<bb8868b90510111312i46ff76dvd72484ec42d5d26e@mail.gmail.com>
Message-ID: <bb8868b90510111315t156c261eo9362f0c02b351b87@mail.gmail.com>

On 10/7/05, Fredrik Lundh <fredrik at pythonware.com> wrote:
> the whole concept might be perfectly fine on the "this construct corre-
> sponds to this code" level, but if you immediately end up with things that
> are not what they seem, and names that don't mean what the say, either
> the design or the description of it needs work.
>
>  ("yes, I know you can use this class to manage the context, but it's not
> really a context manager, because it's that method that's a manager, not
> the class itself.  yes, all the information that belongs to the context are
> managed by the class, but that doesn't make... oh, shut up and read the
> PEP")

Good points... Maybe it is the description that needs work.

Here is a description of iterators, to illustrate the parallels:
    An object that has an __iter__ method is iterable.  It can plug
    into the Python 'for' statement.  obj.__iter__() returns an
    iterator.  An iterator is a single-use, forward-only view of a
    sequence.  'for' calls __iter__() and uses the resulting
    iterator's next() method.

    (This is just as complicated as PEP343+changes, but not as
    mindboggling, because the terminology is better.  Also because
    we're used to iterators.)

Now contexts, per PEP 343 with Nick's proposed changes:
    An object that has a __with__ method is a context.  It can plug
    into the Python 'with' statement.  obj.__with__() returns a
    context manager.  A context manager is a single-use object that
    manages a single visit into a context.  'with' calls __with__()
    and uses the resulting context manager's __enter__() and
    __exit__() methods.

    A contextmanager is a function that returns a new context manager.

Okay, that last bit is weird.  But note that PEP 343 has this oddness
even without the proposed changes.  Perhaps either "context manager"
or contextmanager should be renamed, regardless of whether Nick's
changes are accepted.

With the changes, context managers will be (conceptually) single-use.
So maybe a different term might be appropriate.  Perhaps "ticket".
"A ticket is a single-use object that manages a single visit into a
context."

-j

From rrr at ronadam.com  Tue Oct 11 22:41:01 2005
From: rrr at ronadam.com (Ron Adam)
Date: Tue, 11 Oct 2005 16:41:01 -0400
Subject: [Python-Dev] Extending tuple unpacking
In-Reply-To: <dih15g$nih$1@sea.gmane.org>
References: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com>	<434B3C01.5030001@ronadam.com>	<ca471dc20510102155p7f17fd14h5fe6043ea64717f6@mail.gmail.com>	<434B6C3A.7020001@canterbury.ac.nz>
	<dih15g$nih$1@sea.gmane.org>
Message-ID: <434C235D.1060404@ronadam.com>

Reinhold Birkenfeld wrote:
> Greg Ewing wrote:
> 
>>Guido van Rossum wrote:
>>
>>
>>>BTW, what should
>>>
>>>    [a, b, *rest] = (1, 2, 3, 4, 5)
>>>
>>>do? Should it set rest to (3, 4, 5) or to [3, 4, 5]?
>>
>>Whatever type is chosen, it should be the same type, always.
>>The rhs could be any iterable, not just a tuple or a list.
>>Making a special case of preserving one or two types doesn't
>>seem worth it to me.
> 
> 
> I don't think that
> 
> [a, b, c] = iterable
> 
> is good style right now, so I'd say that
> 
> [a, b, *rest] = iterable
> 
> should be disallowed or be the same as with parentheses. It's not
> intuitive that rest could be a list here.

I wonder if something like the following would fulfill the need?


This divides a sequence at given index's by using an divider iterator on 
it.

class xlist(list):
     def div_at(self, *args):
	""" return a divided sequence """
         return [x for x in self.div_iter(*args)]
     def div_iter(self, *args):
         """ return a sequence divider-iter """
         s = None
         for n in args:
             yield self[s:n]
             s = n
         yield self[n:]

seq = xlist(range(10))

(a,b),rest = seq.div_at(2)

print a,b,rest          # 0 1 [2, 3, 4, 5, 6, 7, 8, 9]

(a,b),c,(d,e),rest = seq.div_at(2,4,6)

print seq.div_at(2,4,6) # [[0, 1], [2, 3], [4, 5], [6, 7, 8, 9]]
print a,b,c,d,e,rest    # 0 1 [2, 3] 4 5 [6, 7, 8, 9]


This addresses the issue of repeating the name of the iterable.

Cheers,
   Ron


From jcarlson at uci.edu  Wed Oct 12 02:41:06 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Tue, 11 Oct 2005 17:41:06 -0700
Subject: [Python-Dev] Making Queue.Queue easier to use
In-Reply-To: <1f7befae0510111135n48591bcawb39d7cae0698d9ff@mail.gmail.com>
References: <20051011105128.28E0.JCARLSON@uci.edu>
	<1f7befae0510111135n48591bcawb39d7cae0698d9ff@mail.gmail.com>
Message-ID: <20051011165924.28FC.JCARLSON@uci.edu>


[Guido]
> >> Apart from trying to guess the API without reading the docs (:-), what
> >> are the use cases for using put/get with a timeout? I have a feeling
> >> it's not that common.

[Josiah Carlson]
> > With timeout=0, a shared connection/resource pool (perhaps DB, etc., I
> > use one in the tuple space implementation I have for connections to the
> > tuple space).

[Tim Peters]
> Passing timeout=0 is goofy:  use {get,put}_nowait() instead.  There's
> no difference in semantics.

I understand this, as do many others who use it.  However, having both
manually and automatically tuned timeouts myself in certain applications,
the timeout=0 case is useful.  Uncommon?  Likely, I've not yet seen any
examples of anyone using this particular timeout method at koders.com .


> > Note that technically speaking, Queue.Queue from Pythons
> > prior to 2.4 is broken: get_nowait() may not get an object even if the
> > Queue is full, this is caused by "elif not self.esema.acquire(0):" being
> > called for non-blocking requests.  Tim did more than simplify the
> > structure by rewriting it, he fixed this bug.
> 
> I don't agree it was a bug, but I did get fatally weary of arguing
> with people who insisted it was ;-)  It's certainly easier to explain
> (and the code is easier to read) now.

When getting an object from a non-empty queue fails because some other
thread already had the lock, and it is a fair assumption that the other
thread will release the lock within the next context switch...

Because I still develop on Python 2.3 (I need to support a commercial
codebase made with 2.3), I was working around it by using the timeout
parameter:
    try:
        connection = connection_queue.get(timeout=.000001)
    except Queue.Empty:
        connection = make_new_connection()

With only get_nowait() calls, by the time I hit 3-4 threads, it was
failing to pick up connections even when there were hundreds in the
queue, and I quickly ran into the file handle limit for my platform, not
to mention that the server I was connecting to used asynchronous sockets
and select, which died at the 513th incoming socket.

I have since copied the implementation of 2.4's queue into certain
portions of code which make use of get_nowait() and its variants
(handline the deque reference as necessary).

Any time one needs to work around a "not buggy feature" with some
claimed "unnecessary feature", it tends to smell less than pristine to
my nose.


> > With block=True, timeout=None, worker threads pulling from a work-to-do
> > queue, and even a thread which handles the output of those threads via
> > a result queue.
> 
> Guido understands use cases for blocking and non-blocking put/get, and
> Queue always supported those possibilities.  The timeout argument got
> added later, and it's not really clear _why_ it was added.  timeout=0
> isn't a sane use case (because the same effect can be gotten with
> non-blocking put/get).

def t():
    try:
        #thread state setup...
        while not QUIT:
            try:
                work = q.get(timeout=5)
            except Queue.Empty:
                continue
            #handle work
    finally:
        #thread state cleanup...

Could the above be daemonized?  Certainly, but then the thread state
wouldn't be cleaned up.  If you can provide me with a way of doing the
above with equivalent behavior, using only get_nowait() and get(), then
put it in the documentation.  If not, then I'd say that the timeout
argument is a necessarily useful feature.

[Guido]
> But one lesson we can learn from sockets (or perhaps the reason why
> people kept asking for timeout=0 to be "fixed" :) is that timeout=0 is
> just a different way to spell blocking=False. The socket module makes
> sure that the socket ends up in exactly the same state no matter which
> API is used; and in fact the setblocking() API is redundant.

This would suggest to me that at least for sockets, setblocking() could
be deprecated, as could the block parameter in Queue.  I wouldn't vote
for either deprecation, but it would seem to make more sense than to
remove the timeout arguments from both.


 - Josiah


From greg.ewing at canterbury.ac.nz  Wed Oct 12 03:26:58 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 12 Oct 2005 14:26:58 +1300
Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use)
In-Reply-To: <ca471dc20510110812h3485c0eck35d7ab6d4dd2c8d5@mail.gmail.com>
References: <434B8DBF.9080509@iinet.net.au>
	<ca471dc20510110812h3485c0eck35d7ab6d4dd2c8d5@mail.gmail.com>
Message-ID: <434C6662.4040503@canterbury.ac.nz>

Guido van Rossum wrote:

> I see no need. Code that *doesn't* need Queue but does use threading
> shouldn't have to pay for loading Queue.py.

However, it does seem awkward to have a whole module
providing just one small class that logically is so
closely related to other threading facilities.

What we want in this kind of situation is some sort
of autoloading mechanism, so you can import something
from a module and have it trigger the loading of another
module behind the scenes to provide it.

Another place I'd like this is in my PyGUI library,
where I want all the commonly-used class names to appear
in the top-level package, but ideally not import the
code to implement them until they're actually used.

There are various ways of hacking up such functionality
today, but it would be nice if there were some kind of
language or library support for it. Maybe something like
a descriptor mechanism for lookups in module namespaces.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From greg.ewing at canterbury.ac.nz  Wed Oct 12 04:10:27 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 12 Oct 2005 15:10:27 +1300
Subject: [Python-Dev] Proposed changes to PEP 343
In-Reply-To: <bb8868b90510111315t156c261eo9362f0c02b351b87@mail.gmail.com>
References: <4346467D.5010005@iinet.net.au> <di5igq$41k$1@sea.gmane.org>
	<43466C3B.50704@gmail.com> <di66pi$f3t$1@sea.gmane.org>
	<bb8868b90510111312i46ff76dvd72484ec42d5d26e@mail.gmail.com>
	<bb8868b90510111315t156c261eo9362f0c02b351b87@mail.gmail.com>
Message-ID: <434C7093.9070802@canterbury.ac.nz>

Jason Orendorff wrote:

>     A contextmanager is a function that returns a new context manager.
> 
> Okay, that last bit is weird.

If the name of the decorator is to be 'contextmanager', it
really needs to say something like

   The contextmanager decorator turns a generator into a
   function that returns a context manager.

So maybe the decorator should be called 'contextmanagergenerator'.
Or perhaps not, since that's getting rather too much of an
eyeful to parse...

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From greg.ewing at canterbury.ac.nz  Wed Oct 12 04:15:51 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 12 Oct 2005 15:15:51 +1300
Subject: [Python-Dev] Fwd: defaultproperty
In-Reply-To: <434B8060.6070903@gmail.com>
References: <433AA5AC.6040509@zope.com>
	<ca471dc205092807477ff2d0f1@mail.gmail.com> <433BA3CF.1090205@zope.com>
	<43494648.6040904@zope.com> <4349997E.9010208@gmail.com>
	<76fd5acf0510092246q1861403flda2dd49ddbe02d6c@mail.gmail.com>
	<76fd5acf0510092247m4d8383e8o40260de7950906@mail.gmail.com>
	<1128955292.27841.2.camel@geddy.wooz.org>
	<bbaeab100510101333s317d1a98r2b2af3b93b7788ad@mail.gmail.com>
	<434B8060.6070903@gmail.com>
Message-ID: <434C71D7.5090703@canterbury.ac.nz>

Nick Coghlan wrote:

> As a location for this, I would actually suggest a module called something 
> like "metatools",

-1, too vague and meaningless a name. If "decorator" is the
official term for this kind of function, then calling the
module "decorators" is precise and helpful. Other kinds of
meta-level tools should go in their own suitably-named
modules.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From greg.ewing at canterbury.ac.nz  Wed Oct 12 04:16:16 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 12 Oct 2005 15:16:16 +1300
Subject: [Python-Dev] Extending tuple unpacking
In-Reply-To: <434BD172.3030004@gmail.com>
References: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com>
	<434B3C01.5030001@ronadam.com>
	<ca471dc20510102155p7f17fd14h5fe6043ea64717f6@mail.gmail.com>
	<434B6C3A.7020001@canterbury.ac.nz> <434BD172.3030004@gmail.com>
Message-ID: <434C71F0.9000609@canterbury.ac.nz>

Nick Coghlan wrote:

> I'm also trying to figure out why you would ever write:
>    [a, b, c, d] = seq

I think the ability to use square brackets is a
holdover from some ancient Python version where you had
to match the type of the thing being unpacked with
the appropriate syntax on the lhs. It was a silly
requirement from the beginning, and it became
unworkable as soon as things other than lists and
tuples could be unpacked.

In Py3k I expect that [...] for unpacking will
no longer be allowed.

> Indeed. It's a (minor) pain that optional flag variables and variable length 
> argument lists are currently mutually exclusive. Although, if you had that 
> rule, I'd want to be able to write:
> 
>    def f(a, b, *, foo=1, bar=2): pass

Yes, I agree.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From greg.ewing at canterbury.ac.nz  Wed Oct 12 04:17:02 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 12 Oct 2005 15:17:02 +1300
Subject: [Python-Dev] Extending tuple unpacking
In-Reply-To: <digorg$opr$1@sea.gmane.org>
References: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com>
	<434B3C01.5030001@ronadam.com>
	<ca471dc20510102155p7f17fd14h5fe6043ea64717f6@mail.gmail.com>
	<434B6C3A.7020001@canterbury.ac.nz> <434BD172.3030004@gmail.com>
	<digorg$opr$1@sea.gmane.org>
Message-ID: <434C721E.6090802@canterbury.ac.nz>

Steve Holden wrote:

> So presumably you'd need to be able to say
> 
>    ((a, *b), c, *d) = ((1, 2, 3), 4, 5, 6)

Yes.

>    a, *b = [1]
> 
> to put the empty list into b, or is that an unpacking error?

Empty sequence in b (of whatever type is chosen).


> does this mean that you'd also like to be able to write
> 
>    def foo((a, *b), *c):

That would follow, yes.

> I do feel that for Python 3 it might be better to make a clean 
> separation between keywords and positionals: in other words, of the 
> function definition specifies a keyword argument then a keyword must be 
> used to present it.

But then how would you give a positional arg a default value
without turning it into a keyword arg?

It seems to me that the suggested extension covers all the
possibilities quite nicely. You can have named positional args
with or without default values, optional extra positional
args with *, named keyword-only args with or without default
values, and unnamed extra keyword-only args with **.

The only thing it doesn't give you directly is mandatory
positional-only args, and you can get that by catching them
with * and unpacking them afterwards. This would actually
synergise nicely with * in tuple unpacking:

   def f(*args):
     a, b, *rest = args

And with one further small extension, you could even get
that into the argument list as well:

   def f(*(a, b, *rest)):
     ...

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From jspoerri+dated+1129516162.13b96a at earthlink.net  Wed Oct 12 04:35:17 2005
From: jspoerri+dated+1129516162.13b96a at earthlink.net (Joshua Spoerri)
Date: Wed, 12 Oct 2005 02:35:17 +0000 (UTC)
Subject: [Python-Dev] Pythonic concurrency
References: <2mll1ghsjc.fsf@starship.python.net>
	<397621172.20050927111836@MailBlocks.com>
	<433AE8A8.3010500@v.loewis.de>
	<329633301.20050929074337@MailBlocks.com>
	<2mll1ghsjc.fsf@starship.python.net>
	<5.1.1.6.0.20050929121236.0399ed88@mail.telecommunity.com>
	<160502469.20050929104837@MailBlocks.com>
Message-ID: <loom.20051012T040249-299@post.gmane.org>

that stm paper isn't the end.

there's a java implementation which seems to be exactly what we want:
http://research.microsoft.com/~tharris/papers/2003-oopsla.pdf


From ncoghlan at gmail.com  Wed Oct 12 11:23:58 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 12 Oct 2005 19:23:58 +1000
Subject: [Python-Dev] Proposed changes to PEP 343
In-Reply-To: <434C7093.9070802@canterbury.ac.nz>
References: <4346467D.5010005@iinet.net.au>
	<di5igq$41k$1@sea.gmane.org>	<43466C3B.50704@gmail.com>
	<di66pi$f3t$1@sea.gmane.org>	<bb8868b90510111312i46ff76dvd72484ec42d5d26e@mail.gmail.com>	<bb8868b90510111315t156c261eo9362f0c02b351b87@mail.gmail.com>
	<434C7093.9070802@canterbury.ac.nz>
Message-ID: <434CD62E.4020901@gmail.com>

Greg Ewing wrote:
> Jason Orendorff wrote:
> 
> 
>>    A contextmanager is a function that returns a new context manager.
>>
>>Okay, that last bit is weird.
> 
> 
> If the name of the decorator is to be 'contextmanager', it
> really needs to say something like
> 
>    The contextmanager decorator turns a generator into a
>    function that returns a context manager.
> 
> So maybe the decorator should be called 'contextmanagergenerator'.
> Or perhaps not, since that's getting rather too much of an
> eyeful to parse...

Strictly speaking this fits in with the existing confusion of "generator 
factory" and "generator":

Py> def g():
...     yield None
...
Py> type(g)
<type 'function'>
Py> type(g())
<type 'generator'>

Most people would call "g" a generator, even though its really just a factory 
function that returns generator objects.

So technically, the "contextmanager" decorator turns a generator factory 
function into a context manager factory function.

But its easier to simply say that the decorator turns a generator into a 
context manager, even if that's technically incorrect.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Wed Oct 12 11:41:56 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 12 Oct 2005 19:41:56 +1000
Subject: [Python-Dev] Extending tuple unpacking
In-Reply-To: <digorg$opr$1@sea.gmane.org>
References: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com>	<434B3C01.5030001@ronadam.com>	<ca471dc20510102155p7f17fd14h5fe6043ea64717f6@mail.gmail.com>	<434B6C3A.7020001@canterbury.ac.nz>	<434BD172.3030004@gmail.com>
	<digorg$opr$1@sea.gmane.org>
Message-ID: <434CDA64.9070302@gmail.com>

Steve Holden wrote:
> But don't forget that at present unpacking can be used at several levels:
> 
>  >>> ((a, b), c) = ((1, 2), 3)
>  >>> a, b, c
> (1, 2, 3)
>  >>>
> 
> So presumably you'd need to be able to say
> 
>    ((a, *b), c, *d) = ((1, 2, 3), 4, 5, 6)
> 
> and see
> 
>    a, b, c, d == 1, (2, 3), 4, (5, 6)
> 
> if we are to retain today's multi-level consistency.

That seems reasonable enough. I'd considered such code bad style though.

And are you also
> proposing to allow
> 
>    a, *b = [1]
> 
> to put the empty list into b, or is that an unpacking error?

It does the same as function parameter unpacking does, by making b the empty 
tuple.

> This would allow users to provide an arbitrary number of positionals 
> rather than having them become keyword arguments. At present it's 
> difficult to specify that.

That's the reasoning behind the "* without a name" idea:

    def f(a, b, c=default, *, foo=1, bar=2): pass

Here, c is a positional argument with a default value, while foo and bar are 
forced to be keyword arguments.

Completely nuts idea #576 wold involve extending this concept past the keyword 
dict as well to get function default values which aren't arguments:

    def f(pos1, pos2, pos3=default, *,
          kw1=1, kw2=2, **,
          const="Nutty idea"):
        pass

Py> f(const=1)
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
TypeError: f() got an unexpected keyword argument 'const'

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Wed Oct 12 11:51:36 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 12 Oct 2005 19:51:36 +1000
Subject: [Python-Dev] Making Queue.Queue easier to use
In-Reply-To: <ca471dc20510110812h3485c0eck35d7ab6d4dd2c8d5@mail.gmail.com>
References: <434B8DBF.9080509@iinet.net.au>
	<ca471dc20510110812h3485c0eck35d7ab6d4dd2c8d5@mail.gmail.com>
Message-ID: <434CDCA8.7050705@gmail.com>

Guido van Rossum wrote:
> Apart from trying to guess the API without reading the docs (:-), what
> are the use cases for using put/get with a timeout? I have a feeling
> it's not that common.

Actually, I think wanting to use a timeout is an artifact of a history of 
dealing with too many C libraries which don't provide a proper event-based or 
select-style interface (which means the calls have to time out periodically in 
order to respond gracefully to program shutdown requests).

However, because Queues are multi-producer, that isn't a problem - I just have 
to remember to push the shutdown request in through the Queue.

Basically, I'd fallen into the "trying-to-write-C-in-Python" trap and I simply 
didn't notice until I read the responses in this thread :)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Wed Oct 12 12:09:24 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 12 Oct 2005 20:09:24 +1000
Subject: [Python-Dev] Extending tuple unpacking
In-Reply-To: <434C235D.1060404@ronadam.com>
References: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com>	<434B3C01.5030001@ronadam.com>	<ca471dc20510102155p7f17fd14h5fe6043ea64717f6@mail.gmail.com>	<434B6C3A.7020001@canterbury.ac.nz>	<dih15g$nih$1@sea.gmane.org>
	<434C235D.1060404@ronadam.com>
Message-ID: <434CE0D4.3070809@gmail.com>

Ron Adam wrote:
> I wonder if something like the following would fulfill the need?

Funny you should say that. . .

A pre-PEP propsing itertools.iunpack (amongst other things):
http://mail.python.org/pipermail/python-dev/2004-November/050043.html

And the reason that PEP was never actually created:
http://mail.python.org/pipermail/python-dev/2004-November/050068.html

Obviouly, I've changed my views over the last year or so ;)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Wed Oct 12 12:25:18 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 12 Oct 2005 20:25:18 +1000
Subject: [Python-Dev] Proposed changes to PEP 343
In-Reply-To: <bb8868b90510120256l313ce556k9eaa7447ab2700bb@mail.gmail.com>
References: <4346467D.5010005@iinet.net.au> <di5igq$41k$1@sea.gmane.org>	
	<43466C3B.50704@gmail.com> <di66pi$f3t$1@sea.gmane.org>	
	<bb8868b90510111312i46ff76dvd72484ec42d5d26e@mail.gmail.com>	
	<bb8868b90510111315t156c261eo9362f0c02b351b87@mail.gmail.com>	
	<434C7093.9070802@canterbury.ac.nz> <434CD62E.4020901@gmail.com>
	<bb8868b90510120256l313ce556k9eaa7447ab2700bb@mail.gmail.com>
Message-ID: <434CE48E.9090305@gmail.com>

Jason Orendorff wrote:
> On 10/12/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> 
>>Strictly speaking this fits in with the existing confusion of "generator
>>factory" and "generator":
>>
>>Py> def g():
>>...     yield None
>>...
>>Py> type(g)
>><type 'function'>
>>Py> type(g())
>><type 'generator'>
>>
>>Most people would call "g" a generator, even though its really just a factory
>>function that returns generator objects.
> 
> 
> Not the same.  A precise term exists for "g": it's a generator function.
> PEP 255 explicitly talks about this:
> 
>     "...Note that when
>     the intent is clear from context, the unqualified name "generator" may
>     be used to refer either to a generator-function or a generator-
>     iterator."
> 
> What would the corresponding paragraph be for PEP 343?


      "...Note that when the intent is clear from context, the unqualified name
      'context manager' may be used to refer either to a 'context manager
       function' or to an actual 'context manager object'. This distinction is
       primarily relevant for generator-based context managers, and is similar
       to that between a normal generator-function and a generator-iterator."

Basically, a context manager object is an object with __enter__ and __exit__ 
methods, while the __with__ method itself is a context manager function.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From john.m.camara at comcast.net  Wed Oct 12 13:03:58 2005
From: john.m.camara at comcast.net (john.m.camara@comcast.net)
Date: Wed, 12 Oct 2005 11:03:58 +0000
Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use)
Message-ID: <101220051103.1985.434CED9E0009C73C000007C122007503300E9D0E030E0CD203D202080106@comcast.net>

Greg Ewing wrote:

> 
> Guido van Rossum wrote:
> 
> > I see no need. Code that *doesn't* need Queue but does use threading
> > shouldn't have to pay for loading Queue.py.
> 
> However, it does seem awkward to have a whole module
> providing just one small class that logically is so
> closely related to other threading facilities.
> 
> What we want in this kind of situation is some sort
> of autoloading mechanism, so you can import something
> from a module and have it trigger the loading of another
> module behind the scenes to provide it.
> 

Bad idea unless it is tied to a namespace.  So that users knows where this auto-loaded functionality is coming from.  Otherwise it's just as bad as 'from xxx import *'.

John M. Camara

From mcherm at mcherm.com  Wed Oct 12 13:35:18 2005
From: mcherm at mcherm.com (Michael Chermside)
Date: Wed, 12 Oct 2005 04:35:18 -0700
Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use)
Message-ID: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com>

> Guido van Rossum writes:
> Code that *doesn't* need Queue but does use threading
> shouldn't have to pay for loading Queue.py.

Greg Ewing responds:
> What we want in this kind of situation is some sort
> of autoloading mechanism, so you can import something
> from a module and have it trigger the loading of another
> module behind the scenes to provide it.

John Camera comments:
> Bad idea unless it is tied to a namespace.  So that users knows
> where this auto-loaded functionality is coming from.  Otherwise
> it's just as bad as 'from xxx import *'.

John, I think what Greg is suggesting is that we include Queue
in the threading module, but that we use a Clever Trick(TM) to
address Guido's point by not actually loading the Queue code
until the first time (if ever) that it is used.

I'm not familiar with the clever trick Greg is proposing, but I
do agree that _IF_ everything else were equal, then Queue seems
to belong in the threading module. My biggest reason is that I
think anyone who is new to threading probably shouldn't use any
communication mechanism OTHER than Queue or something similar
which has been carefully designed by someone knowlegable.

-- Michael Chermside


From mcherm at mcherm.com  Wed Oct 12 13:54:38 2005
From: mcherm at mcherm.com (Michael Chermside)
Date: Wed, 12 Oct 2005 04:54:38 -0700
Subject: [Python-Dev] Extending tuple unpacking
Message-ID: <20051012045438.gz4jb9pc1wwskwcw@login.werra.lunarpages.com>

Steve Holden writes:
> I do feel that for Python 3 it might be better to make a clean
> separation between keywords and positionals: in other words, of the
> function definition specifies a keyword argument then a keyword must be
> used to present it.
>
> This would allow users to provide an arbitrary number of positionals
> rather than having them become keyword arguments. At present it's
> difficult to specify that.

Antoine Pitrou already responded:
> Do you mean it would also be forbidden to invoke a "positional" argument
> using its keyword? It would be a huge step back in usability IMO. Some
> people like invoking by position (because it's shorter) and some others
> prefer invoking by keyword (because it's more explicit). Why should the
> implementer of the API have to make a choice for the user of the API ?

I strongly agree with Antoine here, but the combination of "keyword
arguments after the star":

> foo(a, b, c=1, *args, d=2, e=5, **kwargs)
>     ^^^^^^^^^         ^^^^^^^^
>     positional        only with kw
>     or with kw

with "star without a name":

>     def f(a, b, c=default, *, foo=1, bar=2): pass
>
> Here, c is a positional argument with a default value, while foo and bar are
> forced to be keyword arguments.

is quite tempting. It satisfies Steve by allowing the implementer of the
function to require keyword arguments. It satisfies Antoine and myself
by also allowing the implementor of the function to permit positional OR
keyword use, and making this the default behavior. It is logically
consistant. There's just one big problem that I know of:

Guido writes:
> I've always wanted to write that as
>
>     f(a, b, *args, foo=1, bar=2, **kwds)
>
> but the current grammar doesn't allow it.

Hmm.... why doesn't the current grammar allow it, and can we fix that?
I don't see that it's a limitation of the one-token-lookahead, could
we permit this syntax by rearanging bits of the grammer?

-- Michael Chermside




From cludwig at cdc.informatik.tu-darmstadt.de  Wed Oct 12 14:09:18 2005
From: cludwig at cdc.informatik.tu-darmstadt.de (Christoph Ludwig)
Date: Wed, 12 Oct 2005 14:09:18 +0200
Subject: [Python-Dev] [C++-sig]  GCC version compatibility
In-Reply-To: <20050716101357.GC3607@lap200.cdc.informatik.tu-darmstadt.de>
References: <42CDA654.2080106@v.loewis.de> <uu0j6p7z1.fsf@boost-consulting.com>
	<20050708072807.GC3581@lap200.cdc.informatik.tu-darmstadt.de>
	<u8y0hl45u.fsf@boost-consulting.com> <42CEF948.3010908@v.loewis.de>
	<20050709102010.GA3836@lap200.cdc.informatik.tu-darmstadt.de>
	<42D0D215.9000708@v.loewis.de>
	<20050710125458.GA3587@lap200.cdc.informatik.tu-darmstadt.de>
	<42D15DB2.3020300@v.loewis.de>
	<20050716101357.GC3607@lap200.cdc.informatik.tu-darmstadt.de>
Message-ID: <20051012120917.GA11058@lap200.cdc.informatik.tu-darmstadt.de>

Hi,

this is to continue a discussion started back in July by a posting by 
Dave Abrahams <url:http://thread.gmane.org/gmane.comp.python.devel/69651>
regarding the compiler (C vs. C++) used to compile python's main() and to link
the executable.


On Sat, Jul 16, 2005 at 12:13:58PM +0200, Christoph Ludwig wrote:
> On Sun, Jul 10, 2005 at 07:41:06PM +0200, "Martin v. L?wis" wrote:
> > Maybe. For Python 2.4, feel free to contribute a more complex test. For
> > Python 2.5, I would prefer if the entire code around ccpython.cc was
> > removed.
> 
> I submitted patch #1239112 that implements the test involving two TUs for
> Python 2.4. I plan to work on a more comprehensive patch for Python 2.5 but
> that will take some time.


I finally had the spare time to look into this problem again and submitted
patch #1324762. The proposed patch implements the following:

1) The configure option --with-cxx is renamed --with-cxx-main. This was done
to avoid surprising the user by the changed meaning. Furthermore, it is now
possible that CXX has a different value than provided by --with-cxx-main, so
the old name would have been confusing. 
 
2) The compiler used to translate python's main() function is stored in the
configure / Makefile variable MAINCC. By default, MAINCC=$(CC). If
--with-cxx-main is given (without an appended compiler name), then
MAINCC=$(CXX). If --with-cxx-main=<compiler> is on the configure command line,
then MAINCC=<compiler>. Additionally, configure sets CXX=<compiler> unless CXX
was already set on the configure command line. 
 
3) The command used to link the python executable is (as before) stored in
LINKCC. By default, LINKCC='$(PURIFY) $(MAINCC)', i.e. the linker front-end is
the compiler used to translate main(). If necessary, LINKCC can be set on the
configure command line in which case it won't be altered. 
 
4) If CXX is not set by the user (on the command line or via --with-cxx-main),
then configure tries several likely C++ compiler names. CXX is assigned the
first name that refers to a callable program in the system. (CXX is set even
if python is built with a C compiler only, so distutils can build C++
extensions.)  
 
5) Modules/ccpython.cc is no longer used and can be removed. 

I think that makes it possible to build python appropriately on every
platform:

- By default, python is built with the C compiler only; CXX is assigned the
  name of a "likely" C++ compiler. This works fine, e.g., on ELF systems like
  x86 / Linux where  python should not have any dependency on the C++
  runtime to avoid conflicts with C++ extensions. distutils can still build
  C++ extensions since CXX names a callable C++ compiler.

- On platforms that require main() to be a C++ function if C++ extensions are
  to be imported, the user can configure python --with-cxx-main. On platforms
  where one must compile main() with a C++ compiler, but does not need to link
  the executable with the same compiler, the user can specify both
  --with-cxx-main and LINKCC on the configure command line.

Best regards

Christoph

-- 
http://www.informatik.tu-darmstadt.de/TI/Mitarbeiter/cludwig.html
LiDIA: http://www.informatik.tu-darmstadt.de/TI/LiDIA/Welcome.html


From john.m.camara at comcast.net  Wed Oct 12 15:12:05 2005
From: john.m.camara at comcast.net (john.m.camara@comcast.net)
Date: Wed, 12 Oct 2005 13:12:05 +0000
Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use)
Message-ID: <101220051312.4429.434D0BA5000808D40000114D22007503300E9D0E030E0CD203D202080106@comcast.net>

> > Guido van Rossum writes:
> > Code that *doesn't* need Queue but does use threading
> > shouldn't have to pay for loading Queue.py.
> 
> Greg Ewing responds:
> > What we want in this kind of situation is some sort
> > of autoloading mechanism, so you can import something
> > from a module and have it trigger the loading of another
> > module behind the scenes to provide it.
> 
> John Camera comments:
> > Bad idea unless it is tied to a namespace.  So that users knows
> > where this auto-loaded functionality is coming from.  Otherwise
> > it's just as bad as 'from xxx import *'.
> 
> Michael Chermside comments:
> John, I think what Greg is suggesting is that we include Queue
> in the threading module, but that we use a Clever Trick(TM) to
> address Guido's point by not actually loading the Queue code
> until the first time (if ever) that it is used.
> 
> I'm not familiar with the clever trick Greg is proposing, but I
> do agree that _IF_ everything else were equal, then Queue seems
> to belong in the threading module. My biggest reason is that I
> think anyone who is new to threading probably shouldn't use any
> communication mechanism OTHER than Queue or something similar
> which has been carefully designed by someone knowlegable.
> 
I guess from Greg’s comments I’m not sure if he wants to

import threading

and as a result

‘Queue’ becomes available in the local namespace and bound/loaded when it is first needed and thus saves himself from typing ‘import Queue’ immediately after ‘import threading’

or

Queue becomes part of the threading namespace and bound/loaded when it is first needed.  Queue then becomes accessible through ‘threading.Queue’

When Greg says

> However, it does seem awkward to have a whole module
> providing just one small class that logically is so
> closely related to other threading facilities.

It sounds like he feels Queue should just be part of threading but queues can be used in other contexts besides threading.  So having separate modules is a good thing.

The idea of delaying an import until it’s needed sounds like a great idea and having built in support for this would be great.  Here are 2 possible suggestions for the import statements

import Queue asneeded
delayedimport Queue     # can't think of a better name at this time

But auto loading a module by a module on behalf of a client just doesn’t sit too well for me.  How about the confusion it would cause.  Is Queue in treading module a reference to a Queue in a Queue module or a new class all together?  If we go down this slippery slope we will see modules like array, struct, etc getting referenced and getting auto loaded on behalf of the client.  Where will it end.

John M. Camara


> > Guido van Rossum writes:
> > Code that *doesn't* need Queue but does use threading
> > shouldn't have to pay for loading Queue.py.
> 
> Greg Ewing responds:
> > What we want in this kind of situation is some sort
> > of autoloading mechanism, so you can import something
> > from a module and have it trigger the loading of another
> > module behind the scenes to provide it.
> 
> John Camera comments:
> > Bad idea unless it is tied to a namespace.  So that users knows
> > where this auto-loaded functionality is coming from.  Otherwise
> > it's just as bad as 'from xxx import *'.
> 
> John, I think what Greg is suggesting is that we include Queue
> in the threading module, but that we use a Clever Trick(TM) to
> address Guido's point by not actually loading the Queue code
> until the first time (if ever) that it is used.
> 
> I'm not familiar with the clever trick Greg is proposing, but I
> do agree that _IF_ everything else were equal, then Queue seems
> to belong in the threading module. My biggest reason is that I
> think anyone who is new to threading probably shouldn't use any
> communication mechanism OTHER than Queue or something similar
> which has been carefully designed by someone knowlegable.
> 
> -- Michael Chermside
> 

From mcherm at mcherm.com  Wed Oct 12 16:25:28 2005
From: mcherm at mcherm.com (Michael Chermside)
Date: Wed, 12 Oct 2005 07:25:28 -0700
Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use)
Message-ID: <20051012072528.68vls5q6143k08c0@login.werra.lunarpages.com>

John Camera writes:
> It sounds like he feels Queue should just be part of threading but queues
> can be used in other contexts besides threading.  So having separate
> modules is a good thing.

Perhaps I am wrong here, but the Queue.Queue class is designed specifically
for synchronization, and I have always been under the impression that
it was probably NOT the best tool for normal queues that have nothing
to do with threading. Why incur the overhead of synchronization locks
when you don't intend to use them. I would advise against using Queue.Queue
in any context besides threading.

continued...
> I guess from Greg?s comments I?m not sure if he wants to [...]

I'm going to stop trying to channel Greg here, he can speak for himself.
But I will be quite surprised if _anyone_ supports the idea of having
an module modify the local namespace importing it when it is imported.

and later...
> Here are 2 possible suggestions for the import statements
>
> import Queue asneeded
> delayedimport Queue     # can't think of a better name at this time


Woah! There is no need for new syntax here! If you want to import
Queue only when needed use this (currently legal) syntax:

    if queueIsNeeded:
        import Queue

If you want to add a module (call it "Queue") to the namespace, but
delay executing some of the code for now, then just use "import Queue"
and modify the module so that it doesn't do all its work at import
time, but delays some of it until needed. That too is possible today:

    # start of module
    initialized = False

    def doSomething():
        if not initialized:
            initialize()

    # ...

Python today is incredibly dynamic and flexible... despite the usual
tenor of conversations on python-dev, it is very rare to encounter a
problem that cannot be solved (and readably so) using the existing
tools and constructs.

-- Michael Chermside

From guido at python.org  Wed Oct 12 16:32:17 2005
From: guido at python.org (Guido van Rossum)
Date: Wed, 12 Oct 2005 07:32:17 -0700
Subject: [Python-Dev] Europeans attention please!
Message-ID: <ca471dc20510120732q51c514f1mfd1ecb5f517de417@mail.gmail.com>

I have some 65%-off passes to EuroOSCON which starts next Monday in
Amsterdam. Anybody interested?

http://conferences.oreillynet.com/eurooscon/grid/

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From skip at pobox.com  Wed Oct 12 16:46:52 2005
From: skip at pobox.com (skip@pobox.com)
Date: Wed, 12 Oct 2005 09:46:52 -0500
Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use)
In-Reply-To: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com>
References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com>
Message-ID: <17229.8668.114534.151179@montanaro.dyndns.org>


    Michael> I'm not familiar with the clever trick Greg is proposing, but I
    Michael> do agree that _IF_ everything else were equal, then Queue seems
    Michael> to belong in the threading module. My biggest reason is that I
    Michael> think anyone who is new to threading probably shouldn't use any
    Michael> communication mechanism OTHER than Queue or something similar
    Michael> which has been carefully designed by someone knowlegable.

Is the Queue class very useful outside a multithreaded context?  The notion
of a queue as a data structure has meaning outside of threaded applications.
Its presence might seduce a new programmer into thinking it is subtly
different than it really is.  A cursory test suggests that it works, though
q.get() on a empty queue seems a bit counterproductive.  Also, Queue objects
are probably quite a bit less efficient than lists.  Taken as a whole,
perhaps a stronger attachment with the threading module isn't such a bad
idea.

Skip

From guido at python.org  Wed Oct 12 16:58:10 2005
From: guido at python.org (Guido van Rossum)
Date: Wed, 12 Oct 2005 07:58:10 -0700
Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use)
In-Reply-To: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com>
References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com>
Message-ID: <ca471dc20510120758n6697cd66h46d903814985ba1a@mail.gmail.com>

On 10/12/05, Michael Chermside <mcherm at mcherm.com> wrote:
> I'm not familiar with the clever trick Greg is proposing, but I
> do agree that _IF_ everything else were equal, then Queue seems
> to belong in the threading module. My biggest reason is that I
> think anyone who is new to threading probably shouldn't use any
> communication mechanism OTHER than Queue or something similar
> which has been carefully designed by someone knowlegable.

I *still* disagree. At some level, Queue is just an application of
threading, while the threading module provides the basic API (never
mind that there's an even more basic API, the thread module -- it's
too low-level to consider and we actively recommend against it, at
least I hope we do).

While at this point there may be no other "applications" of threading
in the standard library, that may not remain the case; it's quite
possble that some of the discussions of threading APIs will eventually
lead to a PEP proposing a different threading paradigm build on top of
the threading module.

 I'm using the word "application" loosely here because I realize one
person's application is another's primitive operation. But I object to
the idea that just because A and B are often used together or A is
recommended for programs using B that A and B should live in the same
module. We don't put urllib and httplib in the socket module either!

Now, if we had a package structure, I would sure like to see threading
and Queue end up as neighbors in the same package. But I don't think
it's right to package them all up in the same module.

(Not to say that autoloading is a bad idea; I'm -0 on it for myself,
but I can see use cases; but it doesn't change my mind on whether
Queue should become threading.Queue. I guess I didn't articulate my
reasoning for being against that well previously and tried to hide
behind the load time argument.)

BTW, Queue.Queue violates a recent module naming standard; it is now
considered bad style to name the class and the module the same.
Modules and packages should have short all-lowercase names, classes
should be CapWords. Even the same but different case is bad style.
(I'd suggest queueing.Queue except nobody can type that right. :)

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Wed Oct 12 17:00:30 2005
From: guido at python.org (Guido van Rossum)
Date: Wed, 12 Oct 2005 08:00:30 -0700
Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use)
In-Reply-To: <17229.8668.114534.151179@montanaro.dyndns.org>
References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com>
	<17229.8668.114534.151179@montanaro.dyndns.org>
Message-ID: <ca471dc20510120800k24359140o6fa6166e51aa6e40@mail.gmail.com>

On 10/12/05, skip at pobox.com <skip at pobox.com> wrote:
> Is the Queue class very useful outside a multithreaded context?

No. It was designed specifically for inter-thread communication.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Wed Oct 12 17:19:00 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 12 Oct 2005 11:19:00 -0400
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <loom.20051012T040249-299@post.gmane.org>
References: <2mll1ghsjc.fsf@starship.python.net>
	<397621172.20050927111836@MailBlocks.com>
	<433AE8A8.3010500@v.loewis.de>
	<329633301.20050929074337@MailBlocks.com>
	<2mll1ghsjc.fsf@starship.python.net>
	<5.1.1.6.0.20050929121236.0399ed88@mail.telecommunity.com>
	<160502469.20050929104837@MailBlocks.com>
Message-ID: <5.1.1.6.0.20051012111425.01f3eec8@mail.telecommunity.com>

At 02:35 AM 10/12/2005 +0000, Joshua Spoerri wrote:
>that stm paper isn't the end.
>
>there's a java implementation which seems to be exactly what we want:
>http://research.microsoft.com/~tharris/papers/2003-oopsla.pdf

There's already a Python implementation of what's described in the 
paper.  It's called ZODB.  :)  Just use the memory backend if you don't 
want the objects to persist.

Granted, if you want automatic retry you'll need to create a decorator that 
catches conflict errors.  But basically, ZODB implements a similar 
optimistic conflict management transaction algorithm to that described in 
the paper.  Certainly, it's the closest thing you can get in CPython 
without a complete redesign of the VM.


From skromta at gmail.com  Sat Oct  8 09:55:43 2005
From: skromta at gmail.com (Kalle Anke)
Date: Sat, 8 Oct 2005 09:55:43 +0200
Subject: [Python-Dev] Pythonic concurrency
References: <20051006143740.287E.JCARLSON@uci.edu>
	<200510070145.17284.ms@cerenity.org>
	<20051006221436.2892.JCARLSON@uci.edu>
	<415220344.20051007104751@MailBlocks.com>
Message-ID: <0001HW.BF6D481F0162024EF0407550@news.gmane.org>

On Fri, 7 Oct 2005 18:47:51 +0200, Bruce Eckel wrote
(in article <415220344.20051007104751 at MailBlocks.com>):

> It's hard to know how to answer. I've met enough brilliant people to
> know that it's just possible that the person posting really does
> easily grok concurrency issues and thus I must seem irreconcilably
> thick. This may actually be one of those people for whom threading is
> obvious (and Ian has always seemed like a smart guy, for example).

I think it depends on which "level" you're talking about, concurrency IS very 
easy and "natural" at a conceptual level. It's also quite easy for doing 
basic stuff ... but it can become very complicated if you introduce different 
requirements and/or the system becomes complex and/or you're going to 
implement the actual mechanism.

That's my limited experience (personally, I really like concurrency ... and 
to be honest, some people can't really understand the concept at all while 
others have no problem so it's a "personal thing" also)



From john.m.camara at comcast.net  Wed Oct 12 17:37:24 2005
From: john.m.camara at comcast.net (john.m.camara@comcast.net)
Date: Wed, 12 Oct 2005 15:37:24 +0000
Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use)
Message-ID: <101220051537.5148.434D2DB3000F41170000141C22007503300E9D0E030E0CD203D202080106@comcast.net>

> John Camera writes:
> > It sounds like he feels Queue should just be part of threading but queues
> > can be used in other contexts besides threading.  So having separate
> > modules is a good thing.
>
> Michael Chermside
> Perhaps I am wrong here, but the Queue.Queue class is designed specifically
> for synchronization, and I have always been under the impression that
> it was probably NOT the best tool for normal queues that have nothing
> to do with threading. Why incur the overhead of synchronization locks
> when you don't intend to use them. I would advise against using Queue.Queue
> in any context besides threading.

I haven't used the Queue class before as I normally use a list for a queue.  
I just assumed a Queue was just a queue that was perhaps optimized for 
performance.  I guess I would have expected the Queue class as defined 
in the standard library to have a different name if it wasn't just a queue.
Well I should have known better than to make assumption on this list. :)

I now see where Greg is coming from but I'm still not comfortable having 
it in the threading module.  To me threads and queues are two different 
beasts.

John M. Camara




From skip at pobox.com  Wed Oct 12 18:02:38 2005
From: skip at pobox.com (skip@pobox.com)
Date: Wed, 12 Oct 2005 11:02:38 -0500
Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use)
In-Reply-To: <ca471dc20510120758n6697cd66h46d903814985ba1a@mail.gmail.com>
References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com>
	<ca471dc20510120758n6697cd66h46d903814985ba1a@mail.gmail.com>
Message-ID: <17229.13214.981827.304999@montanaro.dyndns.org>


    Guido> At some level, Queue is just an application of threading, while
    Guido> the threading module provides the basic API ...

While Queue is built on top of threading Lock and Condition objects, it is a
highly useful synchronization mechanism in its own right, and is almost
certainly easier to use correctly (at least for novices) than the
lower-level synchronization objects the threading module provides.  If
threading is the "friendly" version of thread, perhaps Queue should be
considered the "friendly" synchronization object.

(I'm playing the devil's advocate here.  I'm fine with Queue being where it
is.)

Skip

From john.m.camara at comcast.net  Wed Oct 12 18:04:13 2005
From: john.m.camara at comcast.net (john.m.camara@comcast.net)
Date: Wed, 12 Oct 2005 16:04:13 +0000
Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use)
Message-ID: <101220051604.20801.434D33FD000335700000514122007614380E9D0E030E0CD203D202080106@comcast.net>

> Skip write:
> Is the Queue class very useful outside a multithreaded context?  The notion
> of a queue as a data structure has meaning outside of threaded applications.
> Its presence might seduce a new programmer into thinking it is subtly
> different than it really is.  A cursory test suggests that it works, though
> q.get() on a empty queue seems a bit counterproductive.  Also, Queue objects
> are probably quite a bit less efficient than lists.  Taken as a whole,
> perhaps a stronger attachment with the threading module isn't such a bad
> idea.
> 
Maybe Queue belongs in a module called synchronize to avoid any confusions.

John M. Camara



From solipsis at pitrou.net  Wed Oct 12 18:11:40 2005
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Wed, 12 Oct 2005 18:11:40 +0200
Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use)
In-Reply-To: <101220051604.20801.434D33FD000335700000514122007614380E9D0E030E0CD203D202080106@comcast.net>
References: <101220051604.20801.434D33FD000335700000514122007614380E9D0E030E0CD203D202080106@comcast.net>
Message-ID: <1129133500.6178.16.camel@fsol>


> Maybe Queue belongs in a module called synchronize to avoid any confusions.

Why not /just/ make the doc a little bit more explicit ?
Instead of saying:
        It is especially useful in threads programming when information
        must be exchanged safely between multiple threads.
Replace it with:
        It is dedicated to threads programming for safe exchange of
        information between multiple threads. On the other hand, if you
        are only looking for a single-thread queue structure, use the
        built-in list type, or the deque class from the collections
        module.
If necessary, put it in bold ;)




From aahz at pythoncraft.com  Wed Oct 12 19:47:25 2005
From: aahz at pythoncraft.com (Aahz)
Date: Wed, 12 Oct 2005 10:47:25 -0700
Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use)
In-Reply-To: <434C6662.4040503@canterbury.ac.nz>
References: <434B8DBF.9080509@iinet.net.au>
	<ca471dc20510110812h3485c0eck35d7ab6d4dd2c8d5@mail.gmail.com>
	<434C6662.4040503@canterbury.ac.nz>
Message-ID: <20051012174725.GA26101@panix.com>

On Wed, Oct 12, 2005, Greg Ewing wrote:
> Guido van Rossum wrote:
>> 
>> I see no need. Code that *doesn't* need Queue but does use threading
>> shouldn't have to pay for loading Queue.py.

I'd argue that such code is rare enough (given the current emphasis on
Queue) that the performance issue doesn't matter.

> However, it does seem awkward to have a whole module providing
> just one small class that logically is so closely related to other
> threading facilities.

The problem is that historically Queue did not use ``threading``; it was
built directly on top of ``thread``, and people were told to use Queue
regardless of whether they were using ``thread`` or ``threading``.
Obviously, there is no use case for putting Queue into ``thread``, so off
it went into its own module.  At this point, my opinion is that we should
leave reorganizing the thread stuff until Python 3.0.  (Python 3.0
should "deprecate" ``thread`` by renaming it to ``_thread``).
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"If you think it's expensive to hire a professional to do the job, wait
until you hire an amateur."  --Red Adair

From guido at python.org  Wed Oct 12 19:55:06 2005
From: guido at python.org (Guido van Rossum)
Date: Wed, 12 Oct 2005 10:55:06 -0700
Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use)
In-Reply-To: <20051012174725.GA26101@panix.com>
References: <434B8DBF.9080509@iinet.net.au>
	<ca471dc20510110812h3485c0eck35d7ab6d4dd2c8d5@mail.gmail.com>
	<434C6662.4040503@canterbury.ac.nz> <20051012174725.GA26101@panix.com>
Message-ID: <ca471dc20510121055h43bd4c62je95b348dda7fda2b@mail.gmail.com>

On 10/12/05, Aahz <aahz at pythoncraft.com> wrote:
>  (Python 3.0
> should "deprecate" ``thread`` by renaming it to ``_thread``).

+1. (We could even start doing this before 3.0.)

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From mcherm at mcherm.com  Wed Oct 12 21:33:18 2005
From: mcherm at mcherm.com (Michael Chermside)
Date: Wed, 12 Oct 2005 12:33:18 -0700
Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use)
Message-ID: <20051012123318.vb9uvd4meu40sc0c@login.werra.lunarpages.com>

Aahz writes:
> (Python 3.0 should "deprecate" ``thread`` by renaming it to ``_thread``).

Guido says:
> +1. (We could even start doing this before 3.0.)

Before 3.0, let's deprecate it by listing it in the Deprecated modules
section within the documentation... no need to gratuitously break code
by renaming it until 3.0 arrives.

-- Michael Chermside


From rrr at ronadam.com  Wed Oct 12 22:52:20 2005
From: rrr at ronadam.com (Ron Adam)
Date: Wed, 12 Oct 2005 16:52:20 -0400
Subject: [Python-Dev] Extending tuple unpacking
In-Reply-To: <434CE0D4.3070809@gmail.com>
References: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com>	<434B3C01.5030001@ronadam.com>	<ca471dc20510102155p7f17fd14h5fe6043ea64717f6@mail.gmail.com>	<434B6C3A.7020001@canterbury.ac.nz>	<dih15g$nih$1@sea.gmane.org>	<434C235D.1060404@ronadam.com>
	<434CE0D4.3070809@gmail.com>
Message-ID: <434D7784.7020209@ronadam.com>

Nick Coghlan wrote:
> Ron Adam wrote:
> 
>>I wonder if something like the following would fulfill the need?
> 
> 
> Funny you should say that. . .
> 
> A pre-PEP propsing itertools.iunpack (amongst other things):
> http://mail.python.org/pipermail/python-dev/2004-November/050043.html
> 
> And the reason that PEP was never actually created:
> http://mail.python.org/pipermail/python-dev/2004-November/050068.html
> 
> Obviouly, I've changed my views over the last year or so ;)
> 
> Cheers,
> Nick.

It looked like the PEP didn't get created because there wasn't enough
interest at the time, not because there was anything wrong with the
idea.  And the motivation was, suprisingly, that this would be
discussed again, and here it is.  ;-)

I reversed my view in the other direction in the past 6 months or so. 
Mostly because when chaining methods or functions with * and **, my mind 
(which often doesn't have enough sleep), want's to think they mean the 
same thing in both ends of the method.

For example... (with small correction from the previous example)

     def div_at(self, *args):
         return self.__class__(self.div_iter(*args))

This would read better to me if it was.

     # (just an example, not a sugestion.)

     def div_at(self, *args):
         return self.__class__(self.div_iter(/args))

But I may be one of a few that this is a minor annoyance.  <shrug>

I wonder if you make '*' work outside of functions arguments lists, if
requests to do the same for '**' would follow?

Cheers,
    Ron


From aahz at pythoncraft.com  Wed Oct 12 23:02:41 2005
From: aahz at pythoncraft.com (Aahz)
Date: Wed, 12 Oct 2005 14:02:41 -0700
Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use)
In-Reply-To: <20051012123318.vb9uvd4meu40sc0c@login.werra.lunarpages.com>
References: <20051012123318.vb9uvd4meu40sc0c@login.werra.lunarpages.com>
Message-ID: <20051012210241.GA887@panix.com>

On Wed, Oct 12, 2005, Michael Chermside wrote:
> Guido says:
>> Aahz writes:
>>>
>>> (Python 3.0 should "deprecate" ``thread`` by renaming it to ``_thread``).
>> 
>> +1. (We could even start doing this before 3.0.)
> 
> Before 3.0, let's deprecate it by listing it in the Deprecated modules
> section within the documentation... no need to gratuitously break code
> by renaming it until 3.0 arrives.

Note carefully the deprecation in quotes.  It's not going to be
literally deprecated, only renamed, similar to the way _socket and
socket work together.  We could also rename to _threading, but I prefer
the simpler change of only a prepended underscore.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"If you think it's expensive to hire a professional to do the job, wait
until you hire an amateur."  --Red Adair

From guido at python.org  Wed Oct 12 23:24:49 2005
From: guido at python.org (Guido van Rossum)
Date: Wed, 12 Oct 2005 14:24:49 -0700
Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use)
In-Reply-To: <20051012210241.GA887@panix.com>
References: <20051012123318.vb9uvd4meu40sc0c@login.werra.lunarpages.com>
	<20051012210241.GA887@panix.com>
Message-ID: <ca471dc20510121424q39c03d28xab78de7aefd6503b@mail.gmail.com>

On 10/12/05, Aahz <aahz at pythoncraft.com> wrote:
> Note carefully the deprecation in quotes.  It's not going to be
> literally deprecated, only renamed, similar to the way _socket and
> socket work together.  We could also rename to _threading, but I prefer
> the simpler change of only a prepended underscore.

Could you specify exactly what you have in mind? How would backwards
compatibility be maintained in 2.x?

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From aahz at pythoncraft.com  Wed Oct 12 23:48:30 2005
From: aahz at pythoncraft.com (Aahz)
Date: Wed, 12 Oct 2005 14:48:30 -0700
Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use)
In-Reply-To: <ca471dc20510121424q39c03d28xab78de7aefd6503b@mail.gmail.com>
References: <20051012123318.vb9uvd4meu40sc0c@login.werra.lunarpages.com>
	<20051012210241.GA887@panix.com>
	<ca471dc20510121424q39c03d28xab78de7aefd6503b@mail.gmail.com>
Message-ID: <20051012214830.GA24007@panix.com>

On Wed, Oct 12, 2005, Guido van Rossum wrote:
> On 10/12/05, Aahz <aahz at pythoncraft.com> wrote:
>>
>> Note carefully the deprecation in quotes.  It's not going to be
>> literally deprecated, only renamed, similar to the way _socket and
>> socket work together.  We could also rename to _threading, but I prefer
>> the simpler change of only a prepended underscore.
> 
> Could you specify exactly what you have in mind? How would backwards
> compatibility be maintained in 2.x?

I'm suggesting that we add a doc note that using the thread module is
discouraged and that it will be renamed in 3.0.

-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"If you think it's expensive to hire a professional to do the job, wait
until you hire an amateur."  --Red Adair

From mcherm at mcherm.com  Thu Oct 13 00:00:17 2005
From: mcherm at mcherm.com (Michael Chermside)
Date: Wed, 12 Oct 2005 15:00:17 -0700
Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use)
Message-ID: <20051012150017.njcyu2ftthc0wosk@login.werra.lunarpages.com>

Aahz writes:
> I'm suggesting that we add a doc note that using the thread module is
> discouraged and that it will be renamed in 3.0.

Then we're apparently all in agreement.

-- Michael Chermside


From eyal.lotem at gmail.com  Tue Oct 11 23:31:42 2005
From: eyal.lotem at gmail.com (Eyal Lotem)
Date: Tue, 11 Oct 2005 23:31:42 +0200
Subject: [Python-Dev] Early PEP draft (For Python 3000?)
Message-ID: <b64f365b0510111431g7fdc573dpc59e4e304ec4e425@mail.gmail.com>

I would like to re-suggest a suggestion I have made in the past, but
with a mild difference, and a narrower scope.

Name: Attribute access for all namespaces

Rationale: globals() access is conceptually the same as setting the
module's attributes but uses a different idiom (access of the dict
directly).  Also, locals() returns a dict, which implies it can affect
the local scope, but quietly ignores changes.  Using attribute access
to access the local/global namespaces just as that is used in instance
namespaces and other modules' namespaces, could reduce the mental
footprint of Python.

Method: All namespace accesses are attribute accesses, and not direct
__dict__ accesses. Thus globals() is replaced by a "module" keyword
(or "magic variable"?) that evaluates to the module object.  Thus,
reading/writing globals in module X, uses getattr/setattr on the
module object, just like doing so in module Y would be.  locals()
would return be replaced by a function that returns the frame object
(or a weaker equivalent of a frame object) of the currently running
function.  This object will represent the local namespace and will
allow attribute getting/setting to read/write attributes. Or it can
disallow attribute setting.

Examples:

       global x ; x = 1
Replaced by:
       module.x = 1

or:

      globals()[x] = 1
Replaced by:
      setattr(module, x, 1)


      locals()['x'] = 1 # Quietly fails!
Replaced by:
      frame.x = 1 # Raises error

      x = locals()[varname]
Replaced by:
      x = getattr(frame, varname)


Advantages:
  - Python becomes more consistent w.r.t namespacing and scopes.
Disadvantages:
  - "module" is already possible by importing one's own module, but that is:
    * Confusing and unnecessarily requires naming one's self
redundantly (Making renaming of the module a bit more difficult).
    * Not easily possible in a __main__/importable module.
    * No equivalent for locals()
  - Automatic script conversion may be difficult in some use cases of
globals()/locals()

From greg.ewing at canterbury.ac.nz  Thu Oct 13 02:40:44 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 13 Oct 2005 13:40:44 +1300
Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use)
In-Reply-To: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com>
References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com>
Message-ID: <434DAD0C.3060507@canterbury.ac.nz>

Michael Chermside wrote:

> John, I think what Greg is suggesting is that we include Queue
> in the threading module, but that we use a Clever Trick(TM) to
> address Guido's point by not actually loading the Queue code
> until the first time (if ever) that it is used.

I wasn't actually going so far as to suggest doing this,
rather pointing out that, if we had an autoloading mechanism,
this would be an obvious use case for it.

> I'm not familiar with the clever trick Greg is proposing,

I'll see if I can cook up an example of it to show. Be
warned, it is very hackish...

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From greg.ewing at canterbury.ac.nz  Thu Oct 13 02:47:29 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 13 Oct 2005 13:47:29 +1300
Subject: [Python-Dev] Assignment to __class__ of module? (Autoloading?
 (Making Queue.Queue easier to use))
In-Reply-To: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com>
References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com>
Message-ID: <434DAEA1.9000904@canterbury.ac.nz>

I just tried to implement an autoloader using a technique
I'm sure I used in an earlier Python version, but it no
longer seems to be allowed.

I'm trying to change the __class__ of a newly-imported
module to a subclass of types.ModuleType, but I'm getting

   TypeError: __class__ assignment: only for heap types

Have the rules concerning assignent to __class__ been
made more restrictive recently?

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From pje at telecommunity.com  Thu Oct 13 04:16:57 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 12 Oct 2005 22:16:57 -0400
Subject: [Python-Dev] Assignment to __class__ of module? (Autoloading?
 (Making Queue.Queue easier to use))
In-Reply-To: <434DAEA1.9000904@canterbury.ac.nz>
References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com>
	<20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com>
Message-ID: <5.1.1.6.0.20051012221440.01f41e10@mail.telecommunity.com>

At 01:47 PM 10/13/2005 +1300, Greg Ewing wrote:
>I just tried to implement an autoloader using a technique
>I'm sure I used in an earlier Python version, but it no
>longer seems to be allowed.
>
>I'm trying to change the __class__ of a newly-imported
>module to a subclass of types.ModuleType, but I'm getting
>
>    TypeError: __class__ assignment: only for heap types
>
>Have the rules concerning assignent to __class__ been
>made more restrictive recently?

It happened in Python 2.3, actually.  The best way to work around this is 
to add an instance of your subclass to sys.modules *first*, then call 
reload() on it to make the normal import process work.  PEAK uses this to 
implement lazy loading.

Actually, for your purposes, you might be able to just replace the module 
object and copy its contents to the new module's dictionary.


From greg.ewing at canterbury.ac.nz  Thu Oct 13 05:44:12 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 13 Oct 2005 16:44:12 +1300
Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use)
In-Reply-To: <101220051537.5148.434D2DB3000F41170000141C22007503300E9D0E030E0CD203D202080106@comcast.net>
References: <101220051537.5148.434D2DB3000F41170000141C22007503300E9D0E030E0CD203D202080106@comcast.net>
Message-ID: <434DD80C.80207@canterbury.ac.nz>

john.m.camara at comcast.net wrote:

> I now see where Greg is coming from but I'm still not comfortable having 
> it in the threading module.  To me threads and queues are two different 
> beasts.

All right then, how about putting it in a module called
threadutils or something like that, which is clearly
related to threading, but is open for the addition
of future thread-related features that might arise.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From greg.ewing at canterbury.ac.nz  Thu Oct 13 05:44:21 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 13 Oct 2005 16:44:21 +1300
Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use)
In-Reply-To: <20051012072528.68vls5q6143k08c0@login.werra.lunarpages.com>
References: <20051012072528.68vls5q6143k08c0@login.werra.lunarpages.com>
Message-ID: <434DD815.8070909@canterbury.ac.nz>

Michael Chermside wrote:

>     # start of module
>     initialized = False
> 
>     def doSomething():
>         if not initialized:
>             initialize()

But how do you do this if the thing in question is a
class rather than a function?

The module could export a function getSomeClass()
that clients were required to use instead of just
referencing the class, but that would be klunky
in the extreme.

BTW, I agree that special *syntax* isn't necessarily
needed. But it does seem to me that some sort of
hook is needed somewhere to make this doable
smoothly, that doesn't exist today.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From greg.ewing at canterbury.ac.nz  Thu Oct 13 05:44:25 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 13 Oct 2005 16:44:25 +1300
Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use)
In-Reply-To: <101220051312.4429.434D0BA5000808D40000114D22007503300E9D0E030E0CD203D202080106@comcast.net>
References: <101220051312.4429.434D0BA5000808D40000114D22007503300E9D0E030E0CD203D202080106@comcast.net>
Message-ID: <434DD819.7090705@canterbury.ac.nz>

john.m.camara at comcast.net wrote:

> I guess from Greg?s comments I?m not sure if he wants to
> 
> import threading
> 
> and as a result
> 
> ?Queue? becomes available in the local namespace

No!!!

> Queue becomes part of the threading namespace and bound/loaded
 > when it is first needed.  Queue then becomes accessible through
 > ?threading.Queue?

Yes.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From greg.ewing at canterbury.ac.nz  Thu Oct 13 07:20:58 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 13 Oct 2005 18:20:58 +1300
Subject: [Python-Dev] Assignment to __class__ of module? (Autoloading?
 (Making Queue.Queue easier to use))
In-Reply-To: <5.1.1.6.0.20051012221440.01f41e10@mail.telecommunity.com>
References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com>
	<20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com>
	<5.1.1.6.0.20051012221440.01f41e10@mail.telecommunity.com>
Message-ID: <434DEEBA.7050905@canterbury.ac.nz>

Phillip J. Eby wrote:
> At 01:47 PM 10/13/2005 +1300, Greg Ewing wrote:
> 
>> I'm trying to change the __class__ of a newly-imported
>> module to a subclass of types.ModuleType
> 
> It happened in Python 2.3, actually.

Is there a discussion anywhere about the reason this was
done? It would be useful if this capability could be
regained somehow without breaking things.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From greg.ewing at canterbury.ac.nz  Thu Oct 13 07:25:58 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 13 Oct 2005 18:25:58 +1300
Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use)
In-Reply-To: <434DAD0C.3060507@canterbury.ac.nz>
References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com>
	<434DAD0C.3060507@canterbury.ac.nz>
Message-ID: <434DEFE6.7030909@canterbury.ac.nz>

I wrote:

> I'll see if I can cook up an example of it to show. Be
> warned, it is very hackish...

Well, here it is. It's even slightly uglier than I thought
it would be due to the inability to change the class of a
module these days.

When you run it, you should get

Imported my_module
Loading the spam module
Glorious processed meat product!
Glorious processed meat product!

#--------------------------------------------------------------

#
#  test.py
#

import my_module

print "Imported my_module"

my_module.spam()
my_module.spam()

#
#  my_module.py
#

import autoloading
autoloading.register(__name__, {'spam': 'spam_module'})

#
#  spam_module.py
#

print "Loading the spam module"

def spam():
   print "Glorious processed meat product!"

#
#  autoloading.py
#

import sys

class AutoloadingModule(object):

   def __getattr__(self, name):
     modname = self.__dict__['_autoload'][name]
     module = __import__(modname, self.__dict__, {}, [name])
     value = getattr(module, name)
     setattr(self, name, value)
     return value

def register(module_name, mapping):
   module = sys.modules[module_name]
   m2 = AutoloadingModule()
   m2.__name__ = module.__name__
   m2.__dict__ = module.__dict__
   # Drop all references to the original module before assigning
   # the _autoload attribute. Otherwise, when the original module
   # gets cleared, _autoload is set to None.
   sys.modules[module_name] = m2
   del module
   m2._autoload = mapping

#--------------------------------------------------------------

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From ncoghlan at gmail.com  Thu Oct 13 11:47:31 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 13 Oct 2005 19:47:31 +1000
Subject: [Python-Dev] Extending tuple unpacking
In-Reply-To: <434D7784.7020209@ronadam.com>
References: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com>	<434B3C01.5030001@ronadam.com>	<ca471dc20510102155p7f17fd14h5fe6043ea64717f6@mail.gmail.com>	<434B6C3A.7020001@canterbury.ac.nz>	<dih15g$nih$1@sea.gmane.org>	<434C235D.1060404@ronadam.com>	<434CE0D4.3070809@gmail.com>
	<434D7784.7020209@ronadam.com>
Message-ID: <434E2D33.50204@gmail.com>

Ron Adam wrote:
> I wonder if you make '*' work outside of functions arguments lists, if
> requests to do the same for '**' would follow?

Only if keyword unpacking were to be permitted elsewhere first. That is:

Py> data = dict(a=1, b=2, c=3)
Py> (a, b, c) = **data
Py> print a, b, c
(1, 2, 3)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Thu Oct 13 11:54:15 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 13 Oct 2005 19:54:15 +1000
Subject: [Python-Dev] Extending tuple unpacking
In-Reply-To: <20051012045438.gz4jb9pc1wwskwcw@login.werra.lunarpages.com>
References: <20051012045438.gz4jb9pc1wwskwcw@login.werra.lunarpages.com>
Message-ID: <434E2EC7.5060101@gmail.com>

Michael Chermside wrote:
> Guido writes:
> 
>>I've always wanted to write that as
>>
>>    f(a, b, *args, foo=1, bar=2, **kwds)
>>
>>but the current grammar doesn't allow it.
> 
> 
> Hmm.... why doesn't the current grammar allow it, and can we fix that?
> I don't see that it's a limitation of the one-token-lookahead, could
> we permit this syntax by rearanging bits of the grammer?

I griped about this a while back, and got the impression from Guido that 
fixing it was possible, but it had simply never bugged anyone enough for them 
to actaully get around to fixing it.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Thu Oct 13 12:46:42 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 13 Oct 2005 20:46:42 +1000
Subject: [Python-Dev] threadtools (was Re: Autoloading? (Making Queue.Queue
 easier to use))
In-Reply-To: <17229.13214.981827.304999@montanaro.dyndns.org>
References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com>	<ca471dc20510120758n6697cd66h46d903814985ba1a@mail.gmail.com>
	<17229.13214.981827.304999@montanaro.dyndns.org>
Message-ID: <434E3B12.1000108@gmail.com>

skip at pobox.com wrote:
>     Guido> At some level, Queue is just an application of threading, while
>     Guido> the threading module provides the basic API ...
> 
> While Queue is built on top of threading Lock and Condition objects, it is a
> highly useful synchronization mechanism in its own right, and is almost
> certainly easier to use correctly (at least for novices) than the
> lower-level synchronization objects the threading module provides.  If
> threading is the "friendly" version of thread, perhaps Queue should be
> considered the "friendly" synchronization object.
> 
> (I'm playing the devil's advocate here.  I'm fine with Queue being where it
> is.)

If we *don't* make Queue a part of the basic threading API (and I think Guido 
is right that it doesn't need to be), then I suggest we create a threadtools 
module.

So the thread-related API would actually have three layers:
   - _thread (currently "_thread") for the low-level guts
   - threading for the basic thread API that any threaded app needs
   - threadtools for the more complex "application-specific" items

Initially threadtools would just contain Queue, but other candidates for 
inclusion in the future might be standard implementations of:
   - PeriodicTimer (see below)
   - FutureCall (threading out a call, only blocking when you need the result
   - QueueThread (a thread with "inbox" and "outbox" Queues)
   - ThreadPool (up to the application to make sure the Threads are reusable)
   - threading related decorators

Cheers,
Nick.

P.S. PeriodicTimer would be a variant of threading Timer which simply replaces 
the run method with:
   def run():
       while 1:
           self.finished.wait(self.interval)
           if self.finished.isSet():
               break
           self.function(*self.args, **self.kwds)


-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Thu Oct 13 12:51:05 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 13 Oct 2005 20:51:05 +1000
Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use)
In-Reply-To: <101220051312.4429.434D0BA5000808D40000114D22007503300E9D0E030E0CD203D202080106@comcast.net>
References: <101220051312.4429.434D0BA5000808D40000114D22007503300E9D0E030E0CD203D202080106@comcast.net>
Message-ID: <434E3C19.10101@gmail.com>

john.m.camara at comcast.net wrote:
> It sounds like he feels Queue should just be part of threading but queues
> can be used in other contexts besides threading.  So having separate
> modules is a good thing.

If threads aren't involved, you should use "collections.deque" directly, 
rather than going through "Queue.Queue". The latter jumps through a lot of 
hoops in order to be thread-safe.

This confusion is one of the reasons I have a problem with the current name of 
the Queue module.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From steve at holdenweb.com  Thu Oct 13 12:55:52 2005
From: steve at holdenweb.com (Steve Holden)
Date: Thu, 13 Oct 2005 11:55:52 +0100
Subject: [Python-Dev] Extending tuple unpacking
In-Reply-To: <434E2D33.50204@gmail.com>
References: <2773CAC687FD5F4689F526998C7E4E5F4DB6B6@au3010avexu1.global.avaya.com>	<434B3C01.5030001@ronadam.com>	<ca471dc20510102155p7f17fd14h5fe6043ea64717f6@mail.gmail.com>	<434B6C3A.7020001@canterbury.ac.nz>	<dih15g$nih$1@sea.gmane.org>	<434C235D.1060404@ronadam.com>	<434CE0D4.3070809@gmail.com>	<434D7784.7020209@ronadam.com>
	<434E2D33.50204@gmail.com>
Message-ID: <dilefp$et5$1@sea.gmane.org>

Nick Coghlan wrote:
> Ron Adam wrote:
> 
>>I wonder if you make '*' work outside of functions arguments lists, if
>>requests to do the same for '**' would follow?
> 
> 
> Only if keyword unpacking were to be permitted elsewhere first. That is:
> 
> Py> data = dict(a=1, b=2, c=3)
> Py> (a, b, c) = **data
> Py> print a, b, c
> (1, 2, 3)
> 
> Cheers,
> Nick.
> 
This gets too weird, though. What about:

   (a, **d) = **data

Should this be equivalent to

   a = 1
   d = dict(b=2, c=3)

? Basically I suspect we are heading towards the outer limits here.

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC                     www.holdenweb.com
PyCon TX 2006                  www.python.org/pycon/


From ncoghlan at gmail.com  Thu Oct 13 13:41:31 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 13 Oct 2005 21:41:31 +1000
Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use)
In-Reply-To: <434DD815.8070909@canterbury.ac.nz>
References: <20051012072528.68vls5q6143k08c0@login.werra.lunarpages.com>
	<434DD815.8070909@canterbury.ac.nz>
Message-ID: <434E47EB.8090909@gmail.com>

Greg Ewing wrote:
> BTW, I agree that special *syntax* isn't necessarily
> needed. But it does seem to me that some sort of
> hook is needed somewhere to make this doable
> smoothly, that doesn't exist today.

Having module attribute access obey the descriptor protocol (__get__, __set__, 
__delete__) sounds like a pretty good option to me.

It would even be pretty backwards compatible, as I'd be hardpressed to think 
why anyone would have a descriptor *instance* as a top-level object in a 
module (descriptor definition, yes, but not an instance).

Consider lazy instance attributes:

Py> def lazyattr(func):
...     class wrapper(object):
...         def __get__(self, instance, cls):
...             val = func()
...             setattr(instance, func.__name__, val)
...             return val
...     return wrapper()
...
Py> class test(object):
...     @lazyattr
...     def foo():
...         print "Evaluating foo!"
...         return "Instance attribute"
...
Py> t = test()
Py> t.foo
Evaluating foo!
'Instance attribute'
Py> t.foo
'Instance attribute'

And lazy class attributes:

Py> def lazyclassattr(func):
...     class wrapper(object):
...         def __get__(self, instance, cls):
...             val = func()
...             setattr(cls, func.__name__, val)
...             return val
...     return wrapper()
...
Py> class test(object):
...     @lazyclassattr
...     def bar():
...         print "Evaluating bar!"
...         return "Class attribute"
...
Py> test.bar
Evaluating bar!
'Class attribute'
Py> test.bar
'Class attribute'


Unfortunately, that trick doesn't work at the module level:

Py> def lazymoduleattr(func):
...     class wrapper(object):
...         def __get__(self, instance, cls):
...             val = func()
...             globals()[func.__name__] = val
...             return val
...     return wrapper()
...
Py> @lazymoduleattr
... def baz():
...     print "Evaluating baz!"
...     return "Module attribute"
...
Py> baz # Descriptor not invoked
<__main__.wrapper object at 0x00B9E3B0>
Py> import sys
Py> main = sys.modules["__main__"]
Py> main.baz # Descriptor STILL not invoked :(
<__main__.wrapper object at 0x00B9E3B0>

But putting the exact same descriptor in a class lets it work its magic:

Py> class lazy(object):
...   baz = baz
...
Py> lazy.baz
Evaluating baz!
'Module attribute'
Py> baz
'Module attribute'

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From gjc at inescporto.pt  Thu Oct 13 15:07:06 2005
From: gjc at inescporto.pt (Gustavo J. A. M. Carneiro)
Date: Thu, 13 Oct 2005 14:07:06 +0100
Subject: [Python-Dev] Making Queue.Queue easier to use
In-Reply-To: <434B8DBF.9080509@iinet.net.au>
References: <434B8DBF.9080509@iinet.net.au>
Message-ID: <1129208826.31838.13.camel@localhost>

  I'd just like to point out that Queue is not quite as useful as people
seem to think in this thread.  The main problem is that I can't
integrate Queue into a select/poll based main loop.

  The other day I wanted extended a python main loop, which uses poll(),
to be thread safe, so I could queue idle functions from separate
threads.  Obviously Queue doesn't work (no file descriptor to poll), so
I just ended up creating a pipe, to which I send a single byte when I
want to "wake up" the main loop to make it realize changes in its
configuration, such as a new callback added.

  I guess this is partly an unix problem.  There's no system call to say
like "wake me up when one of these descriptors has data OR when this
condition variable is set".  Windows has WaitForMultipleObjects, which I
suspect is quite a bit more powerful.

  Regards.

-- 
Gustavo J. A. M. Carneiro
<gjc at inescporto.pt> <gustavo at users.sourceforge.net>
The universe is always one step beyond logic.


From mwh at python.net  Thu Oct 13 16:36:21 2005
From: mwh at python.net (Michael Hudson)
Date: Thu, 13 Oct 2005 15:36:21 +0100
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <638942434.20051011105302@MailBlocks.com> (Bruce Eckel's
	message of "Tue, 11 Oct 2005 10:53:02 -0600")
References: <ca471dc20510101718y5fdd5af3x1e5a93642a90a20e@mail.gmail.com>
	<05Oct10.180605pdt."58617"@synergy1.parc.xerox.com>
	<638942434.20051011105302@MailBlocks.com>
Message-ID: <2machdbi1m.fsf@starship.python.net>

Bruce Eckel <BruceEckel-Python3234 at mailblocks.com> writes:

> Not only are there significant new library components in
> java.util.concurrent in J2SE5, but perhaps more important is the new
> memory model that deals with issues that are (especially) revealed in
> multiprocessor environments. The new memory model represents new work
> in the computer science field; apparently the original paper is
> written by Ph.D.s and is a bit too theoretical for the normal person
> to follow. But the smart threading guys studied this and came up with
> the new Java memory model -- so that volatile, for example, which
> didn't work quite right before, does now. This is part of J2SE5, and
> this work is being incorporated into the upcoming C++0x.

Do you have a link that explains this sort of thing for the layman?

Cheers,
mwh

-- 
  When physicists speak of a TOE, they don't really mean a theory
  of *everything*.  Taken literally, "Everything" covers a lot of
  ground, including biology, art, decoherence and the best way to
  barbecue ribs.                    -- John Baez, sci.physics.research

From mwh at python.net  Thu Oct 13 17:02:17 2005
From: mwh at python.net (Michael Hudson)
Date: Thu, 13 Oct 2005 16:02:17 +0100
Subject: [Python-Dev] Assignment to __class__ of module? (Autoloading?
 (Making Queue.Queue easier to use))
In-Reply-To: <434DEEBA.7050905@canterbury.ac.nz> (Greg Ewing's message of
	"Thu, 13 Oct 2005 18:20:58 +1300")
References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com>
	<20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com>
	<5.1.1.6.0.20051012221440.01f41e10@mail.telecommunity.com>
	<434DEEBA.7050905@canterbury.ac.nz>
Message-ID: <2m1x2pbgue.fsf@starship.python.net>

Greg Ewing <greg.ewing at canterbury.ac.nz> writes:

> Phillip J. Eby wrote:
>> At 01:47 PM 10/13/2005 +1300, Greg Ewing wrote:
>> 
>>> I'm trying to change the __class__ of a newly-imported
>>> module to a subclass of types.ModuleType
>> 
>> It happened in Python 2.3, actually.
>
> Is there a discussion anywhere about the reason this was
> done? It would be useful if this capability could be
> regained somehow without breaking things.

Well, I think it's undesirable that you be able to do this to, e.g.,
strings.  Modules are something of a greyer area, I guess.

Cheers,
mwh

-- 
  You sound surprised.  We're talking about a government department
  here - they have procedures, not intelligence.
                                            -- Ben Hutchings, cam.misc

From skip at pobox.com  Thu Oct 13 18:07:07 2005
From: skip at pobox.com (skip@pobox.com)
Date: Thu, 13 Oct 2005 11:07:07 -0500
Subject: [Python-Dev] Threading and synchronization primitives
In-Reply-To: <434DD80C.80207@canterbury.ac.nz>
References: <101220051537.5148.434D2DB3000F41170000141C22007503300E9D0E030E0CD203D202080106@comcast.net>
	<434DD80C.80207@canterbury.ac.nz>
Message-ID: <17230.34347.295074.10528@montanaro.dyndns.org>


    Greg> All right then, how about putting it in a module called
    Greg> threadutils or something like that, which is clearly related to
    Greg> threading, but is open for the addition of future thread-related
    Greg> features that might arise.

Then Lock, RLock, Semaphore, etc belong there instead of in threading don't
they?

We have two things here, the basic thread object and the stuff it does (run,
start, etc) and the synchronization primitives.  Thread objects come in two
levels of abstraction: thread.thread and threading.Thread.  The
synchronization primitives come in three levels of abstraction: thread.lock,
threading.{Lock,Semaphore,...} and Queue.Queue.  Each level of abstraction
builds on the level below.

In the typical case I think we want to encourage programmers to use the
highest levels of abstraction available and leave the lower level stuff to
the real pros.  That means most programmers using threads should use
threading.Thread and Queue.Queue.  Partitioning the various classes to
different modules might look like this:

        Module        Thread Classes            Sync Primitives
        ------        --------------            ---------------
        _thread       thread                    lock
        threadutils                             Lock, RLock, Semaphore
        thread        Thread                    Queue

Programmers would clearly be discouraged from using the _thread module
(currently thread).  The typical case would be to import the thread module
(currently threading) and use its Thread and Queue objects.  For specialized
use the threadutils programmer can import the threadutils module to get at
the synchronization primitives it contains.

Skip

From skip at pobox.com  Thu Oct 13 18:15:15 2005
From: skip at pobox.com (skip@pobox.com)
Date: Thu, 13 Oct 2005 11:15:15 -0500
Subject: [Python-Dev] threadtools (was Re: Autoloading? (Making
 Queue.Queue easier to use))
In-Reply-To: <434E3B12.1000108@gmail.com>
References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com>
	<ca471dc20510120758n6697cd66h46d903814985ba1a@mail.gmail.com>
	<17229.13214.981827.304999@montanaro.dyndns.org>
	<434E3B12.1000108@gmail.com>
Message-ID: <17230.34835.844742.854871@montanaro.dyndns.org>


    Nick> So the thread-related API would actually have three layers:
    Nick>    - _thread (currently "_thread") for the low-level guts
    Nick>    - threading for the basic thread API that any threaded app needs
    Nick>    - threadtools for the more complex "application-specific" items

    Nick> Initially threadtools would just contain Queue, but other candidates for 
    Nick> inclusion in the future might be standard implementations of:
    Nick>    - PeriodicTimer (see below)
    Nick>    - FutureCall (threading out a call, only blocking when you need the result
    Nick>    - QueueThread (a thread with "inbox" and "outbox" Queues)
    Nick>    - ThreadPool (up to the application to make sure the Threads are reusable)
    Nick>    - threading related decorators

Given your list of stuff to go in a threadtools module, I still think you
need something to hold Lock, RLock, Condition and Semaphore.  See my
previous post (subject: Threading and synchronization primitives) about a
threadutils module to hold these somewhat lower-level sync primitives.  In
most cases I don't think programmers need them.  OTOH, providing some higher
level abstractions seems to make sense.  (I have to admit I have no idea
what a QueueThread's outbox queue would be used for.  Queues are generally
multi-producer, single-consumer objects.  It makes sense for a thread to
have an inbox.  I'm not so sure about an outbox.)

Skip

From aahz at pythoncraft.com  Thu Oct 13 19:08:17 2005
From: aahz at pythoncraft.com (Aahz)
Date: Thu, 13 Oct 2005 10:08:17 -0700
Subject: [Python-Dev] threadtools (was Re: Autoloading? (Making
	Queue.Queue easier to use))
In-Reply-To: <17230.34835.844742.854871@montanaro.dyndns.org>
References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com>
	<ca471dc20510120758n6697cd66h46d903814985ba1a@mail.gmail.com>
	<17229.13214.981827.304999@montanaro.dyndns.org>
	<434E3B12.1000108@gmail.com>
	<17230.34835.844742.854871@montanaro.dyndns.org>
Message-ID: <20051013170817.GA20568@panix.com>

On Thu, Oct 13, 2005, skip at pobox.com wrote:
> 
> Given your list of stuff to go in a threadtools module, I still think
> you need something to hold Lock, RLock, Condition and Semaphore.  See
> my previous post (subject: Threading and synchronization primitives)
> about a threadutils module to hold these somewhat lower-level sync
> primitives.  In most cases I don't think programmers need them.  OTOH,
> providing some higher level abstractions seems to make sense.  (I
> have to admit I have no idea what a QueueThread's outbox queue would
> be used for.  Queues are generally multi-producer, single-consumer
> objects.  It makes sense for a thread to have an inbox.  I'm not so
> sure about an outbox.)

If you look at my thread tutorial, the spider thread pool uses a
single-producer, multiple-consumer queue to feed URLs to the retrieving
threads.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"If you think it's expensive to hire a professional to do the job, wait
until you hire an amateur."  --Red Adair

From BruceEckel-Python3234 at mailblocks.com  Thu Oct 13 19:10:03 2005
From: BruceEckel-Python3234 at mailblocks.com (Bruce Eckel)
Date: Thu, 13 Oct 2005 11:10:03 -0600
Subject: [Python-Dev] Pythonic concurrency
In-Reply-To: <2machdbi1m.fsf@starship.python.net>
References: <ca471dc20510101718y5fdd5af3x1e5a93642a90a20e@mail.gmail.com>
	<05Oct10.180605pdt."58617"@synergy1.parc.xerox.com>
	<638942434.20051011105302@MailBlocks.com>
	<2machdbi1m.fsf@starship.python.net>
Message-ID: <1569974120.20051013111003@MailBlocks.com>

I don't know of anything that exists. There is an upcoming book that
may help:

Java Concurrency in Practice, by Brian Goetz, Tim Peierls, Joshua
Bloch, Joseph Bowbeer, David Holmes, and Doug Lea (Addison-Wesley
2006).

I have had assistance from some of the authors, but don't know if it
introduces the concepts from the research paper. Estimated publication
is February.

However, you might get something from Scott Meyer's analysis of the
concurrency issues surrounding the double-checked locking algorithm:
http://www.aristeia.com/Papers/DDJ_Jul_Aug_2004_revised.pdf

Thursday, October 13, 2005, 8:36:21 AM, Michael Hudson wrote:

> Bruce Eckel <BruceEckel-Python3234 at mailblocks.com> writes:

>> Not only are there significant new library components in
>> java.util.concurrent in J2SE5, but perhaps more important is the new
>> memory model that deals with issues that are (especially) revealed in
>> multiprocessor environments. The new memory model represents new work
>> in the computer science field; apparently the original paper is
>> written by Ph.D.s and is a bit too theoretical for the normal person
>> to follow. But the smart threading guys studied this and came up with
>> the new Java memory model -- so that volatile, for example, which
>> didn't work quite right before, does now. This is part of J2SE5, and
>> this work is being incorporated into the upcoming C++0x.

> Do you have a link that explains this sort of thing for the layman?

> Cheers,
> mwh



Bruce Eckel    http://www.BruceEckel.com   mailto:BruceEckel-Python3234 at mailblocks.com
Contains electronic books: "Thinking in Java 3e" & "Thinking in C++ 2e"
Web log: http://www.artima.com/weblogs/index.jsp?blogger=beckel
Subscribe to my newsletter:
http://www.mindview.net/Newsletter
My schedule can be found at:
http://www.mindview.net/Calendar




From pje at telecommunity.com  Thu Oct 13 19:46:28 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 13 Oct 2005 13:46:28 -0400
Subject: [Python-Dev] Assignment to __class__ of module? (Autoloading?
 (Making Queue.Queue easier to use))
In-Reply-To: <2m1x2pbgue.fsf@starship.python.net>
References: <434DEEBA.7050905@canterbury.ac.nz>
	<20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com>
	<20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com>
	<5.1.1.6.0.20051012221440.01f41e10@mail.telecommunity.com>
	<434DEEBA.7050905@canterbury.ac.nz>
Message-ID: <5.1.1.6.0.20051013134103.01f23a18@mail.telecommunity.com>

At 04:02 PM 10/13/2005 +0100, Michael Hudson wrote:
>Greg Ewing <greg.ewing at canterbury.ac.nz> writes:
>
> > Phillip J. Eby wrote:
> >> At 01:47 PM 10/13/2005 +1300, Greg Ewing wrote:
> >>
> >>> I'm trying to change the __class__ of a newly-imported
> >>> module to a subclass of types.ModuleType
> >>
> >> It happened in Python 2.3, actually.
> >
> > Is there a discussion anywhere about the reason this was
> > done? It would be useful if this capability could be
> > regained somehow without breaking things.
>
>Well, I think it's undesirable that you be able to do this to, e.g.,
>strings.  Modules are something of a greyer area, I guess.

Actually, it's desirable to be *able* to do it for anything.  But certainly 
for otherwise-immutable objects it can lead to aliasing issues.

For mutable objects, it's *very* desirable, and I think the rules added in 
2.3 might have been overly strict, as they disallow you changing any 
built-in type to a non built-in type, even if the allocator is the 
same.  It seems to me the safety check could perhaps be reduced to just 
checking whether the old and new classes have the same tp_free.  (Apart 
from the layout and other inheritance-related checks, I mean.)

(By the way, for an example use case other than modules, note that somebody 
wrote an "observables" package that could detect mutation of lists and 
dictionaries in Python 2.2 using __class__ changes, which then became 
useless as of Python 2.3.)


From eyal.lotem at gmail.com  Thu Oct 13 19:52:32 2005
From: eyal.lotem at gmail.com (Eyal Lotem)
Date: Thu, 13 Oct 2005 19:52:32 +0200
Subject: [Python-Dev] Assignment to __class__ of module? (Autoloading?
	(Making Queue.Queue easier to use))
In-Reply-To: <5.1.1.6.0.20051013134103.01f23a18@mail.telecommunity.com>
References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com>
	<5.1.1.6.0.20051012221440.01f41e10@mail.telecommunity.com>
	<434DEEBA.7050905@canterbury.ac.nz>
	<2m1x2pbgue.fsf@starship.python.net>
	<5.1.1.6.0.20051013134103.01f23a18@mail.telecommunity.com>
Message-ID: <b64f365b0510131052w48a37821m9893441af464a941@mail.gmail.com>

Why not lazily import modules by importing them when they are needed
(i.e inside functions), and not in the top-level module scope?


On 10/13/05, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 04:02 PM 10/13/2005 +0100, Michael Hudson wrote:
> >Greg Ewing <greg.ewing at canterbury.ac.nz> writes:
> >
> > > Phillip J. Eby wrote:
> > >> At 01:47 PM 10/13/2005 +1300, Greg Ewing wrote:
> > >>
> > >>> I'm trying to change the __class__ of a newly-imported
> > >>> module to a subclass of types.ModuleType
> > >>
> > >> It happened in Python 2.3, actually.
> > >
> > > Is there a discussion anywhere about the reason this was
> > > done? It would be useful if this capability could be
> > > regained somehow without breaking things.
> >
> >Well, I think it's undesirable that you be able to do this to, e.g.,
> >strings.  Modules are something of a greyer area, I guess.
>
> Actually, it's desirable to be *able* to do it for anything.  But certainly
>
> for otherwise-immutable objects it can lead to aliasing issues.
>
> For mutable objects, it's *very* desirable, and I think the rules added in
> 2.3 might have been overly strict, as they disallow you changing any
> built-in type to a non built-in type, even if the allocator is the
> same.  It seems to me the safety check could perhaps be reduced to just
> checking whether the old and new classes have the same tp_free.  (Apart
> from the layout and other inheritance-related checks, I mean.)
>
> (By the way, for an example use case other than modules, note that somebody
>
> wrote an "observables" package that could detect mutation of lists and
> dictionaries in Python 2.2 using __class__ changes, which then became
> useless as of Python 2.3.)
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/eyal.lotem%40gmail.com
>

From jcarlson at uci.edu  Thu Oct 13 20:13:23 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Thu, 13 Oct 2005 11:13:23 -0700
Subject: [Python-Dev] Assignment to __class__ of module? (Autoloading?
	(Making Queue.Queue easier to use))
In-Reply-To: <b64f365b0510131052w48a37821m9893441af464a941@mail.gmail.com>
References: <5.1.1.6.0.20051013134103.01f23a18@mail.telecommunity.com>
	<b64f365b0510131052w48a37821m9893441af464a941@mail.gmail.com>
Message-ID: <20051013110837.918C.JCARLSON@uci.edu>


Eyal Lotem <eyal.lotem at gmail.com> wrote:
> Why not lazily import modules by importing them when they are needed
> (i.e inside functions), and not in the top-level module scope?

Because then it wouldn't be automatic.

The earlier portion of this discussion came from...

    import module
    #module.foo does not reference a module
    module.foo
    #now module.foo references a module

The discussion is about how we can get that kind of behavior.

 - Josiah


From fredrik at pythonware.com  Thu Oct 13 20:07:08 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Thu, 13 Oct 2005 20:07:08 +0200
Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use)
References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com>
	<ca471dc20510120758n6697cd66h46d903814985ba1a@mail.gmail.com>
Message-ID: <dim7oe$cdc$1@sea.gmane.org>

Guido van Rossum wrote:

> BTW, Queue.Queue violates a recent module naming standard; it is now
> considered bad style to name the class and the module the same.
> Modules and packages should have short all-lowercase names, classes
> should be CapWords. Even the same but different case is bad style.

unfortunately, this standard seem to result in generic "spamtools" modules
into which people throw everything that's even remotely related to "spam",
followed by complaints about bloat and performance from users, followed by
various more or less stupid attempts to implement lazy loading of hidden in-
ternal modules, followed by more complaints from users who no longer has
a clear view of what's really going on in there...

I think I'll stick to the old standard for a few more years...

</F>




From guido at python.org  Thu Oct 13 20:37:34 2005
From: guido at python.org (Guido van Rossum)
Date: Thu, 13 Oct 2005 11:37:34 -0700
Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use)
In-Reply-To: <dim7oe$cdc$1@sea.gmane.org>
References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com>
	<ca471dc20510120758n6697cd66h46d903814985ba1a@mail.gmail.com>
	<dim7oe$cdc$1@sea.gmane.org>
Message-ID: <ca471dc20510131137ybd1a277y2fa2c3381eb89cc2@mail.gmail.com>

On 10/13/05, Fredrik Lundh <fredrik at pythonware.com> wrote:
> Guido van Rossum wrote:
>
> > BTW, Queue.Queue violates a recent module naming standard; it is now
> > considered bad style to name the class and the module the same.
> > Modules and packages should have short all-lowercase names, classes
> > should be CapWords. Even the same but different case is bad style.
>
> unfortunately, this standard seem to result in generic "spamtools" modules
> into which people throw everything that's even remotely related to "spam",
> followed by complaints about bloat and performance from users, followed by
> various more or less stupid attempts to implement lazy loading of hidden in-
> ternal modules, followed by more complaints from users who no longer has
> a clear view of what's really going on in there...
>
> I think I'll stick to the old standard for a few more years...

Yeah, until you've learned to use packages. :(

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From solipsis at pitrou.net  Thu Oct 13 20:40:28 2005
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Thu, 13 Oct 2005 20:40:28 +0200
Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use)
In-Reply-To: <dim7oe$cdc$1@sea.gmane.org>
References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com>
	<ca471dc20510120758n6697cd66h46d903814985ba1a@mail.gmail.com>
	<dim7oe$cdc$1@sea.gmane.org>
Message-ID: <1129228828.7198.24.camel@fsol>


> unfortunately, this standard seem to result in generic "spamtools" modules
> into which people throw everything that's even remotely related to "spam",
> followed by complaints about bloat and performance from users, followed by
> various more or less stupid attempts to implement lazy loading of hidden in-
> ternal modules, followed by more complaints from users who no longer has
> a clear view of what's really going on in there...

BTW, what's the performance problem in importing unnecessary stuff
(assuming pyc files are already generated) ?
Has it been evaluated somewhere ?



From guido at python.org  Thu Oct 13 20:42:01 2005
From: guido at python.org (Guido van Rossum)
Date: Thu, 13 Oct 2005 11:42:01 -0700
Subject: [Python-Dev] Threading and synchronization primitives
In-Reply-To: <17230.34347.295074.10528@montanaro.dyndns.org>
References: <101220051537.5148.434D2DB3000F41170000141C22007503300E9D0E030E0CD203D202080106@comcast.net>
	<434DD80C.80207@canterbury.ac.nz>
	<17230.34347.295074.10528@montanaro.dyndns.org>
Message-ID: <ca471dc20510131142s151dac54kebc43da37a02583d@mail.gmail.com>

On 10/13/05, skip at pobox.com <skip at pobox.com> wrote:
>
>     Greg> All right then, how about putting it in a module called
>     Greg> threadutils or something like that, which is clearly related to
>     Greg> threading, but is open for the addition of future thread-related
>     Greg> features that might arise.
>
> Then Lock, RLock, Semaphore, etc belong there instead of in threading don't
> they?

No. Locks and semaphores are the lowest-level threading primitives.
They go in the basic module.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Thu Oct 13 20:44:05 2005
From: guido at python.org (Guido van Rossum)
Date: Thu, 13 Oct 2005 11:44:05 -0700
Subject: [Python-Dev] Making Queue.Queue easier to use
In-Reply-To: <1129208826.31838.13.camel@localhost>
References: <434B8DBF.9080509@iinet.net.au>
	<1129208826.31838.13.camel@localhost>
Message-ID: <ca471dc20510131144l2b46dcbfpe66fadb077bb8ee0@mail.gmail.com>

On 10/13/05, Gustavo J. A. M. Carneiro <gjc at inescporto.pt> wrote:
>   I'd just like to point out that Queue is not quite as useful as people
> seem to think in this thread.  The main problem is that I can't
> integrate Queue into a select/poll based main loop.

Well, you're mixing two incompatible paradigms there, so that's to be
expected, right? Either you're using async I/O or you're using
threads. Mixing the two causes confusion and bugs no matter what you
try.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From jeremy at alum.mit.edu  Thu Oct 13 22:52:14 2005
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Thu, 13 Oct 2005 16:52:14 -0400
Subject: [Python-Dev] AST branch update
Message-ID: <e8bf7a530510131352xb123101uba24367a8dbe09b2@mail.gmail.com>

Neil and I have been working on the AST branch for the last week. 
We're nearly ready to merge the changes to the head.  I imagine we'll
do it this weekend, barring last minute glitches.

There are a few open issues that remain.  I'd like to merge the branch
before resolving them.  Please let me know if you disagree.

The current status of the AST branch is that only two tests fail:
test_trace and test_symtable.  The causes for these failures are
described below.  We did not merge the current head to the branch
again, but I diffed the test suite between head and branch and did not
see any substantive changes since the last merge.

Some of the finer points of generating the line number table (lnotab)
are wrong.  There is some very delicate code to support single
stepping with the debugger.  We'll get that fixed soon, but we'd like
to temporarily disable the failing tests in test_trace.

The symtable module exposed parts of the internal representation of
the old symbol table.  The representation changed, and the module is
going to need to change.  The old module was poorly documented and
tested, so I'd like to start over.  Again, I'd like to disable a
couple of failing tests until after the merge occurs.

I don't think the current test suite covers all of the possible syntax
errors that can be raised.  I'd like to add a new test suite that
covers all of the remaining cases, perhaps moving some existing tests
into this module as well.  I'd like to do that after the merge, which
means there may be some loose ends where syntax errors aren't handled
gracefully.

For those of you familiar with the ast work, I'll summarize the recent changes:

We added line numbers to expressions in the AST.  There are many cases
where a statement spans multiple lines.  We couldn't generate a
correct lnotab without knowing the lines that expressions occur on.

We merged the peephole optimizer into the new compiler and restored
PyNode_Compile() so that the parser module works again.  The parser
module will still expose the old parse trees (just what it's users
want).  We should probably provide a similar AST module, but I'm not
sure if we'll get to that.

We fixed some flawed logic in the symbol table for handling nested
scopes.  Luckily, the test cases for nested scopes are pretty
thorough.  They all pass now.

Jeremy

From jepler at unpythonic.net  Fri Oct 14 00:08:41 2005
From: jepler at unpythonic.net (jepler@unpythonic.net)
Date: Thu, 13 Oct 2005 17:08:41 -0500
Subject: [Python-Dev] AST branch update
In-Reply-To: <e8bf7a530510131352xb123101uba24367a8dbe09b2@mail.gmail.com>
References: <e8bf7a530510131352xb123101uba24367a8dbe09b2@mail.gmail.com>
Message-ID: <20051013220841.GB8826@unpythonic.net>

I'm excited to see work continuing (resuming?) on the AST tree.

I don't know how many machines you've been able to test the AST branch on.  I
have a linux/amd64 machine handy and I've tried to run the test suite with a
fresh copy of the ast-branch.

test_trace segfaults consistently, even when run alone.   You didn't give me the
impression that the failure was a segfault, so I'll include more information
about it below.

With '-x test_trace -x test_codecencodings_kr', I get through the testsuite run.
Compared to a build of HEAD, also from today, I get additional failures in
	test_genexps test_grp test_pwd test_symtable
and additional unexpected skips of:
	test_email test_email_codecs

The 'pwd' and 'grp' failures look like they're due to a change not merged from
HEAD.

I'm not sure what to make of the 'genexps' failure.  Is it just a harmless
output difference?  I didn't see you mention that in your message.

Here is some of the relevant-looking output:
    $ ./python -E -tt ./Lib/test/regrtest.py
    [...]
    **********************************************************************
    File "/usr/src/python-ast/Lib/test/test_genexps.py", line ?, in test.test_genexps.__test__.doctests
    Failed example:
	(y for y in (1,2)) = 10
    Expected:
	Traceback (most recent call last):
	   ...
	SyntaxError: assign to generator expression not possible
    Got:
	Traceback (most recent call last):
	  File "/usr/src/python-ast/Lib/doctest.py", line 1243, in __run
	    compileflags, 1) in test.globs
	  File "<doctest test.test_genexps.__test__.doctests[38]>", line 1
	 SyntaxError: assignment to generator expression not possible (<doctest test.test_genexps.__test__.doctests[38]>, line 1)
    **********************************************************************
    File "/usr/src/python-ast/Lib/test/test_genexps.py", line ?, in test.test_genexps.__test__.doctests
    Failed example:
	(y for y in (1,2)) += 10
    Expected:
	Traceback (most recent call last):
	   ...
	SyntaxError: augmented assign to tuple literal or generator expression not possible
    Got:
	Traceback (most recent call last):
	  File "/usr/src/python-ast/Lib/doctest.py", line 1243, in __run
	    compileflags, 1) in test.globs
	  File "<doctest test.test_genexps.__test__.doctests[39]>", line 1
	 SyntaxError: augmented assignment to generator expression not possible (<doctest test.test_genexps.__test__.doctests[39]>, line 1)
    **********************************************************************
    [...]
    test test_grp failed -- Traceback (most recent call last):
      File "/usr/src/python-ast/Lib/test/test_grp.py", line 29, in test_values
	e2 = grp.getgrgid(e.gr_gid)
    OverflowError: signed integer is greater than maximum
    [...]
    test test_pwd failed -- Traceback (most recent call last):
      File "/usr/src/python-ast/Lib/test/test_pwd.py", line 42, in test_values
	self.assert_(pwd.getpwuid(e.pw_uid) in entriesbyuid[e.pw_uid])
    OverflowError: signed integer is greater than maximum

The segfault in test_trace looks like this:
    $ gdb ./python
    (gdb) source Misc/gdbinit
    (gdb) run Lib/test/test_trace.py
    [...]
    test_10_no_jump_to_except_1 (__main__.JumpTestCase) ... FAIL

    Program received signal SIGSEGV, Segmentation fault.
    [Switching to Thread 46912496260768 (LWP 11945)]
    PyEval_EvalFrame (f=0x652c30) at Python/ceval.c:1994
    [1967			case COMPARE_OP:)
    1994				Py_DECREF(v);
    (gdb) print oparg
    $1 = 10             [PyCmp_EXC_MATCH?]
    (gdb) pyo v
    NULL
    $2 = void
    #0  PyEval_EvalFrame (f=0x652c30) at Python/ceval.c:1994
    #1  0x0000000000475800 in PyEval_EvalFrame (f=0x697390) at Python/ceval.c:3618
    #2  0x0000000000475800 in PyEval_EvalFrame (f=0x694f10) at Python/ceval.c:3618
    #3  0x0000000000475800 in PyEval_EvalFrame (f=0x649fa0) at Python/ceval.c:3618
    [...]
    #50 0x00000000004113bb in Py_Main (argc=Variable "argc" is not available.) at Modules/main.c:484
    (gdb) pystack
    Lib/test/test_trace.py (447): no_jump_to_except_2
    Lib/test/test_trace.py (447): run_test
    Lib/test/test_trace.py (557): test_11_no_jump_to_except_2
    /usr/src/python-ast/Lib/unittest.py (581): run
    /usr/src/python-ast/Lib/unittest.py (280): __call__
    /usr/src/python-ast/Lib/unittest.py (420): run
    /usr/src/python-ast/Lib/unittest.py (427): __call__
    /usr/src/python-ast/Lib/unittest.py (420): run
    /usr/src/python-ast/Lib/unittest.py (427): __call__
    /usr/src/python-ast/Lib/unittest.py (692): run
    /usr/src/python-ast/Lib/test/test_support.py (692): run_suite
    /usr/src/python-ast/Lib/test/test_support.py (278): run_unittest
    Lib/test/test_trace.py (600): test_main
    Lib/test/test_trace.py (600): <module>
I'm not sure what other information from gdb to furnish.

Jeff
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20051013/b025eac6/attachment.pgp

From nas at arctrix.com  Fri Oct 14 00:16:51 2005
From: nas at arctrix.com (Neil Schemenauer)
Date: Thu, 13 Oct 2005 16:16:51 -0600
Subject: [Python-Dev] AST branch update
In-Reply-To: <20051013220841.GB8826@unpythonic.net>
References: <e8bf7a530510131352xb123101uba24367a8dbe09b2@mail.gmail.com>
	<20051013220841.GB8826@unpythonic.net>
Message-ID: <20051013221650.GA8676@mems-exchange.org>

On Thu, Oct 13, 2005 at 05:08:41PM -0500, jepler at unpythonic.net wrote:
> test_trace segfaults consistently, even when run alone.

That's a bug in frame_setlineno(), IMO.  It's failing to detect an
invalid jump because the lnotab generated by the new compiler is
slightly different (DUP_TOP opcode corresponds to a different line).

> I'm not sure what to make of the 'genexps' failure.  Is it just a harmless
> output difference?  I didn't see you mention that in your message.

It's a bug in the traceback.py module, IMO.  See bug 1326077.

  Neil

From greg.ewing at canterbury.ac.nz  Fri Oct 14 02:32:34 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 14 Oct 2005 13:32:34 +1300
Subject: [Python-Dev] Assignment to __class__ of module? (Autoloading?
 (Making Queue.Queue easier to use))
In-Reply-To: <5.1.1.6.0.20051013134103.01f23a18@mail.telecommunity.com>
References: <434DEEBA.7050905@canterbury.ac.nz>
	<20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com>
	<20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com>
	<5.1.1.6.0.20051012221440.01f41e10@mail.telecommunity.com>
	<434DEEBA.7050905@canterbury.ac.nz>
	<5.1.1.6.0.20051013134103.01f23a18@mail.telecommunity.com>
Message-ID: <434EFCA2.4080100@canterbury.ac.nz>

Phillip J. Eby wrote:

> Actually, it's desirable to be *able* to do it for anything.  But certainly 
> for otherwise-immutable objects it can lead to aliasing issues.

Even for immutables, it could be useful to be able to
add behaviour that doesn't mutate anything.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From greg.ewing at canterbury.ac.nz  Fri Oct 14 02:43:28 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 14 Oct 2005 13:43:28 +1300
Subject: [Python-Dev] Simplify lnotab? (AST branch update)
In-Reply-To: <e8bf7a530510131352xb123101uba24367a8dbe09b2@mail.gmail.com>
References: <e8bf7a530510131352xb123101uba24367a8dbe09b2@mail.gmail.com>
Message-ID: <434EFF30.3040703@canterbury.ac.nz>

Jeremy Hylton wrote:

> Some of the finer points of generating the line number table (lnotab)
> are wrong.  There is some very delicate code to support single
> stepping with the debugger.

With disk and memory sizes being what they are nowadays,
is it still worth making heroic efforts to compress the
lnotab table? How about getting rid of all the delicate
code and replacing it with something much simpler?

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From greg.ewing at canterbury.ac.nz  Fri Oct 14 02:49:44 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 14 Oct 2005 13:49:44 +1300
Subject: [Python-Dev] Assignment to __class__ of module? (Autoloading?
 (Making Queue.Queue easier to use))
In-Reply-To: <20051013110837.918C.JCARLSON@uci.edu>
References: <5.1.1.6.0.20051013134103.01f23a18@mail.telecommunity.com>
	<b64f365b0510131052w48a37821m9893441af464a941@mail.gmail.com>
	<20051013110837.918C.JCARLSON@uci.edu>
Message-ID: <434F00A8.1080306@canterbury.ac.nz>

Josiah Carlson wrote:

> The earlier portion of this discussion came from...
> 
>     import module
>     #module.foo does not reference a module
>     module.foo
>     #now module.foo references a module

Or more generally, module.foo now references *something*,
not necessarily a module. (In my use case it's a class.)

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From pje at telecommunity.com  Fri Oct 14 02:59:36 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 13 Oct 2005 20:59:36 -0400
Subject: [Python-Dev] Simplify lnotab? (AST branch update)
In-Reply-To: <434EFF30.3040703@canterbury.ac.nz>
References: <e8bf7a530510131352xb123101uba24367a8dbe09b2@mail.gmail.com>
	<e8bf7a530510131352xb123101uba24367a8dbe09b2@mail.gmail.com>
Message-ID: <5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com>

At 01:43 PM 10/14/2005 +1300, Greg Ewing wrote:
>Jeremy Hylton wrote:
>
> > Some of the finer points of generating the line number table (lnotab)
> > are wrong.  There is some very delicate code to support single
> > stepping with the debugger.
>
>With disk and memory sizes being what they are nowadays,
>is it still worth making heroic efforts to compress the
>lnotab table? How about getting rid of all the delicate
>code and replacing it with something much simpler?

+1.  I'd be especially interested in lifting the current requirement that 
line ranges and byte ranges both increase monotonically.  Even better if 
the lines for a particular piece of code don't have to all come from the 
same file.  It'd be nice to be able to do the equivalent of '#line' 
directives for Python code that's generated by other tools, such as parser 
generators and the like.


From greg.ewing at canterbury.ac.nz  Fri Oct 14 03:25:26 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 14 Oct 2005 14:25:26 +1300
Subject: [Python-Dev] Simplify lnotab? (AST branch update)
In-Reply-To: <5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com>
References: <e8bf7a530510131352xb123101uba24367a8dbe09b2@mail.gmail.com>
	<e8bf7a530510131352xb123101uba24367a8dbe09b2@mail.gmail.com>
	<5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com>
Message-ID: <434F0906.6090508@canterbury.ac.nz>

Phillip J. Eby wrote:

> +1.  I'd be especially interested in lifting the current requirement 
> that line ranges and byte ranges both increase monotonically.  Even 
> better if the lines for a particular piece of code don't have to all 
> come from the same file.

How about an array of:

   +----------------+----------------+----------------+
   | bytecode index |     file no.   |    line no.    |
   +----------------+----------------+----------------+

Entries are sorted by bytecode index, with each entry
applying from that bytecode position up to the position
of the next entry. The file no. indexes a tuple of file
names attached to the code object. All entries are 32-bit
integers.

Easy to generate, easy to look up with a binary search,
should be big enough for everyone except those generating
obscenely huge code objects on 64-bit platforms.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From pje at telecommunity.com  Fri Oct 14 03:42:32 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 13 Oct 2005 21:42:32 -0400
Subject: [Python-Dev] Assignment to __class__ of module? (Autoloading?
 (Making Queue.Queue easier to use))
In-Reply-To: <434EFCA2.4080100@canterbury.ac.nz>
References: <5.1.1.6.0.20051013134103.01f23a18@mail.telecommunity.com>
	<434DEEBA.7050905@canterbury.ac.nz>
	<20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com>
	<20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com>
	<5.1.1.6.0.20051012221440.01f41e10@mail.telecommunity.com>
	<434DEEBA.7050905@canterbury.ac.nz>
	<5.1.1.6.0.20051013134103.01f23a18@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20051013214118.0312e0e8@mail.telecommunity.com>

At 01:32 PM 10/14/2005 +1300, Greg Ewing wrote:
>Phillip J. Eby wrote:
>
> > Actually, it's desirable to be *able* to do it for anything.  But 
> certainly
> > for otherwise-immutable objects it can lead to aliasing issues.
>
>Even for immutables, it could be useful to be able to
>add behaviour that doesn't mutate anything.

I meant that just changing its class is a mutation, and since immutables 
can be shared or cached, that could lead to problems.  So I do think it's a 
reasonable implementation limit to disallow changing the __class__ of an 
immutable.


From pinard at iro.umontreal.ca  Fri Oct 14 03:41:35 2005
From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard)
Date: Thu, 13 Oct 2005 21:41:35 -0400
Subject: [Python-Dev] Simplify lnotab? (AST branch update)
In-Reply-To: <5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com>
References: <e8bf7a530510131352xb123101uba24367a8dbe09b2@mail.gmail.com>
	<e8bf7a530510131352xb123101uba24367a8dbe09b2@mail.gmail.com>
	<5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com>
Message-ID: <20051014014135.GA22105@alcyon.progiciels-bpi.ca>

[Phillip J. Eby]

> It'd be nice to be able to do the equivalent of '#line' directives for
> Python code that's generated by other tools, such as parser generators
> and the like.

I had such a need a few times in the past, and it was tedious having to
do indirections through generated Python for finding the real source as
referenced by comments.

Yet, granted also that the need has not been frequent, for me.

-- 
Fran?ois Pinard   http://pinard.progiciels-bpi.ca

From pje at telecommunity.com  Fri Oct 14 03:55:20 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 13 Oct 2005 21:55:20 -0400
Subject: [Python-Dev] Simplify lnotab? (AST branch update)
In-Reply-To: <434F0906.6090508@canterbury.ac.nz>
References: <5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com>
	<e8bf7a530510131352xb123101uba24367a8dbe09b2@mail.gmail.com>
	<e8bf7a530510131352xb123101uba24367a8dbe09b2@mail.gmail.com>
	<5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20051013214331.02f320a0@mail.telecommunity.com>

At 02:25 PM 10/14/2005 +1300, Greg Ewing wrote:
>Phillip J. Eby wrote:
>
> > +1.  I'd be especially interested in lifting the current requirement
> > that line ranges and byte ranges both increase monotonically.  Even
> > better if the lines for a particular piece of code don't have to all
> > come from the same file.
>
>How about an array of:
>
>    +----------------+----------------+----------------+
>    | bytecode index |     file no.   |    line no.    |
>    +----------------+----------------+----------------+
>
>Entries are sorted by bytecode index, with each entry
>applying from that bytecode position up to the position
>of the next entry. The file no. indexes a tuple of file
>names attached to the code object. All entries are 32-bit
>integers.

The file number could be 16-bit - I don't see a use case for referring to 
65,000 different filenames.  ;)  But that doesn't save much space.

Anyway, in the common case, this scheme will use 10 more bytes per line of 
Python code, which translates to a megabyte or so for the standard 
library.  I definitely like the simplicity, but a meg's a meg.  A more 
compact scheme is possible, by using two tables - a bytecode->line number 
table, and a line number-> file table.  In the single-file case, you can 
omit the second table, and the first table then only uses 6 more bytes per 
line than we're currently using.  Not fantastic, but probably more acceptable.

If you have to encode multiple files, you just offset their line numbers by 
the size of the other files, and put entries in the line->file table to 
match.  When computing the line number, you subtract the matching entry in 
the line->file table to get the actual line number within that file.


From greg.ewing at canterbury.ac.nz  Fri Oct 14 05:14:08 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 14 Oct 2005 16:14:08 +1300
Subject: [Python-Dev] Early PEP draft (For Python 3000?)
In-Reply-To: <b64f365b0510111431g7fdc573dpc59e4e304ec4e425@mail.gmail.com>
References: <b64f365b0510111431g7fdc573dpc59e4e304ec4e425@mail.gmail.com>
Message-ID: <434F2280.2020000@canterbury.ac.nz>

Eyal Lotem wrote:

>       locals()['x'] = 1 # Quietly fails!
> Replaced by:
>       frame.x = 1 # Raises error

Or even better, replaced by

    frame.x = 1 # Does the right thing

The frame object knows enough to be able to find
the correct locals slot and update it, so there's
no need for this to fail.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From falcon at intercable.ru  Tue Oct 11 08:33:43 2005
From: falcon at intercable.ru (Sokolov Yura)
Date: Tue, 11 Oct 2005 10:33:43 +0400
Subject: [Python-Dev] PEP 3000 and exec
Message-ID: <434B5CC7.2030009@intercable.ru>

Agree.
 >>>i=1
 >>>def a():
    i=2
    def b():
        print i
    return b
    
 >>>a()()
2
 >>>def a():
    i=2
    def b():
        exec "print i"
    return b
    
 >>>a()()
1


From falcon at intercable.ru  Tue Oct 11 08:55:41 2005
From: falcon at intercable.ru (Sokolov Yura)
Date: Tue, 11 Oct 2005 10:55:41 +0400
Subject: [Python-Dev] Pythonic concurrency - offtopic
Message-ID: <434B61ED.4080503@intercable.ru>

Offtopic:

Microsoft Windows [Version 5.2.3790]
(C) Copyright 1985-2003 Microsoft Corp.

G:\Working\1>c:\Python24\python
Python 2.4.1 (#65, Mar 30 2005, 09:13:57) [MSC v.1310 32 bit (Intel)] on 
win32
Type "help", "copyright", "credits" or "license" for more information.
 >>> from os import fork
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ImportError: cannot import name fork
 >>>


From support at intercable.ru  Tue Oct 11 08:57:03 2005
From: support at intercable.ru (Technical Support of Intercable Co)
Date: Tue, 11 Oct 2005 10:57:03 +0400
Subject: [Python-Dev]  C.E.R. Thoughts
Message-ID: <434B623F.9030000@intercable.ru>

And why not
if len(sys.argv) > 1 take sys.argv[1] == 'debug':
    ...

It was not so bad :-)

A = len(sys.argv)==0 take None or sys.argv[1]

Sorry for being noisy :-)



From tonynelson at georgeanelson.com  Wed Oct 12 04:10:29 2005
From: tonynelson at georgeanelson.com (Tony Nelson)
Date: Tue, 11 Oct 2005 22:10:29 -0400
Subject: [Python-Dev] Unicode charmap decoders slow
Message-ID: <v04020a03bf72162a051d@[192.168.123.162]>

I have written my fastcharmap decoder and encoder.  It's not meant to be
better than the patch and other changes to come in a future version of
Python, but it does work now with the current codecs.  Using Hye-Shik
Chang's benchmark, decoding is about 4.3x faster than the base, and
encoding is about 2x faster than the base (that's comparing the base and
the fast versions on my machine).  If fastcharmap would be useful, please
tell me where I should make it available, and any changes that are needed.
I would also need to write an installer (distutils I guess).

<http://georgeanelson.com/fastcharmap.d.tar.gz>

Fastcharmap is written in Python and Pyrex 0.9.3, and the .pyx file will
need to be compiled before use.  I used:

pyrexc _fastcharmap.pyx
gcc -c -fPIC -I/usr/include/python2.3/ _fastcharmap.c
gcc -shared _fastcharmap.o -o _fastcharmap.so

To use, hook each codec to be speed up:

    import fastcharmap
    help(fastcharmap)
    fastcharmap.hook('name_of_codec')
    u = unicode('some text', 'name_of_codec')
    s = u.encode('name_of_codec')

No codecs were rewritten.  It took me a while to learn enough to do this
(Pyrex, more Python, some Python C API), and there were some surprises.
Hooking in is grosser than I would have liked.  I've only used it on Python
2.3 on FC3.  Still, it should work going forward, and, if the dicts are
replaced by something else, fastcharmap will know to leave everything
alone.  There's still a tiny bit of debugging print statements in it.


>At 8:36 AM +0200 10/5/05, Martin v. L?wis wrote:
>>Tony Nelson wrote:
> ...
>>> Encoding can be made fast using a simple hash table with external chaining.
>>> There are max 256 codepoints to encode, and they will normally be well
>>> distributed in their lower 8 bits.  Hash on the low 8 bits (just mask), and
>>> chain to an area with 256 entries.  Modest storage, normally short chains,
>>> therefore fast encoding.
>>
>>This is what is currently done: a hash map with 256 keys. You are
>>complaining about the performance of that algorithm. The issue of
>>external chaining is likely irrelevant: there likely are no collisions,
>>even though Python uses open addressing.
>
>I think I'm complaining about the implementation, though on decode, not
>encode.
>
>In any case, there are likely to be collisions in my scheme.  Over the
>next few days I will try to do it myself, but I will need to learn Pyrex,
>some of the Python C API, and more about Python to do it.
>
>
>>>>...I suggest instead just /caching/ the translation in C arrays stored
>>>>with the codec object.  The cache would be invalidated on any write to the
>>>>codec's mapping dictionary, and rebuilt the next time anything was
>>>>translated.  This would maintain the present semantics, work with current
>>>>codecs, and still provide the desired speed improvement.
>>
>>That is not implementable. You cannot catch writes to the dictionary.
>
>I should have been more clear.  I am thinking about using a proxy object
>in the codec's 'encoding_map' and 'decoding_map' slots, that will forward
>all the dictionary stuff.  The proxy will delete the cache on any call
>which changes the dictionary contents.  There are proxy classed and
>dictproxy (don't know how its implemented yet) so it seems doable, at
>least as far as I've gotten so far.
>
>
>>> Note that this caching is done by new code added to the existing C
>>> functions (which, if I have it right, are in unicodeobject.c).  No
>>> architectural changes are made; no existing codecs need to be changed;
>>> everything will just work
>>
>>Please try to implement it. You will find that you cannot. I don't
>>see how regenerating/editing the codecs could be avoided.
>
>Will do!
____________________________________________________________________
TonyN.:'                       <mailto:tonynelson at georgeanelson.com>
      '                              <http://www.georgeanelson.com/>

From falcon at intercable.ru  Thu Oct 13 10:48:56 2005
From: falcon at intercable.ru (Sokolov Yura)
Date: Thu, 13 Oct 2005 12:48:56 +0400
Subject: [Python-Dev]  Autoloading? (Making Queue.Queue easier to use)
Message-ID: <434E1F78.7020504@intercable.ru>

May be allow modules to define __getattr__ ?

def __getattr__(thing):
     try:
          return __some_standart_way__(thing)
    except AttributeError:
          if thing=="Queue":
               import sys
               from Queue import Queue
               setattr(sys.modules[__name__],"Queue",Queue)
               return Queue
          raise



From raymond.hettinger at verizon.net  Fri Oct 14 07:03:28 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Fri, 14 Oct 2005 01:03:28 -0400
Subject: [Python-Dev] AST branch update
In-Reply-To: <e8bf7a530510131352xb123101uba24367a8dbe09b2@mail.gmail.com>
Message-ID: <005601c5d07c$9ea717c0$1fac958d@oemcomputer>

> Neil and I have been working on the AST branch for the last week.
> We're nearly ready to merge the changes to the head. 

Nice work.



> I don't think the current test suite covers all of the possible syntax
> errors that can be raised. 

Do the AST branch generate a syntax error for:

   foo(a = i for i in range(10))

?



Raymond


From jcarlson at uci.edu  Fri Oct 14 07:11:49 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Thu, 13 Oct 2005 22:11:49 -0700
Subject: [Python-Dev] Pythonic concurrency - offtopic
In-Reply-To: <434B61ED.4080503@intercable.ru>
References: <434B61ED.4080503@intercable.ru>
Message-ID: <20051013220748.9195.JCARLSON@uci.edu>


Sokolov Yura <falcon at intercable.ru> wrote:
> 
> Offtopic:
> 
> Microsoft Windows [Version 5.2.3790]
> (C) Copyright 1985-2003 Microsoft Corp.
> 
> G:\Working\1>c:\Python24\python
> Python 2.4.1 (#65, Mar 30 2005, 09:13:57) [MSC v.1310 32 bit (Intel)] on 
> win32
> Type "help", "copyright", "credits" or "license" for more information.
>  >>> from os import fork
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> ImportError: cannot import name fork
>  >>>

Python for Windows, if I remember correctly, has never supported forking. 
This is because the underlying process execution code does not have
support for the standard copy-on-write semantic which makes unix fork
fast.

Cygwin Python does support fork, but I believe this is through a literal
copying of the memory space, which is far slower than unix fork.

Until Microsoft adds kernel support for fork, don't expect standard
Windows Python to support it.

 - Josiah


From nas at arctrix.com  Fri Oct 14 07:11:47 2005
From: nas at arctrix.com (Neil Schemenauer)
Date: Thu, 13 Oct 2005 23:11:47 -0600
Subject: [Python-Dev] AST branch update
In-Reply-To: <005601c5d07c$9ea717c0$1fac958d@oemcomputer>
References: <e8bf7a530510131352xb123101uba24367a8dbe09b2@mail.gmail.com>
	<005601c5d07c$9ea717c0$1fac958d@oemcomputer>
Message-ID: <20051014051147.GA9906@mems-exchange.org>

On Fri, Oct 14, 2005 at 01:03:28AM -0400, Raymond Hettinger wrote:
> Do the AST branch generate a syntax error for:
> 
>    foo(a = i for i in range(10))

No.  It generates the same broken code as the current compiler.

  Neil

From jcarlson at uci.edu  Fri Oct 14 07:15:06 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Thu, 13 Oct 2005 22:15:06 -0700
Subject: [Python-Dev] C.E.R. Thoughts
In-Reply-To: <434B623F.9030000@intercable.ru>
References: <434B623F.9030000@intercable.ru>
Message-ID: <20051013221327.9198.JCARLSON@uci.edu>


Technical Support of Intercable Co <support at intercable.ru> wrote:
> 
> And why not
> if len(sys.argv) > 1 take sys.argv[1] == 'debug':
>     ...
> 
> It was not so bad :-)
> 
> A = len(sys.argv)==0 take None or sys.argv[1]
> 
> Sorry for being noisy :-)

The syntax for 2.5 has already been decided upon.  Except for an act by
Guido, it is likely to stay (None if len(sys.argv) == 0 else sys.argv[1]).

 - Josiah


From fredrik at pythonware.com  Fri Oct 14 07:34:50 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri, 14 Oct 2005 07:34:50 +0200
Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use)
References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com><ca471dc20510120758n6697cd66h46d903814985ba1a@mail.gmail.com><dim7oe$cdc$1@sea.gmane.org>
	<ca471dc20510131137ybd1a277y2fa2c3381eb89cc2@mail.gmail.com>
Message-ID: <ding1q$dle$1@sea.gmane.org>

Guido van Rossum wrote:

> > > BTW, Queue.Queue violates a recent module naming standard; it is now
> > > considered bad style to name the class and the module the same.
> > > Modules and packages should have short all-lowercase names, classes
> > > should be CapWords. Even the same but different case is bad style.
> >
> > unfortunately, this standard seem to result in generic "spamtools" modules
> > into which people throw everything that's even remotely related to "spam",
> > followed by complaints about bloat and performance from users, followed by
> > various more or less stupid attempts to implement lazy loading of hidden in-
> > ternal modules, followed by more complaints from users who no longer has
> > a clear view of what's really going on in there...
> >
> > I think I'll stick to the old standard for a few more years...
>
> Yeah, until you've learned to use packages. :(

what does packages has to do with this ?  does this new module naming
standard only apply to toplevel package names ?

</F>




From ironfroggy at gmail.com  Fri Oct 14 08:16:16 2005
From: ironfroggy at gmail.com (Calvin Spealman)
Date: Fri, 14 Oct 2005 02:16:16 -0400
Subject: [Python-Dev] Early PEP draft (For Python 3000?)
In-Reply-To: <b64f365b0510111431g7fdc573dpc59e4e304ec4e425@mail.gmail.com>
References: <b64f365b0510111431g7fdc573dpc59e4e304ec4e425@mail.gmail.com>
Message-ID: <76fd5acf0510132316x6a8bcc8ck1c3d5a812abd447e@mail.gmail.com>

On 10/11/05, Eyal Lotem <eyal.lotem at gmail.com> wrote:
>       locals()['x'] = 1 # Quietly fails!
> Replaced by:
>       frame.x = 1 # Raises error

What about the possibility of making this hypothetic frame object an
indexable, such that frame[0] is the current scope, frame[1] is the
calling scope, etc.? On the same lines, what about closure[0] for the
current frame, while closure[1] resolves to the closure the function
was defined in? These would ensure that you could reliably access any
namespace you would need, without nasty stack tricks and such, and
would make working around some of the limitation of the closures, when
you have such a need. One might even consider a __resolve__ to be
defined in any namespace, allowing all the namespace resolution rules
to be overridden by code at any level.

From guido at python.org  Fri Oct 14 08:18:45 2005
From: guido at python.org (Guido van Rossum)
Date: Thu, 13 Oct 2005 23:18:45 -0700
Subject: [Python-Dev] AST branch update
In-Reply-To: <005601c5d07c$9ea717c0$1fac958d@oemcomputer>
References: <e8bf7a530510131352xb123101uba24367a8dbe09b2@mail.gmail.com>
	<005601c5d07c$9ea717c0$1fac958d@oemcomputer>
Message-ID: <ca471dc20510132318u5889cb35w524f417dc0b6b98c@mail.gmail.com>

[Jeremy]
> > Neil and I have been working on the AST branch for the last week.
> > We're nearly ready to merge the changes to the head.

[Raymond]
> Nice work.

Indeed. I should've threatened to kill the AST branch long ago! :)

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From greg.ewing at canterbury.ac.nz  Fri Oct 14 08:50:29 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 14 Oct 2005 19:50:29 +1300
Subject: [Python-Dev] Assignment to __class__ of module? (Autoloading?
 (Making Queue.Queue easier to use))
In-Reply-To: <5.1.1.6.0.20051013214118.0312e0e8@mail.telecommunity.com>
References: <5.1.1.6.0.20051013134103.01f23a18@mail.telecommunity.com>
	<434DEEBA.7050905@canterbury.ac.nz>
	<20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com>
	<20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com>
	<5.1.1.6.0.20051012221440.01f41e10@mail.telecommunity.com>
	<434DEEBA.7050905@canterbury.ac.nz>
	<5.1.1.6.0.20051013134103.01f23a18@mail.telecommunity.com>
	<5.1.1.6.0.20051013214118.0312e0e8@mail.telecommunity.com>
Message-ID: <434F5535.4010201@canterbury.ac.nz>

Phillip J. Eby wrote:

> I meant that just changing its class is a mutation, and since immutables 
> can be shared or cached, that could lead to problems.  So I do think 
> it's a reasonable implementation limit to disallow changing the 
> __class__ of an immutable.

That's a fair point.

Although I was actually thinking recently of a use
case for changing the class of a tuple, inside a
Pyrex module for database access. The idea was that
the user would be able to supply a custom subclass
of tuple for returning the records. To avoid extra
copying of the data, I was going to create a normal
uninitialised tuple, stuff the data into it, and then
change its class to the user-supplied one.

But seeing as all this would be happening in Pyrex
where the normal restrictions don't apply anyway, I
suppose it wouldn't matter if user code wasn't allowed
to do this.

Greg

From greg.ewing at canterbury.ac.nz  Fri Oct 14 08:52:34 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 14 Oct 2005 19:52:34 +1300
Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use)
In-Reply-To: <434E1F78.7020504@intercable.ru>
References: <434E1F78.7020504@intercable.ru>
Message-ID: <434F55B2.1030106@canterbury.ac.nz>

Sokolov Yura wrote:
> May be allow modules to define __getattr__ ?

I think I like the descriptor idea better. Besides
being more in keeping with modern practice, it would
allow for things like

   from autoloading import autoload

   Foo = autoload('foomodule', 'Foo')
   Blarg = autoload('blargmodule', 'Blarg')

where autoload is defined as a suitable descriptor
subclass.

I guess we could do with a PEP on this...

Greg

From martin at v.loewis.de  Fri Oct 14 09:14:23 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 14 Oct 2005 09:14:23 +0200
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <v04020a03bf72162a051d@[192.168.123.162]>
References: <v04020a03bf72162a051d@[192.168.123.162]>
Message-ID: <434F5ACF.3000802@v.loewis.de>

Tony Nelson wrote:
> I have written my fastcharmap decoder and encoder.  It's not meant to be
> better than the patch and other changes to come in a future version of
> Python, but it does work now with the current codecs.

It's an interesting solution.

> To use, hook each codec to be speed up:
> 
>     import fastcharmap
>     help(fastcharmap)
>     fastcharmap.hook('name_of_codec')
>     u = unicode('some text', 'name_of_codec')
>     s = u.encode('name_of_codec')
> 
> No codecs were rewritten.  It took me a while to learn enough to do this
> (Pyrex, more Python, some Python C API), and there were some surprises.
> Hooking in is grosser than I would have liked.  I've only used it on Python
> 2.3 on FC3.

Indeed, and I would claim that you did not completely achieve your "no 
changes necessary" goal: you still have to install the hooks explicitly.
I also think overriding codecs.charmap_{encode,decode} is really ugly.

Even if this could be simplified if you would modify the existing
codecs, I still don't think supporting changes to the encoding dict
is worthwhile. People will probably want to update the codecs in-place,
but I don't think we need to make a guarantee that that such an approach
works independent of the Python version. People would be much better off
writing their own codecs if they think the distributed ones are
incorrect.

Regards,
Martin

From greg.ewing at canterbury.ac.nz  Fri Oct 14 09:07:05 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 14 Oct 2005 20:07:05 +1300
Subject: [Python-Dev] Simplify lnotab? (AST branch update)
In-Reply-To: <5.1.1.6.0.20051013214331.02f320a0@mail.telecommunity.com>
References: <5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com>
	<e8bf7a530510131352xb123101uba24367a8dbe09b2@mail.gmail.com>
	<e8bf7a530510131352xb123101uba24367a8dbe09b2@mail.gmail.com>
	<5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com>
	<5.1.1.6.0.20051013214331.02f320a0@mail.telecommunity.com>
Message-ID: <434F5919.6040002@canterbury.ac.nz>

Phillip J. Eby wrote:

> A more 
> compact scheme is possible, by using two tables - a bytecode->line 
> number table, and a line number-> file table.  
> 
> If you have to encode multiple files, you just offset their line numbers 
> by the size of the other files,

More straightforwardly, the second table could just be a
bytecode -> file number mapping. The filename is likely to
change much less often than the line number, so this file
would contain far fewer entries than the line number table.
In the case of only one file, it would contain just a single
entry, so it probably wouldn't even be worth the bother of
special-casing that.

You could save a bit more by having two kinds of line number
table, "small" (16-bit entries) and "large" (32-bit entries)
depending on the size of the code object and range of line
numbers. The small one would be sufficient for almost all
code objects, so the most common case would use only about
4 bytes per line of code. That's only twice as much as the
current scheme uses.

Greg

From jcarlson at uci.edu  Fri Oct 14 09:23:44 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri, 14 Oct 2005 00:23:44 -0700
Subject: [Python-Dev] Early PEP draft (For Python 3000?)
In-Reply-To: <76fd5acf0510132316x6a8bcc8ck1c3d5a812abd447e@mail.gmail.com>
References: <b64f365b0510111431g7fdc573dpc59e4e304ec4e425@mail.gmail.com>
	<76fd5acf0510132316x6a8bcc8ck1c3d5a812abd447e@mail.gmail.com>
Message-ID: <20051014000927.919E.JCARLSON@uci.edu>


Calvin Spealman <ironfroggy at gmail.com> wrote:
> 
> On 10/11/05, Eyal Lotem <eyal.lotem at gmail.com> wrote:
> >       locals()['x'] = 1 # Quietly fails!
> > Replaced by:
> >       frame.x = 1 # Raises error
> 
> What about the possibility of making this hypothetic frame object an
> indexable, such that frame[0] is the current scope, frame[1] is the
> calling scope, etc.? On the same lines, what about closure[0] for the
> current frame, while closure[1] resolves to the closure the function
> was defined in? These would ensure that you could reliably access any
> namespace you would need, without nasty stack tricks and such, and
> would make working around some of the limitation of the closures, when
> you have such a need. One might even consider a __resolve__ to be
> defined in any namespace, allowing all the namespace resolution rules
> to be overridden by code at any level.

-1000  If you want a namespace, create one and pass it around.  If the
writer of a function or method wanted you monkeying around with a
namespace, they would have given you one to work with.

As for closure monkeywork, you've got to be kidding.  Closures in Python
are a clever and interesting way of keeping around certain things, but
are actually unnecessary with the existance of class and instance
namespaces. Every example of a closure can be re-done as a
class/instance, and many end up looking better.

 - Josiah


From nnorwitz at gmail.com  Fri Oct 14 09:46:14 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Fri, 14 Oct 2005 00:46:14 -0700
Subject: [Python-Dev] AST branch update
In-Reply-To: <ca471dc20510132318u5889cb35w524f417dc0b6b98c@mail.gmail.com>
References: <e8bf7a530510131352xb123101uba24367a8dbe09b2@mail.gmail.com>
	<005601c5d07c$9ea717c0$1fac958d@oemcomputer>
	<ca471dc20510132318u5889cb35w524f417dc0b6b98c@mail.gmail.com>
Message-ID: <ee2a432c0510140046r21c7c87flf7f8d17e762b25d4@mail.gmail.com>

On 10/13/05, Guido van Rossum <guido at python.org> wrote:
>
> Indeed. I should've threatened to kill the AST branch long ago! :)

:-)

I decreased a lot of the memory leaks.  Here are some more to work on.
 I doubt this list is complete, but it's a start:

PyObject_Malloc (obmalloc.c:717)
_PyObject_DebugMalloc (obmalloc.c:1014)
compiler_enter_scope (newcompile.c:1204)
compiler_mod (newcompile.c:1894)
PyAST_Compile (newcompile.c:471)
Py_CompileStringFlags (pythonrun.c:1240)
builtin_compile (bltinmodule.c:391)


Tuple (Python-ast.c:907)
ast_for_testlist (ast.c:1782)
ast_for_classdef (ast.c:2677)
ast_for_stmt (ast.c:2758)
PyAST_FromNode (ast.c:233)
PyParser_ASTFromFile (pythonrun.c:1291)
parse_source_module (import.c:762)
load_source_module (import.c:886)


new_arena (obmalloc.c:500)
PyObject_Malloc (obmalloc.c:699)
PyObject_Realloc (obmalloc.c:837)
_PyObject_DebugRealloc (obmalloc.c:1077)
PyNode_AddChild (node.c:95)
shift (parser.c:112)
PyParser_AddToken (parser.c:244)
parsetok (parsetok.c:165)
PyParser_ParseFileFlags (parsetok.c:89)
PyParser_ASTFromFile (pythonrun.c:1288)
parse_source_module (import.c:762)
load_source_module (import.c:886)


Lambda (Python-ast.c:610)
ast_for_lambdef (ast.c:859)
ast_for_expr (ast.c:1443)
ast_for_testlist (ast.c:1776)
ast_for_expr_stmt (ast.c:1845)
ast_for_stmt (ast.c:2716)
PyAST_FromNode (ast.c:233)
PyParser_ASTFromString (pythonrun.c:1271)
Py_CompileStringFlags (pythonrun.c:1237)
builtin_compile (bltinmodule.c:391)


BinOp (Python-ast.c:557)
ast_for_binop (ast.c:1389)
ast_for_expr (ast.c:1531)
ast_for_testlist (ast.c:1776)
ast_for_expr_stmt (ast.c:1845)
ast_for_stmt (ast.c:2716)
PyAST_FromNode (ast.c:233)
PyParser_ASTFromString (pythonrun.c:1271)
Py_CompileStringFlags (pythonrun.c:1237)
builtin_compile (bltinmodule.c:391)


Name (Python-ast.c:865)
ast_for_atom (ast.c:1201)
ast_for_expr (ast.c:1555)
ast_for_testlist (ast.c:1776)
ast_for_expr_stmt (ast.c:1798)
ast_for_stmt (ast.c:2716)
PyAST_FromNode (ast.c:233)
PyParser_ASTFromString (pythonrun.c:1271)
Py_CompileStringFlags (pythonrun.c:1237)
builtin_compile (bltinmodule.c:391)

From nnorwitz at gmail.com  Fri Oct 14 09:55:59 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Fri, 14 Oct 2005 00:55:59 -0700
Subject: [Python-Dev] AST branch update
In-Reply-To: <ee2a432c0510140046r21c7c87flf7f8d17e762b25d4@mail.gmail.com>
References: <e8bf7a530510131352xb123101uba24367a8dbe09b2@mail.gmail.com>
	<005601c5d07c$9ea717c0$1fac958d@oemcomputer>
	<ca471dc20510132318u5889cb35w524f417dc0b6b98c@mail.gmail.com>
	<ee2a432c0510140046r21c7c87flf7f8d17e762b25d4@mail.gmail.com>
Message-ID: <ee2a432c0510140055w78508383n2b9282c970ab969c@mail.gmail.com>

On 10/14/05, Neal Norwitz <nnorwitz at gmail.com> wrote:
>
> I decreased a lot of the memory leaks.  Here are some more to work on.
>  I doubt this list is complete, but it's a start:

Oh and since I fixed the memory leaks in a generated file
Python/Python-ast.c, the changes still need to be implemented in the
right place (ie, Parser/asdl_c.py).

Valgrind didn't report any invalid uses of memory, though there is
also a lot potentially leaked memory.  It seemed a lot higher than
what I remembered, so I'm not sure if it's an issue or not.  I'll look
into that after we get the definite memory leaks plugged.

n

From mwh at python.net  Fri Oct 14 10:23:40 2005
From: mwh at python.net (Michael Hudson)
Date: Fri, 14 Oct 2005 09:23:40 +0100
Subject: [Python-Dev] Simplify lnotab? (AST branch update)
In-Reply-To: <5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com>
	(Phillip J. Eby's message of "Thu, 13 Oct 2005 20:59:36 -0400")
References: <e8bf7a530510131352xb123101uba24367a8dbe09b2@mail.gmail.com>
	<e8bf7a530510131352xb123101uba24367a8dbe09b2@mail.gmail.com>
	<5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com>
Message-ID: <2mwtkga4mr.fsf@starship.python.net>

"Phillip J. Eby" <pje at telecommunity.com> writes:

> Even better if the lines for a particular piece of code don't have
> to all come from the same file.

This seems _fairly_ esoteric to me.  Why do you need it?

I can think of two uses for lnotab information: printing source lines
and locating source lines on the filesystem.  For both, I think I'd
rather see some kind of defined protocol (methods on the code object,
maybe?) rather than inventing some kind of insane
too-general-for-the-common-case data structure.

Cheers,
mwh

-- 
42. You can measure a programmer's perspective by noting his
    attitude on the continuing vitality of FORTRAN.
  -- Alan Perlis, http://www.cs.yale.edu/homes/perlis-alan/quotes.html

From pje at telecommunity.com  Fri Oct 14 17:20:45 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 14 Oct 2005 11:20:45 -0400
Subject: [Python-Dev] Simplify lnotab? (AST branch update)
In-Reply-To: <434F5919.6040002@canterbury.ac.nz>
References: <5.1.1.6.0.20051013214331.02f320a0@mail.telecommunity.com>
	<5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com>
	<e8bf7a530510131352xb123101uba24367a8dbe09b2@mail.gmail.com>
	<e8bf7a530510131352xb123101uba24367a8dbe09b2@mail.gmail.com>
	<5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com>
	<5.1.1.6.0.20051013214331.02f320a0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20051014111829.01f89288@mail.telecommunity.com>

At 08:07 PM 10/14/2005 +1300, Greg Ewing wrote:
>Phillip J. Eby wrote:
>
> > A more
> > compact scheme is possible, by using two tables - a bytecode->line
> > number table, and a line number-> file table.
> >
> > If you have to encode multiple files, you just offset their line numbers
> > by the size of the other files,
>
>More straightforwardly, the second table could just be a
>bytecode -> file number mapping.

That would use more space in any case involving multiple files.


>In the case of only one file, it would contain just a single
>entry, so it probably wouldn't even be worth the bother of
>special-casing that.

A line->file mapping would also have only one entry in that case.


>You could save a bit more by having two kinds of line number
>table, "small" (16-bit entries) and "large" (32-bit entries)
>depending on the size of the code object and range of line
>numbers. The small one would be sufficient for almost all
>code objects, so the most common case would use only about
>4 bytes per line of code. That's only twice as much as the
>current scheme uses.

That'd probably work.


From pje at telecommunity.com  Fri Oct 14 17:28:01 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 14 Oct 2005 11:28:01 -0400
Subject: [Python-Dev] Simplify lnotab? (AST branch update)
In-Reply-To: <2mwtkga4mr.fsf@starship.python.net>
References: <5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com>
	<e8bf7a530510131352xb123101uba24367a8dbe09b2@mail.gmail.com>
	<e8bf7a530510131352xb123101uba24367a8dbe09b2@mail.gmail.com>
	<5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20051014112056.021b5eb0@mail.telecommunity.com>

At 09:23 AM 10/14/2005 +0100, Michael Hudson wrote:
>"Phillip J. Eby" <pje at telecommunity.com> writes:
>
> > Even better if the lines for a particular piece of code don't have
> > to all come from the same file.
>
>This seems _fairly_ esoteric to me.  Why do you need it?

Compilers that inline function calls, but want the code to still be 
debuggable.  AOP tools that weave bytecode.  Overloaded functions 
implemented by combining bytecode.

Okay, those are fairly esoteric use cases, I admit.  :)  However, PyPy 
already has some inlining capability in its optimizer, so it's not all that 
crazy of an idea that Python in general will need it.


>I can think of two uses for lnotab information: printing source lines
>and locating source lines on the filesystem.  For both, I think I'd
>rather see some kind of defined protocol (methods on the code object,
>maybe?) rather than inventing some kind of insane
>too-general-for-the-common-case data structure.

Certainly a protocol would be nice; right now one is forced to interpret 
the data structure directly.  Being able to say, "give me the file and line 
number for a given byte offset" would be handy in any case.

However, since you can't subclass code objects, the capability would have 
to be part of the core.


From mwh at python.net  Fri Oct 14 17:53:17 2005
From: mwh at python.net (Michael Hudson)
Date: Fri, 14 Oct 2005 16:53:17 +0100
Subject: [Python-Dev] Simplify lnotab? (AST branch update)
In-Reply-To: <5.1.1.6.0.20051014112056.021b5eb0@mail.telecommunity.com>
	(Phillip J. Eby's message of "Fri, 14 Oct 2005 11:28:01 -0400")
References: <5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com>
	<e8bf7a530510131352xb123101uba24367a8dbe09b2@mail.gmail.com>
	<e8bf7a530510131352xb123101uba24367a8dbe09b2@mail.gmail.com>
	<5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com>
	<5.1.1.6.0.20051014112056.021b5eb0@mail.telecommunity.com>
Message-ID: <2moe5s9jte.fsf@starship.python.net>

"Phillip J. Eby" <pje at telecommunity.com> writes:

> At 09:23 AM 10/14/2005 +0100, Michael Hudson wrote:
>>"Phillip J. Eby" <pje at telecommunity.com> writes:
>>
>> > Even better if the lines for a particular piece of code don't have
>> > to all come from the same file.
>>
>>This seems _fairly_ esoteric to me.  Why do you need it?
>
> Compilers that inline function calls, but want the code to still be 
> debuggable.  AOP tools that weave bytecode.  Overloaded functions 
> implemented by combining bytecode.

Err...

> Okay, those are fairly esoteric use cases, I admit.  :)  However, PyPy 
> already has some inlining capability in its optimizer, so it's not all that 
> crazy of an idea that Python in general will need it.

Um.  Well, _I_ still think it's pretty crazy.

>>I can think of two uses for lnotab information: printing source lines
>>and locating source lines on the filesystem.  For both, I think I'd
>>rather see some kind of defined protocol (methods on the code object,
>>maybe?) rather than inventing some kind of insane
>>too-general-for-the-common-case data structure.
>
> Certainly a protocol would be nice; right now one is forced to interpret 
> the data structure directly.  Being able to say, "give me the file and line 
> number for a given byte offset" would be handy in any case.
>
> However, since you can't subclass code objects, the capability would have 
> to be part of the core.

Clearly, but any changes to co_lnotab would have to be part of the
core too.   Let's not make a complicated situation _worse_.

Something I didn't say was that a protocol like this would also let us
remove the horrors of functions like inspect.getsourcelines() (see SF
bugs passim).

Cheers,
mwh

-- 
  There's an aura of unholy black magic about CLISP.  It works, but
  I have no idea how it does it.  I suspect there's a goat involved
  somewhere.                     -- Johann Hibschman, comp.lang.scheme

From walter at livinglogic.de  Fri Oct 14 18:26:37 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Fri, 14 Oct 2005 18:26:37 +0200
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <434F5ACF.3000802@v.loewis.de>
References: <v04020a03bf72162a051d@[192.168.123.162]>
	<434F5ACF.3000802@v.loewis.de>
Message-ID: <434FDC3D.9040202@livinglogic.de>

Martin v. L?wis wrote:

> Tony Nelson wrote:
> 
>> I have written my fastcharmap decoder and encoder.  It's not meant to be
>> better than the patch and other changes to come in a future version of
>> Python, but it does work now with the current codecs.
> 
> It's an interesting solution.

I like the fact that encoding doesn't need a special data structure.

>> To use, hook each codec to be speed up:
>>
>>     import fastcharmap
>>     help(fastcharmap)
>>     fastcharmap.hook('name_of_codec')
>>     u = unicode('some text', 'name_of_codec')
>>     s = u.encode('name_of_codec')
>>
>> No codecs were rewritten.  It took me a while to learn enough to do this
>> (Pyrex, more Python, some Python C API), and there were some surprises.
>> Hooking in is grosser than I would have liked.  I've only used it on 
>> Python
>> 2.3 on FC3.
> 
> Indeed, and I would claim that you did not completely achieve your "no 
> changes necessary" goal: you still have to install the hooks explicitly.
> I also think overriding codecs.charmap_{encode,decode} is really ugly.
> 
> Even if this could be simplified if you would modify the existing
> codecs, I still don't think supporting changes to the encoding dict
> is worthwhile. People will probably want to update the codecs in-place,
> but I don't think we need to make a guarantee that that such an approach
> works independent of the Python version. People would be much better off
> writing their own codecs if they think the distributed ones are
> incorrect.

Exacty. If you need another codec write your own insteaad of patching an 
existing one on the fly!

Of course we can't accept Pyrex code in the Python core, so it would be 
great to rewrite the encoder as a patch to PyUnicode_EncodeCharmap(). 
This version must be able to cope with encoding tables that are random 
strings without crashing.

We've already taken care of decoding. What we still need is a new 
gencodec.py and regenerated codecs.

Bye,
    Walter D?rwald

From mal at egenix.com  Fri Oct 14 19:03:54 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 14 Oct 2005 19:03:54 +0200
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <434FDC3D.9040202@livinglogic.de>
References: <v04020a03bf72162a051d@[192.168.123.162]>	<434F5ACF.3000802@v.loewis.de>
	<434FDC3D.9040202@livinglogic.de>
Message-ID: <434FE4FA.8000307@egenix.com>

Walter D?rwald wrote:
> We've already taken care of decoding. What we still need is a new 
> gencodec.py and regenerated codecs.

I'll take care of that; just haven't gotten around to it yet.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 14 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From reinhold-birkenfeld-nospam at wolke7.net  Fri Oct 14 19:30:05 2005
From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld)
Date: Fri, 14 Oct 2005 19:30:05 +0200
Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use)
In-Reply-To: <434E1F78.7020504@intercable.ru>
References: <434E1F78.7020504@intercable.ru>
Message-ID: <dioput$mq2$1@sea.gmane.org>

Sokolov Yura wrote:
> May be allow modules to define __getattr__ ?
> 
> def __getattr__(thing):
>      try:
>           return __some_standart_way__(thing)
>     except AttributeError:
>           if thing=="Queue":
>                import sys
>                from Queue import Queue
>                setattr(sys.modules[__name__],"Queue",Queue)
>                return Queue
>           raise

I proposed something like this in the RFE tracker a while ago, but no
one commented on it.

Reinhold

-- 
Mail address is perfectly valid!


From cfbolz at gmx.de  Fri Oct 14 19:25:45 2005
From: cfbolz at gmx.de (Carl Friedrich Bolz)
Date: Fri, 14 Oct 2005 19:25:45 +0200
Subject: [Python-Dev] Simplify lnotab? (AST branch update)
In-Reply-To: <5.1.1.6.0.20051014112056.021b5eb0@mail.telecommunity.com>
References: <5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com>	<e8bf7a530510131352xb123101uba24367a8dbe09b2@mail.gmail.com>	<e8bf7a530510131352xb123101uba24367a8dbe09b2@mail.gmail.com>	<5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com>
	<2mwtkga4mr.fsf@starship.python.net>
	<5.1.1.6.0.20051014112056.021b5eb0@mail.telecommunity.com>
Message-ID: <diopkg$mhf$1@sea.gmane.org>

Hi!

Phillip J. Eby wrote:
[snip]
> Okay, those are fairly esoteric use cases, I admit.  :)  However, PyPy 
> already has some inlining capability in its optimizer, so it's not all that 
> crazy of an idea that Python in general will need it.

It's kind of strange to argue with PyPy's inlining capabilities, since 
inlining in PyPy happens on a completely different level, that has 
nothing at all to do with Python code objects any more. So your proposed 
changes would not make a difference for PyPy (not even to speak about 
benefits).

[snip]

cheers,

Carl Friedrich Bolz


From raymond.hettinger at verizon.net  Fri Oct 14 20:41:24 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Fri, 14 Oct 2005 14:41:24 -0400
Subject: [Python-Dev] Simplify lnotab? (AST branch update)
In-Reply-To: <2moe5s9jte.fsf@starship.python.net>
Message-ID: <000e01c5d0ee$e1c97f80$1fac958d@oemcomputer>

> >> > Even better if the lines for a particular piece of code don't
have
> >> > to all come from the same file.
> >>
> >>This seems _fairly_ esoteric to me.  Why do you need it?
> >
> > Compilers that inline function calls, but want the code to still be
> > debuggable.  AOP tools that weave bytecode.  Overloaded functions
> > implemented by combining bytecode.
> 
> Err...
> 
> > Okay, those are fairly esoteric use cases, I admit.  :)  However,
PyPy
> > already has some inlining capability in its optimizer, so it's not
all
> that
> > crazy of an idea that Python in general will need it.
> 
> Um.  Well, _I_ still think it's pretty crazy.

YAGNI



Raymond


From pje at telecommunity.com  Fri Oct 14 21:43:43 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 14 Oct 2005 15:43:43 -0400
Subject: [Python-Dev] LOAD_SELF and SELF_ATTR opcodes
Message-ID: <5.1.1.6.0.20051014151215.01f76bb0@mail.telecommunity.com>

I ran across an interesting paper about some VM optimizations yesterday:

    http://www.object-arts.com/Papers/TheInterpreterIsDead.PDF

One thing mentioned was that saving even one cycle in their 'PUSH_SELF' 
opcode improved interpreter performance by 5%.  I thought that was pretty 
cool, and then I realized CPython doesn't even *have* a PUSH_SELF opcode.

So, today, I took a stab at implementing one, by converting "LOAD_FAST 0" 
calls to a "LOAD_SELF" opcode.  Pystone and Parrotbench improved by about 
2% or so.  That wasn't great, so I added a "SELF_ATTR" opcode that combines 
a LOAD_SELF and a LOAD_ATTR in the same opcode while avoiding extra stack 
and refcount manipulation.  This raised the total improvement for pystone 
to about 5%, but didn't seem to improve parrotbench any further.  I guess 
parrotbench doesn't do much self.attr stuff in places that really count, 
and looking at the code it indeed seems that most self.* stuff is done at 
higher levels of the parsing benchmark, not the innermost loops.

Indeed, even pystone doesn't do much attribute access on the first argument 
of most of its functions, especially not those in inner loops.  Only 
Proc1() and the Record.copy() method do anything that would be helped by 
SELF_ATTR.  But it seems to me that this is very unusual for 
object-oriented code, and that more common uses of Python should be helped 
a lot more by this.  Do we have any benchmarks that don't use 'foo = 
self.foo' type shortcuts in their inner loops?

Anyway, my main question is, do these sound like worthwhile 
optimizations?  The code isn't that complex; the only tricky thing I did 
was having the opcodes' error case (unbound local) fall through to the 
LOAD_FAST opcode so as not to duplicate the error handling code, in the 
hopes of keeping the eval loop size down.


From pinard at iro.umontreal.ca  Fri Oct 14 21:45:07 2005
From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard)
Date: Fri, 14 Oct 2005 15:45:07 -0400
Subject: [Python-Dev] Simplify lnotab? (AST branch update)
In-Reply-To: <2mwtkga4mr.fsf@starship.python.net>
References: <e8bf7a530510131352xb123101uba24367a8dbe09b2@mail.gmail.com>
	<e8bf7a530510131352xb123101uba24367a8dbe09b2@mail.gmail.com>
	<5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com>
	<2mwtkga4mr.fsf@starship.python.net>
Message-ID: <20051014194507.GA6435@alcyon.progiciels-bpi.ca>

[Michael Hudson]
> "Phillip J. Eby" <pje at telecommunity.com> writes:

> > Even better if the lines for a particular piece of code don't have
> > to all come from the same file.

> This seems _fairly_ esoteric to me.  Why do you need it?

For when Python code is generated from more than one original source
file (referring to the `#line' directive message, a little earlier this
week).  For example, think include files.

-- 
Fran?ois Pinard   http://pinard.progiciels-bpi.ca

From pinard at iro.umontreal.ca  Fri Oct 14 21:46:29 2005
From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard)
Date: Fri, 14 Oct 2005 15:46:29 -0400
Subject: [Python-Dev] Simplify lnotab? (AST branch update)
In-Reply-To: <000e01c5d0ee$e1c97f80$1fac958d@oemcomputer>
References: <2moe5s9jte.fsf@starship.python.net>
	<000e01c5d0ee$e1c97f80$1fac958d@oemcomputer>
Message-ID: <20051014194629.GB6435@alcyon.progiciels-bpi.ca>

[Raymond Hettinger]

> > >> > Even better if the lines for a particular piece of code don't
> > >> > have to all come from the same file.

> YAGNI

I surely needed it, more than once.  Don't be so assertive. :-)

-- 
Fran?ois Pinard   http://pinard.progiciels-bpi.ca

From pje at telecommunity.com  Fri Oct 14 22:05:13 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 14 Oct 2005 16:05:13 -0400
Subject: [Python-Dev] Simplify lnotab? (AST branch update)
In-Reply-To: <000e01c5d0ee$e1c97f80$1fac958d@oemcomputer>
References: <2moe5s9jte.fsf@starship.python.net>
Message-ID: <5.1.1.6.0.20051014160208.01f73c70@mail.telecommunity.com>

At 02:41 PM 10/14/2005 -0400, Raymond Hettinger wrote:
>YAGNI

If the feature were there, I'd have used it already, so I wouldn't consider 
it YAGNI.  In the cases where I would've used it, I instead split generated 
code into separate functions so I could compile() each one with a different 
filename.

Also, I notice that the peephole optimizer contains stuff to avoid making 
co_lnotab "too complex", although I haven't looked at it to be sure it'd 
actually benefit from an expanded lnotab format.


From skip at pobox.com  Sat Oct 15 00:20:52 2005
From: skip at pobox.com (skip@pobox.com)
Date: Fri, 14 Oct 2005 17:20:52 -0500
Subject: [Python-Dev] LOAD_SELF and SELF_ATTR opcodes
In-Reply-To: <5.1.1.6.0.20051014151215.01f76bb0@mail.telecommunity.com>
References: <5.1.1.6.0.20051014151215.01f76bb0@mail.telecommunity.com>
Message-ID: <17232.12100.667664.919625@montanaro.dyndns.org>


    Phillip> Indeed, even pystone doesn't do much attribute access on the
    Phillip> first argument of most of its functions, especially not those
    Phillip> in inner loops.  Only Proc1() and the Record.copy() method do
    Phillip> anything that would be helped by SELF_ATTR.  But it seems to me
    Phillip> that this is very unusual for object-oriented code, and that
    Phillip> more common uses of Python should be helped a lot more by this.
    Phillip> Do we have any benchmarks that don't use 'foo = self.foo' type
    Phillip> shortcuts in their inner loops?

(Just thinking out loud...)

Maybe we should create an alternate "object-oriented" version of pystone as
a way to inject more attribute access into a convenient benchmark.  Even if
it's completely artificial and has no connection to other versions of the
Drhystone benchmark, it might be useful for testing improvements to
attribute access.

Skip

From martin at v.loewis.de  Sat Oct 15 00:22:37 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 15 Oct 2005 00:22:37 +0200
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <434FDC3D.9040202@livinglogic.de>
References: <v04020a03bf72162a051d@[192.168.123.162]>
	<434F5ACF.3000802@v.loewis.de> <434FDC3D.9040202@livinglogic.de>
Message-ID: <43502FAD.2030500@v.loewis.de>

Walter D?rwald wrote:
> Of course we can't accept Pyrex code in the Python core, so it would be 
> great to rewrite the encoder as a patch to PyUnicode_EncodeCharmap(). 
> This version must be able to cope with encoding tables that are random 
> strings without crashing.

I don't think this will be necessary. I personally dislike the decoding
tables, as I think a straight-forward trie will do better than a
hashtable.

Regards,
Martin

From martin at v.loewis.de  Sat Oct 15 00:33:24 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 15 Oct 2005 00:33:24 +0200
Subject: [Python-Dev] LOAD_SELF and SELF_ATTR opcodes
In-Reply-To: <5.1.1.6.0.20051014151215.01f76bb0@mail.telecommunity.com>
References: <5.1.1.6.0.20051014151215.01f76bb0@mail.telecommunity.com>
Message-ID: <43503234.4030902@v.loewis.de>

Phillip J. Eby wrote:
> Anyway, my main question is, do these sound like worthwhile 
> optimizations? 

In the past, I think the analysis was always "no". It adds
an opcode, so increases the size of the switch, causing
more pressure on the cache, with an overall questionable
effect.

As for measuring the effect of the change: how often
does that pattern occur in the standard library?
(compared to what total number of LOAD_ATTR)

Regards,
Martin

From pje at telecommunity.com  Sat Oct 15 01:33:44 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 14 Oct 2005 19:33:44 -0400
Subject: [Python-Dev] LOAD_SELF and SELF_ATTR opcodes
In-Reply-To: <43503234.4030902@v.loewis.de>
References: <5.1.1.6.0.20051014151215.01f76bb0@mail.telecommunity.com>
	<5.1.1.6.0.20051014151215.01f76bb0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20051014190111.0382c220@mail.telecommunity.com>

At 12:33 AM 10/15/2005 +0200, Martin v. L?wis wrote:
>Phillip J. Eby wrote:
> > Anyway, my main question is, do these sound like worthwhile
> > optimizations?
>
>In the past, I think the analysis was always "no". It adds
>an opcode, so increases the size of the switch, causing
>more pressure on the cache, with an overall questionable
>effect.

Hm.  I'd have thought 5% pystone and 2% pybench is nothing to sneeze at, 
for such a minor change.  I thought Skip's peephole optimizer originally 
only produced a 5% or so speedup.

In any case, in relation to this specific kind of optimization, this is the 
only thing I found:

     http://mail.python.org/pipermail/python-dev/2002-February/019854.html

which is a proposal by Guido to do the same thing, but also speeding up the 
actual attribute lookup.  I didn't find any follow-up suggesting that 
anybody tried this, but perhaps it was put on hold pending the AST branch?  :)


>As for measuring the effect of the change: how often
>does that pattern occur in the standard library?
>(compared to what total number of LOAD_ATTR)

[pje at ns src]$ grep 'self\.[A-Za-z_]' Lib/*.py | wc -l
    9919

[pje at ns src]$ grep '[a-zA-Z_][a-zA-Z_0-9]*\.[a-zA-Z_]' Lib/*.py | wc -l
   19804

So, something like 50% of lines doing an attribute access include a 'self' 
attribute access.  This very rough estimate may be thrown off by:

* Import statements (causing an error in favor of more non-self attributes)
* Functions whose first argument isn't 'self' (error in favor of non-self 
attributes)
* Comments or docstrings talking about attributes or modules (could go 
either way)
* Multiple attribute accesses on the same line (could go either way)

The parrotbench code shows a similar ratio of self to non-self attribute 
usage, but nearly all of parrotbench's self-attribute usage is in b0.py, 
and not called in the innermost loop.

That also suggests that the volume of usage of 'self.' isn't the best way 
to determine the performance impact, because pystone has almost no 'self.' 
usage at all, but still got a 5% total boost.


From skip at pobox.com  Sat Oct 15 02:22:53 2005
From: skip at pobox.com (skip@pobox.com)
Date: Fri, 14 Oct 2005 19:22:53 -0500
Subject: [Python-Dev] LOAD_SELF and SELF_ATTR opcodes
In-Reply-To: <5.1.1.6.0.20051014190111.0382c220@mail.telecommunity.com>
References: <5.1.1.6.0.20051014151215.01f76bb0@mail.telecommunity.com>
	<5.1.1.6.0.20051014190111.0382c220@mail.telecommunity.com>
Message-ID: <17232.19421.938597.152797@montanaro.dyndns.org>


    >> Phillip J. Eby wrote:
    >> > Anyway, my main question is, do these sound like worthwhile
    >> > optimizations?
    >> 
    >> In the past, I think the analysis was always "no". It adds an opcode,
    >> so increases the size of the switch, causing more pressure on the
    >> cache, with an overall questionable effect.

    Phillip> Hm.  I'd have thought 5% pystone and 2% pybench is nothing to
    Phillip> sneeze at, for such a minor change.

We've added lots of new opcodes over the years.  CPU caches have grown
steadily in that time as well, from maybe 128KB-256KB in the early 90's to
around 1MB today.  I suspect cache size has kept up with the growth of
Python's VM inner loop.  At any rate, each change has to be judged on its
own merits.  If it speeds things up and is uncontroversial
implementation-wise, I see no reason it should be rejected out-of-hand.
(Send it to Raymond H.  He'll probably sneak it in when Martin's not
looking. <wink>)

Skip

From falcon at intercable.ru  Fri Oct 14 13:05:30 2005
From: falcon at intercable.ru (Sokolov Yura)
Date: Fri, 14 Oct 2005 15:05:30 +0400
Subject: [Python-Dev] Pythonic concurrency - offtopic
In-Reply-To: <20051013220748.9195.JCARLSON@uci.edu>
References: <434B61ED.4080503@intercable.ru>
	<20051013220748.9195.JCARLSON@uci.edu>
Message-ID: <434F90FA.6060007@intercable.ru>

Josiah Carlson wrote:

>Sokolov Yura <falcon at intercable.ru> wrote:
>  
>
>>Offtopic:
>>
>>Microsoft Windows [Version 5.2.3790]
>>(C) Copyright 1985-2003 Microsoft Corp.
>>
>>G:\Working\1>c:\Python24\python
>>Python 2.4.1 (#65, Mar 30 2005, 09:13:57) [MSC v.1310 32 bit (Intel)] on 
>>win32
>>Type "help", "copyright", "credits" or "license" for more information.
>> >>> from os import fork
>>Traceback (most recent call last):
>>  File "<stdin>", line 1, in ?
>>ImportError: cannot import name fork
>> >>>
>>    
>>
>
>Python for Windows, if I remember correctly, has never supported forking. 
>This is because the underlying process execution code does not have
>support for the standard copy-on-write semantic which makes unix fork
>fast.
>
>Cygwin Python does support fork, but I believe this is through a literal
>copying of the memory space, which is far slower than unix fork.
>
>Until Microsoft adds kernel support for fork, don't expect standard
>Windows Python to support it.
>
> - Josiah
>
>
>
>  
>
That is what i mean...

sorry for being noisy...


From tcdelaney at optusnet.com.au  Sat Oct 15 02:30:07 2005
From: tcdelaney at optusnet.com.au (Tim Delaney)
Date: Sat, 15 Oct 2005 10:30:07 +1000
Subject: [Python-Dev] LOAD_SELF and SELF_ATTR opcodes
Message-ID: <000501c5d11f$98368b70$0201a8c0@ryoko>

Sorry I can't reply to the message (I'm at home, and don't currently have 
python-dev sent there).

I have a version of Raymond's constant binding recipe:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/277940

that also binds all attribute accesses all the way down into a single 
constant call e.g.

    LOAD_FAST  0
    LOAD_ATTR  'a'
    LOAD_ATTR  'b'
    LOAD_ATTR  'c'
    LOAD_ATTR  'd'

is bound to a single constant:

    LOAD_CONST  5

where constant 5 is the object obtained from `self.a.b.c.d`. Unfortunately, 
I think it's at work - don't seem to have a copy here :(

Obviously, this isn't applicable to as many cases, but it might be 
interesting to compare what kind of results this produces compared to 
LOAD_SELF/SELF_ATTR.

Tim Delaney 


From blais at furius.ca  Sat Oct 15 08:50:21 2005
From: blais at furius.ca (Martin Blais)
Date: Sat, 15 Oct 2005 02:50:21 -0400
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
	conversions).
In-Reply-To: <2mbr27f0th.fsf@starship.python.net>
References: <8393fff0510022309l5977e899led262cb3c1a7a805@mail.gmail.com>
	<2mbr27f0th.fsf@starship.python.net>
Message-ID: <8393fff0510142350l81ba453md20cc47a445642ce@mail.gmail.com>

On 10/3/05, Michael Hudson <mwh at python.net> wrote:
> Martin Blais <blais at furius.ca> writes:
>
> > How hard would that be to implement?
>
> import sys
> reload(sys)
> sys.setdefaultencoding('undefined')

Hmmm any particular reason for the call to reload() here?

From martin at v.loewis.de  Sat Oct 15 10:03:32 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 15 Oct 2005 10:03:32 +0200
Subject: [Python-Dev] LOAD_SELF and SELF_ATTR opcodes
In-Reply-To: <17232.19421.938597.152797@montanaro.dyndns.org>
References: <5.1.1.6.0.20051014151215.01f76bb0@mail.telecommunity.com>
	<5.1.1.6.0.20051014190111.0382c220@mail.telecommunity.com>
	<17232.19421.938597.152797@montanaro.dyndns.org>
Message-ID: <4350B7D4.4000102@v.loewis.de>

skip at pobox.com wrote:
> (Send it to Raymond H.  He'll probably sneak it in when Martin's not
> looking. <wink>)

I'm not personally objecting :-) I just recall that there was that kind
of objection when I proposed similar changes myself a couple of years
ago.

Regards,
Martin

From reinhold-birkenfeld-nospam at wolke7.net  Sat Oct 15 10:01:14 2005
From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld)
Date: Sat, 15 Oct 2005 10:01:14 +0200
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
	conversions).
In-Reply-To: <8393fff0510142350l81ba453md20cc47a445642ce@mail.gmail.com>
References: <8393fff0510022309l5977e899led262cb3c1a7a805@mail.gmail.com>	<2mbr27f0th.fsf@starship.python.net>
	<8393fff0510142350l81ba453md20cc47a445642ce@mail.gmail.com>
Message-ID: <diqd0a$o1q$1@sea.gmane.org>

Martin Blais wrote:
> On 10/3/05, Michael Hudson <mwh at python.net> wrote:
>> Martin Blais <blais at furius.ca> writes:
>>
>> > How hard would that be to implement?
>>
>> import sys
>> reload(sys)
>> sys.setdefaultencoding('undefined')
> 
> Hmmm any particular reason for the call to reload() here?

Yes. setdefaultencoding() is removed from sys by site.py. To get it
again you must reload sys.

Reinhold

-- 
Mail address is perfectly valid!


From mwh at python.net  Sat Oct 15 10:17:36 2005
From: mwh at python.net (Michael Hudson)
Date: Sat, 15 Oct 2005 09:17:36 +0100
Subject: [Python-Dev] LOAD_SELF and SELF_ATTR opcodes
In-Reply-To: <5.1.1.6.0.20051014151215.01f76bb0@mail.telecommunity.com>
	(Phillip J. Eby's message of "Fri, 14 Oct 2005 15:43:43 -0400")
References: <5.1.1.6.0.20051014151215.01f76bb0@mail.telecommunity.com>
Message-ID: <2mk6gf9otb.fsf@starship.python.net>

"Phillip J. Eby" <pje at telecommunity.com> writes:

> Indeed, even pystone doesn't do much attribute access on the first argument 
> of most of its functions, especially not those in inner loops.  Only 
> Proc1() and the Record.copy() method do anything that would be helped by 
> SELF_ATTR.  But it seems to me that this is very unusual for 
> object-oriented code, and that more common uses of Python should be helped 
> a lot more by this.

Is it that unusual though?  I don't think it's that unreasonable to
suppose that 'typical smalltalk code' sends messages to self a good
deal more often than 'typical python code'.  I'm not saying that this
*is* the case, but my intuition is that it might be (not all Python
code is that object oriented, after all).

Cheers,
mwh

-- 
  The source passes lint without any complaint (if invoked with
  >/dev/null).    -- Daniel Fischer, http://www.ioccc.org/1998/df.hint

From greg.ewing at canterbury.ac.nz  Sat Oct 15 11:58:12 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 15 Oct 2005 22:58:12 +1300
Subject: [Python-Dev] Simplify lnotab? (AST branch update)
In-Reply-To: <5.1.1.6.0.20051014111829.01f89288@mail.telecommunity.com>
References: <5.1.1.6.0.20051013214331.02f320a0@mail.telecommunity.com>
	<5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com>
	<e8bf7a530510131352xb123101uba24367a8dbe09b2@mail.gmail.com>
	<e8bf7a530510131352xb123101uba24367a8dbe09b2@mail.gmail.com>
	<5.1.1.6.0.20051013205631.02f38e00@mail.telecommunity.com>
	<5.1.1.6.0.20051013214331.02f320a0@mail.telecommunity.com>
	<5.1.1.6.0.20051014111829.01f89288@mail.telecommunity.com>
Message-ID: <4350D2B4.2040401@canterbury.ac.nz>

Phillip J. Eby wrote:
> At 08:07 PM 10/14/2005 +1300, Greg Ewing wrote:
> 
>> More straightforwardly, the second table could just be a
>> bytecode -> file number mapping.
> 
> That would use more space in any case involving multiple files.

Are you sure? Most of the time you're going to have
chunks of contiguous lines coming from the same file,
and the bytecode->filename table will only have an
entry for the first bytecode of the first line of
each chunk. I don't see how that works out differently
from mapping bytecodes->lines and then lines->files.

> That'd probably work.

Greg



From fredrik at pythonware.com  Sat Oct 15 14:35:17 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sat, 15 Oct 2005 14:35:17 +0200
Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use)
References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com><ca471dc20510120758n6697cd66h46d903814985ba1a@mail.gmail.com><dim7oe$cdc$1@sea.gmane.org>
	<1129228828.7198.24.camel@fsol>
Message-ID: <diqt26$rkd$1@sea.gmane.org>

Antoine Pitrou wrote:

> > unfortunately, this standard seem to result in generic "spamtools" modules
> > into which people throw everything that's even remotely related to "spam",
> > followed by complaints about bloat and performance from users, followed by
> > various more or less stupid attempts to implement lazy loading of hidden in-
> > ternal modules, followed by more complaints from users who no longer has
> > a clear view of what's really going on in there...
>
> BTW, what's the performance problem in importing unnecessary stuff
> (assuming pyc files are already generated) ?

larger modules can easily take 0.1-0.2 seconds to import (at least if they
use enough external dependencies).  that may not be a lot of time in itself,
but it can result in several seconds extra startup time for a larger program.

importing unneeded modules also add to the process size, of course.  you
don't need to import too many modules to gobble up a couple of megabytes...

</F>




From skip at pobox.com  Sat Oct 15 15:15:43 2005
From: skip at pobox.com (skip@pobox.com)
Date: Sat, 15 Oct 2005 08:15:43 -0500
Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use)
In-Reply-To: <diqt26$rkd$1@sea.gmane.org>
References: <20051012043518.x88h3g8pehsgo84c@login.werra.lunarpages.com>
	<ca471dc20510120758n6697cd66h46d903814985ba1a@mail.gmail.com>
	<dim7oe$cdc$1@sea.gmane.org> <1129228828.7198.24.camel@fsol>
	<diqt26$rkd$1@sea.gmane.org>
Message-ID: <17233.255.298868.972667@montanaro.dyndns.org>


    >> BTW, what's the performance problem in importing unnecessary stuff
    >> (assuming pyc files are already generated) ?

    Fredrik> larger modules can easily take 0.1-0.2 seconds to import (at
    Fredrik> least if they use enough external dependencies).

I wish it was that short.  At work we use lots of SWIG-wrapped C++
libraries.  Whole lotta dynamic linking goin' on...  In our case I don't
think autoloading would help all that much.  We actually use all that stuff.
The best we could do would be to defer the link step for a couple seconds.

Skip


From pje at telecommunity.com  Sat Oct 15 18:24:33 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sat, 15 Oct 2005 12:24:33 -0400
Subject: [Python-Dev] LOAD_SELF and SELF_ATTR opcodes
In-Reply-To: <2mk6gf9otb.fsf@starship.python.net>
References: <5.1.1.6.0.20051014151215.01f76bb0@mail.telecommunity.com>
	<5.1.1.6.0.20051014151215.01f76bb0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20051015121403.01f2fbf0@mail.telecommunity.com>

At 09:17 AM 10/15/2005 +0100, Michael Hudson wrote:
>"Phillip J. Eby" <pje at telecommunity.com> writes:
>
> > Indeed, even pystone doesn't do much attribute access on the first 
> argument
> > of most of its functions, especially not those in inner loops.  Only
> > Proc1() and the Record.copy() method do anything that would be helped by
> > SELF_ATTR.  But it seems to me that this is very unusual for
> > object-oriented code, and that more common uses of Python should be helped
> > a lot more by this.
>
>Is it that unusual though?  I don't think it's that unreasonable to
>suppose that 'typical smalltalk code' sends messages to self a good
>deal more often than 'typical python code'.  I'm not saying that this
>*is* the case, but my intuition is that it might be (not all Python
>code is that object oriented, after all).

Well, my greps on the stdlib suggest that about 50% of all lines doing 
attribute access, include an attribute access on 'self'.  So for the 
stdlib, it's darn common.  Plus, all functions benefit a tiny bit from 
faster access to their first argument via the LOAD_SELF opcode, which is 
what produced the roughly 2% improvement of parrotbench.

The overall performance question has more to do with whether any of those 
accesses to self or self attributes are in loops.  A person who's 
experienced at doing Python performance tuning will probably lift as many 
of them out of the innermost loops as possible, thereby lessening the 
impact of the change somewhat.  But someone who doesn't know to do that, or 
just hasn't done it yet, will get more benefit from the change, but not as 
much as they'd get by lifting out the attribute access.

Thus my guess is that it'll speed up "typical", un-tuned Python code by a 
few %, and is unlikely to slow anything down - even compilation.  (The 
compiler changes are extremely minimal and localized to the relevant 
bytecode emission points.)  Seems like a freebie, all in all.


From tcdelaney at optusnet.com.au  Sat Oct 15 22:24:30 2005
From: tcdelaney at optusnet.com.au (Tim Delaney)
Date: Sun, 16 Oct 2005 06:24:30 +1000
Subject: [Python-Dev] LOAD_SELF and SELF_ATTR opcodes
Message-ID: <026101c5d1c6$72d1d090$0201a8c0@ryoko>

Tim Delaney wrote:

> that also binds all attribute accesses all the way down into a single
> constant call e.g.
>
>    LOAD_FAST  0
>    LOAD_ATTR  'a'
>    LOAD_ATTR  'b'
>    LOAD_ATTR  'c'
>    LOAD_ATTR  'd'
>
> is bound to a single constant:
>
>    LOAD_CONST  5

D'oh. I'm a moron - of course it can't do that. It'll do it for LOAD_GLOBAL 
followed by multiple LOAD_ATTR, and (I think) also for LOAD_NAME.

Tim Delaney 


From tonynelson at georgeanelson.com  Sun Oct 16 02:12:23 2005
From: tonynelson at georgeanelson.com (Tony Nelson)
Date: Sat, 15 Oct 2005 20:12:23 -0400
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <434F5ACF.3000802@v.loewis.de>
References: <v04020a03bf72162a051d@[192.168.123.162]>
	<v04020a03bf72162a051d@[192.168.123.162]>
Message-ID: <v04020a02bf774089e195@[192.168.123.162]>

I have put up a new, packaged version of my fast charmap module at
<http://georgeanelson.com/fastcharmap> .  Hopefully it is packaged properly
and works properly (it works on my FC3 Python 2.3.4 system).  This version
is over 5 times faster than the base codec according to Hye-Shik Chang's
benchmark (mostly from compiling it with -O3).

I bring it up here mostly because I mention in its docs that improved
faster charmap codecs are coming from the Python developers.  Is it OK to
say that, and have I said it right?  I'll take that out if you folks want.

I understand that my module is not favored by Martin v. L?wis, and I don't
advocate it becoming part of Python.  My web page and docs say that it may
be useful until Python has the faster codecs.  It allows changing the
mappings because that is part of the current semantics -- a new version of
Python can certainly change those semantics.

I want to thank you all for so quickly going to work on the problem of
making charmap codecs faster.  It's to the benefit of Python users
everywhere to have faster charmap codecs in Python.  Your quickness
impressed me.

BTW, Martin, if you care to, would you explain to me how a Trie would be
used for charmap encoding?  I know a couple of approaches, but I don't know
how to do it fast.  (I've never actually had the occasion to use a Trie.)
____________________________________________________________________
TonyN.:'                       <mailto:tonynelson at georgeanelson.com>
      '                              <http://www.georgeanelson.com/>

From ncoghlan at gmail.com  Sun Oct 16 05:01:24 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 16 Oct 2005 13:01:24 +1000
Subject: [Python-Dev] Sourceforge CVS access
In-Reply-To: <ca471dc20510080923q7842e939x9197d07d323b2970@mail.gmail.com>
References: <43468417.4000701@iinet.net.au>	
	<ca471dc20510070914h404045c1r8ab1d88ccb2c64de@mail.gmail.com>	
	<43473E51.4010103@gmail.com>
	<ca471dc20510080923q7842e939x9197d07d323b2970@mail.gmail.com>
Message-ID: <4351C284.1040303@gmail.com>

Guido van Rossum wrote:
> You're in. Use it wisely. Let me know if there are things you still
> cannot do. (But I'm not used to being SF project admin any more; other
> admins may be able to help you quicker...)

Almost there - checking out over SSH failed to work. I checked the python SF 
admin page, and I still only have read access to the CVS repository. So if one 
of the SF admins could flip that last switch, that would be great :)

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From kbk at shore.net  Sun Oct 16 06:34:07 2005
From: kbk at shore.net (Kurt B. Kaiser)
Date: Sun, 16 Oct 2005 00:34:07 -0400 (EDT)
Subject: [Python-Dev] Weekly Python Patch/Bug Summary
Message-ID: <200510160434.j9G4Y7HG022965@bayview.thirdcreek.com>

Patch / Bug Summary
___________________

Patches :  344 open ( +3) /  2955 closed ( +2) /  3299 total ( +5)
Bugs    :  883 open ( -1) /  5341 closed (+20) /  6224 total (+19)
RFE     :  201 open ( +5) /   187 closed ( +0) /   388 total ( +5)

New / Reopened Patches
______________________

Compiling and linking main() with C++ compiler  (2005-10-12)
       http://python.org/sf/1324762  opened by  Christoph Ludwig

Adding redblack tree to collections module  (2005-10-12)
       http://python.org/sf/1324770  opened by  Hye-Shik Chang

Letting "build_ext --libraries" take more than one lib  (2005-10-13)
       http://python.org/sf/1326113  opened by  Stephen A. Langer

VS NET 2003 Project Files contain per file compiler settings  (2005-10-15)
       http://python.org/sf/1327377  opened by  Juan-Carlos Lopez-Garcia

Static Windows Build fails to locate existing installation  (2005-10-15)
       http://python.org/sf/1327594  opened by  Doe, Baby

Patches Closed
______________

Encoding alias "unicode-1-1-utf-7"  (2005-07-26)
       http://python.org/sf/1245379  closed by  doerwalter

Py_INCREF/Py_DECREF with magic constant demo  (2005-10-07)
       http://python.org/sf/1316653  closed by  rhamphoryncus

New / Reopened Bugs
___________________

irregular behavior within class using __setitem__  (2005-10-08)
CLOSED http://python.org/sf/1317376  opened by  capnSTABN

Missing Library Modules  (2005-10-09)
CLOSED http://python.org/sf/1321736  opened by  George LeCompte

Minor error in the Library Reference doc  (2005-10-10)
CLOSED http://python.org/sf/1323294  opened by  Colin J. Williams

getwindowsversion() constants in sys module  (2005-10-11)
       http://python.org/sf/1323369  opened by  Tony Meyer

C API doc for PySequence_Tuple duplicated  (2005-10-11)
CLOSED http://python.org/sf/1323739  opened by  George Yoshida

MSI installer not working  (2005-10-11)
CLOSED http://python.org/sf/1323810  opened by  Eric Rucker

ISO8859-9 broken  (2005-10-11)
       http://python.org/sf/1324237  opened by  Eray Ozkural

Curses module doesn't install on Solaris 2.8  (2005-10-12)
       http://python.org/sf/1324799  opened by  Andrew Koenig

"as" keyword sometimes highlighted in strings  (2005-10-12)
       http://python.org/sf/1325071  opened by  Artur de Sousa Rocha

binary code made by freeze results "unknown encoding"  (2005-10-13)
CLOSED http://python.org/sf/1325491  opened by  greatPython

Curses,h  (2005-10-13)
CLOSED http://python.org/sf/1325611  reopened by  rbrenner

Curses,h  (2005-10-13)
CLOSED http://python.org/sf/1325611  opened by  hafnium

Curses,h  (2005-10-13)
CLOSED http://python.org/sf/1325903  opened by  hafnium

traceback.py formats SyntaxError differently  (2005-10-13)
       http://python.org/sf/1326077  opened by  Neil Schemenauer

odd behaviour when making lists of lambda forms  (2005-10-13)
CLOSED http://python.org/sf/1326195  opened by  Johan Hidding

itertools.count wraps around after maxint  (2005-10-13)
       http://python.org/sf/1326277  opened by  paul rubin

pdb breaks programs which import from __main__  (2005-10-13)
       http://python.org/sf/1326406  opened by  Ilya Sandler

set.__getstate__ is not overriden  (2005-10-14)
       http://python.org/sf/1326448  opened by  George Sakkis

SIGALRM alarm signal kills interpreter  (2005-10-14)
CLOSED http://python.org/sf/1326841  opened by  paul rubin

wrong TypeError traceback in generator expressions  (2005-10-14)
       http://python.org/sf/1327110  opened by  Yusuke Shinyama

title() uppercases latin1 strings after accented letters  (2005-10-14)
CLOSED http://python.org/sf/1327233  opened by  Humberto Di?genes

Bugs Closed
___________

Segmentation fault with invalid "coding"  (2005-10-07)
       http://python.org/sf/1316162  closed by  birkenfeld

irregular behavior within class using __setitem__  (2005-10-08)
       http://python.org/sf/1317376  closed by  ncoghlan

python.exe 2.4.2 compiled with VS2005 crashes  (2005-10-03)
       http://python.org/sf/1311784  closed by  loewis

2.4.2 make problems  (2005-10-03)
       http://python.org/sf/1311579  closed by  loewis

2.4.1 windows MSI has no _socket  (2005-09-24)
       http://python.org/sf/1302793  closed by  loewis

The _ssl build process for 2.3.5 is broken  (2005-09-16)
       http://python.org/sf/1292634  closed by  loewis

codecs.readline sometimes removes newline chars  (2005-04-02)
       http://python.org/sf/1175396  closed by  doerwalter

Missing Library Modules  (2005-10-09)
       http://python.org/sf/1321736  closed by  nnorwitz

inspect.getsourcelines() broken  (2005-10-07)
       http://python.org/sf/1315961  closed by  doerwalter

Minor error in the Library Reference doc  (2005-10-10)
       http://python.org/sf/1323294  closed by  nnorwitz

failure to build RPM on rhel 3  (2005-07-28)
       http://python.org/sf/1246900  closed by  jafo

C API doc for PySequence_Tuple duplicated  (2005-10-11)
       http://python.org/sf/1323739  closed by  nnorwitz

MSI installer not working  (2005-10-11)
       http://python.org/sf/1323810  closed by  loewis

[AST] Patch [ 1190012 ] should've checked for SyntaxWarnings  (2005-05-04)
       http://python.org/sf/1195576  closed by  nascheme

binary code made by freeze results "unknown encoding"  (2005-10-13)
       http://python.org/sf/1325491  closed by  perky

Curses,h  (2005-10-13)
       http://python.org/sf/1325611  closed by  birkenfeld

Curses,h  (2005-10-13)
       http://python.org/sf/1325611  closed by  perky

Curses,h  (2005-10-13)
       http://python.org/sf/1325903  closed by  birkenfeld

odd behaviour when making lists of lambda forms  (2005-10-13)
       http://python.org/sf/1326195  closed by  rhettinger

SIGALRM alarm signal kills interpreter  (2005-10-14)
       http://python.org/sf/1326841  closed by  loewis

title() uppercases latin1 strings after accented letters  (2005-10-15)
       http://python.org/sf/1327233  closed by  perky

New / Reopened RFE
__________________

itemgetter built-in?  (2005-10-10)
       http://python.org/sf/1322308  opened by  capnSTABN

fix for ms stdio tables   (2005-10-11)
       http://python.org/sf/1324176  opened by  Vambola Kotkas

__name__ available during class dictionary build  (2005-10-12)
       http://python.org/sf/1324261  opened by  Adal Chiriliuc

python scratchpad  (2005-10-14)
       http://python.org/sf/1326830  opened by  paul rubin


From guido at python.org  Sun Oct 16 06:39:02 2005
From: guido at python.org (Guido van Rossum)
Date: Sat, 15 Oct 2005 21:39:02 -0700
Subject: [Python-Dev] Sourceforge CVS access
In-Reply-To: <4351C284.1040303@gmail.com>
References: <43468417.4000701@iinet.net.au>
	<ca471dc20510070914h404045c1r8ab1d88ccb2c64de@mail.gmail.com>
	<43473E51.4010103@gmail.com>
	<ca471dc20510080923q7842e939x9197d07d323b2970@mail.gmail.com>
	<4351C284.1040303@gmail.com>
Message-ID: <ca471dc20510152139p7fd051d6y6c0756730d89295f@mail.gmail.com>

Sobebody help Nick! This is beyond my SF-fu! :-(

On 10/15/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Guido van Rossum wrote:
> > You're in. Use it wisely. Let me know if there are things you still
> > cannot do. (But I'm not used to being SF project admin any more; other
> > admins may be able to help you quicker...)
>
> Almost there - checking out over SSH failed to work. I checked the python SF
> admin page, and I still only have read access to the CVS repository. So if one
> of the SF admins could flip that last switch, that would be great :)
>
> Regards,
> Nick.
>
> --
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
> ---------------------------------------------------------------
>              http://boredomandlaziness.blogspot.com
>
>


--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Sun Oct 16 06:45:41 2005
From: guido at python.org (Guido van Rossum)
Date: Sat, 15 Oct 2005 21:45:41 -0700
Subject: [Python-Dev] Sourceforge CVS access
In-Reply-To: <ca471dc20510152139p7fd051d6y6c0756730d89295f@mail.gmail.com>
References: <43468417.4000701@iinet.net.au>
	<ca471dc20510070914h404045c1r8ab1d88ccb2c64de@mail.gmail.com>
	<43473E51.4010103@gmail.com>
	<ca471dc20510080923q7842e939x9197d07d323b2970@mail.gmail.com>
	<4351C284.1040303@gmail.com>
	<ca471dc20510152139p7fd051d6y6c0756730d89295f@mail.gmail.com>
Message-ID: <ca471dc20510152145k6f463773p2bb5c6e2f51f7449@mail.gmail.com>

With Neal's help I've fixed Nick's permissions. Enjoy, Nick!

On 10/15/05, Guido van Rossum <guido at python.org> wrote:
> Somebody help Nick! This is beyond my SF-fu! :-(
>
> On 10/15/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> > Guido van Rossum wrote:
> > > You're in. Use it wisely. Let me know if there are things you still
> > > cannot do. (But I'm not used to being SF project admin any more; other
> > > admins may be able to help you quicker...)
> >
> > Almost there - checking out over SSH failed to work. I checked the python SF
> > admin page, and I still only have read access to the CVS repository. So if one
> > of the SF admins could flip that last switch, that would be great :)
> >
> > Regards,
> > Nick.
> >
> > --
> > Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
> > ---------------------------------------------------------------
> >              http://boredomandlaziness.blogspot.com
> >
> >
>
>
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
>
>


--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From ncoghlan at gmail.com  Sun Oct 16 06:54:44 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 16 Oct 2005 14:54:44 +1000
Subject: [Python-Dev] Sourceforge CVS access
In-Reply-To: <ca471dc20510152145k6f463773p2bb5c6e2f51f7449@mail.gmail.com>
References: <43468417.4000701@iinet.net.au>	
	<ca471dc20510070914h404045c1r8ab1d88ccb2c64de@mail.gmail.com>	
	<43473E51.4010103@gmail.com>	
	<ca471dc20510080923q7842e939x9197d07d323b2970@mail.gmail.com>	
	<4351C284.1040303@gmail.com>	
	<ca471dc20510152139p7fd051d6y6c0756730d89295f@mail.gmail.com>
	<ca471dc20510152145k6f463773p2bb5c6e2f51f7449@mail.gmail.com>
Message-ID: <4351DD14.4050302@gmail.com>

Guido van Rossum wrote:
> With Neal's help I've fixed Nick's permissions. Enjoy, Nick!

Thanks folks - it looks to be all good now.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From jeremy at alum.mit.edu  Sun Oct 16 07:30:26 2005
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Sun, 16 Oct 2005 01:30:26 -0400
Subject: [Python-Dev] AST branch merge status
Message-ID: <e8bf7a530510152230h7d0382ccx1ff7f1727877bd08@mail.gmail.com>

I just merged the head back to the AST branch for what I hope is the
last time.  I plan to merge the branch to the head on Sunday evening. 
I'd appreciate it if folks could hold off on making changes on the
trunk until that merge happens.

If this is a non-trivial inconvenience for anyone, go ahead with the
changes but send me mail to  make sure that I don't lose the changes
when I do the merge.  Regardless, the compiler and Grammar are off
limits.  I'll blow away any changes you make there <0.1 wink>.

Jeremy

From martin at v.loewis.de  Sun Oct 16 11:56:01 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 16 Oct 2005 11:56:01 +0200
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <v04020a02bf774089e195@[192.168.123.162]>
References: <v04020a03bf72162a051d@[192.168.123.162]>
	<v04020a03bf72162a051d@[192.168.123.162]>
	<v04020a02bf774089e195@[192.168.123.162]>
Message-ID: <435223B1.2020209@v.loewis.de>

Tony Nelson wrote:
> BTW, Martin, if you care to, would you explain to me how a Trie would be
> used for charmap encoding?  I know a couple of approaches, but I don't know
> how to do it fast.  (I've never actually had the occasion to use a Trie.)

I currently envision a three-level trie, with 5, 4, and 7 bits. You take
the Unicode character (only chacters below U+FFFF supported), and take
the uppermost 5 bits, as index in an array. There you find the base
of a second array, to which you add the next 4 bits, which gives you an
index into a third array, where you add the last 7 bits. This gives
you the character, or 0 if it is unmappable.

struct encoding_map{
   unsigned char level0[32];
   unsigned char *level1;
   unsigned char *level2;
};

struct encoding_map *table;
Py_UNICODE character;
int level1 = table->level0[character>>11];
if(level1==0xFF)raise unmapped;
int level2 = table->level1[16*level1 + ((character>>7) & 0xF)];
if(level2==0xFF)raise unmapped;
int mapped = table->level2[128*level2 + (character & 0x7F)];
if(mapped==0)raise unmapped;

Over a hashtable, this has the advantage of not having to deal with
collisions. Instead, it guarantees you a lookup in a constant time.

It is also quite space-efficient: all tables use bytes as indizes.
As each level0 deals with 2048 characters, most character maps
will only use 1 or two level1 blocks, meaning 16 or 32 bytes
for level1. The number of level3 blocks required depends on
the number of 127-character rows which the encoding spans;
for most encodings, 3 or four such blocks will be sufficient
(with ASCII spanning one such block typically), causing the
entire memory consumption for many encodings to be less than
600 bytes.

It would be possible to remove one level of indirection (giving
some more speed) at the expense of additional memory: for
example, and 8-8 trie would use 256 bytes for level 0, and
then 256 bytes for each Unicode row where the encoding has
characters, likely resulting in 1kiB for a typical encoding.

Hye-Shik Chang version of the fast codec uses such an 8-8
trie, but conserves space by observing that most 256-char
rows are only sparsely used by encodings (e.g. if you have
only ASCII, you use only 128 characters from row 0). He
therefore strips unused cells from the row, by also recording
the first and last character per row. This brings some
space back, at the expense of slow-down again.

Regards,
Martin

From ncoghlan at iinet.net.au  Sun Oct 16 14:46:44 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Sun, 16 Oct 2005 22:46:44 +1000
Subject: [Python-Dev] PEP 343 updated
Message-ID: <43524BB4.7040808@iinet.net.au>

PEP 343 has been updated on python.org.

Highlights of the changes:

   - changed the name of the PEP to be simply "The 'with' Statement"
   - added __with__() method
   - added section on standard terminology (that is, contexts/context managers)
   - changed generator context decorator name to "context"
   - Updated "Resolved Issues" section
   - updated decimal.Context() example
   - updated closing() example so it works for objects without close methods

I also added a new Open Issues section with the questions:

   - should the decorator be called "context" or something else, such as the
     old "contextmanager"? (The PEP currently says "context")
   - should the decorator be a builtin? (The PEP currently says yes)
   - should the decorator be applied automatically to generators used to write
     __with__ methods? (The PEP currently says yes)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at iinet.net.au  Sun Oct 16 15:56:10 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Sun, 16 Oct 2005 23:56:10 +1000
Subject: [Python-Dev] Definining properties - a use case for class
	decorators?
Message-ID: <43525BFA.9090309@iinet.net.au>

On and off, I've been looking for an elegant way to handle properties using 
decorators.

It hasn't really worked, because decorators are inherently single function, 
and properties span multiple functions.

However, it occurred to me that Python already contains a construct for 
grouping multiple related functions together: classes.

And that thought led me to this decorator:

   def def_property(cls_func):
       cls = cls_func()
       try:
           fget = cls.get.im_func
       except AttributeError:
           fget = None
       try:
           fset = cls.set.im_func
       except AttributeError:
           fset = None
       try:
           fdel = cls.delete.im_func
       except AttributeError:
           fdel = None
       return property(fget, fset, fdel, cls.__doc__)

Obviously, this decorator can only be used by decorating a function that 
returns the class of interest:

   class Demo(object):
      @def_property
      def test():
          class prop:
              """This is a test property"""
              def get(self):
                  print "Getting attribute on instance"
              def set(self, value):
                  print "Setting attribute on instance"
              def delete(self):
                  print "Deleting attribute on instance"
          return prop

Which gives the following behaviour:

Py> Demo.test
<property object at 0x00B9CC38>
Py> Demo().test
Getting attribute on instance
Py> Demo().test = 1
Setting attribute on instance
Py> del Demo().test
Deleting attribute on instance
Py> help(Demo.test)
Help on property:

     This is a test property

     <get> = get(self)

     <set> = set(self, value)

     <delete> = delete(self)

If we had class decorators, though, the decorator could be modified to skip 
the function invocation:

   def def_property(cls):
       try:
           fget = cls.get.im_func
       except AttributeError:
           fget = None
       try:
           fset = cls.set.im_func
       except AttributeError:
           fset = None
       try:
           fdel = cls.delete.im_func
       except AttributeError:
           fdel = None
       return property(fget, fset, fdel, cls.__doc__)

And the usage would be much more direct:

   class Demo(object):
      @def_property
      class test:
          """This is a test property"""
          def get(self):
              print "Getting attribute on instance"
          def set(self, value):
              print "Setting attribute on instance"
          def delete(self):
              print "Deleting attribute on instance"


Comments? Screams of horror?

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From solipsis at pitrou.net  Sun Oct 16 16:08:09 2005
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sun, 16 Oct 2005 16:08:09 +0200
Subject: [Python-Dev] Definining properties - a use case for
	class	decorators?
In-Reply-To: <43525BFA.9090309@iinet.net.au>
References: <43525BFA.9090309@iinet.net.au>
Message-ID: <1129471690.6133.9.camel@fsol>


>    class Demo(object):
>       @def_property
>       class test:
>           """This is a test property"""
>           def get(self):
>               print "Getting attribute on instance"
>           def set(self, value):
>               print "Setting attribute on instance"
>           def delete(self):
>               print "Deleting attribute on instance"

The code looks like "self" refers to a test instance, but it will
actually refer to a Demo instance. This is quite misleading.




From gary at modernsongs.com  Sun Oct 16 16:18:54 2005
From: gary at modernsongs.com (Gary Poster)
Date: Sun, 16 Oct 2005 10:18:54 -0400
Subject: [Python-Dev] Definining properties - a use case for class
	decorators?
In-Reply-To: <43525BFA.9090309@iinet.net.au>
References: <43525BFA.9090309@iinet.net.au>
Message-ID: <414317A5-0B04-4250-B458-A8B8A74AC221@modernsongs.com>


On Oct 16, 2005, at 9:56 AM, Nick Coghlan wrote:

> On and off, I've been looking for an elegant way to handle  
> properties using
> decorators.

This isn't my idea, and it might have been brought up here in the  
past to the same sorts of screams of horror to which you refer later,  
but I use the 'apply' pattern without too many internal objections  
for this:

class Foo(object):
     # just a simple example, practically pointless
     _my_property = None
     @apply
     def my_property():
         def get(self):
             return self._my_property
         def set(self, value):
             self._my_property = value
         return property(get, set)

IMHO, I find this easier to parse than either of your two examples.

Apologies if this has already been screamed at. :-)

Gary

From guido at python.org  Sun Oct 16 17:19:07 2005
From: guido at python.org (Guido van Rossum)
Date: Sun, 16 Oct 2005 08:19:07 -0700
Subject: [Python-Dev] PEP 343 updated
In-Reply-To: <43524BB4.7040808@iinet.net.au>
References: <43524BB4.7040808@iinet.net.au>
Message-ID: <ca471dc20510160819n47bc5697w19c3bf939795e2d8@mail.gmail.com>

On 10/16/05, Nick Coghlan <ncoghlan at iinet.net.au> wrote:
> PEP 343 has been updated on python.org.
>
> Highlights of the changes:
>
>    - changed the name of the PEP to be simply "The 'with' Statement"
>    - added __with__() method
>    - added section on standard terminology (that is, contexts/context managers)
>    - changed generator context decorator name to "context"
>    - Updated "Resolved Issues" section
>    - updated decimal.Context() example
>    - updated closing() example so it works for objects without close methods
>
> I also added a new Open Issues section with the questions:
>
>    - should the decorator be called "context" or something else, such as the
>      old "contextmanager"? (The PEP currently says "context")
>    - should the decorator be a builtin? (The PEP currently says yes)
>    - should the decorator be applied automatically to generators used to write
>      __with__ methods? (The PEP currently says yes)

I hope you reverted the status to "Proposed"...

On the latter: I think it shouldn't; I don't like this kind of magic.
I'll have to read it before I can comment on the rest.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Sun Oct 16 17:23:15 2005
From: guido at python.org (Guido van Rossum)
Date: Sun, 16 Oct 2005 08:23:15 -0700
Subject: [Python-Dev] Definining properties - a use case for class
	decorators?
In-Reply-To: <43525BFA.9090309@iinet.net.au>
References: <43525BFA.9090309@iinet.net.au>
Message-ID: <ca471dc20510160823i1ab5ff15kc42b138a3ced596b@mail.gmail.com>

On 10/16/05, Nick Coghlan <ncoghlan at iinet.net.au> wrote:
> On and off, I've been looking for an elegant way to handle properties using
> decorators.
>
> It hasn't really worked, because decorators are inherently single function,
> and properties span multiple functions.
>
> However, it occurred to me that Python already contains a construct for
> grouping multiple related functions together: classes.

Nick, and everybody else trying to find a "solution" for this
"problem", please don't. There's nothing wrong with having the three
accessor methods explicitly in the namespace, it's clear, and probably
less typing (and certainly less indenting!). Just write this:

    class C:
        def getFoo(self): ...
        def setFoo(self): ...
        foo = property(getFoo, setFoo)

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From ark at acm.org  Sun Oct 16 17:26:09 2005
From: ark at acm.org (Andrew Koenig)
Date: Sun, 16 Oct 2005 11:26:09 -0400
Subject: [Python-Dev] PEP 343 updated
In-Reply-To: <43524BB4.7040808@iinet.net.au>
Message-ID: <000f01c5d265$f232c2f0$6402a8c0@arkdesktop>

> PEP 343 has been updated on python.org.

> Highlights of the changes:

>    - changed the name of the PEP to be simply "The 'with' Statement"

Do you mean PEP 346, perchance?  PEP 343 is something else entirely.




From ironfroggy at gmail.com  Sun Oct 16 17:51:30 2005
From: ironfroggy at gmail.com (Calvin Spealman)
Date: Sun, 16 Oct 2005 11:51:30 -0400
Subject: [Python-Dev] Definining properties - a use case for class
	decorators?
In-Reply-To: <ca471dc20510160823i1ab5ff15kc42b138a3ced596b@mail.gmail.com>
References: <43525BFA.9090309@iinet.net.au>
	<ca471dc20510160823i1ab5ff15kc42b138a3ced596b@mail.gmail.com>
Message-ID: <76fd5acf0510160851l2aeacc6x156a203ca0c7ca60@mail.gmail.com>

On 10/16/05, Guido van Rossum <guido at python.org> wrote:
> On 10/16/05, Nick Coghlan <ncoghlan at iinet.net.au> wrote:
> > On and off, I've been looking for an elegant way to handle properties using
> > decorators.
> >
> > It hasn't really worked, because decorators are inherently single function,
> > and properties span multiple functions.
> >
> > However, it occurred to me that Python already contains a construct for
> > grouping multiple related functions together: classes.
>
> Nick, and everybody else trying to find a "solution" for this
> "problem", please don't. There's nothing wrong with having the three
> accessor methods explicitly in the namespace, it's clear, and probably
> less typing (and certainly less indenting!). Just write this:
>
>     class C:
>         def getFoo(self): ...
>         def setFoo(self): ...
>         foo = property(getFoo, setFoo)

Does this necessisarily mean a 'no' still for class decorators, or do
you just not like this particular use case for them. Or, are you
perhaps against this proposal due to its use of nested classes?

From ironfroggy at gmail.com  Sun Oct 16 17:56:36 2005
From: ironfroggy at gmail.com (Calvin Spealman)
Date: Sun, 16 Oct 2005 11:56:36 -0400
Subject: [Python-Dev] Early PEP draft (For Python 3000?)
In-Reply-To: <20051014000927.919E.JCARLSON@uci.edu>
References: <b64f365b0510111431g7fdc573dpc59e4e304ec4e425@mail.gmail.com>
	<76fd5acf0510132316x6a8bcc8ck1c3d5a812abd447e@mail.gmail.com>
	<20051014000927.919E.JCARLSON@uci.edu>
Message-ID: <76fd5acf0510160856r72f87f7fj51ad48c97003b810@mail.gmail.com>

On 10/14/05, Josiah Carlson <jcarlson at uci.edu> wrote:
>
> Calvin Spealman <ironfroggy at gmail.com> wrote:
> >
> > On 10/11/05, Eyal Lotem <eyal.lotem at gmail.com> wrote:
> > >       locals()['x'] = 1 # Quietly fails!
> > > Replaced by:
> > >       frame.x = 1 # Raises error
> >
> > What about the possibility of making this hypothetic frame object an
> > indexable, such that frame[0] is the current scope, frame[1] is the
> > calling scope, etc.? On the same lines, what about closure[0] for the
> > current frame, while closure[1] resolves to the closure the function
> > was defined in? These would ensure that you could reliably access any
> > namespace you would need, without nasty stack tricks and such, and
> > would make working around some of the limitation of the closures, when
> > you have such a need. One might even consider a __resolve__ to be
> > defined in any namespace, allowing all the namespace resolution rules
> > to be overridden by code at any level.
>
> -1000  If you want a namespace, create one and pass it around.  If the
> writer of a function or method wanted you monkeying around with a
> namespace, they would have given you one to work with.

If they want you monkeying around with their namespace or not, you can
do so with various tricks introspecting the frame stack and other
internals. I was merely suggesting this as something more
standardized, perhaps across the various Python implementations. It
would also provide a single point of restriction when you want to
disable such things.

> As for closure monkeywork, you've got to be kidding.  Closures in Python
> are a clever and interesting way of keeping around certain things, but
> are actually unnecessary with the existance of class and instance
> namespaces. Every example of a closure can be re-done as a
> class/instance, and many end up looking better.

i mostly put that in there for completeness.

From guido at python.org  Sun Oct 16 18:20:13 2005
From: guido at python.org (Guido van Rossum)
Date: Sun, 16 Oct 2005 09:20:13 -0700
Subject: [Python-Dev] Definining properties - a use case for class
	decorators?
In-Reply-To: <76fd5acf0510160851l2aeacc6x156a203ca0c7ca60@mail.gmail.com>
References: <43525BFA.9090309@iinet.net.au>
	<ca471dc20510160823i1ab5ff15kc42b138a3ced596b@mail.gmail.com>
	<76fd5acf0510160851l2aeacc6x156a203ca0c7ca60@mail.gmail.com>
Message-ID: <ca471dc20510160920u348eb8a2r504840c9baaac168@mail.gmail.com>

On 10/16/05, Calvin Spealman <ironfroggy at gmail.com> wrote:
> On 10/16/05, Guido van Rossum <guido at python.org> wrote:
> > Nick, and everybody else trying to find a "solution" for this
> > "problem", please don't. There's nothing wrong with having the three
> > accessor methods explicitly in the namespace, it's clear, and probably
> > less typing (and certainly less indenting!). Just write this:
> >
> >     class C:
> >         def getFoo(self): ...
> >         def setFoo(self): ...
> >         foo = property(getFoo, setFoo)
>
> Does this necessisarily mean a 'no' still for class decorators, or do
> you just not like this particular use case for them. Or, are you
> perhaps against this proposal due to its use of nested classes?

I'm still -0 on class decorators pending good use cases. I'm -1 on
using a class decorator (if we were to introduce them) for get/set
properties; it doesn't save writing and it doesn't save reading.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From martin at v.loewis.de  Sun Oct 16 18:53:24 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 16 Oct 2005 18:53:24 +0200
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <v04020a02bf7823813cdb@[192.168.123.162]>
References: <v04020a02bf774089e195@[192.168.123.162]>
	<v04020a03bf72162a051d@[192.168.123.162]>
	<v04020a03bf72162a051d@[192.168.123.162]>
	<v04020a02bf774089e195@[192.168.123.162]>
	<v04020a02bf7823813cdb@[192.168.123.162]>
Message-ID: <43528584.7090306@v.loewis.de>

Tony Nelson wrote:
> Umm, 0 (NUL) is a valid output character in most of the 8-bit character
> sets.  It could be handled by having a separate "exceptions" string of the
> unicode code points that actually map to the exception char.

Yes. But only U+0000 should normally map to 0. It could be special-cased
altogether.

> As you are concerned about pathological cases for hashing (that would make
> the hash chains long), it is worth noting that in such cases this data
> structure could take 64K bytes.  Of course, no such case occurs in standard
> encodings, and 64K is affordable as long is it is rare.

I'm not concerned with long hash chains, I dislike having collisions in 
the first place (even if they are only for two code points). Having to
deal with collisions makes the code more complicated, and less predictable.

It's primarily a matter of taste: avoid hashtables if you can :-)

Regards,
Martin

From jcarlson at uci.edu  Sun Oct 16 19:22:14 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sun, 16 Oct 2005 10:22:14 -0700
Subject: [Python-Dev] Early PEP draft (For Python 3000?)
In-Reply-To: <76fd5acf0510160856r72f87f7fj51ad48c97003b810@mail.gmail.com>
References: <20051014000927.919E.JCARLSON@uci.edu>
	<76fd5acf0510160856r72f87f7fj51ad48c97003b810@mail.gmail.com>
Message-ID: <20051016100016.37E3.JCARLSON@uci.edu>


Calvin Spealman <ironfroggy at gmail.com> wrote:
> 
> On 10/14/05, Josiah Carlson <jcarlson at uci.edu> wrote:
> >
> > Calvin Spealman <ironfroggy at gmail.com> wrote:
> > >
> > > On 10/11/05, Eyal Lotem <eyal.lotem at gmail.com> wrote:
> > > >       locals()['x'] = 1 # Quietly fails!
> > > > Replaced by:
> > > >       frame.x = 1 # Raises error
> > >
> > > What about the possibility of making this hypothetic frame object an
> > > indexable, such that frame[0] is the current scope, frame[1] is the
> > > calling scope, etc.? On the same lines, what about closure[0] for the
> > > current frame, while closure[1] resolves to the closure the function
> > > was defined in? These would ensure that you could reliably access any
> > > namespace you would need, without nasty stack tricks and such, and
> > > would make working around some of the limitation of the closures, when
> > > you have such a need. One might even consider a __resolve__ to be
> > > defined in any namespace, allowing all the namespace resolution rules
> > > to be overridden by code at any level.
> >
> > -1000  If you want a namespace, create one and pass it around.  If the
> > writer of a function or method wanted you monkeying around with a
> > namespace, they would have given you one to work with.
> 
> If they want you monkeying around with their namespace or not, you can
> do so with various tricks introspecting the frame stack and other
> internals. I was merely suggesting this as something more
> standardized, perhaps across the various Python implementations. It
> would also provide a single point of restriction when you want to
> disable such things.

What I'm saying is that whether or not you can modify the contents of
stack frames via tricks, you shouldn't.  Why?  Because as I said, if the
writer wanted you to be hacking around with a namespace, they should
have passed you a shared namespace.

From what I understand, there are very few (good) reasons why a user
should muck with stack frames, among them because it is quite convenient
to write custom traceback printers (like web CGI, etc.), and if one is
tricky, limit the callers of a function/method to those "allowable". 
There may be other good reasons, but until you offer a use-case that is
compelling for reasons why it should be easier to access and/or modify
the contents of stack frames, I'm going to remain at -1000.

 - Josiah


From ironfroggy at gmail.com  Sun Oct 16 19:37:17 2005
From: ironfroggy at gmail.com (Calvin Spealman)
Date: Sun, 16 Oct 2005 13:37:17 -0400
Subject: [Python-Dev] Early PEP draft (For Python 3000?)
In-Reply-To: <76fd5acf0510161036i4ab09e2cu39bd6961a60df783@mail.gmail.com>
References: <20051014000927.919E.JCARLSON@uci.edu>
	<76fd5acf0510160856r72f87f7fj51ad48c97003b810@mail.gmail.com>
	<20051016100016.37E3.JCARLSON@uci.edu>
	<76fd5acf0510161036i4ab09e2cu39bd6961a60df783@mail.gmail.com>
Message-ID: <76fd5acf0510161037v477874b0w5595e3edffe71511@mail.gmail.com>

On 10/16/05, Josiah Carlson <jcarlson at uci.edu> wrote:
>
> Calvin Spealman <ironfroggy at gmail.com> wrote:
> >
> > On 10/14/05, Josiah Carlson <jcarlson at uci.edu> wrote:
> > >
> > > Calvin Spealman <ironfroggy at gmail.com> wrote:
> > > >
> > > > On 10/11/05, Eyal Lotem <eyal.lotem at gmail.com> wrote:
> > > > >       locals()['x'] = 1 # Quietly fails!
> > > > > Replaced by:
> > > > >       frame.x = 1 # Raises error
> > > >
> > > > What about the possibility of making this hypothetic frame object an
> > > > indexable, such that frame[0] is the current scope, frame[1] is the
> > > > calling scope, etc.? On the same lines, what about closure[0] for the
> > > > current frame, while closure[1] resolves to the closure the function
> > > > was defined in? These would ensure that you could reliably access any
> > > > namespace you would need, without nasty stack tricks and such, and
> > > > would make working around some of the limitation of the closures, when
> > > > you have such a need. One might even consider a __resolve__ to be
> > > > defined in any namespace, allowing all the namespace resolution rules
> > > > to be overridden by code at any level.
> > >
> > > -1000  If you want a namespace, create one and pass it around.  If the
> > > writer of a function or method wanted you monkeying around with a
> > > namespace, they would have given you one to work with.
> >
> > If they want you monkeying around with their namespace or not, you can
> > do so with various tricks introspecting the frame stack and other
> > internals. I was merely suggesting this as something more
> > standardized, perhaps across the various Python implementations. It
> > would also provide a single point of restriction when you want to
> > disable such things.
>
> What I'm saying is that whether or not you can modify the contents of
> stack frames via tricks, you shouldn't.  Why?  Because as I said, if the
> writer wanted you to be hacking around with a namespace, they should
> have passed you a shared namespace.
>
> From what I understand, there are very few (good) reasons why a user
> should muck with stack frames, among them because it is quite convenient
> to write custom traceback printers (like web CGI, etc.), and if one is
> tricky, limit the callers of a function/method to those "allowable".
> There may be other good reasons, but until you offer a use-case that is
> compelling for reasons why it should be easier to access and/or modify
> the contents of stack frames, I'm going to remain at -1000.

I think I was wording this badly. I meant to suggest this as a way to
define nested functions (or classes?) and probably access names from
various levels of scope. In this way, a nested function would be able
to say "bind the name 'a' in the namespace in which I am defined to
this object", thus offering more fine grained approached than the
current global keyword. I know there has been talk of this issue
before, but I don't know if it works with or against anything said for
this previously.

From blais at furius.ca  Mon Oct 17 02:26:43 2005
From: blais at furius.ca (Martin Blais)
Date: Sun, 16 Oct 2005 20:26:43 -0400
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
	conversions).
In-Reply-To: <diqd0a$o1q$1@sea.gmane.org>
References: <8393fff0510022309l5977e899led262cb3c1a7a805@mail.gmail.com>
	<2mbr27f0th.fsf@starship.python.net>
	<8393fff0510142350l81ba453md20cc47a445642ce@mail.gmail.com>
	<diqd0a$o1q$1@sea.gmane.org>
Message-ID: <8393fff0510161726i6fd5c798u38aa3875c8f6ac4d@mail.gmail.com>

On 10/15/05, Reinhold Birkenfeld <reinhold-birkenfeld-nospam at wolke7.net> wrote:
> Martin Blais wrote:
> > On 10/3/05, Michael Hudson <mwh at python.net> wrote:
> >> Martin Blais <blais at furius.ca> writes:
> >>
> >> > How hard would that be to implement?
> >>
> >> import sys
> >> reload(sys)
> >> sys.setdefaultencoding('undefined')
> >
> > Hmmm any particular reason for the call to reload() here?
>
> Yes. setdefaultencoding() is removed from sys by site.py. To get it
> again you must reload sys.

Thanks.

cheers,

From greg.ewing at canterbury.ac.nz  Mon Oct 17 03:42:02 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 17 Oct 2005 14:42:02 +1300
Subject: [Python-Dev] Definining properties - a use case for class
 decorators?
In-Reply-To: <ca471dc20510160823i1ab5ff15kc42b138a3ced596b@mail.gmail.com>
References: <43525BFA.9090309@iinet.net.au>
	<ca471dc20510160823i1ab5ff15kc42b138a3ced596b@mail.gmail.com>
Message-ID: <4353016A.1010707@canterbury.ac.nz>

Guido van Rossum wrote:

> Nick, and everybody else trying to find a "solution" for this
> "problem", please don't.

Denying that there's a problem isn't going to make it
go away. Many people, including me, have the feeling that
the standard way of defining properties at the moment leaves
something to be desired, for all the same reasons that have
led to @-decorators.

However, I agree that trying to keep the accessor method
names out of the class namespace isn't necessary, and may
not even be desirable. The way I'm defining properties in
PyGUI at the moment looks like this:

   class C:

     foo = overridable_property('foo', "The foo property")

     def get_foo(self):
       ...

     def set_foo(self, x):
       ...

This has the advantage that the accessor methods can be
overridden in subclasses with the expected effect. This
is particularly important in PyGUI, where I have a generic
class definition which establishes the valid properties
and their docstrings, and implementation subclasses for
different platforms which supply the accessor methods.

The only wart is the necessity of mentioning the property
name twice, once on the lhs and once as an argument.
I haven't thought of a good solution to that, yet.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From tdelaney at avaya.com  Mon Oct 17 04:39:15 2005
From: tdelaney at avaya.com (Delaney, Timothy (Tim))
Date: Mon, 17 Oct 2005 12:39:15 +1000
Subject: [Python-Dev] Definining properties - a use case for class
	decorators?
Message-ID: <2773CAC687FD5F4689F526998C7E4E5F4DB6E2@au3010avexu1.global.avaya.com>

Greg Ewing wrote:

>    class C:
> 
>      foo = overridable_property('foo', "The foo property")
> 
>      def get_foo(self):
>        ...
> 
>      def set_foo(self, x):
>        ...
> 
> This has the advantage that the accessor methods can be
> overridden in subclasses with the expected effect.

This is a point I was going to bring up.

> The only wart is the necessity of mentioning the property
> name twice, once on the lhs and once as an argument.
> I haven't thought of a good solution to that, yet.

Have a look at my comment to Steven Bethard's recipe:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/408713

Tim Delaney

From guido at python.org  Mon Oct 17 05:06:23 2005
From: guido at python.org (Guido van Rossum)
Date: Sun, 16 Oct 2005 20:06:23 -0700
Subject: [Python-Dev] Definining properties - a use case for class
	decorators?
In-Reply-To: <4353016A.1010707@canterbury.ac.nz>
References: <43525BFA.9090309@iinet.net.au>
	<ca471dc20510160823i1ab5ff15kc42b138a3ced596b@mail.gmail.com>
	<4353016A.1010707@canterbury.ac.nz>
Message-ID: <ca471dc20510162006j43e2172rf81015f71e0a6018@mail.gmail.com>

[Guido]
> > Nick, and everybody else trying to find a "solution" for this
> > "problem", please don't.

[Greg Ewing]
> Denying that there's a problem isn't going to make it
> go away. Many people, including me, have the feeling that
> the standard way of defining properties at the moment leaves
> something to be desired, for all the same reasons that have
> led to @-decorators.

My challenge to many people, including you, is to make that feeling
more concrete. Sometimes when you have such a feeling it just means
you haven't drunk the kool-aid yet. :)

With decorators there was a concrete issue: the modifier trailed after
the function body, in a real sense "hiding" from the reader. I don't
see such an issue with properties. Certainly the proposed solutions so
far are worse than the problem.

> However, I agree that trying to keep the accessor method
> names out of the class namespace isn't necessary, and may
> not even be desirable. The way I'm defining properties in
> PyGUI at the moment looks like this:
>
>    class C:
>
>      foo = overridable_property('foo', "The foo property")
>
>      def get_foo(self):
>        ...
>
>      def set_foo(self, x):
>        ...
>
> This has the advantage that the accessor methods can be
> overridden in subclasses with the expected effect. This
> is particularly important in PyGUI, where I have a generic
> class definition which establishes the valid properties
> and their docstrings, and implementation subclasses for
> different platforms which supply the accessor methods.

But since you define the API, are you sure that you need properties at
all? Maybe the users would be happy to write widget.get_foo() and
widget.set_foo(x) instead of widget.foo or widget.foo = x?

> The only wart is the necessity of mentioning the property
> name twice, once on the lhs and once as an argument.
> I haven't thought of a good solution to that, yet.

To which Tim Delaney responded, "have a look at my response here:"
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/408713

I looked at that, and now I believe it's actually *better* to mention
the property name twice, at least compared to Tim' s approach. Looking
at that version, I think it's obscuring the semantics; it (ab)uses the
fact that a function's name is accessible through its __name__
attribute. But (unlike Greg's version) it breaks down when one of the
arguments is not a plain function. This makes it brittle in the
context of renaming operations, e.g.:

    getx = lambda self: 42
    def sety(self, value): self._y = value
    setx = sety
    x = LateBindingProperty(getx, setx)

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From nnorwitz at gmail.com  Mon Oct 17 05:07:37 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Sun, 16 Oct 2005 20:07:37 -0700
Subject: [Python-Dev] Guido v. Python, Round 1
Message-ID: <ee2a432c0510162007j25a6222dr2a88c2407e5840dc@mail.gmail.com>

We all know Guido likes Python.  But the real question is do pythons like Guido?

  http://python.org/neal/

n

From jeremy at alum.mit.edu  Mon Oct 17 05:26:31 2005
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Sun, 16 Oct 2005 23:26:31 -0400
Subject: [Python-Dev] AST branch merge status
In-Reply-To: <e8bf7a530510152230h7d0382ccx1ff7f1727877bd08@mail.gmail.com>
References: <e8bf7a530510152230h7d0382ccx1ff7f1727877bd08@mail.gmail.com>
Message-ID: <e8bf7a530510162026m22e29e91p70fc8c7a016ebeea@mail.gmail.com>

Real life interfered with the planned merge tonight.  I hope you'll
all excuse and wait until tomorrow night.

Jeremy

On 10/16/05, Jeremy Hylton <jeremy at alum.mit.edu> wrote:
> I just merged the head back to the AST branch for what I hope is the
> last time.  I plan to merge the branch to the head on Sunday evening.
> I'd appreciate it if folks could hold off on making changes on the
> trunk until that merge happens.
>
> If this is a non-trivial inconvenience for anyone, go ahead with the
> changes but send me mail to  make sure that I don't lose the changes
> when I do the merge.  Regardless, the compiler and Grammar are off
> limits.  I'll blow away any changes you make there <0.1 wink>.
>
> Jeremy
>
>

From steven.bethard at gmail.com  Mon Oct 17 06:21:31 2005
From: steven.bethard at gmail.com (Steven Bethard)
Date: Sun, 16 Oct 2005 22:21:31 -0600
Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use)
In-Reply-To: <434E47EB.8090909@gmail.com>
References: <20051012072528.68vls5q6143k08c0@login.werra.lunarpages.com>
	<434DD815.8070909@canterbury.ac.nz> <434E47EB.8090909@gmail.com>
Message-ID: <d11dcfba0510162121y14c50a41n653067c390eed3b6@mail.gmail.com>

Nick Coghlan wrote:
> Having module attribute access obey the descriptor protocol (__get__, __set__,
> __delete__) sounds like a pretty good option to me.
>
> It would even be pretty backwards compatible, as I'd be hardpressed to think
> why anyone would have a descriptor *instance* as a top-level object in a
> module (descriptor definition, yes, but not an instance).

Aren't all functions descriptors?

py> def baz():
...     print "Evaluating baz!"
...     return "Module attribute"
...
py> baz()
Evaluating baz!
'Module attribute'
py> baz.__get__(__import__(__name__), None)
<bound method ?.baz of <module '__main__' (built-in)>>
py> baz.__get__(__import__(__name__), None)()
Traceback (most recent call last):
  File "<interactive input>", line 1, in ?
TypeError: baz() takes no arguments (1 given)

How would your proposal change the invocation of module-level functions?

STeVe
--
You can wordify anything if you just verb it.
        --- Bucky Katt, Get Fuzzy

From tdelaney at avaya.com  Mon Oct 17 06:53:55 2005
From: tdelaney at avaya.com (Delaney, Timothy (Tim))
Date: Mon, 17 Oct 2005 14:53:55 +1000
Subject: [Python-Dev] Definining properties - a use case for
	classdecorators?
Message-ID: <2773CAC687FD5F4689F526998C7E4E5F4DB6E3@au3010avexu1.global.avaya.com>

Guido van Rossum wrote:

> To which Tim Delaney responded, "have a look at my response here:"
> http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/408713
> 
> I looked at that, and now I believe it's actually *better* to mention
> the property name twice, at least compared to Tim' s approach.

I never said it was a *good* approach - just *an* approach ;)

Tim Delaney

From nnorwitz at gmail.com  Mon Oct 17 07:21:00 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Sun, 16 Oct 2005 22:21:00 -0700
Subject: [Python-Dev] problem with genexp
In-Reply-To: <ee2a432c0510102115m19581b97h79cc3df6e1dadd27@mail.gmail.com>
References: <ee2a432c0510102115m19581b97h79cc3df6e1dadd27@mail.gmail.com>
Message-ID: <ee2a432c0510162221g31fcc32dwb0ac3cbf16fcc89d@mail.gmail.com>

On 10/10/05, Neal Norwitz <nnorwitz at gmail.com> wrote:
> There's a problem with genexp's that I think really needs to get
> fixed.  See http://python.org/sf/1167751 the details are below.  This
> code:
>
> >>> foo(a = i for i in range(10))
>
> I agree with the bug report that the code should either raise a
> SyntaxError or do the right thing.

The change to Grammar/Grammar below seems to fix the problem and all
the tests pass.  Can anyone comment on whether this fix is
correct/appropriate?  Is there a better way to fix the problem?

-argument: [test '='] test [gen_for] # Really [keyword '='] test
+argument: test [gen_for] | test '=' test ['(' gen_for ')'] # Really
[keyword '='] test

n

From martin at v.loewis.de  Mon Oct 17 08:36:17 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 17 Oct 2005 08:36:17 +0200
Subject: [Python-Dev] Divorcing str and unicode (no more
	implicit	conversions).
In-Reply-To: <8393fff0510161726i6fd5c798u38aa3875c8f6ac4d@mail.gmail.com>
References: <8393fff0510022309l5977e899led262cb3c1a7a805@mail.gmail.com>	<2mbr27f0th.fsf@starship.python.net>	<8393fff0510142350l81ba453md20cc47a445642ce@mail.gmail.com>	<diqd0a$o1q$1@sea.gmane.org>
	<8393fff0510161726i6fd5c798u38aa3875c8f6ac4d@mail.gmail.com>
Message-ID: <43534661.9050804@v.loewis.de>

Martin Blais wrote:
>>Yes. setdefaultencoding() is removed from sys by site.py. To get it
>>again you must reload sys.
> 
> 
> Thanks.

Actually, I should take the opportunity to advise people that
setdefaultencoding doesn't really work. With the default default
encoding, strings and Unicode objects hash equal when they are
equal. If you change the default encoding, this property
goes away (perhaps unless you change it to Latin-1). As a result,
dictionaries where you mix string and Unicode keys won't work:
you might not find a value for a string key when looking up
with a Unicode object, and vice versa.

Regards,
Martin

From murman at gmail.com  Mon Oct 17 09:09:12 2005
From: murman at gmail.com (Michael Urman)
Date: Mon, 17 Oct 2005 02:09:12 -0500
Subject: [Python-Dev] Definining properties - a use case for class
	decorators?
In-Reply-To: <43525BFA.9090309@iinet.net.au>
References: <43525BFA.9090309@iinet.net.au>
Message-ID: <dcbbbb410510170009j971ad5bsc5c1e0d70638caf7@mail.gmail.com>

On 10/16/05, Nick Coghlan <ncoghlan at iinet.net.au> wrote:
> On and off, I've been looking for an elegant way to handle properties using
> decorators.

Why use decorators when a metaclass will already do the trick, and
save you a line? This doesn't necessarily get around Antoine's
complaint that it looks like self refers to the wrong type, but I'm
not convinced anyone would be confused.

class MetaProperty(type):
    def __new__(cls, name, bases, dct):
        if bases[0] is object: # allow us to create class Property
            return type.__new__(cls, name, bases, dct)
        return property(dct.get('get'), dct.get('set'),
                dct.get('delete'), dct.get('__doc__'))

    def __init__(cls, name, bases, dct):
        if bases[0] is object:
            return type.__init__(cls, name, bases, dct)

class Property(object):
    __metaclass__ = MetaProperty


class Test(object):
    class foo(Property):
        """The foo property"""
        def get(self): return self._foo
        def set(self, val): self._foo = val
        def delete(self): del self._foo

test = Test()
test.foo = 'Yay!'
assert test._foo == 'Yay!'

From seojiwon at gmail.com  Mon Oct 17 10:59:20 2005
From: seojiwon at gmail.com (Jiwon Seo)
Date: Mon, 17 Oct 2005 01:59:20 -0700
Subject: [Python-Dev] problem with genexp
In-Reply-To: <ee2a432c0510162221g31fcc32dwb0ac3cbf16fcc89d@mail.gmail.com>
References: <ee2a432c0510102115m19581b97h79cc3df6e1dadd27@mail.gmail.com>
	<ee2a432c0510162221g31fcc32dwb0ac3cbf16fcc89d@mail.gmail.com>
Message-ID: <b008462b0510170159i3d078a3do319c394a1f7d5fde@mail.gmail.com>

On 10/16/05, Neal Norwitz <nnorwitz at gmail.com> wrote:
> On 10/10/05, Neal Norwitz <nnorwitz at gmail.com> wrote:
> > There's a problem with genexp's that I think really needs to get
> > fixed.  See http://python.org/sf/1167751 the details are below.  This
> > code:
> >
> > >>> foo(a = i for i in range(10))
> >
> > I agree with the bug report that the code should either raise a
> > SyntaxError or do the right thing.
>
> The change to Grammar/Grammar below seems to fix the problem and all
> the tests pass.  Can anyone comment on whether this fix is
> correct/appropriate?  Is there a better way to fix the problem?
>
> -argument: [test '='] test [gen_for] # Really [keyword '='] test
> +argument: test [gen_for] | test '=' test ['(' gen_for ')'] # Really
> [keyword '='] test

The other option would be changes in the Python/compile.c (somewhat)
like following

diff -r2.352 compile.c
6356c6356,6362
-               if (TYPE(n) == argument && NCH(n) == 3) {
---
+               if (TYPE(n) == argument && NCH(n) == 4) {
+                       PyErr_SetString(PyExc_SyntaxError,
+                                                       "invalid syntax");
+                       symtable_error(st, n->n_lineno);
+                       return;
+               }
+               else if (TYPE(n) == argument && NCH(n) == 3) {

but IMO, changing the Grammar looks more obvious.

>
> n
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/seojiwon%40gmail.com
>
>

From ncoghlan at gmail.com  Mon Oct 17 11:06:47 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 17 Oct 2005 19:06:47 +1000
Subject: [Python-Dev] Definining properties - a use case for class
	decorators?
In-Reply-To: <dcbbbb410510170009j971ad5bsc5c1e0d70638caf7@mail.gmail.com>
References: <43525BFA.9090309@iinet.net.au>
	<dcbbbb410510170009j971ad5bsc5c1e0d70638caf7@mail.gmail.com>
Message-ID: <435369A7.1020601@gmail.com>

Michael Urman wrote:
> class Test(object):
>     class foo(Property):
>         """The foo property"""
>         def get(self): return self._foo
>         def set(self, val): self._foo = val
>         def delete(self): del self._foo
> 
> test = Test()
> test.foo = 'Yay!'
> assert test._foo == 'Yay!'

Thus proving once again, that metaclasses are the one true way to monkey with 
classes ;)

Cheers,
Nick.

P.S. I think I need an email program that disables the send button after 11 
pm. . .

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Mon Oct 17 11:32:33 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 17 Oct 2005 19:32:33 +1000
Subject: [Python-Dev] PEP 343 updated
In-Reply-To: <ca471dc20510160819n47bc5697w19c3bf939795e2d8@mail.gmail.com>
References: <43524BB4.7040808@iinet.net.au>
	<ca471dc20510160819n47bc5697w19c3bf939795e2d8@mail.gmail.com>
Message-ID: <43536FB1.7080505@gmail.com>

Guido van Rossum wrote:
> On 10/16/05, Nick Coghlan <ncoghlan at iinet.net.au> wrote:
> I hope you reverted the status to "Proposed"...

I hadn't, but I've now fixed that in CVS (both in the PEP and the PEP index), 
and added some text into the PEP saying why it was reverted to Draft.

> On the latter: I think it shouldn't; I don't like this kind of magic.
> I'll have to read it before I can comment on the rest.

I don't particularly like treating __with__ specially either, but I'm not sure 
I like the alternative.

The alternative is that we'd never be able to safely define a __with__ method 
directly on generators - the reason is that we would want a "def __with__" 
where the @context decorator has been forgotten to trigger a TypeError when it 
is used. If generator-iterators were to provide a context manager to 
automatically invoke close(), then leaving out "@context" would result in a 
very obscure bug (as g.close() would be used to finish the context, instead of 
g.next() or g.throw()).

On the other hand, if the context decorator is invoked automatically when a 
generator function is supplied to populate the __with__ slot, then using a 
generator to define a __with__ method will "just work", instead of "only works 
if you also apply the context decorator"

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Mon Oct 17 11:35:34 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 17 Oct 2005 19:35:34 +1000
Subject: [Python-Dev] PEP 343 updated
In-Reply-To: <000f01c5d265$f232c2f0$6402a8c0@arkdesktop>
References: <000f01c5d265$f232c2f0$6402a8c0@arkdesktop>
Message-ID: <43537066.8050009@gmail.com>

Andrew Koenig wrote:
>> PEP 343 has been updated on python.org.
> 
>> Highlights of the changes:
> 
>>    - changed the name of the PEP to be simply "The 'with' Statement"
> 
> Do you mean PEP 346, perchance?  PEP 343 is something else entirely.

No, I mean PEP 343 - it describes Guido's proposal for a "with" statement. The 
old name made perfect sense if you'd been part of the PEP 340 discussion, but 
was rather obscure otherwise.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Mon Oct 17 11:55:13 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 17 Oct 2005 19:55:13 +1000
Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use)
In-Reply-To: <d11dcfba0510162121y14c50a41n653067c390eed3b6@mail.gmail.com>
References: <20051012072528.68vls5q6143k08c0@login.werra.lunarpages.com>	<434DD815.8070909@canterbury.ac.nz>
	<434E47EB.8090909@gmail.com>
	<d11dcfba0510162121y14c50a41n653067c390eed3b6@mail.gmail.com>
Message-ID: <43537501.6080501@gmail.com>

Steven Bethard wrote:
> Nick Coghlan wrote:
>> Having module attribute access obey the descriptor protocol (__get__, __set__,
>> __delete__) sounds like a pretty good option to me.
>>
>> It would even be pretty backwards compatible, as I'd be hardpressed to think
>> why anyone would have a descriptor *instance* as a top-level object in a
>> module (descriptor definition, yes, but not an instance).
> 
> Aren't all functions descriptors?

So Josh pointed out.

> py> def baz():
> ...     print "Evaluating baz!"
> ...     return "Module attribute"
> ...
> py> baz()
> Evaluating baz!
> 'Module attribute'
> py> baz.__get__(__import__(__name__), None)
> <bound method ?.baz of <module '__main__' (built-in)>>
> py> baz.__get__(__import__(__name__), None)()
> Traceback (most recent call last):
>   File "<interactive input>", line 1, in ?
> TypeError: baz() takes no arguments (1 given)
> 
> How would your proposal change the invocation of module-level functions?

It would, alas, break it. And now that I think about it, functions have to 
work the way they do, otherwise binding an arbitrary function to a class 
variable wouldn't work properly.

So the class descriptor protocol can't be used as is at the module level, 
because functions are descriptor instances.

Ah well, another idea runs aground on the harsh rocks of reality.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From steve at holdenweb.com  Mon Oct 17 13:55:00 2005
From: steve at holdenweb.com (Steve Holden)
Date: Mon, 17 Oct 2005 12:55:00 +0100
Subject: [Python-Dev] Guido v. Python, Round 1
In-Reply-To: <ee2a432c0510162007j25a6222dr2a88c2407e5840dc@mail.gmail.com>
References: <ee2a432c0510162007j25a6222dr2a88c2407e5840dc@mail.gmail.com>
Message-ID: <dj03ei$q5$1@sea.gmane.org>

Neal Norwitz wrote:
> We all know Guido likes Python.  But the real question is do pythons like Guido?
> 
>   http://python.org/neal/
> 
Neal:

Getting a 404 on this one right now.

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC                     www.holdenweb.com
PyCon TX 2006                  www.python.org/pycon/


From phd at mail2.phd.pp.ru  Mon Oct 17 14:14:54 2005
From: phd at mail2.phd.pp.ru (Oleg Broytmann)
Date: Mon, 17 Oct 2005 16:14:54 +0400
Subject: [Python-Dev] Guido v. Python, Round 1
In-Reply-To: <dj03ei$q5$1@sea.gmane.org>
References: <ee2a432c0510162007j25a6222dr2a88c2407e5840dc@mail.gmail.com>
	<dj03ei$q5$1@sea.gmane.org>
Message-ID: <20051017121454.GA1213@phd.pp.ru>

On Mon, Oct 17, 2005 at 12:55:00PM +0100, Steve Holden wrote:
> >   http://python.org/neal/
> > 
> Getting a 404 on this one right now.

   No problems here, very nice fotos! :)

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd at phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.

From jimjjewett at gmail.com  Mon Oct 17 17:19:22 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Mon, 17 Oct 2005 11:19:22 -0400
Subject: [Python-Dev]  PEP 3000 and exec
Message-ID: <fb6fbf560510170819w2c596c71x9da5100399d17c65@mail.gmail.com>

Guido van Rossum wrote:

> Another idea might be to change the exec() spec so that you are
> required to pass in a namespace (and you can't use locals() either!).
> Then the whole point becomes moot.

I think of exec as having two major uses:

(1)  A run-time compiler
(2)  A way to change the local namespace, based on run-time
information (such as a config file).

By turning exec into a function with its own namespace (and
enforcing a readonly locals()), the second use is eliminated.

Is this intentional for security/style/efficiency/predictability?

If so, could exec/eval at least

(1)  Be treatable as nested functions, so that they can *read* the
current namespace.
(2)  Grow a return value, so that they can more easily pass
information back to at least a (tuple of) known variable name(s).

-jJ

From guido at python.org  Mon Oct 17 17:40:59 2005
From: guido at python.org (Guido van Rossum)
Date: Mon, 17 Oct 2005 08:40:59 -0700
Subject: [Python-Dev] Autoloading? (Making Queue.Queue easier to use)
In-Reply-To: <43537501.6080501@gmail.com>
References: <20051012072528.68vls5q6143k08c0@login.werra.lunarpages.com>
	<434DD815.8070909@canterbury.ac.nz> <434E47EB.8090909@gmail.com>
	<d11dcfba0510162121y14c50a41n653067c390eed3b6@mail.gmail.com>
	<43537501.6080501@gmail.com>
Message-ID: <ca471dc20510170840v664f7cfmd5a668149b63f29a@mail.gmail.com>

On 10/17/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Ah well, another idea runs aground on the harsh rocks of reality.

I should point out that it's intentional that there are very few
similarities between modules and classes. Many attempts have been made
to unify the two, but these never work right, because the module can't
decide whether it behaves like a class or like an instance. Also the
direct access to global variables prevents you to put any kind of code
in the get-attribute path.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon Oct 17 17:49:44 2005
From: guido at python.org (Guido van Rossum)
Date: Mon, 17 Oct 2005 08:49:44 -0700
Subject: [Python-Dev] PEP 3000 and exec
In-Reply-To: <fb6fbf560510170819w2c596c71x9da5100399d17c65@mail.gmail.com>
References: <fb6fbf560510170819w2c596c71x9da5100399d17c65@mail.gmail.com>
Message-ID: <ca471dc20510170849o564bc73bg6dddc8ced5176cf2@mail.gmail.com>

On 10/17/05, Jim Jewett <jimjjewett at gmail.com> wrote:
> Guido van Rossum wrote:
>
> > Another idea might be to change the exec() spec so that you are
> > required to pass in a namespace (and you can't use locals() either!).
> > Then the whole point becomes moot.
>
> I think of exec as having two major uses:
>
> (1)  A run-time compiler
> (2)  A way to change the local namespace, based on run-time
> information (such as a config file).
>
> By turning exec into a function with its own namespace (and
> enforcing a readonly locals()), the second use is eliminated.
>
> Is this intentional for security/style/efficiency/predictability?

Yes, there are lots of problems with (2); both the human reader and
the compiler often don't quite know what the intended effect is.

> If so, could exec/eval at least
>
> (1)  Be treatable as nested functions, so that they can *read* the
> current namespace.

There will be a way to get the current namespace (similar to locals()
but without its bugs). But it's probably better to create an empty
namespace and explicitly copy into it only those things that you are
willing to expose to the exec'ed code (or the things it needs).

> (2)  Grow a return value, so that they can more easily pass
> information back to at least a (tuple of) known variable name(s).

You can easily build that functionality yourself; after running
exec(), you just pick certain things out of the namespace that you
expect it to create.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From skip at pobox.com  Mon Oct 17 18:06:36 2005
From: skip at pobox.com (skip@pobox.com)
Date: Mon, 17 Oct 2005 11:06:36 -0500
Subject: [Python-Dev] Guido v. Python, Round 1
In-Reply-To: <ee2a432c0510162007j25a6222dr2a88c2407e5840dc@mail.gmail.com>
References: <ee2a432c0510162007j25a6222dr2a88c2407e5840dc@mail.gmail.com>
Message-ID: <17235.52236.458715.854015@montanaro.dyndns.org>


    Neal> We all know Guido likes Python.  But the real question is do
    Neal> pythons like Guido? 

    Neal>   http://python.org/neal/

Like Steve (and unlike Oleg), I get 404s for this page.  I also tried
"www.python.org" and "~neal".

Skip


From steve at holdenweb.com  Mon Oct 17 18:27:52 2005
From: steve at holdenweb.com (Steve Holden)
Date: Mon, 17 Oct 2005 17:27:52 +0100
Subject: [Python-Dev] Guido v. Python, Round 1
In-Reply-To: <17235.52236.458715.854015@montanaro.dyndns.org>
References: <ee2a432c0510162007j25a6222dr2a88c2407e5840dc@mail.gmail.com>
	<17235.52236.458715.854015@montanaro.dyndns.org>
Message-ID: <4353D108.1060107@holdenweb.com>

skip at pobox.com wrote:
>     Neal> We all know Guido likes Python.  But the real question is do
>     Neal> pythons like Guido? 
> 
>     Neal>   http://python.org/neal/
> 
> Like Steve (and unlike Oleg), I get 404s for this page.  I also tried
> "www.python.org" and "~neal".
> 
This appears to be a DNS issue: the stuff is on creosote, 
213.84.134.214, not www or dinsdale.

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC                     www.holdenweb.com
PyCon TX 2006                  www.python.org/pycon/

From tim.peters at gmail.com  Mon Oct 17 18:34:03 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Mon, 17 Oct 2005 12:34:03 -0400
Subject: [Python-Dev] Guido v. Python, Round 1
In-Reply-To: <17235.52236.458715.854015@montanaro.dyndns.org>
References: <ee2a432c0510162007j25a6222dr2a88c2407e5840dc@mail.gmail.com>
	<17235.52236.458715.854015@montanaro.dyndns.org>
Message-ID: <1f7befae0510170934h21d072dbn2037d634af7d17f@mail.gmail.com>

[Skip]
> Like Steve (and unlike Oleg), I get 404s for this page.  I also tried
> "www.python.org" and "~neal".

The original

    http://python.org/neal/

 worked fine for me, and still does.  OTOH,

     http://www.python.org/neal/

gets a 404, and (the original without the trailing backslash)

    http://python.org/neal

"changes itself" <wink> to the 404 on <http://www.python.org/neal/>.

From steve at holdenweb.com  Mon Oct 17 18:27:52 2005
From: steve at holdenweb.com (Steve Holden)
Date: Mon, 17 Oct 2005 17:27:52 +0100
Subject: [Python-Dev] Guido v. Python, Round 1
In-Reply-To: <17235.52236.458715.854015@montanaro.dyndns.org>
References: <ee2a432c0510162007j25a6222dr2a88c2407e5840dc@mail.gmail.com>
	<17235.52236.458715.854015@montanaro.dyndns.org>
Message-ID: <4353D108.1060107@holdenweb.com>

skip at pobox.com wrote:
>     Neal> We all know Guido likes Python.  But the real question is do
>     Neal> pythons like Guido? 
> 
>     Neal>   http://python.org/neal/
> 
> Like Steve (and unlike Oleg), I get 404s for this page.  I also tried
> "www.python.org" and "~neal".
> 
This appears to be a DNS issue: the stuff is on creosote, 
213.84.134.214, not www or dinsdale.

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC                     www.holdenweb.com
PyCon TX 2006                  www.python.org/pycon/


From arigo at tunes.org  Mon Oct 17 18:52:09 2005
From: arigo at tunes.org (Armin Rigo)
Date: Mon, 17 Oct 2005 18:52:09 +0200
Subject: [Python-Dev] AST branch update
In-Reply-To: <e8bf7a530510131352xb123101uba24367a8dbe09b2@mail.gmail.com>
References: <e8bf7a530510131352xb123101uba24367a8dbe09b2@mail.gmail.com>
Message-ID: <20051017165209.GA14358@code1.codespeak.net>

Hi Jeremy,

On Thu, Oct 13, 2005 at 04:52:14PM -0400, Jeremy Hylton wrote:
> I don't think the current test suite covers all of the possible syntax
> errors that can be raised.  I'd like to add a new test suite that
> covers all of the remaining cases, perhaps moving some existing tests
> into this module as well.

You might be interested in PyPy's test suite here.  In particular,
http://codespeak.net/svn/pypy/dist/pypy/interpreter/test/test_syntax.py
contains a list of syntactically valid and invalid corner cases.

If you are willing to check out the whole of PyPy (i.e.
http://codespeak.net/svn/pypy/dist) you should also be able to run the
whole test suite, or at least the following tests:

   python test_all.py pypy/interpreter/test/test_compiler.py
   python test_all.py pypy/interpreter/pyparser/

which compare CPython's builtin compiler with our own compilers; as of
PyPy revision 18722 these tests pass on all CPython versions (2.3.5,
2.4.2, HEAD).


A bientot,

Armin.

From jimjjewett at gmail.com  Mon Oct 17 19:06:20 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Mon, 17 Oct 2005 13:06:20 -0400
Subject: [Python-Dev] PEP 3000 and exec
Message-ID: <fb6fbf560510171006h64367b51t856931a164055e2a@mail.gmail.com>

For communicating with an exec/eval child, once exec
cannot run in the current namespace, I asked that it be
possible to pass a read-only "current view" and to see
a return value.

(Guido):
>... it's probably better to create an empty namespace and
> explicitly copy into it ...

> ... just pick certain things out of the namespace [afterwards]

Yes and no.

If the exec'ed code is well defined (and it needs to be if
security is a concern), then that works well.

For more exploratory code, it can be hard to know what
in advance what the code will need, or to agree on the
names of return variables.

The simplest general API that I can come up with is

"You're allowed to see anything I can" (even if it is
in a nested scope or base class, and I realize that
you *probably* won't need it).

"Return value is whatever you explicitly choose to
return"  (Lisp's "last result" might be even simpler,
but would probably lead to confusion other places.)

-jJ

From nnorwitz at gmail.com  Mon Oct 17 19:11:43 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Mon, 17 Oct 2005 10:11:43 -0700
Subject: [Python-Dev] Guido v. Python, Round 1
In-Reply-To: <1f7befae0510170934h21d072dbn2037d634af7d17f@mail.gmail.com>
References: <ee2a432c0510162007j25a6222dr2a88c2407e5840dc@mail.gmail.com>
	<17235.52236.458715.854015@montanaro.dyndns.org>
	<1f7befae0510170934h21d072dbn2037d634af7d17f@mail.gmail.com>
Message-ID: <ee2a432c0510171011i5cae4d9axd0f17319afa00e45@mail.gmail.com>

On 10/17/05, Tim Peters <tim.peters at gmail.com> wrote:
>
> [Skip]
> > Like Steve (and unlike Oleg), I get 404s for this page. I also tried
> > "www.python.org <http://www.python.org>" and "~neal".
>
> The original
>
> http://python.org/neal/
>
> worked fine for me, and still does. OTOH,
>
> http://www.python.org/neal/
>
> gets a 404, and (the original without the trailing backslash)
>

Yup, as most people already pointed out, I only put this up on creosote and
should have added that to the URL. I don't have an account on dinsdale and
can't copy stuff up there AFAIK. This URL should work for a while longer.

http://creosote.python.org/neal/

n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20051017/fd16bf23/attachment.html

From tonynelson at georgeanelson.com  Sun Oct 16 18:33:25 2005
From: tonynelson at georgeanelson.com (Tony Nelson)
Date: Sun, 16 Oct 2005 12:33:25 -0400
Subject: [Python-Dev] Unicode charmap decoders slow
In-Reply-To: <435223B1.2020209@v.loewis.de>
References: <v04020a02bf774089e195@[192.168.123.162]>
	<v04020a03bf72162a051d@[192.168.123.162]>
	<v04020a03bf72162a051d@[192.168.123.162]>
	<v04020a02bf774089e195@[192.168.123.162]>
Message-ID: <v04020a02bf7823813cdb@[192.168.123.162]>

At 11:56 AM +0200 10/16/05, Martin v. L?wis wrote:
>Tony Nelson wrote:
>> BTW, Martin, if you care to, would you explain to me how a Trie would be
>> used for charmap encoding?  I know a couple of approaches, but I don't know
>> how to do it fast.  (I've never actually had the occasion to use a Trie.)
>
>I currently envision a three-level trie, with 5, 4, and 7 bits. You take
>the Unicode character (only chacters below U+FFFF supported), and take
>the uppermost 5 bits, as index in an array. There you find the base
>of a second array, to which you add the next 4 bits, which gives you an
>index into a third array, where you add the last 7 bits. This gives
>you the character, or 0 if it is unmappable.

Umm, 0 (NUL) is a valid output character in most of the 8-bit character
sets.  It could be handled by having a separate "exceptions" string of the
unicode code points that actually map to the exception char.  Usually
"exceptions" would be a string of length 1.  Suggested changes below.


>struct encoding_map{
>   unsigned char level0[32];
>   unsigned char *level1;
>   unsigned char *level2;

    Py_UNICODE *exceptions;

>};
>
>struct encoding_map *table;
>Py_UNICODE character;
>int level1 = table->level0[character>>11];
>if(level1==0xFF)raise unmapped;
>int level2 = table->level1[16*level1 + ((character>>7) & 0xF)];
>if(level2==0xFF)raise unmapped;
>int mapped = table->level2[128*level2 + (character & 0x7F)];

change:

>if(mapped==0)raise unmapped;

to:

if(mapped==0) {
    Py_UNICODE *ep;
    for(ep=table->exceptions; *ep; ep++)
        if(*ep==character)
            break;
    if(!*ep)raise unmapped;
}


>Over a hashtable, this has the advantage of not having to deal with
>collisions. Instead, it guarantees you a lookup in a constant time.

OK, I see the benefit.  Your code is about the same amount of work as the
hash table lookup in instructions, indirections, and branches, normally
uses less of the data cache, and has a fixed running time.  It may use one
more branch, but its branches are easily predicted.  Thank you for
explaining it.


>It is also quite space-efficient: all tables use bytes as indizes.
>As each level0 deals with 2048 characters, most character maps
>will only use 1 or two level1 blocks, meaning 16 or 32 bytes
>for level1. The number of level3 blocks required depends on
>the number of 127-character rows which the encoding spans;
>for most encodings, 3 or four such blocks will be sufficient
>(with ASCII spanning one such block typically), causing the
>entire memory consumption for many encodings to be less than
>600 bytes.
 ...

As you are concerned about pathological cases for hashing (that would make
the hash chains long), it is worth noting that in such cases this data
structure could take 64K bytes.  Of course, no such case occurs in standard
encodings, and 64K is affordable as long is it is rare.
____________________________________________________________________
TonyN.:'                       <mailto:tonynelson at georgeanelson.com>
      '                              <http://www.georgeanelson.com/>

From skip at pobox.com  Mon Oct 17 20:14:00 2005
From: skip at pobox.com (skip@pobox.com)
Date: Mon, 17 Oct 2005 13:14:00 -0500
Subject: [Python-Dev] Guido v. Python, Round 1
In-Reply-To: <ee2a432c0510171011i5cae4d9axd0f17319afa00e45@mail.gmail.com>
References: <ee2a432c0510162007j25a6222dr2a88c2407e5840dc@mail.gmail.com>
	<17235.52236.458715.854015@montanaro.dyndns.org>
	<1f7befae0510170934h21d072dbn2037d634af7d17f@mail.gmail.com>
	<ee2a432c0510171011i5cae4d9axd0f17319afa00e45@mail.gmail.com>
Message-ID: <17235.59880.819873.541201@montanaro.dyndns.org>


    Neal> This URL should work for a while longer.

    Neal> http://creosote.python.org/neal/

Ah, the vagaries of URL redirection.  Thanks...

Skip

From greg.ewing at canterbury.ac.nz  Tue Oct 18 02:42:50 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 18 Oct 2005 13:42:50 +1300
Subject: [Python-Dev] Guido v. Python, Round 1
In-Reply-To: <ee2a432c0510162007j25a6222dr2a88c2407e5840dc@mail.gmail.com>
References: <ee2a432c0510162007j25a6222dr2a88c2407e5840dc@mail.gmail.com>
Message-ID: <4354450A.2020400@canterbury.ac.nz>

Neal Norwitz wrote:
> We all know Guido likes Python.  But the real question is do pythons like Guido?
> 
>   http://python.org/neal/

??? I get a 404 for this.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From greg.ewing at canterbury.ac.nz  Tue Oct 18 03:15:45 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 18 Oct 2005 14:15:45 +1300
Subject: [Python-Dev] Definining properties - a use case for class
 decorators?
In-Reply-To: <ca471dc20510162006j43e2172rf81015f71e0a6018@mail.gmail.com>
References: <43525BFA.9090309@iinet.net.au>
	<ca471dc20510160823i1ab5ff15kc42b138a3ced596b@mail.gmail.com>
	<4353016A.1010707@canterbury.ac.nz>
	<ca471dc20510162006j43e2172rf81015f71e0a6018@mail.gmail.com>
Message-ID: <43544CC1.5050204@canterbury.ac.nz>

Guido van Rossum wrote:

> With decorators there was a concrete issue: the modifier trailed after
> the function body, in a real sense "hiding" from the reader.

A similar thing happens with properties, the property
definition (which is the public interface) trailing
after the accessor methods (which are an implementation
detail).

 > Certainly the proposed solutions so far are worse than
 > the problem.

I agree with that (except for mine, of course :-).

I still feel that the ultimate solution lies in some form
of syntactic support, although I haven't decided what
yet.

> But since you define the API, are you sure that you need properties at
> all? Maybe the users would be happy to write widget.get_foo() and
> widget.set_foo(x) instead of widget.foo or widget.foo = x?

I'm one of my main users, and I wouldn't be happy with
it. I *have* thought about this quite carefully. An early
version of the PyGUI API (predating properties) did things
that way, and people complained. After re-doing it with
properties, and getting some experience using the result,
I'm convinced that properties are the way to go for this
particular application.

> To which Tim Delaney responded, "have a look at my response here:"
> http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/408713
> 
> I looked at that, and now I believe it's actually *better* to mention
> the property name twice, at least compared to Tim' s approach.

I'm inclined to agree. Passing functions that you're not
going to use as functions but just use the name of doesn't
seem right.

And in my version, it's not *really* redundant, since the
name is only used to derive the names of the accessor methods.
It doesn't *have* to be the same as the property name, although
using anything else could justifiably be regarded as insane...

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From guido at python.org  Tue Oct 18 03:55:48 2005
From: guido at python.org (Guido van Rossum)
Date: Mon, 17 Oct 2005 18:55:48 -0700
Subject: [Python-Dev] Definining properties - a use case for class
	decorators?
In-Reply-To: <43544CC1.5050204@canterbury.ac.nz>
References: <43525BFA.9090309@iinet.net.au>
	<ca471dc20510160823i1ab5ff15kc42b138a3ced596b@mail.gmail.com>
	<4353016A.1010707@canterbury.ac.nz>
	<ca471dc20510162006j43e2172rf81015f71e0a6018@mail.gmail.com>
	<43544CC1.5050204@canterbury.ac.nz>
Message-ID: <ca471dc20510171855w357591c4hae2a9ddb40694d23@mail.gmail.com>

[Guido]
> > I looked at that, and now I believe it's actually *better* to mention
> > the property name twice, at least compared to Tim' s approach.

[Greg Ewing]
> I'm inclined to agree. Passing functions that you're not
> going to use as functions but just use the name of doesn't
> seem right.
>
> And in my version, it's not *really* redundant, since the
> name is only used to derive the names of the accessor methods.
> It doesn't *have* to be the same as the property name, although
> using anything else could justifiably be regarded as insane...

OK, so how's this for a radical proposal.

Let's change the property built-in so that its arguments can be either
functions or strings (or None). If they are functions or None, it
behaves exactly like it always has.

If an argument is a string, it should be a method name, and the method
is looked up by that name each time the property is used. Because this
is late binding, it can be put before the method definitions, and a
subclass can override the methods. Example:

class C:

    foo = property('getFoo', 'setFoo', None, 'the foo property')

    def getFoo(self):
        return self._foo

    def setFoo(self, foo):
        self._foo = foo

What do you think?

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From greg.ewing at canterbury.ac.nz  Tue Oct 18 04:08:12 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 18 Oct 2005 15:08:12 +1300
Subject: [Python-Dev] Definining properties - a use case for class
 decorators?
In-Reply-To: <ca471dc20510171855w357591c4hae2a9ddb40694d23@mail.gmail.com>
References: <43525BFA.9090309@iinet.net.au>
	<ca471dc20510160823i1ab5ff15kc42b138a3ced596b@mail.gmail.com>
	<4353016A.1010707@canterbury.ac.nz>
	<ca471dc20510162006j43e2172rf81015f71e0a6018@mail.gmail.com>
	<43544CC1.5050204@canterbury.ac.nz>
	<ca471dc20510171855w357591c4hae2a9ddb40694d23@mail.gmail.com>
Message-ID: <4354590C.5070708@canterbury.ac.nz>

Guido van Rossum wrote:

> Let's change the property built-in so that its arguments can be either
> functions or strings (or None).
> 
> If an argument is a string, it should be a method name, and the method
> is looked up by that name each time the property is used.

That sounds reasonable.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From barry at python.org  Tue Oct 18 04:08:48 2005
From: barry at python.org (Barry Warsaw)
Date: Mon, 17 Oct 2005 22:08:48 -0400
Subject: [Python-Dev] Definining properties - a use case for
	class	decorators?
In-Reply-To: <ca471dc20510171855w357591c4hae2a9ddb40694d23@mail.gmail.com>
References: <43525BFA.9090309@iinet.net.au>
	<ca471dc20510160823i1ab5ff15kc42b138a3ced596b@mail.gmail.com>
	<4353016A.1010707@canterbury.ac.nz>
	<ca471dc20510162006j43e2172rf81015f71e0a6018@mail.gmail.com>
	<43544CC1.5050204@canterbury.ac.nz>
	<ca471dc20510171855w357591c4hae2a9ddb40694d23@mail.gmail.com>
Message-ID: <1129601328.9405.13.camel@geddy.wooz.org>

On Mon, 2005-10-17 at 21:55, Guido van Rossum wrote:

> Let's change the property built-in so that its arguments can be either
> functions or strings (or None). If they are functions or None, it
> behaves exactly like it always has.
> 
> If an argument is a string, it should be a method name, and the method
> is looked up by that name each time the property is used. Because this
> is late binding, it can be put before the method definitions, and a
> subclass can override the methods. Example:
> 
> class C:
> 
>     foo = property('getFoo', 'setFoo', None, 'the foo property')
> 
>     def getFoo(self):
>         return self._foo
> 
>     def setFoo(self, foo):
>         self._foo = foo
> 
> What do you think?

Ick, for all the reasons that strings are less appealing than names. 
IMO, there's not enough advantage in having the property() call before
the functions than after.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20051017/a14d6183/attachment-0001.pgp

From greg.ewing at canterbury.ac.nz  Tue Oct 18 04:15:43 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 18 Oct 2005 15:15:43 +1300
Subject: [Python-Dev] Definining properties - a use case for class
 decorators?
In-Reply-To: <1129601328.9405.13.camel@geddy.wooz.org>
References: <43525BFA.9090309@iinet.net.au>
	<ca471dc20510160823i1ab5ff15kc42b138a3ced596b@mail.gmail.com>
	<4353016A.1010707@canterbury.ac.nz>
	<ca471dc20510162006j43e2172rf81015f71e0a6018@mail.gmail.com>
	<43544CC1.5050204@canterbury.ac.nz>
	<ca471dc20510171855w357591c4hae2a9ddb40694d23@mail.gmail.com>
	<1129601328.9405.13.camel@geddy.wooz.org>
Message-ID: <43545ACF.7060004@canterbury.ac.nz>

Barry Warsaw wrote:

> Ick, for all the reasons that strings are less appealing than names. 
> IMO, there's not enough advantage in having the property() call before
> the functions than after.

That's not the only benefit - you also get overridability
of the accessor methods.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From guido at python.org  Tue Oct 18 04:24:36 2005
From: guido at python.org (Guido van Rossum)
Date: Mon, 17 Oct 2005 19:24:36 -0700
Subject: [Python-Dev] Definining properties - a use case for class
	decorators?
In-Reply-To: <1129601328.9405.13.camel@geddy.wooz.org>
References: <43525BFA.9090309@iinet.net.au>
	<ca471dc20510160823i1ab5ff15kc42b138a3ced596b@mail.gmail.com>
	<4353016A.1010707@canterbury.ac.nz>
	<ca471dc20510162006j43e2172rf81015f71e0a6018@mail.gmail.com>
	<43544CC1.5050204@canterbury.ac.nz>
	<ca471dc20510171855w357591c4hae2a9ddb40694d23@mail.gmail.com>
	<1129601328.9405.13.camel@geddy.wooz.org>
Message-ID: <ca471dc20510171924q2b7735cby77e3c6fe083523b9@mail.gmail.com>

[Guido]
> > Let's change the property built-in so that its arguments can be either
> > functions or strings (or None). If they are functions or None, it
> > behaves exactly like it always has.
> >
> > If an argument is a string, it should be a method name, and the method
> > is looked up by that name each time the property is used. Because this
> > is late binding, it can be put before the method definitions, and a
> > subclass can override the methods. Example:
> >
> > class C:
> >
> >     foo = property('getFoo', 'setFoo', None, 'the foo property')
> >
> >     def getFoo(self):
> >         return self._foo
> >
> >     def setFoo(self, foo):
> >         self._foo = foo
> >
> > What do you think?

[Barry]
> Ick, for all the reasons that strings are less appealing than names.

I usually wholeheartedly agree with that argument, but here I don't
see an alternative.

> IMO, there's not enough advantage in having the property() call before
> the functions than after.

Maybe you didn't see the use case that Greg had in mind? He wants to
be able to override the getter and/or setter in a subclass, without
changing the docstring or having to repeat the property() call. That
requires us to do a late binding lookup based on a string.

Tim Delaney had a different solution where you would pass in the
functions but all it did was use their __name__ attribute to look up
the real function at runtime. The problem with that is that the
__name__ attribute may not be what you expect, as it may not
correspond to the name of the object passed in. Example:

class C:
    def getx(self): ...something...
    gety = getx
    y = property(gety)

class D(C):
    def gety(self): ...something else...

Here, the intention is clearly to override the way the property's
value is computed, but it doesn't work right -- gety.__name__ is
'getx', and D doesn't override getx, so D().y calls C.getx() instead
of D.gety().

If you can think of a solution that looks better than mine, you're a genius.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From barry at python.org  Tue Oct 18 05:10:50 2005
From: barry at python.org (Barry Warsaw)
Date: Mon, 17 Oct 2005 23:10:50 -0400
Subject: [Python-Dev] Definining properties - a use case for
	class	decorators?
In-Reply-To: <ca471dc20510171924q2b7735cby77e3c6fe083523b9@mail.gmail.com>
References: <43525BFA.9090309@iinet.net.au>
	<ca471dc20510160823i1ab5ff15kc42b138a3ced596b@mail.gmail.com>
	<4353016A.1010707@canterbury.ac.nz>
	<ca471dc20510162006j43e2172rf81015f71e0a6018@mail.gmail.com>
	<43544CC1.5050204@canterbury.ac.nz>
	<ca471dc20510171855w357591c4hae2a9ddb40694d23@mail.gmail.com>
	<1129601328.9405.13.camel@geddy.wooz.org>
	<ca471dc20510171924q2b7735cby77e3c6fe083523b9@mail.gmail.com>
Message-ID: <1129605050.9405.29.camel@geddy.wooz.org>

On Mon, 2005-10-17 at 22:24, Guido van Rossum wrote:

> > IMO, there's not enough advantage in having the property() call before
> > the functions than after.
> 
> Maybe you didn't see the use case that Greg had in mind? He wants to
> be able to override the getter and/or setter in a subclass, without
> changing the docstring or having to repeat the property() call. That
> requires us to do a late binding lookup based on a string.

True, I missed that use case.  But can't you already support
override-ability just by refactoring the getter and setter into separate
methods?  IOW, the getter and setter isn't overridden, but they call
other methods that implement the core functionality and that /are/
overridden.  Okay, that means a few extra methods per property, but that
still doesn't seem too bad.

> If you can think of a solution that looks better than mine, you're a genius.

Oh, I know that's not the case, but it's such a tempting challenge, I'll
try anyway :).

class A(object):
    def __init__(self):
        self._x = 0

    def set_x(self, x):
        self._set_x(x)

    def _set_x(self, x):
        print 'A._set_x()'
        self._x = x

    def get_x(self):
        return self._get_x()

    def _get_x(self):
        print 'A._get_x()'
        return self._x

    x = property(get_x, set_x)


class B(A):
    def _set_x(self, x):
        print 'B._set_x()'
        super(B, self)._set_x(x)

    def _get_x(self):
        print 'B._get_x()'
        return super(B, self)._get_x()


a = A()
b = B()
a.x = 7
b.x = 9
print a.x
print b.x

Basically A.get_x() and A.set_x() are just wrappers to make the property
machinery work the way you want.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20051017/da1f0a5b/attachment.pgp

From guido at python.org  Tue Oct 18 05:46:47 2005
From: guido at python.org (Guido van Rossum)
Date: Mon, 17 Oct 2005 20:46:47 -0700
Subject: [Python-Dev] Definining properties - a use case for class
	decorators?
In-Reply-To: <1129605050.9405.29.camel@geddy.wooz.org>
References: <43525BFA.9090309@iinet.net.au>
	<ca471dc20510160823i1ab5ff15kc42b138a3ced596b@mail.gmail.com>
	<4353016A.1010707@canterbury.ac.nz>
	<ca471dc20510162006j43e2172rf81015f71e0a6018@mail.gmail.com>
	<43544CC1.5050204@canterbury.ac.nz>
	<ca471dc20510171855w357591c4hae2a9ddb40694d23@mail.gmail.com>
	<1129601328.9405.13.camel@geddy.wooz.org>
	<ca471dc20510171924q2b7735cby77e3c6fe083523b9@mail.gmail.com>
	<1129605050.9405.29.camel@geddy.wooz.org>
Message-ID: <ca471dc20510172046h19eebb5ese9727e2a5bd8a188@mail.gmail.com>

[Barry]
> > > IMO, there's not enough advantage in having the property() call before
> > > the functions than after.

[Guido]
> > Maybe you didn't see the use case that Greg had in mind? He wants to
> > be able to override the getter and/or setter in a subclass, without
> > changing the docstring or having to repeat the property() call. That
> > requires us to do a late binding lookup based on a string.

[Barry]
> True, I missed that use case.  But can't you already support
> override-ability just by refactoring the getter and setter into separate
> methods?  IOW, the getter and setter isn't overridden, but they call
> other methods that implement the core functionality and that /are/
> overridden.  Okay, that means a few extra methods per property, but that
> still doesn't seem too bad.
>
> > If you can think of a solution that looks better than mine, you're a genius.
>
> Oh, I know that's not the case, but it's such a tempting challenge, I'll
> try anyway :).
[...]

Nice try. I guess it's similar to this, which is a bit more concise
and doesn't require as many underscores:

class B:
    def get_x(self): return self._x
    def set_x(self, x): self._x = x
    x = property(lambda self: self.get_x(), lambda self, x: self.set_x(x))

But I still like the version with strings better:

    x = property('get_x', 'set_x')

This trades two lambdas for two pairs of string quotes; a good deal IMO!

Now, if I were to follow Paul Graham's recommendations strictly
(http://www.paulgraham.com/diff.html), point 7 saysthat Python should
have a symbol type. I've always maintained that this is unnecessary
and that we can just as well use regular strings. This makes it easy
to constructs names on the fly that you pass to getattr() and
setattr() using standard string operations. Suppose the symbol type
were written as \foo (meaning a quoted reference to the identifier
'foo'). Then the above could be written like this:

    x = property(\get_x, \set_x)

But I'm not sure this buys us anything, so I still believe that using
'set_x' and 'get_x' is just fine here. Greg Ewing, whose taste in
language features is hard to beat, seems to agree.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From aahz at pythoncraft.com  Tue Oct 18 06:11:12 2005
From: aahz at pythoncraft.com (Aahz)
Date: Mon, 17 Oct 2005 21:11:12 -0700
Subject: [Python-Dev] Definining properties - a use case for class
	decorators?
In-Reply-To: <ca471dc20510171855w357591c4hae2a9ddb40694d23@mail.gmail.com>
References: <43525BFA.9090309@iinet.net.au>
	<ca471dc20510160823i1ab5ff15kc42b138a3ced596b@mail.gmail.com>
	<4353016A.1010707@canterbury.ac.nz>
	<ca471dc20510162006j43e2172rf81015f71e0a6018@mail.gmail.com>
	<43544CC1.5050204@canterbury.ac.nz>
	<ca471dc20510171855w357591c4hae2a9ddb40694d23@mail.gmail.com>
Message-ID: <20051018041112.GA14975@panix.com>

On Mon, Oct 17, 2005, Guido van Rossum wrote:
>
> If an argument is a string, it should be a method name, and the method
> is looked up by that name each time the property is used. Because this
> is late binding, it can be put before the method definitions, and a
> subclass can override the methods. Example:
> 
> class C:
>     foo = property('getFoo', 'setFoo', None, 'the foo property')

+1

The only other alternative is to introduce some kind of block.  This is
a good solution that is not particularly intrusive; it leaves the door
open to a well-designed block structure later on.  The one niggle I have
is that it's going to be a little unwieldy to explain, but people who
create properties really ought to understand Python well enough to deal
with it.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"If you think it's expensive to hire a professional to do the job, wait
until you hire an amateur."  --Red Adair

From pje at telecommunity.com  Tue Oct 18 06:35:19 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 18 Oct 2005 00:35:19 -0400
Subject: [Python-Dev] Definining properties - a use case for class
 decorators?
In-Reply-To: <ca471dc20510172046h19eebb5ese9727e2a5bd8a188@mail.gmail.co
 m>
References: <1129605050.9405.29.camel@geddy.wooz.org>
	<43525BFA.9090309@iinet.net.au>
	<ca471dc20510160823i1ab5ff15kc42b138a3ced596b@mail.gmail.com>
	<4353016A.1010707@canterbury.ac.nz>
	<ca471dc20510162006j43e2172rf81015f71e0a6018@mail.gmail.com>
	<43544CC1.5050204@canterbury.ac.nz>
	<ca471dc20510171855w357591c4hae2a9ddb40694d23@mail.gmail.com>
	<1129601328.9405.13.camel@geddy.wooz.org>
	<ca471dc20510171924q2b7735cby77e3c6fe083523b9@mail.gmail.com>
	<1129605050.9405.29.camel@geddy.wooz.org>
Message-ID: <5.1.1.6.0.20051018001632.01eec1d0@mail.telecommunity.com>

At 08:46 PM 10/17/2005 -0700, Guido van Rossum wrote:
>Now, if I were to follow Paul Graham's recommendations strictly
>(http://www.paulgraham.com/diff.html), point 7 saysthat Python should
>have a symbol type. I've always maintained that this is unnecessary
>and that we can just as well use regular strings.

Well, unless you're going to also do #8 ("a notation for code"), I'd agree.  :)

But then again, Graham also lists #6 ("programs composed of expressions"), 
and even though I'm often tempted by the desire to write something as a big 
expression, the truth is that most people's brains (mine included) just 
don't have enough stack space for it.  The people that have that much 
mental stack space can already write lambda+listcomp atrocities for the 
rest of us to boggle at.  :)

Logix (http://livelogix.net/logix/) basically adds everything on Graham's 
list to Python, and then compiles it to Python bytecode.  But the result is 
something that still doesn't seem very Pythonic to me.

Of course, with good restraint, it seems to me that Logix allows some very 
tasteful language extensions (John Landahl created a nice syntax sugar for 
generic functions with it), but making full-tilt use of Graham's 9 features 
seems to result in a very Lisp-like experience, even without the parentheses.

At the same time, I would note that Ruby does seem to have an edge on 
Python in terms of ability to create "little languages" of the sort that 
Logix also excels at.  Compare SCons (Python) with Rakefiles (Ruby), for 
example, or SQLObject (Python) to Rails' ActiveRecord.  In each case, the 
Python DSL syntax is okay, but Ruby's is better.  Even PEP 340 in its 
heydey wasn't going to improve on it much, because Ruby DSL's benefit 
mainly from being able to pass the blocks to functions which could then 
hold on to them for later use.  (Also, in an ironic twist, Ruby requires 
fewer parentheses than Python for such operations, so the invocation looks 
more like user-defined syntax.)


From steven.bethard at gmail.com  Tue Oct 18 06:46:12 2005
From: steven.bethard at gmail.com (Steven Bethard)
Date: Mon, 17 Oct 2005 22:46:12 -0600
Subject: [Python-Dev] Definining properties - a use case for class
	decorators?
In-Reply-To: <1129601328.9405.13.camel@geddy.wooz.org>
References: <43525BFA.9090309@iinet.net.au>
	<ca471dc20510160823i1ab5ff15kc42b138a3ced596b@mail.gmail.com>
	<4353016A.1010707@canterbury.ac.nz>
	<ca471dc20510162006j43e2172rf81015f71e0a6018@mail.gmail.com>
	<43544CC1.5050204@canterbury.ac.nz>
	<ca471dc20510171855w357591c4hae2a9ddb40694d23@mail.gmail.com>
	<1129601328.9405.13.camel@geddy.wooz.org>
Message-ID: <d11dcfba0510172146p5275727bpb5bd6c604a45a9c5@mail.gmail.com>

Barry Warsaw wrote:
> On Mon, 2005-10-17 at 21:55, Guido van Rossum wrote:
>
> > Let's change the property built-in so that its arguments can be either
> > functions or strings (or None). If they are functions or None, it
> > behaves exactly like it always has.
> >
> > If an argument is a string, it should be a method name, and the method
> > is looked up by that name each time the property is used. Because this
> > is late binding, it can be put before the method definitions, and a
> > subclass can override the methods. Example:
> >
> > class C:
> >
> >     foo = property('getFoo', 'setFoo', None, 'the foo property')
> >
> >     def getFoo(self):
> >         return self._foo
> >
> >     def setFoo(self, foo):
> >         self._foo = foo
> >
> > What do you think?
>
> Ick, for all the reasons that strings are less appealing than names.

I'm not sure if you'll like it any better, but I combined Michael
Urman's suggestion with my late-binding property recipe to get:
    http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/442418
It solves the name-repetition problem and the late-binding problem (I
believe), at the cost of either adding an extra argument to the
functions forming the property or confusing the "self" argument a
little.

STeVe
--
You can wordify anything if you just verb it.
        --- Bucky Katt, Get Fuzzy

From guido at python.org  Tue Oct 18 06:59:18 2005
From: guido at python.org (Guido van Rossum)
Date: Mon, 17 Oct 2005 21:59:18 -0700
Subject: [Python-Dev] Definining properties - a use case for class
	decorators?
In-Reply-To: <d11dcfba0510172146p5275727bpb5bd6c604a45a9c5@mail.gmail.com>
References: <43525BFA.9090309@iinet.net.au>
	<ca471dc20510160823i1ab5ff15kc42b138a3ced596b@mail.gmail.com>
	<4353016A.1010707@canterbury.ac.nz>
	<ca471dc20510162006j43e2172rf81015f71e0a6018@mail.gmail.com>
	<43544CC1.5050204@canterbury.ac.nz>
	<ca471dc20510171855w357591c4hae2a9ddb40694d23@mail.gmail.com>
	<1129601328.9405.13.camel@geddy.wooz.org>
	<d11dcfba0510172146p5275727bpb5bd6c604a45a9c5@mail.gmail.com>
Message-ID: <ca471dc20510172159m7eb94e85vf4b704dd2e6f24b9@mail.gmail.com>

On 10/17/05, Steven Bethard <steven.bethard at gmail.com> wrote:
> I'm not sure if you'll like it any better, but I combined Michael
> Urman's suggestion with my late-binding property recipe to get:
>     http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/442418
> It solves the name-repetition problem and the late-binding problem (I
> believe), at the cost of either adding an extra argument to the
> functions forming the property or confusing the "self" argument a
> little.

That is probably as good as you can get it *if* you prefer the nested
class over a property call with string arguments. Personally, I find
the nested class inheriting from Property a lot more "magical" than
the call to property() with string arguments.

I wonder if at some point in the future Python will have to develop a
macro syntax so that you can write

    Property foo:
        def get(self): return self._foo
        ...etc...

which would somehow translate into code similar to your recipe.

But until then, I prefer the simplicity of

    foo = property('get_foo', 'set_foo')

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From gjc at inescporto.pt  Tue Oct 18 13:01:28 2005
From: gjc at inescporto.pt (Gustavo J. A. M. Carneiro)
Date: Tue, 18 Oct 2005 12:01:28 +0100
Subject: [Python-Dev] Coroutines, generators, function calling
Message-ID: <1129633289.12510.24.camel@localhost>

  There's one thing about coroutines using python generators that is
still troubling, and I was wondering if anyone sees any potencial
solution at language level...

  Suppose you have a complex coroutine (this is just an example, not so
complex, but you get the idea, I hope):

def show_message(msg):
    win = create_window(msg)

    # slide down
    for y in xrange(10):
	win.move(0, y*20)
	yield Timeout(0.1)
    
    # timeout
    yield Timeout(3)
	
    # slide up
    for y in xrange(10, 0, -1):
	win.move(0, y*20)
	yield Timeout(0.1)

    win.destroy()

  Suppose now I want to move the window animation to a function, to
factorize the code:

def animate(win, steps):
    for y in steps:
	win.move(0, y*20)
	yield Timeout(0.1)

def show_message(msg):
    win = create_window(msg)
    animate(win, xrange(10)) # slide down
    yield Timeout(3)
    animate(win, xrange(10, 0, -1)) # slide up
    win.destroy()

  This obviously doesn't work, because calling animate() produces
another generator, instead of calling the function.  In coroutines
context, it's like it produces another coroutine, while all I wanted was
to call a function.

  I don't suppose there could be a way to make the yield inside the
subfunction have the same effect as if it was inside the function that
called it?  Perhaps some special notation, either at function calling or
at function definition?

-- 
Gustavo J. A. M. Carneiro
<gjc at inescporto.pt> <gustavo at users.sourceforge.net>
The universe is always one step beyond logic.


From ncoghlan at gmail.com  Tue Oct 18 15:36:21 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 18 Oct 2005 23:36:21 +1000
Subject: [Python-Dev] Coroutines, generators, function calling
In-Reply-To: <1129633289.12510.24.camel@localhost>
References: <1129633289.12510.24.camel@localhost>
Message-ID: <4354FA55.4040909@gmail.com>

Gustavo J. A. M. Carneiro wrote:
>   I don't suppose there could be a way to make the yield inside the
> subfunction have the same effect as if it was inside the function that
> called it?  Perhaps some special notation, either at function calling or
> at function definition?

You mean like a for loop? ;)

   def show_message(msg):
       win = create_window(msg)
       for step in animate(win, xrange(10)): # slide down
           yield step
       yield Timeout(3)
       for step in animate(win, xrange(10, 0, -1)): # slide up
           yield step
       win.destroy()

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From jimjjewett at gmail.com  Tue Oct 18 15:37:12 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue, 18 Oct 2005 09:37:12 -0400
Subject: [Python-Dev] Defining properties - a use case for class decorators?
Message-ID: <fb6fbf560510180637l1f4a3ce1p533ab9353ebeab21@mail.gmail.com>

Greg Ewing wrote:

>> ... the standard way of defining properties at the moment
>> leaves something to be desired, for all the same reasons
>> that have led to @-decorators.

Guido write:

> ... make that feeling more concrete. ...

> With decorators there was a concrete issue: the modifier
> trailed after the function body, in a real sense "hiding"
> from the reader. I don't see such an issue with properties.

For me, the property declaration (including the function
declarations) is too verbose, and ends up hiding the rest
of the class.  My ideal syntax would look something like:

    # Declare "x" to name a property.  Create a storage slot,
    # with the generic get/set/del/doc.  (doc == "property x"?)
    property(x)

    # Change the setter, possibly in a subclass
    property(x) set:
        if x<5:
            __x = x

If I don't want anything special on the get, I shouldn't have
to add any "get" boilerplate to my code.

An alternative might be

    slots=[x, y, z]

to automatically create default properties for x, y, and z,
while declaring that instances won't have arbitrary fields.

That said, I'm not sure the benefit is enough to justify the
extra complications, and your suggestion of allowing strings
for method names may be close enough.  I agree that the
use of strings is awkward, but ... probably no worse than
using them with __dict__ today.

-jJ

From gjc at inescporto.pt  Tue Oct 18 15:47:08 2005
From: gjc at inescporto.pt (Gustavo J. A. M. Carneiro)
Date: Tue, 18 Oct 2005 14:47:08 +0100
Subject: [Python-Dev] Coroutines, generators, function calling
In-Reply-To: <fb6fbf560510180607w19e327e3g609f0d5ed992b673@mail.gmail.com>
References: <fb6fbf560510180607w19e327e3g609f0d5ed992b673@mail.gmail.com>
Message-ID: <1129643229.12510.37.camel@localhost>

On Tue, 2005-10-18 at 09:07 -0400, Jim Jewett wrote:
>   Suppose now I want to move the window animation to a function, to
> factorize the code:
> 
> def animate(win, steps):
>     for y in steps:
> win.move(0, y*20)
> yield Timeout(0.1)
> 
> def show_message(msg):
>     win = create_window(msg)
>     animate(win, xrange(10)) # slide down
>     yield Timeout(3)
>     animate(win, xrange(10, 0, -1)) # slide up
>     win.destroy()
> 
>   This obviously doesn't work, because calling animate() produces
> another generator, instead of calling the function.  In coroutines
> context, it's like it produces another coroutine, while all I wanted was
> to call a function.
> 
>   I don't suppose there could be a way to make the yield inside the
> subfunction have the same effect as if it was inside the function that
> called it?  Perhaps some special notation, either at function calling or
> at function definition?
> 
> ---------------------------------
> 
> I may be missing something, but to me the answer looks like:
> 
> def show_message(msg):
>     win = create_window(msg)
>     for v in animate(win, xrange(10)):  # slide down
>         yield v
>     yield Timeout(3)
>     for v in animate(win, xrange(10, 0, -1)): # slide up
>         yield v
>     win.destroy()

  Sure, that would work.  Or even this, if the scheduler would
automatically recognize generator objects being yielded and so would run
the the nested coroutine until finish:

def show_message(msg):
    win = create_window(msg)
    yield animate(win, xrange(10)) # slide down
    yield Timeout(3)
    yield animate(win, xrange(10, 0, -1)) # slide up
    win.destroy()

  Sure, it could work, but still... I wish for something that would
avoid creating a nested coroutine.  Maybe I'm asking for too much, I
don't know.  Just trying to get some feedback...

  Regards.

-- 
Gustavo J. A. M. Carneiro
<gjc at inescporto.pt> <gustavo at users.sourceforge.net>
The universe is always one step beyond logic.


From ncoghlan at gmail.com  Tue Oct 18 15:59:03 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 18 Oct 2005 23:59:03 +1000
Subject: [Python-Dev] Defining properties - a use case for class
	decorators?
In-Reply-To: <fb6fbf560510180637l1f4a3ce1p533ab9353ebeab21@mail.gmail.com>
References: <fb6fbf560510180637l1f4a3ce1p533ab9353ebeab21@mail.gmail.com>
Message-ID: <4354FFA7.6020204@gmail.com>

Jim Jewett wrote:
> That said, I'm not sure the benefit is enough to justify the
> extra complications, and your suggestion of allowing strings
> for method names may be close enough.  I agree that the
> use of strings is awkward, but ... probably no worse than
> using them with __dict__ today.

An idea that was kicked around on c.l.p a long while back was "statement local 
variables", where you could define some extra names just for a single simple 
statement:

   x = property(get, set, delete, doc) given:
       doc = "Property x (must be less than 5)"
       def get(self):
           try:
               return self._x
           except AttributeError:
               self._x = 0
               return 0
       def set(self, value):
           if value >= 5: raise ValueError("value too big")
           self._x = x
       def delete(self):
           del self._x

As I recall, the idea died due to problems with figuring out how to allow the 
simple statement to both see the names from the nested block and modify the 
surrounding namespace, but prevent the names from the nested block from 
affecting the surrounding namespace after the statement was completed.

Another option would be to allow attribute reference targets when binding 
function names:

   x = property("Property x (must be less than 5)")

   def x.get(instance):
       try:
           return instance._x
       except AttributeError:
           instance._x = 0
           return 0
   def x.set(instance, value):
       if value >= 5: raise ValueError("value too big")
       instance._x = x

   def x.delete(instance):
       del instance._x

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ark at acm.org  Tue Oct 18 16:04:36 2005
From: ark at acm.org (Andrew Koenig)
Date: Tue, 18 Oct 2005 10:04:36 -0400
Subject: [Python-Dev] Coroutines, generators, function calling
In-Reply-To: <1129643229.12510.37.camel@localhost>
Message-ID: <004801c5d3ec$e29b5360$6402a8c0@arkdesktop>

>   Sure, that would work.  Or even this, if the scheduler would
> automatically recognize generator objects being yielded and so would run
> the the nested coroutine until finish:

This idea has been discussed before.  I think the problem with recognizing
generators as the subject of "yield" statements is that then you can't yield
a generator even if you want to.

The best syntax I can think of without adding a new keyword looks like this:

	yield from x

which would be equivalent to

	for i in x:
	    yield i

Note that this equivalence would imply that x can be any iterable, not just
a generator.  For example:

	yield from ['Hello', 'world']

would be equivalent to

	yield 'Hello'
	yield 'world'




From ncoghlan at gmail.com  Tue Oct 18 16:17:22 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 19 Oct 2005 00:17:22 +1000
Subject: [Python-Dev] Coroutines, generators, function calling
In-Reply-To: <004801c5d3ec$e29b5360$6402a8c0@arkdesktop>
References: <004801c5d3ec$e29b5360$6402a8c0@arkdesktop>
Message-ID: <435503F2.2090103@gmail.com>

Andrew Koenig wrote:
>>   Sure, that would work.  Or even this, if the scheduler would
>> automatically recognize generator objects being yielded and so would run
>> the the nested coroutine until finish:
> 
> This idea has been discussed before.  I think the problem with recognizing
> generators as the subject of "yield" statements is that then you can't yield
> a generator even if you want to.
> 
> The best syntax I can think of without adding a new keyword looks like this:
> 
> 	yield from x
> 
> which would be equivalent to
> 
> 	for i in x:
> 	    yield i
> 
> Note that this equivalence would imply that x can be any iterable, not just
> a generator.  For example:
> 
> 	yield from ['Hello', 'world']
> 
> would be equivalent to
> 
> 	yield 'Hello'
> 	yield 'world'

Hmm, I actually quite like that. The best I came up with was "yield for", and 
that just didn't read correctly. Whereas "yield from seq" says exactly what it 
is doing.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From barry at python.org  Tue Oct 18 16:57:01 2005
From: barry at python.org (Barry Warsaw)
Date: Tue, 18 Oct 2005 10:57:01 -0400
Subject: [Python-Dev] Definining properties - a use case for
	class	decorators?
In-Reply-To: <ca471dc20510172046h19eebb5ese9727e2a5bd8a188@mail.gmail.com>
References: <43525BFA.9090309@iinet.net.au>
	<ca471dc20510160823i1ab5ff15kc42b138a3ced596b@mail.gmail.com>
	<4353016A.1010707@canterbury.ac.nz>
	<ca471dc20510162006j43e2172rf81015f71e0a6018@mail.gmail.com>
	<43544CC1.5050204@canterbury.ac.nz>
	<ca471dc20510171855w357591c4hae2a9ddb40694d23@mail.gmail.com>
	<1129601328.9405.13.camel@geddy.wooz.org>
	<ca471dc20510171924q2b7735cby77e3c6fe083523b9@mail.gmail.com>
	<1129605050.9405.29.camel@geddy.wooz.org>
	<ca471dc20510172046h19eebb5ese9727e2a5bd8a188@mail.gmail.com>
Message-ID: <1129647421.24013.7.camel@geddy.wooz.org>

On Mon, 2005-10-17 at 23:46, Guido van Rossum wrote:

> But I still like the version with strings better:
> 
>     x = property('get_x', 'set_x')
> 
> This trades two lambdas for two pairs of string quotes; a good deal IMO!

You could of course "just" do the wrapping in property().  I put that in
quotes because you'd have the problem of knowing when to wrap and when
not to, but there would be ways to solve that.  But I won't belabor the
point any longer, except to ask what happens when you typo one of those
strings?  What kind of exception do you get and when do you get it?

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20051018/21468041/attachment.pgp

From solipsis at pitrou.net  Tue Oct 18 17:05:34 2005
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 18 Oct 2005 17:05:34 +0200
Subject: [Python-Dev] Definining properties - a use case
	for	class	decorators?
In-Reply-To: <1129647421.24013.7.camel@geddy.wooz.org>
References: <43525BFA.9090309@iinet.net.au>
	<ca471dc20510160823i1ab5ff15kc42b138a3ced596b@mail.gmail.com>
	<4353016A.1010707@canterbury.ac.nz>
	<ca471dc20510162006j43e2172rf81015f71e0a6018@mail.gmail.com>
	<43544CC1.5050204@canterbury.ac.nz>
	<ca471dc20510171855w357591c4hae2a9ddb40694d23@mail.gmail.com>
	<1129601328.9405.13.camel@geddy.wooz.org>
	<ca471dc20510171924q2b7735cby77e3c6fe083523b9@mail.gmail.com>
	<1129605050.9405.29.camel@geddy.wooz.org>
	<ca471dc20510172046h19eebb5ese9727e2a5bd8a188@mail.gmail.com>
	<1129647421.24013.7.camel@geddy.wooz.org>
Message-ID: <1129647934.6135.23.camel@fsol>

Le mardi 18 octobre 2005 ? 10:57 -0400, Barry Warsaw a ?crit :
> On Mon, 2005-10-17 at 23:46, Guido van Rossum wrote:
> 
> > But I still like the version with strings better:
> > 
> >     x = property('get_x', 'set_x')
> > 
> > This trades two lambdas for two pairs of string quotes; a good deal IMO!
> 
> You could of course "just" do the wrapping in property().  I put that in
> quotes because you'd have the problem of knowing when to wrap and when
> not to, but there would be ways to solve that.  But I won't belabor the
> point any longer, except to ask what happens when you typo one of those
> strings?  What kind of exception do you get and when do you get it?

AttributeError when actually accessing the property, no?
Guido's proposal seems quite nice to me. It helps group all property
declarations at the beginning, and having accessor methods at the end
with other non-public methods.

Currently I never use properties, because it makes classes much less
readable for the same kind of reasons as what Jim wrote:

Le mardi 18 octobre 2005 ? 09:37 -0400, Jim Jewett a ?crit : 
> For me, the property declaration (including the function
> declarations) is too verbose, and ends up hiding the rest
> of the class.




From pje at telecommunity.com  Tue Oct 18 17:26:04 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 18 Oct 2005 11:26:04 -0400
Subject: [Python-Dev] Coroutines, generators, function calling
In-Reply-To: <1129633289.12510.24.camel@localhost>
Message-ID: <5.1.1.6.0.20051018112412.01f1f9e8@mail.telecommunity.com>

At 12:01 PM 10/18/2005 +0100, Gustavo J. A. M. Carneiro wrote:
>def show_message(msg):
>     win = create_window(msg)
>     animate(win, xrange(10)) # slide down
>     yield Timeout(3)
>     animate(win, xrange(10, 0, -1)) # slide up
>     win.destroy()
>
>   This obviously doesn't work, because calling animate() produces
>another generator, instead of calling the function.  In coroutines
>context, it's like it produces another coroutine, while all I wanted was
>to call a function.

Just 'yield animate(win, xrange(10))' and have the trampoline recognize 
generators.  See the PEP 342 trampoline example, which does this.  When the 
animate() is exhausted, it'll resume the "calling" function.


>   I don't suppose there could be a way to make the yield inside the
>subfunction have the same effect as if it was inside the function that
>called it?  Perhaps some special notation, either at function calling or
>at function definition?

Yes, it's 'yield' at the function calling.  :)


From pje at telecommunity.com  Tue Oct 18 17:31:32 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 18 Oct 2005 11:31:32 -0400
Subject: [Python-Dev] Defining properties - a use case for class
 decorators?
In-Reply-To: <4354FFA7.6020204@gmail.com>
References: <fb6fbf560510180637l1f4a3ce1p533ab9353ebeab21@mail.gmail.com>
	<fb6fbf560510180637l1f4a3ce1p533ab9353ebeab21@mail.gmail.com>
Message-ID: <5.1.1.6.0.20051018112713.01f10250@mail.telecommunity.com>

At 11:59 PM 10/18/2005 +1000, Nick Coghlan wrote:
>An idea that was kicked around on c.l.p a long while back was "statement 
>local
>variables", where you could define some extra names just for a single simple
>statement:
>
>    x = property(get, set, delete, doc) given:
>        doc = "Property x (must be less than 5)"
>        def get(self):
>            try:
>                return self._x
>            except AttributeError:
>                self._x = 0
>                return 0
>...
>
>As I recall, the idea died due to problems with figuring out how to allow the
>simple statement to both see the names from the nested block and modify the
>surrounding namespace, but prevent the names from the nested block from
>affecting the surrounding namespace after the statement was completed.

Haskell's "where" statement does this, but the block *doesn't* modify the 
surrounding namespace; it's strictly local.  With those semantics, the 
Python translation of the above could just be something like:

         def _tmp():
            doc = "blah"
            def get(self):
                # etc.
            # ...
            return property(get,set,delete,doc)

         x = _tmp()

Which works great except for the part that co_lnotab won't let you identify 
that "return" line as being the original expression line, due to the 
monotonically-increasing bit.  ;)

Note that a "where" or "given" statement like this could make it a little 
easier to drop lambda.


From michele.simionato at gmail.com  Tue Oct 18 17:38:40 2005
From: michele.simionato at gmail.com (Michele Simionato)
Date: Tue, 18 Oct 2005 15:38:40 +0000
Subject: [Python-Dev] Definining properties - a use case for class
	decorators?
In-Reply-To: <1129647934.6135.23.camel@fsol>
References: <43525BFA.9090309@iinet.net.au>
	<ca471dc20510162006j43e2172rf81015f71e0a6018@mail.gmail.com>
	<43544CC1.5050204@canterbury.ac.nz>
	<ca471dc20510171855w357591c4hae2a9ddb40694d23@mail.gmail.com>
	<1129601328.9405.13.camel@geddy.wooz.org>
	<ca471dc20510171924q2b7735cby77e3c6fe083523b9@mail.gmail.com>
	<1129605050.9405.29.camel@geddy.wooz.org>
	<ca471dc20510172046h19eebb5ese9727e2a5bd8a188@mail.gmail.com>
	<1129647421.24013.7.camel@geddy.wooz.org>
	<1129647934.6135.23.camel@fsol>
Message-ID: <4edc17eb0510180838u2fab1cebv6e8975525ece9944@mail.gmail.com>

On 10/18/05, Antoine Pitrou <solipsis at pitrou.net> wrote:
> Le mardi 18 octobre 2005 ? 10:57 -0400, Barry Warsaw a ?crit :
> Currently I never use properties, because it makes classes much less
> readable for the same kind of reasons as what Jim wrote.

Me too, I never use properties directly. However I have experimented
with using helper functions to generate the properties:

_dic = {}

def makeproperty(x):
    def getx(self):
        return _dic[self, x]
    def setx(self, value):
        _dic[self, x] = value
    return property(getx, setx)

class C(object):
    x = makeproperty('x')

c = C()
c.x = 1
print c.x

But in general I prefer to write a custom descriptor class, since it
gives me much more control.

        Michele Simionato

From aahz at pythoncraft.com  Tue Oct 18 17:49:00 2005
From: aahz at pythoncraft.com (Aahz)
Date: Tue, 18 Oct 2005 08:49:00 -0700
Subject: [Python-Dev] Definining properties - a use case for
	class	decorators?
In-Reply-To: <1129647421.24013.7.camel@geddy.wooz.org>
References: <ca471dc20510160823i1ab5ff15kc42b138a3ced596b@mail.gmail.com>
	<4353016A.1010707@canterbury.ac.nz>
	<ca471dc20510162006j43e2172rf81015f71e0a6018@mail.gmail.com>
	<43544CC1.5050204@canterbury.ac.nz>
	<ca471dc20510171855w357591c4hae2a9ddb40694d23@mail.gmail.com>
	<1129601328.9405.13.camel@geddy.wooz.org>
	<ca471dc20510171924q2b7735cby77e3c6fe083523b9@mail.gmail.com>
	<1129605050.9405.29.camel@geddy.wooz.org>
	<ca471dc20510172046h19eebb5ese9727e2a5bd8a188@mail.gmail.com>
	<1129647421.24013.7.camel@geddy.wooz.org>
Message-ID: <20051018154900.GB22469@panix.com>

On Tue, Oct 18, 2005, Barry Warsaw wrote:
>
> You could of course "just" do the wrapping in property().  I put that in
> quotes because you'd have the problem of knowing when to wrap and when
> not to, but there would be ways to solve that.  But I won't belabor the
> point any longer, except to ask what happens when you typo one of those
> strings?  What kind of exception do you get and when do you get it?

AttributeError, just like this:

    class C: pass
    C().foo()

Last night I was thinking that maybe TypeError would be better, but
AttributeError is going to be what the internal machinery raises, and I
decided there was no point trying to translate it.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"If you think it's expensive to hire a professional to do the job, wait
until you hire an amateur."  --Red Adair

From jcarlson at uci.edu  Tue Oct 18 18:55:56 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Tue, 18 Oct 2005 09:55:56 -0700
Subject: [Python-Dev] Definining properties - a use case for class
	decorators?
In-Reply-To: <20051018041112.GA14975@panix.com>
References: <ca471dc20510171855w357591c4hae2a9ddb40694d23@mail.gmail.com>
	<20051018041112.GA14975@panix.com>
Message-ID: <20051018094159.37EE.JCARLSON@uci.edu>


Aahz <aahz at pythoncraft.com> wrote:
> 
> On Mon, Oct 17, 2005, Guido van Rossum wrote:
> >
> > If an argument is a string, it should be a method name, and the method
> > is looked up by that name each time the property is used. Because this
> > is late binding, it can be put before the method definitions, and a
> > subclass can override the methods. Example:
> > 
> > class C:
> >     foo = property('getFoo', 'setFoo', None, 'the foo property')
> 
> +1
> 
> The only other alternative is to introduce some kind of block.  This is
> a good solution that is not particularly intrusive; it leaves the door
> open to a well-designed block structure later on.  The one niggle I have
> is that it's going to be a little unwieldy to explain, but people who
> create properties really ought to understand Python well enough to deal
> with it.

I remember posing an unanswered question back when blocks were being
offered, and being that you brought up blocks again, I'll ask a more
specific variant of my original question:
    What would this mythical block statement look like that would make
properties easier to write than the above late-binding or the subclass
Property recipe?


 - Josiah


From solipsis at pitrou.net  Tue Oct 18 19:17:14 2005
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 18 Oct 2005 19:17:14 +0200
Subject: [Python-Dev] properties and block statement
In-Reply-To: <20051018094159.37EE.JCARLSON@uci.edu>
References: <ca471dc20510171855w357591c4hae2a9ddb40694d23@mail.gmail.com>
	<20051018041112.GA14975@panix.com>
	<20051018094159.37EE.JCARLSON@uci.edu>
Message-ID: <1129655834.8244.9.camel@fsol>


>     What would this mythical block statement look like that would make
> properties easier to write than the above late-binding or the subclass
> Property recipe?

I suppose something like:

class C(object):
    x = prop:
        """ Yay for property x! """
        def __get__(self):
            return self._x
        def __set__(self, value):
            self._x = x

and then:

def prop(@block):
    return property(
        fget=block.get("__get__"),
        fset=block.get("__set__"),
        fdel=block.get("__delete__"),
        doc=block.get("__doc__", ""),
    )

(where "@bargs" would be the syntax to refer to block args as a dict,
the same way "**kargs" already exist)




From jcarlson at uci.edu  Tue Oct 18 19:23:59 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Tue, 18 Oct 2005 10:23:59 -0700
Subject: [Python-Dev] Defining properties - a use case for class
	decorators?
In-Reply-To: <4354FFA7.6020204@gmail.com>
References: <fb6fbf560510180637l1f4a3ce1p533ab9353ebeab21@mail.gmail.com>
	<4354FFA7.6020204@gmail.com>
Message-ID: <20051018095537.37F1.JCARLSON@uci.edu>


Nick Coghlan <ncoghlan at gmail.com> wrote:
> 
> Jim Jewett wrote:
> > That said, I'm not sure the benefit is enough to justify the
> > extra complications, and your suggestion of allowing strings
> > for method names may be close enough.  I agree that the
> > use of strings is awkward, but ... probably no worse than
> > using them with __dict__ today.
> 
> An idea that was kicked around on c.l.p a long while back was "statement local 
> variables", where you could define some extra names just for a single simple 
> statement:
> 
>    x = property(get, set, delete, doc) given:
>        doc = "Property x (must be less than 5)"
>        def get(self):
>            try:
>                return self._x
>            except AttributeError:
>                self._x = 0
>                return 0
>        def set(self, value):
>            if value >= 5: raise ValueError("value too big")
>            self._x = x
>        def delete(self):
>            del self._x
> 
> As I recall, the idea died due to problems with figuring out how to allow the 
> simple statement to both see the names from the nested block and modify the 
> surrounding namespace, but prevent the names from the nested block from 
> affecting the surrounding namespace after the statement was completed.

You wouldn't be able to write to the surrounding namespace, but a
closure would work fine for this.

    def Property(fcn):
        ns = fcn()
        return property(ns.get('get'), ns.get('set'), ns.get('delete'), ns.get('doc'))

    class foo(object):
        @Property
        def x():
            doc = "Property x (must be less than 5)"
            def get(self):
                try:
                    return self._x
                except AttributeError:
                    self._x = 0
                    return 0
            def set(self, value):
                if value >= 5: raise ValueError("value too big")
                self._x = value
            def delete(self):
                del self._x
            return locals()

In an actual 'given:' statement, one could create a local function
namespace with the proper func_closure attribute (which is automatically
executed), then execute the lookup of the arguments to the statement in
the 'given:' line from this closure, but assign to surrounding scope.
Then again, maybe the above function and decorator approach are better.


An unfortunate side-effect of with statement early-binding of 'as VAR'
is that unless one works quite hard at mucking about with frames, the
following has a wholly ugly implementation (whether or not one cares
about the persistance of the variables defined within the block, you
still need to modify x when you are done, which may as well cause a
cleanup of the objects defined within the block...if such things are
possible)...

with Property as x:
    ...


> Another option would be to allow attribute reference targets when binding 
> function names:

*shivers at the proposal*  That's scary.  It took me a few minutes just
to figure out what the heck that was supposed to do.

 - Josiah


From solipsis at pitrou.net  Tue Oct 18 19:32:46 2005
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 18 Oct 2005 19:32:46 +0200
Subject: [Python-Dev] properties and block statement
In-Reply-To: <1129655834.8244.9.camel@fsol>
References: <ca471dc20510171855w357591c4hae2a9ddb40694d23@mail.gmail.com>
	<20051018041112.GA14975@panix.com>
	<20051018094159.37EE.JCARLSON@uci.edu>
	<1129655834.8244.9.camel@fsol>
Message-ID: <1129656766.8244.20.camel@fsol>


Le mardi 18 octobre 2005 ? 19:17 +0200, Antoine Pitrou a ?crit :
> >     What would this mythical block statement look like that would make
> > properties easier to write than the above late-binding or the subclass
> > Property recipe?
> 
> I suppose something like:
> 
> class C(object):
>     x = prop:
>         """ Yay for property x! """
>         def __get__(self):
>             return self._x
>         def __set__(self, value):
>             self._x = x

An example of applicating this scheme to Twisted:

domain_name = "www.google.com"
reactor.resolve(domain_name):
    def callback(value):
        print "%s resolves to %s" % (domain_name, value)
    def errback(error):
        print "failed to resolve %s!"

And then in the Reactor class:

def resolve(self, name, @block):
    ...
    d = defer.Deferred()
    cb = block.pop('callback')
    if cb is not None:
        d.addCallback(cb)
    eb = block.pop('errback')
    if eb is not None:
        d.addCallback(eb)
    if block:
        raise ValueError("spurious blockargs: %s" %
            str(block))
    return d




From jcarlson at uci.edu  Tue Oct 18 21:56:32 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Tue, 18 Oct 2005 12:56:32 -0700
Subject: [Python-Dev] properties and block statement
In-Reply-To: <1129655834.8244.9.camel@fsol>
References: <20051018094159.37EE.JCARLSON@uci.edu>
	<1129655834.8244.9.camel@fsol>
Message-ID: <20051018124010.37F4.JCARLSON@uci.edu>


Antoine Pitrou <solipsis at pitrou.net> wrote:
> 
> 
> >     What would this mythical block statement look like that would make
> > properties easier to write than the above late-binding or the subclass
> > Property recipe?
> 
> I suppose something like:
> 
> class C(object):
>     x = prop:
>         """ Yay for property x! """
>         def __get__(self):
>             return self._x
>         def __set__(self, value):
>             self._x = x
> 
> and then:
> 
> def prop(@block):
>     return property(
>         fget=block.get("__get__"),
>         fset=block.get("__set__"),
>         fdel=block.get("__delete__"),
>         doc=block.get("__doc__", ""),
>     )
> 
> (where "@bargs" would be the syntax to refer to block args as a dict,
> the same way "**kargs" already exist)

You are saving 3 lines over the decorator/function approach at the cost
of possible confusion over blocks and an easily forgotten/not read @
just after an open paren.

Thanks, but I'll stick to the Property decorator, Property subclass,
property late bindings, or even a Property metaclass*, and not need to
modify Python syntax.

 - Josiah

* Property metaclass in an embedded class definition:

class Property(type):
    def __init__(*args):
        pass
    def __new__(cls, name, bases, dct):
        return property(dct.get('get'),
                        dct.get('set'),
                        dct.get('delete'),
                        dct.get('__doc__', ''))

class foo(object):
    class x(object):
        __metaclass__ = Property
        'hello'
        def get(self):
            try:
                return self._x
            except AttributeError:
                self._x = 0
                return 0
        def set(self, value):
            if value >= 5: raise ValueError("value too big")
            self._x = value
        def delete(self):
            del self._x


From solipsis at pitrou.net  Tue Oct 18 22:48:10 2005
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 18 Oct 2005 22:48:10 +0200
Subject: [Python-Dev] properties and block statement
In-Reply-To: <20051018124010.37F4.JCARLSON@uci.edu>
References: <20051018094159.37EE.JCARLSON@uci.edu>
	<1129655834.8244.9.camel@fsol> <20051018124010.37F4.JCARLSON@uci.edu>
Message-ID: <1129668490.8464.16.camel@fsol>

Le mardi 18 octobre 2005 ? 12:56 -0700, Josiah Carlson a ?crit :
> You are saving 3 lines over the decorator/function approach [...]

Well, obviously, the point of a block statement or construct is that it
can be applied to many other things than properties. Otherwise it is
overkill as you imply.

(I'm not actively advocating this by the way, I was just answering a
request for an example)



From martin at v.loewis.de  Wed Oct 19 00:19:01 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 19 Oct 2005 00:19:01 +0200
Subject: [Python-Dev] Migrating to subversion
Message-ID: <435574D5.2040604@v.loewis.de>

AFAICT, everything is now setup to actually switch to subversion.
The infrastructure is complete, the conversion procedure is complete,
and Guido pronounced that the migration could happen.

One open issue is where to do the hosting: whether to pay a commercial
hosting company (i.e. wush.net), or whether to let it be
volunteer-hosted on svn.python.org. Guido was undecided, like several
other developers; the majority of the rest apparently was in favour
of trying it on svn.python.org. Anthony Baxter specifically told me
that he would now be fine with hosting it on svn.python.org as it gives
us more control. If it doesn't work out, we can still go to a commercial
hoster.

So I would like to start a conversion in the near future. One 
complication apparently is that SF doesn't manage to create nightly
CVS tarballs anymore; the one I just got is 5 days old. So we would
submit a support request that they manually trigger tarball
generation to shorten the freeze period.

If people want to test the installation before the switch happens,
this would be the time to do it.

Regards,
Martin

From pje at telecommunity.com  Wed Oct 19 00:43:53 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 18 Oct 2005 18:43:53 -0400
Subject: [Python-Dev] Migrating to subversion
In-Reply-To: <435574D5.2040604@v.loewis.de>
Message-ID: <5.1.1.6.0.20051018184316.01fa0d90@mail.telecommunity.com>

At 12:19 AM 10/19/2005 +0200, Martin v. L?wis wrote:
>If people want to test the installation before the switch happens,
>this would be the time to do it.

What will the procedure be for getting a login?  I assume our SF logins 
won't simply be transferred/transferrable.


From skip at pobox.com  Wed Oct 19 01:16:36 2005
From: skip at pobox.com (skip@pobox.com)
Date: Tue, 18 Oct 2005 18:16:36 -0500
Subject: [Python-Dev] Migrating to subversion
In-Reply-To: <435574D5.2040604@v.loewis.de>
References: <435574D5.2040604@v.loewis.de>
Message-ID: <17237.33364.928502.548360@montanaro.dyndns.org>


    Martin> If people want to test the installation before the switch
    Martin> happens, this would be the time to do it.

Martin,

Can you let us know again the magic incantation to check out the source from
the repository?

Thx,

Skip

From greg.ewing at canterbury.ac.nz  Wed Oct 19 01:20:32 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 19 Oct 2005 12:20:32 +1300
Subject: [Python-Dev] Property syntax for Py3k (properties and block
	statement)
In-Reply-To: <1129655834.8244.9.camel@fsol>
References: <ca471dc20510171855w357591c4hae2a9ddb40694d23@mail.gmail.com>
	<20051018041112.GA14975@panix.com>
	<20051018094159.37EE.JCARLSON@uci.edu>
	<1129655834.8244.9.camel@fsol>
Message-ID: <43558340.1010008@canterbury.ac.nz>

Antoine Pitrou wrote:

> I suppose something like:
> 
> class C(object):
>     x = prop:
>         """ Yay for property x! """
>         def __get__(self):
>             return self._x
>         def __set__(self, value):
>             self._x = x

I've just looked at Steven Bethard's recipe, and it seems
to give you something very like this. Two problems with it
are the abuse of 'class' to define something that's not
really used as a class, and the need to explicitly inherit
from the base class's property descriptor.

In Py3k, I'd like to see 'property' renamed to 'Property',
and 'property' become a keyword used something like

   class C:

     property x:

       """This is the x property."""

       def __get__(self):
         ...

       def __set__(self, value):
         ...

       def __del__(self):
         ...

The accessors should be overridable in subclasses, so
you can do

   class D(C):

     property x:

       def __get__(self):
         """This overrides the __get__ property for x in C"""
         ...

Greg

From michel at cignex.com  Wed Oct 19 02:01:47 2005
From: michel at cignex.com (Michel Pelletier)
Date: Tue, 18 Oct 2005 17:01:47 -0700
Subject: [Python-Dev] Guido v. Python, Round 1
In-Reply-To: <17235.59880.819873.541201@montanaro.dyndns.org>
References: <ee2a432c0510162007j25a6222dr2a88c2407e5840dc@mail.gmail.com>	<17235.52236.458715.854015@montanaro.dyndns.org>	<1f7befae0510170934h21d072dbn2037d634af7d17f@mail.gmail.com>	<ee2a432c0510171011i5cae4d9axd0f17319afa00e45@mail.gmail.com>
	<17235.59880.819873.541201@montanaro.dyndns.org>
Message-ID: <43558CEB.2020304@cignex.com>

skip at pobox.com wrote:
>     Neal> This URL should work for a while longer.
> 
>     Neal> http://creosote.python.org/neal/
> 
> Ah, the vagaries of URL redirection.  Thanks...

The front of his shirt says ++ungood;  Is that the whole joke or is the 
punchline on the back?

-Michel


From bcannon at gmail.com  Wed Oct 19 02:39:49 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Tue, 18 Oct 2005 17:39:49 -0700
Subject: [Python-Dev] Migrating to subversion
In-Reply-To: <17237.33364.928502.548360@montanaro.dyndns.org>
References: <435574D5.2040604@v.loewis.de>
	<17237.33364.928502.548360@montanaro.dyndns.org>
Message-ID: <bbaeab100510181739i60a27c5fk840c8a9d2a64526a@mail.gmail.com>

On 10/18/05, skip at pobox.com <skip at pobox.com> wrote:
>
>     Martin> If people want to test the installation before the switch
>     Martin> happens, this would be the time to do it.
>
> Martin,
>
> Can you let us know again the magic incantation to check out the source from
> the repository?
>

And any other problems people come across or questions they have about
Subversion and its use, please do ask.  I will try to start a new
section in the dev FAQ for svn-specific issues.

-Brett

From dave at pythonapocrypha.com  Wed Oct 19 05:28:14 2005
From: dave at pythonapocrypha.com (Dave Brueck)
Date: Tue, 18 Oct 2005 21:28:14 -0600
Subject: [Python-Dev] Guido v. Python, Round 1
In-Reply-To: <43558CEB.2020304@cignex.com>
References: <ee2a432c0510162007j25a6222dr2a88c2407e5840dc@mail.gmail.com>	<17235.52236.458715.854015@montanaro.dyndns.org>	<1f7befae0510170934h21d072dbn2037d634af7d17f@mail.gmail.com>	<ee2a432c0510171011i5cae4d9axd0f17319afa00e45@mail.gmail.com>	<17235.59880.819873.541201@montanaro.dyndns.org>
	<43558CEB.2020304@cignex.com>
Message-ID: <4355BD4E.9010506@pythonapocrypha.com>

Michel Pelletier wrote:
> skip at pobox.com wrote:
> 
>>    Neal> This URL should work for a while longer.
>>
>>    Neal> http://creosote.python.org/neal/
>>
>>Ah, the vagaries of URL redirection.  Thanks...
> 
> 
> The front of his shirt says ++ungood;  Is that the whole joke or is the 
> punchline on the back?

http://www.ee.surrey.ac.uk/Personal/L.Wood/double-plus-ungood/

From martin at v.loewis.de  Wed Oct 19 07:50:58 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 19 Oct 2005 07:50:58 +0200
Subject: [Python-Dev] Migrating to subversion
In-Reply-To: <5.1.1.6.0.20051018184316.01fa0d90@mail.telecommunity.com>
References: <5.1.1.6.0.20051018184316.01fa0d90@mail.telecommunity.com>
Message-ID: <4355DEC2.3030103@v.loewis.de>

Phillip J. Eby wrote:
> What will the procedure be for getting a login?  I assume our SF logins 
> won't simply be transferred/transferrable.

You should send your SSH2 public key along with your preferred logname
(firstname.lastname) to pydotorg at python.org.

Regards,
Martin

From martin at v.loewis.de  Wed Oct 19 07:54:42 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 19 Oct 2005 07:54:42 +0200
Subject: [Python-Dev] Migrating to subversion
In-Reply-To: <17237.33364.928502.548360@montanaro.dyndns.org>
References: <435574D5.2040604@v.loewis.de>
	<17237.33364.928502.548360@montanaro.dyndns.org>
Message-ID: <4355DFA2.3000801@v.loewis.de>

skip at pobox.com wrote:
> Can you let us know again the magic incantation to check out the source from
> the repository?

See

http://www.python.org/dev/svn.html

It's (say)

svn co svn+ssh://pythondev at svn.python.org/python/trunk/Misc

for read-write access, and

svn co http://svn.python.org/projects/python/trunk/Misc

for read-only access; viewcvs is at

http://svn.python.org/view

Regards,
Martin

From martin at v.loewis.de  Wed Oct 19 07:55:57 2005
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Wed, 19 Oct 2005 07:55:57 +0200
Subject: [Python-Dev] Migrating to subversion
In-Reply-To: <bbaeab100510181739i60a27c5fk840c8a9d2a64526a@mail.gmail.com>
References: <435574D5.2040604@v.loewis.de>	
	<17237.33364.928502.548360@montanaro.dyndns.org>
	<bbaeab100510181739i60a27c5fk840c8a9d2a64526a@mail.gmail.com>
Message-ID: <4355DFED.2020004@v.loewis.de>

Brett Cannon wrote:
> And any other problems people come across or questions they have about
> Subversion and its use, please do ask.  I will try to start a new
> section in the dev FAQ for svn-specific issues.

Please integrate

http://www.python.org/dev/svn.html

(linked from 1.3 of devfaq.html) if possible.

Regards,
Martin

From stefan.rank at ofai.at  Wed Oct 19 09:01:21 2005
From: stefan.rank at ofai.at (Stefan Rank)
Date: Wed, 19 Oct 2005 09:01:21 +0200
Subject: [Python-Dev] properties and block statement
In-Reply-To: <1129655834.8244.9.camel@fsol>
References: <ca471dc20510171855w357591c4hae2a9ddb40694d23@mail.gmail.com>	<20051018041112.GA14975@panix.com>	<20051018094159.37EE.JCARLSON@uci.edu>
	<1129655834.8244.9.camel@fsol>
Message-ID: <4355EF41.4010803@ofai.at>

on 18.10.2005 19:17 Antoine Pitrou said the following:
>>    What would this mythical block statement look like that would make
>>properties easier to write than the above late-binding or the subclass
>>Property recipe?
> 
> I suppose something like:
> 
> class C(object):
>     x = prop:
>         """ Yay for property x! """
>         def __get__(self):
>             return self._x
>         def __set__(self, value):
>             self._x = x
> 
> and then:
> 
> def prop(@block):
>     return property(
>         fget=block.get("__get__"),
>         fset=block.get("__set__"),
>         fdel=block.get("__delete__"),
>         doc=block.get("__doc__", ""),
>     )
> 
> (where "@bargs" would be the syntax to refer to block args as a dict,
> the same way "**kargs" already exist)
> 

I think there is no need for a special @syntax for this to work.

I suppose it would be possible to allow a trailing block after any 
function invocation, with the effect of creating a new namespace that 
gets treated as containing keyword arguments.

No additional function needed for the property example::

   class C(object):
       x = property():
           doc = """ Yay for property x! """
           def fget(self):
               return self._x
           def fset(self, value):
               self._x = x


(This does not help with the problem of overridability though...)

A drawback is that such a "keyword block" would only be possible for the 
last function invocation of a statement.
Although the block could also be inside the method invocation 
parentheses? I do not think that this is a pretty sight but I'll spell 
it out anyways ;-) ::

   class C(object):
       x = property(:
           doc = """ Yay for property x! """
           def fget(self):
               return self._x
           def fset(self, value):
               self._x = x
       )


--stefan


From michele.simionato at gmail.com  Wed Oct 19 10:51:50 2005
From: michele.simionato at gmail.com (Michele Simionato)
Date: Wed, 19 Oct 2005 08:51:50 +0000
Subject: [Python-Dev] Definining properties - a use case for class
	decorators?
In-Reply-To: <ca471dc20510172159m7eb94e85vf4b704dd2e6f24b9@mail.gmail.com>
References: <43525BFA.9090309@iinet.net.au>
	<ca471dc20510160823i1ab5ff15kc42b138a3ced596b@mail.gmail.com>
	<4353016A.1010707@canterbury.ac.nz>
	<ca471dc20510162006j43e2172rf81015f71e0a6018@mail.gmail.com>
	<43544CC1.5050204@canterbury.ac.nz>
	<ca471dc20510171855w357591c4hae2a9ddb40694d23@mail.gmail.com>
	<1129601328.9405.13.camel@geddy.wooz.org>
	<d11dcfba0510172146p5275727bpb5bd6c604a45a9c5@mail.gmail.com>
	<ca471dc20510172159m7eb94e85vf4b704dd2e6f24b9@mail.gmail.com>
Message-ID: <4edc17eb0510190151v6a078f54n2135b8d8386c796a@mail.gmail.com>

On 10/18/05, Guido van Rossum <guido at python.org> wrote:
> I wonder if at some point in the future Python will have to develop a
> macro syntax so that you can write
>
>     Property foo:
>         def get(self): return self._foo
>         ...etc...

This reminds me of an idea I have kept in my drawer for a couple of years or so.
Here is my proposition: we could have the statement syntax

<callable> <name> <tuple>:
   <definitions>

to be syntactic sugar for

<name> = <callable>(<name>, <tuple>, <dict-of-definitions>)

For instance properties could be defined as follows:

def Property(name, args, dic):
    return property(
       dic.get('fget'), dic.get('fset'), dic.get('fdel'), dic.get('__doc__'))

Property p():
    "I am a property"
    def fget(self):
        pass
    def fset(self):
        pass
    def fdel(self):
        pass

Another typical use case could be a dispatcher:

class Dispatcher(object):
    def __init__(self, name, args, dic):
        self.dic = dic
    def __call__(self, action, *args, **kw):
        return self.dic.get(action)(*args, **kw)

Dispatcher dispatch(action):
  def do_this():
     pass
  def do_that():
     pass
  def default():
     pass

dispatch('do_this')

Notice that the proposal is already implementable by abusing the class
statement:

class <name> <tuple>:
   __metaclass__ = <callable>
   <definitions>

But abusing metaclasses for this task is ugly. BTW, if the proposal was
implemented, the 'class' would become redundant and could be replaced
by 'type':

class <classname> <bases>:
   <definitions>

<=>

type <classname> <bases>:
   <definitions>


;)

               Michele Simionato

From duncan.booth at suttoncourtenay.org.uk  Wed Oct 19 11:11:16 2005
From: duncan.booth at suttoncourtenay.org.uk (Duncan Booth)
Date: Wed, 19 Oct 2005 10:11:16 +0100
Subject: [Python-Dev] properties and block statement
References: <1129655834.8244.9.camel@fsol> <4355EF41.4010803@ofai.at>
Message-ID: <n2m-g.Xns96F467A1938F5duncanrcpcouk@127.0.0.1>

Stefan Rank <stefan.rank at ofai.at> wrote in news:4355EF41.4010803 at ofai.at:

> I think there is no need for a special @syntax for this to work.
> 
> I suppose it would be possible to allow a trailing block after any 
> function invocation, with the effect of creating a new namespace that 
> gets treated as containing keyword arguments.
> 

I suspect that without any syntax changes at all it will be possible (for 
some stack crawling implementation of 'propertycontext', and assuming 
nobody makes property objects immutable) to do:

   class C(object):
       with propertycontext() as x:
           doc = """ Yay for property x! """
           def fget(self):
               return self._x
           def fset(self, value):
               self._x = value

for inheritance you would have to specify the base property:

    class D(C):
       with propertycontext(C.x) as x:
           def fset(self, value):
               self._x = value+1


propertycontext could look something like:

import sys
@contextmanager
def propertycontext(parent=None):
    classframe = sys._getframe(2)
    cvars = classframe.f_locals
    marker = object()
    keys = ('fget', 'fset', 'fdel', 'doc')
    old = [cvars.get(key, marker) for key in keys]

    if parent:
        pvars = [getattr(parent, key) for key in
            ('fget', 'fset', 'fdel', '__doc__')]
    else:
        pvars = [None]*4

    args = dict(zip(keys, pvars))

    prop = property()
    try:
        yield prop
        for key, orig in zip(keys, old):
            v = cvars.get(key, marker)
            if v is not orig:
                args[key] = v
        prop.__init__(**args)
    finally:
        for k,v in zip(keys,old):
           if v is marker:
              if k in cvars:
                  del cvars[k]
           else:
               cvars[k] = v

From ncoghlan at gmail.com  Wed Oct 19 11:25:05 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 19 Oct 2005 19:25:05 +1000
Subject: [Python-Dev] Defining properties - a use case for class
	decorators?
In-Reply-To: <20051018095537.37F1.JCARLSON@uci.edu>
References: <fb6fbf560510180637l1f4a3ce1p533ab9353ebeab21@mail.gmail.com>
	<4354FFA7.6020204@gmail.com> <20051018095537.37F1.JCARLSON@uci.edu>
Message-ID: <435610F1.60207@gmail.com>

Josiah Carlson wrote:
>> Another option would be to allow attribute reference targets when binding 
>> function names:
> 
> *shivers at the proposal*  That's scary.  It took me a few minutes just
> to figure out what the heck that was supposed to do.

Yeah, I think it's a concept with many, many more downsides than upsides. A 
"given" or "where" clause based solution would be far easier to read:

   x.get = f given:
       def f(): pass

A given clause has its own problems though (the out-of-order execution it 
involves being the one which seems to raise the most hackles).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From p.f.moore at gmail.com  Wed Oct 19 11:42:25 2005
From: p.f.moore at gmail.com (Paul Moore)
Date: Wed, 19 Oct 2005 10:42:25 +0100
Subject: [Python-Dev] Definining properties - a use case for class
	decorators?
In-Reply-To: <4edc17eb0510190151v6a078f54n2135b8d8386c796a@mail.gmail.com>
References: <43525BFA.9090309@iinet.net.au>
	<ca471dc20510160823i1ab5ff15kc42b138a3ced596b@mail.gmail.com>
	<4353016A.1010707@canterbury.ac.nz>
	<ca471dc20510162006j43e2172rf81015f71e0a6018@mail.gmail.com>
	<43544CC1.5050204@canterbury.ac.nz>
	<ca471dc20510171855w357591c4hae2a9ddb40694d23@mail.gmail.com>
	<1129601328.9405.13.camel@geddy.wooz.org>
	<d11dcfba0510172146p5275727bpb5bd6c604a45a9c5@mail.gmail.com>
	<ca471dc20510172159m7eb94e85vf4b704dd2e6f24b9@mail.gmail.com>
	<4edc17eb0510190151v6a078f54n2135b8d8386c796a@mail.gmail.com>
Message-ID: <79990c6b0510190242l10dee1b8va08d9245252dcf8d@mail.gmail.com>

On 10/19/05, Michele Simionato <michele.simionato at gmail.com> wrote:
> On 10/18/05, Guido van Rossum <guido at python.org> wrote:
> > I wonder if at some point in the future Python will have to develop a
> > macro syntax so that you can write
> >
> >     Property foo:
> >         def get(self): return self._foo
> >         ...etc...
>
> This reminds me of an idea I have kept in my drawer for a couple of years or so.
> Here is my proposition: we could have the statement syntax
>
> <callable> <name> <tuple>:
>   <definitions>
>
> to be syntactic sugar for
>
> <name> = <callable>(<name>, <tuple>, <dict-of-definitions>)

Cor. That looks like very neat/scary stuff. I'm not sure if I feel
that that is a good thing or a bad thing :-)

One question - in the expansion, "name" is used on both sides of the
assignment. Consider

    something name():
        <definitions>

This expands to

    name = something(name, (), <dict>)

What should happen if name wasn't defined before? A literal
translation will result in a NameError. Maybe an expansion

    name = something('name', (), <dict>)

would be better (ie, the callable gets the *name* of the target as an
argument, rather than the old value).

Also, the <definitions> bit needs some clarification. I'm guessing
that it would be a suite, executed in a new, empty namespace, and the
<dict-of-definitions> is the resulting modified namespace (with
__builtins__ removed?)

In other words, take <definitions>, and do

    d = {}
    exec <definitions> in d
    del d['__builtins__']

then <dict-of-definitions> is the resulting value of d.

Interesting idea...

Paul.

From ncoghlan at gmail.com  Wed Oct 19 11:47:17 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 19 Oct 2005 19:47:17 +1000
Subject: [Python-Dev] Defining properties - a use case for class
	decorators?
In-Reply-To: <5.1.1.6.0.20051018112713.01f10250@mail.telecommunity.com>
References: <fb6fbf560510180637l1f4a3ce1p533ab9353ebeab21@mail.gmail.com>
	<fb6fbf560510180637l1f4a3ce1p533ab9353ebeab21@mail.gmail.com>
	<5.1.1.6.0.20051018112713.01f10250@mail.telecommunity.com>
Message-ID: <43561625.4030902@gmail.com>

Phillip J. Eby wrote:
> Note that a "where" or "given" statement like this could make it a 
> little easier to drop lambda.

I think the "lambda will disappear in Py3k" concept might have been what 
triggered the original 'where' statement discussion.

The idea was to be able to lift an arbitrary subexpression out of a function 
call or assignment statement without having to worry about affecting the 
surrounding namespace, and without distracting attention from the original 
statement. Basically, let a local refactoring *stay* local.

The discussion wandered fairly far afield from that original goal though.

One reason it fell apart was trying to answer the seemingly simple question 
"What would this print?":

   def f():
      a = 1
      b = 2
      print 1, locals()
      print 3, locals() given:
          a = 2
          c = 3
          print 2, locals()
      print 4, locals()

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From michele.simionato at gmail.com  Wed Oct 19 12:10:53 2005
From: michele.simionato at gmail.com (Michele Simionato)
Date: Wed, 19 Oct 2005 10:10:53 +0000
Subject: [Python-Dev] Definining properties - a use case for class
	decorators?
In-Reply-To: <79990c6b0510190242l10dee1b8va08d9245252dcf8d@mail.gmail.com>
References: <43525BFA.9090309@iinet.net.au> <4353016A.1010707@canterbury.ac.nz>
	<ca471dc20510162006j43e2172rf81015f71e0a6018@mail.gmail.com>
	<43544CC1.5050204@canterbury.ac.nz>
	<ca471dc20510171855w357591c4hae2a9ddb40694d23@mail.gmail.com>
	<1129601328.9405.13.camel@geddy.wooz.org>
	<d11dcfba0510172146p5275727bpb5bd6c604a45a9c5@mail.gmail.com>
	<ca471dc20510172159m7eb94e85vf4b704dd2e6f24b9@mail.gmail.com>
	<4edc17eb0510190151v6a078f54n2135b8d8386c796a@mail.gmail.com>
	<79990c6b0510190242l10dee1b8va08d9245252dcf8d@mail.gmail.com>
Message-ID: <4edc17eb0510190310o637d141bvaaad939b0f3c072b@mail.gmail.com>

On 10/19/05, Paul Moore <p.f.moore at gmail.com> wrote:
>
> One question - in the expansion, "name" is used on both sides of the
> assignment. Consider
>
>     something name():
>         <definitions>
>
> This expands to
>
>     name = something(name, (), <dict>)
>
> What should happen if name wasn't defined before? A literal
> translation will result in a NameError. Maybe an expansion
>
>     name = something('name', (), <dict>)
>
> would be better (ie, the callable gets the *name* of the target as an
> argument, rather than the old value).
>
> Also, the <definitions> bit needs some clarification. I'm guessing
> that it would be a suite, executed in a new, empty namespace, and the
> <dict-of-definitions> is the resulting modified namespace (with
> __builtins__ removed?)
>
> In other words, take <definitions>, and do
>
>     d = {}
>     exec <definitions> in d
>     del d['__builtins__']
>
> then <dict-of-definitions> is the resulting value of d.
>
> Interesting idea...
>
> Paul.
>

<name> would be a string and <dict-of-definitions> a dictionary.
As I said, the semantic would be exactly the same as the current
way of doing it:

class <name> <args>:
    __metaclass__ = <callable>

I am just advocating for syntactic sugar, the functionality is already there.

           Michele Simionato

From solipsis at pitrou.net  Wed Oct 19 13:12:26 2005
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Wed, 19 Oct 2005 13:12:26 +0200
Subject: [Python-Dev] Definining properties - a use case for
	class	decorators?
In-Reply-To: <4edc17eb0510190151v6a078f54n2135b8d8386c796a@mail.gmail.com>
References: <43525BFA.9090309@iinet.net.au>
	<ca471dc20510160823i1ab5ff15kc42b138a3ced596b@mail.gmail.com>
	<4353016A.1010707@canterbury.ac.nz>
	<ca471dc20510162006j43e2172rf81015f71e0a6018@mail.gmail.com>
	<43544CC1.5050204@canterbury.ac.nz>
	<ca471dc20510171855w357591c4hae2a9ddb40694d23@mail.gmail.com>
	<1129601328.9405.13.camel@geddy.wooz.org>
	<d11dcfba0510172146p5275727bpb5bd6c604a45a9c5@mail.gmail.com>
	<ca471dc20510172159m7eb94e85vf4b704dd2e6f24b9@mail.gmail.com>
	<4edc17eb0510190151v6a078f54n2135b8d8386c796a@mail.gmail.com>
Message-ID: <1129720346.6215.18.camel@fsol>


Hi Michele,

> Property p():
>     "I am a property"
>     def fget(self):
>         pass
>     def fset(self):
>         pass
>     def fdel(self):
>         pass

In effect this is quite similar to the proposal I've done (except that
you've reversed the traditional assignment order from "p = Property()"
to "Property p()")

If others find it interesting and Guido doesn't frown on it, maybe we
should sit down and start writing a PEP ?

ciao

Antoine.



From janc13 at gmail.com  Wed Oct 19 13:59:59 2005
From: janc13 at gmail.com (JanC)
Date: Wed, 19 Oct 2005 13:59:59 +0200
Subject: [Python-Dev] Pythonic concurrency - offtopic
In-Reply-To: <20051013220748.9195.JCARLSON@uci.edu>
References: <434B61ED.4080503@intercable.ru>
	<20051013220748.9195.JCARLSON@uci.edu>
Message-ID: <984838bf0510190459o695d99bcqa44e7eb072cf230d@mail.gmail.com>

On 10/14/05, Josiah Carlson <jcarlson at uci.edu> wrote:
> Until Microsoft adds kernel support for fork, don't expect standard
> Windows Python to support it.

AFAIK the NT kernel has support for fork, but the Win32 subsystem
doesn't support it (you can only use it with the POSIX subsystem).

--
JanC

From jimjjewett at gmail.com  Wed Oct 19 15:23:35 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Wed, 19 Oct 2005 09:23:35 -0400
Subject: [Python-Dev] Early PEP draft (For Python 3000?)
Message-ID: <fb6fbf560510190623g7d979666n42abf86b95d1323a@mail.gmail.com>

(In http://mail.python.org/pipermail/python-dev/2005-October/057251.html)
Eyal Lotem wrote:

> Name: Attribute access for all namespaces ...

>       global x ; x = 1
> Replaced by:
>       module.x = 1

Attribute access as an option would be nice, but might be slower.

Also note that one common use for a __dict__ is that you don't
know what keys are available; meeting this use case with
attribute access would require some extra machinery, such as
an iterator over attributes.

-jJ

From jimjjewett at gmail.com  Wed Oct 19 15:44:09 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Wed, 19 Oct 2005 09:44:09 -0400
Subject: [Python-Dev] Defining properties - a use case for class decorators?
Message-ID: <fb6fbf560510190644u67bed4f8w796b0abec495914d@mail.gmail.com>

(In http://mail.python.org/pipermail/python-dev/2005-October/057409.html,)
Nick Coghlan suggested allowing attribute references as binding targets.

>    x = property("Property x (must be less than 5)")

>    def x.get(instance):  ...

Josiah shivered and said it was hard to tell what was even intended, and
(in http://mail.python.org/pipermail/python-dev/2005-October/057437.html)
Nick agreed that it was worse than

>    x.get = f given:
>        def f(): ...

Could someone explain to me why it is worse?

I understand not wanting to modify object x outside of its definition.

I understand that there is some trickiness about instancemethods
and bound variables.

But these objections seem equally strong for both forms, as well
as for the current "equivalent" of

    def f(): ...
    x.get = f

The first form (def x.get) at least avoids repeating (or even creating)
the temporary function name.

The justification for decorators was to solve this very problem within
a module or class.  How is this different?  Is it just that attributes
shouldn't be functions, and this might encourage the practice?

-jJ

From steven.bethard at gmail.com  Wed Oct 19 17:38:12 2005
From: steven.bethard at gmail.com (Steven Bethard)
Date: Wed, 19 Oct 2005 09:38:12 -0600
Subject: [Python-Dev] Definining properties - a use case for class
	decorators?
In-Reply-To: <4edc17eb0510190151v6a078f54n2135b8d8386c796a@mail.gmail.com>
References: <43525BFA.9090309@iinet.net.au>
	<ca471dc20510160823i1ab5ff15kc42b138a3ced596b@mail.gmail.com>
	<4353016A.1010707@canterbury.ac.nz>
	<ca471dc20510162006j43e2172rf81015f71e0a6018@mail.gmail.com>
	<43544CC1.5050204@canterbury.ac.nz>
	<ca471dc20510171855w357591c4hae2a9ddb40694d23@mail.gmail.com>
	<1129601328.9405.13.camel@geddy.wooz.org>
	<d11dcfba0510172146p5275727bpb5bd6c604a45a9c5@mail.gmail.com>
	<ca471dc20510172159m7eb94e85vf4b704dd2e6f24b9@mail.gmail.com>
	<4edc17eb0510190151v6a078f54n2135b8d8386c796a@mail.gmail.com>
Message-ID: <d11dcfba0510190838g3629f183r616485df4f8ea252@mail.gmail.com>

Michele Simionato wrote:
> This reminds me of an idea I have kept in my drawer for a couple of years or so.
> Here is my proposition: we could have the statement syntax
>
> <callable> <name> <tuple>:
>    <definitions>
>
> to be syntactic sugar for
>
> <name> = <callable>(<name>, <tuple>, <dict-of-definitions>)
>
[snip]
> BTW, if the proposal was implemented, the 'class' would become
> redundant and could be replaced by 'type':
>
> class <classname> <bases>:
>    <definitions>
>
> <=>
>
> type <classname> <bases>:
>    <definitions>

Wow, that's really neat.  And you save a keyword! ;-)

I'd definitely like to see a PEP.

STeVE
--
You can wordify anything if you just verb it.
        --- Bucky Katt, Get Fuzzy

From pje at telecommunity.com  Wed Oct 19 17:42:05 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 19 Oct 2005 11:42:05 -0400
Subject: [Python-Dev] Defining properties - a use case for class
 decorators?
In-Reply-To: <43561625.4030902@gmail.com>
References: <5.1.1.6.0.20051018112713.01f10250@mail.telecommunity.com>
	<fb6fbf560510180637l1f4a3ce1p533ab9353ebeab21@mail.gmail.com>
	<fb6fbf560510180637l1f4a3ce1p533ab9353ebeab21@mail.gmail.com>
	<5.1.1.6.0.20051018112713.01f10250@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20051019113729.01fa79f8@mail.telecommunity.com>

At 07:47 PM 10/19/2005 +1000, Nick Coghlan wrote:
>Phillip J. Eby wrote:
> > Note that a "where" or "given" statement like this could make it a
> > little easier to drop lambda.
>
>I think the "lambda will disappear in Py3k" concept might have been what
>triggered the original 'where' statement discussion.
>
>The idea was to be able to lift an arbitrary subexpression out of a function
>call or assignment statement without having to worry about affecting the
>surrounding namespace, and without distracting attention from the original
>statement. Basically, let a local refactoring *stay* local.
>
>The discussion wandered fairly far afield from that original goal though.
>
>One reason it fell apart was trying to answer the seemingly simple question
>"What would this print?":
>
>    def f():
>       a = 1
>       b = 2
>       print 1, locals()
>       print 3, locals() given:
>           a = 2
>           c = 3
>           print 2, locals()
>       print 4, locals()

It would print "SyntaxError", because the 'given' or 'where' clause should 
only work on an expression or assignment statement, not print.  :)

In Python 3000, where print is a function, it should print the numbers in 
sequence, with 1+4 showing the outer locals, and 2+3 showing the inner 
locals (not including 'b', since b is not a local variable in the nested 
block).

I don't see what's hard about the question, if you view the block as syntax 
sugar for a function definition and invocation on the right hand side of an 
assignment.

Of course, if you assume it can occur on *any* statement (e.g. print), I 
suppose things could seem more hairy.


From jeremy at alum.mit.edu  Wed Oct 19 17:46:01 2005
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Wed, 19 Oct 2005 11:46:01 -0400
Subject: [Python-Dev] AST branch merge status
In-Reply-To: <e8bf7a530510162026m22e29e91p70fc8c7a016ebeea@mail.gmail.com>
References: <e8bf7a530510152230h7d0382ccx1ff7f1727877bd08@mail.gmail.com>
	<e8bf7a530510162026m22e29e91p70fc8c7a016ebeea@mail.gmail.com>
Message-ID: <e8bf7a530510190846m28f3310dje32f32f44bbd22bf@mail.gmail.com>

I'm still making slow progress on this front.  I have a versioned
merged to the CVS head.  I'd like to make a final pass over the patch.
 I'd upload it to SF, but I can't connect to a web server there.  If
anyone would like to eyeball that patch before I commit it, I can
email it to you.

Jeremy

On 10/16/05, Jeremy Hylton <jeremy at alum.mit.edu> wrote:
> Real life interfered with the planned merge tonight.  I hope you'll
> all excuse and wait until tomorrow night.
>
> Jeremy
>
> On 10/16/05, Jeremy Hylton <jeremy at alum.mit.edu> wrote:
> > I just merged the head back to the AST branch for what I hope is the
> > last time.  I plan to merge the branch to the head on Sunday evening.
> > I'd appreciate it if folks could hold off on making changes on the
> > trunk until that merge happens.
> >
> > If this is a non-trivial inconvenience for anyone, go ahead with the
> > changes but send me mail to  make sure that I don't lose the changes
> > when I do the merge.  Regardless, the compiler and Grammar are off
> > limits.  I'll blow away any changes you make there <0.1 wink>.
> >
> > Jeremy
> >
> >
>

From skip at pobox.com  Wed Oct 19 18:43:59 2005
From: skip at pobox.com (skip@pobox.com)
Date: Wed, 19 Oct 2005 11:43:59 -0500
Subject: [Python-Dev] Definining properties - a use case for class
 decorators?
In-Reply-To: <d11dcfba0510190838g3629f183r616485df4f8ea252@mail.gmail.com>
References: <43525BFA.9090309@iinet.net.au>
	<ca471dc20510160823i1ab5ff15kc42b138a3ced596b@mail.gmail.com>
	<4353016A.1010707@canterbury.ac.nz>
	<ca471dc20510162006j43e2172rf81015f71e0a6018@mail.gmail.com>
	<43544CC1.5050204@canterbury.ac.nz>
	<ca471dc20510171855w357591c4hae2a9ddb40694d23@mail.gmail.com>
	<1129601328.9405.13.camel@geddy.wooz.org>
	<d11dcfba0510172146p5275727bpb5bd6c604a45a9c5@mail.gmail.com>
	<ca471dc20510172159m7eb94e85vf4b704dd2e6f24b9@mail.gmail.com>
	<4edc17eb0510190151v6a078f54n2135b8d8386c796a@mail.gmail.com>
	<d11dcfba0510190838g3629f183r616485df4f8ea252@mail.gmail.com>
Message-ID: <17238.30671.916987.907009@montanaro.dyndns.org>

    >> <callable> <name> <tuple>:
    >>     <definitions>
    ...

    Steve> Wow, that's really neat.  And you save a keyword! ;-)

Two if you add a builtin called "function" (get rid of "def").

Skip

From pje at telecommunity.com  Wed Oct 19 18:49:47 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 19 Oct 2005 12:49:47 -0400
Subject: [Python-Dev] Definining properties - a use case for class
 decorators?
In-Reply-To: <17238.30671.916987.907009@montanaro.dyndns.org>
References: <d11dcfba0510190838g3629f183r616485df4f8ea252@mail.gmail.com>
	<43525BFA.9090309@iinet.net.au>
	<ca471dc20510160823i1ab5ff15kc42b138a3ced596b@mail.gmail.com>
	<4353016A.1010707@canterbury.ac.nz>
	<ca471dc20510162006j43e2172rf81015f71e0a6018@mail.gmail.com>
	<43544CC1.5050204@canterbury.ac.nz>
	<ca471dc20510171855w357591c4hae2a9ddb40694d23@mail.gmail.com>
	<1129601328.9405.13.camel@geddy.wooz.org>
	<d11dcfba0510172146p5275727bpb5bd6c604a45a9c5@mail.gmail.com>
	<ca471dc20510172159m7eb94e85vf4b704dd2e6f24b9@mail.gmail.com>
	<4edc17eb0510190151v6a078f54n2135b8d8386c796a@mail.gmail.com>
	<d11dcfba0510190838g3629f183r616485df4f8ea252@mail.gmail.com>
Message-ID: <5.1.1.6.0.20051019124840.01fa73b8@mail.telecommunity.com>

At 11:43 AM 10/19/2005 -0500, skip at pobox.com wrote:
>     >> <callable> <name> <tuple>:
>     >>     <definitions>
>     ...
>
>     Steve> Wow, that's really neat.  And you save a keyword! ;-)
>
>Two if you add a builtin called "function" (get rid of "def").

Not unless the tuple is passed in as an abstract syntax tree or something.


From jcarlson at uci.edu  Wed Oct 19 20:01:06 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed, 19 Oct 2005 11:01:06 -0700
Subject: [Python-Dev] Pythonic concurrency - offtopic
In-Reply-To: <984838bf0510190459o695d99bcqa44e7eb072cf230d@mail.gmail.com>
References: <20051013220748.9195.JCARLSON@uci.edu>
	<984838bf0510190459o695d99bcqa44e7eb072cf230d@mail.gmail.com>
Message-ID: <20051019103148.380E.JCARLSON@uci.edu>


JanC <janc13 at gmail.com> wrote:
> 
> On 10/14/05, Josiah Carlson <jcarlson at uci.edu> wrote:
> > Until Microsoft adds kernel support for fork, don't expect standard
> > Windows Python to support it.
> 
> AFAIK the NT kernel has support for fork, but the Win32 subsystem
> doesn't support it (you can only use it with the POSIX subsystem).

Good to know.  But if I remember subsystem semantics properly, you can
use a single subsystem at any time, so if one wanted to use fork from
the POSIX subsystem, one would necessarily have to massage the rest of
Python into NT's POSIX subsystem, which could be a problem because
NT/2K's posix subsystem doesn't support network interfaces, memory
mapped files, ...
    http://msdn2.microsoft.com/en-us/library/y23kc048

Based on this page:
    http://www.cygwin.com/cygwin-ug-net/highlights.html
...it does seem possible to borrow cygwin's implementation of fork for
use on win32, but I would guess that most people would be disappointed
with its performance in comparison to unix fork.

 - Josiah


From jcarlson at uci.edu  Wed Oct 19 20:03:13 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed, 19 Oct 2005 11:03:13 -0700
Subject: [Python-Dev] Early PEP draft (For Python 3000?)
In-Reply-To: <fb6fbf560510190623g7d979666n42abf86b95d1323a@mail.gmail.com>
References: <fb6fbf560510190623g7d979666n42abf86b95d1323a@mail.gmail.com>
Message-ID: <20051019105405.3817.JCARLSON@uci.edu>


Jim Jewett <jimjjewett at gmail.com> wrote:
> 
> (In http://mail.python.org/pipermail/python-dev/2005-October/057251.html)
> Eyal Lotem wrote:
> 
> > Name: Attribute access for all namespaces ...
> 
> >       global x ; x = 1
> > Replaced by:
> >       module.x = 1
> 
> Attribute access as an option would be nice, but might be slower.
> 
> Also note that one common use for a __dict__ is that you don't
> know what keys are available; meeting this use case with
> attribute access would require some extra machinery, such as
> an iterator over attributes.

This particular use case is easily handled.  Put the following once at
the top of the module...

module = __import__(__name__)

Then one can access (though perhaps not quickly) the module-level
variables for that module.  To access attributes, it is a quick scan
through module.__dict__, dir(), or vars().


Want to make that automatic?  Write an import hook that puts a reference
to the module in the module itself on load/reload.

 - Josiah


From jcarlson at uci.edu  Wed Oct 19 20:58:06 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed, 19 Oct 2005 11:58:06 -0700
Subject: [Python-Dev] Early PEP draft (For Python 3000?)
In-Reply-To: <76fd5acf0510161037v477874b0w5595e3edffe71511@mail.gmail.com>
References: <76fd5acf0510161036i4ab09e2cu39bd6961a60df783@mail.gmail.com>
	<76fd5acf0510161037v477874b0w5595e3edffe71511@mail.gmail.com>
Message-ID: <20051019110251.381A.JCARLSON@uci.edu>


Calvin Spealman <ironfroggy at gmail.com> wrote:
> On 10/16/05, Josiah Carlson <jcarlson at uci.edu> wrote:
[snip]
> > What I'm saying is that whether or not you can modify the contents of
> > stack frames via tricks, you shouldn't.  Why?  Because as I said, if the
> > writer wanted you to be hacking around with a namespace, they should
> > have passed you a shared namespace.
> >
> > From what I understand, there are very few (good) reasons why a user
> > should muck with stack frames, among them because it is quite convenient
> > to write custom traceback printers (like web CGI, etc.), and if one is
> > tricky, limit the callers of a function/method to those "allowable".
> > There may be other good reasons, but until you offer a use-case that is
> > compelling for reasons why it should be easier to access and/or modify
> > the contents of stack frames, I'm going to remain at -1000.
> 
> I think I was wording this badly. I meant to suggest this as a way to
> define nested functions (or classes?) and probably access names from
> various levels of scope. In this way, a nested function would be able
> to say "bind the name 'a' in the namespace in which I am defined to
> this object", thus offering more fine grained approached than the
> current global keyword. I know there has been talk of this issue
> before, but I don't know if it works with or against anything said for
> this previously.

And as I have said, if you want people to modify a namespace, you should
be creating a namespace and passing it around.  If you want people to
have access to some embedded definition, then you expose it.  If some
writer of some module/class/whatever decides that they want to embed
some thing that you think should have been exposed to the outside world,
then complain the the writer that they have designed it poorly.

Take a walk though the standard library.  You will likely note the
rarity of embedded function/class definitions.  In those cases where
they are used, it is generally for a good reason.

You will also note the general rarity of stack frame access.  Prior to
the cycle-removing garbage collector, this was because accessing stack
frames could result in memory leaks of stack frames.  You may also note
the rarity of modification of stack frame contents (I'm not sure there
are any), which can be quite dangerous.  Right now it is difficult to go
and access the value of a local 'x' three callers above you in the stack
frame.  I think this is great, working as intended in fact.  Being able
to read and/or modify arbitrary stack frame contents, and/or being able
to pass those stack frames around: foo(frame[3]), is quite dangerous.

I'm still -1000.

 - Josiah


From skip at pobox.com  Wed Oct 19 21:22:06 2005
From: skip at pobox.com (skip@pobox.com)
Date: Wed, 19 Oct 2005 14:22:06 -0500
Subject: [Python-Dev] Definining properties - a use case for class
 decorators?
In-Reply-To: <5.1.1.6.0.20051019124840.01fa73b8@mail.telecommunity.com>
References: <d11dcfba0510190838g3629f183r616485df4f8ea252@mail.gmail.com>
	<43525BFA.9090309@iinet.net.au>
	<ca471dc20510160823i1ab5ff15kc42b138a3ced596b@mail.gmail.com>
	<4353016A.1010707@canterbury.ac.nz>
	<ca471dc20510162006j43e2172rf81015f71e0a6018@mail.gmail.com>
	<43544CC1.5050204@canterbury.ac.nz>
	<ca471dc20510171855w357591c4hae2a9ddb40694d23@mail.gmail.com>
	<1129601328.9405.13.camel@geddy.wooz.org>
	<d11dcfba0510172146p5275727bpb5bd6c604a45a9c5@mail.gmail.com>
	<ca471dc20510172159m7eb94e85vf4b704dd2e6f24b9@mail.gmail.com>
	<4edc17eb0510190151v6a078f54n2135b8d8386c796a@mail.gmail.com>
	<5.1.1.6.0.20051019124840.01fa73b8@mail.telecommunity.com>
Message-ID: <17238.40158.735826.504410@montanaro.dyndns.org>

>>>>> "Phillip" == Phillip J Eby <pje at telecommunity.com> writes:

    Phillip> At 11:43 AM 10/19/2005 -0500, skip at pobox.com wrote:
    >> >> <callable> <name> <tuple>:
    >> >>     <definitions>
    >> ...
    >> 
    Steve> Wow, that's really neat.  And you save a keyword! ;-)
    >> 
    >> Two if you add a builtin called "function" (get rid of "def").

    Phillip> Not unless the tuple is passed in as an abstract syntax tree or
    Phillip> something.

Hmmm...  Maybe I misread something then.  I saw (I think) that

    type Foo (base):
        def __init__(self):
            pass

would be equivalent to

    class Foo (base):
        def __init__(self):
            pass

and thought that

    function myfunc(arg1, arg2):
        pass

would be equivalent to

    def myfunc(arg1, arg2):
        pass

where "function" a builtin that when called returns a new function.

Skip

From jcarlson at uci.edu  Wed Oct 19 21:46:12 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed, 19 Oct 2005 12:46:12 -0700
Subject: [Python-Dev] Definining properties - a use case for class
	decorators?
In-Reply-To: <17238.40158.735826.504410@montanaro.dyndns.org>
References: <5.1.1.6.0.20051019124840.01fa73b8@mail.telecommunity.com>
	<17238.40158.735826.504410@montanaro.dyndns.org>
Message-ID: <20051019124508.3836.JCARLSON@uci.edu>


skip at pobox.com wrote:
> 
> >>>>> "Phillip" == Phillip J Eby <pje at telecommunity.com> writes:
> 
>     Phillip> At 11:43 AM 10/19/2005 -0500, skip at pobox.com wrote:
>     >> >> <callable> <name> <tuple>:
>     >> >>     <definitions>
>     >> ...
>     >> 
>     Steve> Wow, that's really neat.  And you save a keyword! ;-)
>     >> 
>     >> Two if you add a builtin called "function" (get rid of "def").
> 
>     Phillip> Not unless the tuple is passed in as an abstract syntax tree or
>     Phillip> something.
> 
> Hmmm...  Maybe I misread something then.  I saw (I think) that
> 
>     type Foo (base):
>         def __init__(self):
>             pass
> 
> would be equivalent to
> 
>     class Foo (base):
>         def __init__(self):
>             pass
> 
> and thought that
> 
>     function myfunc(arg1, arg2):
>         pass
> 
> would be equivalent to
> 
>     def myfunc(arg1, arg2):
>         pass
> 
> where "function" a builtin that when called returns a new function.

For it to work in classes, it would need to execute the body of the
class, which is precisely why it can't work with functions.

 - Josiah


From jcarlson at uci.edu  Wed Oct 19 22:10:30 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed, 19 Oct 2005 13:10:30 -0700
Subject: [Python-Dev] Defining properties - a use case for class
	decorators?
In-Reply-To: <fb6fbf560510190644u67bed4f8w796b0abec495914d@mail.gmail.com>
References: <fb6fbf560510190644u67bed4f8w796b0abec495914d@mail.gmail.com>
Message-ID: <20051019102954.380B.JCARLSON@uci.edu>


Jim Jewett <jimjjewett at gmail.com> wrote:
> (In http://mail.python.org/pipermail/python-dev/2005-October/057409.html,)
> Nick Coghlan suggested allowing attribute references as binding targets.
> 
> >    x = property("Property x (must be less than 5)")
> 
> >    def x.get(instance):  ...
> 
> Josiah shivered and said it was hard to tell what was even intended, and
> (in http://mail.python.org/pipermail/python-dev/2005-October/057437.html)
> Nick agreed that it was worse than
> 
> >    x.get = f given:
> >        def f(): ...
> 
> Could someone explain to me why it is worse?

    def x.get(...): ...

Seems to imply that one is defining a method on x.  This is not the case.
It is also confused by the x.get(instance) terminology that I doubt has
ever seen light of day in production code.  Instance of what?  Instance
of x?  The class?  ...

I'm personally averse to the 'given:' syntax, if only because under
certain situations, it can be reasonably emulated.


> I understand not wanting to modify object x outside of its definition.
> 
> I understand that there is some trickiness about instancemethods
> and bound variables.
> 
> But these objections seem equally strong for both forms, as well
> as for the current "equivalent" of
> 
>     def f(): ...
>     x.get = f
> 
> The first form (def x.get) at least avoids repeating (or even creating)
> the temporary function name.
> 
> The justification for decorators was to solve this very problem within
> a module or class.  How is this different?  Is it just that attributes
> shouldn't be functions, and this might encourage the practice?

Many will agree that there is a problem with how properties are defined.
There are many proposed solutions, some of which use decorators, custom
subclasses, metaclasses, etc.

I have a problem with it because from the description, you could use...

    def x.y.z.a.b.c.foobarbaz(...):
        ...

...and it woud be unclear to the reader or writer what the hell
x.y.z.a.b.c is (class, instance, module), which can come up if the
definition/import of x is far enough away from the definition of
x.<blah> .  Again, ick.

 - Josiah


From pje at telecommunity.com  Wed Oct 19 22:15:33 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 19 Oct 2005 16:15:33 -0400
Subject: [Python-Dev] Definining properties - a use case for class
 decorators?
In-Reply-To: <20051019124508.3836.JCARLSON@uci.edu>
References: <17238.40158.735826.504410@montanaro.dyndns.org>
	<5.1.1.6.0.20051019124840.01fa73b8@mail.telecommunity.com>
	<17238.40158.735826.504410@montanaro.dyndns.org>
Message-ID: <5.1.1.6.0.20051019161405.01fb3218@mail.telecommunity.com>

At 12:46 PM 10/19/2005 -0700, Josiah Carlson wrote:
>skip at pobox.com wrote:
> > >>>>> "Phillip" == Phillip J Eby <pje at telecommunity.com> writes:
> >
> >     Phillip> Not unless the tuple is passed in as an abstract syntax 
> tree or
> >     Phillip> something.
> >
> > Hmmm...  Maybe I misread something then.  I saw (I think) that
> >
> >     type Foo (base):
> >         def __init__(self):
> >             pass
> >
> > would be equivalent to
> >
> >     class Foo (base):
> >         def __init__(self):
> >             pass
> >
> > and thought that
> >
> >     function myfunc(arg1, arg2):
> >         pass
> >
> > would be equivalent to
> >
> >     def myfunc(arg1, arg2):
> >         pass
> >
> > where "function" a builtin that when called returns a new function.
>
>For it to work in classes, it would need to execute the body of the
>class, which is precisely why it can't work with functions.

Not only that, but the '(arg1, arg2)' for classes is a tuple of *values*, 
but for functions it's just a function signature, not an expression!  Which 
is why this would effectively have to be a macro facility.


From fredrik at pythonware.com  Wed Oct 19 22:23:35 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed, 19 Oct 2005 22:23:35 +0200
Subject: [Python-Dev] Definining properties - a use case for
	classdecorators?
References: <43525BFA.9090309@iinet.net.au><ca471dc20510160823i1ab5ff15kc42b138a3ced596b@mail.gmail.com><4353016A.1010707@canterbury.ac.nz><ca471dc20510162006j43e2172rf81015f71e0a6018@mail.gmail.com><43544CC1.5050204@canterbury.ac.nz>
	<ca471dc20510171855w357591c4hae2a9ddb40694d23@mail.gmail.com>
Message-ID: <dj6a09$31j$1@sea.gmane.org>

Guido van Rossum wrote:

> OK, so how's this for a radical proposal.
>
> Let's change the property built-in so that its arguments can be either
> functions or strings (or None). If they are functions or None, it
> behaves exactly like it always has.
>
> If an argument is a string, it should be a method name, and the method
> is looked up by that name each time the property is used. Because this
> is late binding, it can be put before the method definitions, and a
> subclass can override the methods. Example:
>
> class C:
>
>     foo = property('getFoo', 'setFoo', None, 'the foo property')
>
>     def getFoo(self):
>         return self._foo
>
>     def setFoo(self, foo):
>         self._foo = foo
>
> What do you think?

+1 from here.

> If you can think of a solution that looks better than mine, you're a genius.

letting "class" inject a slightly magic "self" variable into the class namespace ?

    class C:

        foo = property(self.getFoo, self.setFoo, None, 'the foo property')

        def getFoo(self):
            return self._foo

        def setFoo(self, foo):
            self._foo = foo

(figuring out exactly what "self" should be is left as an exercise etc)

</F>




From guido at python.org  Wed Oct 19 22:53:04 2005
From: guido at python.org (Guido van Rossum)
Date: Wed, 19 Oct 2005 13:53:04 -0700
Subject: [Python-Dev] Definining properties - a use case for
	classdecorators?
In-Reply-To: <dj6a09$31j$1@sea.gmane.org>
References: <43525BFA.9090309@iinet.net.au>
	<ca471dc20510160823i1ab5ff15kc42b138a3ced596b@mail.gmail.com>
	<4353016A.1010707@canterbury.ac.nz>
	<ca471dc20510162006j43e2172rf81015f71e0a6018@mail.gmail.com>
	<43544CC1.5050204@canterbury.ac.nz>
	<ca471dc20510171855w357591c4hae2a9ddb40694d23@mail.gmail.com>
	<dj6a09$31j$1@sea.gmane.org>
Message-ID: <ca471dc20510191353u27b73475td05f3ef412e3b1eb@mail.gmail.com>

On 10/19/05, Fredrik Lundh <fredrik at pythonware.com> wrote:
> letting "class" inject a slightly magic "self" variable into the class namespace ?
>
>     class C:
>
>         foo = property(self.getFoo, self.setFoo, None, 'the foo property')
>
>         def getFoo(self):
>             return self._foo
>
>         def setFoo(self, foo):
>             self._foo = foo
>
> (figuring out exactly what "self" should be is left as an exercise etc)

It's magical enough to deserve to be called __self__. But even so:

I've seen proposals like this a few times in other contexts. I may
even have endorsed the idea at one time. The goal is always the same:
forcing delayed evaluation of a getattr operation without using either
a string literal or a lambda. But I find it quite a bit too magical,
for all values of xyzzy, that xyzzy.foo would return a function of one
argument that, when called with an argument x, returns x.foo. Even if
it's easy enough to write the solution (*), that sentence describing
it gives me goosebumps. And the logical consequence, xyzzy.foo(x),
which is an obfuscated way to write x.foo, makes me nervous.

(*) Here's the solution:

class XYZZY(object):
    def __getattr__(self, name):
        return lambda arg: getattr(arg, name)
xyzzy = XYZZY()

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From blais at furius.ca  Wed Oct 19 22:54:16 2005
From: blais at furius.ca (Martin Blais)
Date: Wed, 19 Oct 2005 16:54:16 -0400
Subject: [Python-Dev] enumerate with a start index
Message-ID: <8393fff0510191354s3676682dk58a4db65edee1fee@mail.gmail.com>

Hi

Just wondering, would anyone think of it as a good idea if the
enumerate() builtin could accept a "start" argument?  I've run across
a few cases where this would have been useful.  It seems generic
enough too.

From michel at cignex.com  Thu Oct 20 00:14:31 2005
From: michel at cignex.com (Michel Pelletier)
Date: Wed, 19 Oct 2005 15:14:31 -0700
Subject: [Python-Dev] Coroutines, generators, function calling
In-Reply-To: <004801c5d3ec$e29b5360$6402a8c0@arkdesktop>
References: <1129643229.12510.37.camel@localhost>
	<004801c5d3ec$e29b5360$6402a8c0@arkdesktop>
Message-ID: <4356C547.8020402@cignex.com>

Andrew Koenig wrote:
>>  Sure, that would work.  Or even this, if the scheduler would
>>automatically recognize generator objects being yielded and so would run
>>the the nested coroutine until finish:
> 
> 
> This idea has been discussed before.  I think the problem with recognizing
> generators as the subject of "yield" statements is that then you can't yield
> a generator even if you want to.
> 
> The best syntax I can think of without adding a new keyword looks like this:
> 
> 	yield from x
> 
> which would be equivalent to
> 
> 	for i in x:
> 	    yield i

My eyes really like the syntax, but I wonder about it's usefulness.  In 
rdflib, particularly here:

http://svn.rdflib.net/trunk/rdflib/backends/IOMemory.py

We yield values from inside for loops all over the place, but the 
yielded value is very rarely just the index value (only 1 of 14 yields) 
, but something calculated from the index value, so the new syntax would 
not be useful, unless it was something that provided access to the index 
item as a variable, like:

yield foo(i) for i in x

which barely saves you anything (a colon, a newline, and an indent). 
(hey wait, isn't that a generator comprehension?  Haven't really 
encountered those yet). Of course rdflib could be the minority case and 
most folks who yield in loops are yielding only the index value directly.

off to read the generator comprehension docs...

-Michel


From michel at cignex.com  Thu Oct 20 01:03:52 2005
From: michel at cignex.com (Michel Pelletier)
Date: Wed, 19 Oct 2005 16:03:52 -0700
Subject: [Python-Dev] enumerate with a start index
In-Reply-To: <8393fff0510191354s3676682dk58a4db65edee1fee@mail.gmail.com>
References: <8393fff0510191354s3676682dk58a4db65edee1fee@mail.gmail.com>
Message-ID: <4356D0D8.9010209@cignex.com>

Martin Blais wrote:
> Hi
> 
> Just wondering, would anyone think of it as a good idea if the
> enumerate() builtin could accept a "start" argument?  I've run across
> a few cases where this would have been useful.  It seems generic
> enough too.

+1, but something more useful might be a a cross between enumerate a 
zip, where you pass N iterables and it yields N-tuples.  Then you could 
do something like:

zipyield(range(10, 20), mygenerator())

and it would be like you wanted for enumerate, but starting from 10 in 
this case.

-Michel


> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/python-python-dev%40m.gmane.org
> 


From ark at acm.org  Thu Oct 20 01:17:05 2005
From: ark at acm.org (Andrew Koenig)
Date: Wed, 19 Oct 2005 19:17:05 -0400
Subject: [Python-Dev] Coroutines, generators, function calling
In-Reply-To: <4356C547.8020402@cignex.com>
Message-ID: <00ad01c5d503$3c529370$6402a8c0@arkdesktop>

> We yield values from inside for loops all over the place, but the
> yielded value is very rarely just the index value (only 1 of 14 yields)
> , but something calculated from the index value, so the new syntax would
> not be useful, unless it was something that provided access to the index
> item as a variable, like:
> 
> yield foo(i) for i in x
> 
> which barely saves you anything (a colon, a newline, and an indent).
> (hey wait, isn't that a generator comprehension? 

Here's a use case:

	def preorder(tree):
		if tree:
			yield tree
			yield from preorder(tree.left)
			yield from preorder(tree.right)




From jcarlson at uci.edu  Thu Oct 20 01:28:29 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed, 19 Oct 2005 16:28:29 -0700
Subject: [Python-Dev] enumerate with a start index
In-Reply-To: <4356D0D8.9010209@cignex.com>
References: <8393fff0510191354s3676682dk58a4db65edee1fee@mail.gmail.com>
	<4356D0D8.9010209@cignex.com>
Message-ID: <20051019162606.385D.JCARLSON@uci.edu>


Michel Pelletier <michel at cignex.com> wrote:
> 
> Martin Blais wrote:
> > Hi
> > 
> > Just wondering, would anyone think of it as a good idea if the
> > enumerate() builtin could accept a "start" argument?  I've run across
> > a few cases where this would have been useful.  It seems generic
> > enough too.
> 
> +1, but something more useful might be a a cross between enumerate a 
> zip, where you pass N iterables and it yields N-tuples.  Then you could 
> do something like:
> 
> zipyield(range(10, 20), mygenerator())
> 
> and it would be like you wanted for enumerate, but starting from 10 in 
> this case.

All of this already exists.

    from itertools import izip, count

    for i,j in izip(count(start), iterable):
        ...

Read your standard library.

 - Josiah


From pje at telecommunity.com  Thu Oct 20 03:40:35 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 19 Oct 2005 21:40:35 -0400
Subject: [Python-Dev] Pre-PEP: Task-local variables
Message-ID: <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com>

This is still rather rough, but I figured it's easier to let everybody fill 
in the remaining gaps by arguments than it is for me to pick a position I 
like and try to convince everybody else that it's right.  :)  Your feedback 
is requested and welcome.


PEP: XXX
Title: Task-local Variables
Author: Phillip J. Eby <pje at telecommunity.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 19-Oct-2005
Python-Version: 2.5
Post-History: 19-Oct-2005


Abstract
========

Many Python modules provide some kind of global or thread-local state,
which is relatively easy to implement.  With the acceptance of PEP
342, however, co-routines will become more common, and it will be
desirable in many cases to treat each as its own logical thread of
execution. So, many kinds of state that might now be kept as a
thread-specific variable (such as the "current transaction" in ZODB or
the "current database connection" in SQLObject) will not work with
coroutines.

This PEP proposes a simple mechanism akin to thread-local variables,
but which will make it easy and efficient for co-routine schedulers to
switch state between tasks.  The mechanism is proposed for the standard
library because its usefulness is dependent on its adoption by
standard library modules, such as the ``decimal`` module. The proposed
features can be implemented as pure Python code, and as such are
suitable for use by other Python implementations (including older
versions of Python, if desired).


Motivation
==========

PEP 343's new "with" statement makes it very attractive to temporarily
alter some aspect of system state, and then restore it, using a
context manager.  Many of PEP 343's examples are of this nature,
whether they are temporarily redirecting ``sys.stdout``, or
temporarily altering decimal precision.

But when this attractive feature is combined with PEP 342-style
co-routines, a new challenge emerges.  Consider this code, which may
misbehave if run as a co-routine::

         with opening(filename, "w") as f:
             with redirecting_stdout(f):
                 print "Hello world"
                 yield pause(5)
                 print "Goodbye world"

Problems can arise from this code in two ways.  First, the redirection
of output "leaks out" to other coroutines during the pause.  Second,
when this coroutine is finished, it resets stdout to whatever it was
at the beginning of the coroutine, regardless of what another
co-routine might have been using.

Similar issues can be demonstrated using the decimal context,
transactions, database connections, etc., which are all likely to be
popular contexts for the "with" statement.  However, if these new
context managers are written to use global or thread-local state,
coroutines will be locked out of the market, so to speak.

Therefore, this PEP proposes to provide and promote a standard way of
managing per-execution-context state, such that coroutine schedulers
can keep each coroutine's state distinct.  If this mechanism is then
used by library modules (such as ``decimal``) to maintain their
current state, then they will be transparently compatible with
co-routines as well as threaded and threadless code.

(Note that for Python 2.x versions, backward compatibility requires
that we continue to allow direct reassignment to e.g. ``sys.stdout``.
So, it will still of course be possible to write code that will
interoperate poorly with co-routines.  But for Python 3.x it seems
worth considering making some of the ``sys`` module's contents into
task-local variables rather than assignment targets.)


Specification
=============

This PEP proposes to offer a standard library module called
``context``, with the following core contents:

Variable
     A class that allows creation of a context variable (see below).

snapshot()
     Returns a snapshot of the current execution context.

swap(ctx)
     Set the current context to `ctx`, returning a snapshot of the
     current context.

The basic idea here is that a co-routine scheduler can switch between
tasks by doing something like::

     last_coroutine.state = context.swap(next_coroutine.state)

Or perhaps more like::

     # ... execute coroutine iteration
     last_coroutine.state = context.snapshot()
     # ... figure out what routine to run next
     context.swap(next_coroutine.state)

Each ``context.Variable`` stores and retrieves its state using the
current execution context, which is thread-specific.  (Thus, each
thread may execute any number of concurrent tasks, although most
practical systems today have only one thread that executes coroutines,
the other threads being reserved for operations that would otherwise
block co-routine execution.  Nonetheless, such other threads will often
still require context variables of their own.)


Context Variable Objects
------------------------

A context variable object provides the following methods:

get(default=None)
     Return the value of the variable in the current execution context,
     or `default` if not set.

set(value)
     Set the value of the variable for the current execution context.

unset()
     Delete the value of the variable for the current execution context.

__call__(*value)
     If called with an argument, return a context manager that sets the
     variable to the specified value, then restores the old value upon
     ``__exit__``.  If called without an argument, return the value of
     the variable for the current execution context, or raise an error
     if no value is set.  Thus::

         with some_variable(value):
              foo()

     would be roughly equivalent to::

         old = some_variable()
         some_variable.set(value)
         try:
             foo()
         finally:
             some_variable.set(old)


Implementation Details
----------------------

The simplest possible implementation is for ``Variable`` objects to
use themselves as unique keys into an execution context dictionary.
The context dictionary would be stored in another dictionary, keyed by
``get_thread_ident()``.  This approach would work with almost any
version or implementation of Python.

For efficiency's sake, however, CPython could simply store the
execution context dictionary in its "thread state" structure, creating
an empty dictionary at thread initialization time.  This would make it
somewhat easier to offer a C API for access to context variables,
especially where efficiency of access is desirable.  But the proposal
does not depend on this.

In the PEP author's experiments, a simple copy-on-write optimization
to the the ``set()`` and ``unset()`` methods allows for high
performance task switching.  By placing a "frozen" flag in the context
dictionary when a snapshot is taken, and then checking for the flag
before making changes, a single snapshot can be shared by multiple
callers, and thus a ``swap()`` operation is little more than two
dictionary writes and a read.  This leads to higher performance in the
typical case, because context variables are more likely to set in
outer loops, but task switches are more likely to occur in inner
loops.  A copy-on-write approach thus prevents copying from occurring
during most task switches.


Possible Enhancements
---------------------

The core of this proposal is extremely minimalist, as it should be
possible to do almost anything desired using combinations of
``Variable`` objects or by simply using variables whose values are
mutable objects.  There are, however, a variety of options for
enhancement:

``manager`` decorator
     The ``context`` module could perhaps be the home of the PEP 343
     ``contextmanager`` decorator, effectively renamed to
     ``context.manager``.  This could be a natural fit, in that it would
     remind the creators of new context managers that they should
     consider tracking any associated state in a ``context.Variable``.

Proxy class
     Sometimes it's useful to have an object that looks like a module
     global (e.g. ``sys.stdout``) but which actually delegates its
     behavior to a context-specific instance.  Thus, you could have one
     ``sys.stdout``, but its actual output would be directed based on
     the current execution context. The simplest form of such a proxy
     class might look something like::

         class Proxy(object):
             def __init__(self, initial_value):
                 self.var = context.Variable()
                 self.var.set(initial_value)

             def __call__(self,*value):
                 return object.__getattribute__(self,'var')(*value)

             def __getattribute__(self, attr):
                 var = object.__getattribute__(self,'var')
                 return getattr(var, attr)

         sys.stdout = Proxy(sys.stdout)   # make sys.stdout selectable

         with sys.stdout(somefile):  # temporary redirect in current context
             print "hey!"

     The main open issues in implementing this sort of proxy are in the
     precise set of special methods (e.g. ``__getitem__``,
     ``__setattr__``, etc.) that should be supported, and what API
     should be supplied for changing the value, setting a default value
     for new threads, etc.

Low-level API
     Currently, this PEP does not specify an API for accessing and
     modifying the current execution context, nor a C API for such
     access. It currently assumes that ``snapshot()``, ``swap()`` and
     ``Variable`` are the only public means of accessing context
     information.  It may be desirable to offer finer-grained APIs for
     use by more advanced uses (such as creating an API for management
     of proxies).  And it may be desirable to have a C API for use by
     Python extensions that wish convenient access to context
     variables.


Rationale
=========

Different libraries have different uses for maintaining a "current"
state, be it global or local to a specific thread or task.  There is
currently no way for task-management code to find and switch all of
these "current" states.  And even if it could, task switching
performance would degrade linearly as new libraries were added.

One possible alternative approach to this proposal, would be for
explicit task objects to exist, and to provide a way to give them
identities, so that libraries could instead store their own state
as a property of the task, rather than storing their state in a
task-specific mapping.  This offers similar potential performance
to a copy-on-write strategy, but would use more memory than this
proposal when only one task is involved.  (Because each variable
would have a dictionary mapping from task to the variable's value, but
in this proposal there is simply a single dictionary for the task.)

Some languages offer "dynamically scoped" variables that are somewhat
similar in behavior to the context variables proposed by this PEP.
The principal differences are that:

1. Context variables are objects used to obtain or save a value,
    rather than being a syntactic construct of the language.

2. PEP 343 allows for *controlled* manipulation of context variables,
    helping to prevent "duelling libraries" from changing state on each
    other.  Also, a library can potentially ``snapshot()`` a desired
    state at startup, and use ``swap()`` to restore that state on
    re-entry.  (And could even define a simple decorator to wrap its
    entry points to ensure this.)

3. The PEP author is not aware of any language that explicitly offers
    coroutine-scoped variables, but presumes that they can be modelled
    with monads or continuations in functional languages like Haskell.
    (And I only mention this to forestall the otherwise-inevitable
    response from fans of such techniques, pointing out that it's
    possible.)


Reference Implementation
========================

The author has prototyped an implementation with somewhat fancier
features than shown here, but prefers not to publish it until the
basic features and choices of optional functionality have been
discussed on Python-Dev.


Copyright
=========

This document has been placed in the public domain.



..
    Local Variables:
    mode: indented-text
    indent-tabs-mode: nil
    sentence-end-double-space: t
    fill-column: 70
    End:


From jcarlson at uci.edu  Thu Oct 20 04:30:54 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed, 19 Oct 2005 19:30:54 -0700
Subject: [Python-Dev] Pre-PEP: Task-local variables
In-Reply-To: <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com>
References: <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com>
Message-ID: <20051019191804.386C.JCARLSON@uci.edu>


"Phillip J. Eby" <pje at telecommunity.com> wrote:
> For efficiency's sake, however, CPython could simply store the
> execution context dictionary in its "thread state" structure, creating
> an empty dictionary at thread initialization time.  This would make it
> somewhat easier to offer a C API for access to context variables,
> especially where efficiency of access is desirable.  But the proposal
> does not depend on this.

What about a situation in which corutines are handled by multiple
threads?  Any time a corutine passed from one thread to another, it
would lose its state.

While I agree with the obvious "don't do that" response, I don't believe
that the proposal will actually go very far in preventing real problems
when using context managers and generators or corutines.  Why?  How much
task state is going to be monitored/saved?  Just sys?  Perhaps sys and
the module in which a corutine was defined?  Eventually you will have
someone who says, "I need Python to be saving and restoring the state of
the entire interpreter so that I can have a per-user execution
environment that cannot be corrupted by another user."  But how much
farther out is that?

 - Josiah


From pje at telecommunity.com  Thu Oct 20 06:13:29 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 20 Oct 2005 00:13:29 -0400
Subject: [Python-Dev] Pre-PEP: Task-local variables
In-Reply-To: <20051019191804.386C.JCARLSON@uci.edu>
References: <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com>
	<5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20051020000801.01fafe18@mail.telecommunity.com>

At 07:30 PM 10/19/2005 -0700, Josiah Carlson wrote:
>What about a situation in which corutines are handled by multiple
>threads?  Any time a corutine passed from one thread to another, it
>would lose its state.

It's the responsibility of a coroutine scheduler to take a snapshot() when 
a task is suspended, and to swap() it in when resumed.  So it doesn't 
matter that you've changed what thread you're running in, as long as you 
keep the context with the coroutine that "owns" it.


>While I agree with the obvious "don't do that" response, I don't believe
>that the proposal will actually go very far in preventing real problems
>when using context managers and generators or corutines.  Why?  How much
>task state is going to be monitored/saved?  Just sys?  Perhaps sys and
>the module in which a corutine was defined?

As I mentioned in the PEP, I don't think that we would bother having 
Python-defined variables be context-specific until Python 3.0.  This is 
mainly intended for the kinds of things described in the proposal: ZODB 
current transaction, current database connection, decimal context, 
etc.  Basically, anything that you'd have a thread-local for now, and 
indeed most anything that you'd use a global variable and 'with:' for.


>   Eventually you will have
>someone who says, "I need Python to be saving and restoring the state of
>the entire interpreter so that I can have a per-user execution
>environment that cannot be corrupted by another user."  But how much
>farther out is that?

I don't see how that's even related.  This is simply a replacement for 
thread-local variables that allows you to also be compatible with 
"lightweight" (coroutine-based) threads.


From jcarlson at uci.edu  Thu Oct 20 07:29:18 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed, 19 Oct 2005 22:29:18 -0700
Subject: [Python-Dev] Pre-PEP: Task-local variables
In-Reply-To: <5.1.1.6.0.20051020000801.01fafe18@mail.telecommunity.com>
References: <20051019191804.386C.JCARLSON@uci.edu>
	<5.1.1.6.0.20051020000801.01fafe18@mail.telecommunity.com>
Message-ID: <20051019221804.3880.JCARLSON@uci.edu>


"Phillip J. Eby" <pje at telecommunity.com> wrote:
> It's the responsibility of a coroutine scheduler to take a snapshot() when 
> a task is suspended, and to swap() it in when resumed.  So it doesn't 
> matter that you've changed what thread you're running in, as long as you 
> keep the context with the coroutine that "owns" it.
> 
> As I mentioned in the PEP, I don't think that we would bother having 
> Python-defined variables be context-specific until Python 3.0.  This is 
> mainly intended for the kinds of things described in the proposal: ZODB 
> current transaction, current database connection, decimal context, 
> etc.  Basically, anything that you'd have a thread-local for now, and 
> indeed most anything that you'd use a global variable and 'with:' for.
> 
> I don't see how that's even related.  This is simply a replacement for 
> thread-local variables that allows you to also be compatible with 
> "lightweight" (coroutine-based) threads.

I just re-read the proposal with your clarifications in mind.  Looks
good.  +1

 - Josiah


From michele.simionato at gmail.com  Thu Oct 20 09:35:17 2005
From: michele.simionato at gmail.com (Michele Simionato)
Date: Thu, 20 Oct 2005 07:35:17 +0000
Subject: [Python-Dev] Definining properties - a use case for class
	decorators?
In-Reply-To: <17238.40158.735826.504410@montanaro.dyndns.org>
References: <d11dcfba0510190838g3629f183r616485df4f8ea252@mail.gmail.com>
	<ca471dc20510162006j43e2172rf81015f71e0a6018@mail.gmail.com>
	<43544CC1.5050204@canterbury.ac.nz>
	<ca471dc20510171855w357591c4hae2a9ddb40694d23@mail.gmail.com>
	<1129601328.9405.13.camel@geddy.wooz.org>
	<d11dcfba0510172146p5275727bpb5bd6c604a45a9c5@mail.gmail.com>
	<ca471dc20510172159m7eb94e85vf4b704dd2e6f24b9@mail.gmail.com>
	<4edc17eb0510190151v6a078f54n2135b8d8386c796a@mail.gmail.com>
	<5.1.1.6.0.20051019124840.01fa73b8@mail.telecommunity.com>
	<17238.40158.735826.504410@montanaro.dyndns.org>
Message-ID: <4edc17eb0510200035u370b57f9ub1d66b4e99d1be62@mail.gmail.com>

As other explained, the syntax would not work for functions (and it is
not intended to).
A possible use case I had in mind is to define inlined modules to be
used as bunches
of attributes. For instance, I could define a module as

module m():
    a = 1
    b = 2

where 'module' would be the following function:

def module(name, args, dic):
    mod = types.ModuleType(name, dic.get('__doc__'))
    for k in dic: setattr(mod, k, dic[k])
    return mod

From ncoghlan at gmail.com  Thu Oct 20 14:40:07 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 20 Oct 2005 22:40:07 +1000
Subject: [Python-Dev] Pre-PEP: Task-local variables
In-Reply-To: <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com>
References: <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com>
Message-ID: <43579027.6040007@gmail.com>

Phillip J. Eby wrote:
> This is still rather rough, but I figured it's easier to let everybody fill 
> in the remaining gaps by arguments than it is for me to pick a position I 
> like and try to convince everybody else that it's right.  :)  Your feedback 
> is requested and welcome.

I think you're actually highlighting a bigger issue with the behaviour of 
"yield" inside a "with" block, and working around it rather than fixing the 
fundamental problem.

The issue with "yield" causing changes to leak to outer scopes isn't limited 
to coroutine style usage - it can happen with generator-iterators, too.

What's missing is a general way of saying "suspend this context temporarily, 
and resume it when done".

An example use-case not involving 'yield' at all is the "asynchronise" 
functionality. A generator-iterator that works in a high precision 
decimal.Context(), but wants to return values from inside a loop using normal 
precision is another example not involving coroutines.

The basic idea would be to provide syntax that allows a with statement to be 
"suspended", along the lines of:

   with EXPR as VAR:
       for VAR2 in EXPR2:
           without:
               BLOCK

To mean:

   abc = (EXPR).__with__()
   exc = (None, None, None)
   VAR = abc.__enter__()
   try:
       for VAR2 in EXPR2:
           try:
               abc.__suspend__()
               try:
                   BLOCK
               finally:
                   abc.__resume__()
           except:
               exc = sys.exc_info()
               raise
   finally:
       abc.__exit__(*exc)


To keep things simple, just as 'break' and 'continue' work only on the 
innermost loop, 'without' would only apply to the innermost 'with' statement.

Locks, for example, could support this via:

   class Lock(object):
     def __with__(self):
         return self

     def __enter__(self):
         self.acquire()
         return self

     def __resume__(self):
         self.acquire()

     def __suspend__(self):
         self.release()

     def __exit__(self):
         self.release()


(Note that there's a potential problem if the call to acquire() in __resume__ 
fails, but that's no different than if this same dance is done manually).

Cheers,
Nick.

P.S. Here's a different generator wrapper that could be used to create a 
generator-based "suspendable context" that can be invoked multiple times 
through use of the "without" keyword. If applied to the PEP 343 
decimal.Context() __with__ method example, it would automatically restore the 
original context for the duration of the "without" block:

   class SuspendableGeneratorContext(object):

      def __init__(self, func, args, kwds):
          self.gen = None
          self.func = func
          self.args = args
          self.kwds = kwds

      def __with__(self):
          return self

      def __enter__(self):
          if self.gen is not None:
              raise RuntimeError("context already in use")
          gen = self.func(*args, **kwds)
          try:
              result = gen.next()
          except StopIteration:
              raise RuntimeError("generator didn't yield")
          self.gen = gen
          return result


      def __resume__(self):
          if self.gen is None:
              raise RuntimeError("context not suspended")
          gen = self.func(*args, **kwds)
          try:
              gen.next()
          except StopIteration:
              raise RuntimeError("generator didn't yield")
          self.gen = gen

      def __suspend__(self):
          try:
              self.gen.next()
          except StopIteration:
              return
          else:
              raise RuntimeError("generator didn't stop")

      def __exit__(self, type, value, traceback):
          gen = self.gen
          self.gen = None
          if type is None:
              try:
                  gen.next()
              except StopIteration:
                  return
              else:
                  raise RuntimeError("generator didn't stop")
          else:
              try:
                  gen.throw(type, value, traceback)
              except (type, StopIteration):
                  return
              else:
                  raise RuntimeError("generator caught exception")

   def suspendable_context(func):
      def helper(*args, **kwds):
          return SuspendableGeneratorContext(func, args, kwds)
      return helper

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Thu Oct 20 15:25:48 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 20 Oct 2005 23:25:48 +1000
Subject: [Python-Dev] Pre-PEP: Task-local variables
In-Reply-To: <43579027.6040007@gmail.com>
References: <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com>
	<43579027.6040007@gmail.com>
Message-ID: <43579ADC.80006@gmail.com>

Nick Coghlan wrote:
> P.S. Here's a different generator wrapper that could be used to create a 
> generator-based "suspendable context" that can be invoked multiple times 
> through use of the "without" keyword. If applied to the PEP 343 
> decimal.Context() __with__ method example, it would automatically restore the 
> original context for the duration of the "without" block.

I realised this isn't actually true for the version I posted, and the __with__ 
method example in the PEP - changes made to the decimal context in the 
"without" block would be visible after the "with" block.

Consider the following:

   def iter_sin(iterable):
       # Point A
       with decimal.getcontext() as ctx:
           ctx.prec += 10
           for r in iterable:
               y = sin(r) # Very high precision during calculation
               without:
                   yield +y # Interim results have normal precision
       # Point B

What I posted would essentially work for this example, but there isn't a 
guarantee that the context at Point A is the same as the context at Point B - 
the reason is that the thread-local context may be changed within the without 
block (i.e., external to this iterator), and that changed context would get 
saved when the decimal.Context context manager was resumed.

To fix that, the arguments to StopIteration in __suspend__ would need to be 
used as arguments when the generator is recreated in __resume__.

That is, the context manager would look like:

   @suspendable
   def __with__(self, oldctx=None): # Accept argument in __resume__
       newctx = self.copy()
       if oldctx is None:
           oldctx = decimal.getcontext()
       decimal.setcontext(newctx)
       try:
           yield newctx
       finally:
           decimal.setcontext(oldctx)
       raise StopIteration(oldctx) # Return result in __suspend__

(This might look cleaner if "return arg" in a generator was equivalent to 
"raise StopIteration(arg)" as previously discussed)

And (including reversion to 'one-use-only' status) the wrapper class would 
look like:

    class SuspendableGeneratorContext(object):

       def __init__(self, func, args, kwds):
           self.gen = func(*args, **kwds)
           self.func = func
           self.args = None

       def __with__(self):
           return self

       def __enter__(self):
           try:
               return self.gen.next()
           except StopIteration:
               raise RuntimeError("generator didn't yield")

       def __suspend__(self):
           try:
               self.gen.next()
           except StopIteration, ex:
               # Use the return value as the arguments for resumption
               self.args = ex.args
               return
           else:
               raise RuntimeError("generator didn't stop")

       def __resume__(self):
           if self.args is None:
               raise RuntimeError("context not suspended")
           self.gen = self.func(*args)
           try:
               self.gen.next()
           except StopIteration:
               raise RuntimeError("generator didn't yield")

       def __exit__(self, type, value, traceback):
           if type is None:
               try:
                   self.gen.next()
               except StopIteration:
                   return
               else:
                   raise RuntimeError("generator didn't stop")
           else:
               try:
                   self.gen.throw(type, value, traceback)
               except (type, StopIteration):
                   return
               else:
                   raise RuntimeError("generator caught exception")

    def suspendable_context(func):
       def helper(*args, **kwds):
           return SuspendableGeneratorContext(func, args, kwds)
       return helper

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From jimjjewett at gmail.com  Thu Oct 20 15:48:06 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Thu, 20 Oct 2005 09:48:06 -0400
Subject: [Python-Dev] Early PEP draft (For Python 3000?)
Message-ID: <fb6fbf560510200648k3be516dembd9373889a2a7ae3@mail.gmail.com>

I'll try to be more explicit; if Josiah and I are talking past each
other, than the explanation was clearly not yet mature.

(In http://mail.python.org/pipermail/python-dev/2005-October/057251.html)
Eyal Lotem suggested:

> Name: Attribute access for all namespaces ...

>       global x ; x = 1
> Replaced by:
>       module.x = 1

I responded:
> Attribute access as an option would be nice, but might be slower.

> Also note that one common use for a __dict__ is that you don't
> know what keys are available; meeting this use case with
> attribute access would require some extra machinery, such as
> an iterator over attributes.

Josiah Carlson responded
(http://mail.python.org/pipermail/python-dev/2005-October/057451.html)

> This particular use case is easily handled.  Put the following
> once at the top of the module...

> module = __import__(__name__)

> Then one can access (though perhaps not quickly) the module-level
> variables for that module.  To access attributes, it is a quick scan
> through module.__dict__, dir(), or vars().

My understanding of the request was that all namespaces --
including those returned by globals() and locals() -- should
be used with attribute access *instead* of __dict__ access.

module.x is certainly nicer than module.__dict__['x']

Even with globals() and locals(), I usually *wish* I could
use attribute access, to avoid creating a string when what
I really want is a name.

The catch is that sometimes I don't know the names in
advance, and have to iterate over the dict -- as you
suggested.  That works fine today; my question is what
to do instead if __dict__ is unavailable.

Note that vars(obj) itself conceptually returns a NameSpace
rather than a dict, so that isn't the answer.

My inclination is to add an __iterattr__ that returns
(attribute name, attribute value) pairs, and to make this the
default iterator for NameSpace objects.

Whether the good of
  (1)  not needing to mess with __dict__, and
  (2)  not needing to pretend that strings are names
is enough to justify an extra magic method ... I'm not as sure.

-jJ

From guido at python.org  Thu Oct 20 17:57:49 2005
From: guido at python.org (Guido van Rossum)
Date: Thu, 20 Oct 2005 08:57:49 -0700
Subject: [Python-Dev] Pre-PEP: Task-local variables
In-Reply-To: <43579ADC.80006@gmail.com>
References: <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com>
	<43579027.6040007@gmail.com> <43579ADC.80006@gmail.com>
Message-ID: <ca471dc20510200857y6c4ce263ob98f9f18e5d9eece@mail.gmail.com>

Whoa, folks! Can I ask the gentlemen to curb their enthusiasm?

PEP 343 is still (back) on the drawing table, PEP 342 has barely been
implemented (did it survive the AST-branch merge?), and already you
are talking about adding more stuff. Please put on the brakes!

If there's anything this discussion shows me, it's that implicit
contexts are a dangerous concept, and should be treated with much
skepticism.

I would recommend that if you find yourself needing context data while
programming an asynchronous application using generator trampolines
simulating coroutines, you ought to refactor the app so that the
context is explicitly passed along rather than grabbed implicitly.
Zope doesn't *require* you to get the context from a thread-local, and
I presume that SQLObject also has a way to explicitly use a specific
connection (I'm assuming cursors and similar data structures have an
explicit reference to the connection used to create them). Heck, even
Decimal allows you to invoke every operation as a method on a
decimal.Context object!

I'd rather not tie implicit contexts to the with statement,
conceptually. Most uses of the with-statement are purely local (e.g.
"with open(fn) as f"), or don't apply to coroutines (e.g. "with
my_lock"). I'd say that "with redirect_stdout(f)" also doesn't apply
-- we already know it doesn't work in threaded applications, and that
restriction is easily and logically extended to coroutines.

If you're writing a trampoline for an app that needs to modify decimal
contexts, the decimal module already provides the APIs for explicitly
saving and restoring contexts.

I know that somewhere in the proto-PEP Phillip argues that the context
API needs to be made a part of the standard library so that his
trampoline can efficiently swap implicit contexts required by
arbitrary standard and third-party library code. My response to that
is that library code (whether standard or third-party) should not
depend on implicit context unless it assumes it can assume complete
control over the application. (That rules out pretty much everything
except Zope, which is fine with me. :-)

Also, Nick wants the name 'context' for PEP-343 style context
managers. I think it's overloading too much to use the same word for
per-thread or per-coroutine context.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From jcarlson at uci.edu  Thu Oct 20 18:48:43 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Thu, 20 Oct 2005 09:48:43 -0700
Subject: [Python-Dev] Early PEP draft (For Python 3000?)
In-Reply-To: <fb6fbf560510200648k3be516dembd9373889a2a7ae3@mail.gmail.com>
References: <fb6fbf560510200648k3be516dembd9373889a2a7ae3@mail.gmail.com>
Message-ID: <20051020093601.3889.JCARLSON@uci.edu>


Jim Jewett <jimjjewett at gmail.com> wrote:
> I'll try to be more explicit; if Josiah and I are talking past each
> other, than the explanation was clearly not yet mature.
> 
> (In http://mail.python.org/pipermail/python-dev/2005-October/057251.html)
> Eyal Lotem suggested:
> 
> > Name: Attribute access for all namespaces ...
> 
> >       global x ; x = 1
> > Replaced by:
> >       module.x = 1
> 
> I responded:
> > Attribute access as an option would be nice, but might be slower.
> 
> > Also note that one common use for a __dict__ is that you don't
> > know what keys are available; meeting this use case with
> > attribute access would require some extra machinery, such as
> > an iterator over attributes.
> 
> Josiah Carlson responded
> (http://mail.python.org/pipermail/python-dev/2005-October/057451.html)
> 
> > This particular use case is easily handled.  Put the following
> > once at the top of the module...
> 
> > module = __import__(__name__)
> 
> > Then one can access (though perhaps not quickly) the module-level
> > variables for that module.  To access attributes, it is a quick scan
> > through module.__dict__, dir(), or vars().
> 
> My understanding of the request was that all namespaces --
> including those returned by globals() and locals() -- should
> be used with attribute access *instead* of __dict__ access.

Yeah, I missed the transition from arbitrary stack frame access to
strictly global and local scope attribute access.


> module.x is certainly nicer than module.__dict__['x']
> 
> Even with globals() and locals(), I usually *wish* I could
> use attribute access, to avoid creating a string when what
> I really want is a name.

Indeed.

> The catch is that sometimes I don't know the names in
> advance, and have to iterate over the dict -- as you
> suggested.  That works fine today; my question is what
> to do instead if __dict__ is unavailable.
> 
> Note that vars(obj) itself conceptually returns a NameSpace
> rather than a dict, so that isn't the answer.

>>> help(vars)
vars(...)
    vars([object]) -> dictionary

    Without arguments, equivalent to locals().
    With an argument, equivalent to object.__dict__.

When an object lacks a dictionary, dir() works just fine.

>>> help(dir)
Help on built-in function dir:

dir(...)
    dir([object]) -> list of strings

    Return an alphabetized list of names comprising (some of) the attributes
    of the given object, and of attributes reachable from it:

    No argument:  the names in the current scope.
    Module object:  the module attributes.
    Type or class object:  its attributes, and recursively the attributes of
        its bases.
    Otherwise:  its attributes, its class's attributes, and recursively the
        attributes of its class's base classes.


> My inclination is to add an __iterattr__ that returns
> (attribute name, attribute value) pairs, and to make this the
> default iterator for NameSpace objects.

def __iterattr__(obj):
    for i in dir(obj):
        yield i, getattr(obj, i)


> Whether the good of
>   (1)  not needing to mess with __dict__, and
>   (2)  not needing to pretend that strings are names
> is enough to justify an extra magic method ... I'm not as sure.

I don't know, but leaning towards no; dir() works pretty well.  Yeah, you
have to use getattr(), but there are worse things.

 - Josiah


From dalcinl at gmail.com  Thu Oct 20 19:04:03 2005
From: dalcinl at gmail.com (Lisandro Dalcin)
Date: Thu, 20 Oct 2005 14:04:03 -0300
Subject: [Python-Dev] enumerate with a start index
In-Reply-To: <8393fff0510191354s3676682dk58a4db65edee1fee@mail.gmail.com>
References: <8393fff0510191354s3676682dk58a4db65edee1fee@mail.gmail.com>
Message-ID: <e7ba66e40510201004o63cf64e2s52c08cd475292a5b@mail.gmail.com>

On 10/19/05, Martin Blais <blais at furius.ca> wrote:
> Just wondering, would anyone think of it as a good idea if the
> enumerate() builtin could accept a "start" argument?

And why not an additional "step" argument? Anyway, perhaps all this
can be done with a 'xrange' object...


--
Lisandro Dalc?n
---------------
Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC)
Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC)
Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET)
PTLC - G?emes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594

From pje at telecommunity.com  Thu Oct 20 19:14:08 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 20 Oct 2005 13:14:08 -0400
Subject: [Python-Dev] Pre-PEP: Task-local variables
In-Reply-To: <43579027.6040007@gmail.com>
References: <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com>
	<5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20051020131024.02033d58@mail.telecommunity.com>

At 10:40 PM 10/20/2005 +1000, Nick Coghlan wrote:
>Phillip J. Eby wrote:
> > This is still rather rough, but I figured it's easier to let everybody 
> fill
> > in the remaining gaps by arguments than it is for me to pick a position I
> > like and try to convince everybody else that it's right.  :)  Your 
> feedback
> > is requested and welcome.
>
>I think you're actually highlighting a bigger issue with the behaviour of
>"yield" inside a "with" block, and working around it rather than fixing the
>fundamental problem.
>
>The issue with "yield" causing changes to leak to outer scopes isn't limited
>to coroutine style usage - it can happen with generator-iterators, too.
>
>What's missing is a general way of saying "suspend this context temporarily,
>and resume it when done".

Actually, it's fairly simple to write a generator decorator using 
context.swap() that saves and restores the current execution state around 
next()/send()/throw() calls, if you prefer it to be the generator's 
responsibility to maintain such context.


From michel at cignex.com  Thu Oct 20 00:14:31 2005
From: michel at cignex.com (Michel Pelletier)
Date: Wed, 19 Oct 2005 15:14:31 -0700
Subject: [Python-Dev] Coroutines, generators, function calling
In-Reply-To: <004801c5d3ec$e29b5360$6402a8c0@arkdesktop>
References: <1129643229.12510.37.camel@localhost>
	<004801c5d3ec$e29b5360$6402a8c0@arkdesktop>
Message-ID: <4356C547.8020402@cignex.com>

Andrew Koenig wrote:
>>  Sure, that would work.  Or even this, if the scheduler would
>>automatically recognize generator objects being yielded and so would run
>>the the nested coroutine until finish:
> 
> 
> This idea has been discussed before.  I think the problem with recognizing
> generators as the subject of "yield" statements is that then you can't yield
> a generator even if you want to.
> 
> The best syntax I can think of without adding a new keyword looks like this:
> 
> 	yield from x
> 
> which would be equivalent to
> 
> 	for i in x:
> 	    yield i

My eyes really like the syntax, but I wonder about it's usefulness.  In 
rdflib, particularly here:

http://svn.rdflib.net/trunk/rdflib/backends/IOMemory.py

We yield values from inside for loops all over the place, but the 
yielded value is very rarely just the index value (only 1 of 14 yields) 
, but something calculated from the index value, so the new syntax would 
not be useful, unless it was something that provided access to the index 
item as a variable, like:

yield foo(i) for i in x

which barely saves you anything (a colon, a newline, and an indent). 
(hey wait, isn't that a generator comprehension?  Haven't really 
encountered those yet). Of course rdflib could be the minority case and 
most folks who yield in loops are yielding only the index value directly.

off to read the generator comprehension docs...

-Michel

From jeremy at alum.mit.edu  Thu Oct 20 22:04:27 2005
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Thu, 20 Oct 2005 16:04:27 -0400
Subject: [Python-Dev] Pre-PEP: Task-local variables
In-Reply-To: <ca471dc20510200857y6c4ce263ob98f9f18e5d9eece@mail.gmail.com>
References: <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com>
	<43579027.6040007@gmail.com> <43579ADC.80006@gmail.com>
	<ca471dc20510200857y6c4ce263ob98f9f18e5d9eece@mail.gmail.com>
Message-ID: <e8bf7a530510201304g29beec3x385815a03ac70690@mail.gmail.com>

On 10/20/05, Guido van Rossum <guido at python.org> wrote:
> Whoa, folks! Can I ask the gentlemen to curb their enthusiasm?
>
> PEP 343 is still (back) on the drawing table, PEP 342 has barely been
> implemented (did it survive the AST-branch merge?), and already you
> are talking about adding more stuff. Please put on the brakes!

Yes.  PEP 342 survived the merge of the AST branch.  I wonder, though,
if the Grammar for it can be simplified at all.  I haven't read the
PEP closely, but I found the changes a little hard to follow.  That
is, why was the grammar changed the way it was -- or how would you
describe the intent of the changes?  It was hard when doing the
transformation in ast.c to be sure that the intent of the changes was
honored.  On the other hand, it seemed to have extensive tests and
they all pass.

Jeremy

From pje at telecommunity.com  Thu Oct 20 22:29:12 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 20 Oct 2005 16:29:12 -0400
Subject: [Python-Dev] Pre-PEP: Task-local variables
In-Reply-To: <e8bf7a530510201304g29beec3x385815a03ac70690@mail.gmail.com
 >
References: <ca471dc20510200857y6c4ce263ob98f9f18e5d9eece@mail.gmail.com>
	<5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com>
	<43579027.6040007@gmail.com> <43579ADC.80006@gmail.com>
	<ca471dc20510200857y6c4ce263ob98f9f18e5d9eece@mail.gmail.com>
Message-ID: <5.1.1.6.0.20051020162549.01faedf0@mail.telecommunity.com>

At 04:04 PM 10/20/2005 -0400, Jeremy Hylton wrote:
>On 10/20/05, Guido van Rossum <guido at python.org> wrote:
> > Whoa, folks! Can I ask the gentlemen to curb their enthusiasm?
> >
> > PEP 343 is still (back) on the drawing table, PEP 342 has barely been
> > implemented (did it survive the AST-branch merge?), and already you
> > are talking about adding more stuff. Please put on the brakes!
>
>Yes.  PEP 342 survived the merge of the AST branch.  I wonder, though,
>if the Grammar for it can be simplified at all.  I haven't read the
>PEP closely, but I found the changes a little hard to follow.  That
>is, why was the grammar changed the way it was -- or how would you
>describe the intent of the changes?

The intent was to make it so that '(yield optional_expr)' always works, and 
also that   [lvalue =] yield optional_expr works.  If you can find another 
way to hack the grammar so that both of 'em work, it's certainly okay by 
me.  The changes I made were just the simplest things I could figure out to do.

I seem to recall that the hard part was the need for 'yield expr,expr' to 
be interpreted as '(yield expr,expr)', not '(yield expr),expr', for 
backward compatibility reasons.


From tzot at mediconsa.com  Thu Oct 20 23:08:12 2005
From: tzot at mediconsa.com (Christos Georgiou)
Date: Fri, 21 Oct 2005 00:08:12 +0300
Subject: [Python-Dev] list splicing
References: <quack.20050918T1850.lth7jddkeqx@roar.cs.berkeley.edu>
	<432E4BC9.1020100@canterbury.ac.nz>
Message-ID: <dj9100$pmq$1@sea.gmane.org>

"Greg Ewing" <greg.ewing at canterbury.ac.nz> wrote in message 
news:432E4BC9.1020100 at canterbury.ac.nz...
> Karl Chen wrote:
>> Hi, has anybody considered adding something like this:
>>     a = [1, 2]
>>     [ 'x', *a, 'y']
>>
>> as syntactic sugar for
>>     a = [1, 2]
>>     [ 'x' ] + a + [ 'y' ].
>
> You can write that as
>   a = [1, 2]
>   a[1:1] = a

I'm sure you meant to write:

a = [1, 2]
b = ['x', 'y']
b[1:1] = a

Occasional absence of mind makes other people feel useful!


PS actually one *can* write

a = [1, 2]
['x', 'y'][1:1] = a

since this is not actually an assignment but rather syntactic sugar for a 
function call, but I don't know how one would use the modified list, since

b = ['x','y'][1:1] = a

doesn't quite fulfill the initial requirement ;) 



From pje at telecommunity.com  Thu Oct 20 23:35:31 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 20 Oct 2005 17:35:31 -0400
Subject: [Python-Dev] Pre-PEP: Task-local variables
In-Reply-To: <ca471dc20510200857y6c4ce263ob98f9f18e5d9eece@mail.gmail.co
 m>
References: <43579ADC.80006@gmail.com>
	<5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com>
	<43579027.6040007@gmail.com> <43579ADC.80006@gmail.com>
Message-ID: <5.1.1.6.0.20051020163313.01faf660@mail.telecommunity.com>

At 08:57 AM 10/20/2005 -0700, Guido van Rossum wrote:
>Whoa, folks! Can I ask the gentlemen to curb their enthusiasm?
>
>PEP 343 is still (back) on the drawing table, PEP 342 has barely been
>implemented (did it survive the AST-branch merge?), and already you
>are talking about adding more stuff. Please put on the brakes!

Sorry.  I thought that 343 was just getting a minor tune-up.  In the months 
since the discussion and approval (and implementation; Michael Hudson 
actually had a PEP 343 patch out there), I've been doing a lot of thinking 
about how they will be used in applications, and thought that it would be a 
good idea to promote people using task-specific variables in place of 
globals or thread-locals.

The conventional wisdom is that global variables are bad, but the truth is 
that they're very attractive because they allow you to have one less thing 
to pass around and think about in every line of code.  Without globals, you 
would sooner or later end up with every function taking twenty arguments to 
pass through states down to other code, or else trying to cram all this 
data into some kind of "context" object, which then won't work with code 
that doesn't know about *your* definition of what a context is.

Globals are thus extremely attractive for practical software 
development.  If they weren't so useful, it wouldn't be necessary to warn 
people not to use them, after all.  :)

The problem with globals, however, is that sometimes they need to be 
changed in a particular context.  PEP 343 makes it safer to use globals 
because you can always offer a context manager that changes them 
temporarily, without having to hand-write a try-finally block.  This will 
make it even *more* attractive to use globals, which is not a problem as 
long as the code has no multitasking of any sort.

Of course, the multithreading scenario is usually fixed by using 
thread-locals.  All I'm proposing is that we replace thread locals with 
task locals, and promote the use of task-local variables for managed 
contexts (such as the decimal context) *that would otherwise be a global or 
a thread-local variable*.  This doesn't seem to me like a very big deal; 
just an encouragement for people to make their stuff easy to use with PEP 
342 and 343.

By the way, I don't know if you do much with Java these days, but a big 
part of the whole J2EE fiasco and the rise of the so-called "lightweight 
containers" in Java has all been about how to manage implicit context so 
that you don't get stuck with either the inflexibility of globals or the 
deadweight of passing tons of parameters around.  One of the big selling 
points of AspectJ is that it lets you implicitly funnel parameters from 
point A to point B without having to modify all the call signatures in 
between.  In other words, its use is promoted for precisely the sort of 
thing that 'with' plus a task variable would be ideal for.  As far as I can 
tell, 'with' plus a task variable is *much* easier to explain, use, and 
understand than an aspect-oriented programming tool is!  (Especially from 
the "if the implementation is easy to explain, it may be a good idea" 
perspective.)


>I know that somewhere in the proto-PEP Phillip argues that the context
>API needs to be made a part of the standard library so that his
>trampoline can efficiently swap implicit contexts required by
>arbitrary standard and third-party library code. My response to that
>is that library code (whether standard or third-party) should not
>depend on implicit context unless it assumes it can assume complete
>control over the application.

I think maybe there's some confusion here, at least on my part.  :)  I see 
two ways to read your statement, one of which seems to be saying that we 
should get rid of the decimal context (because it doesn't have complete 
control over the application), and the other way of reading it doesn't seem 
connected to what I proposed.

Anything that's a global variable is an "implicit context".  Because of 
that, I spent considerable time and effort in PEAK trying to utterly stamp 
out global variables.  *Everything* in PEAK has an explicit context.  But 
that then becomes more of a pain to *use*, because you are now stuck with 
managing it, even if you cram it into a Zope-style acquisition tree so 
there's only one "context" to deal with.  Plus, it assumes that everything 
the developer wants to do can be supplied by *one* framework, be it PEAK, 
Zope, or whatever, which is rarely the case but still forces framework 
developers to duplicate everybody else's stuff.

In other words, I've come to realize that the path the major Python 
application frameworks is not really Pythonic.  A Pythonic framework 
shouldn't load you down with new management burdens and keep you from using 
other frameworks.  It should make life easier, and make your code *more* 
interoperable, not less.  Indeed, I've pretty much come to agreement with 
the part of the Python developer community that has says Frameworks Are 
Evil.  A primary source of this evil in the big three frameworks (PEAK, 
Twisted, and Zope) stem from their various approaches to dealing with this 
issue of context, which lack the simplicity of global (or task-local) 
variables.

So, the lesson I've taken from my attempt to make everything explicit is 
that what developers *really* want is to have global variables, just 
without the downsides of uncontrolled modifications, and inter-thread or 
inter-task pollution.  Explicit isn't always better than implicit, because 
oftentimes the practicality of having implicit things is much more 
important than the purity of making them all explicit.  Simple is better 
than complex, and task-local variables are *much* simpler than trying to 
make everything explicit.


>Also, Nick wants the name 'context' for PEP-343 style context
>managers. I think it's overloading too much to use the same word for
>per-thread or per-coroutine context.

Actually, I was the one who originally proposed the term "context manager", 
and it doesn't seem like a conflict to me.  Indeed, I suggested in the 
pre-PEP that "@context.manager" might be where we could put the 
decorator.  The overload was intentional, to suggest that when creating a 
new context manager, it's worth considering whether the state should be 
kept in a context variable, rather than a global variable.  The naming 
choice was for propaganda purposes, in other words.  :)

Anyway, I'll withdraw the proposal for now.  We can always leave it out of 
2.5, I can release an independent implementation, and then submit it for 
consideration again in the 2.6 timeframe.  I just thought it would be a 
no-brainer to use task locals where thread locals are currently being used, 
and that's really all I was proposing we do as far as stdlib changes 
anyway.  I was also hoping to get good input from Python-dev regarding some 
of the open issues, to try and build a consensus on them from the beginning.


From tzot at mediconsa.com  Thu Oct 20 23:51:10 2005
From: tzot at mediconsa.com (Christos Georgiou)
Date: Fri, 21 Oct 2005 00:51:10 +0300
Subject: [Python-Dev] bool(iter([])) changed between 2.3 and 2.4
References: <ca471dc20509201449652f11d@mail.gmail.com><001c01c5be3c$53130dc0$6522c797@oemcomputer>
	<ca471dc205092017071f2eb1e8@mail.gmail.com>
Message-ID: <dj93gh$1rj$1@sea.gmane.org>


"Guido van Rossum" <guido at python.org> wrote in message 
news:ca471dc205092017071f2eb1e8 at mail.gmail.com...
>> [Fred]
>> > think iterators shouldn't have length at all:
>> > they're *not* containers and shouldn't act that way.
>>
>> Some iterators can usefully report their length with the invariant:
>>    len(it) == len(list(it)).
>
>I still consider this an erroneous hypergeneralization of the concept
>of iterators. Iterators should be pure iterators and not also act as
>containers. Which other object type implements __len__ but not
>__getitem__?

Too late, and probably irrelevant by now; the answer though is

set([1,2,3]) 



From guido at python.org  Fri Oct 21 02:23:04 2005
From: guido at python.org (Guido van Rossum)
Date: Thu, 20 Oct 2005 17:23:04 -0700
Subject: [Python-Dev] Pre-PEP: Task-local variables
In-Reply-To: <5.1.1.6.0.20051020162549.01faedf0@mail.telecommunity.com>
References: <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com>
	<43579027.6040007@gmail.com> <43579ADC.80006@gmail.com>
	<ca471dc20510200857y6c4ce263ob98f9f18e5d9eece@mail.gmail.com>
	<5.1.1.6.0.20051020162549.01faedf0@mail.telecommunity.com>
Message-ID: <ca471dc20510201723hda50db5k1f5e58ad7bb84427@mail.gmail.com>

On 10/20/05, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 04:04 PM 10/20/2005 -0400, Jeremy Hylton wrote:
> >On 10/20/05, Guido van Rossum <guido at python.org> wrote:
> > > Whoa, folks! Can I ask the gentlemen to curb their enthusiasm?
> > >
> > > PEP 343 is still (back) on the drawing table, PEP 342 has barely been
> > > implemented (did it survive the AST-branch merge?), and already you
> > > are talking about adding more stuff. Please put on the brakes!
> >
> >Yes.  PEP 342 survived the merge of the AST branch.  I wonder, though,
> >if the Grammar for it can be simplified at all.  I haven't read the
> >PEP closely, but I found the changes a little hard to follow.  That
> >is, why was the grammar changed the way it was -- or how would you
> >describe the intent of the changes?
>
> The intent was to make it so that '(yield optional_expr)' always works, and
> also that   [lvalue =] yield optional_expr works.  If you can find another
> way to hack the grammar so that both of 'em work, it's certainly okay by
> me.  The changes I made were just the simplest things I could figure out to do.

Right.

> I seem to recall that the hard part was the need for 'yield expr,expr' to
> be interpreted as '(yield expr,expr)', not '(yield expr),expr', for
> backward compatibility reasons.

But only at the statement level.

These should be errors IMO:

  foo(yield expr, expr)
  foo(expr, yield expr)
  foo(1 + yield expr)
  x = yield expr, expr
  x = expr, yield expr
  x = 1 + yield expr

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From anthony at interlink.com.au  Fri Oct 21 04:02:11 2005
From: anthony at interlink.com.au (Anthony Baxter)
Date: Fri, 21 Oct 2005 12:02:11 +1000
Subject: [Python-Dev] AST branch is in?
Message-ID: <200510211202.12015.anthony@interlink.com.au>

So it looks like the AST branch has landed. Wooo! Well done to all who 
were involved - it seems like it's been a huge amount of work. 

Could someone involved give a short email laying out what concrete (no 
pun intended) advantages this new compiler gives us? Does it just 
allow us to do new and interesting manipulations of the code during 
compilation? Cleaner, easier to maintain, or the like?

Anthony
-- 
Anthony Baxter     <anthony at interlink.com.au>
It's never too late to have a happy childhood.

From nnorwitz at gmail.com  Fri Oct 21 04:32:56 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Thu, 20 Oct 2005 19:32:56 -0700
Subject: [Python-Dev] AST branch is in?
In-Reply-To: <200510211202.12015.anthony@interlink.com.au>
References: <200510211202.12015.anthony@interlink.com.au>
Message-ID: <ee2a432c0510201932g147e36belf3b85e39be8e817e@mail.gmail.com>

On 10/20/05, Anthony Baxter <anthony at interlink.com.au> wrote:
>
> Could someone involved give a short email laying out what concrete (no
> pun intended) advantages this new compiler gives us? Does it just
> allow us to do new and interesting manipulations of the code during
> compilation? Cleaner, easier to maintain, or the like?

The Grammar is (was at one point at least) shared between Jython and
would allow more tools to be able to share infrastructure.  The idea
is to eventually be able to have [JP]ython output the same AST to
tools.  There is quite a bit of generated code based on the Grammar. 
So some stuff should be easier.  Other stuff is just moved.  You still
need to convert from the AST to the byte code.

Hopefully it will be easier to do various sorts of optimization and
general manipulation of an AST rather than what existed before.

Only time will tell if we can acheive many of the benefits, so it
would be good if people could review the code and see if things look
more complex/complicated and suggest improvements.  I'm not all that
familiar with the structure, I'm more of a hopeful consumer of it.

HTH,
n

From simon.belak at hruska.si  Fri Oct 21 04:34:02 2005
From: simon.belak at hruska.si (Simon Belak)
Date: Fri, 21 Oct 2005 04:34:02 +0200
Subject: [Python-Dev] A solution to the evils of static typing and
	interfaces?
Message-ID: <4358539A.7050901@hruska.si>

Hi,

I was thinking why not have a separate file for all the proposed 
optional meta-information (in particular interfaces, static types)? 
Something along the lines of IDLs in CORBA (with pythonic syntax, of 
curse). This way most of the benefits are retained without 
"contaminating" the actual syntax (dare I be so pretentious to even hope 
making both sides happy?).

For the sole purpose of illustration, let meta-files have extension .pym 
and linking to source-files be name based:

parrot.py
parrot.pym
(parrot.pyc)

With some utilities like a prototype generator (to and from meta-files) 
and a synchronization tool, time penalty on development for having two 
separate files could be kept within reason.

We could even go as far as introducing a syntax allowing custom 
meta-information to be added.

For example something akin to decorators.

parrot.pym:

@sharedinstance
class Parrot:
	
	# Methods
	# note this are only prototypes so no semicolon or suite is needed
	
	@cache
	def playDead(a : int, b : int) -> None
	
	# Attributes
	
	@const
	name : str

where sharedinstance, cache and const are custom meta-information.

This opens up countless possibilities for third-party interpreter 
enchantments and/or optimisations by providing a fully portable (as all 
meta-information are optional) language extensions.


P.S. my sincerest apologies if I am reopening a can of worms here

From guido at python.org  Fri Oct 21 04:57:16 2005
From: guido at python.org (Guido van Rossum)
Date: Thu, 20 Oct 2005 19:57:16 -0700
Subject: [Python-Dev] Pre-PEP: Task-local variables
In-Reply-To: <5.1.1.6.0.20051020163313.01faf660@mail.telecommunity.com>
References: <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com>
	<43579027.6040007@gmail.com> <43579ADC.80006@gmail.com>
	<5.1.1.6.0.20051020163313.01faf660@mail.telecommunity.com>
Message-ID: <ca471dc20510201957m7823c49ama127de972eef4028@mail.gmail.com>

On 10/20/05, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 08:57 AM 10/20/2005 -0700, Guido van Rossum wrote:
> >Whoa, folks! Can I ask the gentlemen to curb their enthusiasm?
> >
> >PEP 343 is still (back) on the drawing table, PEP 342 has barely been
> >implemented (did it survive the AST-branch merge?), and already you
> >are talking about adding more stuff. Please put on the brakes!
>
> Sorry.  I thought that 343 was just getting a minor tune-up.

Maybe, but the issues on the table are naming issues -- is __with__
the right name, or should it be __context__? Should the decorator be
applied implicitly? Should the decorator be called @context or
@contextmanager?

> In the months
> since the discussion and approval (and implementation; Michael Hudson
> actually had a PEP 343 patch out there),

Which he described previously as "a hack" and apparently didn't feel
comfortable checking in. At least some of it will have to be redone,
(a) for the AST code, and (b) for the revised PEP.

> I've been doing a lot of thinking
> about how they will be used in applications, and thought that it would be a
> good idea to promote people using task-specific variables in place of
> globals or thread-locals.

That's clear, yes. :-)

I still find it unlikely that a lot of people will be using trampoline
frameworks. You and Twisted, that's all I expect.

> The conventional wisdom is that global variables are bad, but the truth is
> that they're very attractive because they allow you to have one less thing
> to pass around and think about in every line of code.

Which doesn't make them less bad -- they're still there and perhaps
more likely to trip you up when you least expect it. I think there's a
lot of truth in that conventional wisdom.

> Without globals, you
> would sooner or later end up with every function taking twenty arguments to
> pass through states down to other code, or else trying to cram all this
> data into some kind of "context" object, which then won't work with code
> that doesn't know about *your* definition of what a context is.

Methinks you are exaggerating for effect.

> Globals are thus extremely attractive for practical software
> development.  If they weren't so useful, it wouldn't be necessary to warn
> people not to use them, after all.  :)
>
> The problem with globals, however, is that sometimes they need to be
> changed in a particular context.  PEP 343 makes it safer to use globals
> because you can always offer a context manager that changes them
> temporarily, without having to hand-write a try-finally block.  This will
> make it even *more* attractive to use globals, which is not a problem as
> long as the code has no multitasking of any sort.

Hm. There are different kinds of globals. Most globals don't need to
be context-managed at all, because they can safely be shared between
threads, tasks or coroutines. Caches usually fall in this category
(e.g. the compiled regex cache). A little locking is all it takes.

The globals that need to be context-managed are the pernicious kind of
which you can never have too few. :-)

They aren't just accumulating global state, they are implicit
parameters, thereby truly invoking the reasons why globals are frowned
upon.

> Of course, the multithreading scenario is usually fixed by using
> thread-locals.  All I'm proposing is that we replace thread locals with
> task locals, and promote the use of task-local variables for managed
> contexts (such as the decimal context) *that would otherwise be a global or
> a thread-local variable*.  This doesn't seem to me like a very big deal;
> just an encouragement for people to make their stuff easy to use with PEP
> 342 and 343.

I'm all for encouraging people to make their stuff easy to use with
these PEPs, and with multi-threading use.

But IMO the best way to accomplish those goals is to refrain from
global (or thread-local or task-local) context as much as possible,
for example by passing along explicit context.

The mere existence of a standard library module to make handling
task-specific contexts easier sends the wrong signal; it suggests that
it's a good pattern to use, which it isn't -- it's a last-resort
pattern, when all other solutions fail.

If it weren't for Python's operator overloading, the decimal module
would have used explicit contexts (like the Java version); but since
it would be really strange to have such a fundamental numeric type
without the ability to use the conventional operator notation, we
resorted to per-thread context. Even that doesn't always do the right
thing -- handling decimal contexts is surprisingly subtle (as Nick can
testify based on his experiences attempting to write a decimal context
manager for the with-statement!).

Yes, coroutines make it even subtler.

But I haven't seen the use case yet for mixing coroutines with changes
to decimal context settings; somehow it doesn't strike me as a likely
use case (not that you can't construct one, so don't bother -- I can
imagine it too, I just think YAGNI).

> By the way, I don't know if you do much with Java these days, but a big
> part of the whole J2EE fiasco and the rise of the so-called "lightweight
> containers" in Java has all been about how to manage implicit context so
> that you don't get stuck with either the inflexibility of globals or the
> deadweight of passing tons of parameters around.

I have to trust your word on that; I'm using Tomcat and not liking it
but overly long parameter lists or context management aren't on my
list of gripes. I have no idea what a "lightweight container" is. It
sounds (especially since you put it in scare quotes :-) like a typical
Java understatement.

> One of the big selling
> points of AspectJ is that it lets you implicitly funnel parameters from
> point A to point B without having to modify all the call signatures in
> between.

Again, I'll have to trust you on this. I've never tried AspectJ or any
other aspect-oriented system. But frankly I believe the idea is
overhyped -- there are a few example cases that everyone uses to show
it off (persistence, thread-safety) but I'm not sure these warrant the
weight of the solution.

> In other words, its use is promoted for precisely the sort of
> thing that 'with' plus a task variable would be ideal for.

This I simply don't follow (except that you seem to agree with me that
AspectJ is overkill :-). The with-statement is primarily useful for
mandatory cleanup (or release) and for restoring temporary changes.
Even if decimal contexts were always passed around explicitly, a
with-statement around a block with temporarily increased precision or
changed error handling would make sense.

> As far as I can
> tell, 'with' plus a task variable is *much* easier to explain, use, and
> understand than an aspect-oriented programming tool is!  (Especially from
> the "if the implementation is easy to explain, it may be a good idea"
> perspective.)

And this may be a very good thing. But I still expect that the number
of people who need these is a lot smaller than you think (since
clearly *you* need it :-).

> >I know that somewhere in the proto-PEP Phillip argues that the context
> >API needs to be made a part of the standard library so that his
> >trampoline can efficiently swap implicit contexts required by
> >arbitrary standard and third-party library code. My response to that
> >is that library code (whether standard or third-party) should not
> >depend on implicit context unless it assumes it can assume complete
> >control over the application.
>
> I think maybe there's some confusion here, at least on my part.  :)  I see
> two ways to read your statement, one of which seems to be saying that we
> should get rid of the decimal context (because it doesn't have complete
> control over the application), and the other way of reading it doesn't seem
> connected to what I proposed.

I simply see decimal as the exception that proves the rule.

> Anything that's a global variable is an "implicit context".

See above for my quibbles with that.

> Because of
> that, I spent considerable time and effort in PEAK trying to utterly stamp
> out global variables.  *Everything* in PEAK has an explicit context.  But
> that then becomes more of a pain to *use*, because you are now stuck with
> managing it, even if you cram it into a Zope-style acquisition tree so
> there's only one "context" to deal with.  Plus, it assumes that everything
> the developer wants to do can be supplied by *one* framework, be it PEAK,
> Zope, or whatever, which is rarely the case but still forces framework
> developers to duplicate everybody else's stuff.

Well, face it. Frameworks want to control the world. Multiple
frameworks rarely cooperate until they somehow agree on a common
ground. That usually doesn't happen until both frameworks are already
mature, and then it's painful of course. But I don't see a solution --
that's just the nature of frameworks.

> In other words, I've come to realize that the path the major Python
> application frameworks is not really Pythonic.

(Is there a missing work "take" after "frameworks"?)

> A Pythonic framework shouldn't load you down with new
> management burdens and keep you from using
> other frameworks.  It should make life easier, and make your code *more*
> interoperable, not less.  Indeed, I've pretty much come to agreement with
> the part of the Python developer community that has says Frameworks Are
> Evil.

I would agree, yes. :-)

> A primary source of this evil in the big three frameworks (PEAK,
> Twisted, and Zope) stem from their various approaches to dealing with this
> issue of context, which lack the simplicity of global (or task-local)
> variables.

I think that's rather an exaggeration (again for effect?).

They're frameworks, they want you to do everything in a way that
reflects the framework's philosophy.

Python, in its design philosophy, tries hard *not* to be a framework.
(This sets it apart from Java, which is hostile to non-Java code.)
Python tries to be helpful when you want to solve part of your problem
using a different tool. It tries to work well even if Python is only a
small part of your total solution. It tries to be agnostic of
platform-specific frameworks, optionally working with them (e.g. fork
and pipes on Unix) but not depending or relying on them. Even threads
are quite optional to Python.

> So, the lesson I've taken from my attempt to make everything explicit is
> that what developers *really* want is to have global variables, just
> without the downsides of uncontrolled modifications, and inter-thread or
> inter-task pollution.  Explicit isn't always better than implicit, because
> oftentimes the practicality of having implicit things is much more
> important than the purity of making them all explicit.  Simple is better
> than complex, and task-local variables are *much* simpler than trying to
> make everything explicit.

Maybe. But this may just be a case where you simply can't have your
cake and eat it too. I expect that having 100 task-local variables
would probe to be just as big a pain as 100 other forms of context,
implicit or explicit.

> >Also, Nick wants the name 'context' for PEP-343 style context
> >managers. I think it's overloading too much to use the same word for
> >per-thread or per-coroutine context.
>
> Actually, I was the one who originally proposed the term "context manager",
> and it doesn't seem like a conflict to me.  Indeed, I suggested in the
> pre-PEP that "@context.manager" might be where we could put the
> decorator.  The overload was intentional, to suggest that when creating a
> new context manager, it's worth considering whether the state should be
> kept in a context variable, rather than a global variable.  The naming
> choice was for propaganda purposes, in other words.  :)

That may be, but I think it's confusing, since most of the popular
uses of the with-statement will have nothing to do with task-locals.

> Anyway, I'll withdraw the proposal for now.

Thanks.

> We can always leave it out of
> 2.5, I can release an independent implementation, and then submit it for
> consideration again in the 2.6 timeframe.

That sounds like a much better plan than rushing into it now.

> I just thought it would be a
> no-brainer to use task locals where thread locals are currently being used,
> and that's really all I was proposing we do as far as stdlib changes
> anyway.  I was also hoping to get good input from Python-dev regarding
> some of the open issues, to try and build a consensus on them from
> the beginning.

If you look at the code in decimal.py, it already has three different
ways to handle contexts, depending on the Python version and whether
it has threads. Adding task-locals would just complicate matters.

(Sorry for the long post -- there just wasn't anything you said that I
felt could be left unquoted. :-)

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Fri Oct 21 04:59:42 2005
From: guido at python.org (Guido van Rossum)
Date: Thu, 20 Oct 2005 19:59:42 -0700
Subject: [Python-Dev] AST branch is in?
In-Reply-To: <200510211202.12015.anthony@interlink.com.au>
References: <200510211202.12015.anthony@interlink.com.au>
Message-ID: <ca471dc20510201959h3c9ddc39o5b8860584c261b52@mail.gmail.com>

On 10/20/05, Anthony Baxter <anthony at interlink.com.au> wrote:
> So it looks like the AST branch has landed. Wooo! Well done to all who
> were involved - it seems like it's been a huge amount of work.

Hear, hear. Great news! Thanks to Jeremy, Neil and all the others. I
can't wait to check it out!

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From ark at acm.org  Fri Oct 21 06:58:28 2005
From: ark at acm.org (Andrew Koenig)
Date: Fri, 21 Oct 2005 00:58:28 -0400
Subject: [Python-Dev] Coroutines, generators, function calling
In-Reply-To: <4356C547.8020402@cignex.com>
Message-ID: <006101c5d5fc$16ddb2b0$6402a8c0@arkdesktop>

> so the new syntax would
> not be useful, unless it was something that provided access to the index
> item as a variable, like:
> 
> yield foo(i) for i in x
> 
> which barely saves you anything (a colon, a newline, and an indent).

Not even that, because you can omit the newline and indent:

	for i in x: yield foo(i)

There's a bigger difference between

	for i in x: yield i

and

	yield from x

Moreover, I can imagine optimization opportunities for "yield from" that
would not make sense in the context of comprehensions.




From nnorwitz at gmail.com  Fri Oct 21 07:00:16 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Thu, 20 Oct 2005 22:00:16 -0700
Subject: [Python-Dev] Questionable AST wibbles
Message-ID: <ee2a432c0510202200u394cedb7hf5b96641bda4ca1a@mail.gmail.com>

Jeremy,

There are a bunch of mods from the AST branch that got integrated into
head.  Hopefully, by doing this on python-dev more people will get
involved.  I'll describe  high level things first, but there will be a
ton of details later on.  If people don't want to see this crap on
python-dev, I can take this offline.

Highlevel overview of code size (rough decrease of 300 C source lines):
 * Python/compile.c -2733 (was 6822 now 4089)
 * Python/Python-ast.c +2281 (new file)
 * Python/asdl.c +92 (new file)
 * plus other minor mods

symtable.h has lots of changes to structs and APIs.  Not sure what
needs to be doc'ed.

I was very glad to see that ./python compileall.py Lib took virtually
the same time before and after AST.  Yeah!  Unfortunately, I can't say
the same for memory usage for running compileall:

Before AST: [10120 refs]
After AST:  [916096 refs]

I believe there aren't that many true memory leaks from running
valgrind.  Though there are likely some ref leaks.  Most of this is
probably stuff that we are just hanging on to that is not required.  I
will continue to run valgrind to find more problems.

A bunch of APIs changed and there is some additional name pollution. 
Since these are pretty internal APIs, I'm not sure that part is a big
deal.  I will try to find more name pollution and eliminate it by
prefixing with Py.

One API change which I think was a mistake was _Py_Mangle() losing 2
parameters (I think this was how it was a long time ago).  See
typeobject.c, Python.h, compile.c.

pythonrun.h has a bunch of changes.  I think a lot of the APIs
changed, but there might be backwards compatible macros.  I'm not
sure.  I need to review closely.

symtable.h has lots of changes to structs and APIs.  Not sure what
needs to be doc'ed.  Some #defines are history (I think they are in
the enum now):  TYPE_*.

code.h was added, but it mostly contains stuff from compile.h.  Should
we remove code.h and just put everything in compile.h?  This will
remove lots little changes.
code.h & compile.h are tightly coupled.  If we keep them separate, I
would like to see some other changes.

This probably is not a big deal, but I was surprised by this change:

+++ test_repr.py        20 Oct 2005 19:59:24 -0000      1.20
@@ -123,7 +123,7 @@

    def test_lambda(self):
        self.failUnless(repr(lambda x: x).startswith(
-            "<function <lambda"))
+            "<function lambda"))

This one may be only marginally worse (names w/parameter unpacking):

test_grammar.py

-    verify(f4.func_code.co_varnames == ('two', '.2', 'compound',
-                                        'argument',  'list'))
+    vereq(f4.func_code.co_varnames,
+          ('two', '.1', 'compound', 'argument',  'list'))

There are still more things I need to review.  These were the biggest
issues I found.  I don't think most are that big of a deal, just
wanted to point stuff out.

n

From pje at telecommunity.com  Fri Oct 21 07:34:02 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 21 Oct 2005 01:34:02 -0400
Subject: [Python-Dev] Pre-PEP: Task-local variables
In-Reply-To: <ca471dc20510201957m7823c49ama127de972eef4028@mail.gmail.co
 m>
References: <5.1.1.6.0.20051020163313.01faf660@mail.telecommunity.com>
	<5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com>
	<43579027.6040007@gmail.com> <43579ADC.80006@gmail.com>
	<5.1.1.6.0.20051020163313.01faf660@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20051021012649.02a37070@mail.telecommunity.com>

At 07:57 PM 10/20/2005 -0700, Guido van Rossum wrote:
>(Sorry for the long post -- there just wasn't anything you said that I
>felt could be left unquoted. :-)

Wow.  You've brought up an awful lot of stuff I want to respond to, about 
the nature of frameworks, AOP, Chandler, PEP 342, software deployment, 
etc.  But I know you're busy, and the draft I was working on in reply to 
this has gotten simply huge and still unfinished, so I think I should just 
turn it all into a blog article on "Why Frameworks Are Evil And What We Can 
Do To Stop Them".  :)

I don't think I've exaggerated anything, though.  I think maybe you're 
perceiving more vehemence than I actually have on the issue.  Context 
variables are a very small thing and I've not been arguing that they're a 
big one.  In the scope of the coming Global War On Frameworks, they are 
pretty small potatoes.  :)


From nnorwitz at gmail.com  Fri Oct 21 08:35:57 2005
From: nnorwitz at gmail.com (Neal Norwitz)
Date: Thu, 20 Oct 2005 23:35:57 -0700
Subject: [Python-Dev] problem with genexp
In-Reply-To: <ee2a432c0510162221g31fcc32dwb0ac3cbf16fcc89d@mail.gmail.com>
References: <ee2a432c0510102115m19581b97h79cc3df6e1dadd27@mail.gmail.com>
	<ee2a432c0510162221g31fcc32dwb0ac3cbf16fcc89d@mail.gmail.com>
Message-ID: <ee2a432c0510202335r3aed1f5bwfca4d0c962097c83@mail.gmail.com>

On 10/16/05, Neal Norwitz <nnorwitz at gmail.com> wrote:
> On 10/10/05, Neal Norwitz <nnorwitz at gmail.com> wrote:
> > There's a problem with genexp's that I think really needs to get
> > fixed.  See http://python.org/sf/1167751 the details are below.  This
> > code:
> >
> > >>> foo(a = i for i in range(10))
> >
> > I agree with the bug report that the code should either raise a
> > SyntaxError or do the right thing.
>
> The change to Grammar/Grammar below seems to fix the problem and all
> the tests pass.  Can anyone comment on whether this fix is
> correct/appropriate?  Is there a better way to fix the problem?

Since no one responded other than Jiwon, I checked in this change.  I
did *not* backport it since what was syntactically correct in 2.4.2
would raise an error in 2.4.3.  I'm not sure which is worse.  I'll
leave it up to Anthony whether this should be backported.

BTW, the change was the same regardless of old code vs. new AST code.

n

From mwh at python.net  Fri Oct 21 14:41:47 2005
From: mwh at python.net (Michael Hudson)
Date: Fri, 21 Oct 2005 13:41:47 +0100
Subject: [Python-Dev] Pre-PEP: Task-local variables
In-Reply-To: <ca471dc20510201957m7823c49ama127de972eef4028@mail.gmail.com>
	(Guido van Rossum's message of "Thu, 20 Oct 2005 19:57:16 -0700")
References: <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com>
	<43579027.6040007@gmail.com> <43579ADC.80006@gmail.com>
	<5.1.1.6.0.20051020163313.01faf660@mail.telecommunity.com>
	<ca471dc20510201957m7823c49ama127de972eef4028@mail.gmail.com>
Message-ID: <2mr7af6nzo.fsf@starship.python.net>

Guido van Rossum <guido at python.org> writes:

> On 10/20/05, Phillip J. Eby <pje at telecommunity.com> wrote:
>> At 08:57 AM 10/20/2005 -0700, Guido van Rossum wrote:
>> >Whoa, folks! Can I ask the gentlemen to curb their enthusiasm?
>> >
>> >PEP 343 is still (back) on the drawing table, PEP 342 has barely been
>> >implemented (did it survive the AST-branch merge?), and already you
>> >are talking about adding more stuff. Please put on the brakes!
>>
>> Sorry.  I thought that 343 was just getting a minor tune-up.
>
> Maybe, but the issues on the table are naming issues -- is __with__
> the right name, or should it be __context__? Should the decorator be
> applied implicitly? Should the decorator be called @context or
> @contextmanager?
>
>> In the months
>> since the discussion and approval (and implementation; Michael Hudson
>> actually had a PEP 343 patch out there),
>
> Which he described previously as "a hack"

Err, that was the code I used for my talk at EuroPython.  That really
*was* a hack.  The code on SF is much better.

> and apparently didn't feel comfortable checking in.

Well, I was kind of hoping for a review, or positive comment on the
tracker, or *something* (Phillip posted half a review here a couple of
weeks ago, but I've been stupidly stupidly busy since then).

> At least some of it will have to be redone, (a) for the AST code,

Indeed.  Not much, I hope, the compiler changes were fairly simple.

> and (b) for the revised PEP.

Which I still haven't digested :-/

Cheers,
mwh

-- 
  I'm about to search Google for contract assassins to go to Iomega
  and HP's programming groups and kill everyone there with some kind
  of electrically charged rusty barbed thing.
                -- http://bofhcam.org/journal/journal.html, 2002-01-08

From ncoghlan at gmail.com  Fri Oct 21 15:34:14 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 21 Oct 2005 23:34:14 +1000
Subject: [Python-Dev] Pre-PEP: Task-local variables
In-Reply-To: <5.1.1.6.0.20051020131024.02033d58@mail.telecommunity.com>
References: <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com>
	<5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com>
	<5.1.1.6.0.20051020131024.02033d58@mail.telecommunity.com>
Message-ID: <4358EE56.4010903@gmail.com>

Phillip J. Eby wrote:
> Actually, it's fairly simple to write a generator decorator using 
> context.swap() that saves and restores the current execution state 
> around next()/send()/throw() calls, if you prefer it to be the 
> generator's responsibility to maintain such context.

Yeah, I also realised there's a fairly obvious solution to my decimal.Context 
"problem" too:

     def iter_sin(iterable):
        orig_ctx = decimal.getcontext()
        with orig_ctx as ctx:
            ctx.prec += 10
            for r in iterable:
                y = sin(r) # Very high precision during calculation
                with orig_ctx:
                    yield +y # Interim results have normal precision
                # We get "ctx" back here
        # We get "orig_ctx" back here

That is, if you want to be able to restore the original context just *save* 
the damn thing. . .

Ah well, chalk up the __suspend__/__resume__ idea up as another case of me 
getting overly enthusiastic about a complex idea without looking for simpler 
solutions first. It's not like it would be the first time ;)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Fri Oct 21 16:30:26 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 22 Oct 2005 00:30:26 +1000
Subject: [Python-Dev] AST branch is in?
In-Reply-To: <200510211202.12015.anthony@interlink.com.au>
References: <200510211202.12015.anthony@interlink.com.au>
Message-ID: <4358FB82.9040109@gmail.com>

Anthony Baxter wrote:
> So it looks like the AST branch has landed. Wooo! Well done to all who 
> were involved - it seems like it's been a huge amount of work.

Congratulations from this quarter, too.

I really liked the structure of the new compiler in the limited time I spent 
working with it on the AST branch, and am glad it has made its way onto the 
HEAD for Python 2.5.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Fri Oct 21 17:08:43 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 22 Oct 2005 01:08:43 +1000
Subject: [Python-Dev] Pre-PEP: Task-local variables
In-Reply-To: <ca471dc20510201957m7823c49ama127de972eef4028@mail.gmail.com>
References: <5.1.1.6.0.20051019213825.01fafa00@mail.telecommunity.com>	<43579027.6040007@gmail.com>
	<43579ADC.80006@gmail.com>	<5.1.1.6.0.20051020163313.01faf660@mail.telecommunity.com>
	<ca471dc20510201957m7823c49ama127de972eef4028@mail.gmail.com>
Message-ID: <4359047B.6020203@gmail.com>

Guido van Rossum wrote:
> If it weren't for Python's operator overloading, the decimal module
> would have used explicit contexts (like the Java version); but since
> it would be really strange to have such a fundamental numeric type
> without the ability to use the conventional operator notation, we
> resorted to per-thread context. Even that doesn't always do the right
> thing -- handling decimal contexts is surprisingly subtle (as Nick can
> testify based on his experiences attempting to write a decimal context
> manager for the with-statement!).

Indeed. Fortunately it isn't as complicated as I feared last night (it turned 
out to be a problem with me trying to hit a small nail with the new 
sledgehammer I was playing with, forgetting entirely about the trusty old 
normal hammer still in the toolkit).

> But I haven't seen the use case yet for mixing coroutines with changes
> to decimal context settings; somehow it doesn't strike me as a likely
> use case (not that you can't construct one, so don't bother -- I can
> imagine it too, I just think YAGNI).

For Python 2.5, I think the approach of generators explicitly reverting 
altered contexts around yield expressions is a reasonable way to go.

This concept is workable for generators, because they *know* when they're 
going to lose control (i.e., by invoking yield), whereas it's impossible for 
threads to know when the eval loop is going to drop them in favour of a 
different thread.

I think the parallel between __iter__ and __with__ continues to work here, too 
- alternate context managers to handle reversion of the context (e.g., 
Lock.released()) can be provided as separate methods, just as alternative 
iterators are provided (e.g., dict.iteritems(), dict.itervalues()).

Also, just as we eventually added "itertools" to support specific ways of 
working with iterators, I expect to eventually see "contexttools" to support 
specific ways of working with contexts (e.g. duck-typed contexts like 
"closing", or a 'nested' context that allowed multiple resources to be 
properly managed by a single with statement).

contexttools would also be the place for ideas like suspending and resuming a 
context - rather than requiring specific syntax, it could be implemented as a 
context manager:

   ctx = suspendable_context(EXPR)
   with ctx as VAR:
     # VAR would still be the result of (EXPR).__with__().__enter__()
     # It's just that suspendable_context would be taking care of
     # making that happen, rather than it happening the usual way
     with ctx.suspended():
       # Context is suspended here
     # Context is resumed here

I do *not* think we should add contexttools in Python 2.5, because there's far 
too much chance of YAGNI. We need experience with the 'with' statement before 
we can really identify the tools that are appropriate.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From jeremy at alum.mit.edu  Fri Oct 21 17:13:47 2005
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Fri, 21 Oct 2005 11:13:47 -0400
Subject: [Python-Dev] Questionable AST wibbles
In-Reply-To: <ee2a432c0510202200u394cedb7hf5b96641bda4ca1a@mail.gmail.com>
References: <ee2a432c0510202200u394cedb7hf5b96641bda4ca1a@mail.gmail.com>
Message-ID: <e8bf7a530510210813n7734e80g2af18ca1dc0236f@mail.gmail.com>

On 10/21/05, Neal Norwitz <nnorwitz at gmail.com> wrote:
> There are a bunch of mods from the AST branch that got integrated into
> head.  Hopefully, by doing this on python-dev more people will get
> involved.  I'll describe  high level things first, but there will be a
> ton of details later on.  If people don't want to see this crap on
> python-dev, I can take this offline.

Thanks for all the notes and questions, Neal.  There were a lot of
changes made over a long time, and it's good to discuss some of them.

> Highlevel overview of code size (rough decrease of 300 C source lines):
>  * Python/compile.c -2733 (was 6822 now 4089)
>  * Python/Python-ast.c +2281 (new file)
>  * Python/asdl.c +92 (new file)
>  * plus other minor mods
>
> symtable.h has lots of changes to structs and APIs.  Not sure what
> needs to be doc'ed.

The old symtable wasn't well documented and the API it exposed to
Python programmers was lousy.  We need to figure out a good Python API
and document it.

> I was very glad to see that ./python compileall.py Lib took virtually
> the same time before and after AST.  Yeah!  Unfortunately, I can't say
> the same for memory usage for running compileall:
>
> Before AST: [10120 refs]
> After AST:  [916096 refs]

That's great news!  That is, I expected it to be a lot slower to
compile and didn't have any particulary good ideas about how to speed
it up.  I expected there to be a lot of memory bloat and think we can
fix that without undue effort :-).

> A bunch of APIs changed and there is some additional name pollution.
> Since these are pretty internal APIs, I'm not sure that part is a big
> deal.  I will try to find more name pollution and eliminate it by
> prefixing with Py.

Right.  The code isn't binary compatible with Python 2.4 right now,
but given the APIs that changed I wasn't too concerned about that. 
I'm not sure who should make the final decision there.

> One API change which I think was a mistake was _Py_Mangle() losing 2
> parameters (I think this was how it was a long time ago).  See
> typeobject.c, Python.h, compile.c.

I don't mind this one since it's an _Py function.  I don't think code
outside the core should use it.

> pythonrun.h has a bunch of changes.  I think a lot of the APIs
> changed, but there might be backwards compatible macros.  I'm not
> sure.  I need to review closely.

We should double-check.  I tried to get rid of the nest of different
functions that call each other by replacing the old ones with macros
that call the newest ones (the functions that take the most
arguments).  It's not really a related change, except that it seemed
like cleanup of compiler-related code.  Also, a bunch of functions
started taking const char* instead of char*.  I think that's a net
win, too.

> code.h was added, but it mostly contains stuff from compile.h.  Should
> we remove code.h and just put everything in compile.h?  This will
> remove lots little changes.
> code.h & compile.h are tightly coupled.  If we keep them separate, I
> would like to see some other changes.

I would like to keep them separate.  The compiler produces code
objects, but consumers of code objects don't need to know anything
about the compiler.  You did remind me that I intended to remove the
#include "compile.h" lines from a bunch of files that merely consume
code objects.  What other changes would you like to see?

> This probably is not a big deal, but I was surprised by this change:
>
> +++ test_repr.py        20 Oct 2005 19:59:24 -0000      1.20
> @@ -123,7 +123,7 @@
>
>     def test_lambda(self):
>         self.failUnless(repr(lambda x: x).startswith(
> -            "<function <lambda"))
> +            "<function lambda"))
>
> This one may be only marginally worse (names w/parameter unpacking):
>
> test_grammar.py
>
> -    verify(f4.func_code.co_varnames == ('two', '.2', 'compound',
> -                                        'argument',  'list'))
> +    vereq(f4.func_code.co_varnames,
> +          ('two', '.1', 'compound', 'argument',  'list'))
>
> There are still more things I need to review.  These were the biggest
> issues I found.  I don't think most are that big of a deal, just
> wanted to point stuff out.

I don't have a strong sense for how important these changes are.  I
don't think the old behavior was documented, but I can imagine some
code depending on these implementation details.

Jeremy

From guido at python.org  Fri Oct 21 17:26:35 2005
From: guido at python.org (Guido van Rossum)
Date: Fri, 21 Oct 2005 08:26:35 -0700
Subject: [Python-Dev] Questionable AST wibbles
In-Reply-To: <e8bf7a530510210813n7734e80g2af18ca1dc0236f@mail.gmail.com>
References: <ee2a432c0510202200u394cedb7hf5b96641bda4ca1a@mail.gmail.com>
	<e8bf7a530510210813n7734e80g2af18ca1dc0236f@mail.gmail.com>
Message-ID: <ca471dc20510210826i124c68d0nb383fbe302b51ff2@mail.gmail.com>

On 10/21/05, Jeremy Hylton <jeremy at alum.mit.edu> wrote:
> On 10/21/05, Neal Norwitz <nnorwitz at gmail.com> wrote:
> > This probably is not a big deal, but I was surprised by this change:
> >
> > +++ test_repr.py        20 Oct 2005 19:59:24 -0000      1.20
> > @@ -123,7 +123,7 @@
> >
> >     def test_lambda(self):
> >         self.failUnless(repr(lambda x: x).startswith(
> > -            "<function <lambda"))
> > +            "<function lambda"))

if this means that the __name__ attribute of a lambda now says
"lambda" instead of "<lambda>", please change it back. The angle
brackets make it stand out more, and I imagine people might be
checking for this to handle it specially.

> > This one may be only marginally worse (names w/parameter unpacking):
> >
> > test_grammar.py
> >
> > -    verify(f4.func_code.co_varnames == ('two', '.2', 'compound',
> > -                                        'argument',  'list'))
> > +    vereq(f4.func_code.co_varnames,
> > +          ('two', '.1', 'compound', 'argument',  'list'))

This doesn't bother me.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Fri Oct 21 17:42:44 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 21 Oct 2005 11:42:44 -0400
Subject: [Python-Dev] Questionable AST wibbles
In-Reply-To: <e8bf7a530510210813n7734e80g2af18ca1dc0236f@mail.gmail.com>
References: <ee2a432c0510202200u394cedb7hf5b96641bda4ca1a@mail.gmail.com>
	<ee2a432c0510202200u394cedb7hf5b96641bda4ca1a@mail.gmail.com>
Message-ID: <5.1.1.6.0.20051021114101.01faf080@mail.telecommunity.com>

At 11:13 AM 10/21/2005 -0400, Jeremy Hylton wrote:
>I don't have a strong sense for how important these changes are.  I
>don't think the old behavior was documented, but I can imagine some
>code depending on these implementation details.

I'm pretty sure I've seen code in the field (e.g. recipes in the online 
Python cookbook) that checked for a function's name being 
'<lambda>'.  That's also a thing that's likely to show up in people's doctests.


From jeremy at alum.mit.edu  Fri Oct 21 18:03:54 2005
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Fri, 21 Oct 2005 12:03:54 -0400
Subject: [Python-Dev] AST branch is in?
In-Reply-To: <ee2a432c0510201932g147e36belf3b85e39be8e817e@mail.gmail.com>
References: <200510211202.12015.anthony@interlink.com.au>
	<ee2a432c0510201932g147e36belf3b85e39be8e817e@mail.gmail.com>
Message-ID: <e8bf7a530510210903w200704f7n120ad8abd1a11ff9@mail.gmail.com>

On 10/20/05, Neal Norwitz <nnorwitz at gmail.com> wrote:
> On 10/20/05, Anthony Baxter <anthony at interlink.com.au> wrote:
> >
> > Could someone involved give a short email laying out what concrete (no
> > pun intended) advantages this new compiler gives us? Does it just
> > allow us to do new and interesting manipulations of the code during
> > compilation? Cleaner, easier to maintain, or the like?
>

I just wanted to clarify that Neal meant the abstract syntax, not the
grammar.  It should allow people to write tools to analyze Python
source code without having to worry about the often irrelevant details
of the exact tokens or the way they are parsed.  We should be able to
get to a state where tools using the AST work with Python and Jython
(and maybe IronPython, who knows).  The tokenize and parser modules
still exist for tools for which those details aren't irrelevant.

We should also think about how to migrate the compiler module from its
current AST to the new AST, although the backwards compatibility
issues there are a bit tricky.

> The Grammar is (was at one point at least) shared between Jython and
> would allow more tools to be able to share infrastructure.  The idea
> is to eventually be able to have [JP]ython output the same AST to
> tools.  There is quite a bit of generated code based on the Grammar.
> So some stuff should be easier.  Other stuff is just moved.  You still
> need to convert from the AST to the byte code.
>
> Hopefully it will be easier to do various sorts of optimization and
> general manipulation of an AST rather than what existed before.

I think it should be a lot easier to write tools for the C Python
compiler that do extra analysis or optimization.  The existing
peephole optimizer could be improved by integrating it with the
bytecode assembler (for example, eliminating all NOP bytecodes).

Jeremy

From jeremy at alum.mit.edu  Fri Oct 21 18:06:42 2005
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Fri, 21 Oct 2005 12:06:42 -0400
Subject: [Python-Dev] AST branch is in?
In-Reply-To: <ca471dc20510201959h3c9ddc39o5b8860584c261b52@mail.gmail.com>
References: <200510211202.12015.anthony@interlink.com.au>
	<ca471dc20510201959h3c9ddc39o5b8860584c261b52@mail.gmail.com>
Message-ID: <e8bf7a530510210906p27911c6dg3c6b960edc9c81a@mail.gmail.com>

On 10/20/05, Guido van Rossum <guido at python.org> wrote:
> On 10/20/05, Anthony Baxter <anthony at interlink.com.au> wrote:
> > So it looks like the AST branch has landed. Wooo! Well done to all who
> > were involved - it seems like it's been a huge amount of work.
>
> Hear, hear. Great news! Thanks to Jeremy, Neil and all the others. I
> can't wait to check it out!

I want to thank all the people who made it possible by writing code
and debugging.  I hope this is a complete list:

Armin Rigo
Brett Cannon
Grant Edwards
John Ehresman
Kurt Kaiser
Neal Norwitz
Neil Schemenauer
Nick Coghlan
Tim Peters

And thanks to the PSF and PyCon organizers for hosting the formerly
annual ast-branch sprints!

Jeremy

From nas at arctrix.com  Fri Oct 21 20:32:22 2005
From: nas at arctrix.com (Neil Schemenauer)
Date: Fri, 21 Oct 2005 18:32:22 +0000 (UTC)
Subject: [Python-Dev] AST branch is in?
References: <200510211202.12015.anthony@interlink.com.au>
Message-ID: <djbc7m$ptk$1@sea.gmane.org>

Anthony Baxter <anthony at interlink.com.au> wrote:
> Could someone involved give a short email laying out what concrete (no 
> pun intended) advantages this new compiler gives us?

One advantage is that it decreases the coupling between the parser
and the backend of the compiler. For example, it should be possible
to replace the parser without modifying a lot of the compiler.
Also, the concrete syntax tree (CST) generated by Python's parser is
not a convenient data structure to deal with. Anyone who's used the
'parser' module probably experienced the pain:

    >>> parser.ast2list(parser.suite('a = 1'))
    [257, [266, [267, [268, [269, [320, [298, [299, [300, [301,
    [303, [304, [305, [306, [307, [308, [309, [310, [311, [1,
    'a']]]]]]]]]]]]]]], [22, '='], [320, [298, [299, [300, [301, [303,
    [304, [305, [306, [307, [308, [309, [310, [311, [2,
    '1']]]]]]]]]]]]]]]]], [4, '']]], [0, '']]

> Does it just allow us to do new and interesting manipulations of
> the code during compilation?

Well, that's a pretty big deal, IMHO. For example, adding
pychecker-like functionality should be straight forward now. I also
hope some of the namespace optimizations get explored (e.g. PEP
267).

> Cleaner, easier to maintain, or the like?

At this point, the old and new compiler are pretty similar in terms
of complexity. However, the new compiler is a much better base to
build upon.

  Neil


From guido at python.org  Fri Oct 21 21:13:36 2005
From: guido at python.org (Guido van Rossum)
Date: Fri, 21 Oct 2005 12:13:36 -0700
Subject: [Python-Dev] AST branch is in?
In-Reply-To: <djbc7m$ptk$1@sea.gmane.org>
References: <200510211202.12015.anthony@interlink.com.au>
	<djbc7m$ptk$1@sea.gmane.org>
Message-ID: <ca471dc20510211213we63e30fsd3639fa7243c7fe6@mail.gmail.com>

On 10/21/05, Neil Schemenauer <nas at arctrix.com> wrote:
> Also, the concrete syntax tree (CST) generated by Python's parser is
> not a convenient data structure to deal with. Anyone who's used the
> 'parser' module probably experienced the pain:
>
>     >>> parser.ast2list(parser.suite('a = 1'))
>     [257, [266, [267, [268, [269, [320, [298, [299, [300, [301,
>     [303, [304, [305, [306, [307, [308, [309, [310, [311, [1,
>     'a']]]]]]]]]]]]]]], [22, '='], [320, [298, [299, [300, [301, [303,
>     [304, [305, [306, [307, [308, [309, [310, [311, [2,
>     '1']]]]]]]]]]]]]]]]], [4, '']]], [0, '']]

That's the fault of the 'parser' extension module though, and this
affects tools using the parser module, not the bytecode compiler
itself. The CST exposed to C programmers is slightly higher level.
(But the new AST is higher level still, of course.)

BTW, Elemental is letting me open-source a reimplementation of pgen in
Python. This also includes a nifty way to generate ASTs. This should
become available within a few weeks.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From ncoghlan at gmail.com  Fri Oct 21 23:43:01 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 22 Oct 2005 07:43:01 +1000
Subject: [Python-Dev] Questionable AST wibbles
In-Reply-To: <e8bf7a530510210813n7734e80g2af18ca1dc0236f@mail.gmail.com>
References: <ee2a432c0510202200u394cedb7hf5b96641bda4ca1a@mail.gmail.com>
	<e8bf7a530510210813n7734e80g2af18ca1dc0236f@mail.gmail.com>
Message-ID: <435960E5.7090502@gmail.com>

Jeremy Hylton wrote:
> I would like to keep them separate.  The compiler produces code
> objects, but consumers of code objects don't need to know anything
> about the compiler.

Please do keep these separate - the only reason I've ever had to muck with 
code objects is to check if a function is a generator or not, and including 
the entire compiler header just for that seemed like overkill.

It's not a huge issue for me, but the separate header files do give better 
'separation of concerns'.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From mal at egenix.com  Sat Oct 22 00:01:06 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Sat, 22 Oct 2005 00:01:06 +0200
Subject: [Python-Dev] Questionable AST wibbles
In-Reply-To: <ee2a432c0510202200u394cedb7hf5b96641bda4ca1a@mail.gmail.com>
References: <ee2a432c0510202200u394cedb7hf5b96641bda4ca1a@mail.gmail.com>
Message-ID: <43596522.7090209@egenix.com>

Neal Norwitz wrote:
> Jeremy,
> 
> There are a bunch of mods from the AST branch that got integrated into
> head.  Hopefully, by doing this on python-dev more people will get
> involved.  I'll describe  high level things first, but there will be a
> ton of details later on.  If people don't want to see this crap on
> python-dev, I can take this offline.
> 
> Highlevel overview of code size (rough decrease of 300 C source lines):
>  * Python/compile.c -2733 (was 6822 now 4089)
>  * Python/Python-ast.c +2281 (new file)
>  * Python/asdl.c +92 (new file)
>  * plus other minor mods

FYI, I'm getting these warnings:

Python/Python-ast.c: In function `marshal_write_expr_context':
Python/Python-ast.c:1995: warning: unused variable `i'
Python/Python-ast.c: In function `marshal_write_boolop':
Python/Python-ast.c:2070: warning: unused variable `i'
Python/Python-ast.c: In function `marshal_write_operator':
Python/Python-ast.c:2085: warning: unused variable `i'
Python/Python-ast.c: In function `marshal_write_unaryop':
Python/Python-ast.c:2130: warning: unused variable `i'
Python/Python-ast.c: In function `marshal_write_cmpop':
Python/Python-ast.c:2151: warning: unused variable `i'
Python/Python-ast.c: In function `marshal_write_keyword':
Python/Python-ast.c:2261: warning: unused variable `i'
Python/Python-ast.c: In function `marshal_write_alias':
Python/Python-ast.c:2270: warning: unused variable `i'

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 21 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From mal at egenix.com  Sat Oct 22 00:04:20 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Sat, 22 Oct 2005 00:04:20 +0200
Subject: [Python-Dev] New codecs checked in
Message-ID: <435965E4.5050207@egenix.com>

I've checked in a whole bunch of newly generated codecs
which now make use of the faster charmap decoding variant added
by Walter a short while ago.

Please let me know if you find any problems.

Some codecs (esp. the Mac OS X ones) have minor changes.
These originate from updated mapping files on ftp.unicode.org.

I also added an alias iso8859_1 -> latin_1, so that applications
using the iso8859_1 encoding name can benefit from the faster
native implementation of the latin_1 codec.

Regards,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 22 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From jimjjewett at gmail.com  Sat Oct 22 00:25:47 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri, 21 Oct 2005 18:25:47 -0400
Subject: [Python-Dev] PEP 267 -- is the semantics change OK?
Message-ID: <fb6fbf560510211525r7732b6aaw3f4ff774b5ce45c0@mail.gmail.com>

(In http://mail.python.org/pipermail/python-dev/2005-October/057501.html)
Neil Schemenauer suggested PEP 267 as an example of something that
might be easier with the AST compiler.

As written, PEP 267 does propose a slight semantics change -- but it
might be an improvement, if it is acceptable.

Today, after

    from othermod import val1
    import othermod
    val2 = othermod.val2
    othermod.val3  # Just making sure it was referenced early

    othermod.val1 = "new1"
    othermod.val2 = "new2"
    othermod.val3 = "new3"

    print val1, val2, othermod.val3

The print statement will see the updated val3, but will still have
the original values for val1 and val2.

Under PEP267, all three variables would be compiled to a slot
access in othermod, and would see the updated objects.

In many cases, this would be a *good* thing.  It might allow
reload to be rewritten to do what people expect.  On the other
hand, it would be a change.  Would it be an acceptable change?

-jJ

From jeremy at alum.mit.edu  Sat Oct 22 00:48:34 2005
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Fri, 21 Oct 2005 18:48:34 -0400
Subject: [Python-Dev] PEP 267 -- is the semantics change OK?
In-Reply-To: <fb6fbf560510211525r7732b6aaw3f4ff774b5ce45c0@mail.gmail.com>
References: <fb6fbf560510211525r7732b6aaw3f4ff774b5ce45c0@mail.gmail.com>
Message-ID: <e8bf7a530510211548x5f1539d7j443ba116b3893fe3@mail.gmail.com>

On 10/21/05, Jim Jewett <jimjjewett at gmail.com> wrote:
> (In http://mail.python.org/pipermail/python-dev/2005-October/057501.html)
> Neil Schemenauer suggested PEP 267 as an example of something that
> might be easier with the AST compiler.
>
> As written, PEP 267 does propose a slight semantics change -- but it
> might be an improvement, if it is acceptable.

No, it does not.  PEP 267 suggests a way to preserve the existing
semantics.  You could probably come up with a much simpler approach if
you agreed to change semantics.

Jeremy

> Today, after
>
>     from othermod import val1
>     import othermod
>     val2 = othermod.val2
>     othermod.val3  # Just making sure it was referenced early
>
>     othermod.val1 = "new1"
>     othermod.val2 = "new2"
>     othermod.val3 = "new3"
>
>     print val1, val2, othermod.val3
>
> The print statement will see the updated val3, but will still have
> the original values for val1 and val2.
>
> Under PEP267, all three variables would be compiled to a slot
> access in othermod, and would see the updated objects.
>
> In many cases, this would be a *good* thing.  It might allow
> reload to be rewritten to do what people expect.  On the other
> hand, it would be a change.  Would it be an acceptable change?
>
> -jJ
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/jeremy%40alum.mit.edu
>

From t-meyer at ihug.co.nz  Sat Oct 22 02:05:11 2005
From: t-meyer at ihug.co.nz (Tony Meyer)
Date: Sat, 22 Oct 2005 13:05:11 +1300
Subject: [Python-Dev] DRAFT: python-dev Summary for 2005-09-01 through
	2005-09-16
Message-ID: <3A5D2942-E0C3-4768-94D5-E43B914BB80B@ihug.co.nz>

This is over a month late, sorry, but here it is (Steve did his  
threads ages ago; I've fallen really behind).  Summaries for the  
second half of September and the first half of October will soon  
follow.  As always, if anyone is able to give this a quick look that  
would be great.  Feedback to me or Steve (steven.bethard at gmail.com).   
Thanks!

=============
Announcements
=============

-----------------------------
QOTF: Quotes of the Fortnight
-----------------------------

In the thread on the print statement, Charles Cazabon provided some  
nice imagery for Guido's Python 3.0 strategy.  Our first QOTF is his  
comment about the print statement:

     It's an anomaly.  It stands out in the language as a sore thumb  
waiting for Guido's hammer.

We also learned something important about the evolution of Python  
thanks to Paul Moore.  In the thread on the Python 3.0 executable  
name, Greg Ewing worried that if the Python 3.0 executable is named  
"py":

     Python 4.0 is going to just be called "p", and by the time we  
get to Python 5.0, the name will have vanished altogether!

Fortunately, as Paul Moore explains in our second QOTF, these naming  
conventions are exactly as we should expect them:

     That's OK, by the time Python 5.0 comes out, it will have taken  
over the world and be the default language for everything. So  
omitting the name is exactly right :-)

[SJB]

Contributing threads:

- `Replacement for print in Python 3.0 <http://mail.python.org/ 
pipermail/python-dev/2005-September/055995.html>`__
- `Python 3 executable name <http://mail.python.org/pipermail/python- 
dev/2005-September/056377.html>`__

--------------------------------------------------
The "Swiss Army Knife (...Not)" API design pattern
--------------------------------------------------

This fortnight saw a number of different discussions on what Guido's  
guiding principles are in making design decisions about Python. Guido  
introduced the "Swiss Army Knife (...Not)" API design pattern, which  
has been lauded by some as `the long-lost 20th principle from the Zen  
of Python`_. A direct quote from Guido:

     [I]nstead of a single "swiss-army-knife" function with various  
options that choose different behavior variants, it's better to have  
different dedicated functions for each of the major functionality types.

This principle is the basis for pairs like str.split() and str.rsplit 
() or str.find() and str.rfind().  The goal is to keep cognitive  
overhead down by associating with each use case a single function  
with a minimal number of parameters.

[SJB]

.. _the long-lost 20th principle from the Zen of Python: http:// 
mail.python.org/pipermail/python-dev/2005-September/056228.html

Contributing threads:

- `Replacement for print in Python 3.0 <http://mail.python.org/ 
pipermail/python-dev/2005-September/056202.html>`__
- `Replacement for print in Python 3.0 <http://mail.python.org/ 
pipermail/python-dev/2005-September/056228.html>`__

------------------------
A Python-to-C++ compiler
------------------------

Mark Dufour announced `Shed Skin`_, an experimental Python-to-C++  
compiler, which can convert many Python programs into optimized C++  
code, using static type inference techniques.  It works best for  
Python programs written in a relatively static C++-style; much work  
remains to be done, and Mark would like anyone interested in getting  
involved to contact him.  Shed Skin was one of the recent `Google`_  
`Summer of Code`_ projects.
.. _Shed Skin: http://shedskin.sourceforge.net
.. _Google: http://www.google.com
.. _Summer of Code: http://code.google.com/summerofcode.html

[TAM]

Contributing thread:

- `First release of Shed Skin, a Python-to-C++ compiler. <http:// 
mail.python.org/pipermail/python-dev/2005-September/056356.html>`__

--------------------------------------------------------------
python-checkins followups now stay on the python-checkins list
--------------------------------------------------------------

In a follow-up to a `thread in early July`_, the python-checkins  
mailing list Reply-To header munging has been turned off.   
Previously, follow-ups to python-checkins would be addressed to  
python-dev; now, follow-ups will stay on the python-checkins list by  
default.

.. _thread in early July: http://www.python.org/dev/summary/ 
2005-07-01_2005-07-15.html#behavior-of-sourceforge-when-replying-to- 
python-checkins

[TAM]

Contributing thread:

- `python-checkins reply-to <http://mail.python.org/pipermail/python- 
dev/2005-September/056428.html>`__


=========
Summaries
=========

--------------------------------------------
Converting print to a function in Python 3.0
--------------------------------------------

In Python 3.0, Guido wants to change print from a statement to a  
function.  Some of his motivation for this change:

* Converting code that uses the print statement to instead use the  
logging package, a UI package, etc. is complicated because of the  
syntax.  Parentheses, commas and >>s all behave differently in the  
print statement than they would in a function call.
* Having print as a statement makes the language harder to evolve.   
For example, if it's determined that Python should gain printf  
behavior of some sort, adding this is harder -- as a statement, it  
would require the introduction of new syntax, as a function, it would  
feel like a second-class citizen compared to print.
* Since the print statement always inserts spaces, code that doesn't  
want these spaces will often have to use a completely different style  
of formatting (e.g. using sys.stdout.write and/or string formatting)
* Changing the behavior of statements is hard, while builtin  
functions can simply be replaced by setting an attribute of __builtin__.

Guido's initial proposal suggested three methods to be adopted by all  
stream (file-like) objects::

     stream.write(a1, a2, ...) equivalent to:
         map(stream.write, map(str, [a1, a2, ...]))
     stream.writeln(a1, a2, ...) equivalent to:
         stream.write(a1, a2, ..., "\n")
     stream.writef(fmt, a1, a2, ...) equivalent to:
         stream.write(fmt % (a1, a2, ...))

Additionally, three new builtins would appear, write(), writeln() and  
writef() which called the corresponding methods on sys.stdout.

People had a number of problems with this initial proposal:

* People make heavy use of the space-insertion behavior of the  
current print statement. With Guido's initial proposal, inserting  
spaces would require manually adding space characters, e.g. ``write 
(foo, " ", bar, " ", baz)``.
* People want to keep the stream API simple.  With Guido's initial  
proposal, all file-like objects would probably need to support these  
three new methods.  (But see also `Deriving file-like object methods  
from read() and write()`_.)
* People primarily (about 85% of the time) use the print statement to  
print complete lines.  With Guido's initial proposal, the function to  
do this, writeln(), has the longer name than the less-frequently  
needed write().

There were a variety of proposals following Guido's that attempted to  
address the issues above, most of which were posted to the wiki_.   
They generally all proposed a function something like::

     def print(*args):
         sys.stdout.write(' '.join(str(arg) for arg in args))
         sys.stdout.write('\n')

with support for a file= keyword parameter to specify a stream other  
than sys.stdout, and a sep= keyword parameter to specify a separator  
other than ' '.  There was some discussion about how the final  
newline could be suppressed, including a nl= keyword parameter and  
the usage of the Ellipsis object (e.g. so that ``print(foo, bar, ...) 
`` would not print the final newline).  There was also substantial  
support for a formatting variant like::

     def printf(fmt, *args):
         print(fmt % args)

In the end, Guido seemed to be leaning towards supporting three  
printing variants::

* print(...) would be much like the proposals above, calling str() on  
each argument and then printing them with spaces in between and a  
following newline
* printraw(...) or printbare(...) would also call str() on each  
argument and print them, but with no intervening spaces and no final  
newline
(c) printf(fmt, ...) would string-substitute the arguments into the  
format string and then write the format string

Each of these functions would also accept a keyword parameter for  
specifying a stream other than sys.stdout.  Because ``print`` is a  
keyword, ``from __future__ import printing`` would be required to use  
the new print() function.

At this point, the thread trailed off, and no final decisions were made.

[SJB]

.. _wiki: http://wiki.python.org/moin/PrintAsFunction

Contributing threads:

- `Python 3 design principles <http://mail.python.org/pipermail/ 
python-dev/2005-September/055944.html>`__
- `Replacement for print in Python 3.0 <http://mail.python.org/ 
pipermail/python-dev/2005-September/055968.html>`__
- `New Wiki page - PrintAsFunction <http://mail.python.org/pipermail/ 
python-dev/2005-September/056081.html>`__
- `Hacking print (was: Replacement for print in Python 3.0) <http:// 
mail.python.org/pipermail/python-dev/2005-September/056125.html>`__
- `Pascaloid print substitute (Replacement for print in Python 3.0)  
<http://mail.python.org/pipermail/python-dev/2005-September/ 
056171.html>`__

----------------------------------
Making C code easier in Python 3.0
----------------------------------

Nick Jacobson asked whether reference counting would be replaced in  
Python 3.0.  Guido pointed out that the (CPython) implementation  
would have to be completely changed, and that isn't planned; many  
people also pointed out that reference counting is an implementation  
detail, not part of the language specification, and that there are  
other options that can be explored (e.g. `PyPy`_, `Jython`_,  
`IronPython`_).

Arising from this question was a suggestion from Greg Ewing to build  
something akin to `Pyrex`_ (which takes care of reference count/ 
garbage collection issues automatically) into the standard Python  
distribution.  This suggestion was met with general enthusiasm; some  
general discussion about which cases were most appropriate for Pyrex  
use (e.g. extension modules, wrapping C libraries, modules  
implemented in C for performance reasons) also followed.

.. _PyPy: http://codespeak.net/pypy/
.. _Jython: http://www.jython.org/
.. _IronPython: http://www.ironpython.com/
.. _Pyrex: http://www.cosc.canterbury.ac.nz/~greg/python/Pyrex/

[TAM]

Contributing thread:

- `reference counting in Py3K <http://mail.python.org/pipermail/ 
python-dev/2005-September/056215.html>`__

--------------------------
Multiple views of a string
--------------------------

Tim Delaney suggested that str.partition could return 'views' of a  
string, rather than new string objects, as the substrings, to avoid  
the time needed to create strings that are potentially unused.   
Raymond Hettinger pointed out that the practical cost is unlikely to  
be significant, as the strings are likely to be empty, small, or  
used, and that all the pre-Python 2.5 methods would still be  
available for those times when they would be more appropriate.

However, using string 'views' (objects that reference the 'parent'  
string, rather than copying the data) caught the imagination of  
several Python-dev'ers.  Discussion ensued about how this object  
could work (Skip Montanaro threw together a sample implementation);  
towards the end it was pointed out that buffer() objects, with some  
additional string methods, could provide this slice-like instance  
with low memory requirements.  Guido also mentioned `NSString`_, the  
NextStep string type used by `ObjC`_, which is fairly similar.

.. _ObjC: http://en.wikipedia.org/wiki/Objc
.. _NSString: http://developer.apple.com/documentation/Cocoa/ 
Reference/Foundation/ObjC_classic/Classes/NSString.html

[TAM]

Contributing threads:

- `Proof of the pudding: str.partition() <http://mail.python.org/ 
pipermail/python-dev/2005-September/055924.html>`__
- `String views (was: Re: Proof of the pudding: str.partition())  
<http://mail.python.org/pipermail/python-dev/2005-September/ 
055933.html>`__
- `String views <http://mail.python.org/pipermail/python-dev/2005- 
September/055947.html>`__

-------------------------------
String-formatting in Python 3.0
-------------------------------

Currently, using ``%`` for string formatting has a number of  
inconvenient consequences:

* precedence issues: ``"a%sa" % "b"*4`` produces ``'abaabaabaaba'``,  
not ``'abbbba'``
* special-cased tuples: ``"%s" % x`` produces a string representation  
of x *unless* x is a tuple (in which case it unpacks it, raising a  
TypeError if ``len(x) != 1``).
* keyword formatting issues: a number of people have complained that  
``%(myvar)s`` is much more complicated than it needs to be (hence  
string.Template's ``$myvar``).

To address the first two issues, Raymond Hettinger proposed that  
string formatting become a builtin function, and others proposed that  
formatting become a method of str/unicode objects.  Guido definitely  
agreed with the move from ``%`` to a callable, but it was unclear as  
to his preference for function or method.

Nick Coghlan tried to address the ``%(myvar)s`` issue by exploring a  
few extensions to string.Template formatting.  He produced a format()  
function where arguments could be specified either by position (e.g.  
$1, $2, etc.) or with keywords (e.g. $item, $quantity), and where the  
usual C-style format specifiers were still supported::

     format("$item: $[0.2f]quantity", quantity=0.5, item='Bees')
     format("$1: $[0.2f]2", 'Bees', 0.5)

Nick also briefly explored format specifiers for expanding iterables,  
but Guido disliked the idea, explaining that adding or removing a  
print from a program should not drastically change the program's  
behavior (as it might if a print accidentally consumed an iterable  
that you weren't done with).

There was also some high-level discussion about internationalization  
concerns, where format strings need to be easy for translators to  
read and reorganize.  Since word orders may change, having either  
keyword parameters or positional parameters (as in Nick's scheme) is  
crucial.

Unfortunately, this discussion seemed to get lost in the massive  
`Converting print to a function in Python 3.0`_ discussion, and no  
decisions were made about either a formatting function or Nick's  
format specifier extensions.

[SJB]

Contributing threads:

- `Replacement for print in Python 3.0 <http://mail.python.org/ 
pipermail/python-dev/2005-September/055979.html>`__
- `string formatting options and removing basestring.__mod__ (WAS:  
Replacement for print in Python 3.0) <http://mail.python.org/ 
pipermail/python-dev/2005-September/056163.html>`__
- `string formatting and i18n <http://mail.python.org/pipermail/ 
python-dev/2005-September/056191.html>`__
- `string.Template format enhancements (Re: Replacement for print in  
Python 3.0) <http://mail.python.org/pipermail/python-dev/2005- 
September/056231.html>`__


---------------------------------------------------------
Deriving file-like object methods from read() and write()
---------------------------------------------------------

A variety of methods on the file object, including  __iter__(), next 
(), readline(), readlines() and writelines(), are all derivable from  
the read() and write() methods. At least twice this fortnight, the  
issue was raised about making it easier for file-like objects to add  
the derivable methods if they've defined read() and write().

One suggestion was to provide a FileMixin class (like the DictMixin  
of UserDict) that other types could inherit from. This has the  
problem that the creator of the file-like object must determine at  
the time that the class is defined that it should support the  
additional methods. It is also more difficult to use mixin classes in  
C code (because multiple inheritance requires dealing with the type's  
metaclass).

Fredrik Lundh suggested that something along the lines of `PEP  
246`_'s object adaptation might be appropriate, but there was still  
some disagreement on the issue.

[SJB]

.. _PEP 246: http://www.python.org/peps/pep-0246.html

Contributing threads:

- `Mixin classes in the standard library <http://mail.python.org/ 
pipermail/python-dev/2005-September/056135.html>`__
- `Simplify the file-like-object interface (Replacement for print in  
Python 3.0) <http://mail.python.org/pipermail/python-dev/2005- 
September/056218.html>`__
- `Simplify the file-like-object interface <http://mail.python.org/ 
pipermail/python-dev/2005-September/056229.html>`__

------------------------------------
Making new-style classes the default
------------------------------------

Lisandro Dalcin proposed that something like::

     from __future__ import new_style_classes

be introduced to have newly defined classes implicitly derive from  
object. It was pointed out that this functionality is already  
available through the module-level statement::

     __metaclass__ = type

The argument was made that the __future__ version would be easier for  
non-experts to understand and to Google for, but Guido declared that  
the current syntax is fine -- there are much more important issues to  
be dealt with right now.

[SJB]

Contributing thread:

- `PEP 3000 and new style classes <http://mail.python.org/pipermail/ 
python-dev/2005-September/056305.html>`__

--------------------------------------------------
Using __future__ to have builtins return iterators
--------------------------------------------------

Lisandro Dalcin requested a __future__ import of some sort that would

* make range() and zip() return iterators
* remove xrange()
* make the dict.keys(), dict.values(), dict.items() etc. methods  
return iterators

Guido indicated that an alternate builtins module could be provided  
so that the first point could be covered with something like::

     from future_builtins import zip, range

However, there wasn't really a good way to change the dict methods.   
Simply importing a new dict object from "future_builtins" wouldn't  
solve the problem because using anyone's module that used the old  
dict object would mean a mix of the two types in your module.  And  
since __future__ imports are intended to affect only the module which  
includes them, changing the builtin dict object globally would be  
inappropriate (as it would let an import in one module break code in  
another).

[SJB]

Contributing thread:

- `PEP 3000 and iterators <http://mail.python.org/pipermail/python- 
dev/2005-September/056340.html>`__

-------------------------------------------------------------
Using compiled re methods vs. using module-level re functions
-------------------------------------------------------------

After Michael Chermside commented that users should be encouraged to  
use the methods on compiled re objects instead of the re functions  
available at the module level (and after Stephen J. Turnbull promised  
to look into supplying such a documentation patch), there was a brief  
discussion about how much of a difference using the compiled re  
objects really makes. As it turns out, in the CPython implementation,  
the module-level functions cache the first 100 patterns, so in many  
cases, the only additional cost of using the module-level functions  
is a dictionary lookup.

[SJB]

Contributing thread:

- `Revising RE docs <http://mail.python.org/pipermail/python-dev/2005- 
September/055938.html>`__

--------------------------------------
urlparse and urls with too many '../'s
--------------------------------------

Fabien Schwob pointed out that urlparse.join() doesn't strip out any  
extraneous '..' directories (e.g. http://example.com/../index.html).   
While Guido indicated that he found the current behaviour acceptable,  
Jeff Epler pointed out that `RFC 2396`_ states that invalid URIs like  
this may be handled by removing the ".." segments from the resolved  
path (although this is an implementation detail).  Armin Rigo  
indicated that, even if this is theoretically not a bug, a proposed  
patch with this motiviation would be welcome.

.. _RFC 2396: http://www.faqs.org/rfcs/rfc2396.html

[TAM]

Contributing thread:

- `bug in urlparse <http://mail.python.org/pipermail/python-dev/2005- 
September/056144.html>`__

-----------------------------------------------
Using an iterator instead of a tuple for \*args
-----------------------------------------------

Nick Coghlan suggested that in Python 3.0, the \*args extended  
function call syntax should produce an iterator instead of a tuple as  
it currently does.  That would mean that code like::

     output(*some_long_iterator)

would not load the entire iterator into a memory before processing  
it.  I pointed him to a previous discussion Raymond Hettinger and I  
had about the subject that indicated that for \*args, sequences were  
preferable to iterators in a number of situations.  Guido agreed,  
indicating that \*args will continue to be sequences in Python 3.0.

[SJB]

Contributing thread:

- `iterators and extended function call syntax (WAS: Replacement for  
print in Python 3.0) <http://mail.python.org/pipermail/python-dev/ 
2005-September/056109.html>`__

----------------------------------------
Constructing traceback objects in Python
----------------------------------------

Contributing thread:

- `Asynchronous use of Traceback objects <http://mail.python.org/ 
pipermail/python-dev/2005-September/056091.html>`__

-------------------------------
No dedent() methods for strings
-------------------------------

Contributing thread:

- `str.dedent <http://mail.python.org/pipermail/python-dev/2005- 
September/056412.html>`__

------------------------
Arguments vs. parameters
------------------------

- `Term unification <http://mail.python.org/pipermail/python-dev/2005- 
September/056409.html>`__

-----------------------------------------------------------------------
Removing sequence support from the return value of stat() in Python 3.0
-----------------------------------------------------------------------

Terry J. Reedy proposed that, in Python 3.0, instead of os.stat()  
returning a sequence (where the order of the items is only of  
historical significance), a proper stat object be returned.  This was  
met with general support, and so seems likely to occur.  Skip  
Montanaro also proposed that the st_ prefixes in the attribute names  
be removed, since there are no namespace issues to be concerned with,  
which met with some approval, but concern from Guido that the forms  
with the prefixes would be more familiar to users, and make Googling  
or grepping simpler.

[TAM]

Contributing thread:

- `stat() return value (was: Re: Proof of the pudding: str.partition 
()) <http://mail.python.org/pipermail/python-dev/2005-September/ 
055931.html>`__

--------------------------------------------------
Making code in the Tools directory more accessible
--------------------------------------------------

Installation of Python typically doesn't include the Tools directory;  
combined with the lack of mention of these scripts in the  
documentation, this means that knowledge of these generally useful  
scripts is fairly limited.  Tim Peters noted that historically a  
Tools directory was only added to the Windows installer if it was  
specifically requested; as such, the audiopy, bgen, compiler, faqwiz,  
framer, modulator, msi, unicode, and world Tools directories are not  
currently included in the Windows installer.  Nick Coghlan added that  
Tools/README.txt isn't included in the Windows installer, so Windows  
users don't get a synopsis of the tools that are included; he also  
suggested that adding this readme to the "undocumented modules"  
section of the standard library would be a simple improvement.  Non- 
windows users typically don't get the Tools directory at all with an  
install.

Remaining questions included how the directory should be documented  
(e.g. man pages for the scripts, a documentation page for them),  
where to install them on non-Windows installations (e.g. /usr/share/ 
python, /usr/lib/pythonX.Y/Tools), and whether the Windows installer  
should include all of the directories.

[TAM]

Contributing thread:

- `Tools directory (Was RE: Replacement for print in Python 3.0)  
<http://mail.python.org/pipermail/python-dev/2005-September/ 
056318.html>`__

----------------------------------
Responsiveness of IDLE development
----------------------------------

Noam Raphael posted a request for help getting a large patch to IDLE  
committed to CVS.  He was concerned that there hasn't been any IDLE  
development recently, and that patches are not being considered.  He  
indicated that his group was considering offering a fork of IDLE with  
the improvements, but that they would much prefer integrating the  
improvements into the core distribution.

It was pointed out that a fork might be the best solution, for  
various reasons (e.g. the improvements may not be of general  
interest, the release time would be much quicker), and that this was  
how the current version of IDLE was developed.  The dicussion died  
out, so it seems likely that a fork will be the resulting solution.

[TAM]

Contributing thread:

- `IDLE development <http://mail.python.org/pipermail/python-dev/2005- 
September/056357.html>`__

-----------------------------
Speeding up list append calls
-----------------------------

A `comp.lang.python message from Tim Peters`_ prompted Neal Norwitz  
to investigate how the code that Tim posted could be sped up.  He  
hacked the code to replace var.append() with the LIST_APPEND opcode,  
and achieved a roughly 200% speed increase.  Although this doesn't  
work in general, Neal wondered if it could be used as a peephole  
optimization when a variable is known to be a list.  Martin v. L?wis  
suggested that the code could simply check whether it was a list;  
Phillip J. Eby and Fredrik Lundh pointed out that this is similar to  
what various math operators do (e.g. speeding up int + int calls).

.. _comp.lang.python message from Tim Peters: http:// 
groups.google.com/group/comp.lang.python/msg/9075a3bc59c334c9

[TAM]

Contributing thread:

- `speeding up list append calls <http://mail.python.org/pipermail/ 
python-dev/2005-September/056396.html>`__

------------------------------------
Allowing str.strip to remove "words"
------------------------------------

Jonny Reichwald proposed an enhancement to str.strip().  In addition  
to its current form, where it takes a string of characters to strip,  
to take any iterable containing either character lists or string  
lists, so that is is possible to remove entire words from the  
stripped string.  For example::

    #A char list gives the same result as the standard strip
    >>> my_strip("abcdeed", "de")
    'abc'

    #A list of strings instead
    >>> my_strip("abcdeed", ("ed",))
    'abcde'

    #The char order in the strings to be stripped are of importance
    >>> my_strip("abcdeed", ("ad", "eb"))
    'abcdeed'

Raymond Hettinger queried whether there was actual demand for such a  
change, and whether such demand was sufficient to justify the added  
complexity; Josiah Carlson also pointed out that implementing this  
only requires a four-line function.  Judging from the lack of  
responses, it seems likely that there isn't enough demand.

Contributing thread:

- `str.strip() enhancement <http://mail.python.org/pipermail/python- 
dev/2005-September/056119.html>`__

================
Deferred Threads
================

- `C coding experiment <http://mail.python.org/pipermail/python-dev/ 
2005-September/055965.html>`__
- `os.path.diff(path1, path2) <http://mail.python.org/pipermail/ 
python-dev/2005-September/056391.html>`__


===============
Skipped Threads
===============

- `import exceptions <http://mail.python.org/pipermail/python-dev/ 
2005-September/055926.html>`__
- `[Python-checkins] python/dist/src/Lib/test test_re.py, 1.45.6.3,  
1.45.6.4 <http://mail.python.org/pipermail/python-dev/2005-September/ 
055927.html>`__
- `setdefault's second argument <http://mail.python.org/pipermail/ 
python-dev/2005-September/055929.html>`__
- `Alternative imports (Re: Python 3 design principles) <http:// 
mail.python.org/pipermail/python-dev/2005-September/055945.html>`__
- `python/dist/src/Lib/test test_re.py, 1.45.6.3, 1.45.6.4 <http:// 
mail.python.org/pipermail/python-dev/2005-September/055963.html>`__
- `Status of PEP 328 <http://mail.python.org/pipermail/python-dev/ 
2005-September/055969.html>`__
- `Weekly Python Patch/Bug Summary <http://mail.python.org/pipermail/ 
python-dev/2005-September/055978.html>`__
- `itertools.chain should take an iterable ? <http://mail.python.org/ 
pipermail/python-dev/2005-September/055981.html>`__
- `partition() (was: Remove str.find in 3.0?) <http://mail.python.org/ 
pipermail/python-dev/2005-September/055983.html>`__
- `gdbinit problem <http://mail.python.org/pipermail/python-dev/2005- 
September/056178.html>`__
- `Exception Reorg PEP checked in <http://mail.python.org/pipermail/ 
python-dev/2005-September/056296.html>`__
- `international python <http://mail.python.org/pipermail/python-dev/ 
2005-September/056326.html>`__
- `SIGPIPE =&gt; SIG_IGN? <http://mail.python.org/pipermail/python- 
dev/2005-September/056341.html>`__
- `[draft] python-dev Summary for 2005-08-16 through 2005-08-31  
<http://mail.python.org/pipermail/python-dev/2005-September/ 
056348.html>`__
- `[Python-checkins] python/dist/src/Lib urllib.py, 1.169, 1.170  
<http://mail.python.org/pipermail/python-dev/2005-September/ 
056349.html>`__
- `Wanting to learn <http://mail.python.org/pipermail/python-dev/2005- 
September/056350.html>`__
- `Python code.interact() and UTF-8 locale <http://mail.python.org/ 
pipermail/python-dev/2005-September/056361.html>`__
- `pygettext() without newlines (Was: Re: Replacement for print in  
Python 3.0) <http://mail.python.org/pipermail/python-dev/2005- 
September/056368.html>`__
- `Python 3 executable name (was: Re: PEP 3000 and iterators) <http:// 
mail.python.org/pipermail/python-dev/2005-September/056369.html>`__
- `Python 3 executable name <http://mail.python.org/pipermail/python- 
dev/2005-September/056371.html>`__
- `Skiping searching throw dictionaries of mro() members. <http:// 
mail.python.org/pipermail/python-dev/2005-September/056403.html>`__
- `Fwd: [Python-checkins] python/dist/src/Misc NEWS, 1.1193.2.94,  
1.1193.2.95 <http://mail.python.org/pipermail/python-dev/2005- 
September/056405.html>`__
- `[Python-checkins] python/dist/src/Lib/test regrtest.py, 1.171,  
1.172 test_ioctl.py, 1.2, 1.3 <http://mail.python.org/pipermail/ 
python-dev/2005-September/056406.html>`__
- `python/dist/src/Lib urllib.py, 1.165.2.1, 1.165.2.2 <http:// 
mail.python.org/pipermail/python-dev/2005-September/056419.html>`__
- `Variant of removing GIL. <http://mail.python.org/pipermail/python- 
dev/2005-September/056423.html>`__
- `Compatibility between Python 2.3.x and Python 2.4.x <http:// 
mail.python.org/pipermail/python-dev/2005-September/056431.html>`__
- `Example for "property" violates "Python is not a one pass  
compiler" <http://mail.python.org/pipermail/python-dev/2005-September/ 
056190.html>`__
- `python optimization <http://mail.python.org/pipermail/python-dev/ 
2005-September/056425.html>`__


From guido at python.org  Sat Oct 22 03:39:47 2005
From: guido at python.org (Guido van Rossum)
Date: Fri, 21 Oct 2005 18:39:47 -0700
Subject: [Python-Dev] DRAFT: python-dev Summary for 2005-09-01 through
	2005-09-16
In-Reply-To: <3A5D2942-E0C3-4768-94D5-E43B914BB80B@ihug.co.nz>
References: <3A5D2942-E0C3-4768-94D5-E43B914BB80B@ihug.co.nz>
Message-ID: <ca471dc20510211839r569db387x9259e5204a21debd@mail.gmail.com>

On 10/21/05, Tony Meyer <t-meyer at ihug.co.nz> wrote:
> This is over a month late, sorry, but here it is (Steve did his
> threads ages ago; I've fallen really behind).

Better late than never! These summaries are awesome.

Just one nit:
> ----------------------------------
> Responsiveness of IDLE development
> ----------------------------------
>
> Noam Raphael posted a request for help getting a large patch to IDLE
> committed to CVS.  He was concerned that there hasn't been any IDLE
> development recently, and that patches are not being considered.  He
> indicated that his group was considering offering a fork of IDLE with
> the improvements, but that they would much prefer integrating the
> improvements into the core distribution.
>
> It was pointed out that a fork might be the best solution, for
> various reasons (e.g. the improvements may not be of general
> interest, the release time would be much quicker), and that this was
> how the current version of IDLE was developed.  The dicussion died
> out, so it seems likely that a fork will be the resulting solution.

Later, it turned out that Kurt Kaiser had missed this message on
python-dev (which he only reads occasionally); he redirected the
thread to idle-dev where it seems that his issues with the
contribution are being resolved and a fork is averted. Whew!

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From tim.peters at gmail.com  Sat Oct 22 04:52:36 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Fri, 21 Oct 2005 22:52:36 -0400
Subject: [Python-Dev] int(string) (was: DRAFT: python-dev Summary for
	2005-09-01 through 2005-09-16)
Message-ID: <1f7befae0510211952x5eb2000bicdf3c1a80a3f5749@mail.gmail.com>

...
> -----------------------------
> Speeding up list append calls
> -----------------------------
>
> A `comp.lang.python message from Tim Peters`_ prompted Neal Norwitz
> to investigate how the code that Tim posted could be sped up.  He
> hacked the code to replace var.append() with the LIST_APPEND opcode,
> ....

Someone want a finite project that would _really_ help their Uncle
Timmy in his slow-motion crusade to get Python on the list of "solved
it!" languages for each problem on that magnificent site?

    http://spoj.sphere.pl

It turns out that many of the problems there have input encoded as
vast quantities of integers (stdin is a mass of decimal integers on
one or more lines).  Most infamous for Python is this tutorial (you
don't get points for solving it) problem, which is _trying_ to test
whether your language of choice can read from stdin "fast enough":

    http://spoj.sphere.pl/problems/INTEST/

"""
The input begins with two positive integers n k (n, k<=10**7). The
next n lines of input contain one positive integer t_i, not greater
than 10**9, each.

Output
Write a single integer to output, denoting how many integers t_i are
divisable by k.

Example

Input:
7 3
1
51
966369
7
9
999996
11

Output:
4
"""

There's an 8-second time limit, and I believe stdin is about 8MB
(you're never allowed to see the actual input they use).  They have a
slower machine than you use ;-), so it's harder than it sounds.  To
date, 975 people have submitted a program that passed, but only a few
managed to do it using Python.  I did, and it required every trick in
the book, including using psyco.

Turns out it's _not_ input speed that's the problem here, and not even
mainly the speed of integer mod:  the bulk of the time is spent in
int(string) (and, yes, that's also far more important to the problem
Neal was looking at than list.append time). If you can even track all
the levels of C function calls that ends up invoking <wink>, you find
yourself in PyOS_strtoul(), which is a nifty all-purpose routine that
accepts inputs in bases 2 thru 36, can auto-detect base, and does
platform-independent overflow checking at the cost of a division per
digit.  All those are features, but it makes for sloooow conversion.

I assume it's the overflow-checking that's the major time sink, and
it's not correct anyway:  it does the check slightly differently for
base 10 than for any other base, explained only in the checkin comment
for rev 2.13, 8 years ago:

    For base 10, cast unsigned long to long before testing overflow.
    This prevents 4294967296 from being an acceptable way to spell zero!

So what are the odds that base 10 was the _only_ base that had a "bad
input" case for the overflow-check method used?  If you thought
"slim", you were right ;-)  Here are other bad cases, under all Python
versions to date (on a 32-bit box; if sizeof(long) == 8, there are
different bad cases):

int('102002022201221111211', 3) = 0
int('32244002423141', 5) = 0
int('1550104015504', 6) = 0
int('211301422354', 7) = 0
int('12068657454', 9) = 0
int('1904440554', 11) = 0
int('9ba461594', 12) = 0
int('535a79889', 13) = 0
int('2ca5b7464', 14) = 0
int('1a20dcd81', 15) = 0
int('a7ffda91', 17) = 0
int('704he7g4', 18) = 0
int('4f5aff66', 19) = 0
int('3723ai4g', 20) = 0
int('281d55i4', 21) = 0
int('1fj8b184', 22) = 0
int('1606k7ic', 23) = 0
int('mb994ag', 24) = 0
int('hek2mgl', 25) = 0
int('dnchbnm', 26) = 0
int('b28jpdm', 27) = 0
int('8pfgih4', 28) = 0
int('76beigg', 29) = 0
int('5qmcpqg', 30) = 0
int('4q0jto4', 31) = 0
int('3aokq94', 33) = 0
int('2qhxjli', 34) = 0
int('2br45qb', 35) = 0
int('1z141z4', 36) = 0

IOW, the only bases that _aren't_ "bad" are powers of 2, and 10
because it's special-cased (BTW, I'm not sure that base 10 doesn't
have a different bad case now, but don't care enough to prove it one
way or the other).

Now fixing that is easy:  the problem comes from being too clever,
doing both a multiply and an addition before checking for overflow. 
Check each operation on its own and it would be bulletproof, without
special-casing.  But that might be even slower (it would remove the
branch special-casing 10, but add a cheap integer addition overflow
check with its own branch).

The challenge (should you decide to accept it <wink>) is to replace
the overflow-checking with something both correct _and_ much faster
than doing n integer divisions for an n-character input.  For example,
36**6 < 2**32-1, so whenever the input has no more than 6 digits
overflow is impossible regardless of base and regardless of platform. 
That's simple and exploitable.  For extra credit, make int(string) go
faster than preparing your taxes ;-)


BTW, Python as-is can be used to solve many (I'd bet most) of these
problems in the time limit imposed, although it may take some effort,
and it may not be possible without using psyco.  A Python triumph I'm
particularly fond of:

     http://spoj.sphere.pl/problems/FAMILY/

The legend at the bottom:

    Warning: large Input/Output data, be careful with certain languages

seems to be a euphemism for "don't even think about using Python" <0.9 wink>.

But there's a big difference in this one:  it's a _hard_ problem,
requiring graph analysis, delicate computation, greater than
double-precision precision (in the end), and can hugely benefit from
preprocessing a batch of queries to plan out and minimize the number
of operations needed.  Five people have solved it to date (click on
"Best Solutions"), and you'll see that my Python entry is the
second-fastest so far, beating 3 C++ entries by 3 excellent C++
programmers.  I don't know what they did, but I suspect I was far more
willing to code up an effective but tedious "plan out and minimize"
phase _because_ I was using Python.  I sure didn't beat them on
reading the mass quantities of integers from stdin <wink>.

From hyeshik at gmail.com  Sat Oct 22 07:55:05 2005
From: hyeshik at gmail.com (Hye-Shik Chang)
Date: Sat, 22 Oct 2005 14:55:05 +0900
Subject: [Python-Dev] LXR site for Python CVS
Message-ID: <4f0b69dc0510212255t61185aa5x4e4b8e253e0c2573@mail.gmail.com>

Hi,

I just set up a LXR instance for Python CVS for my personal use:
  http://pxr.openlook.org/pxr/

If you find it useful, feel free to use the site. :) The source files will
be updated twice a day.

Hye-Shik

From mwh at python.net  Sat Oct 22 10:38:46 2005
From: mwh at python.net (Michael Hudson)
Date: Sat, 22 Oct 2005 09:38:46 +0100
Subject: [Python-Dev] int(string)
In-Reply-To: <1f7befae0510211952x5eb2000bicdf3c1a80a3f5749@mail.gmail.com> (Tim
	Peters's message of "Fri, 21 Oct 2005 22:52:36 -0400")
References: <1f7befae0510211952x5eb2000bicdf3c1a80a3f5749@mail.gmail.com>
Message-ID: <2mfyqu6j55.fsf@starship.python.net>

Tim Peters <tim.peters at gmail.com> writes:

> Turns out it's _not_ input speed that's the problem here, and not even
> mainly the speed of integer mod:  the bulk of the time is spent in
> int(string) (and, yes, that's also far more important to the problem
> Neal was looking at than list.append time). If you can even track all
> the levels of C function calls that ends up invoking <wink>, you find
> yourself in PyOS_strtoul(), which is a nifty all-purpose routine that
> accepts inputs in bases 2 thru 36, can auto-detect base, and does
> platform-independent overflow checking at the cost of a division per
> digit.  All those are features, but it makes for sloooow conversion.

> I assume it's the overflow-checking that's the major time sink, and
> it's not correct anyway:  it does the check slightly differently for
> base 10 than for any other base, explained only in the checkin comment
> for rev 2.13, 8 years ago:
>
>     For base 10, cast unsigned long to long before testing overflow.
>     This prevents 4294967296 from being an acceptable way to spell zero!
>
> So what are the odds that base 10 was the _only_ base that had a "bad
> input" case for the overflow-check method used?  If you thought
> "slim", you were right ;-)  Here are other bad cases, under all Python
> versions to date (on a 32-bit box; if sizeof(long) == 8, there are
> different bad cases):
>
> int('102002022201221111211', 3) = 0
[...]

Eek!

> Now fixing that is easy:  the problem comes from being too clever,

Surprise!

> doing both a multiply and an addition before checking for overflow. 
> Check each operation on its own and it would be bulletproof, without
> special-casing.  But that might be even slower (it would remove the
> branch special-casing 10, but add a cheap integer addition overflow
> check with its own branch).
>
> The challenge (should you decide to accept it <wink>) is to replace
> the overflow-checking with something both correct _and_ much faster
> than doing n integer divisions for an n-character input.  For example,
> 36**6 < 2**32-1, so whenever the input has no more than 6 digits
> overflow is impossible regardless of base and regardless of platform. 
> That's simple and exploitable.  For extra credit, make int(string) go
> faster than preparing your taxes ;-)

So, you're suggesting dividing the input up into known non-overflowing
chunks and using the normal Python operations to combine those chunks,
relying on them overflowing to longs as needed?  All of the examples
you posted should have returned longs anyway, right?

I guess the change to automatically overflowing to longs has led to
some code that shows its history more than one would like.

Cheers,
mwh

-- 
  I think if we have the choice, I'd rather we didn't explicitly put
  flaws in the reST syntax for the sole purpose of not insulting the
  almighty.                                    -- /will on the doc-sig

From ncoghlan at gmail.com  Sat Oct 22 10:54:13 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 22 Oct 2005 18:54:13 +1000
Subject: [Python-Dev] DRAFT: python-dev Summary for 2005-09-01 through
 2005-09-16
In-Reply-To: <ca471dc20510211839r569db387x9259e5204a21debd@mail.gmail.com>
References: <3A5D2942-E0C3-4768-94D5-E43B914BB80B@ihug.co.nz>
	<ca471dc20510211839r569db387x9259e5204a21debd@mail.gmail.com>
Message-ID: <4359FE35.9010405@gmail.com>

Guido van Rossum wrote:
> On 10/21/05, Tony Meyer <t-meyer at ihug.co.nz> wrote:
>> This is over a month late, sorry, but here it is (Steve did his
>> threads ages ago; I've fallen really behind).
> 
> Better late than never! These summaries are awesome.

I certainly find them to be a very useful reminder of list threads that got 
overwhelmed by other discussions.

I'm still trying to close out the naming issues for PEP 343, but I hope to get 
back to the "Template.format" method idea eventually (along with an idea 
inspired by the discussion of the module level functions in the 're' module - 
how about providing similar module level functions in the string module that 
correspond to the methods of Template objects?).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From skip at pobox.com  Sat Oct 22 13:48:17 2005
From: skip at pobox.com (skip@pobox.com)
Date: Sat, 22 Oct 2005 06:48:17 -0500
Subject: [Python-Dev] Comparing date+time w/ just time
Message-ID: <17242.9985.574217.23379@montanaro.dyndns.org>

With significant input from Fred I made some changes to xmlrpclib a couple
months ago to better integrate datetime objects into xmlrpclib.  That raised
some problems because I neglected to add support for comparing datetime
objects with xmlrpclib.DateTime objects.  (The problem showed up in
MoinMoin.)  I've been working on that recently (adding rich comparison
methods to DateTime while retaining __cmp__ for backward compatibility), and
have second thoughts about one of the original changes.

I tried to support datetime, date and time objects.  My problems are with
support for time objects.  Marshalling datetimes as xmlrpclib.DateTime
objects is no problem (though you lose fractions of a second).  Marshalling
dates is reasonable if you treat the time as 00:00:00.  I decided to marshal
datetime.time objects by fixing the day portion of the xmlrpclib.DateTime
object as today's date.  That's the suspect part.

When I went back recently to add better comparison support, I decided to
compare xmlrpclib.DateTime objects with time objects by simply comparing the
HH:MM:SS part of the DateTime with the time object.  That's making me a bit
queazy now.  datetime.time(hour=23) would compare equal to any DateTime with
its time equal to 11PM.  Under the rule, "in the face of ambiguity, refuse
the temptation to guess", I'm inclined to dump support for marshalling and
comparison of time objects altogether.  Do others agree that was a bad idea?

Thx,

Skip

From guido at python.org  Sat Oct 22 15:58:16 2005
From: guido at python.org (Guido van Rossum)
Date: Sat, 22 Oct 2005 06:58:16 -0700
Subject: [Python-Dev] Comparing date+time w/ just time
In-Reply-To: <17242.9985.574217.23379@montanaro.dyndns.org>
References: <17242.9985.574217.23379@montanaro.dyndns.org>
Message-ID: <ca471dc20510220658k611580b3p2a514309215a471e@mail.gmail.com>

On 10/22/05, skip at pobox.com <skip at pobox.com> wrote:
> With significant input from Fred I made some changes to xmlrpclib a couple
> months ago to better integrate datetime objects into xmlrpclib.  That raised
> some problems because I neglected to add support for comparing datetime
> objects with xmlrpclib.DateTime objects.  (The problem showed up in
> MoinMoin.)  I've been working on that recently (adding rich comparison
> methods to DateTime while retaining __cmp__ for backward compatibility), and
> have second thoughts about one of the original changes.
>
> I tried to support datetime, date and time objects.  My problems are with
> support for time objects.  Marshalling datetimes as xmlrpclib.DateTime
> objects is no problem (though you lose fractions of a second).  Marshalling
> dates is reasonable if you treat the time as 00:00:00.  I decided to marshal
> datetime.time objects by fixing the day portion of the xmlrpclib.DateTime
> object as today's date.  That's the suspect part.
>
> When I went back recently to add better comparison support, I decided to
> compare xmlrpclib.DateTime objects with time objects by simply comparing the
> HH:MM:SS part of the DateTime with the time object.  That's making me a bit
> queazy now.  datetime.time(hour=23) would compare equal to any DateTime with
> its time equal to 11PM.  Under the rule, "in the face of ambiguity, refuse
> the temptation to guess", I'm inclined to dump support for marshalling and
> comparison of time objects altogether.  Do others agree that was a bad idea?

Agreed.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From ncoghlan at iinet.net.au  Sat Oct 22 15:58:48 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Sat, 22 Oct 2005 23:58:48 +1000
Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues
Message-ID: <435A4598.3060403@iinet.net.au>

I'm still looking for more feedback on the issues raised in the last update of 
PEP 343. There hasn't been much direct feedback so far, but I've rephrased and 
suggested resolutions for the outstanding issues based on what feedback I have 
received, and my own thoughts over the last week of so.

For those simply skimming, my proposed issue resolutions are:

   1. Use the slot name "__context__" instead of "__with__"
   2. Reserve the builtin name "context" for future use as described below
   3a. Give generator-iterators a native context that invokes self.close()
   3b. Use "contextmanager" as a builtin decorator to get generator-contexts
   4. Special case the __context__ slot to avoid the need to decorate it

For those that disagree with the proposed resolutions above, or simply would 
like more details, here's the reasoning:

1. Should the slot be named "__with__" or "__context__"?

     Guido raised this as a side comment during the discussion of
   PJE's task variables pre-PEP, and it's a fair question.
     The closest analogous slot method ("__iter__") is named after the protocol
   it relates to, rather than the associated statement/expression keyword (that
   is, the method isn't called "__for__").
     The next closest analogous slot is one that doesn't actually exist yet -
   the proposed "boolean" protocol. This again uses the name of a protocol
   rather than the associated keyword (that is, the name "__bool__" was
   suggested rather than "__if__").
     At the moment, PEP 343 makes the opposite choice - it uses the keyword,
   rather than the protocol name (that is, it uses "__with__" instead of using
   "__context__").
     That inconsistency would be a bad thing, in my opinion, and I propose that
   the slot should instead be named "__context__".

2. If the slot is called "__context__" what should a "context" builtin do?

     Again, considering existing slot names, a slot with a given name is
   generally invoked by the builtin type or function with the same name.
     This is true of the builtin types, and also true of iter, cmp and pow.
   getattr, setattr and delattr get in on the act as well.
     So, to be consistent, one would expect a "context" builtin to be able to
   be used such that "context(x)" invoked "x.__context__()".
     Such a method might special-case certain types, or have a two-argument
   form that accepted an "enter" function and an "exit" function, but using it
   to mark a generator that is to be used as a context manager (as currently
   suggested in PEP 343) would not be appropriate.
     I don't mind either way whether or not a "context" builtin is actually
   included for Python 2.5. However, even if it isn't included, the name should
   be reserved for that purpose (that is, we shouldn't be using it to name a
   generator decorator or a module).

3. How should generators behave when used as contexts?

     With PEP 342 accepted, generators pose a problem, because they have two
   possible uses as contexts. The first is for a generator that is intended to
   be used as an actual iterator. This case is a case of resource management -
   ensuring the close method is invoked on the generator-iterator when done
   with it (i.e., similar to the proposed native context for files).
     PEP 343 proposes a second use case for generators - to write custom
   context managers. In this case, the __enter__ method steps the generator
   once, and the __exit__ method finishes the job.
     I propose that we give generator-iterator objects resource management
   behaviour by default (i.e., __context__ and __enter__ methods that just
   "return self", and an __exit__ method that invokes "self.close()").
     The "contextmanager" builtin decorator from previous drafts of the PEP
   (called simply "context" in the current draft) can then be used to get the
   custom context manager behaviour.
     I previously thought giving generators a native context caused problems
   with getting silent failures when the "contextmanager" decorator was
   inadvertently omitted. This is still technically true - the "with" statement
   itself won't raise a TypeError because the generator is a legal context.
     However, with this bug, the context manager won't be getting entered *at
   all* (it gets closed without its next() method ever being called). Even the
   most cursory testing of the generator-context function should be able to
   tell whether the generator-context is being entered or not.
     The main alternative (having yet-another-decorator to give generators
   "auto-close" functionality) would be possible, but the additional builtin
   clutter would be getting to the point where it started to concern me. And
   given that "yield" inside "try/finally" is now always legal, I consider it
   reasonable that using a generator in a "with" statement would also always be
   legal.
     Further, if type_new special cases __context__ as suggested below, then
   the context behaviour of generators used to define "__iter__" and
   "__context__" slots will always be appropriate.

4. Should the __context__ slot be special-cased in type_new?

     Currently, type_new special cases the "__new__" slot and automatically
   applies the staticmethod decorator when it finds a function occupying that
   slot in the class attribute dictionary.
     I propose that type_new also special case the situation where the
   "__context__" slot is occupied by a generator function, and automatically
   apply the "contextmanager" decorator.
     This looks much nicer when using a generator to write a __context__
   function, and also avoids the situation where the decorator is omitted, and
   the object becomes legal to use directly in with statements but doesn't
   actually do the right thing.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From jim at zope.com  Sat Oct 22 16:21:28 2005
From: jim at zope.com (Jim Fulton)
Date: Sat, 22 Oct 2005 10:21:28 -0400
Subject: [Python-Dev] Comparing date+time w/ just time
In-Reply-To: <17242.9985.574217.23379@montanaro.dyndns.org>
References: <17242.9985.574217.23379@montanaro.dyndns.org>
Message-ID: <435A4AE8.6010708@zope.com>

skip at pobox.com wrote:
> With significant input from Fred I made some changes to xmlrpclib a couple
> months ago to better integrate datetime objects into xmlrpclib.  That raised
> some problems because I neglected to add support for comparing datetime
> objects with xmlrpclib.DateTime objects.  (The problem showed up in
> MoinMoin.)  I've been working on that recently (adding rich comparison
> methods to DateTime while retaining __cmp__ for backward compatibility), and
> have second thoughts about one of the original changes.
> 
> I tried to support datetime, date and time objects.  My problems are with
> support for time objects.  Marshalling datetimes as xmlrpclib.DateTime
> objects is no problem (though you lose fractions of a second).  Marshalling
> dates is reasonable if you treat the time as 00:00:00. 

I don't think that is reasonable at all.  I would normally expect
a date to represent the whole day, not a particular, unspecified time.
Other people may have other expectations, but xmlrpclib should not
assume a particular interpretation.

 > I decided to marshal
> datetime.time objects by fixing the day portion of the xmlrpclib.DateTime
> object as today's date.  That's the suspect part.

Very very suspect. :)

> When I went back recently to add better comparison support, I decided to
> compare xmlrpclib.DateTime objects with time objects by simply comparing the
> HH:MM:SS part of the DateTime with the time object.  That's making me a bit
> queazy now.  datetime.time(hour=23) would compare equal to any DateTime with
> its time equal to 11PM.  Under the rule, "in the face of ambiguity, refuse
> the temptation to guess", I'm inclined to dump support for marshalling and
> comparison of time objects altogether.  Do others agree that was a bad idea?

I agree that it was a bad idea and that you should not try to marshal
time objects or compare time objects with DateTime objects.
Similarly, I strongly recommend that you also stop trying to marshal date
objects or compare date objects to DateTime objects.  After all,
if the datetime module doesn't allow compatison of date and datetime,
why should you try to compare date and DateTime?

Jim

-- 
Jim Fulton           mailto:jim at zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org

From tim.peters at gmail.com  Sat Oct 22 18:13:53 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Sat, 22 Oct 2005 12:13:53 -0400
Subject: [Python-Dev] int(string)
In-Reply-To: <2mfyqu6j55.fsf@starship.python.net>
References: <1f7befae0510211952x5eb2000bicdf3c1a80a3f5749@mail.gmail.com>
	<2mfyqu6j55.fsf@starship.python.net>
Message-ID: <1f7befae0510220913te4f0975k95932d0515b6ce59@mail.gmail.com>

[Tim]
...
>> int('102002022201221111211', 3) = 0

I should have added that all those examples simply used 2**32 as
input, expressed as a string in the input base.  They're not the only
failing cases; e.g., this is also obviously wrong:

>>> int('102002022201221111212', 3)
1

...

>> The challenge (should you decide to accept it <wink>) is to replace
>> the overflow-checking with something both correct _and_ much faster
>> than doing n integer divisions for an n-character input.  For example,
>> 36**6 < 2**32-1, so whenever the input has no more than 6 digits
>> overflow is impossible regardless of base and regardless of platform.
>> That's simple and exploitable.  For extra credit, make int(string) go
>> faster than preparing your taxes ;-)

|Michael Hudson]
> So, you're suggesting dividing the input up into known non-overflowing
> chunks and using the normal Python operations to combine those chunks,
> relying on them overflowing to longs as needed?

Possibly.  I want int(str), for the comparatively short decimal
strings most apps convert most of the time, to be much faster too. 
The _simplest_ thing one could do with the observation is add a
number-of-digits counter to PyOS_strtoul's loop, skip the overflow
check entirely for the first six digits converted, and for every digit
(if any) after the sixth do "obviously correct" overflow checking.

That would save min(len(s), 6) integer divisions per call, and would
probably be a real speed win for most apps that do a lot of
int(string).  Slightly more ambitious would be to use a different
constant per base; e.g., for base 10 overflow is impossible if there
are no more than 9 digits, and exploiting that would buy that
int(decimal_str) would almost never need to do an integer division in
most apps.

The strategy you suggest could, if implemented carefully, speed all
int(string) and long(string) operations, except for long(string, base)
where base is a power of 2 (the latter case is highly optimized
already, in longobject.c's long_from_binary_base).

Speeding long(string) for non-power-of-2 bases is tricky.  It benefits
already from the internal muladd1() routine, which does the "multiply
by the base and add in the next digit" step in one gulp, mutating the
C representation of a long directly.  That's a very efficient loop in
part because it _knows_ the base fits in a single "Python long digit".

Combining larger chunks _could_ be faster, but the multiplication
problem gets harder if base**chunk_size exceeds a single Python long
digit.

So there are a world of possible complications here.  I'd be delighted
to see "just" correct overflow checking plus a major speed boost for
int(decimal_string) where the result does fit in a 32-bit unsigned int
(which I'm sure accounts for the vast bulk of dynamic real-life
int(string) invocations).

> All of the examples you posted should have returned longs anyway, right?

On a 32-bit box, yes.  Regardless of box, all of the original examples
should return 2**32.  The one at the top of this message should return
2**32+1.

> I guess the change to automatically overflowing to longs has led to
> some code that shows its history more than one would like.

Well, these particular cases were always broken -- they always
returned 0.  The difference is that in modern Pythons they should
return the right answer, while in older Pythons they should have
raised OverflowError.

From rhamph at gmail.com  Sat Oct 22 18:49:38 2005
From: rhamph at gmail.com (Adam Olsen)
Date: Sat, 22 Oct 2005 10:49:38 -0600
Subject: [Python-Dev] int(string)
In-Reply-To: <2mfyqu6j55.fsf@starship.python.net>
References: <1f7befae0510211952x5eb2000bicdf3c1a80a3f5749@mail.gmail.com>
	<2mfyqu6j55.fsf@starship.python.net>
Message-ID: <aac2c7cb0510220949g5c60dcfdp89ca2d94bdd041d4@mail.gmail.com>

> Tim Peters <tim.peters at gmail.com> writes:
>
> > Turns out it's _not_ input speed that's the problem here, and not even
> > mainly the speed of integer mod:  the bulk of the time is spent in
> > int(string) (and, yes, that's also far more important to the problem
> > Neal was looking at than list.append time). If you can even track all
> > the levels of C function calls that ends up invoking <wink>, you find
> > yourself in PyOS_strtoul(), which is a nifty all-purpose routine that
> > accepts inputs in bases 2 thru 36, can auto-detect base, and does
> > platform-independent overflow checking at the cost of a division per
> > digit.  All those are features, but it makes for sloooow conversion.
>
> > I assume it's the overflow-checking that's the major time sink,

Are you sure?

https://sourceforge.net/tracker/index.php?func=detail&aid=1334979&group_id=5470&atid=305470

That patch removes the division from the loop (and fixes the bugs),
but gives only a small increase in speed.

--
Adam Olsen, aka Rhamphoryncus

From guido at python.org  Sat Oct 22 19:22:56 2005
From: guido at python.org (Guido van Rossum)
Date: Sat, 22 Oct 2005 10:22:56 -0700
Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues
In-Reply-To: <435A4598.3060403@iinet.net.au>
References: <435A4598.3060403@iinet.net.au>
Message-ID: <ca471dc20510221022s6e75c5a0y591271cd7a950109@mail.gmail.com>

On 10/22/05, Nick Coghlan <ncoghlan at iinet.net.au> wrote:
> I'm still looking for more feedback on the issues raised in the last update of
> PEP 343. There hasn't been much direct feedback so far, but I've rephrased and
> suggested resolutions for the outstanding issues based on what feedback I have
> received, and my own thoughts over the last week of so.

Thanks for bringing this up again. It's been at the back of my mind,
but hasn't had much of a chance to come to the front lately...

> For those simply skimming, my proposed issue resolutions are:
>
>    1. Use the slot name "__context__" instead of "__with__"

+1

>    2. Reserve the builtin name "context" for future use as described below

+0.5. I don't think we'll need that built-in, but I do think that the
term "context" is too overloaded to start using it for anything in
particular.

>    3a. Give generator-iterators a native context that invokes self.close()

I'll have to think about this one more, and I don't have time for that
right now.

>    3b. Use "contextmanager" as a builtin decorator to get generator-contexts

+1

>    4. Special case the __context__ slot to avoid the need to decorate it

-1. I expect that we'll also see generator *functions* (not methods)
as context managers. The functions need the decorator. For consistency
the methods should also be decorated explicitly.

For example, while I'm now okay (at the +0.5 level) with having files
automatically behave like context managers, one could still write an
explicit context manager 'opening':

@contextmanager
def opening(filename):
    f = open(filename)
    try:
        yield f
    finally:
        f.close()

Compare to

class FileLike:

    def __init__(self, ...): ...

    def close(self): ...

    @contextmanager
    def __context__(self):
        try:
            yield self
        finally:
            self.close()

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From fdrake at acm.org  Sat Oct 22 19:26:35 2005
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Sat, 22 Oct 2005 13:26:35 -0400
Subject: [Python-Dev] Comparing date+time w/ just time
In-Reply-To: <17242.9985.574217.23379@montanaro.dyndns.org>
References: <17242.9985.574217.23379@montanaro.dyndns.org>
Message-ID: <200510221326.36225.fdrake@acm.org>

On Saturday 22 October 2005 07:48, skip at pobox.com wrote:
 > ..., I'm inclined to dump support
 > for marshalling and comparison of time objects altogether.  Do others
 > agree that was a bad idea?

Very much.  As Jim notes, supporting date objects is more than a little 
questionable as well.  Dates and times, separate from a date-time, are 
completely unsupported by the bare XML-RPC protocol.  Applications must 
determine what they mean and how to encode them in XML-RPC separately if they 
need to do so.


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From tim.peters at gmail.com  Sat Oct 22 19:38:11 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Sat, 22 Oct 2005 13:38:11 -0400
Subject: [Python-Dev] int(string)
In-Reply-To: <aac2c7cb0510220949g5c60dcfdp89ca2d94bdd041d4@mail.gmail.com>
References: <1f7befae0510211952x5eb2000bicdf3c1a80a3f5749@mail.gmail.com>
	<2mfyqu6j55.fsf@starship.python.net>
	<aac2c7cb0510220949g5c60dcfdp89ca2d94bdd041d4@mail.gmail.com>
Message-ID: <1f7befae0510221038u4bd75ca5l67dc930ae16d24b7@mail.gmail.com>

[Tim]
>> I assume it's the overflow-checking that's the major time sink,

[Adam Olsen]
> Are you sure?

No -- that's what "assume" means <0.7 wink>.  For example, there's a
long chain of function calls involved in int(string) too.

> <https://sourceforge.net/tracker/index.php?func=detail&aid=1334979&group_id=5470&atid=305470>
>
> That patch removes the division from the loop (and fixes the bugs),
> but gives only a small increase in speed.

As measured how?  Platform, compiler, input, etc?  Is the "ULONG_MAX /
base" part compiled to inline code or to a call to a library routine
(e.g., if the latter, it could be that a dividend with "the sign bit
set" is extraordinarily expensive for unsigned division -- depends on
the <compiler, HW> pair in use)?  If so, a small static table could
avoid all runtime division.  If not, note that the number of divisions
hasn't actually changed for 1-character input.  Etc.

In any case, I agree it _should_ fix the bugs (although it also needs
new tests to verify that).

From rhamph at gmail.com  Sat Oct 22 20:03:45 2005
From: rhamph at gmail.com (Adam Olsen)
Date: Sat, 22 Oct 2005 12:03:45 -0600
Subject: [Python-Dev] int(string)
In-Reply-To: <1f7befae0510221038u4bd75ca5l67dc930ae16d24b7@mail.gmail.com>
References: <1f7befae0510211952x5eb2000bicdf3c1a80a3f5749@mail.gmail.com>
	<2mfyqu6j55.fsf@starship.python.net>
	<aac2c7cb0510220949g5c60dcfdp89ca2d94bdd041d4@mail.gmail.com>
	<1f7befae0510221038u4bd75ca5l67dc930ae16d24b7@mail.gmail.com>
Message-ID: <aac2c7cb0510221103w57b883b1r1a22b407097a2aca@mail.gmail.com>

On 10/22/05, Tim Peters <tim.peters at gmail.com> wrote:
> [Tim]
> >> I assume it's the overflow-checking that's the major time sink,
>
> [Adam Olsen]
> > Are you sure?
>
> No -- that's what "assume" means <0.7 wink>.  For example, there's a
> long chain of function calls involved in int(string) too.
>
> > <https://sourceforge.net/tracker/index.php?func=detail&aid=1334979&group_id=5470&atid=305470>
> >
> > That patch removes the division from the loop (and fixes the bugs),
> > but gives only a small increase in speed.
>
> As measured how?  Platform, compiler, input, etc?  Is the "ULONG_MAX /
> base" part compiled to inline code or to a call to a library routine
> (e.g., if the latter, it could be that a dividend with "the sign bit
> set" is extraordinarily expensive for unsigned division -- depends on
> the <compiler, HW> pair in use)?  If so, a small static table could
> avoid all runtime division.  If not, note that the number of divisions
> hasn't actually changed for 1-character input.  Etc.

AMD Athlon 2500+, Linux 2.6.13, GCC 4.0.2

rhamph at factor:~/src/Python-2.4.1$ python2.4 -m timeit 'int("999999999")'
1000000 loops, best of 3: 0.834 usec per loop
rhamph at factor:~/src/Python-2.4.1$ ./python -m timeit 'int("999999999")'
1000000 loops, best of 3: 0.801 usec per loop
rhamph at factor:~/src/Python-2.4.1$ python2.4 -m timeit 'int("9")'
1000000 loops, best of 3: 0.709 usec per loop
rhamph at factor:~/src/Python-2.4.1$ ./python -m timeit 'int("9")'
1000000 loops, best of 3: 0.717 usec per loop

Originally I just tried the longer string so I hadn't noticed that the
smaller string was slightly slower.  Oh well, caveat emptor.

--
Adam Olsen, aka Rhamphoryncus

From reinhold-birkenfeld-nospam at wolke7.net  Sat Oct 22 22:51:24 2005
From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld)
Date: Sat, 22 Oct 2005 22:51:24 +0200
Subject: [Python-Dev] Definining properties - a use case for class
	decorators?
In-Reply-To: <4edc17eb0510200035u370b57f9ub1d66b4e99d1be62@mail.gmail.com>
References: <d11dcfba0510190838g3629f183r616485df4f8ea252@mail.gmail.com>	<ca471dc20510162006j43e2172rf81015f71e0a6018@mail.gmail.com>	<43544CC1.5050204@canterbury.ac.nz>	<ca471dc20510171855w357591c4hae2a9ddb40694d23@mail.gmail.com>	<1129601328.9405.13.camel@geddy.wooz.org>	<d11dcfba0510172146p5275727bpb5bd6c604a45a9c5@mail.gmail.com>	<ca471dc20510172159m7eb94e85vf4b704dd2e6f24b9@mail.gmail.com>	<4edc17eb0510190151v6a078f54n2135b8d8386c796a@mail.gmail.com>	<5.1.1.6.0.20051019124840.01fa73b8@mail.telecommunity.com>	<17238.40158.735826.504410@montanaro.dyndns.org>
	<4edc17eb0510200035u370b57f9ub1d66b4e99d1be62@mail.gmail.com>
Message-ID: <dje8oj$mml$1@sea.gmane.org>

Michele Simionato wrote:
> As other explained, the syntax would not work for functions (and it is
> not intended to).
> A possible use case I had in mind is to define inlined modules to be
> used as bunches
> of attributes. For instance, I could define a module as
> 
> module m():
>     a = 1
>     b = 2
> 
> where 'module' would be the following function:
> 
> def module(name, args, dic):
>     mod = types.ModuleType(name, dic.get('__doc__'))
>     for k in dic: setattr(mod, k, dic[k])
>     return mod

Wow. This looks like an almighty tool. We can have modules, interfaces,
classes and properties all the like with this.

Guess a PEP would be nice.

Reinhold


From guido at python.org  Sat Oct 22 23:14:34 2005
From: guido at python.org (Guido van Rossum)
Date: Sat, 22 Oct 2005 14:14:34 -0700
Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues
In-Reply-To: <ca471dc20510221022s6e75c5a0y591271cd7a950109@mail.gmail.com>
References: <435A4598.3060403@iinet.net.au>
	<ca471dc20510221022s6e75c5a0y591271cd7a950109@mail.gmail.com>
Message-ID: <ca471dc20510221414m4df50f3ft1fceb3eaa024a7b8@mail.gmail.com>

Here's another argument against automatically decorating __context__.

What if I want to have a class with a __context__ method that returns
a custom context manager that *doesn't* involve applying
@contextmanager to a generator?

While technically this is possible with your proposal (since such a
method wouldn't be a generator), it's exceedingly subtle for the human
reader. I'd much rather see the @contextmanager decorator to emphasize
the difference.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From skip at pobox.com  Sat Oct 22 23:28:00 2005
From: skip at pobox.com (skip@pobox.com)
Date: Sat, 22 Oct 2005 16:28:00 -0500
Subject: [Python-Dev] Comparing date+time w/ just time
In-Reply-To: <435A4AE8.6010708@zope.com>
References: <17242.9985.574217.23379@montanaro.dyndns.org>
	<435A4AE8.6010708@zope.com>
Message-ID: <17242.44768.957645.357325@montanaro.dyndns.org>

Based on feedback from Jim and Fred, I took out date and time object
marshalling and comparison.  (Actually, you can still compare an
xmlrpclib.DateTime object with a datetime.date object, because DateTime
objects can be compared with anything that has a timetuple method.)  There's
a patch at

    http://python.org/sf/1330538

I went ahead and assigned it to Fred since he's worked with that code fairly
recently.

Skip

From bokr at oz.net  Sun Oct 23 01:49:53 2005
From: bokr at oz.net (Bengt Richter)
Date: Sat, 22 Oct 2005 16:49:53 -0700
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
 conversions).
Message-ID: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net>

Please bear with me for a few paragraphs ;-)

One aspect of str-type strings is the efficiency afforded when all the encoding really
is ascii. If the internal encoding were e.g. fixed utf-16le for strings, maybe with today's
computers it would still be efficient enough for most actual string purposes (excluding
the current use of str-strings as byte sequences).

I.e., you'd still have to identify what was "strings" (of characters) and what was really
byte sequences with no implied or explicit encoding or character semantics.

Ok, let's make that distinction explicit: Call one kind of string a byte sequence and the
other a character sequence (representation being a separate issue).

A unicode object is of course the prime _general_ representation of a character sequence
in Python, but all the names in python source code (that become NAME tokens) are UIAM
also character sequences, and representable by a byte sequence interpreted according to
ascii encoding.

For the sake of discussion, suppose we had another _character_ sequence type that was
the moral equivalent of unicode except for internal representation, namely a str
subclass with an encoding attribute specifying the encoding that you _could_ use
to decode the str bytes part to get unicode (which you wouldn't do except when necessary).
We could call it class charstr(str): ... and have chrstr().bytes be the str part and
chrstr().encoding specify the encoding part.

In all the contexts where we have obvious encoding information, we can then generate
a charstr instead of a str. E.g., if the source of module_a has

    # -*- coding: latin1 -*-
    cs = '?ber-cool'
then
    type(cs)  # => <type 'charstr'>
    cs.bytes  # => '\xfcber-cool'
    cs.encoding # => 'latin-1'

and print cs would act like print cs.bytes.decode(cs.encoding) -- or I guess
    sys.stdout.write(cs.bytes.decode(cs.encoding).encode(sys.stdout.encoding)
followed by
    sys.stdout.write('\n'.decode('ascii').encode(sys.stdout.encoding)
for the newline of the print.

Now if module_b has

    # -*- coding: utf8 -*-
    cs = '?ber-cool'

and we interactively
    import module_a, module_b
and then
    print module_a.cs + ' =?= ' + module_b.cs

what could happen ideally vs. what we have currently?
UIAM, currently we would just get the concatenation of
the three str byte sequences concatenated to make
    '\xfcber-cool =?= \xc3\xbcber-cool'
and that would be printed as whatever that comes out as
without conversion when seen by the output according to
sys.stdout.encoding.

But if those cs instances had been charstr instances, the coding cookie
encoding information would have been preserved, and the interactive print could
have evaluated the string expression -- given cs.decode() as sugar for
    (cs.bytes.decode(cs.encoding or globals().get('__encoding__') or
         __import__('sys').getdefaultencoding()))
-- as

    module_a.cs.decode() + ' =?= '.decode() + module_b.cs.decode()

if pairwise terms differ in encoding as they might all here. If the interactive
session source were e.g. latin-1, like module_a, then
    module_a.cs + ' =?= '
would not require an encoding change, because the ' =?= ' would be a charstr instance
with encoding == 'latin-1', and so the result would still be latin-1 that far.
But with module_b.cs being utf8, the next addition would cause the .decode() promotions
to unicode. In a console window, the ' =?= '.encoding might be 'cp437' or such, and
the first addition would then cause promotion (since module_a.cs.encoding != 'cp437').

I have sneaked in run-time access to individual modules' encodings by assuming that
the encoding cookie could be compiled in as an explicit global __encoding__ variable
for any given module (what to have as __encoding__ for built-in modules could vary for
various purposes).

ISTM this could have use in situations where an encoding assumption is necessary and
currently 'ascii' is not as good a guess as one could make, though I suspect if string
literals became charstr strings instead of str strings, many if not most of those situations
would disappear (I'm saying this because ATM I can't think of an 'ascii'-guess situation that
wouldn't go away ;-) If there were a charchr() version of chr() that would result in
a charstr instead of a str, IWT one would want an easy-sugar default encoding assumption,
probably based on the same as one would assume for '%c' % num in a given module source
-- which presumably would be '%c'.encoding, where '%c' assumes the encoding of the module
source, normally recorded in __encoding__. So charchr(n) would act like chr(n).decode().encode(''.encoding) -- or more reasonably charstr(chr(n)), which would be
short for
    charstr(chr(n), globals().get('__encoding__') or __import__('sys').getdefaultencoding())
Or some efficient equivalent ;-)

Using strings in dicts requires hashing to find key comparison candidates and comparison to
check for key equivalence. This would seem to point to some kind of normalized hashing, but
not necessarily normalized key representation. Some is apparently happening, since
 >>> hash('a') == hash(unicode('a'))
 True

I don't know what would be worth the trouble to optimize string key usage where strings are
really all of one encoding vs totally general use vs a heavily biased mix. Or even if it could
be done without unreasonable complexity. Maybe a dict could be given an option to hash all
its keys as unicode vs whatever it does now. But having a charstr subtype of str would improve
the "implicit" conversions to unicode IMO.

Anyway, I wanted to throw in my .02USD re the implicit conversions, taking the view that
much of the implicitness could be based on reliable inferences from source encodings of
string literals or from their effects as format strings.

Regards,
Bengt Richter
[not a normal subscriber to python-dev, so I'll have to google for any responses]


From raymond.hettinger at verizon.net  Sun Oct 23 07:30:17 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Sun, 23 Oct 2005 01:30:17 -0400
Subject: [Python-Dev] AST reverts PEP 342 implementation and IDLE starts
	working again
Message-ID: <000201c5d792$daa57240$79fbcc97@oemcomputer>

FWIW, a few months ago, I reported that File New or File Open in IDLE
would crash Python as a result of the check-in implementing PEP 342.
Now, with AST checked-in, IDLE has started working again.  Given the
reconfirmation, I recommend that the 342 patch be regarded as suspect
and not be restored until the fault is found and repaired.


Raymond


From pje at telecommunity.com  Sun Oct 23 07:53:59 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun, 23 Oct 2005 01:53:59 -0400
Subject: [Python-Dev] AST reverts PEP 342 implementation and IDLE starts
 working again
In-Reply-To: <000201c5d792$daa57240$79fbcc97@oemcomputer>
Message-ID: <5.1.1.6.0.20051023014942.01f9f8b8@mail.telecommunity.com>

At 01:30 AM 10/23/2005 -0400, Raymond Hettinger wrote:
>FWIW, a few months ago, I reported that File New or File Open in IDLE
>would crash Python as a result of the check-in implementing PEP 342.
>Now, with AST checked-in, IDLE has started working again.  Given the
>reconfirmation, I recommend that the 342 patch be regarded as suspect
>and not be restored until the fault is found and repaired.

PEP 342 is actually implemented in the HEAD.  See:

http://mail.python.org/pipermail/python-dev/2005-October/057477.html

So, your observation actually means that the bug, if any, was somewhere 
else, or was inadvertently fixed or hidden by the AST branch merge.


From ncoghlan at gmail.com  Sun Oct 23 11:35:56 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 23 Oct 2005 19:35:56 +1000
Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues
In-Reply-To: <ca471dc20510221414m4df50f3ft1fceb3eaa024a7b8@mail.gmail.com>
References: <435A4598.3060403@iinet.net.au>	
	<ca471dc20510221022s6e75c5a0y591271cd7a950109@mail.gmail.com>
	<ca471dc20510221414m4df50f3ft1fceb3eaa024a7b8@mail.gmail.com>
Message-ID: <435B597C.6040300@gmail.com>

Guido van Rossum wrote:
> Here's another argument against automatically decorating __context__.
> 
> What if I want to have a class with a __context__ method that returns
> a custom context manager that *doesn't* involve applying
> @contextmanager to a generator?
> 
> While technically this is possible with your proposal (since such a
> method wouldn't be a generator), it's exceedingly subtle for the human
> reader. I'd much rather see the @contextmanager decorator to emphasize
> the difference.

Being able to easily pull a native context manager out and turn it into an 
independent context manager just by changing its name is also a big plus. For 
that matter, consider a class that had a "normal" context manager (its context 
slot), and an alternative context manager (defined as a separate method). The 
fact that one had the contextmanager decorator and the other one didn't would 
be rather confusing.

So you've convinced me that auto-decoration is not the right thing to do. 
Those that really don't like decorating a slot can always write it as:

   def UndecoratedSlot(object):

       @contextmanager
       def native_context(self):
           print "Entering native context"
           yield
           print "Exiting native context cleanly"

       __context__ = native_context

Or:

   def UndecoratedSlot(object):

       def __context__(self):
           return self.native_context()

       @contextmanager
       def native_context(self):
           print "Entering native context"
           yield
           print "Exiting native context cleanly"

However, I'm still concerned about the fact that the following class has a 
context manager that doesn't actually work:

   class Broken(object):
     def __context__(self):
         print "This never gets executed"
         yield
         print "Neither does this"

So how about if type_new simply raises a TypeError if it finds a 
generator-iterator function in the __context__ slot?

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Sun Oct 23 11:52:46 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 23 Oct 2005 19:52:46 +1000
Subject: [Python-Dev] Definining properties - a use case for
	class	decorators?
In-Reply-To: <dje8oj$mml$1@sea.gmane.org>
References: <d11dcfba0510190838g3629f183r616485df4f8ea252@mail.gmail.com>	<ca471dc20510162006j43e2172rf81015f71e0a6018@mail.gmail.com>	<43544CC1.5050204@canterbury.ac.nz>	<ca471dc20510171855w357591c4hae2a9ddb40694d23@mail.gmail.com>	<1129601328.9405.13.camel@geddy.wooz.org>	<d11dcfba0510172146p5275727bpb5bd6c604a45a9c5@mail.gmail.com>	<ca471dc20510172159m7eb94e85vf4b704dd2e6f24b9@mail.gmail.com>	<4edc17eb0510190151v6a078f54n2135b8d8386c796a@mail.gmail.com>	<5.1.1.6.0.20051019124840.01fa73b8@mail.telecommunity.com>	<17238.40158.735826.504410@montanaro.dyndns.org>	<4edc17eb0510200035u370b57f9ub1d66b4e99d1be62@mail.gmail.com>
	<dje8oj$mml$1@sea.gmane.org>
Message-ID: <435B5D6E.80101@gmail.com>

Reinhold Birkenfeld wrote:
> Michele Simionato wrote:
>> As other explained, the syntax would not work for functions (and it is
>> not intended to).
>> A possible use case I had in mind is to define inlined modules to be
>> used as bunches
>> of attributes. For instance, I could define a module as
>>
>> module m():
>>     a = 1
>>     b = 2
>>
>> where 'module' would be the following function:
>>
>> def module(name, args, dic):
>>     mod = types.ModuleType(name, dic.get('__doc__'))
>>     for k in dic: setattr(mod, k, dic[k])
>>     return mod
> 
> Wow. This looks like an almighty tool. We can have modules, interfaces,
> classes and properties all the like with this.
> 
> Guess a PEP would be nice.

Very nice indeed. I'd be more supportive if it was defined as a new statement 
such as "create" with the syntax:

   create TYPE NAME(ARGS):
     BLOCK

The result would be roughly equivalent to:

   kwds = {}
   exec BLOCK in kwds
   NAME = TYPE(NAME, ARGS, kwds)

Such that the existing 'class' statement is equivalent to:

   create __metaclass__ NAME(ARGS):
     BLOCK

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From raymond.hettinger at verizon.net  Sun Oct 23 13:27:31 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Sun, 23 Oct 2005 07:27:31 -0400
Subject: [Python-Dev] AST reverts PEP 342 implementation and IDLE starts
 working again
In-Reply-To: <5.1.1.6.0.20051023014942.01f9f8b8@mail.telecommunity.com>
Message-ID: <000601c5d7c4$c2ace600$79fbcc97@oemcomputer>

[Phillip J. Eby]
> your observation actually means that the bug, if any, was somewhere
> else, or was inadvertently fixed or hidden by the AST branch merge.

What a nice side benefit :-)


Raymond


From martin at v.loewis.de  Sun Oct 23 15:53:04 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 23 Oct 2005 15:53:04 +0200
Subject: [Python-Dev] New codecs checked in
In-Reply-To: <435965E4.5050207@egenix.com>
References: <435965E4.5050207@egenix.com>
Message-ID: <435B95C0.9060005@v.loewis.de>

M.-A. Lemburg wrote:
> I've checked in a whole bunch of newly generated codecs
> which now make use of the faster charmap decoding variant added
> by Walter a short while ago.
> 
> Please let me know if you find any problems.

I think we should work on eliminating the decoding_map variables.
There are some codecs which rely on them being present in other codecs
(e.g. koi8_u.py is based on koi8_r.py); however, this could be updated
to use, say

decoding_table = codecs.update_decoding_map(koi8_r.decoding_table, {
         0x00a4: 0x0454, #       CYRILLIC SMALL LETTER UKRAINIAN IE
         0x00a6: 0x0456, #       CYRILLIC SMALL LETTER 
BYELORUSSIAN-UKRAINIAN I
         0x00a7: 0x0457, #       CYRILLIC SMALL LETTER YI (UKRAINIAN)
         0x00ad: 0x0491, #       CYRILLIC SMALL LETTER UKRAINIAN GHE 
WITH UPTURN
         0x00b4: 0x0404, #       CYRILLIC CAPITAL LETTER UKRAINIAN IE
         0x00b6: 0x0406, #       CYRILLIC CAPITAL LETTER 
BYELORUSSIAN-UKRAINIAN I
         0x00b7: 0x0407, #       CYRILLIC CAPITAL LETTER YI (UKRAINIAN)
         0x00bd: 0x0490, #       CYRILLIC CAPITAL LETTER UKRAINIAN GHE 
WITH UPTURN
})

With all these cross-references gone, the decoding_maps could also go.

Regards,
Martin

From giovanniangeli at iquattrocastelli.it  Sun Oct 23 18:12:42 2005
From: giovanniangeli at iquattrocastelli.it (giovanniangeli@iquattrocastelli.it)
Date: Sun, 23 Oct 2005 18:12:42 +0200 (CEST)
Subject: [Python-Dev] cross compiling python for embedded systems
Message-ID: <32987.82.58.25.23.1130083962.squirrel@www.iquattrocastelli.it>


is this the right place to ask:
How could I build the python interpreter for an embedded linux target system
(arm9 based), cross-compiling on a linux PC host?

thanks, Giovanni Angeli.



From guido at python.org  Sun Oct 23 18:19:40 2005
From: guido at python.org (Guido van Rossum)
Date: Sun, 23 Oct 2005 09:19:40 -0700
Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues
In-Reply-To: <435B597C.6040300@gmail.com>
References: <435A4598.3060403@iinet.net.au>
	<ca471dc20510221022s6e75c5a0y591271cd7a950109@mail.gmail.com>
	<ca471dc20510221414m4df50f3ft1fceb3eaa024a7b8@mail.gmail.com>
	<435B597C.6040300@gmail.com>
Message-ID: <ca471dc20510230919p7218c195oe97389935c3814ff@mail.gmail.com>

On 10/23/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> However, I'm still concerned about the fact that the following class has a
> context manager that doesn't actually work:
>
>    class Broken(object):
>      def __context__(self):
>          print "This never gets executed"
>          yield
>          print "Neither does this"

That's only because of your proposal to endow generators with a
default __context__ manager. Drop that idea and you're golden.

(As long as nobody snuck the proposal back in to let the
with-statement silently ignore objects that don't have a __context__
method -- that was rejected long ago on.)

In my previous mail I said I had to think about that one more -- well,
I have, and I'm now -1 on it. Very few generators (that aren't used a
context manangers) will need the immediate explicit close() call, and
it will happen eventually when they are GC'ed anyway. Too much magic
is bad for your health.

> So how about if type_new simply raises a TypeError if it finds a
> generator-iterator function in the __context__ slot?

No. type should not bother with understanding what the class is trying
to do. __new__ is only special because it is part of the machinery
that type itself invokes in order to create a new class.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Sun Oct 23 18:51:27 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun, 23 Oct 2005 12:51:27 -0400
Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues
In-Reply-To: <ca471dc20510230919p7218c195oe97389935c3814ff@mail.gmail.co
 m>
References: <435B597C.6040300@gmail.com> <435A4598.3060403@iinet.net.au>
	<ca471dc20510221022s6e75c5a0y591271cd7a950109@mail.gmail.com>
	<ca471dc20510221414m4df50f3ft1fceb3eaa024a7b8@mail.gmail.com>
	<435B597C.6040300@gmail.com>
Message-ID: <5.1.1.6.0.20051023124842.01af9078@mail.telecommunity.com>

At 09:19 AM 10/23/2005 -0700, Guido van Rossum wrote:
>On 10/23/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> > However, I'm still concerned about the fact that the following class has a
> > context manager that doesn't actually work:
> >
> >    class Broken(object):
> >      def __context__(self):
> >          print "This never gets executed"
> >          yield
> >          print "Neither does this"
>
>That's only because of your proposal to endow generators with a
>default __context__ manager. Drop that idea and you're golden.
>
>(As long as nobody snuck the proposal back in to let the
>with-statement silently ignore objects that don't have a __context__
>method -- that was rejected long ago on.)

Actually, you've just pointed out a new complication introduced by having 
__context__.  The return value of __context__ is supposed to have an 
__enter__ and an __exit__.  Is it a type error if it doesn't?  How do we 
handle that, exactly?

That is, assuming generators don't have enter/exit/context methods, then 
the above code is broken because its __context__ returns an object without 
enter/exit, sort of like an __iter__ that returns something without a 'next()'.


From martin at v.loewis.de  Sun Oct 23 19:03:56 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 23 Oct 2005 19:03:56 +0200
Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover
Message-ID: <435BC27C.1010503@v.loewis.de>

I'd like to start the subversion switchover this coming Wednesday,
with a total commit freeze at 16:00 GMT. If you have larger changes
to commit that you would like to commit before the switchover, but
after that date, please let me know.

At that point, I will set the repository to read-only (through a
commitinfo hook), and request that SF rolls a tarfile. I will then
notify you when the Subversion repository is online.

If you have sandboxes with modifications, it might be good to
cvs diff -u them now. I plan to keep the CVS up for a short while
after the switchover (about a month); after that point, you will
need to get the CVS tarball and retarget your sandbox to perform
diffs.

I'm not aware of a procedure to convert a CVS sandbox into an SVN
one, so you will have to recheckout all your sandboxes after the
switch.

Regards,
Martin

From jepler at unpythonic.net  Sun Oct 23 19:22:32 2005
From: jepler at unpythonic.net (jepler@unpythonic.net)
Date: Sun, 23 Oct 2005 12:22:32 -0500
Subject: [Python-Dev] cross compiling python for embedded systems
In-Reply-To: <32987.82.58.25.23.1130083962.squirrel@www.iquattrocastelli.it>
References: <32987.82.58.25.23.1130083962.squirrel@www.iquattrocastelli.it>
Message-ID: <20051023172232.GA11117@unpythonic.net>

There's a patch on sourceforge for cross compiling.  I haven't used it personally.

http://sourceforge.net/tracker/index.php?func=detail&aid=1006238&group_id=5470&atid=305470

Jeff
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20051023/9935c4e7/attachment.pgp

From martin at v.loewis.de  Sun Oct 23 19:22:37 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 23 Oct 2005 19:22:37 +0200
Subject: [Python-Dev] cross compiling python for embedded systems
In-Reply-To: <32987.82.58.25.23.1130083962.squirrel@www.iquattrocastelli.it>
References: <32987.82.58.25.23.1130083962.squirrel@www.iquattrocastelli.it>
Message-ID: <435BC6DD.3030900@v.loewis.de>

giovanniangeli at iquattrocastelli.it wrote:
> How could I build the python interpreter for an embedded linux target system
> (arm9 based), cross-compiling on a linux PC host?

No. news:comp.lang.python (aka: mailto:python-list at python.org) would be 
the right list.

This would be the right list for the question "I made this and that 
modification to get it cross-compile, can somebody please review them?"

Regards,
Martin

From p.f.moore at gmail.com  Sun Oct 23 21:15:15 2005
From: p.f.moore at gmail.com (Paul Moore)
Date: Sun, 23 Oct 2005 20:15:15 +0100
Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues
In-Reply-To: <5.1.1.6.0.20051023124842.01af9078@mail.telecommunity.com>
References: <435A4598.3060403@iinet.net.au>
	<ca471dc20510221022s6e75c5a0y591271cd7a950109@mail.gmail.com>
	<ca471dc20510221414m4df50f3ft1fceb3eaa024a7b8@mail.gmail.com>
	<435B597C.6040300@gmail.com>
	<5.1.1.6.0.20051023124842.01af9078@mail.telecommunity.com>
Message-ID: <79990c6b0510231215tb93f1c0nb58821a36f033a5f@mail.gmail.com>

On 10/23/05, Phillip J. Eby <pje at telecommunity.com> wrote:
> Actually, you've just pointed out a new complication introduced by having
> __context__.  The return value of __context__ is supposed to have an
> __enter__ and an __exit__.  Is it a type error if it doesn't?  How do we
> handle that, exactly?
>
> That is, assuming generators don't have enter/exit/context methods, then
> the above code is broken because its __context__ returns an object without
> enter/exit, sort of like an __iter__ that returns something without a 'next()'.

I would have thought that the parallel with __iter__ would be the
right way to go:

>>> class C:
...     def __iter__(self):
...         return 12
...
>>> c = C()
>>> iter(c)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: __iter__ returned non-iterator of type 'int'
>>>

So, when you try calling __context__ in a with statement (or I guess
in a context() builtin if one were to be added), raise a TypeError if
the resulting object doesn't have __enter__ and __exit__ methods. (Or
maybe just if it has neither - I can't recall if the methods are
optional, but certainly having neither is wrong).

Paul.

From mwh at python.net  Sun Oct 23 22:08:30 2005
From: mwh at python.net (Michael Hudson)
Date: Sun, 23 Oct 2005 21:08:30 +0100
Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover
In-Reply-To: <435BC27C.1010503@v.loewis.de> (
	=?iso-8859-1?q?Martin_v._L=F6wis's_message_of?= "Sun,
	23 Oct 2005 19:03:56 +0200")
References: <435BC27C.1010503@v.loewis.de>
Message-ID: <2mbr1g6loh.fsf@starship.python.net>

"Martin v. L?wis" <martin at v.loewis.de> writes:

> I'd like to start the subversion switchover this coming Wednesday,
> with a total commit freeze at 16:00 GMT.

Yay!  Thanks again for doing this.

Cheers,
mwh

-- 
  [Perl] combines all the worst aspects of C and Lisp: a billion
  different sublanguages in one monolithic executable.  It combines
  the power of C with the readability of PostScript. -- Jamie Zawinski

From jason.orendorff at gmail.com  Mon Oct 24 00:10:28 2005
From: jason.orendorff at gmail.com (Jason Orendorff)
Date: Sun, 23 Oct 2005 18:10:28 -0400
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
	conversions).
In-Reply-To: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net>
References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net>
Message-ID: <bb8868b90510231510i3f247b29nc590d9c45c9bdb9f@mail.gmail.com>

-1 on keeping the source encoding of string literals.  Python should
definitely decode them at compile time.

-1 on decoding implicitly "as needed".  This causes decoding to happen
late, in unpredictable places.  Decodes can fail; they should happen
as early and as close to the data source as possible.

-j

From barry at python.org  Mon Oct 24 00:43:50 2005
From: barry at python.org (Barry Warsaw)
Date: Sun, 23 Oct 2005 18:43:50 -0400
Subject: [Python-Dev] PEP 351, the freeze protocol
Message-ID: <1130107429.11268.40.camel@geddy.wooz.org>

I've had this PEP laying around for quite a few months.  It was inspired
by some code we'd written which wanted to be able to get immutable
versions of arbitrary objects.  I've finally finished the PEP, uploaded
a sample patch (albeit a bit incomplete), and I'm posting it here to see
if there is any interest.

http://www.python.org/peps/pep-0351.html

Cheers,
-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20051023/3d621ed8/attachment.pgp

From guido at python.org  Mon Oct 24 01:58:48 2005
From: guido at python.org (Guido van Rossum)
Date: Sun, 23 Oct 2005 16:58:48 -0700
Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues
In-Reply-To: <5.1.1.6.0.20051023124842.01af9078@mail.telecommunity.com>
References: <435A4598.3060403@iinet.net.au>
	<ca471dc20510221022s6e75c5a0y591271cd7a950109@mail.gmail.com>
	<ca471dc20510221414m4df50f3ft1fceb3eaa024a7b8@mail.gmail.com>
	<435B597C.6040300@gmail.com>
	<5.1.1.6.0.20051023124842.01af9078@mail.telecommunity.com>
Message-ID: <ca471dc20510231658qb8e14ddkb7956960cfa507d7@mail.gmail.com>

On 10/23/05, Phillip J. Eby <pje at telecommunity.com> wrote:
> Actually, you've just pointed out a new complication introduced by having
> __context__.  The return value of __context__ is supposed to have an
> __enter__ and an __exit__.  Is it a type error if it doesn't?  How do we
> handle that, exactly?

Of course it's an error! The translation in the PEP should make that
quite clear (there's no testing for whether __context__, __enter__
and/or __exit__ exist before they are called). It would be an
AttributeError.

> That is, assuming generators don't have enter/exit/context methods, then
> the above code is broken because its __context__ returns an object without
> enter/exit, sort of like an __iter__ that returns something without a 'next()'.

Right. That was my point. Nick's worried about undecorated __context__
because he wants to endow generators with a different default
__context__. I say no to both proposals and the worries cancel each
other out. EIBTI.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From rhamph at gmail.com  Mon Oct 24 02:01:44 2005
From: rhamph at gmail.com (Adam Olsen)
Date: Sun, 23 Oct 2005 18:01:44 -0600
Subject: [Python-Dev] PEP 351, the freeze protocol
In-Reply-To: <1130107429.11268.40.camel@geddy.wooz.org>
References: <1130107429.11268.40.camel@geddy.wooz.org>
Message-ID: <aac2c7cb0510231701n47e99a6re0756fd070fcbaaf@mail.gmail.com>

On 10/23/05, Barry Warsaw <barry at python.org> wrote:
> I've had this PEP laying around for quite a few months.  It was inspired
> by some code we'd written which wanted to be able to get immutable
> versions of arbitrary objects.  I've finally finished the PEP, uploaded
> a sample patch (albeit a bit incomplete), and I'm posting it here to see
> if there is any interest.
>
> http://www.python.org/peps/pep-0351.html

My sandboxes need freezing for some stuff and ultimately freezable
user classes will be desirable, but for performance reasons I prefer
freezing inplace.  Not much overlap with PEP 351 really.

--
Adam Olsen, aka Rhamphoryncus

From bob at redivi.com  Mon Oct 24 02:24:05 2005
From: bob at redivi.com (Bob Ippolito)
Date: Sun, 23 Oct 2005 17:24:05 -0700
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
	conversions).
In-Reply-To: <bb8868b90510231510i3f247b29nc590d9c45c9bdb9f@mail.gmail.com>
References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net>
	<bb8868b90510231510i3f247b29nc590d9c45c9bdb9f@mail.gmail.com>
Message-ID: <7E14C004-99C3-4382-BA76-FE2F731B4CE5@redivi.com>


On Oct 23, 2005, at 3:10 PM, Jason Orendorff wrote:

> -1 on decoding implicitly "as needed".  This causes decoding to happen
> late, in unpredictable places.  Decodes can fail; they should happen
> as early and as close to the data source as possible.

That's not necessarily true... Some codecs can't fail, like latin1.   
I think the main use case for this is to speed up usage of text in  
these sorts of formats anyway.

-bob


From srichter at cosmos.phy.tufts.edu  Mon Oct 24 02:52:27 2005
From: srichter at cosmos.phy.tufts.edu (Stephan Richter)
Date: Sun, 23 Oct 2005 20:52:27 -0400
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
	conversions).
In-Reply-To: <bb8868b90510231510i3f247b29nc590d9c45c9bdb9f@mail.gmail.com>
References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net>
	<bb8868b90510231510i3f247b29nc590d9c45c9bdb9f@mail.gmail.com>
Message-ID: <200510232052.27775.srichter@cosmos.phy.tufts.edu>

On Sunday 23 October 2005 18:10, Jason Orendorff wrote:
> -1 on keeping the source encoding of string literals. ?Python should
> definitely decode them at compile time.
>
> -1 on decoding implicitly "as needed". ?This causes decoding to happen
> late, in unpredictable places. ?Decodes can fail; they should happen
> as early and as close to the data source as possible.

+1. We have followed this last practice throughout Zope 3 successfully. In our 
case, the publisher framework (in other words the output-protocol-specific 
layer) is responsible for the decoding and encoding of input and output 
streams, respectively. We have been pretty much free of any encoding/decoding 
troubles since. Having our application only use unicode internally was one of 
the best decisions we have made.

Regards,
Stephan
-- 
Stephan Richter
CBU Physics & Chemistry (B.S.) / Tufts Physics (Ph.D. student)
Web2k - Web Software Design, Development and Training

From guido at python.org  Mon Oct 24 03:06:00 2005
From: guido at python.org (Guido van Rossum)
Date: Sun, 23 Oct 2005 18:06:00 -0700
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
	conversions).
In-Reply-To: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net>
References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net>
Message-ID: <ca471dc20510231806k2ba088c7g2d4afd460e023ae1@mail.gmail.com>

Folks, please focus on what Python 3000 should do.

I'm thinking about making all character strings Unicode (possibly with
different internal representations a la NSString in Apple's Objective
C) and introduce a separate mutable bytes array data type. But I could
use some validation or feedback on this idea from actual
practitioners.

I don't want to see proposals to mess with the str/unicode semantics
in Python 2.x. Let' leave the Python 2.x str/unicode semantics alone
until Python 3000 -- we don't need mutliple transitions. (Although we
could add the mutable bytes array type sooner.)

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From bob at redivi.com  Mon Oct 24 03:31:12 2005
From: bob at redivi.com (Bob Ippolito)
Date: Sun, 23 Oct 2005 18:31:12 -0700
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
	conversions).
In-Reply-To: <ca471dc20510231806k2ba088c7g2d4afd460e023ae1@mail.gmail.com>
References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net>
	<ca471dc20510231806k2ba088c7g2d4afd460e023ae1@mail.gmail.com>
Message-ID: <9BAB15CE-D7BE-4080-9A21-3ED4EE01B0AE@redivi.com>

On Oct 23, 2005, at 6:06 PM, Guido van Rossum wrote:

> Folks, please focus on what Python 3000 should do.
>
> I'm thinking about making all character strings Unicode (possibly with
> different internal representations a la NSString in Apple's Objective
> C) and introduce a separate mutable bytes array data type. But I could
> use some validation or feedback on this idea from actual
> practitioners.
>
> I don't want to see proposals to mess with the str/unicode semantics
> in Python 2.x. Let' leave the Python 2.x str/unicode semantics alone
> until Python 3000 -- we don't need mutliple transitions. (Although we
> could add the mutable bytes array type sooner.)

+1, this is precisely what I'd like to see.

-bob


From pje at telecommunity.com  Mon Oct 24 04:23:40 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun, 23 Oct 2005 22:23:40 -0400
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
 conversions).
In-Reply-To: <ca471dc20510231806k2ba088c7g2d4afd460e023ae1@mail.gmail.co
 m>
References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net>
	<5.0.2.1.1.20051022120117.02b58da0@mail.oz.net>
Message-ID: <5.1.1.6.0.20051023221433.02ab8200@mail.telecommunity.com>

At 06:06 PM 10/23/2005 -0700, Guido van Rossum wrote:
>Folks, please focus on what Python 3000 should do.
>
>I'm thinking about making all character strings Unicode (possibly with
>different internal representations a la NSString in Apple's Objective
>C) and introduce a separate mutable bytes array data type. But I could
>use some validation or feedback on this idea from actual
>practitioners.

+1.  Chandler has been going through quite an upheaval to get its unicode 
handling together.  Having a bytes type would be great, as long as there 
was support for files and sockets to produce bytes instead of strings 
(unless an encoding was specified).

I'm tempted to say it would be even better if there was a command line 
option that could be used to force all binary opens to result in bytes, and 
require all text opens to specify an encoding.  The Chandler i18n project 
lead would jump for joy if we had a way to keep "legacy" strings out of the 
system, apart from ASCII string constants found in code.

It would then be okay not to drop support for the implicit conversions; if 
you can't get strings on input, then conversion's not really an issue.

Anyway, I think all of the things I'd like to see can be done without 
breakage in 2.5.  For Chandler at least, we'd be willing to go with a 
command-line option that's more strict, in order to be able to ensure that 
plugin developers can't accidentally put 8-bit strings in somewhere, just 
by opening a file.


From jcarlson at uci.edu  Mon Oct 24 04:29:11 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sun, 23 Oct 2005 19:29:11 -0700
Subject: [Python-Dev] Definining properties - a use case for
	class	decorators?
In-Reply-To: <435B5D6E.80101@gmail.com>
References: <dje8oj$mml$1@sea.gmane.org> <435B5D6E.80101@gmail.com>
Message-ID: <20051023121124.38D4.JCARLSON@uci.edu>


Nick Coghlan <ncoghlan at gmail.com> wrote:
> 
> Reinhold Birkenfeld wrote:
> > Michele Simionato wrote:
> >> As other explained, the syntax would not work for functions (and it is
> >> not intended to).
> >> A possible use case I had in mind is to define inlined modules to be
> >> used as bunches
> >> of attributes. For instance, I could define a module as
> >>
> >> module m():
> >>     a = 1
> >>     b = 2
> >>
> >> where 'module' would be the following function:
> >>
> >> def module(name, args, dic):
> >>     mod = types.ModuleType(name, dic.get('__doc__'))
> >>     for k in dic: setattr(mod, k, dic[k])
> >>     return mod
> > 
> > Wow. This looks like an almighty tool. We can have modules, interfaces,
> > classes and properties all the like with this.
> > 
> > Guess a PEP would be nice.
> 
> Very nice indeed. I'd be more supportive if it was defined as a new statement 
> such as "create" with the syntax:
> 
>    create TYPE NAME(ARGS):
>      BLOCK
> 
> The result would be roughly equivalent to:
> 
>    kwds = {}
>    exec BLOCK in kwds
>    NAME = TYPE(NAME, ARGS, kwds)

And is equivalent to the class/metaclass abuse...

    #suport code
    def BlockMetaclassFactory(constructor):
        class BlockMetaclass(type):
            def __new__(cls, name, bases, dct):
                return constructor(name, bases, dct)
        return BlockMetaClass

    #non-syntax syntax
    class NAME(ARGS):
        __metaclass__ = BlockMetaclassFactory(TYPE)
        BLOCK

Or even...

    def BlockClassFactory(constructor):
        class BlockClass:
            __metaclass__ = BlockMetaclassFactory(constructor)
        return BlockClass

    class NAME(BlockClassFactory(TYPE), ARGS):
        BLOCK

To be used with properties, one could use a wrapper and class definition...

    def _Property(names, bases, dct):
        return property(**dct)

    Property = BlockClassFactory(_Property)

    class foo(object):
        class x(Property):
            ...

With minor work, it would be easy to define a subclassable Property
which could handle some basic styles: write once, default value, etc.

I am unconvinced that a block syntax is necessary or desireable for this
case.  With the proper support classes, you can get modules, classes,
metaclasses, properties, the previous 'given:' syntax, etc.


 - Josiah


From jcarlson at uci.edu  Mon Oct 24 04:50:47 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sun, 23 Oct 2005 19:50:47 -0700
Subject: [Python-Dev] PEP 351, the freeze protocol
In-Reply-To: <1130107429.11268.40.camel@geddy.wooz.org>
References: <1130107429.11268.40.camel@geddy.wooz.org>
Message-ID: <20051023194708.38D7.JCARLSON@uci.edu>


Barry Warsaw <barry at python.org> wrote:
> I've had this PEP laying around for quite a few months.  It was inspired
> by some code we'd written which wanted to be able to get immutable
> versions of arbitrary objects.  I've finally finished the PEP, uploaded
> a sample patch (albeit a bit incomplete), and I'm posting it here to see
> if there is any interest.
> 
> http://www.python.org/peps/pep-0351.html



    class xlist(list):
        def __freeze__(self):
            return tuple(self)

Shouldn't that be:

    class xlist(list):
        def __freeze__(self):
            return tuple(map(freeze, self))


"Should dicts and sets automatically freeze their mutable keys?"

Dictionaries don't have mutable keys, but it is of my opinion that a
container which is frozen should have its contents frozen as well.

 - Josiah


From nyamatongwe at gmail.com  Mon Oct 24 05:41:50 2005
From: nyamatongwe at gmail.com (Neil Hodgson)
Date: Mon, 24 Oct 2005 13:41:50 +1000
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
	conversions).
In-Reply-To: <ca471dc20510231806k2ba088c7g2d4afd460e023ae1@mail.gmail.com>
References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net>
	<ca471dc20510231806k2ba088c7g2d4afd460e023ae1@mail.gmail.com>
Message-ID: <50862ebd0510232041j19b6b3achd0aecb072f9db148@mail.gmail.com>

Guido van Rossum:

> Folks, please focus on what Python 3000 should do.
>
> I'm thinking about making all character strings Unicode (possibly with
> different internal representations a la NSString in Apple's Objective
> C) and introduce a separate mutable bytes array data type. But I could
> use some validation or feedback on this idea from actual
> practitioners.

   I'd like to more tightly define Unicode strings for Python 3000.
Currently, Unicode strings may be implemented with either 2 byte
(UCS-2) or 4 byte (UTF-32) elements. Python should allow strings to
contain any Unicode character and should be indexable yielding
characters rather than half characters. Therefore Python strings
should appear to be UTF-32. There could still be multiple
implementations (using UTF-16 or UTF-8) to preserve space but all
implementations should appear to be the same apart from speed and
memory use.

   Neil

From alan.mcintyre at esrgtech.com  Mon Oct 24 07:43:26 2005
From: alan.mcintyre at esrgtech.com (Alan McIntyre)
Date: Mon, 24 Oct 2005 01:43:26 -0400
Subject: [Python-Dev] int(string)
In-Reply-To: <1f7befae0510221038u4bd75ca5l67dc930ae16d24b7@mail.gmail.com>
References: <1f7befae0510211952x5eb2000bicdf3c1a80a3f5749@mail.gmail.com>	<2mfyqu6j55.fsf@starship.python.net>	<aac2c7cb0510220949g5c60dcfdp89ca2d94bdd041d4@mail.gmail.com>
	<1f7befae0510221038u4bd75ca5l67dc930ae16d24b7@mail.gmail.com>
Message-ID: <435C747E.9060005@esrgtech.com>

Tim Peters wrote:

>[Adam Olsen]
>
>>https://sourceforge.net/tracker/index.php?func=detail&aid=1334979&group_id=5470&atid=305470>
>>
>>That patch removes the division from the loop (and fixes the bugs),
>>but gives only a small increase in speed.
>>
>In any case, I agree it _should_ fix the bugs (although it also needs
>new tests to verify that).
>
I started with Adam's patch and did some additional work on
PyOS_strtoul.  I ended up with a patch that seems to correctly evaluate
the tests that Tim listed in bug #1334662, includes new tests (in
test_builtin), passes (almost) all of "make test," and it seems to be
somewhat faster (~20%) for a "spoj.sphere.pl"-like test on a ~8MB input
file.  All the ugly details are here (along with my ugly code):

http://sourceforge.net/tracker/index.php?func=detail&aid=1335972&group_id=5470&atid=305470

When running "make test" I get some errors in test_array and
test_compile that did not occur in the build from CVS.  Given the inputs
to long() have '.' characters in them, I assume that these tests really
should be failing as implemented, but I haven't dug into them to see
what's going on:

======================================================================
ERROR: test_repr (__main__.FloatTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "Lib/test/test_array.py", line 187, in test_repr
    self.assertEqual(a, eval(repr(a), {"array": array.array}))
ValueError: invalid literal for long(): 10000000000.0
 
======================================================================
ERROR: test_repr (__main__.DoubleTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "Lib/test/test_array.py", line 187, in test_repr
    self.assertEqual(a, eval(repr(a), {"array": array.array}))
ValueError: invalid literal for long(): 10000000000.0
 
----------------------------------------------------------------------

test test_compile crashed -- exceptions.ValueError: invalid literal for
long():
90000000000000.


From martin at v.loewis.de  Mon Oct 24 08:28:14 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 24 Oct 2005 08:28:14 +0200
Subject: [Python-Dev] Divorcing str and unicode (no more
	implicit	conversions).
In-Reply-To: <50862ebd0510232041j19b6b3achd0aecb072f9db148@mail.gmail.com>
References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net>	<ca471dc20510231806k2ba088c7g2d4afd460e023ae1@mail.gmail.com>
	<50862ebd0510232041j19b6b3achd0aecb072f9db148@mail.gmail.com>
Message-ID: <435C7EFE.6060504@v.loewis.de>

Neil Hodgson wrote:
>    I'd like to more tightly define Unicode strings for Python 3000.
> Currently, Unicode strings may be implemented with either 2 byte
> (UCS-2) or 4 byte (UTF-32) elements. Python should allow strings to
> contain any Unicode character and should be indexable yielding
> characters rather than half characters. Therefore Python strings
> should appear to be UTF-32. There could still be multiple
> implementations (using UTF-16 or UTF-8) to preserve space but all
> implementations should appear to be the same apart from speed and
> memory use.

That's very tricky. If you have multiple implementations, you make
usage at the C API difficult. If you make it either UTF-8 or UTF-32,
you make PythonWin difficult. If you make it UTF-16, you make indexing
difficult.

Regards,
Martin

From martin at v.loewis.de  Mon Oct 24 08:30:52 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 24 Oct 2005 08:30:52 +0200
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
	conversions).
In-Reply-To: <5.1.1.6.0.20051023221433.02ab8200@mail.telecommunity.com>
References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net>	<5.0.2.1.1.20051022120117.02b58da0@mail.oz.net>
	<5.1.1.6.0.20051023221433.02ab8200@mail.telecommunity.com>
Message-ID: <435C7F9C.1050602@v.loewis.de>

Phillip J. Eby wrote:
> I'm tempted to say it would be even better if there was a command line 
> option that could be used to force all binary opens to result in bytes, and 
> require all text opens to specify an encoding.

For Python 3000? -1. There shouldn't be command line switches that have
that much importance.

For Python 2.x? Well, we are not supposed to discuss this.

Regards,
Martin

From nyamatongwe at gmail.com  Mon Oct 24 09:24:23 2005
From: nyamatongwe at gmail.com (Neil Hodgson)
Date: Mon, 24 Oct 2005 17:24:23 +1000
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
	conversions).
In-Reply-To: <435C7EFE.6060504@v.loewis.de>
References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net>
	<ca471dc20510231806k2ba088c7g2d4afd460e023ae1@mail.gmail.com>
	<50862ebd0510232041j19b6b3achd0aecb072f9db148@mail.gmail.com>
	<435C7EFE.6060504@v.loewis.de>
Message-ID: <50862ebd0510240024o2f339a0bwbe25bee0639dd229@mail.gmail.com>

Martin v. L?wis:

> That's very tricky. If you have multiple implementations, you make
> usage at the C API difficult. If you make it either UTF-8 or UTF-32,
> you make PythonWin difficult. If you make it UTF-16, you make indexing
> difficult.

   For Windows, the code will get a little uglier, needing to perform
an allocation/encoding and deallocation more often then at present but
I don't think there will be a speed degradation as Windows is
currently performing a conversion from 8 bit to UTF-16 inside many
system calls. To minimize the cost of allocation, Python could copy
Windows in keeping a small number of commonly sized preallocated
buffers handy.

   For indexing UTF-16, a flag could be set to show if the string is
all in the base plane and if not, an index could be constructed when
and if needed. It'd be good to get some feel for what proportion of
string operations performed require indexing. Many, such as
startswith, split, and concatenation don't require indexing. The
proportion of operations that use indexing to scan strings would also
be interesting as adding a (currentIndex, currentOffset) cursor to
string objects would be another approach.

   Neil

From michele.simionato at gmail.com  Mon Oct 24 09:30:02 2005
From: michele.simionato at gmail.com (Michele Simionato)
Date: Mon, 24 Oct 2005 07:30:02 +0000
Subject: [Python-Dev] Definining properties - a use case for class
	decorators?
In-Reply-To: <435B5D6E.80101@gmail.com>
References: <d11dcfba0510190838g3629f183r616485df4f8ea252@mail.gmail.com>
	<1129601328.9405.13.camel@geddy.wooz.org>
	<d11dcfba0510172146p5275727bpb5bd6c604a45a9c5@mail.gmail.com>
	<ca471dc20510172159m7eb94e85vf4b704dd2e6f24b9@mail.gmail.com>
	<4edc17eb0510190151v6a078f54n2135b8d8386c796a@mail.gmail.com>
	<5.1.1.6.0.20051019124840.01fa73b8@mail.telecommunity.com>
	<17238.40158.735826.504410@montanaro.dyndns.org>
	<4edc17eb0510200035u370b57f9ub1d66b4e99d1be62@mail.gmail.com>
	<dje8oj$mml$1@sea.gmane.org> <435B5D6E.80101@gmail.com>
Message-ID: <4edc17eb0510240030x74523a8fl1a9389b5054c061b@mail.gmail.com>

On 10/23/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Very nice indeed. I'd be more supportive if it was defined as a new statement
> such as "create" with the syntax:
>
>    create TYPE NAME(ARGS):
>      BLOCK

I like it, but it would require a new keyword. Alternatively, one
could abuse 'def':

def  TYPE NAME(ARGS):
      BLOCK

but then people would likely be confused as Skip was, earlier in this thread,
so I guess 'def' is a not an option.

IMHO a new keyword could be justified for such a powerful feature,
but only Guido's opinion counts on this matters ;)

Anyway I expected people to criticize the proposal as too powerful and
dangerously close to Lisp macros.

         Michele Simionato

From fredrik at pythonware.com  Mon Oct 24 09:41:39 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 24 Oct 2005 09:41:39 +0200
Subject: [Python-Dev] int(string)
References: <1f7befae0510211952x5eb2000bicdf3c1a80a3f5749@mail.gmail.com>	<2mfyqu6j55.fsf@starship.python.net>	<aac2c7cb0510220949g5c60dcfdp89ca2d94bdd041d4@mail.gmail.com><1f7befae0510221038u4bd75ca5l67dc930ae16d24b7@mail.gmail.com>
	<435C747E.9060005@esrgtech.com>
Message-ID: <dji37o$fl3$1@sea.gmane.org>

Alan McIntyre wrote:

> When running "make test" I get some errors in test_array and
> test_compile that did not occur in the build from CVS.  Given the inputs
> to long() have '.' characters in them, I assume that these tests really
> should be failing as implemented, but I haven't dug into them to see
> what's going on:
>
> ======================================================================
> ERROR: test_repr (__main__.FloatTest)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File "Lib/test/test_array.py", line 187, in test_repr
>     self.assertEqual(a, eval(repr(a), {"array": array.array}))
> ValueError: invalid literal for long(): 10000000000.0
>
> ======================================================================
> ERROR: test_repr (__main__.DoubleTest)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File "Lib/test/test_array.py", line 187, in test_repr
>     self.assertEqual(a, eval(repr(a), {"array": array.array}))
> ValueError: invalid literal for long(): 10000000000.0

I don't have the latest cvs, but in my copy of test_array, the input to those
two eval calls are

 array('f', [-42.0, 0.0, 42.0, 100000.0, -10000000000.0, -42.0, 0.0, 42.0,
        100000.0, -10000000000.0])

and

 array('d', [-42.0, 0.0, 42.0, 100000.0, -10000000000.0, -42.0, 0.0, 42.0,
        100000.0, -10000000000.0])

respectively.  if either of those gives "invalid literal for long", something's
seriously broken.

does a plain

    a = -10000000000.0

still work on your machine?

</F>




From jcarlson at uci.edu  Mon Oct 24 10:19:23 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Mon, 24 Oct 2005 01:19:23 -0700
Subject: [Python-Dev] Definining properties - a use case for class
	decorators?
In-Reply-To: <4edc17eb0510240030x74523a8fl1a9389b5054c061b@mail.gmail.com>
References: <435B5D6E.80101@gmail.com>
	<4edc17eb0510240030x74523a8fl1a9389b5054c061b@mail.gmail.com>
Message-ID: <20051024011400.38DA.JCARLSON@uci.edu>


Michele Simionato <michele.simionato at gmail.com> wrote:
> 
> On 10/23/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> > Very nice indeed. I'd be more supportive if it was defined as a new statement
> > such as "create" with the syntax:
> >
> >    create TYPE NAME(ARGS):
> >      BLOCK
> 
> I like it, but it would require a new keyword. Alternatively, one
> could abuse 'def':
> 
> def  TYPE NAME(ARGS):
>       BLOCK
> 
> but then people would likely be confused as Skip was, earlier in this thread,
> so I guess 'def' is a not an option.
> 
> IMHO a new keyword could be justified for such a powerful feature,
> but only Guido's opinion counts on this matters ;)
> 
> Anyway I expected people to criticize the proposal as too powerful and
> dangerously close to Lisp macros.

I would criticise it for being dangerously close to worthless.  With the
minor support code that I (and others) have offered, no new syntax is
necessary.

You can get the same semantics with...

class NAME(_(TYPE), ARGS):
    BLOCK

And a suitably defined _.  Remember, not every X line function should be
made a builtin or syntax.

 - Josiah


From michele.simionato at gmail.com  Mon Oct 24 10:33:18 2005
From: michele.simionato at gmail.com (Michele Simionato)
Date: Mon, 24 Oct 2005 08:33:18 +0000
Subject: [Python-Dev] Definining properties - a use case for class
	decorators?
In-Reply-To: <20051024011400.38DA.JCARLSON@uci.edu>
References: <435B5D6E.80101@gmail.com>
	<4edc17eb0510240030x74523a8fl1a9389b5054c061b@mail.gmail.com>
	<20051024011400.38DA.JCARLSON@uci.edu>
Message-ID: <4edc17eb0510240133v26cc056em47dee26b901460b9@mail.gmail.com>

On 10/24/05, Josiah Carlson <jcarlson at uci.edu> wrote:
> I would criticise it for being dangerously close to worthless.  With the
> minor support code that I (and others) have offered, no new syntax is
> necessary.
>
> You can get the same semantics with...
>
> class NAME(_(TYPE), ARGS):
>     BLOCK
>
> And a suitably defined _.  Remember, not every X line function should be
> made a builtin or syntax.
>
>  - Josiah

Could you re-read my original message, please? Sugar is *everything*
in this case. If the functionality is to be implemented via a __metaclass__
hook, then it should be considered a hack that nobody in his right mind
should use. OTOH, if there is a specific syntax for it, then it means
this the usage
has the benediction of the BDFL. This would be a HUGE change.
For instance, I would never abuse metaclasses for that, whereas I
would freely use a 'create' statement.

       Michele Simionato

From mal at egenix.com  Mon Oct 24 10:40:28 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 24 Oct 2005 10:40:28 +0200
Subject: [Python-Dev] Divorcing str and unicode (no more
	implicit	conversions).
In-Reply-To: <50862ebd0510232041j19b6b3achd0aecb072f9db148@mail.gmail.com>
References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net>	<ca471dc20510231806k2ba088c7g2d4afd460e023ae1@mail.gmail.com>
	<50862ebd0510232041j19b6b3achd0aecb072f9db148@mail.gmail.com>
Message-ID: <435C9DFC.8020501@egenix.com>

Neil Hodgson wrote:
> Guido van Rossum:
> 
> 
>>Folks, please focus on what Python 3000 should do.
>>
>>I'm thinking about making all character strings Unicode (possibly with
>>different internal representations a la NSString in Apple's Objective
>>C) and introduce a separate mutable bytes array data type. But I could
>>use some validation or feedback on this idea from actual
>>practitioners.
> 
> 
>    I'd like to more tightly define Unicode strings for Python 3000.
> Currently, Unicode strings may be implemented with either 2 byte
> (UCS-2) or 4 byte (UTF-32) elements. Python should allow strings to
> contain any Unicode character and should be indexable yielding
> characters rather than half characters. Therefore Python strings
> should appear to be UTF-32. There could still be multiple
> implementations (using UTF-16 or UTF-8) to preserve space but all
> implementations should appear to be the same apart from speed and
> memory use.

There seems to be a general misunderstanding here: even if you
have UCS4 storage, it is still possible to slice a Unicode
string in a way which makes rendering it correctly.

Unicode has the concept of combining code points, e.g. you can
store an "?" (e with a accent) as "e" + "'". Now if you slice
off the accent, you'll break the character that you encoded
using combining code points.

Note that combining code points are rather common in encodings
of Asian scripts, so this is not an artificial example.

Some time ago I proposed a new module called unicodeindex
to help with indexing. It would solve most of the indexing
issues you run into when dealing with Unicode. I've attached
it to this email for reference.

More on the used terms:

http://www.egenix.com/files/python/EuroPython2002-Python-and-Unicode.pdf
http://www.egenix.com/files/python/LSM2005-Developing-Unicode-aware-applications-in-Python.pdf

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 24 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: pep-unicodeindex.txt
Url: http://mail.python.org/pipermail/python-dev/attachments/20051024/dacea951/pep-unicodeindex.txt

From walter at livinglogic.de  Mon Oct 24 11:00:42 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Mon, 24 Oct 2005 11:00:42 +0200
Subject: [Python-Dev] New codecs checked in
In-Reply-To: <435B95C0.9060005@v.loewis.de>
References: <435965E4.5050207@egenix.com> <435B95C0.9060005@v.loewis.de>
Message-ID: <435CA2BA.7050900@livinglogic.de>

Martin v. L?wis wrote:

> M.-A. Lemburg wrote:
> 
>>I've checked in a whole bunch of newly generated codecs
>>which now make use of the faster charmap decoding variant added
>>by Walter a short while ago.
>>
>>Please let me know if you find any problems.
> 
> I think we should work on eliminating the decoding_map variables.
> There are some codecs which rely on them being present in other codecs
> (e.g. koi8_u.py is based on koi8_r.py); however, this could be updated
> to use, say
> 
> decoding_table = codecs.update_decoding_map(koi8_r.decoding_table, {
>          0x00a4: 0x0454, #       CYRILLIC SMALL LETTER UKRAINIAN IE
>          0x00a6: 0x0456, #       CYRILLIC SMALL LETTER 
> BYELORUSSIAN-UKRAINIAN I
>          0x00a7: 0x0457, #       CYRILLIC SMALL LETTER YI (UKRAINIAN)
>          0x00ad: 0x0491, #       CYRILLIC SMALL LETTER UKRAINIAN GHE 
> WITH UPTURN
>          0x00b4: 0x0404, #       CYRILLIC CAPITAL LETTER UKRAINIAN IE
>          0x00b6: 0x0406, #       CYRILLIC CAPITAL LETTER 
> BYELORUSSIAN-UKRAINIAN I
>          0x00b7: 0x0407, #       CYRILLIC CAPITAL LETTER YI (UKRAINIAN)
>          0x00bd: 0x0490, #       CYRILLIC CAPITAL LETTER UKRAINIAN GHE 
> WITH UPTURN
> })
> 
> With all these cross-references gone, the decoding_maps could also go.

Why should koi_u.py be defined in terms of koi8_r.py anyway? Why not put 
a complete decoding_table into koi8_u.py?

I'd like to suggest a small cosmetic change: gencodec.py should output 
byte values with two hexdigits instead of four. This makes it easier to 
see what is a byte values and what is a codepoint. And it would make 
grepping for stuff simpler.

I.e. change:

decoding_map.update({
     0x0080: 0x0402, #  CYRILLIC CAPITAL LETTER DJE

to

decoding_map.update({
     0x80: 0x0402, #  CYRILLIC CAPITAL LETTER DJE

and

decoding_table = (
     u'\x00' #  0x0000 -> NULL

to

decoding_table = (
     u'\x00' # 0x00 -> U+0000 NULL

and

encoding_map = {
     0x0000: 0x0000, #  NULL

to

encoding_map = {
     0x0000: 0x00, #  NULL

From mal at egenix.com  Mon Oct 24 11:25:27 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 24 Oct 2005 11:25:27 +0200
Subject: [Python-Dev] New codecs checked in
In-Reply-To: <435CA2BA.7050900@livinglogic.de>
References: <435965E4.5050207@egenix.com> <435B95C0.9060005@v.loewis.de>
	<435CA2BA.7050900@livinglogic.de>
Message-ID: <435CA887.3020900@egenix.com>

Walter D?rwald wrote:
> Martin v. L?wis wrote:
> 
>> M.-A. Lemburg wrote:
>>
>>> I've checked in a whole bunch of newly generated codecs
>>> which now make use of the faster charmap decoding variant added
>>> by Walter a short while ago.
>>>
>>> Please let me know if you find any problems.
>>
>>
>> I think we should work on eliminating the decoding_map variables.
>> There are some codecs which rely on them being present in other codecs
>> (e.g. koi8_u.py is based on koi8_r.py); however, this could be updated
>> to use, say
>>
>> decoding_table = codecs.update_decoding_map(koi8_r.decoding_table, {
>>          0x00a4: 0x0454, #       CYRILLIC SMALL LETTER UKRAINIAN IE
>>          0x00a6: 0x0456, #       CYRILLIC SMALL LETTER
>> BYELORUSSIAN-UKRAINIAN I
>>          0x00a7: 0x0457, #       CYRILLIC SMALL LETTER YI (UKRAINIAN)
>>          0x00ad: 0x0491, #       CYRILLIC SMALL LETTER UKRAINIAN GHE
>> WITH UPTURN
>>          0x00b4: 0x0404, #       CYRILLIC CAPITAL LETTER UKRAINIAN IE
>>          0x00b6: 0x0406, #       CYRILLIC CAPITAL LETTER
>> BYELORUSSIAN-UKRAINIAN I
>>          0x00b7: 0x0407, #       CYRILLIC CAPITAL LETTER YI (UKRAINIAN)
>>          0x00bd: 0x0490, #       CYRILLIC CAPITAL LETTER UKRAINIAN GHE
>> WITH UPTURN
>> })
>>
>> With all these cross-references gone, the decoding_maps could also go.

I just left them in because I thought they wouldn't do any harm
and might be useful in some applications.

Removing them where not directly needed by the codec would not
be a problem.

> Why should koi_u.py be defined in terms of koi8_r.py anyway? Why not put
> a complete decoding_table into koi8_u.py?

KOI8-U is not available as mapping on ftp.unicode.org and
I only recreated codecs from the mapping files available
there.

> I'd like to suggest a small cosmetic change: gencodec.py should output
> byte values with two hexdigits instead of four. This makes it easier to
> see what is a byte values and what is a codepoint. And it would make
> grepping for stuff simpler.

True.

I'll rerun the creation with the above changes sometime this
week.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 24 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From ncoghlan at gmail.com  Mon Oct 24 11:35:19 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 24 Oct 2005 19:35:19 +1000
Subject: [Python-Dev] Definining properties - a use case for
	class	decorators?
In-Reply-To: <20051024011400.38DA.JCARLSON@uci.edu>
References: <435B5D6E.80101@gmail.com>	<4edc17eb0510240030x74523a8fl1a9389b5054c061b@mail.gmail.com>
	<20051024011400.38DA.JCARLSON@uci.edu>
Message-ID: <435CAAD7.7010909@gmail.com>

Josiah Carlson wrote:
> You can get the same semantics with...
> 
> class NAME(_(TYPE), ARGS):
>     BLOCK
> 
> And a suitably defined _.  Remember, not every X line function should be
> made a builtin or syntax.

And this would be an extremely fragile hack that is entirely dependent on the 
murky rules regarding how Python chooses the metaclass for the newly created 
class. Ensuring that the metaclass of the class returned by "_" was always the 
one chosen would be tricky at best and impossible at worst.

Even if it *could* be done, I'd never want to see a hack like that in 
production code I had anything to do with.

And while writing it with "__metaclass__" has precisely the correct semantics, 
that simply isn't as readable as a new block statement would be, nor is it as 
readable as the current major alternatives (e.g., defining and invoking a 
factory function).

An alternative to a completely new function would be to simply allow the 
metaclass to be defined up front, rather than inside the body of the class 
statement:

   class @TYPE NAME(ARGS):
       BLOCK

For example:

   class @Property x():
       def get(self):
           return self._x
       def set(self, value):
           self._x = value
       def delete(self, value):
           del self._x

(I put the metaclass after the keyword, because, unlike a function decorator, 
the metaclass is invoked *before* the class is created, and because you're 
only allowed one explicit metaclass)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From mal at egenix.com  Mon Oct 24 11:43:09 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 24 Oct 2005 11:43:09 +0200
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
	conversions).
In-Reply-To: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net>
References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net>
Message-ID: <435CACAD.9070106@egenix.com>

Bengt Richter wrote:
> Please bear with me for a few paragraphs ;-)

Please note that source code encoding doesn't really have
anything to do with the way the interpreter executes the
program - it's merely a way to tell the parser how to
convert string literals (currently on the Unicode ones)
into constant Unicode objects within the program text.
It's also a nice way to let other people know what kind of
encoding you used to write your comments ;-)

Nothing more.

Once a module is compiled, there's no distinction between
a module using the latin-1 source code encoding or one using
the utf-8 encoding.

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 24 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From ncoghlan at gmail.com  Mon Oct 24 11:14:32 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 24 Oct 2005 19:14:32 +1000
Subject: [Python-Dev] PEP 351, the freeze protocol
In-Reply-To: <1130107429.11268.40.camel@geddy.wooz.org>
References: <1130107429.11268.40.camel@geddy.wooz.org>
Message-ID: <435CA5F8.7010607@gmail.com>

Barry Warsaw wrote:
> I've had this PEP laying around for quite a few months.  It was inspired
> by some code we'd written which wanted to be able to get immutable
> versions of arbitrary objects.  I've finally finished the PEP, uploaded
> a sample patch (albeit a bit incomplete), and I'm posting it here to see
> if there is any interest.
> 
> http://www.python.org/peps/pep-0351.html

I think it's definitely worth considering. It may also reduce the need for "x" 
and "frozenx" builtin pairs. We already have "set" and "frozenset", and the 
various "bytes" ideas that have been kicked around have generally considered 
the need for a "frozenbytes" as well.

If freeze was available, then "freeze(x(*args))" might server as a replacement 
for any builtin "frozen" variants.

I think having dicts and sets automatically invoke freeze would be a mistake, 
because at least one of the following two cases would behave unexpectedly:

   d = {}
   l = []
   d[l] = "Oops!"
   d[l] # Raises KeyError if freeze() isn't also invoked in __getitem__

   d = {}
   l = []
   d[l] = "Oops!"
   l.append(1)
   d[l] # Raises KeyError regardless

Oh, and the PEP's xdict example is even more broken than the PEP implies, 
because two imdicts which compare equal (same contents) may not hash equal 
(different id's).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From radeex at gmail.com  Mon Oct 24 11:54:20 2005
From: radeex at gmail.com (Christopher Armstrong)
Date: Mon, 24 Oct 2005 20:54:20 +1100
Subject: [Python-Dev] PEP 351, the freeze protocol
In-Reply-To: <20051023194708.38D7.JCARLSON@uci.edu>
References: <1130107429.11268.40.camel@geddy.wooz.org>
	<20051023194708.38D7.JCARLSON@uci.edu>
Message-ID: <60ed19d40510240254h7e077a74hf719abcf6e5a4ad@mail.gmail.com>

On 10/24/05, Josiah Carlson <jcarlson at uci.edu> wrote:
> "Should dicts and sets automatically freeze their mutable keys?"
>
> Dictionaries don't have mutable keys,

Since when?

class Foo:
    def __init__(self):
        self.x = 1

f = Foo()
d = {f: 1}
f.x = 2

Maybe you meant something else? I can't think of any way in which
"dictionaries don't have mutable keys" is true. The only rule about
dictionary keys that I know of is that they need to be hashable and
need to be comparable with the equality operator.

--
  Twisted   |  Christopher Armstrong: International Man of Twistery
   Radix    |    -- http://radix.twistedmatrix.com
            |  Release Manager, Twisted Project
  \\\V///   |    -- http://twistedmatrix.com
   |o O|    |
w----v----w-+

From jcarlson at uci.edu  Mon Oct 24 12:09:07 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Mon, 24 Oct 2005 03:09:07 -0700
Subject: [Python-Dev] Definining properties - a use case for
	class	decorators?
In-Reply-To: <435CAAD7.7010909@gmail.com>
References: <20051024011400.38DA.JCARLSON@uci.edu> <435CAAD7.7010909@gmail.com>
Message-ID: <20051024025358.38E5.JCARLSON@uci.edu>


Nick Coghlan <ncoghlan at gmail.com> wrote:
> 
> Josiah Carlson wrote:
> > You can get the same semantics with...
> > 
> > class NAME(_(TYPE), ARGS):
> >     BLOCK
> > 
> > And a suitably defined _.  Remember, not every X line function should be
> > made a builtin or syntax.
> 
> And this would be an extremely fragile hack that is entirely dependent on the 
> murky rules regarding how Python chooses the metaclass for the newly created 
> class. Ensuring that the metaclass of the class returned by "_" was always the 
> one chosen would be tricky at best and impossible at worst.

The rules for which metaclass is used is listed in the metaclass
documentation.  I personally never claimed it was perfect, and neither
is this one...

class NAME(_(TYPE, ARGS)):
    BLOCK

But it does solve the problem without needing syntax (and fixes any
possible metaclass order choices).


> Even if it *could* be done, I'd never want to see a hack like that in 
> production code I had anything to do with.

That's perfectly reasonable.


> (I put the metaclass after the keyword, because, unlike a function decorator, 
> the metaclass is invoked *before* the class is created, and because you're 
> only allowed one explicit metaclass)

Perhaps, but because the metaclass can return anything (in this case, it
returns a property), being able to modify the object that is created may
be desireable...at which point, we may as well get class decorators for
the built-in chaining.

 - Josiah


From jcarlson at uci.edu  Mon Oct 24 12:15:17 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Mon, 24 Oct 2005 03:15:17 -0700
Subject: [Python-Dev] PEP 351, the freeze protocol
In-Reply-To: <60ed19d40510240254h7e077a74hf719abcf6e5a4ad@mail.gmail.com>
References: <20051023194708.38D7.JCARLSON@uci.edu>
	<60ed19d40510240254h7e077a74hf719abcf6e5a4ad@mail.gmail.com>
Message-ID: <20051024030952.38E8.JCARLSON@uci.edu>


Christopher Armstrong <radeex at gmail.com> wrote:
> 
> On 10/24/05, Josiah Carlson <jcarlson at uci.edu> wrote:
> > "Should dicts and sets automatically freeze their mutable keys?"
> >
> > Dictionaries don't have mutable keys,
> 
> Since when?
> 
> Maybe you meant something else? I can't think of any way in which
> "dictionaries don't have mutable keys" is true. The only rule about
> dictionary keys that I know of is that they need to be hashable and
> need to be comparable with the equality operator.

Good point, I forgot about user-defined classes (I rarely use them as
keys myself, it's all too easy to make a mutable whose hash is dependant
on mutable contents, as having an object which you can only find if you
have the exact object is not quite as useful I generally need).  I will,
however, stand by, "a container which is frozen should have its contents
frozen as well."

 - Josiah


From walter at livinglogic.de  Mon Oct 24 12:17:31 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Mon, 24 Oct 2005 12:17:31 +0200
Subject: [Python-Dev] New codecs checked in
In-Reply-To: <435CA887.3020900@egenix.com>
References: <435965E4.5050207@egenix.com> <435B95C0.9060005@v.loewis.de>
	<435CA2BA.7050900@livinglogic.de> <435CA887.3020900@egenix.com>
Message-ID: <435CB4BB.6070009@livinglogic.de>

M.-A. Lemburg wrote:

> Walter D?rwald wrote:
> 
>>Martin v. L?wis wrote:
>>
>>>M.-A. Lemburg wrote:
>>>
>>>>I've checked in a whole bunch of newly generated codecs
>>>>which now make use of the faster charmap decoding variant added
>>>>by Walter a short while ago.
>>>>
>>>>Please let me know if you find any problems.
>>>
>>>I think we should work on eliminating the decoding_map variables.
>>>There are some codecs which rely on them being present in other codecs
>>>(e.g. koi8_u.py is based on koi8_r.py); however, this could be updated
>>>to use, say
>>>
>>>decoding_table = codecs.update_decoding_map(koi8_r.decoding_table, {
>>>         0x00a4: 0x0454, #       CYRILLIC SMALL LETTER UKRAINIAN IE
>>>         0x00a6: 0x0456, #       CYRILLIC SMALL LETTER
>>>BYELORUSSIAN-UKRAINIAN I
>>>         0x00a7: 0x0457, #       CYRILLIC SMALL LETTER YI (UKRAINIAN)
>>>         0x00ad: 0x0491, #       CYRILLIC SMALL LETTER UKRAINIAN GHE
>>>WITH UPTURN
>>>         0x00b4: 0x0404, #       CYRILLIC CAPITAL LETTER UKRAINIAN IE
>>>         0x00b6: 0x0406, #       CYRILLIC CAPITAL LETTER
>>>BYELORUSSIAN-UKRAINIAN I
>>>         0x00b7: 0x0407, #       CYRILLIC CAPITAL LETTER YI (UKRAINIAN)
>>>         0x00bd: 0x0490, #       CYRILLIC CAPITAL LETTER UKRAINIAN GHE
>>>WITH UPTURN
>>>})
>>>
>>>With all these cross-references gone, the decoding_maps could also go.
> 
> I just left them in because I thought they wouldn't do any harm
> and might be useful in some applications.
 >
> Removing them where not directly needed by the codec would not
> be a problem.

Recreating them is quite simple via dict(enumerate(decoding_table)) so I 
think we should remove them.

>>Why should koi_u.py be defined in terms of koi8_r.py anyway? Why not put
>>a complete decoding_table into koi8_u.py?
> 
> KOI8-U is not available as mapping on ftp.unicode.org and
> I only recreated codecs from the mapping files available
> there.

OK, so we'd need something that creates a new decoding table from an old 
one + changes, i.e. something like:

def update_decoding_table(table, new):
    table = list[table]
    for (key, value) in new.iteritems():
       table[key] = unichr(value)
    return u"".join(table)

>>I'd like to suggest a small cosmetic change: gencodec.py should output
>>byte values with two hexdigits instead of four. This makes it easier to
>>see what is a byte values and what is a codepoint. And it would make
>>grepping for stuff simpler.
> 
> True.
> 
> I'll rerun the creation with the above changes sometime this
> week.

Great, thanks!

Bye,
    Walter D?rwald

From mwh at python.net  Mon Oct 24 12:24:34 2005
From: mwh at python.net (Michael Hudson)
Date: Mon, 24 Oct 2005 11:24:34 +0100
Subject: [Python-Dev] Definining properties - a use case for
	class	decorators?
In-Reply-To: <435CAAD7.7010909@gmail.com> (Nick Coghlan's message of "Mon,
	24 Oct 2005 19:35:19 +1000")
References: <435B5D6E.80101@gmail.com>
	<4edc17eb0510240030x74523a8fl1a9389b5054c061b@mail.gmail.com>
	<20051024011400.38DA.JCARLSON@uci.edu> <435CAAD7.7010909@gmail.com>
Message-ID: <2m4q776wm5.fsf@starship.python.net>

Nick Coghlan <ncoghlan at gmail.com> writes:

> Josiah Carlson wrote:
>> You can get the same semantics with...
>> 
>> class NAME(_(TYPE), ARGS):
>>     BLOCK
>> 
>> And a suitably defined _.  Remember, not every X line function should be
>> made a builtin or syntax.
>
> And this would be an extremely fragile hack that is entirely
> dependent on the murky rules regarding how Python chooses the
> metaclass for the newly created class.

Uh, not really.  In the presence of base classes it's always "the type
of the first base".  The reason it might not seem this simple is that
most metaclasses end up calling type.__new__ at some point and this
function does more complicated things (such as checking for metaclass
conflict and deferring to the most specific metaclass).  

Not sure what the context is here, but I have to butt in when I see
people complicating things which aren't actually that complicated...

Cheers,
mwh

-- 
  There's an aura of unholy black magic about CLISP.  It works, but
  I have no idea how it does it.  I suspect there's a goat involved
  somewhere.                     -- Johann Hibschman, comp.lang.scheme

From jcarlson at uci.edu  Mon Oct 24 12:54:04 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Mon, 24 Oct 2005 03:54:04 -0700
Subject: [Python-Dev] PEP 351, the freeze protocol
In-Reply-To: <435CA5F8.7010607@gmail.com>
References: <1130107429.11268.40.camel@geddy.wooz.org>
	<435CA5F8.7010607@gmail.com>
Message-ID: <20051024034957.38EB.JCARLSON@uci.edu>


Nick Coghlan <ncoghlan at gmail.com> wrote:
> I think having dicts and sets automatically invoke freeze would be a mistake, 
> because at least one of the following two cases would behave unexpectedly:

I'm pretty sure that the PEP was only aslomg if one would freeze the
contents of dicts IF the dict was being frozen.

That is, which of the following should be the case:
    freeze({1:[2,3,4]}) -> {1:[2,3,4]}
    freeze({1:[2,3,4]}) -> xdict(1=(2,3,4))

 - Josiah


From jcarlson at uci.edu  Mon Oct 24 12:54:55 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Mon, 24 Oct 2005 03:54:55 -0700
Subject: [Python-Dev] Definining properties - a use case for class
	decorators?
In-Reply-To: <4edc17eb0510240133v26cc056em47dee26b901460b9@mail.gmail.com>
References: <20051024011400.38DA.JCARLSON@uci.edu>
	<4edc17eb0510240133v26cc056em47dee26b901460b9@mail.gmail.com>
Message-ID: <20051024020601.38DF.JCARLSON@uci.edu>


Michele Simionato <michele.simionato at gmail.com> wrote:
> 
> On 10/24/05, Josiah Carlson <jcarlson at uci.edu> wrote:
> > I would criticise it for being dangerously close to worthless.  With the
> > minor support code that I (and others) have offered, no new syntax is
> > necessary.
> >
> > You can get the same semantics with...
> >
> > class NAME(_(TYPE), ARGS):
> >     BLOCK
> >
> > And a suitably defined _.  Remember, not every X line function should be
> > made a builtin or syntax.
> >
> >  - Josiah
> 
> Could you re-read my original message, please? Sugar is *everything*
> in this case. If the functionality is to be implemented via a __metaclass__
> hook, then it should be considered a hack that nobody in his right mind
> should use. OTOH, if there is a specific syntax for it, then it means
> this the usage
> has the benediction of the BDFL. This would be a HUGE change.
> For instance, I would never abuse metaclasses for that, whereas I
> would freely use a 'create' statement.

Metaclass abuse?  Oh, I'm sorry, I thought that the point of metaclasses
were to offer a way to make "magic" happen in a somewhat pragmatic
manner, you know, through metaprogramming.  I would call this particular
use a practical application of standard Python semantics.

Pardon me while I attempt to re-parse your above statement...
"If there is a specific syntax for [passing a temporary namespace to a
callable, created by some sort of block mechanism], then [using it for
property creation] has the benediction of the BDFL".

What I'm trying to say is that it already has a no-syntax syntax.  It
uses the "magic" of metaclasses, but one can make that "magic" as
explicit as necessary.

class NAME(PassNamespaceFromClassBlock(fcn=TYPE, args=ARGS)):
    BLOCK


Personally, I've not seen the desire to pass temporary namespaces to
functions until recently, so whether or not people will use it for
property creation, or any other way that people would find interesting
and/or useful, is at least a bit of prediction.  Maybe people will
prefer to use property('get_foo', 'set_foo', 'del_foo'), who knows?  But
you know what?  Regardless of what people want, they can use metaclasses
right now to create properties, where they would have to wait until
Python 2.5 comes out before they could use this proposed 'create'
statement.


 - Josiah


From ronaldoussoren at mac.com  Mon Oct 24 13:13:45 2005
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Mon, 24 Oct 2005 13:13:45 +0200
Subject: [Python-Dev] Definining properties - a use case for class
	decorators?
In-Reply-To: <20051024020601.38DF.JCARLSON@uci.edu>
References: <20051024011400.38DA.JCARLSON@uci.edu>
	<4edc17eb0510240133v26cc056em47dee26b901460b9@mail.gmail.com>
	<20051024020601.38DF.JCARLSON@uci.edu>
Message-ID: <85DC3622-F572-4E00-92A0-3781B7AC7EB0@mac.com>


On 24-okt-2005, at 12:54, Josiah Carlson wrote:


>>
>
> Metaclass abuse?  Oh, I'm sorry, I thought that the point of  
> metaclasses
> were to offer a way to make "magic" happen in a somewhat pragmatic
> manner, you know, through metaprogramming.  I would call this  
> particular
> use a practical application of standard Python semantics.


I'd say using a class statement to define a property is metaclass  
abuse, as would
anything that wouldn't define something class-like. The same is true  
for other
constructs, using an decorator to define something that is not a  
callable would IMHO
also be abuse.

That said, I really have an opinion on the 'create' statement  
proposal yet. It
does seem to have a very limited field of use. I'm quite happy with  
using property
as it is, property('get_foo', 'set_foo') would take away most if not  
all of
the remaining problems.

Ronald


From mal at egenix.com  Mon Oct 24 13:23:14 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 24 Oct 2005 13:23:14 +0200
Subject: [Python-Dev] KOI8_U (New codecs checked in)
In-Reply-To: <435CB4BB.6070009@livinglogic.de>
References: <435965E4.5050207@egenix.com>
	<435B95C0.9060005@v.loewis.de>	<435CA2BA.7050900@livinglogic.de>
	<435CA887.3020900@egenix.com> <435CB4BB.6070009@livinglogic.de>
Message-ID: <435CC422.8070600@egenix.com>

Walter D?rwald wrote:
>>> Why should koi_u.py be defined in terms of koi8_r.py anyway? Why not put
>>> a complete decoding_table into koi8_u.py?
>>
>>
>> KOI8-U is not available as mapping on ftp.unicode.org and
>> I only recreated codecs from the mapping files available
>> there.
> 
> 
> OK, so we'd need something that creates a new decoding table from an old
> one + changes, i.e. something like:
> 
> def update_decoding_table(table, new):
>    table = list[table]
>    for (key, value) in new.iteritems():
>       table[key] = unichr(value)
>    return u"".join(table)

Actually, I'd rather have some official mapping files
for these.

Perhaps we could get someone to upload a mapping file
for KOI8_U to the Unicode site ?!

The mapping is defined in RFC2319:

    http://www.faqs.org/rfcs/rfc2319.html

I've put Alexander Yeremenko, the coordinator of
the KOI8-U group on CC.

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 24 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From mal at egenix.com  Mon Oct 24 13:37:05 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 24 Oct 2005 13:37:05 +0200
Subject: [Python-Dev] KOI8_U (New codecs checked in)
In-Reply-To: <435CC422.8070600@egenix.com>
References: <435965E4.5050207@egenix.com>	<435B95C0.9060005@v.loewis.de>	<435CA2BA.7050900@livinglogic.de>	<435CA887.3020900@egenix.com>
	<435CB4BB.6070009@livinglogic.de> <435CC422.8070600@egenix.com>
Message-ID: <435CC761.8000006@egenix.com>



M.-A. Lemburg wrote:
> Walter D?rwald wrote:
> 
>>>>Why should koi_u.py be defined in terms of koi8_r.py anyway? Why not put
>>>>a complete decoding_table into koi8_u.py?
>>>
>>>
>>>KOI8-U is not available as mapping on ftp.unicode.org and
>>>I only recreated codecs from the mapping files available
>>>there.
>>
>>
>>OK, so we'd need something that creates a new decoding table from an old
>>one + changes, i.e. something like:
>>
>>def update_decoding_table(table, new):
>>   table = list[table]
>>   for (key, value) in new.iteritems():
>>      table[key] = unichr(value)
>>   return u"".join(table)
> 
> 
> Actually, I'd rather have some official mapping files
> for these.
> 
> Perhaps we could get someone to upload a mapping file
> for KOI8_U to the Unicode site ?!
> 
> The mapping is defined in RFC2319:
> 
>     http://www.faqs.org/rfcs/rfc2319.html
> 
> I've put Alexander Yeremenko, the coordinator of
> the KOI8-U group on CC.

Hmm, that email address bounces. I've now put Maxim
on CC: Maxim Dzumanenko <mvd at mylinux.com.ua>

Here's a mapping file for KOI9-U - please check whether
it's correct.

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 24 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: KOI8-U.TXT
Url: http://mail.python.org/pipermail/python-dev/attachments/20051024/362a6093/KOI8-U-0001.asc

From michele.simionato at gmail.com  Mon Oct 24 13:41:18 2005
From: michele.simionato at gmail.com (Michele Simionato)
Date: Mon, 24 Oct 2005 11:41:18 +0000
Subject: [Python-Dev] Definining properties - a use case for class
	decorators?
In-Reply-To: <85DC3622-F572-4E00-92A0-3781B7AC7EB0@mac.com>
References: <20051024011400.38DA.JCARLSON@uci.edu>
	<4edc17eb0510240133v26cc056em47dee26b901460b9@mail.gmail.com>
	<20051024020601.38DF.JCARLSON@uci.edu>
	<85DC3622-F572-4E00-92A0-3781B7AC7EB0@mac.com>
Message-ID: <4edc17eb0510240441n7f02cfbcvdc8bdd171f0b4cf6@mail.gmail.com>

On 10/24/05, Ronald Oussoren <ronaldoussoren at mac.com> wrote:
> I'd say using a class statement to define a property is metaclass
> abuse, as would
> anything that wouldn't define something class-like. The same is true
> for other
> constructs, using an decorator to define something that is not a
> callable would IMHO
> also be abuse.

+1

> That said, I really have an opinion on the 'create' statement
> proposal yet. It
> does seem to have a very limited field of use.

This is definitely non-true. The 'create' statement would have lots of
applications. On top of my mind I can think of 'create' applied to:

- bunches;
- modules;
- interfaces;
- properties;
- usage in framewors, for instance providing sugar for
Object-Relational mappers,
  for making templates (i.e. a create HTMLPage);
- building custom minilanguages;
- ...

This is way I see a 'create' statement is frightening powerful addition to the
language.

     Michele Simionato

From mal at egenix.com  Mon Oct 24 13:43:52 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 24 Oct 2005 13:43:52 +0200
Subject: [Python-Dev] KOI8_U (New codecs checked in)
In-Reply-To: <435CC761.8000006@egenix.com>
References: <435965E4.5050207@egenix.com>	<435B95C0.9060005@v.loewis.de>	<435CA2BA.7050900@livinglogic.de>	<435CA887.3020900@egenix.com>	<435CB4BB.6070009@livinglogic.de>
	<435CC422.8070600@egenix.com> <435CC761.8000006@egenix.com>
Message-ID: <435CC8F8.1070102@egenix.com>

M.-A. Lemburg wrote:
> Here's a mapping file for KOI9-U - please check whether
> it's correct.

I missed one codec point change: 0xB4.

Here's the updated version which matches the codec we currently
have in Python 2.3 and 2.4.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 24 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: KOI8-U.TXT
Url: http://mail.python.org/pipermail/python-dev/attachments/20051024/4b51ccf8/KOI8-U.pot

From ncoghlan at gmail.com  Mon Oct 24 13:53:00 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 24 Oct 2005 21:53:00 +1000
Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues
In-Reply-To: <ca471dc20510231658qb8e14ddkb7956960cfa507d7@mail.gmail.com>
References: <435A4598.3060403@iinet.net.au>	
	<ca471dc20510221022s6e75c5a0y591271cd7a950109@mail.gmail.com>	
	<ca471dc20510221414m4df50f3ft1fceb3eaa024a7b8@mail.gmail.com>	
	<435B597C.6040300@gmail.com>	
	<5.1.1.6.0.20051023124842.01af9078@mail.telecommunity.com>
	<ca471dc20510231658qb8e14ddkb7956960cfa507d7@mail.gmail.com>
Message-ID: <435CCB1C.4030108@gmail.com>

Guido van Rossum wrote:
> Right. That was my point. Nick's worried about undecorated __context__
> because he wants to endow generators with a different default
> __context__. I say no to both proposals and the worries cancel each
> other out. EIBTI.

Works for me.

That makes the resolutions for the posted issues:

    1. The slot name "__context__" will be used instead of "__with__"
    2. The builtin name "context" is currently offlimits due to its ambiguity
    3a. generator-iterators do NOT have a native context
    3b. Use "contextmanager" as a builtin decorator to get generator-contexts
    4. The __context__ slot will NOT be special cased

I'll add those into the PEP and reference this thread after Martin is done 
with the SVN migration.

However, those resolutions bring up the following issues:

   5 a. What exception is raised when EXPR does not have a __context__ method?
     b.  What about when the returned object is missing __enter__ or __exit__?
    I suggest raising TypeError in both cases, for symmetry with for loops.
    The slot check is made in C code, so I don't see any difficulty in raising
    TypeError instead of AttributeError if the relevant slots aren't filled.

   6 a. Should a generic "closing" context manager be provided?
     b. If yes, should it be a builtin or in a "contexttools" module?
    I'm not too worried about this one for the moment, and it could easily be
    left out of the PEP itself. Of the sample managers, it seems the most
    universally useful, though.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From mal at egenix.com  Mon Oct 24 14:09:10 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 24 Oct 2005 14:09:10 +0200
Subject: [Python-Dev] New codecs checked in
In-Reply-To: <435CB4BB.6070009@livinglogic.de>
References: <435965E4.5050207@egenix.com>
	<435B95C0.9060005@v.loewis.de>	<435CA2BA.7050900@livinglogic.de>
	<435CA887.3020900@egenix.com> <435CB4BB.6070009@livinglogic.de>
Message-ID: <435CCEE6.6020005@egenix.com>

Walter D?rwald wrote:
>>>I'd like to suggest a small cosmetic change: gencodec.py should output
>>>byte values with two hexdigits instead of four. This makes it easier to
>>>see what is a byte values and what is a codepoint. And it would make
>>>grepping for stuff simpler.
>>
>>True.
>>
>>I'll rerun the creation with the above changes sometime this
>>week.
> 
> 
> Great, thanks!

Done.

I had to create three custom mapping files for cp1140, koi8-u
and tis-620.

If you want more non-standard charmap codecs converted, please
send me the mapping files in the Unicode standard format for
these files.

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 24 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From paolo_veronelli at libero.it  Mon Oct 24 17:09:53 2005
From: paolo_veronelli at libero.it (Paolino)
Date: Mon, 24 Oct 2005 17:09:53 +0200
Subject: [Python-Dev] PEP 351, the freeze protocol
In-Reply-To: <1130107429.11268.40.camel@geddy.wooz.org>
References: <1130107429.11268.40.camel@geddy.wooz.org>
Message-ID: <435CF941.6070104@libero.it>

I'm not sure I understood completely the idea but deriving freeze 
function from hash gives hash a wider importance.
Is __hash__=id inside a class enough to use a set (sets.Set before 2.5) 
derived class instance as a key to a mapping?
Sure I missed the point.


Regards Paolino


From guido at python.org  Mon Oct 24 16:47:47 2005
From: guido at python.org (Guido van Rossum)
Date: Mon, 24 Oct 2005 07:47:47 -0700
Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues
In-Reply-To: <435CCB1C.4030108@gmail.com>
References: <435A4598.3060403@iinet.net.au>
	<ca471dc20510221022s6e75c5a0y591271cd7a950109@mail.gmail.com>
	<ca471dc20510221414m4df50f3ft1fceb3eaa024a7b8@mail.gmail.com>
	<435B597C.6040300@gmail.com>
	<5.1.1.6.0.20051023124842.01af9078@mail.telecommunity.com>
	<ca471dc20510231658qb8e14ddkb7956960cfa507d7@mail.gmail.com>
	<435CCB1C.4030108@gmail.com>
Message-ID: <ca471dc20510240747v33cfd354qc104129d6f9f90a1@mail.gmail.com>

On 10/24/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> That makes the resolutions for the posted issues:
>
>     1. The slot name "__context__" will be used instead of "__with__"
>     2. The builtin name "context" is currently offlimits due to its ambiguity
>     3a. generator-iterators do NOT have a native context
>     3b. Use "contextmanager" as a builtin decorator to get generator-contexts
>     4. The __context__ slot will NOT be special cased

+1

> I'll add those into the PEP and reference this thread after Martin is done
> with the SVN migration.
>
> However, those resolutions bring up the following issues:
>
>    5 a. What exception is raised when EXPR does not have a __context__ method?
>      b.  What about when the returned object is missing __enter__ or __exit__?
>     I suggest raising TypeError in both cases, for symmetry with for loops.
>     The slot check is made in C code, so I don't see any difficulty in raising
>     TypeError instead of AttributeError if the relevant slots aren't filled.

Why are you so keen on TypeError? I find AttributeError totally
appropriate. I don't see symmetry with for-loops as a valuable
property here. AttributeError and TypeError are often interchangeable
anyway.

>    6 a. Should a generic "closing" context manager be provided?

No. Let's provide the minimal mechanisms FIRST.

>      b. If yes, should it be a builtin or in a "contexttools" module?
>     I'm not too worried about this one for the moment, and it could easily be
>     left out of the PEP itself. Of the sample managers, it seems the most
>     universally useful, though.

Let's leave some examples just be examples.

I think I'm leaning towards adding __context__ to locks (all types
defined in tread or threading, including condition variables), files,
and decimal.Context, and leave it at that.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From gary at modernsongs.com  Mon Oct 24 17:04:52 2005
From: gary at modernsongs.com (Gary Poster)
Date: Mon, 24 Oct 2005 11:04:52 -0400
Subject: [Python-Dev] PEP 351, the freeze protocol
In-Reply-To: <1130107429.11268.40.camel@geddy.wooz.org>
References: <1130107429.11268.40.camel@geddy.wooz.org>
Message-ID: <7F3FCEC2-CD0D-4EAC-9C2B-77ECB4A0B73B@modernsongs.com>


On Oct 23, 2005, at 6:43 PM, Barry Warsaw wrote:

> I've had this PEP laying around for quite a few months.  It was  
> inspired
> by some code we'd written which wanted to be able to get immutable
> versions of arbitrary objects.  I've finally finished the PEP,  
> uploaded
> a sample patch (albeit a bit incomplete), and I'm posting it here  
> to see
> if there is any interest.
>
> http://www.python.org/peps/pep-0351.html

I like this.  I'd like it better if it integrated with the adapter  
PEP, so that the freezing mechanism for a given type could be  
pluggable, and could be provided even if the original object did not  
contemplate it.  I don't know where the adapter PEP stands: skimming  
through the (most recent?) thread in January didn't give me a clear  
idea.

As another poster mentioned, in-place freezing is also of interest to  
me (and why I read the PEP Initially), but as also as mentioned  
that's probably unrelated to your PEP.

Gary

From Alan.McIntyre at esrgtech.com  Mon Oct 24 17:27:08 2005
From: Alan.McIntyre at esrgtech.com (Alan McIntyre)
Date: Mon, 24 Oct 2005 11:27:08 -0400
Subject: [Python-Dev] int(string)
In-Reply-To: <dji37o$fl3$1@sea.gmane.org>
References: <1f7befae0510211952x5eb2000bicdf3c1a80a3f5749@mail.gmail.com>	<2mfyqu6j55.fsf@starship.python.net>	<aac2c7cb0510220949g5c60dcfdp89ca2d94bdd041d4@mail.gmail.com><1f7befae0510221038u4bd75ca5l67dc930ae16d24b7@mail.gmail.com>	<435C747E.9060005@esrgtech.com>
	<dji37o$fl3$1@sea.gmane.org>
Message-ID: <435CFD4C.1090200@esrgtech.com>

Fredrik Lundh wrote:

>does a plain
>
>    a = -10000000000.0
>
>still work on your machine?
>
D'oh - I seriously broke something, then, because it didn't. 
funny_falcon commented on the patch in SF and suggested a change that
took care of that.  I've uploaded the corrected version of the patch,
which now passes all the tests.

Alan

From raymond.hettinger at verizon.net  Mon Oct 24 17:49:35 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Mon, 24 Oct 2005 11:49:35 -0400
Subject: [Python-Dev] PEP 351, the freeze protocol
In-Reply-To: <1130107429.11268.40.camel@geddy.wooz.org>
Message-ID: <003201c5d8b2$88c64280$e135c797@oemcomputer>

[Barry Warsaw]
> I've had this PEP laying around for quite a few months.  It was
inspired
> by some code we'd written which wanted to be able to get immutable
> versions of arbitrary objects.


* FWIW, the _as_immutable() protocol was dropped from sets.py for a
reason.  User reports indicated that it was never helpful in practice.
It added complexity and confusion without producing offsetting benefits.

* AFAICT, there are no use cases for freezing arbitrary objects when the
object types are restricted to just lists and sets but not dicts,
arrays, or other containers.  Even if the range of supported types were
expanded, what applications could use this?  Most apps cannot support
generic substitution of lists and sets -- they have too few methods in
common -- they are almost never interchangeable.

* I'm concerned that generic freezing leads to poor design and
hard-to-find bugs.  One class of bugs results from conflating ordered
and unordered collections as lookup keys.  It is difficult to assess
program correctness when the ordered/unordered distinction has been
abstracted away.  A second class of errors can arise when the original
object mutates and gets out-of-sync with its frozen counterpart.

* For a rare app needing mutable lookup keys, a simple recipe would
suffice:

    freeze_pairs = [(list, tuple), (set, frozenset)]
    
    def freeze(obj):
        try:
            hash(obj)
        except TypeError:
            for sourcetype, desttype in freeze_pairs:
                if isinstance(obj, sourcetype):
                    return desttype(obj)
            raise
        else:
            return obj

Unlike the PEP, the recipe works with older pythons and is trivially
easy to extend to include other containers.

* The name "freeze" is problematic because it suggests an in-place
change.  Instead, the proposed mechanism creates a new object.  In
contrast, explicit conversions like tuple(l) or frozenset(s) are obvious
about their running time, space consumed, and new object identity.  

Overall, I'm -1 on the PEP.  Like a bad C macro, the proposed
abstraction hides too much.  We lose critical distinctions of ordered vs
unordered, mutable vs immutable, new objects vs in-place change, etc.
Without compelling use cases, the mechanism smells like a
hyper-generalization.


Raymond


From janssen at parc.com  Mon Oct 24 19:06:27 2005
From: janssen at parc.com (Bill Janssen)
Date: Mon, 24 Oct 2005 10:06:27 PDT
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
	conversions).
In-Reply-To: Your message of "Sun, 23 Oct 2005 19:23:40 PDT."
	<5.1.1.6.0.20051023221433.02ab8200@mail.telecommunity.com> 
Message-ID: <05Oct24.100629pdt."58617"@synergy1.parc.xerox.com>

> >I'm thinking about making all character strings Unicode (possibly with
> >different internal representations a la NSString in Apple's Objective
> >C) and introduce a separate mutable bytes array data type. But I could
> >use some validation or feedback on this idea from actual
> >practitioners.

+1 from me, too.

> I'm tempted to say it would be even better if there was a command line 
> option that could be used to force all binary opens to result in bytes, and 
> require all text opens to specify an encoding.

I like this idea, too.  Presumably plain "open(FILENAME, MODE)" would
then result in a binary open (no encoding specified), which I've
wanted for a long time (and which makes sense).  But it is a change.

Bill

From janssen at parc.com  Mon Oct 24 19:07:40 2005
From: janssen at parc.com (Bill Janssen)
Date: Mon, 24 Oct 2005 10:07:40 PDT
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
	conversions).
In-Reply-To: Your message of "Sun, 23 Oct 2005 20:41:50 PDT."
	<50862ebd0510232041j19b6b3achd0aecb072f9db148@mail.gmail.com> 
Message-ID: <05Oct24.100744pdt."58617"@synergy1.parc.xerox.com>

> Python should allow strings to
> contain any Unicode character and should be indexable yielding
> characters rather than half characters. Therefore Python strings
> should appear to be UTF-32.

+1.

Bill

From phil at riverbankcomputing.co.uk  Mon Oct 24 19:18:41 2005
From: phil at riverbankcomputing.co.uk (Phil Thompson)
Date: Mon, 24 Oct 2005 18:18:41 +0100
Subject: [Python-Dev] Inconsistent Use of Buffer Interface in stringobject.c
Message-ID: <200510241818.41145.phil@riverbankcomputing.co.uk>

I'm implementing a string-like object in an extension module and trying to 
make it as interoperable with the standard string object as possible. To do 
this I'm implementing the relevant slots and the buffer interface. For most 
things this is fine, but there are a small number of methods in 
stringobject.c that don't use the buffer interface - and I don't understand 
why.

Specifically...

string_contains() doesn't which means that...

    MyString("foo") in "foobar"

...doesn't work.

s.join(sequence) only allows sequence to contain string or unicode objects.

s.strip([chars]) only allows chars to be a string or unicode object. Same for 
lstrip() and rstrip().

s.ljust(width[, fillchar]) only allows fillchar to be a string object (not 
even a unicode object). Same for rjust() and center().

Other methods happily allow types that support the buffer interface as well as 
string and unicode objects.

I'm happy to submit a patch - I just wanted to make sure that this behaviour 
wasn't intentional for some reason.

Thanks,
Phil

From guido at python.org  Mon Oct 24 20:06:43 2005
From: guido at python.org (Guido van Rossum)
Date: Mon, 24 Oct 2005 11:06:43 -0700
Subject: [Python-Dev] Inconsistent Use of Buffer Interface in
	stringobject.c
In-Reply-To: <200510241818.41145.phil@riverbankcomputing.co.uk>
References: <200510241818.41145.phil@riverbankcomputing.co.uk>
Message-ID: <ca471dc20510241106m18b35219w280cb63919e20f1c@mail.gmail.com>

On 10/24/05, Phil Thompson <phil at riverbankcomputing.co.uk> wrote:
> I'm implementing a string-like object in an extension module and trying to
> make it as interoperable with the standard string object as possible. To do
> this I'm implementing the relevant slots and the buffer interface. For most
> things this is fine, but there are a small number of methods in
> stringobject.c that don't use the buffer interface - and I don't understand
> why.
>
> Specifically...
>
> string_contains() doesn't which means that...
>
>     MyString("foo") in "foobar"
>
> ...doesn't work.
>
> s.join(sequence) only allows sequence to contain string or unicode objects.
>
> s.strip([chars]) only allows chars to be a string or unicode object. Same for
> lstrip() and rstrip().
>
> s.ljust(width[, fillchar]) only allows fillchar to be a string object (not
> even a unicode object). Same for rjust() and center().
>
> Other methods happily allow types that support the buffer interface as well as
> string and unicode objects.
>
> I'm happy to submit a patch - I just wanted to make sure that this behaviour
> wasn't intentional for some reason.

A concern I'd have with fixing this is that Unicode objects also
support the buffer API. In any situation where either str or unicode
is accepted I'd be reluctant to guess whether a buffer object was
meant to be str-like or Unicode-like. I think this covers all the
cases you mention here.

We need to support this better in Python 3000; but I'm not sure you
can do much better in Python 2.x; subclassing from str is unlikely to
work for you because then too many places are going to assume the
internal representation is also the same as for str.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From fredrik at pythonware.com  Mon Oct 24 20:19:05 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 24 Oct 2005 20:19:05 +0200
Subject: [Python-Dev] Inconsistent Use of Buffer Interface
	instringobject.c
References: <200510241818.41145.phil@riverbankcomputing.co.uk>
	<ca471dc20510241106m18b35219w280cb63919e20f1c@mail.gmail.com>
Message-ID: <djj8ir$h6q$1@sea.gmane.org>

Guido van Rossum wrote:

> A concern I'd have with fixing this is that Unicode objects also
> support the buffer API. In any situation where either str or unicode
> is accepted I'd be reluctant to guess whether a buffer object was
> meant to be str-like or Unicode-like. I think this covers all the
> cases you mention here.

iirc, SRE solves that by comparing the length of the sequence with the
number of bytes in the buffer.  if length == bytes, it's an 8-bit string; if
length*sizeof(Py_Unicode) == bytes, it's a Unicode string.

</F>




From mal at egenix.com  Mon Oct 24 20:32:22 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 24 Oct 2005 20:32:22 +0200
Subject: [Python-Dev] Inconsistent Use of Buffer Interface
	in	stringobject.c
In-Reply-To: <ca471dc20510241106m18b35219w280cb63919e20f1c@mail.gmail.com>
References: <200510241818.41145.phil@riverbankcomputing.co.uk>
	<ca471dc20510241106m18b35219w280cb63919e20f1c@mail.gmail.com>
Message-ID: <435D28B6.9010806@egenix.com>

Guido van Rossum wrote:
> On 10/24/05, Phil Thompson <phil at riverbankcomputing.co.uk> wrote:
> 
>>I'm implementing a string-like object in an extension module and trying to
>>make it as interoperable with the standard string object as possible. To do
>>this I'm implementing the relevant slots and the buffer interface. For most
>>things this is fine, but there are a small number of methods in
>>stringobject.c that don't use the buffer interface - and I don't understand
>>why.
>>
>>Specifically...
>>
>>string_contains() doesn't which means that...
>>
>>    MyString("foo") in "foobar"
>>
>>...doesn't work.
>>
>>s.join(sequence) only allows sequence to contain string or unicode objects.
>>
>>s.strip([chars]) only allows chars to be a string or unicode object. Same for
>>lstrip() and rstrip().
>>
>>s.ljust(width[, fillchar]) only allows fillchar to be a string object (not
>>even a unicode object). Same for rjust() and center().
>>
>>Other methods happily allow types that support the buffer interface as well as
>>string and unicode objects.
>>
>>I'm happy to submit a patch - I just wanted to make sure that this behaviour
>>wasn't intentional for some reason.
> 
> 
> A concern I'd have with fixing this is that Unicode objects also
> support the buffer API. In any situation where either str or unicode
> is accepted I'd be reluctant to guess whether a buffer object was
> meant to be str-like or Unicode-like. I think this covers all the
> cases you mention here.

This situation is a little better than that: the buffer
interface has a slot called getcharbuffer which is what
the string methods use in case they find that a string
argument is not of type str or unicode.

A few don't, but I guess we could fix this.

str.split(), .[lr]strip() all support the getcharbuffer
interface. str.join() currently doesn't. The Unicode object also
leaves out a few cases, among those the ones you mentioned.
If it's better for inter-op, I guess we should make an effort
and let all of them support the getcharbuffer interface.

> We need to support this better in Python 3000; but I'm not sure you
> can do much better in Python 2.x; subclassing from str is unlikely to
> work for you because then too many places are going to assume the
> internal representation is also the same as for str.

As first step, I'd suggest to implement the gatcharbuffer
slot. That will already go a long way.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 24 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From guido at python.org  Mon Oct 24 20:39:21 2005
From: guido at python.org (Guido van Rossum)
Date: Mon, 24 Oct 2005 11:39:21 -0700
Subject: [Python-Dev] Inconsistent Use of Buffer Interface in
	stringobject.c
In-Reply-To: <435D28B6.9010806@egenix.com>
References: <200510241818.41145.phil@riverbankcomputing.co.uk>
	<ca471dc20510241106m18b35219w280cb63919e20f1c@mail.gmail.com>
	<435D28B6.9010806@egenix.com>
Message-ID: <ca471dc20510241139j1e3196fdy443c066739048d1f@mail.gmail.com>

On 10/24/05, M.-A. Lemburg <mal at egenix.com> wrote:
> Guido van Rossum wrote:
> > A concern I'd have with fixing this is that Unicode objects also
> > support the buffer API. In any situation where either str or unicode
> > is accepted I'd be reluctant to guess whether a buffer object was
> > meant to be str-like or Unicode-like. I think this covers all the
> > cases you mention here.
>
> This situation is a little better than that: the buffer
> interface has a slot called getcharbuffer which is what
> the string methods use in case they find that a string
> argument is not of type str or unicode.

I stand corrected!

> As first step, I'd suggest to implement the gatcharbuffer
> slot. That will already go a long way.

Phil, if anything still doesn't work after doing what Marc-Andre says,
those would be good candidates for fixes!

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From phil at riverbankcomputing.co.uk  Mon Oct 24 20:56:26 2005
From: phil at riverbankcomputing.co.uk (Phil Thompson)
Date: Mon, 24 Oct 2005 19:56:26 +0100
Subject: [Python-Dev] Inconsistent Use of Buffer Interface in
	stringobject.c
In-Reply-To: <ca471dc20510241139j1e3196fdy443c066739048d1f@mail.gmail.com>
References: <200510241818.41145.phil@riverbankcomputing.co.uk>
	<435D28B6.9010806@egenix.com>
	<ca471dc20510241139j1e3196fdy443c066739048d1f@mail.gmail.com>
Message-ID: <200510241956.26257.phil@riverbankcomputing.co.uk>

On Monday 24 October 2005 7:39 pm, Guido van Rossum wrote:
> On 10/24/05, M.-A. Lemburg <mal at egenix.com> wrote:
> > Guido van Rossum wrote:
> > > A concern I'd have with fixing this is that Unicode objects also
> > > support the buffer API. In any situation where either str or unicode
> > > is accepted I'd be reluctant to guess whether a buffer object was
> > > meant to be str-like or Unicode-like. I think this covers all the
> > > cases you mention here.
> >
> > This situation is a little better than that: the buffer
> > interface has a slot called getcharbuffer which is what
> > the string methods use in case they find that a string
> > argument is not of type str or unicode.
>
> I stand corrected!
>
> > As first step, I'd suggest to implement the gatcharbuffer
> > slot. That will already go a long way.
>
> Phil, if anything still doesn't work after doing what Marc-Andre says,
> those would be good candidates for fixes!

I have implemented getcharbuffer - I was highlighting those methods where the 
getcharbuffer implementation was ignored.

I'll put a patch together.

Phil

From martin at v.loewis.de  Mon Oct 24 22:44:38 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 24 Oct 2005 22:44:38 +0200
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
	conversions).
In-Reply-To: <50862ebd0510240024o2f339a0bwbe25bee0639dd229@mail.gmail.com>
References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net>	
	<ca471dc20510231806k2ba088c7g2d4afd460e023ae1@mail.gmail.com>	
	<50862ebd0510232041j19b6b3achd0aecb072f9db148@mail.gmail.com>	
	<435C7EFE.6060504@v.loewis.de>
	<50862ebd0510240024o2f339a0bwbe25bee0639dd229@mail.gmail.com>
Message-ID: <435D47B6.4010703@v.loewis.de>

Neil Hodgson wrote:
>    For Windows, the code will get a little uglier, needing to perform
> an allocation/encoding and deallocation more often then at present but
> I don't think there will be a speed degradation as Windows is
> currently performing a conversion from 8 bit to UTF-16 inside many
> system calls.
[...]
> 
>    For indexing UTF-16, a flag could be set to show if the string is
> all in the base plane and if not, an index could be constructed when
> and if needed.

There are many design alternatives: one option would be to support
*three* internal representations in a single type, generating the
others from the one operation existing as needed. The default, initial
representation might be UTF-8, with UCS-4 only being generated when
indexing occurs, and UCS-2 only being generated when the API requires
it. On concatenation, always concatenate just one represenation: either
one that is already present in both operands, else UTF-8.

 > It'd be good to get some feel for what proportion of
> string operations performed require indexing. Many, such as
> startswith, split, and concatenation don't require indexing. The
> proportion of operations that use indexing to scan strings would also
> be interesting as adding a (currentIndex, currentOffset) cursor to
> string objects would be another approach.

Indeed. My guess is that indexing is more common than you think,
especially when iterating over the string. Of course, iteration
could also operate on UTF-8, if you introduced string iterator
objects.

Regards,
Martin

From martin at v.loewis.de  Mon Oct 24 22:49:05 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 24 Oct 2005 22:49:05 +0200
Subject: [Python-Dev] New codecs checked in
In-Reply-To: <435CA2BA.7050900@livinglogic.de>
References: <435965E4.5050207@egenix.com> <435B95C0.9060005@v.loewis.de>
	<435CA2BA.7050900@livinglogic.de>
Message-ID: <435D48C1.9080003@v.loewis.de>

Walter D?rwald wrote:
> Why should koi_u.py be defined in terms of koi8_r.py anyway? Why not put 
> a complete decoding_table into koi8_u.py?

Not sure. Unfortunately, the tables being used as source are not part of
the Python source, so nobody except MAL can faithfully regenerate them.
If they were part of the Python source, explicitly adding one for
KOI8-U would certainly be feasible.

> I.e. change:
> 
> decoding_map.update({
>     0x0080: 0x0402, #  CYRILLIC CAPITAL LETTER DJE

Hmm. I was suggesting to remove decoding_map completely, in which
case neither the current form nor your suggested cosmetic change
would survive.

> to
> 
> decoding_table = (
>     u'\x00' # 0x00 -> U+0000 NULL

Using U+XXXX in comments to denote the codepoints is a good idea,
anyway.

Regards,
Martin

From martin at v.loewis.de  Mon Oct 24 22:55:07 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 24 Oct 2005 22:55:07 +0200
Subject: [Python-Dev] New codecs checked in
In-Reply-To: <435CA887.3020900@egenix.com>
References: <435965E4.5050207@egenix.com> <435B95C0.9060005@v.loewis.de>
	<435CA2BA.7050900@livinglogic.de> <435CA887.3020900@egenix.com>
Message-ID: <435D4A2B.1090505@v.loewis.de>

M.-A. Lemburg wrote:
> I just left them in because I thought they wouldn't do any harm
> and might be useful in some applications.
> 
> Removing them where not directly needed by the codec would not
> be a problem.

I think memory usage caused is measurable (I estimated 4KiB per
dictionary). More importantly, people apparently currently change
the dictionaries we provide and expect the codecs to automatically
pick up the modified mappings. It would be better if the breakage
is explicit (i.e. they get an AttributeError on the variable) instead
of implicit (their changes to the mapping simply have no effect
anymore).

> KOI8-U is not available as mapping on ftp.unicode.org and
> I only recreated codecs from the mapping files available
> there.

I think we should come up with mapping tables for the additional
codecs as well, and maintain them in the CVS. This also applies
to things like rot13.

> I'll rerun the creation with the above changes sometime this
> week.

I hope I can finish my encoding routine shortly, which again
results in changes to the codecs (replacing the encoding dictionaries
with other lookup tables).

Regards,
Martin

From martin at v.loewis.de  Mon Oct 24 22:59:00 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 24 Oct 2005 22:59:00 +0200
Subject: [Python-Dev] New codecs checked in
In-Reply-To: <435CCEE6.6020005@egenix.com>
References: <435965E4.5050207@egenix.com>
	<435B95C0.9060005@v.loewis.de>	<435CA2BA.7050900@livinglogic.de>
	<435CA887.3020900@egenix.com> <435CB4BB.6070009@livinglogic.de>
	<435CCEE6.6020005@egenix.com>
Message-ID: <435D4B14.7060008@v.loewis.de>

M.-A. Lemburg wrote:

> I had to create three custom mapping files for cp1140, koi8-u
> and tis-620.

Can you please publish the files you have used somewhere? They
best go into the Python CVS.

Regards,
Martin


From martin at v.loewis.de  Mon Oct 24 23:06:38 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 24 Oct 2005 23:06:38 +0200
Subject: [Python-Dev] Divorcing str and unicode (no
	more	implicit	conversions).
In-Reply-To: <435C9DFC.8020501@egenix.com>
References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net>	<ca471dc20510231806k2ba088c7g2d4afd460e023ae1@mail.gmail.com>	<50862ebd0510232041j19b6b3achd0aecb072f9db148@mail.gmail.com>
	<435C9DFC.8020501@egenix.com>
Message-ID: <435D4CDE.8020907@v.loewis.de>

M.-A. Lemburg wrote:
> There seems to be a general misunderstanding here: even if you
> have UCS4 storage, it is still possible to slice a Unicode
> string in a way which makes rendering it correctly.
                                                       [impossible?]

> Unicode has the concept of combining code points, e.g. you can
> store an "?" (e with a accent) as "e" + "'". Now if you slice
> off the accent, you'll break the character that you encoded
> using combining code points.

While this is all true, I agree with Neil that it should do
whatever it does consistently across implementations, i.e.
len("\U00010000") should always give the same result, and
I think this result should always be 1.

How to best implement this efficiently is an entirely different
question, as is the question whether you can render
arbitrary substrings in a meaningful way.

Regards,
Martin

From solipsis at pitrou.net  Mon Oct 24 23:22:23 2005
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 24 Oct 2005 23:22:23 +0200
Subject: [Python-Dev] Divorcing str and unicode (no more
	implicit	conversions).
In-Reply-To: <435D47B6.4010703@v.loewis.de>
References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net>
	<ca471dc20510231806k2ba088c7g2d4afd460e023ae1@mail.gmail.com>
	<50862ebd0510232041j19b6b3achd0aecb072f9db148@mail.gmail.com>
	<435C7EFE.6060504@v.loewis.de>
	<50862ebd0510240024o2f339a0bwbe25bee0639dd229@mail.gmail.com>
	<435D47B6.4010703@v.loewis.de>
Message-ID: <1130188943.8137.13.camel@fsol>


> There are many design alternatives: one option would be to support
> *three* internal representations in a single type, generating the
> others from the one operation existing as needed. The default, initial
> representation might be UTF-8, with UCS-4 only being generated when
> indexing occurs, and UCS-2 only being generated when the API requires
> it. On concatenation, always concatenate just one represenation: either
> one that is already present in both operands, else UTF-8.

Wouldn't it be simpler to use:
- one-byte representation if every character <= 0xFF
- two-byte representation if every character <= 0xFFFF
- four-byte representation otherwise

Then combining several strings means using the larger representation as
a result (*). In practice, most use cases will not involve the four-byte
representation.

(*) a heuristic can be invented so that, when producing a smaller string
(by stripping/slicing/etc.), it will "sometimes" check whether a
narrower representation is possible.
For example : store the length of the string when the last check
occurred, and do a new check when the length falls below the half that
value.

Regards

Antoine.



From guido at python.org  Mon Oct 24 23:31:18 2005
From: guido at python.org (Guido van Rossum)
Date: Mon, 24 Oct 2005 14:31:18 -0700
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
	conversions).
In-Reply-To: <435D47B6.4010703@v.loewis.de>
References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net>
	<ca471dc20510231806k2ba088c7g2d4afd460e023ae1@mail.gmail.com>
	<50862ebd0510232041j19b6b3achd0aecb072f9db148@mail.gmail.com>
	<435C7EFE.6060504@v.loewis.de>
	<50862ebd0510240024o2f339a0bwbe25bee0639dd229@mail.gmail.com>
	<435D47B6.4010703@v.loewis.de>
Message-ID: <ca471dc20510241431q2f1ab3b7md3a3593cbd28903c@mail.gmail.com>

On 10/24/05, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> Indeed. My guess is that indexing is more common than you think,
> especially when iterating over the string. Of course, iteration
> could also operate on UTF-8, if you introduced string iterator
> objects.

Python's slice-and-dice model pretty much ensures that indexing is
common. Almost everything is ultimately represented as indices: regex
search results have the index in the API, find()/index() return
indices, many operations take a start and/or end index. As long as
that's the case, indexing better be fast.

Changing the APIs would be much work, although perhaps not impossible
of Python 3000. For example, Raymond Hettinger's partition() API
doesn't refer to indices at all, and can replace many uses of find()
or index().

Still, the mere existence of __getitem__ and __getslice__ on strings
makes it necessary to implement them efficiently. How realistic would
it be to drop them? What should replace them? Some kind of abstract
pointers-into-strings perhaps, but that seems much more complex.

The trick seems to be to support both simple programs manipulating
short strings (where indexing is probably the easiest API to
understand, and the additional copying is unlikely to cause
performance problems) , as well as  programs manipulating very large
buffers containing text and doing sophisticated string processing on
them. Perhaps we could provide a different kind of API to support the
latter, perhaps based on a mutable character buffer data type without
direct indexing?

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From martin at v.loewis.de  Tue Oct 25 00:21:06 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 25 Oct 2005 00:21:06 +0200
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
	conversions).
In-Reply-To: <ca471dc20510241431q2f1ab3b7md3a3593cbd28903c@mail.gmail.com>
References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net>	
	<ca471dc20510231806k2ba088c7g2d4afd460e023ae1@mail.gmail.com>	
	<50862ebd0510232041j19b6b3achd0aecb072f9db148@mail.gmail.com>	
	<435C7EFE.6060504@v.loewis.de>	
	<50862ebd0510240024o2f339a0bwbe25bee0639dd229@mail.gmail.com>	
	<435D47B6.4010703@v.loewis.de>
	<ca471dc20510241431q2f1ab3b7md3a3593cbd28903c@mail.gmail.com>
Message-ID: <435D5E52.8080003@v.loewis.de>

Guido van Rossum wrote:
> Changing the APIs would be much work, although perhaps not impossible
> of Python 3000. For example, Raymond Hettinger's partition() API
> doesn't refer to indices at all, and can replace many uses of find()
> or index().

I think Neil's proposal is not to make them go away, but to implement
them less efficiently. For example, if the internal representation
is UTF-8, indexing requires linear time, as opposed to constant time.
If the internal representation is UTF-16, and you have a flag to
indicate whether there are any surrogates on the string, indexing
is constant if the flag is false, else linear.

> Perhaps we could provide a different kind of API to support the
> latter, perhaps based on a mutable character buffer data type without
> direct indexing?

There are different design goals conflicting here:
- some think: "all my data is ASCII, so I want to only use one
   byte per character".
- others think: "all my data goes to the Windows API, so I want
   to use 2 byte per character".
- yet others think: "I want all of Unicode, with proper, efficient
   indexing, so I want four bytes per char".

It's not so much a matter of API as a matter of internal
representation. The API doesn't have to change (except for the
very low-level C API that directly exposes Py_UNICODE*, perhaps).

Regards,
Martin

From guido at python.org  Tue Oct 25 00:47:22 2005
From: guido at python.org (Guido van Rossum)
Date: Mon, 24 Oct 2005 15:47:22 -0700
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
	conversions).
In-Reply-To: <435D5E52.8080003@v.loewis.de>
References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net>
	<ca471dc20510231806k2ba088c7g2d4afd460e023ae1@mail.gmail.com>
	<50862ebd0510232041j19b6b3achd0aecb072f9db148@mail.gmail.com>
	<435C7EFE.6060504@v.loewis.de>
	<50862ebd0510240024o2f339a0bwbe25bee0639dd229@mail.gmail.com>
	<435D47B6.4010703@v.loewis.de>
	<ca471dc20510241431q2f1ab3b7md3a3593cbd28903c@mail.gmail.com>
	<435D5E52.8080003@v.loewis.de>
Message-ID: <ca471dc20510241547g46a4df56q5d61a810b7316ada@mail.gmail.com>

On 10/24/05, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> Guido van Rossum wrote:
> > Changing the APIs would be much work, although perhaps not impossible
> > of Python 3000. For example, Raymond Hettinger's partition() API
> > doesn't refer to indices at all, and can replace many uses of find()
> > or index().
>
> I think Neil's proposal is not to make them go away, but to implement
> them less efficiently. For example, if the internal representation
> is UTF-8, indexing requires linear time, as opposed to constant time.
> If the internal representation is UTF-16, and you have a flag to
> indicate whether there are any surrogates on the string, indexing
> is constant if the flag is false, else linear.

I understand all that. My point is that it's a bad idea to offer an
indexing operation that isn't O(1).

> > Perhaps we could provide a different kind of API to support the
> > latter, perhaps based on a mutable character buffer data type without
> > direct indexing?
>
> There are different design goals conflicting here:
> - some think: "all my data is ASCII, so I want to only use one
>    byte per character".
> - others think: "all my data goes to the Windows API, so I want
>    to use 2 byte per character".
> - yet others think: "I want all of Unicode, with proper, efficient
>    indexing, so I want four bytes per char".

I doubt the last one though. Probably they really don't want efficient
indexing, they want to perform higher-level operations that currently
are only possible using efficient indexing or slicing. With the right
API. perhaps they could work just as efficiently with an internal
representation of UTF-8.

> It's not so much a matter of API as a matter of internal
> representation. The API doesn't have to change (except for the
> very low-level C API that directly exposes Py_UNICODE*, perhaps).

I think the API should reflect the representation *to some extend*,
namely it shouldn't claim to have operations that are typically
thought of as O(1) that can only be implemented as O(n). An internal
representation of UTF-8 might make everyone happy except heavy Windows
users; but it requires changes to the API so people won't be writing
Python 2.x-style string slinging code.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From martin at v.loewis.de  Tue Oct 25 00:59:27 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 25 Oct 2005 00:59:27 +0200
Subject: [Python-Dev] Divorcing str and unicode (no
	more	implicit	conversions).
In-Reply-To: <1130188943.8137.13.camel@fsol>
References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net>	<ca471dc20510231806k2ba088c7g2d4afd460e023ae1@mail.gmail.com>	<50862ebd0510232041j19b6b3achd0aecb072f9db148@mail.gmail.com>	<435C7EFE.6060504@v.loewis.de>	<50862ebd0510240024o2f339a0bwbe25bee0639dd229@mail.gmail.com>	<435D47B6.4010703@v.loewis.de>
	<1130188943.8137.13.camel@fsol>
Message-ID: <435D674F.3060209@v.loewis.de>

Antoine Pitrou wrote:
>>There are many design alternatives:
> 
> Wouldn't it be simpler to use:
> - one-byte representation if every character <= 0xFF
> - two-byte representation if every character <= 0xFFFF
> - four-byte representation otherwise

As I said: there are many alternatives. This one has the
disadvantage of requiring a copy every time you pass the string
to a Win32 function (which expects UTF-16).

Whether or not this is a significant disadvantage, I don't know.

In any case, a multi-representations implementation has the
disadvantage of making the C API more difficult to use, in
particular for writing codecs. On encoding, it is difficult
to fetch the individual characters which you need for the
lookup table; on decoding, it is difficult to know in advance
what representation to use (unless you know there is an upper
bound on the decoded character ordinals).

Regards,
Martin

From nyamatongwe at gmail.com  Tue Oct 25 01:13:51 2005
From: nyamatongwe at gmail.com (Neil Hodgson)
Date: Tue, 25 Oct 2005 09:13:51 +1000
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
	conversions).
In-Reply-To: <435C9DFC.8020501@egenix.com>
References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net>
	<ca471dc20510231806k2ba088c7g2d4afd460e023ae1@mail.gmail.com>
	<50862ebd0510232041j19b6b3achd0aecb072f9db148@mail.gmail.com>
	<435C9DFC.8020501@egenix.com>
Message-ID: <50862ebd0510241613w2b5da91cqbcaf4f6157ae338e@mail.gmail.com>

M.-A. Lemburg:

> Unicode has the concept of combining code points, e.g. you can
> store an "?" (e with a accent) as "e" + "'". Now if you slice
> off the accent, you'll break the character that you encoded
> using combining code points.
> ...
>     next_<indextype>(u, index) -> integer
>
>         Returns the Unicode object index for the start of the next
>         <indextype> found after u[index] or -1 in case no next element
>         of this type exists.

   Should entity breakage be further discouraged by returning a slice
here rather than an object index?

   Something like:

i = first_grapheme(u)
x = 0
while x < width and u[i] != "\n":
   x, _ = draw(u[i], (x, y))
   i = next_grapheme(u, i)

   Neil

From janssen at parc.com  Tue Oct 25 01:50:58 2005
From: janssen at parc.com (Bill Janssen)
Date: Mon, 24 Oct 2005 16:50:58 PDT
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
	conversions).
In-Reply-To: Your message of "Mon, 24 Oct 2005 15:47:22 PDT."
	<ca471dc20510241547g46a4df56q5d61a810b7316ada@mail.gmail.com> 
Message-ID: <05Oct24.165105pdt."58617"@synergy1.parc.xerox.com>

> > - yet others think: "I want all of Unicode, with proper, efficient
> >    indexing, so I want four bytes per char".
> 
> I doubt the last one though. Probably they really don't want efficient
> indexing, they want to perform higher-level operations that currently
> are only possible using efficient indexing or slicing. With the right
> API. perhaps they could work just as efficiently with an internal
> representation of UTF-8.

I just got mail this morning from a researcher who wants exactly what
Martin described, and wondered why the default MacPython 2.4.2 didn't
provide it by default. :-)

Bill

From guido at python.org  Tue Oct 25 02:04:35 2005
From: guido at python.org (Guido van Rossum)
Date: Mon, 24 Oct 2005 17:04:35 -0700
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
	conversions).
In-Reply-To: <3481071664344267578@unknownmsgid>
References: <ca471dc20510241547g46a4df56q5d61a810b7316ada@mail.gmail.com>
	<3481071664344267578@unknownmsgid>
Message-ID: <ca471dc20510241704h4f17486fg3a17a92df45bbf6@mail.gmail.com>

On 10/24/05, Bill Janssen <janssen at parc.com> wrote:
> > > - yet others think: "I want all of Unicode, with proper, efficient
> > >    indexing, so I want four bytes per char".
> >
> > I doubt the last one though. Probably they really don't want efficient
> > indexing, they want to perform higher-level operations that currently
> > are only possible using efficient indexing or slicing. With the right
> > API. perhaps they could work just as efficiently with an internal
> > representation of UTF-8.
>
> I just got mail this morning from a researcher who wants exactly what
> Martin described, and wondered why the default MacPython 2.4.2 didn't
> provide it by default. :-)

Oh, I don't doubt that they want it. But often they don't *need* it,
and the higher-level goal they are trying to accomplish can be dealt
with better in a different way. (Sort of my response to people asking
for static typing in Python as well. :-)

Did they tell you what they were trying to do that MacPython 2.4.2
wouldn't let them, beyond "represent a large Unicode string as an
array of 4-byte integers"?

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From greg.ewing at canterbury.ac.nz  Tue Oct 25 02:39:58 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 25 Oct 2005 13:39:58 +1300
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
 conversions).
In-Reply-To: <ca471dc20510241547g46a4df56q5d61a810b7316ada@mail.gmail.com>
References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net>
	<ca471dc20510231806k2ba088c7g2d4afd460e023ae1@mail.gmail.com>
	<50862ebd0510232041j19b6b3achd0aecb072f9db148@mail.gmail.com>
	<435C7EFE.6060504@v.loewis.de>
	<50862ebd0510240024o2f339a0bwbe25bee0639dd229@mail.gmail.com>
	<435D47B6.4010703@v.loewis.de>
	<ca471dc20510241431q2f1ab3b7md3a3593cbd28903c@mail.gmail.com>
	<435D5E52.8080003@v.loewis.de>
	<ca471dc20510241547g46a4df56q5d61a810b7316ada@mail.gmail.com>
Message-ID: <435D7EDE.7040307@canterbury.ac.nz>

Guido van Rossum wrote:

> I think the API should reflect the representation *to some extend*,
> namely it shouldn't claim to have operations that are typically
> thought of as O(1) that can only be implemented as O(n).

Maybe a compromise could be reached by using a
btree of chunks or something, so indexing is
O(log n). Not as good as O(1) but a lot better
than O(n).

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From greg.ewing at canterbury.ac.nz  Tue Oct 25 02:40:02 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 25 Oct 2005 13:40:02 +1300
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
 conversions).
In-Reply-To: <ca471dc20510241431q2f1ab3b7md3a3593cbd28903c@mail.gmail.com>
References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net>
	<ca471dc20510231806k2ba088c7g2d4afd460e023ae1@mail.gmail.com>
	<50862ebd0510232041j19b6b3achd0aecb072f9db148@mail.gmail.com>
	<435C7EFE.6060504@v.loewis.de>
	<50862ebd0510240024o2f339a0bwbe25bee0639dd229@mail.gmail.com>
	<435D47B6.4010703@v.loewis.de>
	<ca471dc20510241431q2f1ab3b7md3a3593cbd28903c@mail.gmail.com>
Message-ID: <435D7EE2.6080907@canterbury.ac.nz>

Guido van Rossum wrote:

> Python's slice-and-dice model pretty much ensures that indexing is
> common. Almost everything is ultimately represented as indices: regex
> search results have the index in the API, find()/index() return
> indices, many operations take a start and/or end index.

Maybe the idea of string views should be reconsidered in
light of this. It's been criticised on the grounds that
its use could keep large strings alive longer than needed,
but if operations that currently return indices instead
returned string views, this wouldn't be any more of a
concern than it is now, especially if there is an easy
way to explicitly materialise the view as an independent
string when wanted.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From janssen at parc.com  Tue Oct 25 04:39:42 2005
From: janssen at parc.com (Bill Janssen)
Date: Mon, 24 Oct 2005 19:39:42 PDT
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
	conversions).
In-Reply-To: Your message of "Mon, 24 Oct 2005 17:04:35 PDT."
	<ca471dc20510241704h4f17486fg3a17a92df45bbf6@mail.gmail.com> 
Message-ID: <05Oct24.193946pdt."58617"@synergy1.parc.xerox.com>

Guido writes:
> Oh, I don't doubt that they want it. But often they don't *need* it,
> and the higher-level goal they are trying to accomplish can be dealt
> with better in a different way. (Sort of my response to people asking
> for static typing in Python as well. :-)

I suppose that's true.  But what if they're not smart enough to figure
out that better, different, way?  I doubt you intend Python to be sort
of the Rubik's cube of programming...

And no, he didn't say why he wanted the ability to "represent a
Unicode string as an array of 4-byte integers".  Though I know he's
doing something with the Deseret Alphabet, translating some early work
on American Indian culture that was transcribed in that character set.

Bill

From simon at arrowtheory.com  Tue Oct 25 05:36:26 2005
From: simon at arrowtheory.com (Simon Burton)
Date: Tue, 25 Oct 2005 13:36:26 +1000
Subject: [Python-Dev] AST branch is in?
In-Reply-To: <djbc7m$ptk$1@sea.gmane.org>
References: <200510211202.12015.anthony@interlink.com.au>
	<djbc7m$ptk$1@sea.gmane.org>
Message-ID: <20051025133626.286990fd.simon@arrowtheory.com>

On Fri, 21 Oct 2005 18:32:22 +0000 (UTC)
nas at arctrix.com (Neil Schemenauer) wrote:

> > Does it just allow us to do new and interesting manipulations of
> > the code during compilation?
> 
> Well, that's a pretty big deal, IMHO. For example, adding
> pychecker-like functionality should be straight forward now. I also
> hope some of the namespace optimizations get explored (e.g. PEP
> 267).

Is there a python interface ?

Simon.



-- 
Simon Burton, B.Sc.
Licensed PO Box 8066
ANU Canberra 2601
Australia
Ph. 61 02 6249 6940
http://arrowtheory.com 

From mal at egenix.com  Tue Oct 25 10:38:14 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 25 Oct 2005 10:38:14 +0200
Subject: [Python-Dev] Divorcing str and unicode (no more
	implicit	conversions).
In-Reply-To: <50862ebd0510241613w2b5da91cqbcaf4f6157ae338e@mail.gmail.com>
References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net>	
	<ca471dc20510231806k2ba088c7g2d4afd460e023ae1@mail.gmail.com>	
	<50862ebd0510232041j19b6b3achd0aecb072f9db148@mail.gmail.com>	
	<435C9DFC.8020501@egenix.com>
	<50862ebd0510241613w2b5da91cqbcaf4f6157ae338e@mail.gmail.com>
Message-ID: <435DEEF6.5020603@egenix.com>

Neil Hodgson wrote:
> M.-A. Lemburg:
> 
> 
>>Unicode has the concept of combining code points, e.g. you can
>>store an "?" (e with a accent) as "e" + "'". Now if you slice
>>off the accent, you'll break the character that you encoded
>>using combining code points.
>>...
>>    next_<indextype>(u, index) -> integer
>>
>>        Returns the Unicode object index for the start of the next
>>        <indextype> found after u[index] or -1 in case no next element
>>        of this type exists.
> 
> 
>    Should entity breakage be further discouraged by returning a slice
> here rather than an object index?

You mean a slice that slices out the next <indextype> ?

>    Something like:
> 
> i = first_grapheme(u)
> x = 0
> while x < width and u[i] != "\n":
>    x, _ = draw(u[i], (x, y))
>    i = next_grapheme(u, i)

This sounds a lot like you'd want iterators for the various
index types. Should be possible to implement on top of the
proposed APIs, e.g. itergraphemes(u), itercodepoints(u), etc.

Note that what most people refer to as "character" is a
grapheme in Unicode speak. Given that interpretation,
"breaking" Unicode "characters" is something you won't
ever work around with by using larger code units such
as UCS4 compatible ones.

Furthermore, you should also note that surrogates (two
code units encoding one code point) are part of Unicode
life. While you don't need them when storing Unicode
in UCS4 code units, they can still be part of the
Unicode data and the programmer has to be aware of
these.

I personally, don't think that slicing Unicode is
such a big issue. If you know what you are doing,
things tend not to break - which is true for pretty
much everything you do in programming ;-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 25 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From nas at arctrix.com  Tue Oct 25 10:53:07 2005
From: nas at arctrix.com (Neil Schemenauer)
Date: Tue, 25 Oct 2005 08:53:07 +0000 (UTC)
Subject: [Python-Dev] AST branch is in?
References: <200510211202.12015.anthony@interlink.com.au>
	<djbc7m$ptk$1@sea.gmane.org>
	<20051025133626.286990fd.simon@arrowtheory.com>
Message-ID: <djkrpj$4cm$1@sea.gmane.org>

Simon Burton <simon at arrowtheory.com> wrote:
> Is there a python interface ?

Not yet.

  Neil


From mal at egenix.com  Tue Oct 25 11:09:52 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 25 Oct 2005 11:09:52 +0200
Subject: [Python-Dev] New codecs checked in
In-Reply-To: <435D4B14.7060008@v.loewis.de>
References: <435965E4.5050207@egenix.com>	<435B95C0.9060005@v.loewis.de>	<435CA2BA.7050900@livinglogic.de>	<435CA887.3020900@egenix.com>
	<435CB4BB.6070009@livinglogic.de>	<435CCEE6.6020005@egenix.com>
	<435D4B14.7060008@v.loewis.de>
Message-ID: <435DF660.1010800@egenix.com>

Martin v. L?wis wrote:
> M.-A. Lemburg wrote:
> 
> 
>>I had to create three custom mapping files for cp1140, koi8-u
>>and tis-620.
> 
> 
> Can you please publish the files you have used somewhere? They
> best go into the Python CVS.

Sure; I'll check in the whole build machinery I'm using for this.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 25 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From mal at egenix.com  Tue Oct 25 11:17:58 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 25 Oct 2005 11:17:58 +0200
Subject: [Python-Dev] New codecs checked in
In-Reply-To: <435D4A2B.1090505@v.loewis.de>
References: <435965E4.5050207@egenix.com>
	<435B95C0.9060005@v.loewis.de>	<435CA2BA.7050900@livinglogic.de>
	<435CA887.3020900@egenix.com> <435D4A2B.1090505@v.loewis.de>
Message-ID: <435DF846.5050602@egenix.com>

Martin v. L?wis wrote:
> M.-A. Lemburg wrote:
> 
>>I just left them in because I thought they wouldn't do any harm
>>and might be useful in some applications.
>>
>>Removing them where not directly needed by the codec would not
>>be a problem.
> 
> 
> I think memory usage caused is measurable (I estimated 4KiB per
> dictionary). More importantly, people apparently currently change
> the dictionaries we provide and expect the codecs to automatically
> pick up the modified mappings. It would be better if the breakage
> is explicit (i.e. they get an AttributeError on the variable) instead
> of implicit (their changes to the mapping simply have no effect
> anymore).

Agreed. I've already checked in the changes, BTW.

>>KOI8-U is not available as mapping on ftp.unicode.org and
>>I only recreated codecs from the mapping files available
>>there.
> 
> 
> I think we should come up with mapping tables for the additional
> codecs as well, and maintain them in the CVS. This also applies
> to things like rot13.

Agreed.

>>I'll rerun the creation with the above changes sometime this
>>week.
> 
> 
> I hope I can finish my encoding routine shortly, which again
> results in changes to the codecs (replacing the encoding dictionaries
> with other lookup tables).

Having seen the decode tables written as long Unicode string,
I think that this may indeed also be a good solution for
encoding - the major improvement here is that the parser
and compiler will do the work of creating the table. At
module load time, the .pyc file will only contain a long
string which is very fast to create and load (unlike dictionaries
which are set up dynamically at load time).

In general, it's better to do all the work up-front when
creating the codecs, rather than having run-time code
repeat these tasks over and over again.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 25 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From ncoghlan at gmail.com  Tue Oct 25 11:26:58 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 25 Oct 2005 19:26:58 +1000
Subject: [Python-Dev] PEP 351, the freeze protocol
In-Reply-To: <20051024034957.38EB.JCARLSON@uci.edu>
References: <1130107429.11268.40.camel@geddy.wooz.org>
	<435CA5F8.7010607@gmail.com> <20051024034957.38EB.JCARLSON@uci.edu>
Message-ID: <435DFA62.4090802@gmail.com>

Josiah Carlson wrote:
> Nick Coghlan <ncoghlan at gmail.com> wrote:
>> I think having dicts and sets automatically invoke freeze would be a mistake, 
>> because at least one of the following two cases would behave unexpectedly:
> 
> I'm pretty sure that the PEP was only aslomg if one would freeze the
> contents of dicts IF the dict was being frozen.
> 
> That is, which of the following should be the case:
>     freeze({1:[2,3,4]}) -> {1:[2,3,4]}
>     freeze({1:[2,3,4]}) -> xdict(1=(2,3,4))

I believe the choices you intended are:
      freeze({1:[2,3,4]}) -> imdict(1=[2,3,4])
      freeze({1:[2,3,4]}) -> imdict(1=(2,3,4))

Regardless, that question makes a lot more sense (and looking at the PEP 
again, I realised I simply read it wrong the first time).

For containers where equality depends on the contents of the container (i.e., 
all the builtin ones), I don't see how it is possible to implement a sensible 
hash function without freezing the contents as well - otherwise your immutable 
isn't particularly immutable.

Consider what would happen if list "__freeze__" simply returned a tuple 
version of itself - you have a __freeze__ method which returns a potentially 
unhashable object!

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From amk at amk.ca  Tue Oct 25 11:58:49 2005
From: amk at amk.ca (A.M. Kuchling)
Date: Tue, 25 Oct 2005 05:58:49 -0400
Subject: [Python-Dev] AST branch is in?
In-Reply-To: <20051025133626.286990fd.simon@arrowtheory.com>
References: <200510211202.12015.anthony@interlink.com.au>
	<djbc7m$ptk$1@sea.gmane.org>
	<20051025133626.286990fd.simon@arrowtheory.com>
Message-ID: <20051025095849.GA4930@rogue.amk.ca>

On Tue, Oct 25, 2005 at 01:36:26PM +1000, Simon Burton wrote:
> Is there a python interface ?

Not yet, as far as I know.

FYI, all: please see the following weblog entry for a description of
the AST branch:  
	http://www.amk.ca/diary/2005/10/the_ast_branch_lands_1

If I got anything wrong, please offer corrections in the comments for
that post.

--amk

From amk at amk.ca  Tue Oct 25 12:13:20 2005
From: amk at amk.ca (A.M. Kuchling)
Date: Tue, 25 Oct 2005 06:13:20 -0400
Subject: [Python-Dev] Reminder: PyCon 2006 submissions due in a week
Message-ID: <20051025101320.GC4930@rogue.amk.ca>

The submission deadline for PyCon 2006 is now a week away.  PyCon 2006
will be in Dallas, Texas, February 24-26 2006.

For 2006, I'd like to see more tutorial-style talks on the program.
This means that your talk doesn't have to be about something entirely
new; you can show how to use a particular language feature, standard
library module, examine some aspect of a Python implementation, or
compare the available libraries in an application domain.

For example, the most popular talk at 2005 was Michelle Levesque's
PyWeboff, which compare various web development tools.  The next most
popular (ignoring a few keynotes and the lightning talks) were Alex
Martelli's talks on iterators & generators, and on OOP.  Partly that's
because it's Alex, of course, but I think attendees want help in
deciding which tools are good/helpful/safe to use.

If you need an idea, http://wiki.python.org/moin/PyCon2005/Feedback
lists some topics that 2005's attendees were interested in.

CFP:
	http://www.python.org/pycon/2006/cfp

Proposal submission site:
	http://submit.python.org/

--amk

From mal at egenix.com  Tue Oct 25 12:18:50 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 25 Oct 2005 12:18:50 +0200
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
	conversions).
In-Reply-To: <5.0.2.1.1.20051024133833.02b81cb0@mail.oz.net>
References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net>
	<5.0.2.1.1.20051022120117.02b58da0@mail.oz.net>
	<5.0.2.1.1.20051024133833.02b81cb0@mail.oz.net>
Message-ID: <435E068A.70409@egenix.com>

Bengt Richter wrote:
> At 11:43 2005-10-24 +0200, M.-A. Lemburg wrote:
> 
>>Bengt Richter wrote:
>>
>>>Please bear with me for a few paragraphs ;-)
>>
>>Please note that source code encoding doesn't really have
>>anything to do with the way the interpreter executes the
>>program - it's merely a way to tell the parser how to
>>convert string literals (currently on the Unicode ones)
>>into constant Unicode objects within the program text.
>>It's also a nice way to let other people know what kind of
>>encoding you used to write your comments ;-)
>>
>>Nothing more.
> 
> I think somehow I didn't make things clear, sorry ;-)
> As I tried to show in the example of module_a.cs vs module_b.cs,
> the source encoding currently results in two different str-type
> strings representing the source _character_ sequence, which is the
> _same_ in both cases. 

I don't follow you here. The source code encoding
is only applied to Unicode literals (you are using string
literals in your example). String literals are passed
through as-is.

Whether or not you editor will use the source
code encoding marker is really up to your editor
and not within the scope of Python.

If you open the two module files in Emacs, you'll
see identical renderings of the string literals.
With other editors, you may have to explicitly tell
the editor which encoding to assume. Dito for shell
printouts.

> To make it more clear, try the following little
> program (untested except on NT4 with
> Python 2.4b1 (#56, Nov  3 2004, 01:47:27)
> [GCC 3.2.3 (mingw special 20030504-1)] on win32 ;-):
> 
> ----< t_srcenc.py >--------------------------------
> import os
> def test():
>     open('module_a.py','wb').write(
>         "# -*- coding: latin-1 -*-" + os.linesep +
>         "cs = '\xfcber-cool'" + os.linesep)
>     open('module_b.py','wb').write(
>         "# -*- coding: utf-8 -*-" + os.linesep +
>         "cs = '\xc3\xbcber-cool'" + os.linesep)
>     # show that we have two modules differing only in encoding:
>     print ''.join(line.decode('latin-1') for line in open('module_a.py'))
>     print ''.join(line.decode('utf-8') for line in open('module_b.py'))
>     # see how results are affected:
>     import module_a, module_b
>     print module_a.cs + ' =?= ' + module_b.cs
>     print module_a.cs.decode('latin-1') + ' =?= ' + module_b.cs.decode('utf-8')
> 
> if __name__ == '__main__':
>     test()
> ---------------------------------------------------
> The result copied from NT4 console to clipboard and pasted into eudora:
> __________________________________________________________
> 
> [17:39] C:\pywk\python-dev>py24 t_srcenc.py
> # -*- coding: latin-1 -*-
> cs = '?ber-cool'
> 
> # -*- coding: utf-8 -*-
> cs = '?ber-cool'
> 
> nber-cool =?= ++ber-cool
> ?ber-cool =?= ?ber-cool
> __________________________________________________________
> (I'd say NT did the best it could, rendering the the copied cp437
> superscript n as the 'n' above, and the '++' coming from the
> cp437 box characters corresponding to the '\xc3\xbc'. Not sure
> how it will show on your screen, but try the program to see ;-)
>
>>Once a module is compiled, there's no distinction between
>>a module using the latin-1 source code encoding or one using
>>the utf-8 encoding.
> 
> ISTM module_a.cs and module_b.cs can readily be distinguished after
> compilation, whereas the sources displayed according to their declared
> encodings as above (or as e.g. different editors using different native
> encoding might) cannot (other than the encoding cookie itself) ;-)
> Perhaps you meant something else?

What your editor displays to you is not within the scope
of Python, e.g. if you open the files in Emacs you'll see
something different than in Notepad.

I guess that's the price you have to pay for being able to write
programs that can include Unicode literals using the complete range
of possible Unicode characters without having to revert to
escapes.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 25 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From fredrik at pythonware.com  Tue Oct 25 12:26:28 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 25 Oct 2005 12:26:28 +0200
Subject: [Python-Dev] Divorcing str and unicode (no more
	implicitconversions).
References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net><5.0.2.1.1.20051022120117.02b58da0@mail.oz.net><5.0.2.1.1.20051024133833.02b81cb0@mail.oz.net>
	<435E068A.70409@egenix.com>
Message-ID: <djl18k$l70$1@sea.gmane.org>

M.-A. Lemburg wrote:

> I don't follow you here. The source code encoding
> is only applied to Unicode literals (you are using string
> literals in your example). String literals are passed
> through as-is.

however, for Python 3000, it would be nice if the source-code encoding applied
to the *entire* file (XML-style), rather than just unicode string literals and (hope-
fully) comments and docstrings.

</F> 




From mal at egenix.com  Tue Oct 25 13:31:50 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 25 Oct 2005 13:31:50 +0200
Subject: [Python-Dev] Divorcing str and unicode (no
	more	implicitconversions).
In-Reply-To: <djl18k$l70$1@sea.gmane.org>
References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net><5.0.2.1.1.20051022120117.02b58da0@mail.oz.net><5.0.2.1.1.20051024133833.02b81cb0@mail.oz.net>	<435E068A.70409@egenix.com>
	<djl18k$l70$1@sea.gmane.org>
Message-ID: <435E17A6.2000404@egenix.com>

Fredrik Lundh wrote:
> M.-A. Lemburg wrote:
> 
> 
>>I don't follow you here. The source code encoding
>>is only applied to Unicode literals (you are using string
>>literals in your example). String literals are passed
>>through as-is.
> 
> 
> however, for Python 3000, it would be nice if the source-code encoding applied
> to the *entire* file (XML-style), rather than just unicode string literals and (hope-
> fully) comments and docstrings.

Actually, the encoding is applied to the complete source file:
the file is transcoded into UTF-8 and then parsed by the
Python parser.

Unicode literals are then decoded from the UTF-8 into Unicode.
String literals are transcoded back into the source code encoding,
thus making the (rather long due to technical constraints) round-trip
source code encoding -> Unicode -> UTF-8 -> Unicode -> source code encoding.

Python 3k should have a fully Unicode based parser to reduce this
additional transcoding overhead.

Since Py3k will only have Unicode literals, the problems with
string literals will go away all by themselves :-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 25 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From mal at egenix.com  Tue Oct 25 14:11:28 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 25 Oct 2005 14:11:28 +0200
Subject: [Python-Dev] New codecs checked in
In-Reply-To: <435DF660.1010800@egenix.com>
References: <435965E4.5050207@egenix.com>	<435B95C0.9060005@v.loewis.de>	<435CA2BA.7050900@livinglogic.de>	<435CA887.3020900@egenix.com>	<435CB4BB.6070009@livinglogic.de>	<435CCEE6.6020005@egenix.com>	<435D4B14.7060008@v.loewis.de>
	<435DF660.1010800@egenix.com>
Message-ID: <435E20F0.5040905@egenix.com>

M.-A. Lemburg wrote:
> Martin v. L?wis wrote:
> 
>>M.-A. Lemburg wrote:
>>
>>
>>
>>>I had to create three custom mapping files for cp1140, koi8-u
>>>and tis-620.
>>
>>
>>Can you please publish the files you have used somewhere? They
>>best go into the Python CVS.
> 
> 
> Sure; I'll check in the whole build machinery I'm using for this.

Done.

In order to rebuild the codecs, cd Tools/unicode; make
then check the codecs in the created build/ subdir (e.g.
using comparecodecs.py) and copy them over to the
Lib/encodings/ directory.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 25 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From ncoghlan at gmail.com  Tue Oct 25 16:27:49 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 26 Oct 2005 00:27:49 +1000
Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues
In-Reply-To: <ca471dc20510240747v33cfd354qc104129d6f9f90a1@mail.gmail.com>
References: <435A4598.3060403@iinet.net.au>	
	<ca471dc20510221022s6e75c5a0y591271cd7a950109@mail.gmail.com>	
	<ca471dc20510221414m4df50f3ft1fceb3eaa024a7b8@mail.gmail.com>	
	<435B597C.6040300@gmail.com>	
	<5.1.1.6.0.20051023124842.01af9078@mail.telecommunity.com>	
	<ca471dc20510231658qb8e14ddkb7956960cfa507d7@mail.gmail.com>	
	<435CCB1C.4030108@gmail.com>
	<ca471dc20510240747v33cfd354qc104129d6f9f90a1@mail.gmail.com>
Message-ID: <435E40E5.7070703@gmail.com>

Almost there - this is the only issue I have left on my list :)

Guido van Rossum wrote:
> On 10/24/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> However, those resolutions bring up the following issues:
>>
>>    5 a. What exception is raised when EXPR does not have a __context__ method?
>>      b.  What about when the returned object is missing __enter__ or __exit__?
>>     I suggest raising TypeError in both cases, for symmetry with for loops.
>>     The slot check is made in C code, so I don't see any difficulty in raising
>>     TypeError instead of AttributeError if the relevant slots aren't filled.
> 
> Why are you so keen on TypeError? I find AttributeError totally
> appropriate. I don't see symmetry with for-loops as a valuable
> property here. AttributeError and TypeError are often interchangeable
> anyway.

The reason I'm keen on TypeError is because 'abstract.c' uses it consistently
when it fails to find a method to support a requested protocol.

None of the abstract object methods currently raise AttributeError, and this
property is fairly visible at the Python level because the abstract API's are 
used to implement many of the bytecodes and various builtin functions. Both 
for loops and the iter function, for example, get their current exception 
behaviour from PyObject_GetIter and PyIter_Next.

Having had a look at mwh's patch, however, I've realised that going that way 
would only be possible if there were dedicated bytecodes for GET_CONTEXT, 
ENTER_CONTEXT and EXIT_CONTEXT (similar to the dedicated GET_ITER and FOR_ITER).

Leaving the exception as AttributeError means that level of bytecode hacking 
isn't necessary (mwh's patch just emits a fairly normal try/finally statement, 
although it still modifies the bytecode to include LOAD_EXIT_ARGS).

So, the inconsistency with other syntactic protocols still bothers me, but I 
can live with AttributeError if you don't want to add three new bytecodes just 
to support PEP 343.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From dave at boost-consulting.com  Tue Oct 25 17:21:56 2005
From: dave at boost-consulting.com (David Abrahams)
Date: Tue, 25 Oct 2005 11:21:56 -0400
Subject: [Python-Dev] MinGW and libpython24.a
Message-ID: <u3bmp62qz.fsf@boost-consulting.com>


Is the instruction at
http://www.python.org/dev/doc/devel/inst/tweak-flags.html#SECTION000622000000000000000
still relevant?  I am not 100% certain I didn't make one myself, but
it looks to me as though my Windows Python 2.4.1 distro came with a
libpython24.a.  I am asking here because it seems only the person who
prepares the installer would know.  If this is true, in which version
was it introduced?

Thanks,

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com


From guido at python.org  Tue Oct 25 18:14:29 2005
From: guido at python.org (Guido van Rossum)
Date: Tue, 25 Oct 2005 09:14:29 -0700
Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues
In-Reply-To: <435E40E5.7070703@gmail.com>
References: <435A4598.3060403@iinet.net.au>
	<ca471dc20510221022s6e75c5a0y591271cd7a950109@mail.gmail.com>
	<ca471dc20510221414m4df50f3ft1fceb3eaa024a7b8@mail.gmail.com>
	<435B597C.6040300@gmail.com>
	<5.1.1.6.0.20051023124842.01af9078@mail.telecommunity.com>
	<ca471dc20510231658qb8e14ddkb7956960cfa507d7@mail.gmail.com>
	<435CCB1C.4030108@gmail.com>
	<ca471dc20510240747v33cfd354qc104129d6f9f90a1@mail.gmail.com>
	<435E40E5.7070703@gmail.com>
Message-ID: <ca471dc20510250914x3c316fedyeea48f4568ffbb15@mail.gmail.com>

On 10/25/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Almost there - this is the only issue I have left on my list :)
[,,,]
> > Why are you so keen on TypeError? I find AttributeError totally
> > appropriate. I don't see symmetry with for-loops as a valuable
> > property here. AttributeError and TypeError are often interchangeable
> > anyway.
>
> The reason I'm keen on TypeError is because 'abstract.c' uses it consistently
> when it fails to find a method to support a requested protocol.

Hm. abstract.c well predates the new type system. Slots and methods
weren't really unified back then, so TypeError made obvious sense at
the time.

But with the new unified types/classes, those TypeErrors are really
just delayed (or precomputed? depends on your POV) AttributeErrors.

> None of the abstract object methods currently raise AttributeError, and this
> property is fairly visible at the Python level because the abstract API's are
> used to implement many of the bytecodes and various builtin functions. Both
> for loops and the iter function, for example, get their current exception
> behaviour from PyObject_GetIter and PyIter_Next.
>
> Having had a look at mwh's patch, however, I've realised that going that way
> would only be possible if there were dedicated bytecodes for GET_CONTEXT,
> ENTER_CONTEXT and EXIT_CONTEXT (similar to the dedicated GET_ITER and FOR_ITER).
>
> Leaving the exception as AttributeError means that level of bytecode hacking
> isn't necessary (mwh's patch just emits a fairly normal try/finally statement,
> although it still modifies the bytecode to include LOAD_EXIT_ARGS).

Let's definitely not introduce new bytecodes just so we can raise a
different exception.

> So, the inconsistency with other syntactic protocols still bothers me, but I
> can live with AttributeError if you don't want to add three new bytecodes just
> to support PEP 343.

I think the consistency you are seeking is a mirage. The TypeErrors
stem from the pre-computation of the slot population, not from some
requirements to raise TypeError for failing to implement some required
built-in protocol. I wouldn't hold it against other implementations of
Python if they raised AttributeError in more situations.

It is true though that AttributeError is somewhat special. There are
lots of places (perhaps too many?) where an operation is defined using
something like "if the object has attribute __foo__, use it, otherwise
use some other approach".  Some operations explicitly check for
AttributeError in their attribute check, and let a different exception
bubble up the stack. Presumably this is done so that a bug in
somebody's __getattr__ implementation doesn't get masked by the
"otherwise use some other approach" branch. But this is relatively
rare; most calls to PyObject_GetAttr just clear the error if they have
a different approach available. In any case, I don't see any of this
as supporting the position that TypeError is somehow more appropriate.
An AttributeError complaining about a missing __enter__, __exit__ or
__context__ method sounds just fine. (Oh, and please don't go checking
for the existence of __exit__ before calling __enter__. That kind of
bug is found with even the most cursory testing.)

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From martin at v.loewis.de  Tue Oct 25 20:12:13 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 25 Oct 2005 20:12:13 +0200
Subject: [Python-Dev] New codecs checked in
In-Reply-To: <435E20F0.5040905@egenix.com>
References: <435965E4.5050207@egenix.com>	<435B95C0.9060005@v.loewis.de>	<435CA2BA.7050900@livinglogic.de>	<435CA887.3020900@egenix.com>	<435CB4BB.6070009@livinglogic.de>	<435CCEE6.6020005@egenix.com>	<435D4B14.7060008@v.loewis.de>	<435DF660.1010800@egenix.com>
	<435E20F0.5040905@egenix.com>
Message-ID: <435E757D.4030408@v.loewis.de>

M.-A. Lemburg wrote:

> Done.
> 
> In order to rebuild the codecs, cd Tools/unicode; make
> then check the codecs in the created build/ subdir (e.g.
> using comparecodecs.py) and copy them over to the
> Lib/encodings/ directory.

Thanks!

Martin

From martin at v.loewis.de  Tue Oct 25 20:16:45 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 25 Oct 2005 20:16:45 +0200
Subject: [Python-Dev] MinGW and libpython24.a
In-Reply-To: <u3bmp62qz.fsf@boost-consulting.com>
References: <u3bmp62qz.fsf@boost-consulting.com>
Message-ID: <435E768D.2000401@v.loewis.de>

David Abrahams wrote:
> Is the instruction at
> http://www.python.org/dev/doc/devel/inst/tweak-flags.html#SECTION000622000000000000000
> still relevant?  I am not 100% certain I didn't make one myself, but
> it looks to me as though my Windows Python 2.4.1 distro came with a
> libpython24.a.  I am asking here because it seems only the person who
> prepares the installer would know.

That impression might be incorrect: I can tell you when I started
including libpython24.a, but I have no clue whether the instructions
you refer to are correct - I don't use the file myself at all.

> If this is true, in which version was it introduced?

It was introduced in 1.20/1.16.2.4 of Tools/msi/msi.py in response to
patch #1088716; this in turn was first used to release r241c1.

Regards,
Martin

From martin at v.loewis.de  Tue Oct 25 20:18:24 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 25 Oct 2005 20:18:24 +0200
Subject: [Python-Dev] Divorcing str and unicode (no more
	implicit	conversions).
In-Reply-To: <05Oct24.165105pdt."58617"@synergy1.parc.xerox.com>
References: <05Oct24.165105pdt."58617"@synergy1.parc.xerox.com>
Message-ID: <435E76F0.70001@v.loewis.de>

Bill Janssen wrote:
> I just got mail this morning from a researcher who wants exactly what
> Martin described, and wondered why the default MacPython 2.4.2 didn't
> provide it by default. :-)

If all he wants is to represent Deseret, he can do so in a 16-bit
Unicode type, too: Python supports UTF-16.

Regards,
Martin

From martin at v.loewis.de  Tue Oct 25 20:21:08 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 25 Oct 2005 20:21:08 +0200
Subject: [Python-Dev] Divorcing str and unicode (no
	more	implicitconversions).
In-Reply-To: <djl18k$l70$1@sea.gmane.org>
References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net><5.0.2.1.1.20051022120117.02b58da0@mail.oz.net><5.0.2.1.1.20051024133833.02b81cb0@mail.oz.net>	<435E068A.70409@egenix.com>
	<djl18k$l70$1@sea.gmane.org>
Message-ID: <435E7794.2030505@v.loewis.de>

Fredrik Lundh wrote:
> however, for Python 3000, it would be nice if the source-code encoding applied
> to the *entire* file (XML-style), rather than just unicode string literals and (hope-
> fully) comments and docstrings.

As MAL explains, the encoding currently does apply to the entire file.
However, because of the Python syntax, you are restricted to ASCII
in many places, such as keywords, number literals, and (unfortunately)
identifiers. Lifting the restriction on identifiers is on my agenda.

Regards,
Martin

From jcarlson at uci.edu  Tue Oct 25 21:04:33 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Tue, 25 Oct 2005 12:04:33 -0700
Subject: [Python-Dev] PEP 351, the freeze protocol
In-Reply-To: <435DFA62.4090802@gmail.com>
References: <20051024034957.38EB.JCARLSON@uci.edu> <435DFA62.4090802@gmail.com>
Message-ID: <20051025120225.3924.JCARLSON@uci.edu>


Nick Coghlan <ncoghlan at gmail.com> wrote:
> 
> Josiah Carlson wrote:
> > Nick Coghlan <ncoghlan at gmail.com> wrote:
> >> I think having dicts and sets automatically invoke freeze would be a mistake, 
> >> because at least one of the following two cases would behave unexpectedly:
> > 
> > I'm pretty sure that the PEP was only aslomg if one would freeze the
> > contents of dicts IF the dict was being frozen.
> > 
> > That is, which of the following should be the case:
> >     freeze({1:[2,3,4]}) -> {1:[2,3,4]}
> >     freeze({1:[2,3,4]}) -> xdict(1=(2,3,4))
> 
> I believe the choices you intended are:
>       freeze({1:[2,3,4]}) -> imdict(1=[2,3,4])
>       freeze({1:[2,3,4]}) -> imdict(1=(2,3,4))
> 
> Regardless, that question makes a lot more sense (and looking at the PEP 
> again, I realised I simply read it wrong the first time).
> 
> For containers where equality depends on the contents of the container (i.e., 
> all the builtin ones), I don't see how it is possible to implement a sensible 
> hash function without freezing the contents as well - otherwise your immutable 
> isn't particularly immutable.
> 
> Consider what would happen if list "__freeze__" simply returned a tuple 
> version of itself - you have a __freeze__ method which returns a potentially 
> unhashable object!

I agree completely, hence my original statement on 10/23: "it is of my
opinion that a container which is frozen should have its contents frozen
as well."

 - Josiah


From dave at boost-consulting.com  Tue Oct 25 21:06:29 2005
From: dave at boost-consulting.com (David Abrahams)
Date: Tue, 25 Oct 2005 15:06:29 -0400
Subject: [Python-Dev] MinGW and libpython24.a
In-Reply-To: <435E768D.2000401@v.loewis.de> (Martin v. =?iso-8859-1?Q?L=F6?=
	=?iso-8859-1?Q?wis's?= message of
	"Tue, 25 Oct 2005 20:16:45 +0200")
References: <u3bmp62qz.fsf@boost-consulting.com> <435E768D.2000401@v.loewis.de>
Message-ID: <u7jc12z7u.fsf@boost-consulting.com>

"Martin v. L?wis" <martin at v.loewis.de> writes:

> David Abrahams wrote:
>> Is the instruction at
>> http://www.python.org/dev/doc/devel/inst/tweak-flags.html#SECTION000622000000000000000
>> still relevant?  I am not 100% certain I didn't make one myself, but
>> it looks to me as though my Windows Python 2.4.1 distro came with a
>> libpython24.a.  I am asking here because it seems only the person who
>> prepares the installer would know.
>
> That impression might be incorrect: I can tell you when I started
> including libpython24.a, but I have no clue whether the instructions
> you refer to are correct - I don't use the file myself at all.
>
>> If this is true, in which version was it introduced?
>
> It was introduced in 1.20/1.16.2.4 of Tools/msi/msi.py in response to
> patch #1088716; this in turn was first used to release r241c1.

Thanks!

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com

From jcarlson at uci.edu  Tue Oct 25 21:17:10 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Tue, 25 Oct 2005 12:17:10 -0700
Subject: [Python-Dev] Divorcing str and unicode (no
	more	implicitconversions).
In-Reply-To: <435E7794.2030505@v.loewis.de>
References: <djl18k$l70$1@sea.gmane.org> <435E7794.2030505@v.loewis.de>
Message-ID: <20051025120919.3927.JCARLSON@uci.edu>


"Martin v. L?wis" <martin at v.loewis.de> wrote:
> 
> Fredrik Lundh wrote:
> > however, for Python 3000, it would be nice if the source-code encoding applied
> > to the *entire* file (XML-style), rather than just unicode string literals and (hope-
> > fully) comments and docstrings.
> 
> As MAL explains, the encoding currently does apply to the entire file.
> However, because of the Python syntax, you are restricted to ASCII
> in many places, such as keywords, number literals, and (unfortunately)
> identifiers. Lifting the restriction on identifiers is on my agenda.

It seems that removing this restriction may cause serious issues, at
least in the case when using cyrillic characters in names.  See recent
security issues in regards to web addresses in web browsers for the
confusion (and/or name errors) that could result in their use.

While I agree in principle that people should be able to use the
entirety of one's own natural language in writing software in
programming languages, I think that it is an ugly can of worms that
perhaps shouldn't be opened.

 - Josiah


From eric.nieuwland at xs4all.nl  Tue Oct 25 22:05:18 2005
From: eric.nieuwland at xs4all.nl (Eric Nieuwland)
Date: Tue, 25 Oct 2005 22:05:18 +0200
Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues
In-Reply-To: <ca471dc20510250914x3c316fedyeea48f4568ffbb15@mail.gmail.com>
References: <435A4598.3060403@iinet.net.au>
	<ca471dc20510221022s6e75c5a0y591271cd7a950109@mail.gmail.com>
	<ca471dc20510221414m4df50f3ft1fceb3eaa024a7b8@mail.gmail.com>
	<435B597C.6040300@gmail.com>
	<5.1.1.6.0.20051023124842.01af9078@mail.telecommunity.com>
	<ca471dc20510231658qb8e14ddkb7956960cfa507d7@mail.gmail.com>
	<435CCB1C.4030108@gmail.com>
	<ca471dc20510240747v33cfd354qc104129d6f9f90a1@mail.gmail.com>
	<435E40E5.7070703@gmail.com>
	<ca471dc20510250914x3c316fedyeea48f4568ffbb15@mail.gmail.com>
Message-ID: <103348c72cae040229895842cb3c0cdc@xs4all.nl>

Guido van Rossum wrote:
> It is true though that AttributeError is somewhat special. There are
> lots of places (perhaps too many?) where an operation is defined using
> something like "if the object has attribute __foo__, use it, otherwise
> use some other approach".  Some operations explicitly check for
> AttributeError in their attribute check, and let a different exception
> bubble up the stack. Presumably this is done so that a bug in
> somebody's __getattr__ implementation doesn't get masked by the
> "otherwise use some other approach" branch. But this is relatively
> rare; most calls to PyObject_GetAttr just clear the error if they have
> a different approach available. In any case, I don't see any of this
> as supporting the position that TypeError is somehow more appropriate.
> An AttributeError complaining about a missing __enter__, __exit__ or
> __context__ method sounds just fine. (Oh, and please don't go checking
> for the existence of __exit__ before calling __enter__. That kind of
> bug is found with even the most cursory testing.)

Hmmm... Would it be reasonable to introduce a ProtocolError exception?

--eric


From guido at python.org  Tue Oct 25 22:11:23 2005
From: guido at python.org (Guido van Rossum)
Date: Tue, 25 Oct 2005 13:11:23 -0700
Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues
In-Reply-To: <103348c72cae040229895842cb3c0cdc@xs4all.nl>
References: <435A4598.3060403@iinet.net.au>
	<ca471dc20510221414m4df50f3ft1fceb3eaa024a7b8@mail.gmail.com>
	<435B597C.6040300@gmail.com>
	<5.1.1.6.0.20051023124842.01af9078@mail.telecommunity.com>
	<ca471dc20510231658qb8e14ddkb7956960cfa507d7@mail.gmail.com>
	<435CCB1C.4030108@gmail.com>
	<ca471dc20510240747v33cfd354qc104129d6f9f90a1@mail.gmail.com>
	<435E40E5.7070703@gmail.com>
	<ca471dc20510250914x3c316fedyeea48f4568ffbb15@mail.gmail.com>
	<103348c72cae040229895842cb3c0cdc@xs4all.nl>
Message-ID: <ca471dc20510251311h60d653d4y374a09a1ae473f18@mail.gmail.com>

On 10/25/05, Eric Nieuwland <eric.nieuwland at xs4all.nl> wrote:
> Hmmm... Would it be reasonable to introduce a ProtocolError exception?

And which perceived problem would that solve? The problem of Nick &
Guido disagreeing in public?

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From eric.nieuwland at xs4all.nl  Tue Oct 25 22:22:33 2005
From: eric.nieuwland at xs4all.nl (Eric Nieuwland)
Date: Tue, 25 Oct 2005 22:22:33 +0200
Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues
In-Reply-To: <ca471dc20510251311h60d653d4y374a09a1ae473f18@mail.gmail.com>
References: <435A4598.3060403@iinet.net.au>
	<ca471dc20510221414m4df50f3ft1fceb3eaa024a7b8@mail.gmail.com>
	<435B597C.6040300@gmail.com>
	<5.1.1.6.0.20051023124842.01af9078@mail.telecommunity.com>
	<ca471dc20510231658qb8e14ddkb7956960cfa507d7@mail.gmail.com>
	<435CCB1C.4030108@gmail.com>
	<ca471dc20510240747v33cfd354qc104129d6f9f90a1@mail.gmail.com>
	<435E40E5.7070703@gmail.com>
	<ca471dc20510250914x3c316fedyeea48f4568ffbb15@mail.gmail.com>
	<103348c72cae040229895842cb3c0cdc@xs4all.nl>
	<ca471dc20510251311h60d653d4y374a09a1ae473f18@mail.gmail.com>
Message-ID: <0c1f3f308e9c605ef4687689c860913e@xs4all.nl>

Guido van Rossum wrote:

> On 10/25/05, Eric Nieuwland <eric.nieuwland at xs4all.nl> wrote:
>> Hmmm... Would it be reasonable to introduce a ProtocolError exception?
>
> And which perceived problem would that solve? The problem of Nick &
> Guido disagreeing in public?

;-)

No, that will go on in other fields, I guess.

It was meant to be a bit more informative about what is wrong.

ProtocolError: lacks __enter__ or __exit__

--eric


From guido at python.org  Tue Oct 25 22:35:14 2005
From: guido at python.org (Guido van Rossum)
Date: Tue, 25 Oct 2005 13:35:14 -0700
Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues
In-Reply-To: <0c1f3f308e9c605ef4687689c860913e@xs4all.nl>
References: <435A4598.3060403@iinet.net.au>
	<5.1.1.6.0.20051023124842.01af9078@mail.telecommunity.com>
	<ca471dc20510231658qb8e14ddkb7956960cfa507d7@mail.gmail.com>
	<435CCB1C.4030108@gmail.com>
	<ca471dc20510240747v33cfd354qc104129d6f9f90a1@mail.gmail.com>
	<435E40E5.7070703@gmail.com>
	<ca471dc20510250914x3c316fedyeea48f4568ffbb15@mail.gmail.com>
	<103348c72cae040229895842cb3c0cdc@xs4all.nl>
	<ca471dc20510251311h60d653d4y374a09a1ae473f18@mail.gmail.com>
	<0c1f3f308e9c605ef4687689c860913e@xs4all.nl>
Message-ID: <ca471dc20510251335p75627d61o6a22b22fc4a33c06@mail.gmail.com>

[Eric "are all your pets called Eric?" Nieuwland]
> >> Hmmm... Would it be reasonable to introduce a ProtocolError exception?

[Guido]
> > And which perceived problem would that solve?

[Eric]
> It was meant to be a bit more informative about what is wrong.
>
> ProtocolError: lacks __enter__ or __exit__

That's exactly what I'm trying to avoid. :)

I find "AttributeError: __exit__" just as informative. In either case,
if you know what __exit__ means, you'll know what you did wrong. And
if you don't know what it means, you'll have to look it up anyway. And
searching for ProtocolError doesn't do you any good -- you'll have to
learn about what __exit__ is and where it is required.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From p.f.moore at gmail.com  Tue Oct 25 22:40:30 2005
From: p.f.moore at gmail.com (Paul Moore)
Date: Tue, 25 Oct 2005 21:40:30 +0100
Subject: [Python-Dev] PEP 343 - multiple context managers in one statement
Message-ID: <79990c6b0510251340s6adb7fbcpac5247886a171c3f@mail.gmail.com>

I have a deep suspicion that this has been done to death already, but
my searching ability isn't up to finding the reference. So I'll simply
ask the question, and not offer a long discussion:

Has the option of letting the with statement admit multiple context
managers been considered (and presumably rejected)?

I'm thinking of

    with expr1, expr2, expr3:
        # whatever

In some ways, this doesn't even need an extension to the PEP - giving
tuples suitable __enter__ and __exit__ methods would do it. Or, I
suppose a user-defined manager which combined a list of others:

    class combining:
        def __init__(*mgrs):
            self.mgrs = mgrs
        def __with__(self):
            return self
        def __enter__(self):
            return tuple(mgr.__enter__() for mgr in self.mgrs)
        def __exit__(self, type, value, tb):
            # first in, last out
            for mgr in reversed(self.mgrs):
                mgr.__exit__(type, value, tb)

Would that be worth using as an example in the PEP?

Sorry - it got a bit long anyway...

Paul.

PS The signature of __with__ in example 4 in the PEP is wrong - it has
an incorrect "lock" parameter.

From fwierzbicki at gmail.com  Tue Oct 25 22:48:05 2005
From: fwierzbicki at gmail.com (Frank Wierzbicki)
Date: Tue, 25 Oct 2005 16:48:05 -0400
Subject: [Python-Dev] AST branch is in?
In-Reply-To: <ee2a432c0510201932g147e36belf3b85e39be8e817e@mail.gmail.com>
References: <200510211202.12015.anthony@interlink.com.au>
	<ee2a432c0510201932g147e36belf3b85e39be8e817e@mail.gmail.com>
Message-ID: <4dab5f760510251348u7b90f7b4if71624f702a54ea2@mail.gmail.com>

On 10/20/05, Neal Norwitz <nnorwitz at gmail.com> wrote:
>
> The Grammar is (was at one point at least) shared between Jython and
> would allow more tools to be able to share infrastructure. The idea
> is to eventually be able to have [JP]ython output the same AST to
> tools.


Hello Python-dev,

My name is Frank Wierzbicki and I'm working on the Jython project. Does
anyone on this list know more about the history of this Grammar sharing
between the two projects? I've heard about some Grammar sharing between
Jython and Python, and I've noticed that (most of) the jython code in
/org/python/parser/ast is commented "Autogenerated AST node". I would
definitely like to look at (eventually) coordinating with this effort.

I've cross-posted to the Jython-dev list in case someone there has some
insight.

Thanks,
Frank
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20051025/71dbda4a/attachment.html

From guido at python.org  Tue Oct 25 22:57:02 2005
From: guido at python.org (Guido van Rossum)
Date: Tue, 25 Oct 2005 13:57:02 -0700
Subject: [Python-Dev] AST branch is in?
In-Reply-To: <4dab5f760510251348u7b90f7b4if71624f702a54ea2@mail.gmail.com>
References: <200510211202.12015.anthony@interlink.com.au>
	<ee2a432c0510201932g147e36belf3b85e39be8e817e@mail.gmail.com>
	<4dab5f760510251348u7b90f7b4if71624f702a54ea2@mail.gmail.com>
Message-ID: <ca471dc20510251357k4cf14121je044756a00af32ed@mail.gmail.com>

On 10/25/05, Frank Wierzbicki <fwierzbicki at gmail.com> wrote:
>  My name is Frank Wierzbicki and I'm working on the Jython project.  Does
> anyone on this list know more about the history of this Grammar sharing
> between the two projects?  I've heard about some Grammar sharing between
> Jython and Python, and I've noticed that (most of) the jython code in
> /org/python/parser/ast is commented "Autogenerated AST node".  I would
> definitely like to look at (eventually) coordinating with this effort.
>
>  I've cross-posted to the Jython-dev list in case someone there has some
> insight.

Your best bet is to track down Jim Hugunin and see if he remembers.
He's jimhug at microsoft.com or jim at hugunin.net.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From janssen at parc.com  Tue Oct 25 23:02:26 2005
From: janssen at parc.com (Bill Janssen)
Date: Tue, 25 Oct 2005 14:02:26 PDT
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
	conversions).
In-Reply-To: Your message of "Tue, 25 Oct 2005 11:18:24 PDT."
	<435E76F0.70001@v.loewis.de> 
Message-ID: <05Oct25.140234pdt."58617"@synergy1.parc.xerox.com>

I think he was more interested in the invariant Martin proposed, that

 len("\U00010000")

should always be the same and should always be 1.

Bill

From guido at python.org  Tue Oct 25 23:04:09 2005
From: guido at python.org (Guido van Rossum)
Date: Tue, 25 Oct 2005 14:04:09 -0700
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
	conversions).
In-Reply-To: <-6904385769530255077@unknownmsgid>
References: <435E76F0.70001@v.loewis.de> <-6904385769530255077@unknownmsgid>
Message-ID: <ca471dc20510251404m6bed8520wce7796c41a0212f5@mail.gmail.com>

On 10/25/05, Bill Janssen <janssen at parc.com> wrote:
> I think he was more interested in the invariant Martin proposed, that
>
>  len("\U00010000")
>
> should always be the same and should always be 1.

Yes but why? What does this invariant do for him?

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From pedronis at strakt.com  Tue Oct 25 23:08:43 2005
From: pedronis at strakt.com (Samuele Pedroni)
Date: Tue, 25 Oct 2005 23:08:43 +0200
Subject: [Python-Dev] [Jython-dev] Re:  AST branch is in?
In-Reply-To: <4dab5f760510251348u7b90f7b4if71624f702a54ea2@mail.gmail.com>
References: <200510211202.12015.anthony@interlink.com.au>	
	<ee2a432c0510201932g147e36belf3b85e39be8e817e@mail.gmail.com>
	<4dab5f760510251348u7b90f7b4if71624f702a54ea2@mail.gmail.com>
Message-ID: <435E9EDB.90308@strakt.com>

Frank Wierzbicki wrote:
> On 10/20/05, *Neal Norwitz* <nnorwitz at gmail.com 
> <mailto:nnorwitz at gmail.com>> wrote:
> 
>     The Grammar is (was at one point at least) shared between Jython and
>     would allow more tools to be able to share infrastructure.  The idea
>     is to eventually be able to have [JP]ython output the same AST to
>     tools.
> 
> 
> Hello Python-dev,
> 
> My name is Frank Wierzbicki and I'm working on the Jython project.  Does 
> anyone on this list know more about the history of this Grammar sharing 
> between the two projects?  I've heard about some Grammar sharing between 
> Jython and Python, and I've noticed that (most of) the jython code in 
> /org/python/parser/ast is commented "Autogenerated AST node".  I would 
> definitely like to look at (eventually) coordinating with this effort.
> 
> I've cross-posted to the Jython-dev list in case someone there has some 
> insight.

as far as I understand now Python trunk contains some generated AST
representation C code created through the asdl_c.py script from an 
updated Python.asdl, these files live in

http://cvs.sourceforge.net/viewcvs.py/python/python/dist/src/Parser/

a parallel asdl_java.py existed in Python CVS sandbox (where the
AST effort started) and was updated the last time the Jython
own AST classes were generated with at the time version of Python.asdl
(this was done by me if I remember correctly at some point in Jython
2.2 evolution, I think when the PyDev guys wanted a more up-to-date
Jython parser to reuse):

http://cvs.sourceforge.net/viewcvs.py/*checkout*/python/python/nondist/sandbox/ast/asdl_java.py?content-type=text%2Fplain&rev=1.7

basically the new Python.asdl needs to be used, the asdl_java.py
maybe updated and our compiler changed as necessary.

regards.









From martin at v.loewis.de  Tue Oct 25 23:05:28 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 25 Oct 2005 23:05:28 +0200
Subject: [Python-Dev] Divorcing str and unicode (no
	more	implicitconversions).
In-Reply-To: <20051025120919.3927.JCARLSON@uci.edu>
References: <djl18k$l70$1@sea.gmane.org> <435E7794.2030505@v.loewis.de>
	<20051025120919.3927.JCARLSON@uci.edu>
Message-ID: <435E9E18.7090502@v.loewis.de>

Josiah Carlson wrote:
> It seems that removing this restriction may cause serious issues, at
> least in the case when using cyrillic characters in names.  See recent
> security issues in regards to web addresses in web browsers for the
> confusion (and/or name errors) that could result in their use.

That impression is deceiving. We are talking about source code here;
people type in identifiers explicitly rather than receiving them
through linking, and they scope identifiers (by module or object).

If somebody manages to get look-alike identifiers into your Python
libraries, you have bigger problems than these look-alikes: anybody
capable of doing so could just as well replace the real thing in
the first place.

As always in computer security: define your threat model before
reasoning about the risks.

Regards,
Martin

From pedronis at strakt.com  Tue Oct 25 23:11:32 2005
From: pedronis at strakt.com (Samuele Pedroni)
Date: Tue, 25 Oct 2005 23:11:32 +0200
Subject: [Python-Dev] AST branch is in?
In-Reply-To: <ca471dc20510251357k4cf14121je044756a00af32ed@mail.gmail.com>
References: <200510211202.12015.anthony@interlink.com.au>	<ee2a432c0510201932g147e36belf3b85e39be8e817e@mail.gmail.com>	<4dab5f760510251348u7b90f7b4if71624f702a54ea2@mail.gmail.com>
	<ca471dc20510251357k4cf14121je044756a00af32ed@mail.gmail.com>
Message-ID: <435E9F84.2010205@strakt.com>

Guido van Rossum wrote:
> On 10/25/05, Frank Wierzbicki <fwierzbicki at gmail.com> wrote:
> 
>> My name is Frank Wierzbicki and I'm working on the Jython project.  Does
>>anyone on this list know more about the history of this Grammar sharing
>>between the two projects?  I've heard about some Grammar sharing between
>>Jython and Python, and I've noticed that (most of) the jython code in
>>/org/python/parser/ast is commented "Autogenerated AST node".  I would
>>definitely like to look at (eventually) coordinating with this effort.
>>
>> I've cross-posted to the Jython-dev list in case someone there has some
>>insight.
> 
> 
> Your best bet is to track down Jim Hugunin and see if he remembers.
> He's jimhug at microsoft.com or jim at hugunin.net.
> 

no. this is all after Jim, its indeed a derived effort from the CPython
own AST effort, just that we started using it quite a while ago.
This is all after Jim was not involved with Jython anymore, Finn Bock
started this.

From guido at python.org  Tue Oct 25 23:13:38 2005
From: guido at python.org (Guido van Rossum)
Date: Tue, 25 Oct 2005 14:13:38 -0700
Subject: [Python-Dev] AST branch is in?
In-Reply-To: <435E9F84.2010205@strakt.com>
References: <200510211202.12015.anthony@interlink.com.au>
	<ee2a432c0510201932g147e36belf3b85e39be8e817e@mail.gmail.com>
	<4dab5f760510251348u7b90f7b4if71624f702a54ea2@mail.gmail.com>
	<ca471dc20510251357k4cf14121je044756a00af32ed@mail.gmail.com>
	<435E9F84.2010205@strakt.com>
Message-ID: <ca471dc20510251413p2e635258k7b16926e7527b88c@mail.gmail.com>

On 10/25/05, Samuele Pedroni <pedronis at strakt.com> wrote:
> > Your best bet is to track down Jim Hugunin and see if he remembers.
> > He's jimhug at microsoft.com or jim at hugunin.net.

> no. this is all after Jim, its indeed a derived effort from the CPython
> own AST effort, just that we started using it quite a while ago.
> This is all after Jim was not involved with Jython anymore, Finn Bock
> started this.

Oops! Sorry for the misinformation. Shows how much I know. :(

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From martin at v.loewis.de  Tue Oct 25 23:21:43 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 25 Oct 2005 23:21:43 +0200
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
	conversions).
In-Reply-To: <ca471dc20510251404m6bed8520wce7796c41a0212f5@mail.gmail.com>
References: <435E76F0.70001@v.loewis.de> <-6904385769530255077@unknownmsgid>
	<ca471dc20510251404m6bed8520wce7796c41a0212f5@mail.gmail.com>
Message-ID: <435EA1E7.9020006@v.loewis.de>

Guido van Rossum wrote:
> Yes but why? What does this invariant do for him?

I don't know about this person, but there are a few things that
don't work properly in UTF-16 mode:

- the Unicode character database fails to lookup things.
   u"\U0001D670".isupper() gives false, but should give true
   (since it denotes MATHEMATICAL MONOSPACE CAPITAL A).
   It gives true in UCS-4 mode
- As a result, normalization on these doesn't work, either.
   It should normalize to "LATIN CAPITAL LETTER A" under
   NFKC, but doesn't.
- regular expressions only have limited support. In
   particular, adding non-BMP characters to character classes
   is not possible. [\U0001D670] will match any character
   that is either \uD835 or \uDE70, whereas it only matches
   MATHEMATICAL MONOSPACE CAPITAL A in UCS-4 mode.

There might be more limitations, but those are the ones that
come to mind easily. While I could imagine fixing the first
two with some effort, the third one is really tricky (unless
you would accept a "wide" representation of a character
class even if the Unicode representation is only narrow).

Regards,
Martin

From jcarlson at uci.edu  Tue Oct 25 23:40:06 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Tue, 25 Oct 2005 14:40:06 -0700
Subject: [Python-Dev] Divorcing str and unicode (no
	more	implicitconversions).
In-Reply-To: <435E9E18.7090502@v.loewis.de>
References: <20051025120919.3927.JCARLSON@uci.edu>
	<435E9E18.7090502@v.loewis.de>
Message-ID: <20051025142245.3930.JCARLSON@uci.edu>


"Martin v. L?wis" <martin at v.loewis.de> wrote:
> 
> Josiah Carlson wrote:
> > It seems that removing this restriction may cause serious issues, at
> > least in the case when using cyrillic characters in names.  See recent
> > security issues in regards to web addresses in web browsers for the
> > confusion (and/or name errors) that could result in their use.
> 
> That impression is deceiving. We are talking about source code here;
> people type in identifiers explicitly rather than receiving them
> through linking, and they scope identifiers (by module or object).
> 
> If somebody manages to get look-alike identifiers into your Python
> libraries, you have bigger problems than these look-alikes: anybody
> capable of doing so could just as well replace the real thing in
> the first place.
> 
> As always in computer security: define your threat model before
> reasoning about the risks.

I should have been more explicit.  I did not mean to imply that I was
concerned about the security implications of inserting arbitrary
identifiers in Python (I was mentioning the web browser case for
an example of how such characters have been confusing previously), I am
concerned about confusion involved with using:
    Greek Capital: Alpha, Beta, Epsilon, Zeta, Eta, Iota, Kappa, Mu, Nu,
Omicron, Rho, and Tau.
    Cyrillic Capital: Dze, Je, A, Ve, Ie, Em, En, O, Er, Es, Te, Ha, ...

And how users could say, "name error? But I typed in window.draw(PEN) as
I was told to, and it didn't work!"


Identically drawn glyphs are a problem, and pretending that they aren't
a problem, doesn't make it so.  Right now, all possible name glyphs are
visually distinct, which would not be the case if any unicode character
could be used as a name (except for numerals).  Speaking of which, would
we then be offering support for arabic/indic numeric literals, and/or
support it in int()/float()?  Ideally I would like to say yes, but I
could see the confusion if such were allowed.

 - Josiah


From phil at riverbankcomputing.co.uk  Tue Oct 25 23:45:02 2005
From: phil at riverbankcomputing.co.uk (Phil Thompson)
Date: Tue, 25 Oct 2005 22:45:02 +0100
Subject: [Python-Dev] Inconsistent Use of Buffer Interface in
	stringobject.c
In-Reply-To: <ca471dc20510241139j1e3196fdy443c066739048d1f@mail.gmail.com>
References: <200510241818.41145.phil@riverbankcomputing.co.uk>
	<435D28B6.9010806@egenix.com>
	<ca471dc20510241139j1e3196fdy443c066739048d1f@mail.gmail.com>
Message-ID: <200510252245.02871.phil@riverbankcomputing.co.uk>

On Monday 24 October 2005 7:39 pm, Guido van Rossum wrote:
> On 10/24/05, M.-A. Lemburg <mal at egenix.com> wrote:
> > Guido van Rossum wrote:
> > > A concern I'd have with fixing this is that Unicode objects also
> > > support the buffer API. In any situation where either str or unicode
> > > is accepted I'd be reluctant to guess whether a buffer object was
> > > meant to be str-like or Unicode-like. I think this covers all the
> > > cases you mention here.
> >
> > This situation is a little better than that: the buffer
> > interface has a slot called getcharbuffer which is what
> > the string methods use in case they find that a string
> > argument is not of type str or unicode.
>
> I stand corrected!
>
> > As first step, I'd suggest to implement the gatcharbuffer
> > slot. That will already go a long way.
>
> Phil, if anything still doesn't work after doing what Marc-Andre says,
> those would be good candidates for fixes!

The patch is now on SF, #1337876.

Phil

From mal at egenix.com  Tue Oct 25 23:47:23 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 25 Oct 2005 23:47:23 +0200
Subject: [Python-Dev] Divorcing str and unicode
	(no	more	implicitconversions).
In-Reply-To: <20051025120919.3927.JCARLSON@uci.edu>
References: <djl18k$l70$1@sea.gmane.org> <435E7794.2030505@v.loewis.de>
	<20051025120919.3927.JCARLSON@uci.edu>
Message-ID: <435EA7EB.90100@egenix.com>

Josiah Carlson wrote:
> "Martin v. L?wis" <martin at v.loewis.de> wrote:
> 
>>Fredrik Lundh wrote:
>>
>>>however, for Python 3000, it would be nice if the source-code encoding applied
>>>to the *entire* file (XML-style), rather than just unicode string literals and (hope-
>>>fully) comments and docstrings.
>>
>>As MAL explains, the encoding currently does apply to the entire file.
>>However, because of the Python syntax, you are restricted to ASCII
>>in many places, such as keywords, number literals, and (unfortunately)
>>identifiers. Lifting the restriction on identifiers is on my agenda.
> 
> 
> It seems that removing this restriction may cause serious issues, at
> least in the case when using cyrillic characters in names.  See recent
> security issues in regards to web addresses in web browsers for the
> confusion (and/or name errors) that could result in their use.
> 
> While I agree in principle that people should be able to use the
> entirety of one's own natural language in writing software in
> programming languages, I think that it is an ugly can of worms that
> perhaps shouldn't be opened.

I agree with Josiah.

A few years ago we had a discussion about this on python-dev
and agreed to stick with ASCII identifiers for Python. I still
think that's the right way to go.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 25 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From guido at python.org  Tue Oct 25 23:47:42 2005
From: guido at python.org (Guido van Rossum)
Date: Tue, 25 Oct 2005 14:47:42 -0700
Subject: [Python-Dev] Divorcing str and unicode (no more
	implicitconversions).
In-Reply-To: <20051025142245.3930.JCARLSON@uci.edu>
References: <20051025120919.3927.JCARLSON@uci.edu>
	<435E9E18.7090502@v.loewis.de> <20051025142245.3930.JCARLSON@uci.edu>
Message-ID: <ca471dc20510251447g7a2baf36g2f0f75bccfbbba26@mail.gmail.com>

On 10/25/05, Josiah Carlson <jcarlson at uci.edu> wrote:
> Identically drawn glyphs are a problem, and pretending that they aren't
> a problem, doesn't make it so.  Right now, all possible name glyphs are
> visually distinct, which would not be the case if any unicode character
> could be used as a name (except for numerals).  Speaking of which, would
> we then be offering support for arabic/indic numeric literals, and/or
> support it in int()/float()?  Ideally I would like to say yes, but I
> could see the confusion if such were allowed.

This problem isn't new. There are plenty of fonts where 1 and l are
hard to distinguish, or l and I for that matter, or O and 0.

Yes, we need better tools to diagnose this.

No, we shouldn't let this stop us from adding such a feature if it is
otherwise a good feature.

I'm not so sure about this for other reasons -- it hampers code
sharing, and as soon as you add right-to-left character sets to the
mix (or top-to-bottom, for that matter), displaying source code is
going to be near impossible for most tools (since the keywords and
standard library module names will still be in the Latin alphabet).
This actually seems a killer even for allowing Unicode in comments,
which I'd otherwise favor. What do Unicode-aware apps generally do
with right-to-left characters?

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From martin at v.loewis.de  Tue Oct 25 23:55:27 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 25 Oct 2005 23:55:27 +0200
Subject: [Python-Dev] Divorcing str and unicode (no
	more	implicitconversions).
In-Reply-To: <20051025142245.3930.JCARLSON@uci.edu>
References: <20051025120919.3927.JCARLSON@uci.edu>
	<435E9E18.7090502@v.loewis.de>
	<20051025142245.3930.JCARLSON@uci.edu>
Message-ID: <435EA9CF.6060305@v.loewis.de>

Josiah Carlson wrote:
> And how users could say, "name error? But I typed in window.draw(PEN) as
> I was told to, and it didn't work!"

Ah, so the "serious issues" you are talking about are not security 
issues, but usability issues.

I don't think extending the range of acceptable characters will
cause any additional confusion. Users are already getting "surprising"
NameErrors/AttributeErrors in the following cases:
- they just misspell the identifier, and then, when the error message
   is printed, fail to recognize the difference, as they read over the
   typo just like they read over it when mistyping it in the first place.

- they run into confusions with different things having the same names
   in different contexts. For example, they wonder why they get TypeError
   for passing the wrong number of arguments to a function, when the
   call matches exactly what the source code in front of them tells
   them - only that they were calling a different function which just
   happened to have the same name.

In the light of these common mistakes, your example with an identifier
named PEN, where the "P" might be a cyrillic letter or the E a greek one
is just made up: For window.draw, people will readily understand that
they are supposed to use Latin letters. More generally, they will know
what script to use just from looking at the identifier.

> Identically drawn glyphs are a problem, and pretending that they aren't
> a problem, doesn't make it so.  Right now, all possible name glyphs are
> visually distinct

Not at all: Just compare Fool and Foo1 (and perhaps FooI)


In the font in which I'm typing this, these are slightly different - but
there are fonts in which the difference is really difficult to
recognize.

> Speaking of which, would
> we then be offering support for arabic/indic numeric literals, and/or
> support it in int()/float()?

No. None of the Arabic users have ever requested such a feature, so
it would be stupid to provide it. We provide extended identifiers not
for the fun of it, but because users are requesting them.

Regards,
Martin

From ncoghlan at gmail.com  Tue Oct 25 23:55:15 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 26 Oct 2005 07:55:15 +1000
Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues
In-Reply-To: <ca471dc20510250914x3c316fedyeea48f4568ffbb15@mail.gmail.com>
References: <435A4598.3060403@iinet.net.au>	
	<ca471dc20510221022s6e75c5a0y591271cd7a950109@mail.gmail.com>	
	<ca471dc20510221414m4df50f3ft1fceb3eaa024a7b8@mail.gmail.com>	
	<435B597C.6040300@gmail.com>	
	<5.1.1.6.0.20051023124842.01af9078@mail.telecommunity.com>	
	<ca471dc20510231658qb8e14ddkb7956960cfa507d7@mail.gmail.com>	
	<435CCB1C.4030108@gmail.com>	
	<ca471dc20510240747v33cfd354qc104129d6f9f90a1@mail.gmail.com>	
	<435E40E5.7070703@gmail.com>
	<ca471dc20510250914x3c316fedyeea48f4568ffbb15@mail.gmail.com>
Message-ID: <435EA9C3.9030600@gmail.com>

Guido van Rossum wrote:
> On 10/25/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> Almost there - this is the only issue I have left on my list :)
> [,,,]
>>> Why are you so keen on TypeError? I find AttributeError totally
>>> appropriate. I don't see symmetry with for-loops as a valuable
>>> property here. AttributeError and TypeError are often interchangeable
>>> anyway.
>> The reason I'm keen on TypeError is because 'abstract.c' uses it consistently
>> when it fails to find a method to support a requested protocol.
> 
> Hm. abstract.c well predates the new type system. Slots and methods
> weren't really unified back then, so TypeError made obvious sense at
> the time.

Ah, I hadn't considered that, because I never made significant use of any 
Python versions before 2.2.

Maybe there's a design principle in there somewhere:

   Failed duck-typing -> AttributeError (or TypeError for complex checks)
   Failed instance or subtype check -> TypeError

Most of the functions in abstract.c handle complex protocols, so a simple 
attribute error wouldn't convey the necessary meaning. The context protocol, 
on the other hand, is fairly simple, and an AttributeError tells you 
everything you really need to know.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From martin at v.loewis.de  Tue Oct 25 23:56:06 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 25 Oct 2005 23:56:06 +0200
Subject: [Python-Dev] Divorcing str and unicode
	(no	more	implicitconversions).
In-Reply-To: <435EA7EB.90100@egenix.com>
References: <djl18k$l70$1@sea.gmane.org> <435E7794.2030505@v.loewis.de>
	<20051025120919.3927.JCARLSON@uci.edu> <435EA7EB.90100@egenix.com>
Message-ID: <435EA9F6.3030702@v.loewis.de>

M.-A. Lemburg wrote:
> A few years ago we had a discussion about this on python-dev
> and agreed to stick with ASCII identifiers for Python. I still
> think that's the right way to go.

I don't think there ever was such an agreement.

Regards,
Martin


From guido at python.org  Wed Oct 26 00:05:25 2005
From: guido at python.org (Guido van Rossum)
Date: Tue, 25 Oct 2005 15:05:25 -0700
Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues
In-Reply-To: <435EA9C3.9030600@gmail.com>
References: <435A4598.3060403@iinet.net.au>
	<ca471dc20510221414m4df50f3ft1fceb3eaa024a7b8@mail.gmail.com>
	<435B597C.6040300@gmail.com>
	<5.1.1.6.0.20051023124842.01af9078@mail.telecommunity.com>
	<ca471dc20510231658qb8e14ddkb7956960cfa507d7@mail.gmail.com>
	<435CCB1C.4030108@gmail.com>
	<ca471dc20510240747v33cfd354qc104129d6f9f90a1@mail.gmail.com>
	<435E40E5.7070703@gmail.com>
	<ca471dc20510250914x3c316fedyeea48f4568ffbb15@mail.gmail.com>
	<435EA9C3.9030600@gmail.com>
Message-ID: <ca471dc20510251505w69d07cd3q2875de35f6a62fa0@mail.gmail.com>

On 10/25/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Maybe there's a design principle in there somewhere:
>
>    Failed duck-typing -> AttributeError (or TypeError for complex checks)
>    Failed instance or subtype check -> TypeError

Doesn't convince me.

If there are principles at work here (and not just coincidences), they
are (a) don't  lightly replace an exception by another, and (b) don't
raise AttributeError; the getattr operation raise it for you. (a) says
that we should let the AttributeError bubble up in the case of the
with-statement; (b) explains why you see TypeError when a slot isn't
filled.

> Most of the functions in abstract.c handle complex protocols, so a simple
> attribute error wouldn't convey the necessary meaning. The context protocol,
> on the other hand, is fairly simple, and an AttributeError tells you
> everything you really need to know.

That's what I've been saying all the time. :-)

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From martin at v.loewis.de  Wed Oct 26 00:11:36 2005
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Wed, 26 Oct 2005 00:11:36 +0200
Subject: [Python-Dev] Divorcing str and unicode (no more
	implicitconversions).
In-Reply-To: <ca471dc20510251447g7a2baf36g2f0f75bccfbbba26@mail.gmail.com>
References: <20051025120919.3927.JCARLSON@uci.edu>	
	<435E9E18.7090502@v.loewis.de>
	<20051025142245.3930.JCARLSON@uci.edu>
	<ca471dc20510251447g7a2baf36g2f0f75bccfbbba26@mail.gmail.com>
Message-ID: <435EAD98.6000401@v.loewis.de>

Guido van Rossum wrote:
> This actually seems a killer even for allowing Unicode in comments,
> which I'd otherwise favor. What do Unicode-aware apps generally do
> with right-to-left characters?

The Unicode standard has an elaborate definition of what should happen.
There are many rules to it, but essentially, there is the notion of a
"primary" direction, which then is toggled based on the directionality
of each character (unicodedata.bidirectional). There are also formatting
characters which toggle the direction.

This aspect of rendering is often not implemented, though. Web browsers
do it correctly, see

http://he.wikipedia.org/wiki/Python

where all text should come out right-adjusted, yet the Latin fragments
are still left to right (such as "Guido van Rossum")

Integrating it into this text looks like this: ?????? (Python).

GUI frameworks sometimes do it correctly, sometimes don't; most
notably, Tk has no good support for RTL text.

Regards,
Martin


From ncoghlan at gmail.com  Wed Oct 26 00:20:50 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 26 Oct 2005 08:20:50 +1000
Subject: [Python-Dev] PEP 343 - multiple context managers in one
	statement
In-Reply-To: <79990c6b0510251340s6adb7fbcpac5247886a171c3f@mail.gmail.com>
References: <79990c6b0510251340s6adb7fbcpac5247886a171c3f@mail.gmail.com>
Message-ID: <435EAFC2.6020305@gmail.com>

Paul Moore wrote:
> I have a deep suspicion that this has been done to death already, but
> my searching ability isn't up to finding the reference. So I'll simply
> ask the question, and not offer a long discussion:
> 
> Has the option of letting the with statement admit multiple context
> managers been considered (and presumably rejected)?
> 
> I'm thinking of
> 
>     with expr1, expr2, expr3:
>         # whatever

Not rejected - deliberately left as a future option (this is the reason why 
the RHS of an as clause has to be parenthesised if you want tuple unpacking).

> In some ways, this doesn't even need an extension to the PEP - giving
> tuples suitable __enter__ and __exit__ methods would do it. Or, I
> suppose a user-defined manager which combined a list of others:
> 
>     class combining:
>         def __init__(*mgrs):
>             self.mgrs = mgrs
>         def __with__(self):
>             return self
>         def __enter__(self):
>             return tuple(mgr.__enter__() for mgr in self.mgrs)
>         def __exit__(self, type, value, tb):
>             # first in, last out
>             for mgr in reversed(self.mgrs):
>                 mgr.__exit__(type, value, tb)
> 
> Would that be worth using as an example in the PEP?

The issue with that implementation is that the semantics are wrong - it 
doesn't actually mirror *nested* with statements. If one of the later 
__enter__ methods, or one of the first-executed __exit__ methods throws an 
exception, there are a lot of __exit__ methods that get skipped.

Getting it right is more complicated (and this probably still has mistakes):

      class nested(object):
          def __init__(*mgrs):
              self.mgrs = mgrs
              self.entered = None

          def __context__(self):
              return self

          def __enter__(self):
              self.entered = deque()
              vars = []
              try:
                  for mgr in self.mgrs:
                      var = mgr.__enter__()
                      self.entered.push_front(mgr)
                      vars.append(var)
              except:
                  self.__exit__(*sys.exc_info())
                  raise
              return vars

          def __exit__(self, *exc_info):
              # first in, last out
              # Behave like nested with statements
              ex = exc_info
              for mgr in self.entered:
                  try:
                      mgr.__exit__(*ex)
                  except:
                      ex = sys.exc_info()
              if ex is not exc_info:
                  raise ex[0], ex[1], ex[2]

> PS The signature of __with__ in example 4 in the PEP is wrong - it has
> an incorrect "lock" parameter.

Thanks - I'll fix that when I incorporate the resolutions of the open issues 
(which will be post the SVN migration).

Cheers,
Nick.


-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From jcarlson at uci.edu  Wed Oct 26 01:59:51 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Tue, 25 Oct 2005 16:59:51 -0700
Subject: [Python-Dev] Divorcing str and unicode (no
	more	implicitconversions).
In-Reply-To: <435EA9CF.6060305@v.loewis.de>
References: <20051025142245.3930.JCARLSON@uci.edu>
	<435EA9CF.6060305@v.loewis.de>
Message-ID: <20051025164015.3942.JCARLSON@uci.edu>


"Martin v. L?wis" <martin at v.loewis.de> wrote:
> 
> Josiah Carlson wrote:
> > And how users could say, "name error? But I typed in window.draw(PEN) as
> > I was told to, and it didn't work!"
> 
> Ah, so the "serious issues" you are talking about are not security 
> issues, but usability issues.

Indeed, it was a misunderstanding, as the email stated:
    I did not mean to imply that I was concerned about the security
    implications of inserting arbitrary identifiers in Python (I was
    mentioning the web browser case for an example of how such
    characters have been confusing previously), I am concerned about
    confusion involved with using: [glyphs which are identical]


> I don't think extending the range of acceptable characters will
> cause any additional confusion. Users are already getting "surprising"
> NameErrors/AttributeErrors in the following cases:
> - they just misspell the identifier, and then, when the error message
>    is printed, fail to recognize the difference, as they read over the
>    typo just like they read over it when mistyping it in the first place.

In this case it's not just a misreading, the characters look identical! 
When is an 'E' not an 'E'?  When it is an Epsilon or Ie.  Saying what
characters will or will not be used as identifiers, when those
characters are keys on a keyboard of a specific type, is pretty
presumptuous.


> - they run into confusions with different things having the same names
>    in different contexts. For example, they wonder why they get TypeError
>    for passing the wrong number of arguments to a function, when the
>    call matches exactly what the source code in front of them tells
>    them - only that they were calling a different function which just
>    happened to have the same name.

Right, and users should be reading the documentation for the functions
and methods they are calling.


> In the light of these common mistakes, your example with an identifier
> named PEN, where the "P" might be a cyrillic letter or the E a greek one
> is just made up: For window.draw, people will readily understand that
> they are supposed to use Latin letters. More generally, they will know
> what script to use just from looking at the identifier.

Sure, that example was made up, but there are words which have been
stolen from various languages by english, and you are discounting the
case of single-letter temporary variables.  Saying what will and won't
happen over the course of using unicode identifiers is quite the
prediction.


> > Identically drawn glyphs are a problem, and pretending that they aren't
> > a problem, doesn't make it so.  Right now, all possible name glyphs are
> > visually distinct
> 
> Not at all: Just compare Fool and Foo1 (and perhaps FooI)
> 
> In the font in which I'm typing this, these are slightly different - but
> there are fonts in which the difference is really difficult to
> recognize.

Indeed, they are similar, but_ different_ in my font as well.  The trick
is that the glyphs are not different in the case of certain greek or
cyrillic letters.  They don't just /look/ similar they /are identical/.

 - Josiah


From guido at python.org  Wed Oct 26 02:18:37 2005
From: guido at python.org (Guido van Rossum)
Date: Tue, 25 Oct 2005 17:18:37 -0700
Subject: [Python-Dev] Divorcing str and unicode (no more
	implicitconversions).
In-Reply-To: <20051025164015.3942.JCARLSON@uci.edu>
References: <20051025142245.3930.JCARLSON@uci.edu>
	<435EA9CF.6060305@v.loewis.de> <20051025164015.3942.JCARLSON@uci.edu>
Message-ID: <ca471dc20510251718x3057456cva4f8e0fa113e65dc@mail.gmail.com>

On 10/25/05, Josiah Carlson <jcarlson at uci.edu> wrote:
> Indeed, they are similar, but_ different_ in my font as well.  The trick
> is that the glyphs are not different in the case of certain greek or
> cyrillic letters.  They don't just /look/ similar they /are identical/.

Well, in the font I'm using to read this email, I and l are /identical/.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From jcarlson at uci.edu  Wed Oct 26 02:35:37 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Tue, 25 Oct 2005 17:35:37 -0700
Subject: [Python-Dev] Divorcing str and unicode (no more
	implicitconversions).
In-Reply-To: <ca471dc20510251718x3057456cva4f8e0fa113e65dc@mail.gmail.com>
References: <20051025164015.3942.JCARLSON@uci.edu>
	<ca471dc20510251718x3057456cva4f8e0fa113e65dc@mail.gmail.com>
Message-ID: <20051025173215.3951.JCARLSON@uci.edu>


Guido van Rossum <guido at python.org> wrote:
> 
> On 10/25/05, Josiah Carlson <jcarlson at uci.edu> wrote:
> > Indeed, they are similar, but_ different_ in my font as well.  The trick
> > is that the glyphs are not different in the case of certain greek or
> > cyrillic letters.  They don't just /look/ similar they /are identical/.
> 
> Well, in the font I'm using to read this email, I and l are /identical/.

In all fonts I've seen, E/Epsilon/Ie are /always identical/.

 - Josiah


From nyamatongwe at gmail.com  Wed Oct 26 02:49:53 2005
From: nyamatongwe at gmail.com (Neil Hodgson)
Date: Wed, 26 Oct 2005 10:49:53 +1000
Subject: [Python-Dev] Divorcing str and unicode (no more
	implicitconversions).
In-Reply-To: <435EAD98.6000401@v.loewis.de>
References: <20051025120919.3927.JCARLSON@uci.edu>
	<435E9E18.7090502@v.loewis.de> <20051025142245.3930.JCARLSON@uci.edu>
	<ca471dc20510251447g7a2baf36g2f0f75bccfbbba26@mail.gmail.com>
	<435EAD98.6000401@v.loewis.de>
Message-ID: <50862ebd0510251749y2b8137cy672145f04994e854@mail.gmail.com>

Martin v. L?wis:

> This aspect of rendering is often not implemented, though. Web browsers
> do it correctly, see
> ...
> GUI frameworks sometimes do it correctly, sometimes don't; most
> notably, Tk has no good support for RTL text.

   Scintilla does a rough job with this. RTL text is displayed
correctly as the underlying platform libraries (Windows or GTK+/Pango)
handle this aspect when called to draw text. However editing is not
performed correctly with the caret not being placed correctly within
RTL text and other visual glitches. There is interest in the area and
even a funding proposal this week.

   Neil

From greg.ewing at canterbury.ac.nz  Wed Oct 26 02:51:40 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 26 Oct 2005 13:51:40 +1300
Subject: [Python-Dev] Divorcing str and unicode (no	more
 implicitconversions).
In-Reply-To: <435EA9CF.6060305@v.loewis.de>
References: <20051025120919.3927.JCARLSON@uci.edu>
	<435E9E18.7090502@v.loewis.de> <20051025142245.3930.JCARLSON@uci.edu>
	<435EA9CF.6060305@v.loewis.de>
Message-ID: <435ED31C.3010800@canterbury.ac.nz>

Martin v. L?wis wrote:

> For window.draw, people will readily understand that
> they are supposed to use Latin letters. More generally, they will know
> what script to use just from looking at the identifier.

Would it help if an identifier were required to be
made up of letters from the same alphabet, e.g. all
Latin or all Greek or all Cyrillic, but not a mixture.
Then you'd get an immediate error if you accidentally
slipped in a letter from the wrong alphabet.

Greg


From anthony at interlink.com.au  Wed Oct 26 05:25:19 2005
From: anthony at interlink.com.au (Anthony Baxter)
Date: Wed, 26 Oct 2005 13:25:19 +1000
Subject: [Python-Dev] make testall hanging on HEAD?
Message-ID: <200510261325.19363.anthony@interlink.com.au>

At the moment, I see make testall hanging in test_timeout. In 
addition, test_curses is leaving the tty in a hosed state:

test_crypt
test_csv
test_curses
test_datetime
             test_dbm
                     test_decimal
                                 test_decorators
                                                test_deque
                                                          test_descr

This is on Ubuntu Breezy, 
[GCC 4.0.2 20050808 (prerelease) (Ubuntu 4.0.1-4ubuntu9)] on linux2

Anyone else see this?

-- 
Anthony Baxter     <anthony at interlink.com.au>
It's never too late to have a happy childhood.

From jepler at unpythonic.net  Wed Oct 26 06:47:13 2005
From: jepler at unpythonic.net (jepler@unpythonic.net)
Date: Tue, 25 Oct 2005 23:47:13 -0500
Subject: [Python-Dev] make testall hanging on HEAD?
In-Reply-To: <200510261325.19363.anthony@interlink.com.au>
References: <200510261325.19363.anthony@interlink.com.au>
Message-ID: <20051026044713.GA1460@unpythonic.net>

ditto on the "curses" problem, but test_timeout completed just fine, at least
the first time around.

fedora core 4, x86_64
[GCC 4.0.1 20050727 (Red Hat 4.0.1-5)] on linux2

Jeff
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20051025/36f54361/attachment-0001.pgp

From nyamatongwe at gmail.com  Wed Oct 26 07:49:39 2005
From: nyamatongwe at gmail.com (Neil Hodgson)
Date: Wed, 26 Oct 2005 15:49:39 +1000
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
	conversions).
In-Reply-To: <435DEEF6.5020603@egenix.com>
References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net>
	<ca471dc20510231806k2ba088c7g2d4afd460e023ae1@mail.gmail.com>
	<50862ebd0510232041j19b6b3achd0aecb072f9db148@mail.gmail.com>
	<435C9DFC.8020501@egenix.com>
	<50862ebd0510241613w2b5da91cqbcaf4f6157ae338e@mail.gmail.com>
	<435DEEF6.5020603@egenix.com>
Message-ID: <50862ebd0510252249r24a73350l2c692a0e4c45d2fc@mail.gmail.com>

M.-A. Lemburg:

> You mean a slice that slices out the next <indextype> ?

   Yes.

> This sounds a lot like you'd want iterators for the various
> index types. Should be possible to implement on top of the
> proposed APIs, e.g. itergraphemes(u), itercodepoints(u), etc.

   Iterators may be helpful, but can also be too restrictive when the
processing is not completely iterative, such as peeking ahead or
looking behind to wrap at a word boundary in the display example.
There should be

  It was more that there may leave less scope for error if there was a
move away from indexes to slices. The PEP provides ways to specify
what you want to examine or modify but it looks to me like returning
indexes will see code repetition or additional variables with an
increase in fragility.

> Note that what most people refer to as "character" is a
> grapheme in Unicode speak.

   A grapheme-oriented string type may be worthwhile although you'd
probably have to choose a particular normalisation form to ease
processing.

> Given that interpretation,
> "breaking" Unicode "characters" is something you won't
> ever work around with by using larger code units such
> as UCS4 compatible ones.

   I still think we can reduce the scope for errors.

> Furthermore, you should also note that surrogates (two
> code units encoding one code point) are part of Unicode
> life. While you don't need them when storing Unicode
> in UCS4 code units, they can still be part of the
> Unicode data and the programmer has to be aware of
> these.

   Many programmers can and will ignore surrogates. One day that may
bite them but we can't close off text processing to those who have no
idea of what surrogates are, or directional marks, or that sorting is
locale dependent, or have no understanding of the difference between
NFC and NFKD normalization forms.

   Neil

From martin at v.loewis.de  Wed Oct 26 08:09:33 2005
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Wed, 26 Oct 2005 08:09:33 +0200
Subject: [Python-Dev] Divorcing str and unicode (no
	more	implicitconversions).
In-Reply-To: <20051025164015.3942.JCARLSON@uci.edu>
References: <20051025142245.3930.JCARLSON@uci.edu>
	<435EA9CF.6060305@v.loewis.de>
	<20051025164015.3942.JCARLSON@uci.edu>
Message-ID: <435F1D9D.8060001@v.loewis.de>

Josiah Carlson wrote:
> In this case it's not just a misreading, the characters look identical! 
> When is an 'E' not an 'E'?  When it is an Epsilon or Ie.  Saying what
> characters will or will not be used as identifiers, when those
> characters are keys on a keyboard of a specific type, is pretty
> presumptuous.

Why is that rude and disrespectful? I'm certainly respecting developers
who want to use their scripts for identifiers, or else I would not have
suggested that they could do so.

However, from the experience with my own language, and the three or so
foreign languages I know, I can tell you that people would normally
don't mix identifiers of different scripts.

> Sure, that example was made up, but there are words which have been
> stolen from various languages by english, and you are discounting the
> case of single-letter temporary variables.  Saying what will and won't
> happen over the course of using unicode identifiers is quite the
> prediction.

Sure, people can make mistakes. They get an error, and then will
need to find the cause of the problem. Sometimes, this will be easy,
and sometimes, it will not.

> Indeed, they are similar, but_ different_ in my font as well.  The trick
> is that the glyphs are not different in the case of certain greek or
> cyrillic letters.  They don't just /look/ similar they /are identical/.

This string: "E?" is the LATIN CAPITAL LETTER E, followed by the GREEK
CAPITAL LETTER EPSILON. In the font my email composer uses, the E is
slightly larger than the Epsilon - so there /is/ a visual difference.

But even if there isn't: if this was a frequent problem, the name
error could include an alternative representation (say, with Unicode
ordinals for non-ASCII characters) which would give an easy visual
clue.

I still doubt that this is a frequent problem, and I don't see any
better grounds for claiming that it is than for claiming that it
is not.

Regards,
Martin

From eric.nieuwland at xs4all.nl  Wed Oct 26 08:17:01 2005
From: eric.nieuwland at xs4all.nl (Eric Nieuwland)
Date: Wed, 26 Oct 2005 08:17:01 +0200
Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues
In-Reply-To: <ca471dc20510251335p75627d61o6a22b22fc4a33c06@mail.gmail.com>
References: <435A4598.3060403@iinet.net.au>
	<5.1.1.6.0.20051023124842.01af9078@mail.telecommunity.com>
	<ca471dc20510231658qb8e14ddkb7956960cfa507d7@mail.gmail.com>
	<435CCB1C.4030108@gmail.com>
	<ca471dc20510240747v33cfd354qc104129d6f9f90a1@mail.gmail.com>
	<435E40E5.7070703@gmail.com>
	<ca471dc20510250914x3c316fedyeea48f4568ffbb15@mail.gmail.com>
	<103348c72cae040229895842cb3c0cdc@xs4all.nl>
	<ca471dc20510251311h60d653d4y374a09a1ae473f18@mail.gmail.com>
	<0c1f3f308e9c605ef4687689c860913e@xs4all.nl>
	<ca471dc20510251335p75627d61o6a22b22fc4a33c06@mail.gmail.com>
Message-ID: <d1c13142713773e4cc9b712dbf7eb309@xs4all.nl>

Guido van Rossum wrote:

> [Eric "are all your pets called Eric?" Nieuwland]
>>>> Hmmm... Would it be reasonable to introduce a ProtocolError 
>>>> exception?
>
> [Guido]
>>> And which perceived problem would that solve?
>
> [Eric]
>> It was meant to be a bit more informative about what is wrong.
>>
>> ProtocolError: lacks __enter__ or __exit__
>
> That's exactly what I'm trying to avoid. :)
>
> I find "AttributeError: __exit__" just as informative. In either case,
> if you know what __exit__ means, you'll know what you did wrong. And
> if you don't know what it means, you'll have to look it up anyway. And
> searching for ProtocolError doesn't do you any good -- you'll have to
> learn about what __exit__ is and where it is required.

I see. Then why don't we unify *Error into Error?
Just read the message and know what it means.
And we could then drop the burden of exception classes and only use the 
message.
A sense of deja-vu comes over me somehow ;-)


From martin at v.loewis.de  Wed Oct 26 08:22:41 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 26 Oct 2005 08:22:41 +0200
Subject: [Python-Dev] Divorcing str and unicode (no	more
	implicitconversions).
In-Reply-To: <435ED31C.3010800@canterbury.ac.nz>
References: <20051025120919.3927.JCARLSON@uci.edu>	<435E9E18.7090502@v.loewis.de>
	<20051025142245.3930.JCARLSON@uci.edu>	<435EA9CF.6060305@v.loewis.de>
	<435ED31C.3010800@canterbury.ac.nz>
Message-ID: <435F20B1.8080803@v.loewis.de>

Greg Ewing wrote:
> Would it help if an identifier were required to be
> made up of letters from the same alphabet, e.g. all
> Latin or all Greek or all Cyrillic, but not a mixture.
> Then you'd get an immediate error if you accidentally
> slipped in a letter from the wrong alphabet.

Not in the literal sense: you certainly want to allow
"latin" digits in, say, a cyrillic identifier.See

http://www.unicode.org/reports/tr31/

for what the Unicode consortium recommends to do.
In addition to the strict specification, they envision
usage guidelines. This seems Pythonic: just because
you could potentially shoot yourself in the foot doesn't
mean it should be banned from the language.

IOW, whether it would help largely depends on whether
the problem is real in the first place. Just because
you *can* come up with look-alike identifiers doesn't
mean that people will use them, or that they will mistake
the scripts (except for deliberately doing so, of
course).

Regards,
Martin

From stephen at xemacs.org  Wed Oct 26 08:40:55 2005
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed, 26 Oct 2005 15:40:55 +0900
Subject: [Python-Dev] Divorcing str and unicode (no
	more	implicitconversions).
In-Reply-To: <20051025164015.3942.JCARLSON@uci.edu> (Josiah Carlson's
	message of "Tue, 25 Oct 2005 16:59:51 -0700")
References: <20051025142245.3930.JCARLSON@uci.edu>
	<435EA9CF.6060305@v.loewis.de> <20051025164015.3942.JCARLSON@uci.edu>
Message-ID: <871x28vkzs.fsf@tleepslib.sk.tsukuba.ac.jp>

>>>>> "Josiah" == Josiah Carlson <jcarlson at uci.edu> writes:

    Josiah> Indeed, they are similar, but_ different_ in my font as
    Josiah> well.  The trick is that the glyphs are not different in
    Josiah> the case of certain greek or cyrillic letters.  They don't
    Josiah> just /look/ similar they /are identical/.

But these problems are going to arise in _any_ multilingual context;
it's not at all specific to identifiers.  It's just that computers
lexing identifiers are kinda picky about those things compared to
humans.  I think you can reasonably classify it as a new breed of
typo, and develop UIs to deal with it in that way.

To handle cases where glyphs are (nearly) identical, UIs that visually
flag "foreign" characters, at least in contexts where cross-block
punning is unacceptable, will be developed, and users will learn to
pay attention to those flags.


-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.

From walter at livinglogic.de  Wed Oct 26 09:31:47 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Wed, 26 Oct 2005 09:31:47 +0200
Subject: [Python-Dev] Divorcing str and unicode (no
	more	implicitconversions).
In-Reply-To: <20051025142245.3930.JCARLSON@uci.edu>
References: <20051025120919.3927.JCARLSON@uci.edu>
	<435E9E18.7090502@v.loewis.de>
	<20051025142245.3930.JCARLSON@uci.edu>
Message-ID: <AF512595-07C7-40FC-9CAA-4ED87BE50A1E@livinglogic.de>

Am 25.10.2005 um 23:40 schrieb Josiah Carlson:

> [...]
> Identically drawn glyphs are a problem, and pretending that they  
> aren't
> a problem, doesn't make it so.  Right now, all possible name glyphs  
> are
> visually distinct, which would not be the case if any unicode  
> character
> could be used as a name (except for numerals).  Speaking of which,  
> would
> we then be offering support for arabic/indic numeric literals, and/or
> support it in int()/float()?

It's already supported in int() and float()

 >>> int(u"\u136c\u2082")
42
 >>> float(u"\u0664\u09e8")
42.0

But not as literals:

# -*- coding: unicode-escape -*-

print \u136c\u2082

This gives (on the Mac):

   File "encoding.py", line 3
     print ??
           ^
SyntaxError: invalid syntax

> [...]

Bye,
    Walter D?rwald


From mal at egenix.com  Wed Oct 26 11:50:01 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 26 Oct 2005 11:50:01 +0200
Subject: [Python-Dev] Divorcing str and
	unicode	(no	more	implicitconversions).
In-Reply-To: <435EA9F6.3030702@v.loewis.de>
References: <djl18k$l70$1@sea.gmane.org>
	<435E7794.2030505@v.loewis.de>	<20051025120919.3927.JCARLSON@uci.edu>
	<435EA7EB.90100@egenix.com> <435EA9F6.3030702@v.loewis.de>
Message-ID: <435F5149.4060804@egenix.com>

Martin v. L?wis wrote:
> M.-A. Lemburg wrote:
> 
>>A few years ago we had a discussion about this on python-dev
>>and agreed to stick with ASCII identifiers for Python. I still
>>think that's the right way to go.
> 
> I don't think there ever was such an agreement.

You even argued against having non-ASCII identifiers:

http://mail.python.org/pipermail/python-list/2002-May/102936.html

and I agree with you on most of the points you make in that
posting:

* Unicode identifiers are going to introduce massive
code breakage - just think of all the tools people use
to manipulate Python code today; I'm quite sure that
most of it will fail in one way or another if you present
it Unicode literals such as in "z?hler += 1".

* People don't seem very interested in using Unicode
identifiers, e.g.

  http://mail.python.org/pipermail/i18n-sig/2001-February/000828.html

most of the few who did comment, said they'd rather have
ASCII identifiers, e.g.

  http://mail.python.org/pipermail/python-list/2002-May/104050.html


Do you really think that it will help with code readability
if programmers are allowed to use native scripts for their
identifiers ?

I think this goes beyond just visual aspects of being able
to distinguish graphemes:

If you are told to debug a program
written by say a Japanese programmer using Japanese identifiers
you are going to have a really hard time. Integrating such
code into other applications will be even harder, since you'd
be forced to use his Japanese class names in your application.
This doesn't only introduce problems with being able to enter
the Japanese identifiers, it will also cause your application
to suddenly contain identifiers in Japanese even though that's
not your native script.

I think source code encodings provide an ideal way to
have comments written in native scripts - and people
use that a lot. However, keeping the program code itself
in plain ASCII makes it far more readable and reusable
across locales. Something that's important in this
globalized world.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 26 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From paolo.veronelli at gmail.com  Mon Oct 24 17:30:36 2005
From: paolo.veronelli at gmail.com (Paolino)
Date: Mon, 24 Oct 2005 17:30:36 +0200
Subject: [Python-Dev] PEP 351, the freeze protocol
In-Reply-To: <435CF941.6070104@libero.it>
References: <1130107429.11268.40.camel@geddy.wooz.org>
	<435CF941.6070104@libero.it>
Message-ID: <435CFE1C.9050809@gmail.com>

Paolino wrote:

> Is __hash__=id inside a class enough to use a set (sets.Set before 2.5) 
> derived class instance as a key to a mapping?
It is __hash__=lambda self:id(self) that is terribly slow ,it needs a 
faster way to state that to let them be useful as key to mapping as all 
set operations will pipe into the mechanism .In my application that 
function is eating time like hell, and will keep on doing it even with 
the PEP proposed .OT probably.

Regards Paolino



From bokr at oz.net  Tue Oct 25 02:56:40 2005
From: bokr at oz.net (Bengt Richter)
Date: Mon, 24 Oct 2005 17:56:40 -0700
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
 conversions).
In-Reply-To: <435CACAD.9070106@egenix.com>
References: <5.0.2.1.1.20051022120117.02b58da0@mail.oz.net>
	<5.0.2.1.1.20051022120117.02b58da0@mail.oz.net>
Message-ID: <5.0.2.1.1.20051024133833.02b81cb0@mail.oz.net>

At 11:43 2005-10-24 +0200, M.-A. Lemburg wrote:
>Bengt Richter wrote:
>> Please bear with me for a few paragraphs ;-)
>
>Please note that source code encoding doesn't really have
>anything to do with the way the interpreter executes the
>program - it's merely a way to tell the parser how to
>convert string literals (currently on the Unicode ones)
>into constant Unicode objects within the program text.
>It's also a nice way to let other people know what kind of
>encoding you used to write your comments ;-)
>
>Nothing more.
I think somehow I didn't make things clear, sorry ;-)
As I tried to show in the example of module_a.cs vs module_b.cs,
the source encoding currently results in two different str-type
strings representing the source _character_ sequence, which is the
_same_ in both cases. To make it more clear, try the following little
program (untested except on NT4 with
Python 2.4b1 (#56, Nov  3 2004, 01:47:27)
[GCC 3.2.3 (mingw special 20030504-1)] on win32 ;-):

----< t_srcenc.py >--------------------------------
import os
def test():
    open('module_a.py','wb').write(
        "# -*- coding: latin-1 -*-" + os.linesep +
        "cs = '\xfcber-cool'" + os.linesep)
    open('module_b.py','wb').write(
        "# -*- coding: utf-8 -*-" + os.linesep +
        "cs = '\xc3\xbcber-cool'" + os.linesep)
    # show that we have two modules differing only in encoding:
    print ''.join(line.decode('latin-1') for line in open('module_a.py'))
    print ''.join(line.decode('utf-8') for line in open('module_b.py'))
    # see how results are affected:
    import module_a, module_b
    print module_a.cs + ' =?= ' + module_b.cs
    print module_a.cs.decode('latin-1') + ' =?= ' + module_b.cs.decode('utf-8')

if __name__ == '__main__':
    test()
---------------------------------------------------
The result copied from NT4 console to clipboard and pasted into eudora:
__________________________________________________________

[17:39] C:\pywk\python-dev>py24 t_srcenc.py
# -*- coding: latin-1 -*-
cs = '?ber-cool'

# -*- coding: utf-8 -*-
cs = '?ber-cool'

nber-cool =?= ++ber-cool
?ber-cool =?= ?ber-cool
__________________________________________________________
(I'd say NT did the best it could, rendering the the copied cp437
superscript n as the 'n' above, and the '++' coming from the
cp437 box characters corresponding to the '\xc3\xbc'. Not sure
how it will show on your screen, but try the program to see ;-)

>Once a module is compiled, there's no distinction between
>a module using the latin-1 source code encoding or one using
>the utf-8 encoding.
ISTM module_a.cs and module_b.cs can readily be distinguished after
compilation, whereas the sources displayed according to their declared
encodings as above (or as e.g. different editors using different native
encoding might) cannot (other than the encoding cookie itself) ;-)
Perhaps you meant something else?

>Thanks,
You're welcome.

Regards,
Bengt Richter


From lucky1010_studies at yahoo.co.in  Tue Oct 25 15:09:10 2005
From: lucky1010_studies at yahoo.co.in (Lucky Wankhede)
Date: Tue, 25 Oct 2005 14:09:10 +0100 (BST)
Subject: [Python-Dev] "? operator in python"
Message-ID: <20051025130910.65658.qmail@web8602.mail.in.yahoo.com>



 Dear sir,

         I m a student of Computer Science Dept.
 University Of Pune.(M.S.) (India).

          Sir , I have found that the python is about
to have feature of "? " operator same as in C languge.

          Sir , Not Only I but the our whole Dept. is
waitng for it. 
          
          Kindly provide me with the information that
in version of python we will be able to find that
feature and when it is about to realse. 
           Considering your best of sympathetic
consideration. Hoping for early response.



                 Thank You.
           
                
                                 Mr. Lucky R. Wankhede
                                 M.C,A, Ist,
   
             


		
__________________________________________________________ 
Yahoo! India Matrimony: Find your partner now. Go to http://yahoo.shaadi.com

From lucky1010_studies at yahoo.co.in  Tue Oct 25 15:12:26 2005
From: lucky1010_studies at yahoo.co.in (Lucky Wankhede)
Date: Tue, 25 Oct 2005 14:12:26 +0100 (BST)
Subject: [Python-Dev] "? operator in python"
Message-ID: <20051025131226.42537.qmail@web8603.mail.in.yahoo.com>



 Dear sir,

         I m a student of Computer Science Dept.
 University Of Pune.(M.S.) (India). We are learning
python as a course for our semester. Found its not
only use full but heart touching laguage.

          Sir , I have found that the python is going
to have new feature, of "? " operator, same as in C
languge.

         
          Kindly provide me with the information that
in version of python we will be able to find that
feature and when it is about to realse. 
           Considering your best of sympathetic
consideration. Hoping for early response.



                 Thank You.
           
                
                                Mr. Lucky R. Wankhede
                                M.C,A, Ist,
                                Dept. Of Comp.
Sciende,
                                University of Pune,
                                India.   
                                   


		
__________________________________________________________ 
Yahoo! India Matrimony: Find your partner now. Go to http://yahoo.shaadi.com

From p.f.moore at gmail.com  Wed Oct 26 14:59:34 2005
From: p.f.moore at gmail.com (Paul Moore)
Date: Wed, 26 Oct 2005 13:59:34 +0100
Subject: [Python-Dev] PEP 343 - multiple context managers in one
	statement
In-Reply-To: <435EAFC2.6020305@gmail.com>
References: <79990c6b0510251340s6adb7fbcpac5247886a171c3f@mail.gmail.com>
	<435EAFC2.6020305@gmail.com>
Message-ID: <79990c6b0510260559k1aaf7b40o3598001a7d79fcaa@mail.gmail.com>

On 10/25/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Paul Moore wrote:
[...]
> > Has the option of letting the with statement admit multiple context
> > managers been considered (and presumably rejected)?
[...]
> Not rejected - deliberately left as a future option (this is the reason why
> the RHS of an as clause has to be parenthesised if you want tuple unpacking).

Thanks. I now see that note in the PEP - apologies for missing it in
the first instance.

[...]
> The issue with that implementation is that the semantics are wrong - it
> doesn't actually mirror *nested* with statements. If one of the later
> __enter__ methods, or one of the first-executed __exit__ methods throws an
> exception, there are a lot of __exit__ methods that get skipped.
>
> Getting it right is more complicated (and this probably still has mistakes):

Bah. You're right, of course (about it being more complicated - I
can't see any mistakes :-))

I'd argue that precisely because a naive approach gets it wrong,
having your version as an example in the PEP (and possibly the
documentation, much like the itertools module has a recipes section)
is that much more useful.

Anyway, thanks for the help.
Paul.

From mcherm at mcherm.com  Wed Oct 26 17:32:46 2005
From: mcherm at mcherm.com (Michael Chermside)
Date: Wed, 26 Oct 2005 08:32:46 -0700
Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues
Message-ID: <20051026083246.6tg0gu2gafms8gkc@login.werra.lunarpages.com>

Guido writes:
> I find "AttributeError: __exit__" just as informative.

Eric Nieuwland responds:
> I see. Then why don't we unify *Error into Error?
> Just read the message and know what it means.
> And we could then drop the burden of exception classes and only use the
> message.
> A sense of deja-vu comes over me somehow ;-)

The answer (and there _IS_ an answer) is that using different exception
types allows the user some flexibility in CATCHING the exceptions. The
discussion you have been following obscures that point somewhat because
there's little meaningful difference between TypeError and
AttributeError (at least in well-written code that doesn't have
unnecessary typechecks in it).

If there were a significant difference between TypeError and
AttributeError then Nick and Guido would have immediately chosen the
appropriate error type based on functionality rather than style, and
there wouldn't have been any need for discussion.

Oh yeah, and you can also put extra info into an exception object
besides just the error message. (We don't do that as often as we
should... it's a powerful technique.)

-- Michael Chermside


From jimjjewett at gmail.com  Wed Oct 26 18:16:14 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Wed, 26 Oct 2005 12:16:14 -0400
Subject: [Python-Dev] Divorcing str and unicode (no more
	implicitconversions).
Message-ID: <fb6fbf560510260916g305eb637mad870e43dbec4891@mail.gmail.com>

Greg Ewing asked:

> Would it help if an identifier were required to be
> made up of letters from the same alphabet, e.g. all
> Latin or all Greek or all Cyrillic, but not a mixture.

Probably, yes, though there could still be problems
mixing within a program.

FWIW, the Opera web browser is already using
a similar solution.  Domain names are limited to
Latin-1 *unless* the top-level registrar has a
policy to prevent spoofing.

-jJ

From guido at python.org  Wed Oct 26 18:21:55 2005
From: guido at python.org (Guido van Rossum)
Date: Wed, 26 Oct 2005 09:21:55 -0700
Subject: [Python-Dev] "? operator in python"
In-Reply-To: <20051025131226.42537.qmail@web8603.mail.in.yahoo.com>
References: <20051025131226.42537.qmail@web8603.mail.in.yahoo.com>
Message-ID: <ca471dc20510260921w442b7439g20634dcbae639a3c@mail.gmail.com>

Dear Lucky,

You are correct. Python 2.5 will have a conditional operator. The
syntax will be different than C; it will look like this:

  (EXPR1 if TEST else EXPR2)

(which is the equivalent of TEST?EXPR1:EXPR2 in C). For more
information, see PEP 308 (http://www.python.org/peps/pep-0308.html).

Python 2.5 will be released some time next year; we hope to have
alphas available in the 2nd quarter. Thatr's about as firm as we can
currently be about the release date.

Enjoy,

--Guido van Rossum

On 10/25/05, Lucky Wankhede <lucky1010_studies at yahoo.co.in> wrote:
>
>
>  Dear sir,
>
>          I m a student of Computer Science Dept.
>  University Of Pune.(M.S.) (India). We are learning
> python as a course for our semester. Found its not
> only use full but heart touching laguage.
>
>           Sir , I have found that the python is going
> to have new feature, of "? " operator, same as in C
> languge.
>
>
>           Kindly provide me with the information that
> in version of python we will be able to find that
> feature and when it is about to realse.
>            Considering your best of sympathetic
> consideration. Hoping for early response.
>
>
>
>                  Thank You.
>
>
>                                 Mr. Lucky R. Wankhede
>                                 M.C,A, Ist,
>                                 Dept. Of Comp.
> Sciende,
>                                 University of Pune,
>                                 India.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From martin at v.loewis.de  Wed Oct 26 19:02:22 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 26 Oct 2005 19:02:22 +0200
Subject: [Python-Dev] CVS is read-only
Message-ID: <435FB69E.8090501@v.loewis.de>

I just switched the repository to read-only mode,
and removed the test subversion installation. I'll let
you know when the conversion is complete.

Regards,
Martin

From martin at v.loewis.de  Wed Oct 26 19:32:52 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 26 Oct 2005 19:32:52 +0200
Subject: [Python-Dev] Divorcing str and
	unicode	(no	more	implicitconversions).
In-Reply-To: <435F5149.4060804@egenix.com>
References: <djl18k$l70$1@sea.gmane.org>
	<435E7794.2030505@v.loewis.de>	<20051025120919.3927.JCARLSON@uci.edu>
	<435EA7EB.90100@egenix.com> <435EA9F6.3030702@v.loewis.de>
	<435F5149.4060804@egenix.com>
Message-ID: <435FBDC4.8030300@v.loewis.de>

M.-A. Lemburg wrote:
> You even argued against having non-ASCII identifiers:
> 
> http://mail.python.org/pipermail/python-list/2002-May/102936.html

I see :-) It seems I have changed my mind since then (which
apparently predates PEP 263).

One issue I apparently was worried about was the plan to use
native-encoding byte strings for the identifiers; this I didn't
like at all.

> * Unicode identifiers are going to introduce massive
> code breakage - just think of all the tools people use
> to manipulate Python code today; I'm quite sure that
> most of it will fail in one way or another if you present
> it Unicode literals such as in "z?hler += 1".

True. Today, I think I would be willing to accept the
code breakage: these tools had quite some time to update
themselves to PEP 263 (even though not all of them have
done so yet); also, usage of the feature would only spread
gradually. A failure to support the feature in the Python
proper would be treated as a bug by us; how tool providers
deal with the feature would be their choice.

> * People don't seem very interested in using Unicode
> identifiers, e.g.
> 
>   http://mail.python.org/pipermail/i18n-sig/2001-February/000828.html

True. However, I also suspect that lack of tool support
contributes to that. For the specific case of Java,
there is no notion of source encoding, which makes Unicode
identifiers really tedious to use.

If it were really easy to use, I assume people would actually
use it - atleast in some of the contexts, like teaching,
where Python is also widely used.

> Do you really think that it will help with code readability
> if programmers are allowed to use native scripts for their
> identifiers ?

Yes, I do - for some groups of users. Of course, code sharing
would be more difficult, and there certainly should be a policy
to use only ASCII in the standard library. But within local
groups, users would find understanding code easier if they
knew what the identifiers actually meant.

> If you are told to debug a program
> written by say a Japanese programmer using Japanese identifiers
> you are going to have a really hard time. Integrating such
> code into other applications will be even harder, since you'd
> be forced to use his Japanese class names in your application.

Certainly, yes. There is a trade-off: you can make it easier
for some people to read and write code if they can use their
native script; at the same time, it would be harder for others
to read and modify it.

It's a policy decision whether you use English identifiers or
not - it shouldn't be a technical decision (as it currently
is).

> I think source code encodings provide an ideal way to
> have comments written in native scripts - and people
> use that a lot. However, keeping the program code itself
> in plain ASCII makes it far more readable and reusable
> across locales. Something that's important in this
> globalized world.

Certainly. However, some programs don't need to live in
a globalized world - e.g. if they are homework in a school.
Within a locale, using native scripts would make the program
more readable.

Regards,
Martin

From jcarlson at uci.edu  Wed Oct 26 20:33:14 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed, 26 Oct 2005 11:33:14 -0700
Subject: [Python-Dev] Divorcing str and
	unicode	(no	more	implicitconversions).
In-Reply-To: <435FBDC4.8030300@v.loewis.de>
References: <435F5149.4060804@egenix.com> <435FBDC4.8030300@v.loewis.de>
Message-ID: <20051026105934.3977.JCARLSON@uci.edu>


"Martin v. L?wis" <martin at v.loewis.de> wrote:
> 
> M.-A. Lemburg wrote:
> > You even argued against having non-ASCII identifiers:
> > 
> > http://mail.python.org/pipermail/python-list/2002-May/102936.html
> > 
> > Do you really think that it will help with code readability
> > if programmers are allowed to use native scripts for their
> > identifiers ?
> 
> Yes, I do - for some groups of users. Of course, code sharing
> would be more difficult, and there certainly should be a policy
> to use only ASCII in the standard library. But within local
> groups, users would find understanding code easier if they
> knew what the identifiers actually meant.

According to wikipedia (http://en.wikipedia.org/wiki/Latin_alphabet),
various languages have adopted a transliteration of their language
and/or former alphabets into latin.  They don't purport to know all of
the reasons why, and I'm not going to speculate.

Whether or not more languages start using the latin alphabet is a good
question.  Basing judgement on history and likely globalization, it is
only a matter of time before basically all languages have a
transcription into the latin alphabet that is taught to all (unless
China takes over the world).

 - Josiah


From jcarlson at uci.edu  Wed Oct 26 20:40:14 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed, 26 Oct 2005 11:40:14 -0700
Subject: [Python-Dev] Divorcing str and unicode (no
	more	implicitconversions).
In-Reply-To: <435F1D9D.8060001@v.loewis.de>
References: <20051025164015.3942.JCARLSON@uci.edu>
	<435F1D9D.8060001@v.loewis.de>
Message-ID: <20051026002535.395E.JCARLSON@uci.edu>


"Martin v. L?wis" <martin at v.loewis.de> wrote:
> 
> Josiah Carlson wrote:
> > In this case it's not just a misreading, the characters look identical! 
> > When is an 'E' not an 'E'?  When it is an Epsilon or Ie.  Saying what
> > characters will or will not be used as identifiers, when those
> > characters are keys on a keyboard of a specific type, is pretty
> > presumptuous.
> 
> Why is that rude and disrespectful? I'm certainly respecting developers
> who want to use their scripts for identifiers, or else I would not have
> suggested that they could do so.

I never said rude, I said presumptuous.  "Going beyond what is right or
proper; excessively forward." (according to dictionary.com, the OED has
a similar definition).  I was trying to say that in stating that users
wouldn't be using keys on their keyboard in their natual language when
also using english characters, that you were assuming a bit about their
usage patterns that you perhaps shouldn't.  I certainly could also be
presumptuous in stating that users may very well mix certain languages,
but it seems to be more likely given keywords and the standard library
using the latin alphabet.


> > Indeed, they are similar, but_ different_ in my font as well.  The trick
> > is that the glyphs are not different in the case of certain greek or
> > cyrillic letters.  They don't just /look/ similar they /are identical/.
> 
> This string: "E??" is the LATIN CAPITAL LETTER E, followed by the GREEK
> CAPITAL LETTER EPSILON. In the font my email composer uses, the E is
> slightly larger than the Epsilon - so there /is/ a visual difference.

My email client doesn't handle unicode, but a quick check by swapping
fonts in a word processor provides that at least on my platform, all
three are the same glyph (same size, shape, ...) for all fixed-width
fonts. If a platform distinguishes all three, then one should consider
one's platform lucky.  Not all platforms and/or preferred fonts of users
are.

> But even if there isn't: if this was a frequent problem, the name
> error could include an alternative representation (say, with Unicode
> ordinals for non-ASCII characters) which would give an easy visual
> clue.

It would offer a great cue, but I'm not sure if it is possible.  I think
that it sounds like an ugly discussion of stdout/err encodings and
exception handling machinery that I don't want to be a part of.

> I still doubt that this is a frequent problem, and I don't see any
> better grounds for claiming that it is than for claiming that it
> is not.

Whether or not it is frequent will depend on the prevalence of desire to
use those characters.  While I don't think that such uses will be as
common as using 'klass' when passing a class, I do think that it will
result in more than a few sf bug reports.  I also share Marc-Andre
Lemburg's concerns about the understandability of code written in Kanji,
Hebrew, Arabic, etc., at least for those who have not memorized the
entirety of those alphabets.

 - Josiah


From martin at v.loewis.de  Wed Oct 26 21:39:31 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 26 Oct 2005 21:39:31 +0200
Subject: [Python-Dev] Divorcing str and
	unicode	(no	more	implicitconversions).
In-Reply-To: <20051026105934.3977.JCARLSON@uci.edu>
References: <435F5149.4060804@egenix.com> <435FBDC4.8030300@v.loewis.de>
	<20051026105934.3977.JCARLSON@uci.edu>
Message-ID: <435FDB73.5080703@v.loewis.de>

Josiah Carlson wrote:
> According to wikipedia (http://en.wikipedia.org/wiki/Latin_alphabet),
> various languages have adopted a transliteration of their language
> and/or former alphabets into latin.  They don't purport to know all of
> the reasons why, and I'm not going to speculate.
> 
> Whether or not more languages start using the latin alphabet is a good
> question.  Basing judgement on history and likely globalization, it is
> only a matter of time before basically all languages have a
> transcription into the latin alphabet that is taught to all (unless
> China takes over the world).

That is a very U.S. centric view. I don't share it, but I think it is
pointless to argue against it.

Regards,
Martin

From ejones at uwaterloo.ca  Thu Oct 27 02:02:54 2005
From: ejones at uwaterloo.ca (Evan Jones)
Date: Wed, 26 Oct 2005 20:02:54 -0400
Subject: [Python-Dev] Parser and Runtime: Divorced!
Message-ID: <03b7f74aebe5c6249a8bb00ac17d1952@uwaterloo.ca>

After a few hours of tedious and frustrating hacking I've managed to 
separate the Python abstract syntax tree parser from the rest of Python 
itself. This could be useful for people who may wish to build Python 
tools without Python, or tools in C/C++.

In the process of doing this, I came across a comment mentioning that 
it would be desirable to separate the parser. Is there any interest in 
doing this? I now have a vague idea about how to do this. Of course, 
there is no point in making changes like this unless there is some 
tangible benefit.

I will make my ugly hack available once I have polished it a little bit 
more. It involved hacking header files to provide a "re-implementation" 
of the pieces of Python that the parser needs (PyObject, PyString, and 
PyInt). It likely is a bit buggy, and it doesn't support all the types 
(notably, it is missing support for Unicode, Longs, Floats, and 
Complex), but it works well enough to get the AST from a few simple 
strings, which is what I wanted.

Evan Jones

--
Evan Jones
http://evanjones.ca/


From greg.ewing at canterbury.ac.nz  Thu Oct 27 04:14:04 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 27 Oct 2005 15:14:04 +1300
Subject: [Python-Dev] Divorcing str and	unicode	(no	more
 implicitconversions).
In-Reply-To: <435F5149.4060804@egenix.com>
References: <djl18k$l70$1@sea.gmane.org> <435E7794.2030505@v.loewis.de>
	<20051025120919.3927.JCARLSON@uci.edu> <435EA7EB.90100@egenix.com>
	<435EA9F6.3030702@v.loewis.de> <435F5149.4060804@egenix.com>
Message-ID: <436037EC.7050308@canterbury.ac.nz>

M.-A. Lemburg wrote:

> If you are told to debug a program
> written by say a Japanese programmer using Japanese identifiers
> you are going to have a really hard time.

Or you could look upon it as an opportunity to
broaden your mental horizons by learning some
Japanese. :-)

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From greg.ewing at canterbury.ac.nz  Thu Oct 27 04:14:13 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 27 Oct 2005 15:14:13 +1300
Subject: [Python-Dev] Divorcing str and unicode (no	more
 implicitconversions).
In-Reply-To: <435F20B1.8080803@v.loewis.de>
References: <20051025120919.3927.JCARLSON@uci.edu>
	<435E9E18.7090502@v.loewis.de> <20051025142245.3930.JCARLSON@uci.edu>
	<435EA9CF.6060305@v.loewis.de> <435ED31C.3010800@canterbury.ac.nz>
	<435F20B1.8080803@v.loewis.de>
Message-ID: <436037F5.8050501@canterbury.ac.nz>

Martin v. L?wis wrote:

> Not in the literal sense: you certainly want to allow
> "latin" digits in, say, a cyrillic identifier.

Yes, by "alphabet" I really only meant the letters,
although you might want to apply the same idea to
clusters of digits within an identifier, depending
on how potentially confusable the various sets of
digits are -- I'm not familiar enough with alternative
digit sets to know whether that would be a problem.

 > Just because
> you *can* come up with look-alike identifiers doesn't
> mean that people will use them, or that they will mistake
> the scripts (except for deliberately doing so, of
> course).

I still think this is a much worse potential problem
than that of "l" vs "1", etc. It's reasonable to
adopt the practice of never using "l" as a single
letter identifier, for example. But it would be
unreasonable to ban the use of "E" as an identifier
on the grounds that someone somewhere might confuse
it with a capital epsilon.

An alternative would be to identify such confusable
letters in the various alphabets and define them
to be equivalent.

And beyond the issue of alphabets there's also the
question of whether accented characters should be
considered distinct. I can see quite a few holy
flame wars erupting over that...

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing at canterbury.ac.nz	   +--------------------------------------+

From jcarlson at uci.edu  Thu Oct 27 08:23:39 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed, 26 Oct 2005 23:23:39 -0700
Subject: [Python-Dev] Divorcing str and
	unicode	(no	more	implicitconversions).
In-Reply-To: <435FDB73.5080703@v.loewis.de>
References: <20051026105934.3977.JCARLSON@uci.edu>
	<435FDB73.5080703@v.loewis.de>
Message-ID: <20051026232149.399A.JCARLSON@uci.edu>


"Martin v. L?wis" <martin at v.loewis.de> wrote:
> Josiah Carlson wrote:
> > According to wikipedia (http://en.wikipedia.org/wiki/Latin_alphabet),
> > various languages have adopted a transliteration of their language
> > and/or former alphabets into latin.  They don't purport to know all of
> > the reasons why, and I'm not going to speculate.
> > 
> > Whether or not more languages start using the latin alphabet is a good
> > question.  Basing judgement on history and likely globalization, it is
> > only a matter of time before basically all languages have a
> > transcription into the latin alphabet that is taught to all (unless
> > China takes over the world).
> 
> That is a very U.S. centric view. I don't share it, but I think it is
> pointless to argue against it.

I should have included a ;).  Whether or not in the future all languages
use the latin alphabet should have little to do with whether Python
chooses to support non-latin identifiers in the forthcoming 2.5 or later
releases.

 - Josiah


From mal at egenix.com  Thu Oct 27 11:09:04 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 27 Oct 2005 11:09:04 +0200
Subject: [Python-Dev] Divorcing str
	and	unicode	(no	more	implicitconversions).
In-Reply-To: <435FBDC4.8030300@v.loewis.de>
References: <djl18k$l70$1@sea.gmane.org>	<435E7794.2030505@v.loewis.de>	<20051025120919.3927.JCARLSON@uci.edu>	<435EA7EB.90100@egenix.com>
	<435EA9F6.3030702@v.loewis.de>	<435F5149.4060804@egenix.com>
	<435FBDC4.8030300@v.loewis.de>
Message-ID: <43609930.5030907@egenix.com>

Martin v. L?wis wrote:
> M.-A. Lemburg wrote:
> 
>>You even argued against having non-ASCII identifiers:
>>
>>http://mail.python.org/pipermail/python-list/2002-May/102936.html
> 
> 
> I see :-) It seems I have changed my mind since then (which
> apparently predates PEP 263).
> 
> One issue I apparently was worried about was the plan to use
> native-encoding byte strings for the identifiers; this I didn't
> like at all.
> 
> 
>>* Unicode identifiers are going to introduce massive
>>code breakage - just think of all the tools people use
>>to manipulate Python code today; I'm quite sure that
>>most of it will fail in one way or another if you present
>>it Unicode literals such as in "z?hler += 1".
> 
> 
> True. Today, I think I would be willing to accept the
> code breakage: these tools had quite some time to update
> themselves to PEP 263 (even though not all of them have
> done so yet); also, usage of the feature would only spread
> gradually. A failure to support the feature in the Python
> proper would be treated as a bug by us; how tool providers
> deal with the feature would be their choice.

I was thinking of introspection and debugging tools.
These would then see Unicode objects in the namespace
dictionaries and this will likely break a lot of code -
much for the same reason you see code breakage now
if you let Unicode object enter the Python standard lib
without warning :-)

>>* People don't seem very interested in using Unicode
>>identifiers, e.g.
>>
>>  http://mail.python.org/pipermail/i18n-sig/2001-February/000828.html
> 
> 
> True. However, I also suspect that lack of tool support
> contributes to that. For the specific case of Java,
> there is no notion of source encoding, which makes Unicode
> identifiers really tedious to use.
> 
> If it were really easy to use, I assume people would actually
> use it - atleast in some of the contexts, like teaching,
> where Python is also widely used.

Well, that has two sides: Of course, you'll always find
some people that will like a certain feature. The question
is what effects does it have on the rest of us.

Python has always put some constraints on programmers
to raise code readability, e.g. white space awareness.
Giving them Unicode identifiers sounds like a step
backwards in this context.

Note that I'm not talking about comments, string literal
contents, etc. - only the programming logic, ie. keywords
and identifiers.

>>Do you really think that it will help with code readability
>>if programmers are allowed to use native scripts for their
>>identifiers ?
> 
> 
> Yes, I do - for some groups of users. Of course, code sharing
> would be more difficult, and there certainly should be a policy
> to use only ASCII in the standard library. But within local
> groups, users would find understanding code easier if they
> knew what the identifiers actually meant.

Hmm, but why do you think they wouldn't understand the meaning of
ASCII versions of the identifiers ?

Note that using ASCII doesn't necessarily mean that you
have to use English as basis for the naming schemes of
identifiers.

>>If you are told to debug a program
>>written by say a Japanese programmer using Japanese identifiers
>>you are going to have a really hard time. Integrating such
>>code into other applications will be even harder, since you'd
>>be forced to use his Japanese class names in your application.
> 
> 
> Certainly, yes. There is a trade-off: you can make it easier
> for some people to read and write code if they can use their
> native script; at the same time, it would be harder for others
> to read and modify it.
> 
> It's a policy decision whether you use English identifiers or
> not - it shouldn't be a technical decision (as it currently
> is).

See above: ASCII != English. Most scripts have a transliteration
into ASCII - simply because that's the global standard for
scripts.

>>I think source code encodings provide an ideal way to
>>have comments written in native scripts - and people
>>use that a lot. However, keeping the program code itself
>>in plain ASCII makes it far more readable and reusable
>>across locales. Something that's important in this
>>globalized world.
> 
> 
> Certainly. However, some programs don't need to live in
> a globalized world - e.g. if they are homework in a school.
> Within a locale, using native scripts would make the program
> more readable.

True.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 27 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From mal at egenix.com  Thu Oct 27 11:25:22 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 27 Oct 2005 11:25:22 +0200
Subject: [Python-Dev] Divorcing str
	and	unicode	(no	more	implicitconversions).
In-Reply-To: <436037EC.7050308@canterbury.ac.nz>
References: <djl18k$l70$1@sea.gmane.org>
	<435E7794.2030505@v.loewis.de>	<20051025120919.3927.JCARLSON@uci.edu>
	<435EA7EB.90100@egenix.com>	<435EA9F6.3030702@v.loewis.de>
	<435F5149.4060804@egenix.com> <436037EC.7050308@canterbury.ac.nz>
Message-ID: <43609D02.8080209@egenix.com>

Greg Ewing wrote:
> M.-A. Lemburg wrote:
> 
> 
>>If you are told to debug a program
>>written by say a Japanese programmer using Japanese identifiers
>>you are going to have a really hard time.
> 
> 
> Or you could look upon it as an opportunity to
> broaden your mental horizons by learning some
> Japanese. :-)

I just took Japanese as exmaple for a language and script
that I don't know anything about. I would actually love
to learn some Japanese, but simply don't have the time
for learning it.

Anyway, I could just as well have chosen Tibetian, Thai or Limbu
scripts (which all look very nice, BTW):

	http://www.unicode.org/charts/

Perhaps this is not as bad after all - I just don't think that
it will help code readability in the long run.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 27 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From martin at v.loewis.de  Thu Oct 27 12:35:13 2005
From: martin at v.loewis.de (martin@v.loewis.de)
Date: Thu, 27 Oct 2005 12:35:13 +0200
Subject: [Python-Dev] Conversion to Subversion is complete
Message-ID: <1130409313.4360ad6139518@www.domainfactory-webmail.de>

The Python source code repository is now converted to subversion;
please feel free to start checking out new sandboxes. For a few
days, this installation probably still needs to be considered in
testing. If there are no serious problems found by next Monday,
I would consider conversion of the data complete. The CVS repository
will be kept available read-only for a while longer, so you can
easily forward any patches you may have.

Most of you are probably interested in checking out one of these
folders:

svn+ssh://pythondev at svn.python.org/python/trunk
svn+ssh://pythondev at svn.python.org/python/branches/release24-maint
svn+ssh://pythondev at svn.python.org/peps

The anonymous read-only equivalents of these are

http://svn.python.org/projects/python/trunk
http://svn.python.org/projects/python/branches/release24-maint
http://svn.python.org/projects/peps

As mentioned before, in addition to "plain" http/WebDAV,
viewcvs is available at

http://svn.python.org/view/

There are some more things left to be done, such as updating
the developer documentation. I'll start working on that soon,
but contributions are welcome.

Regards,
Martin



From skip at pobox.com  Thu Oct 27 12:49:35 2005
From: skip at pobox.com (skip@pobox.com)
Date: Thu, 27 Oct 2005 05:49:35 -0500
Subject: [Python-Dev] Conversion to Subversion is complete
In-Reply-To: <1130409313.4360ad6139518@www.domainfactory-webmail.de>
References: <1130409313.4360ad6139518@www.domainfactory-webmail.de>
Message-ID: <17248.45247.676631.388117@montanaro.dyndns.org>

>>>>> "martin" == martin  <martin at v.loewis.de> writes:

    martin> The Python source code repository is now converted to
    martin> subversion; please feel free to start checking out new
    martin> sandboxes. 

Excellent...  Thanks for all the effort.

Skip

From jeremy at alum.mit.edu  Thu Oct 27 14:23:45 2005
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Thu, 27 Oct 2005 08:23:45 -0400
Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover
In-Reply-To: <2mbr1g6loh.fsf@starship.python.net>
References: <435BC27C.1010503@v.loewis.de> <2mbr1g6loh.fsf@starship.python.net>
Message-ID: <e8bf7a530510270523g4a3bef5fk1dd5e8e016d9aa1a@mail.gmail.com>

Can anyone point an old CVS/Perforce-Luddite at instructions for how
to use the new SVN repository?

Jeremy

On 10/23/05, Michael Hudson <mwh at python.net> wrote:
> "Martin v. L?wis" <martin at v.loewis.de> writes:
>
> > I'd like to start the subversion switchover this coming Wednesday,
> > with a total commit freeze at 16:00 GMT.
>
> Yay!  Thanks again for doing this.
>
> Cheers,
> mwh
>
> --
>   [Perl] combines all the worst aspects of C and Lisp: a billion
>   different sublanguages in one monolithic executable.  It combines
>   the power of C with the readability of PostScript. -- Jamie Zawinski
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/jeremy%40alum.mit.edu
>

From dave at boost-consulting.com  Thu Oct 27 14:26:36 2005
From: dave at boost-consulting.com (David Abrahams)
Date: Thu, 27 Oct 2005 08:26:36 -0400
Subject: [Python-Dev] [Docs] MinGW and libpython24.a
References: <u3bmp62qz.fsf@boost-consulting.com>
	<435E768D.2000401@v.loewis.de> <u7jc12z7u.fsf@boost-consulting.com>
Message-ID: <u3bmnrvr7.fsf_-_@boost-consulting.com>

David Abrahams <dave at boost-consulting.com> writes:

> "Martin v. L?wis" <martin at v.loewis.de> writes:
>
>> David Abrahams wrote:
>>> Is the instruction at
>>> http://www.python.org/dev/doc/devel/inst/tweak-flags.html#SECTION000622000000000000000
>>> still relevant?  I am not 100% certain I didn't make one myself, but
>>> it looks to me as though my Windows Python 2.4.1 distro came with a
>>> libpython24.a.  I am asking here because it seems only the person who
>>> prepares the installer would know.
>>
>> That impression might be incorrect: I can tell you when I started
>> including libpython24.a, but I have no clue whether the instructions
>> you refer to are correct - I don't use the file myself at all.
>>
>>> If this is true, in which version was it introduced?
>>
>> It was introduced in 1.20/1.16.2.4 of Tools/msi/msi.py in response to
>> patch #1088716; this in turn was first used to release r241c1.
>
> Thanks!

As it turns out, MinGW also implemented, in version 3.0.0 (with
binutils-2.13.90-20030111-1), features which make the creation of
libpython24.a unnecessary.  So whoever maintains this doc might want
to note that you only need that step if you are using a version of
Python prior to 2.4.1 with a MinGW prior to 3.0.0 (with
binutils-2.13.90-20030111-1).

Regards
-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com


From walter at livinglogic.de  Thu Oct 27 14:41:07 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Thu, 27 Oct 2005 14:41:07 +0200
Subject: [Python-Dev] Conversion to Subversion is complete
In-Reply-To: <1130409313.4360ad6139518@www.domainfactory-webmail.de>
References: <1130409313.4360ad6139518@www.domainfactory-webmail.de>
Message-ID: <4360CAE3.4090408@livinglogic.de>

martin at v.loewis.de wrote:

> The Python source code repository is now converted to subversion;
 > [...]

Thanks for doing this.

BTW, will there be daily tarballs, like the one available from:
http://cvs.perl.org/snapshots/python/python/python-latest.tar.gz

Bye,
    Walter D?rwald

From eric.nieuwland at xs4all.nl  Thu Oct 27 14:45:54 2005
From: eric.nieuwland at xs4all.nl (Eric Nieuwland)
Date: Thu, 27 Oct 2005 14:45:54 +0200
Subject: [Python-Dev] Proposed resolutions for open PEP 343 issues
In-Reply-To: <20051026083246.6tg0gu2gafms8gkc@login.werra.lunarpages.com>
References: <20051026083246.6tg0gu2gafms8gkc@login.werra.lunarpages.com>
Message-ID: <0a6f9d0d448835d658af6e4be85bd954@xs4all.nl>

Michael Chermside wrote:

> Guido writes:
>> I find "AttributeError: __exit__" just as informative.
>
> Eric Nieuwland responds:
>> I see. Then why don't we unify *Error into Error?
>> Just read the message and know what it means.
>> And we could then drop the burden of exception classes and only use 
>> the
>> message.
>> A sense of deja-vu comes over me somehow ;-)
>
> The answer (and there _IS_ an answer) is that using different exception
> types allows the user some flexibility in CATCHING the exceptions. The
> discussion you have been following obscures that point somewhat because
> there's little meaningful difference between TypeError and
> AttributeError (at least in well-written code that doesn't have
> unnecessary typechecks in it).

Yep. I too would like to have 'SOME flexibility in catching the 
exceptions' meaning I'd like to be able to catch TypeErrors and 
AttributeErrors while not catching what I call ProtocolErrors. The 
simple reason is that in most of my apps TypeErrors and AttributeErrors 
will depend on the runtime situation, while ProtocolErrors will mostly 
be static. So I'll debug for ProtocolErrors and I'll handle runtime 
stuff.

> If there were a significant difference between TypeError and
> AttributeError then Nick and Guido would have immediately chosen the
> appropriate error type based on functionality rather than style, and
> there wouldn't have been any need for discussion.

I got that already. To me it means one of them may be a candidate for 
removal/redefinition.

> Oh yeah, and you can also put extra info into an exception object
> besides just the error message. (We don't do that as often as we
> should... it's a powerful technique.)

Perhaps that needs for propaganda then. I won't dare to suggest 
syntactic sugar ;-)

--eric


From skip at pobox.com  Thu Oct 27 14:54:59 2005
From: skip at pobox.com (skip@pobox.com)
Date: Thu, 27 Oct 2005 07:54:59 -0500
Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover
In-Reply-To: <e8bf7a530510270523g4a3bef5fk1dd5e8e016d9aa1a@mail.gmail.com>
References: <435BC27C.1010503@v.loewis.de> <2mbr1g6loh.fsf@starship.python.net>
	<e8bf7a530510270523g4a3bef5fk1dd5e8e016d9aa1a@mail.gmail.com>
Message-ID: <17248.52771.225830.484931@montanaro.dyndns.org>


    Jeremy> Can anyone point an old CVS/Perforce-Luddite at instructions for
    Jeremy> how to use the new SVN repository?

Jeremy,

I'd never used Subversion until Barry grabbed the python.org web maintainers
by our collective ears and dragged us to the table with the kool aid.  As it
turns out, the svn flavored kool aid tastes about the same as the cvs flavor
(svn {commit,up,diff} == cvs {commit,up,diff}, though there are some slight
aftertastes you have to get used to (e.g., revision numbers are for the
entire branch, not just a single file).

That said, the best place to start is probably the Subversion book,
available in both online and dead tree versions:

    http://svnbook.red-bean.com/

Appendix A of that book is "Subversion for CVS Users".  Probably worth a
quick skim and a browser bookmark.

Though there's no svn/cvs cheatsheet there, you may also find isolated
tidbits in the Subversion FAQ:

    http://subversion.tigris.org/faq.html

Just grep around for "cvs".

Skip

From wl at flexis.de  Thu Oct 27 15:15:54 2005
From: wl at flexis.de (Wolfgang Langner)
Date: Thu, 27 Oct 2005 15:15:54 +0200
Subject: [Python-Dev] Conversion to Subversion is complete
In-Reply-To: <17248.45247.676631.388117@montanaro.dyndns.org>
References: <1130409313.4360ad6139518@www.domainfactory-webmail.de>
	<17248.45247.676631.388117@montanaro.dyndns.org>
Message-ID: <djqjua$rjh$1@sea.gmane.org>

Hello,

skip at pobox.com wrote:

>     martin> The Python source code repository is now converted to
>     martin> subversion; please feel free to start checking out new
>     martin> sandboxes. 
> 
> Excellent...  Thanks for all the effort.
Good work. I checked the http and viewcvs access and all worked.

But why is an old subversion used ?
(Powered by Subversion version 1.1.4)

bye by Wolfgang


From mwh at python.net  Thu Oct 27 15:57:19 2005
From: mwh at python.net (Michael Hudson)
Date: Thu, 27 Oct 2005 14:57:19 +0100
Subject: [Python-Dev] Conversion to Subversion is complete
In-Reply-To: <1130409313.4360ad6139518@www.domainfactory-webmail.de>
	(martin@v.loewis.de's
	message of "Thu, 27 Oct 2005 12:35:13 +0200")
References: <1130409313.4360ad6139518@www.domainfactory-webmail.de>
Message-ID: <2m64rj5agw.fsf@starship.python.net>

martin at v.loewis.de writes:

> The Python source code repository is now converted to subversion;
> please feel free to start checking out new sandboxes. For a few
> days, this installation probably still needs to be considered in
> testing. If there are no serious problems found by next Monday,
> I would consider conversion of the data complete. The CVS repository
> will be kept available read-only for a while longer, so you can
> easily forward any patches you may have.

Woo!

Do checkins to svn.python.org go to the python-checkins list already?

Cheers,
mwh

-- 
  <skreech> How do I keep people from reading my Perl code? Oh wait.
            Ha ha!                              -- from Twisted.Quotes

From jim at zope.com  Thu Oct 27 16:03:08 2005
From: jim at zope.com (Jim Fulton)
Date: Thu, 27 Oct 2005 10:03:08 -0400
Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover
In-Reply-To: <e8bf7a530510270523g4a3bef5fk1dd5e8e016d9aa1a@mail.gmail.com>
References: <435BC27C.1010503@v.loewis.de> <2mbr1g6loh.fsf@starship.python.net>
	<e8bf7a530510270523g4a3bef5fk1dd5e8e016d9aa1a@mail.gmail.com>
Message-ID: <4360DE1C.3010602@zope.com>

Jeremy Hylton wrote:
> Can anyone point an old CVS/Perforce-Luddite at instructions for how
> to use the new SVN repository?

And can you remind us where to send our public keys? :)

Jim

-- 
Jim Fulton           mailto:jim at zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org

From guido at python.org  Thu Oct 27 17:32:04 2005
From: guido at python.org (Guido van Rossum)
Date: Thu, 27 Oct 2005 08:32:04 -0700
Subject: [Python-Dev] Conversion to Subversion is complete
In-Reply-To: <1130409313.4360ad6139518@www.domainfactory-webmail.de>
References: <1130409313.4360ad6139518@www.domainfactory-webmail.de>
Message-ID: <ca471dc20510270832n75cf9b4ic588ffb368975c23@mail.gmail.com>

On 10/27/05, martin at v.loewis.de <martin at v.loewis.de> wrote:
> The Python source code repository is now converted to subversion;
> please feel free to start checking out new sandboxes.

Woo hoo! Thanks for all the hard work and good thinking, Martin.

> Most of you are probably interested in checking out one of these
> folders:
>
> svn+ssh://pythondev at svn.python.org/python/trunk
> svn+ssh://pythondev at svn.python.org/python/branches/release24-maint
> svn+ssh://pythondev at svn.python.org/peps

This doesn't work for me. I'm sure the problem is on my end, but my
svn skills are too rusty to figure it out. I get this:

$ svn checkout svn+ssh://pythondev at svn.python.org/peps
Permission denied (publickey,keyboard-interactive).

svn: Connection closed unexpectedly
$svn --version
svn, version 1.2.0 (r14790)
   compiled Jun 13 2005, 18:51:32

Copyright (C) 2000-2005 CollabNet.
Subversion is open source software, see http://subversion.tigris.org/
This product includes software developed by CollabNet (http://www.Collab.Net/).

The following repository access (RA) modules are available:

* ra_dav : Module for accessing a repository via WebDAV (DeltaV) protocol.
  - handles 'http' scheme
  - handles 'https' scheme
* ra_svn : Module for accessing a repository using the svn network protocol.
  - handles 'svn' scheme
* ra_local : Module for accessing a repository on local disk.
  - handles 'file' scheme

$

I can ssh to svn.python.org just fine, with no password (it says it's
dinsdale). I can checkout the read-only versions just fine. I can work
with the pydotorg svn repository just fine (checked something in last
week).

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From skip at pobox.com  Thu Oct 27 18:07:16 2005
From: skip at pobox.com (skip@pobox.com)
Date: Thu, 27 Oct 2005 11:07:16 -0500
Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover
In-Reply-To: <4360DE1C.3010602@zope.com>
References: <435BC27C.1010503@v.loewis.de> <2mbr1g6loh.fsf@starship.python.net>
	<e8bf7a530510270523g4a3bef5fk1dd5e8e016d9aa1a@mail.gmail.com>
	<4360DE1C.3010602@zope.com>
Message-ID: <17248.64308.936680.578655@montanaro.dyndns.org>


    Jim> And can you remind us where to send our public keys? :)

Jim,

Send your keys to pydotorg at python.org.  Unless you specify otherwise, your
login will probably be "jim.fulton".

Skip



From martin at v.loewis.de  Thu Oct 27 19:18:01 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 27 Oct 2005 19:18:01 +0200
Subject: [Python-Dev] Conversion to Subversion is complete
In-Reply-To: <4360CAE3.4090408@livinglogic.de>
References: <1130409313.4360ad6139518@www.domainfactory-webmail.de>
	<4360CAE3.4090408@livinglogic.de>
Message-ID: <43610BC9.1040508@v.loewis.de>

Walter D?rwald wrote:
> Thanks for doing this.
> 
> BTW, will there be daily tarballs, like the one available from:
> http://cvs.perl.org/snapshots/python/python/python-latest.tar.gz

Will be, yes (I'm saddened that you refer to this location, and not
http://www.dcl.hpi.uni-potsdam.de/home/loewis/python.tgz :-)

I'm planning to provide them at http://svn.python.org/snapshots.

Regards,
Martin

From martin at v.loewis.de  Thu Oct 27 19:19:50 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 27 Oct 2005 19:19:50 +0200
Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover
In-Reply-To: <17248.52771.225830.484931@montanaro.dyndns.org>
References: <435BC27C.1010503@v.loewis.de> <2mbr1g6loh.fsf@starship.python.net>
	<e8bf7a530510270523g4a3bef5fk1dd5e8e016d9aa1a@mail.gmail.com>
	<17248.52771.225830.484931@montanaro.dyndns.org>
Message-ID: <43610C36.2030500@v.loewis.de>

skip at pobox.com wrote:
> Though there's no svn/cvs cheatsheet there, you may also find isolated
> tidbits in the Subversion FAQ:
> 
>     http://subversion.tigris.org/faq.html
> 
> Just grep around for "cvs".

In addition, you might want to read

http://www.python.org/dev/svn.html

Regards,
Martin

From martin at v.loewis.de  Thu Oct 27 19:20:53 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 27 Oct 2005 19:20:53 +0200
Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover
In-Reply-To: <4360DE1C.3010602@zope.com>
References: <435BC27C.1010503@v.loewis.de> <2mbr1g6loh.fsf@starship.python.net>
	<e8bf7a530510270523g4a3bef5fk1dd5e8e016d9aa1a@mail.gmail.com>
	<4360DE1C.3010602@zope.com>
Message-ID: <43610C75.1020908@v.loewis.de>

Jim Fulton wrote:
>> Can anyone point an old CVS/Perforce-Luddite at instructions for how
>> to use the new SVN repository?
> 
> 
> And can you remind us where to send our public keys? :)

pydotorg at python.org should work; you will get a confirmation when they
are installed.

Regards,
Martin

From ndbecker2 at gmail.com  Thu Oct 27 19:27:53 2005
From: ndbecker2 at gmail.com (Neal Becker)
Date: Thu, 27 Oct 2005 13:27:53 -0400
Subject: [Python-Dev] Help with inotify
Message-ID: <djr2mo$g9u$1@sea.gmane.org>

I'm trying to make a module to support inotify (linux).  I put together a
module using boost::python.  Problem is, inotify uses a file descriptor. 
If I call python os.fdopen on it, I get an error:
Python 2.4.1 (#1, May 16 2005, 15:15:14)
[GCC 4.0.0 20050512 (Red Hat 4.0.0-5)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from inotify import *
>>> import os
>>> i=inotify()
>>> i.fileno()
4
>>> os.fdopen (i.fileno())
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
IOError: [Errno 21] Is a directory

Any ideas?  I'd rather not have to trace through python if I could avoid it
(I don't even have source installed here).


From kbk at shore.net  Thu Oct 27 19:35:30 2005
From: kbk at shore.net (Kurt B. Kaiser)
Date: Thu, 27 Oct 2005 13:35:30 -0400 (EDT)
Subject: [Python-Dev] Weekly Python Patch/Bug Summary
Message-ID: <200510271735.j9RHZUHG005080@bayview.thirdcreek.com>

Patch / Bug Summary
___________________

Patches :  360 open (+16) /  2956 closed ( +1) /  3316 total (+17)
Bugs    :  893 open (+10) /  5353 closed (+12) /  6246 total (+22)
RFE     :  199 open ( -2) /   189 closed ( +2) /   388 total ( +0)

New / Reopened Patches
______________________

Patch for (Doc) #1255218  (2005-10-17)
       http://python.org/sf/1328526  opened by  Peter van Kampen

Patch for (Doc) #1261659  (2005-10-17)
       http://python.org/sf/1328566  opened by  Peter van Kampen

pclose raises spurious exception on win32  (2005-10-17)
       http://python.org/sf/1328851  opened by  Guido van Rossum

datetime/xmlrpclib.DateTime comparison  (2005-10-18)
       http://python.org/sf/1330538  opened by  Skip Montanaro

tarfile.py: fix for 1330039  (2005-10-19)
CLOSED http://python.org/sf/1331635  opened by  Lars Gust?bel

Allow use of non-latin1 chars in interactive shell  (2005-10-21)
       http://python.org/sf/1333679  opened by  Noam Raphael

Fix for int(string, base) wrong answers  (2005-10-22)
       http://python.org/sf/1334979  opened by  Adam Olsen

Patch to implement PEP 351  (2005-10-23)
       http://python.org/sf/1335812  opened by  Barry A. Warsaw

Fix for int(string, base) wrong answers (take 2)  (2005-10-24)
       http://python.org/sf/1335972  opened by  Alan McIntyre

remove 4 ints from PyFrameObject  (2005-10-24)
       http://python.org/sf/1337051  opened by  Neal Norwitz

Elemental Security contribution - parsexml.py  (2005-10-25)
       http://python.org/sf/1337648  opened by  Guido van Rossum

 Elemental Security contribution - pgen2 package  (2005-10-25)
       http://python.org/sf/1337696  opened by  Guido van Rossum

fileinput patch for bug #1336582  (2005-10-25)
       http://python.org/sf/1337756  opened by  A. Murat EREN

Inconsistent use of buffer interface in string and unicode  (2005-10-25)
       http://python.org/sf/1337876  opened by  Phil Thompson

tarfile.py: fix for bug #1336623  (2005-10-26)
       http://python.org/sf/1338314  opened by  Lars Gust?bel

cross compile and mingw support  (2005-10-27)
       http://python.org/sf/1339673  opened by  Jan Nieuwenhuizen

Patches Closed
______________

tarfile.py: fix for 1330039  (2005-10-19)
       http://python.org/sf/1331635  closed by  nnorwitz

New / Reopened Bugs
___________________

HTTPResponse instance has no attribute 'fileno'  (2005-10-16)
       http://python.org/sf/1327971  opened by  Kevin Dwyer

__getslice__ taking priority over __getitem__  (2005-10-17)
       http://python.org/sf/1328278  opened by  Josh Marshall

os-process.html  (2005-10-17)
CLOSED http://python.org/sf/1328915  opened by  Noah Spurrier

Empty Generator doesn't evaluate as False  (2005-10-17)
CLOSED http://python.org/sf/1328959  opened by  Christian H?ltje

tarfile.add() produces hard links instead of normal files  (2005-10-18)
CLOSED http://python.org/sf/1330039  opened by  Martin Pitt

utf 7 codec broken  (2005-10-19)
CLOSED http://python.org/sf/1331062  opened by  Ralf Schmitt

string_subscript doesn't check for failed PyMem_Malloc  (2005-10-19)
CLOSED http://python.org/sf/1331563  opened by  Adam Olsen

Incorrect use of -L/usr/lib/termcap  (2005-10-19)
       http://python.org/sf/1332732  opened by  Robert M. Zigweid

Inaccurate footnote 1 in Lib ref, sect 2.3.6.4  (2005-10-20)
CLOSED http://python.org/sf/1332780  opened by  Andy

BSD DB test failures for BSD DB 3.2  (2005-10-19)
       http://python.org/sf/1332852  opened by  Neal Norwitz

 Fatal Python error: Interpreter not initialized   (2005-10-20)
       http://python.org/sf/1332869  opened by  Andrew Mitchell

BSD DB test failures for BSD DB 4.1  (2005-10-19)
       http://python.org/sf/1332873  opened by  Neal Norwitz

Bugs of the new AST compiler  (2005-10-21)
       http://python.org/sf/1333982  opened by  Armin Rigo

int(string, base) wrong answers  (2005-10-22)
       http://python.org/sf/1334662  opened by  Tim Peters

Python 2.4.2 doesn't build with "--without-threads"  (2005-10-22)
       http://python.org/sf/1335054  opened by  Gunter Ohrner

fileinput device or resource busy error  (2005-10-24)
       http://python.org/sf/1336582  opened by  A. Murat EREN

tarfile can't extract some tar archives..  (2005-10-24)
       http://python.org/sf/1336623  opened by  A. Murat EREN

Python.h should include system headers properly [POSIX]  (2005-10-25)
       http://python.org/sf/1337400  opened by  Dimitri Papadopoulos

IDLE, F5 ? wrong external file content on error.  (2005-10-26)
       http://python.org/sf/1337987  opened by  MvGulik

doctest mishandles exceptions raised within generators  (2005-10-26)
       http://python.org/sf/1337990  opened by  Tim Wegener

Memory keeping  (2005-10-26)
       http://python.org/sf/1338264  opened by  sin

CVS webbrowser.py (1.40) bugs  (2005-10-26)
       http://python.org/sf/1338995  opened by  Greg Couch

shelve.Shelf.__del__ throws exceptions  (2005-10-26)
       http://python.org/sf/1339007  opened by  Geoffrey T. Dairiki

Threading misbehavior with lambdas  (2005-10-27)
CLOSED http://python.org/sf/1339045  opened by  Maciek Fijalkowski

Bugs Closed
___________

wrong TypeError traceback in generator expressions  (2005-10-14)
       http://python.org/sf/1327110  closed by  mwh

os-process.html  (2005-10-17)
       http://python.org/sf/1328915  closed by  nnorwitz

Empty Generator doesn't evaluate as False  (2005-10-17)
       http://python.org/sf/1328959  closed by  rhettinger

tarfile.add() produces hard links instead of normal files  (2005-10-18)
       http://python.org/sf/1330039  closed by  nnorwitz

utf 7 codec broken  (2005-10-19)
       http://python.org/sf/1331062  closed by  lemburg

string_subscript doesn't check for failed PyMem_Malloc  (2005-10-19)
       http://python.org/sf/1331563  closed by  nnorwitz

Inaccurate footnote 1 in Lib ref, sect 2.3.6.4  (2005-10-20)
       http://python.org/sf/1332780  closed by  birkenfeld

Argument genexp corner case  (2005-03-21)
       http://python.org/sf/1167751  closed by  nnorwitz

Encodings iso8859_1 and latin_1 are redundant  (2005-08-12)
       http://python.org/sf/1257525  closed by  lemburg

ISO8859-9 broken  (2005-10-11)
       http://python.org/sf/1324237  closed by  lemburg

mac_roman codec missing "apple" codepoint  (2005-10-04)
       http://python.org/sf/1313051  closed by  lemburg

line numbers off by 1 in dis  (2005-07-28)
       http://python.org/sf/1246473  closed by  nascheme

Threading misbehavior with lambdas  (2005-10-27)
       http://python.org/sf/1339045  deleted by  fijal

RFE Closed
__________

python scratchpad (IDLE)  (2005-10-14)
       http://python.org/sf/1326830  closed by  kbk

datetime.replace could take a dict  (2005-09-20)
       http://python.org/sf/1296581  closed by  birkenfeld


From martin at v.loewis.de  Thu Oct 27 19:38:32 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 27 Oct 2005 19:38:32 +0200
Subject: [Python-Dev] Conversion to Subversion is complete
In-Reply-To: <ca471dc20510270832n75cf9b4ic588ffb368975c23@mail.gmail.com>
References: <1130409313.4360ad6139518@www.domainfactory-webmail.de>
	<ca471dc20510270832n75cf9b4ic588ffb368975c23@mail.gmail.com>
Message-ID: <43611098.3000401@v.loewis.de>

Guido van Rossum wrote:
> Woo hoo! Thanks for all the hard work and good thinking, Martin.

My pleasure!

>>svn+ssh://pythondev at svn.python.org/python/trunk
>>svn+ssh://pythondev at svn.python.org/python/branches/release24-maint
>>svn+ssh://pythondev at svn.python.org/peps
> 
> 
> This doesn't work for me. I'm sure the problem is on my end, but my
> svn skills are too rusty to figure it out.

It's actually not: you missed the pythondev@ part. To access the
repository, your SSH key must be added to pythondev's authorized_keys
file; it previously wasn't.

I have now added your key <something>.comcast.net to the file;
I did not add guido at eric, as SSH1 is not supported. Please try
again.

The list of committers is (now) at

http://www.python.org/dev/committers

Anybody not on the list who wishes to (and had access to the CVS)
please send your key; if you have access to dinsdale, just let us
know and we copy your key.

Regards,
Martin

From fdrake at acm.org  Thu Oct 27 19:40:04 2005
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu, 27 Oct 2005 13:40:04 -0400
Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover
In-Reply-To: <17248.64308.936680.578655@montanaro.dyndns.org>
References: <435BC27C.1010503@v.loewis.de> <4360DE1C.3010602@zope.com>
	<17248.64308.936680.578655@montanaro.dyndns.org>
Message-ID: <200510271340.04668.fdrake@acm.org>

On Thursday 27 October 2005 12:07, skip at pobox.com wrote:
 > Send your keys to pydotorg at python.org.  Unless you specify otherwise, your
 > login will probably be "jim.fulton".

Mail to pydotorg doesn't allow posting from non-members; I watch for 
notifications for owner on that list and try to approve as quickly as 
possible, but it's a manual process just to get the mail through.

We should probably have a dedicated address for this, or tell people to send 
them to webmaster.


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From martin at v.loewis.de  Thu Oct 27 19:56:34 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 27 Oct 2005 19:56:34 +0200
Subject: [Python-Dev] Conversion to Subversion is complete
In-Reply-To: <djqjua$rjh$1@sea.gmane.org>
References: <1130409313.4360ad6139518@www.domainfactory-webmail.de>	<17248.45247.676631.388117@montanaro.dyndns.org>
	<djqjua$rjh$1@sea.gmane.org>
Message-ID: <436114D2.4090803@v.loewis.de>

Wolfgang Langner wrote:
> But why is an old subversion used ?
> (Powered by Subversion version 1.1.4)

That's the one Debian provides. We don't build our own, but use
Debian packages for everything.

Also, subversion 1.1 is not old: it was released on Oct 4, 2004;
1.1.4 is less than a year old.

Regards,
Martin

From martin at v.loewis.de  Thu Oct 27 19:57:27 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 27 Oct 2005 19:57:27 +0200
Subject: [Python-Dev] Conversion to Subversion is complete
In-Reply-To: <2m64rj5agw.fsf@starship.python.net>
References: <1130409313.4360ad6139518@www.domainfactory-webmail.de>
	<2m64rj5agw.fsf@starship.python.net>
Message-ID: <43611507.8090606@v.loewis.de>

Michael Hudson wrote:
> Do checkins to svn.python.org go to the python-checkins list already?

They do indeed - you should have received one commit message by now
(me testing whether committing works, on PEP 347).

Regards,
Martin

From martin at v.loewis.de  Thu Oct 27 19:59:06 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 27 Oct 2005 19:59:06 +0200
Subject: [Python-Dev] [Docs] MinGW and libpython24.a
In-Reply-To: <u3bmnrvr7.fsf_-_@boost-consulting.com>
References: <u3bmp62qz.fsf@boost-consulting.com>	<435E768D.2000401@v.loewis.de>
	<u7jc12z7u.fsf@boost-consulting.com>
	<u3bmnrvr7.fsf_-_@boost-consulting.com>
Message-ID: <4361156A.7090101@v.loewis.de>

David Abrahams wrote:
> As it turns out, MinGW also implemented, in version 3.0.0 (with
> binutils-2.13.90-20030111-1), features which make the creation of
> libpython24.a unnecessary.  So whoever maintains this doc might want
> to note that you only need that step if you are using a version of
> Python prior to 2.4.1 with a MinGW prior to 3.0.0 (with
> binutils-2.13.90-20030111-1).

Can you please provide a patch to the documentation? None of the
regular documentation maintainers would know what exactly to write;
this is all user-contributed.

Regards,
Martin

From martin at v.loewis.de  Thu Oct 27 20:01:40 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 27 Oct 2005 20:01:40 +0200
Subject: [Python-Dev] Help with inotify
In-Reply-To: <djr2mo$g9u$1@sea.gmane.org>
References: <djr2mo$g9u$1@sea.gmane.org>
Message-ID: <43611604.4080404@v.loewis.de>

Neal Becker wrote:
> Any ideas?  I'd rather not have to trace through python if I could avoid it
> (I don't even have source installed here).

Use strace, then. Find out what precise system call gives you this
error. If this is not enough clue, post the relevant fragment of the
trace output. Usage would be

strace -o muell python test_notify.py
(look into the file muell afterwards)

Regards,
Martin

From martin at v.loewis.de  Thu Oct 27 20:06:43 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 27 Oct 2005 20:06:43 +0200
Subject: [Python-Dev] Divorcing str and unicode (no	more
	implicitconversions).
In-Reply-To: <436037F5.8050501@canterbury.ac.nz>
References: <20051025120919.3927.JCARLSON@uci.edu>	<435E9E18.7090502@v.loewis.de>
	<20051025142245.3930.JCARLSON@uci.edu>	<435EA9CF.6060305@v.loewis.de>
	<435ED31C.3010800@canterbury.ac.nz>	<435F20B1.8080803@v.loewis.de>
	<436037F5.8050501@canterbury.ac.nz>
Message-ID: <43611733.3060606@v.loewis.de>

Greg Ewing wrote:
> I still think this is a much worse potential problem
> than that of "l" vs "1", etc. It's reasonable to
> adopt the practice of never using "l" as a single
> letter identifier, for example. But it would be
> unreasonable to ban the use of "E" as an identifier
> on the grounds that someone somewhere might confuse
> it with a capital epsilon.

As a style guide, people should use single-letter
identifiers only for local variables. If they follow
the guideline, it should be easy to tell whether
such an identifier is Latin or Greek (if everything
else in the function is Latin, the E likely is as
well).

> An alternative would be to identify such confusable
> letters in the various alphabets and define them
> to be equivalent.

pylint could check for such things (although I very
much doubt it would have any hits in the next 10
years).

> And beyond the issue of alphabets there's also the
> question of whether accented characters should be
> considered distinct. I can see quite a few holy
> flame wars erupting over that...

For that, there is the Unicode TR that precisely
defines how this should be done. People should then
have their wars with the Unicode consortium.

Regards,
Martin

From martin at v.loewis.de  Thu Oct 27 20:16:19 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 27 Oct 2005 20:16:19 +0200
Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover
In-Reply-To: <200510271340.04668.fdrake@acm.org>
References: <435BC27C.1010503@v.loewis.de>
	<4360DE1C.3010602@zope.com>	<17248.64308.936680.578655@montanaro.dyndns.org>
	<200510271340.04668.fdrake@acm.org>
Message-ID: <43611973.9030100@v.loewis.de>

Fred L. Drake, Jr. wrote:
> Mail to pydotorg doesn't allow posting from non-members; I watch for 
> notifications for owner on that list and try to approve as quickly as 
> possible, but it's a manual process just to get the mail through.

Ah, didn't know this.

> We should probably have a dedicated address for this, or tell people to send 
> them to webmaster.

I think I would request a separate address; I don't think I want to get
all webmaster email. That address should probably include webmaster,
though.

Regards,
Martin

From ndbecker2 at gmail.com  Thu Oct 27 20:17:03 2005
From: ndbecker2 at gmail.com (Neal Becker)
Date: Thu, 27 Oct 2005 14:17:03 -0400
Subject: [Python-Dev] Help with inotify
References: <djr2mo$g9u$1@sea.gmane.org> <43611604.4080404@v.loewis.de>
Message-ID: <djr5iu$tou$1@sea.gmane.org>

"Martin v. L?wis" wrote:

> Neal Becker wrote:
>> Any ideas?  I'd rather not have to trace through python if I could avoid
>> it (I don't even have source installed here).
> 
> Use strace, then. Find out what precise system call gives you this
> error. If this is not enough clue, post the relevant fragment of the
> trace output. Usage would be
> 
> strace -o muell python test_notify.py
> (look into the file muell afterwards)
> 

Yes, tried that- learned nothing.

I suspect what's happening is that python's fdopen is using some stat call
to determine whether the file descriptor refers to a directory, and is
getting an answer that the inotify fd does.  Don't know what to do about
it.  Can I build a python file object in "C" from the fd?

Here's strace.  The write of '4' is where my code writes the value of
fileno() to stdout, which is '4', which is correct - notice that
open("test-inotify.py") returned '3':
...
open("test-inotify.py", O_RDONLY)       = 3
write(2, "  File \"test-inotify.py\", line 6"..., 39) = 39
fstat(3, {st_mode=S_IFREG|0664, st_size=87, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0x2aaaadc13000
read(3, "from inotify import *\nimport os\n"..., 4096) = 87
write(2, "    ", 4)                     = 4
write(2, "os.fdopen (i.fileno())\n", 23) = 23
close(3)                                = 0
munmap(0x2aaaadc13000, 4096)            = 0
write(2, "IOError", 7)                  = 7
write(2, ": ", 2)                       = 2
write(2, "[Errno 21] Is a directory", 25) = 25



From fdrake at acm.org  Thu Oct 27 20:23:55 2005
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu, 27 Oct 2005 14:23:55 -0400
Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover
In-Reply-To: <43611973.9030100@v.loewis.de>
References: <435BC27C.1010503@v.loewis.de> <200510271340.04668.fdrake@acm.org>
	<43611973.9030100@v.loewis.de>
Message-ID: <200510271423.55919.fdrake@acm.org>

On Thursday 27 October 2005 14:16, Martin v. L?wis wrote:
 > I think I would request a separate address; I don't think I want to get
 > all webmaster email.

I like the idea of a separate address as well.

 > That address should probably include webmaster, though.

Are you suggesting that the key-deposit address be routed to the webmaster 
crew?  Most of the webmasters don't have the access needed to deposit keys.


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From dave at boost-consulting.com  Thu Oct 27 20:37:23 2005
From: dave at boost-consulting.com (David Abrahams)
Date: Thu, 27 Oct 2005 14:37:23 -0400
Subject: [Python-Dev] [Docs] MinGW and libpython24.a
In-Reply-To: <4361156A.7090101@v.loewis.de> (Martin v. =?iso-8859-1?Q?L=F6?=
	=?iso-8859-1?Q?wis's?= message of
	"Thu, 27 Oct 2005 19:59:06 +0200")
References: <u3bmp62qz.fsf@boost-consulting.com>
	<435E768D.2000401@v.loewis.de> <u7jc12z7u.fsf@boost-consulting.com>
	<u3bmnrvr7.fsf_-_@boost-consulting.com> <4361156A.7090101@v.loewis.de>
Message-ID: <upspqn6vw.fsf@boost-consulting.com>

"Martin v. L?wis" <martin at v.loewis.de> writes:

> David Abrahams wrote:
>> As it turns out, MinGW also implemented, in version 3.0.0 (with
>> binutils-2.13.90-20030111-1), features which make the creation of
>> libpython24.a unnecessary.  So whoever maintains this doc might want
>> to note that you only need that step if you are using a version of
>> Python prior to 2.4.1 with a MinGW prior to 3.0.0 (with
>> binutils-2.13.90-20030111-1).
>
> Can you please provide a patch to the documentation? None of the
> regular documentation maintainers would know what exactly to write;
> this is all user-contributed.

This isn't rocket science.  Or maybe it is; if adding

  These instructions only apply if you're using a version of Python
  prior to 2.4.1 with a MinGW prior to 3.0.0 (with
  binutils-2.13.90-20030111-1)

is not acceptable then no patch I could submit would be acceptable,
because I don't know how to do better either.

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com

From martin at v.loewis.de  Thu Oct 27 20:59:13 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 27 Oct 2005 20:59:13 +0200
Subject: [Python-Dev] [Docs] MinGW and libpython24.a
In-Reply-To: <upspqn6vw.fsf@boost-consulting.com>
References: <u3bmp62qz.fsf@boost-consulting.com>	<435E768D.2000401@v.loewis.de>
	<u7jc12z7u.fsf@boost-consulting.com>	<u3bmnrvr7.fsf_-_@boost-consulting.com>
	<4361156A.7090101@v.loewis.de> <upspqn6vw.fsf@boost-consulting.com>
Message-ID: <43612381.3070300@v.loewis.de>

David Abrahams wrote:
> This isn't rocket science.  Or maybe it is; if adding
> 
>   These instructions only apply if you're using a version of Python
>   prior to 2.4.1 with a MinGW prior to 3.0.0 (with
>   binutils-2.13.90-20030111-1)
> 
> is not acceptable then no patch I could submit would be acceptable,
> because I don't know how to do better either.

Thanks, committed as revision 41338:

http://svn.python.org/projects/python/trunk/Doc/inst/inst.tex

I wasn't sure whether to place this text at the beginning or
the end (i.e. whether all instructions of this section are incorrect
or only part of it); I put it at the beginning.

Regards,
Martin

From walter at livinglogic.de  Thu Oct 27 21:02:21 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Thu, 27 Oct 2005 21:02:21 +0200
Subject: [Python-Dev] Conversion to Subversion is complete
In-Reply-To: <43610BC9.1040508@v.loewis.de>
References: <1130409313.4360ad6139518@www.domainfactory-webmail.de>
	<4360CAE3.4090408@livinglogic.de> <43610BC9.1040508@v.loewis.de>
Message-ID: <606B4D81-32DB-4125-B449-C023A1A61014@livinglogic.de>

Am 27.10.2005 um 19:18 schrieb Martin v. L?wis:

> Walter D?rwald wrote:
>
>> Thanks for doing this.
>> BTW, will there be daily tarballs, like the one available from:
>> http://cvs.perl.org/snapshots/python/python/python-latest.tar.gz
>>
>
> Will be, yes (I'm saddened that you refer to this location, and not
> http://www.dcl.hpi.uni-potsdam.de/home/loewis/python.tgz :-)

I didn't know that, although I probably should, the links are on the  
official page at http://www.python.org/dev/. ;)

BTW, http://www.dcl.hpi.uni-potsdam.de/home/loewis/python.tgz is just  
45 bytes.

> I'm planning to provide them at http://svn.python.org/snapshots.

Great!

BTW, ViewCVS seems to be missing the stylesheet. http:// 
svn.python.org/view/*docroot*/styles.css gives an exception  
complaining about "No such file or directory: '/etc/viewcvs/doc/ 
styles.css'"

Bye,
    Walter D?rwald


From martin at v.loewis.de  Thu Oct 27 21:05:36 2005
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Thu, 27 Oct 2005 21:05:36 +0200
Subject: [Python-Dev] Help with inotify
In-Reply-To: <djr5iu$tou$1@sea.gmane.org>
References: <djr2mo$g9u$1@sea.gmane.org> <43611604.4080404@v.loewis.de>
	<djr5iu$tou$1@sea.gmane.org>
Message-ID: <43612500.4040403@v.loewis.de>

Neal Becker wrote:
> Yes, tried that- learned nothing.

Please go back further in the trace file. There must be a return
value of -1 (EISDIR) somewhere in the file, try to locate that.

> Here's strace.  The write of '4' is where my code writes the value of
> fileno() to stdout, which is '4', which is correct - notice that
> open("test-inotify.py") returned '3':

The fragment you quote only refers to the part where it tries to
format the traceback. The value '4' is never written, instead,
it writes 4 spaces (the second argument is the bytes, the third
is the number of bytes).

Regards,
Martin

From ndbecker2 at gmail.com  Thu Oct 27 21:17:29 2005
From: ndbecker2 at gmail.com (Neal Becker)
Date: Thu, 27 Oct 2005 15:17:29 -0400
Subject: [Python-Dev] Help with inotify
References: <djr2mo$g9u$1@sea.gmane.org> <43611604.4080404@v.loewis.de>
	<djr5iu$tou$1@sea.gmane.org> <43612500.4040403@v.loewis.de>
Message-ID: <djr948$b7u$1@sea.gmane.org>

"Martin v. L?wis" wrote:

> Neal Becker wrote:
>> Yes, tried that- learned nothing.
> 
> Please go back further in the trace file. There must be a return
> value of -1 (EISDIR) somewhere in the file, try to locate that.
> 
>> Here's strace.  The write of '4' is where my code writes the value of
>> fileno() to stdout, which is '4', which is correct - notice that
>> open("test-inotify.py") returned '3':
> 
> The fragment you quote only refers to the part where it tries to
> format the traceback. The value '4' is never written, instead,
> it writes 4 spaces (the second argument is the bytes, the third
> is the number of bytes).
> 
This 1st line is the syscall for inotify:

SYS_253(0, 0x7fffff88f0f0, 0x2aaaadda3f00, 0x2aaaaab4611b, 0x7) = 4
close(3)                                = 0
futex(0x502530, FUTEX_WAKE, 1)          = 0
futex(0x502530, FUTEX_WAKE, 1)          = 0
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 3), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0x2aaaadc12000
write(1, "4\n", 2)                      = 2
fcntl(4, F_GETFL)                       = 0 (flags O_RDONLY)
fstat(4, {st_mode=S_IFDIR|0600, st_size=0, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0x2aaaadc13000
lseek(4, 0, SEEK_CUR)                   = -1 ESPIPE (Illegal seek)
fstat(4, {st_mode=S_IFDIR|0600, st_size=0, ...}) = 0
close(4)                                = 0
munmap(0x2aaaadc13000, 4096)            = 0
write(2, "Traceback (most recent call last"..., 35) = 35
open("test-inotify.py", O_RDONLY)       = 3
write(2, "  File \"test-inotify.py\", line 6"..., 39) = 39
...


From skip at pobox.com  Thu Oct 27 23:01:43 2005
From: skip at pobox.com (skip@pobox.com)
Date: Thu, 27 Oct 2005 16:01:43 -0500
Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover
In-Reply-To: <200510271423.55919.fdrake@acm.org>
References: <435BC27C.1010503@v.loewis.de> <200510271340.04668.fdrake@acm.org>
	<43611973.9030100@v.loewis.de> <200510271423.55919.fdrake@acm.org>
Message-ID: <17249.16439.870339.133847@montanaro.dyndns.org>


    Fred> Are you suggesting that the key-deposit address be routed to the
    Fred> webmaster crew?  Most of the webmasters don't have the access
    Fred> needed to deposit keys.

In fact, many of us on the pydotorg list don't have ssh access either.  I
suspect the number of useful recipients is no more than five (Martin, Barry,
Anthony, Sean, maybe one or two others).

Skip

From martin at v.loewis.de  Thu Oct 27 23:02:40 2005
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Thu, 27 Oct 2005 23:02:40 +0200
Subject: [Python-Dev] Help with inotify
In-Reply-To: <djr948$b7u$1@sea.gmane.org>
References: <djr2mo$g9u$1@sea.gmane.org>
	<43611604.4080404@v.loewis.de>	<djr5iu$tou$1@sea.gmane.org>
	<43612500.4040403@v.loewis.de> <djr948$b7u$1@sea.gmane.org>
Message-ID: <43614070.2030007@v.loewis.de>

Neal Becker wrote:
> SYS_253(0, 0x7fffff88f0f0, 0x2aaaadda3f00, 0x2aaaaab4611b, 0x7) = 4
> close(3)                                = 0
> futex(0x502530, FUTEX_WAKE, 1)          = 0
> futex(0x502530, FUTEX_WAKE, 1)          = 0
> fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 3), ...}) = 0
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
> 0x2aaaadc12000
> write(1, "4\n", 2)                      = 2
> fcntl(4, F_GETFL)                       = 0 (flags O_RDONLY)
> fstat(4, {st_mode=S_IFDIR|0600, st_size=0, ...}) = 0
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
> 0x2aaaadc13000
> lseek(4, 0, SEEK_CUR)                   = -1 ESPIPE (Illegal seek)
> fstat(4, {st_mode=S_IFDIR|0600, st_size=0, ...}) = 0
> close(4)                                = 0
> munmap(0x2aaaadc13000, 4096)            = 0
> write(2, "Traceback (most recent call last"..., 35) = 35

I see. Python is making up the EISDIR, looking at the stat result.
In Objects/fileobject.c:dircheck generates the EISDIR error, which
apparently comes from posix_fdopen, PyFile_FromFile,
fill_file_fields.

Python simply does not support file objects which stat(2) as directories.

Regards,
Martin

From martin at v.loewis.de  Fri Oct 28 00:12:30 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 28 Oct 2005 00:12:30 +0200
Subject: [Python-Dev] Conversion to Subversion is complete
In-Reply-To: <606B4D81-32DB-4125-B449-C023A1A61014@livinglogic.de>
References: <1130409313.4360ad6139518@www.domainfactory-webmail.de>	<4360CAE3.4090408@livinglogic.de>
	<43610BC9.1040508@v.loewis.de>
	<606B4D81-32DB-4125-B449-C023A1A61014@livinglogic.de>
Message-ID: <436150CE.8000305@v.loewis.de>

Walter D?rwald wrote:
> BTW, ViewCVS seems to be missing the stylesheet. http:// 
> svn.python.org/view/*docroot*/styles.css gives an exception  
> complaining about "No such file or directory: '/etc/viewcvs/doc/ 
> styles.css'"

Thanks, fixed. I already wondered why I was supposed to
create a /viewcvs Alias in the apache configuration...

Regards,
Martin

From ndbecker2 at gmail.com  Fri Oct 28 01:32:23 2005
From: ndbecker2 at gmail.com (Neal Becker)
Date: Thu, 27 Oct 2005 19:32:23 -0400
Subject: [Python-Dev] Help with inotify
References: <djr2mo$g9u$1@sea.gmane.org>
	<43611604.4080404@v.loewis.de>	<djr5iu$tou$1@sea.gmane.org>
	<43612500.4040403@v.loewis.de> <djr948$b7u$1@sea.gmane.org>
	<43614070.2030007@v.loewis.de>
Message-ID: <djro25$t1s$1@sea.gmane.org>

"Martin v. L?wis" wrote:


> I see. Python is making up the EISDIR, looking at the stat result.
> In Objects/fileobject.c:dircheck generates the EISDIR error, which
> apparently comes from posix_fdopen, PyFile_FromFile,
> fill_file_fields.
> 
> Python simply does not support file objects which stat(2) as directories.
> 

OK, does python have a C API that would allow me to create a python file
object from my C (C++) code?  Then instead of using python's fdopen I could
just do it myself.


From bcannon at gmail.com  Fri Oct 28 01:46:58 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Thu, 27 Oct 2005 16:46:58 -0700
Subject: [Python-Dev] Conversion to Subversion is complete
In-Reply-To: <1130409313.4360ad6139518@www.domainfactory-webmail.de>
References: <1130409313.4360ad6139518@www.domainfactory-webmail.de>
Message-ID: <bbaeab100510271646v5eada245q43d85de5746a2e40@mail.gmail.com>

On 10/27/05, martin at v.loewis.de <martin at v.loewis.de> wrote:
[SNIP]
> Most of you are probably interested in checking out one of these
> folders:
>
> svn+ssh://pythondev at svn.python.org/python/trunk
> svn+ssh://pythondev at svn.python.org/python/branches/release24-maint
> svn+ssh://pythondev at svn.python.org/peps
>

Why the entire 'peps' directory and not just the trunk like with
'python'?  It looks like no tags or branches have ever been created
for the PEPs and thus are not really needed.

I am also curious as to what you would have me check out for the
sandbox; whole directory or just the trunk?

-Brett

From bob at redivi.com  Fri Oct 28 01:49:02 2005
From: bob at redivi.com (Bob Ippolito)
Date: Thu, 27 Oct 2005 16:49:02 -0700
Subject: [Python-Dev] Help with inotify
In-Reply-To: <djro25$t1s$1@sea.gmane.org>
References: <djr2mo$g9u$1@sea.gmane.org>
	<43611604.4080404@v.loewis.de>	<djr5iu$tou$1@sea.gmane.org>
	<43612500.4040403@v.loewis.de> <djr948$b7u$1@sea.gmane.org>
	<43614070.2030007@v.loewis.de> <djro25$t1s$1@sea.gmane.org>
Message-ID: <70FDA54D-3D84-4502-BE83-061BF4DBC101@redivi.com>


On Oct 27, 2005, at 4:32 PM, Neal Becker wrote:

> "Martin v. L?wis" wrote:
>
>> I see. Python is making up the EISDIR, looking at the stat result.
>> In Objects/fileobject.c:dircheck generates the EISDIR error, which
>> apparently comes from posix_fdopen, PyFile_FromFile,
>> fill_file_fields.
>>
>> Python simply does not support file objects which stat(2) as  
>> directories.
>>
>>
>
> OK, does python have a C API that would allow me to create a python  
> file
> object from my C (C++) code?  Then instead of using python's fdopen  
> I could
> just do it myself.

Why do you need a file object for something that is not a file  
anyway?  select.select doesn't require file objects for example, just  
objects that have a fileno() method.

-bob


From ndbecker2 at gmail.com  Fri Oct 28 01:58:05 2005
From: ndbecker2 at gmail.com (Neal Becker)
Date: Thu, 27 Oct 2005 19:58:05 -0400
Subject: [Python-Dev] Help with inotify
References: <djr2mo$g9u$1@sea.gmane.org>
	<43611604.4080404@v.loewis.de>	<djr5iu$tou$1@sea.gmane.org>
	<43612500.4040403@v.loewis.de> <djr948$b7u$1@sea.gmane.org>
	<43614070.2030007@v.loewis.de> <djro25$t1s$1@sea.gmane.org>
	<70FDA54D-3D84-4502-BE83-061BF4DBC101@redivi.com>
Message-ID: <djrpic$1v2$1@sea.gmane.org>

Bob Ippolito wrote:

> 
> On Oct 27, 2005, at 4:32 PM, Neal Becker wrote:
> 
>> "Martin v. L?wis" wrote:
>>
>>> I see. Python is making up the EISDIR, looking at the stat result.
>>> In Objects/fileobject.c:dircheck generates the EISDIR error, which
>>> apparently comes from posix_fdopen, PyFile_FromFile,
>>> fill_file_fields.
>>>
>>> Python simply does not support file objects which stat(2) as
>>> directories.
>>>
>>>
>>
>> OK, does python have a C API that would allow me to create a python
>> file
>> object from my C (C++) code?  Then instead of using python's fdopen
>> I could
>> just do it myself.
> 
> Why do you need a file object for something that is not a file
> anyway?  select.select doesn't require file objects for example, just
> objects that have a fileno() method.
> 
Yes, that's a good point - the reason is I didn't want to restrict the
interface to only work with select.  Maybe I should rethink the interface.


From bob at redivi.com  Fri Oct 28 02:07:35 2005
From: bob at redivi.com (Bob Ippolito)
Date: Thu, 27 Oct 2005 17:07:35 -0700
Subject: [Python-Dev] Help with inotify
In-Reply-To: <djrpic$1v2$1@sea.gmane.org>
References: <djr2mo$g9u$1@sea.gmane.org>
	<43611604.4080404@v.loewis.de>	<djr5iu$tou$1@sea.gmane.org>
	<43612500.4040403@v.loewis.de> <djr948$b7u$1@sea.gmane.org>
	<43614070.2030007@v.loewis.de> <djro25$t1s$1@sea.gmane.org>
	<70FDA54D-3D84-4502-BE83-061BF4DBC101@redivi.com>
	<djrpic$1v2$1@sea.gmane.org>
Message-ID: <2B6C4ED6-4E9E-4154-AFEC-982E7CDEC182@redivi.com>


On Oct 27, 2005, at 4:58 PM, Neal Becker wrote:

> Bob Ippolito wrote:
>
>
>>
>> On Oct 27, 2005, at 4:32 PM, Neal Becker wrote:
>>
>>
>>> "Martin v. L?wis" wrote:
>>>
>>>
>>>> I see. Python is making up the EISDIR, looking at the stat result.
>>>> In Objects/fileobject.c:dircheck generates the EISDIR error, which
>>>> apparently comes from posix_fdopen, PyFile_FromFile,
>>>> fill_file_fields.
>>>>
>>>> Python simply does not support file objects which stat(2) as
>>>> directories.
>>>>
>>>>
>>>>
>>>
>>> OK, does python have a C API that would allow me to create a python
>>> file
>>> object from my C (C++) code?  Then instead of using python's fdopen
>>> I could
>>> just do it myself.
>>>
>>
>> Why do you need a file object for something that is not a file
>> anyway?  select.select doesn't require file objects for example, just
>> objects that have a fileno() method.
>>
>>
> Yes, that's a good point - the reason is I didn't want to restrict the
> interface to only work with select.  Maybe I should rethink the  
> interface.

Well what would the interface do if you had a file object?  Are you  
supposed to be able to read/write/seek/tell/etc.?  I don't understand  
why you're trying to do what you're doing.  select.select was just an  
example, select.poll's register/unregister takes any object with a  
fileno also.

Note that socket isn't a file and it has a fileno also.  Since what  
you have isn't a file, chances are returning a file object is a bug  
not a feature.

-bob


From nyamatongwe at gmail.com  Fri Oct 28 02:21:16 2005
From: nyamatongwe at gmail.com (Neil Hodgson)
Date: Fri, 28 Oct 2005 10:21:16 +1000
Subject: [Python-Dev] Divorcing str and unicode (no more
	implicitconversions).
In-Reply-To: <20051026105934.3977.JCARLSON@uci.edu>
References: <435F5149.4060804@egenix.com> <435FBDC4.8030300@v.loewis.de>
	<20051026105934.3977.JCARLSON@uci.edu>
Message-ID: <50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com>

Josiah Carlson:

> According to wikipedia (http://en.wikipedia.org/wiki/Latin_alphabet),
> various languages have adopted a transliteration of their language
> and/or former alphabets into latin.  They don't purport to know all of
> the reasons why, and I'm not going to speculate.

   I used to work on software written by Japanese and English speakers
at Fujitsu with most developers being Japanese. The rules were that
comments could be in Japanese but identifiers were only allowed to
contain ASCII characters. Most variable names were poorly chosen with
s, p, q, fla (boolean=flag) and flafla being popular. When I asked
some Japanese coders why they didn't use Japanese words expressed in
ASCII (Romaji), their response was that it was a really weird idea.

   This is anecdotal but it appears to me that transliterations are
not commonly used apart from learning languages and some minimal help
for foreigners such as including transliterated names on railway
station name boards.

   Neil

From ndbecker2 at gmail.com  Fri Oct 28 02:33:12 2005
From: ndbecker2 at gmail.com (Neal Becker)
Date: Thu, 27 Oct 2005 20:33:12 -0400
Subject: [Python-Dev] Help with inotify
References: <djr2mo$g9u$1@sea.gmane.org>
	<43611604.4080404@v.loewis.de>	<djr5iu$tou$1@sea.gmane.org>
	<43612500.4040403@v.loewis.de> <djr948$b7u$1@sea.gmane.org>
	<43614070.2030007@v.loewis.de> <djro25$t1s$1@sea.gmane.org>
	<70FDA54D-3D84-4502-BE83-061BF4DBC101@redivi.com>
	<djrpic$1v2$1@sea.gmane.org>
	<2B6C4ED6-4E9E-4154-AFEC-982E7CDEC182@redivi.com>
Message-ID: <djrrk7$6i0$1@sea.gmane.org>

Bob Ippolito wrote:

> 
> On Oct 27, 2005, at 4:58 PM, Neal Becker wrote:
> 
>> Bob Ippolito wrote:
>>
>>
>>>
>>> On Oct 27, 2005, at 4:32 PM, Neal Becker wrote:
>>>
>>>
>>>> "Martin v. L?wis" wrote:
>>>>
>>>>
>>>>> I see. Python is making up the EISDIR, looking at the stat result.
>>>>> In Objects/fileobject.c:dircheck generates the EISDIR error, which
>>>>> apparently comes from posix_fdopen, PyFile_FromFile,
>>>>> fill_file_fields.
>>>>>
>>>>> Python simply does not support file objects which stat(2) as
>>>>> directories.
>>>>>
>>>>>
>>>>>
>>>>
>>>> OK, does python have a C API that would allow me to create a python
>>>> file
>>>> object from my C (C++) code?  Then instead of using python's fdopen
>>>> I could
>>>> just do it myself.
>>>>
>>>
>>> Why do you need a file object for something that is not a file
>>> anyway?  select.select doesn't require file objects for example, just
>>> objects that have a fileno() method.
>>>
>>>
>> Yes, that's a good point - the reason is I didn't want to restrict the
>> interface to only work with select.  Maybe I should rethink the
>> interface.
> 
> Well what would the interface do if you had a file object?  Are you
> supposed to be able to read/write/seek/tell/etc.?  I don't understand
> why you're trying to do what you're doing.  select.select was just an
> example, select.poll's register/unregister takes any object with a
> fileno also.
>

Yes, you are supposed to be able to read and get information.  However, I
have implemented fileno for it, so you can use select.select on it if you
just want to wait for something to happen - which is probably all that's
really needed.  I also implemented select as a method of my inotify object,
in case you prefer that.

Here's an excerpt from documentation/filesystems/inotify.txt:
-----------------
Events are provided in the form of an inotify_event structure that is
read(2)
from a given inotify instance.  The filename is of dynamic length and
follows
the struct. It is of size len.  The filename is padded with null bytes to
ensure proper alignment.  This padding is reflected in len.

You can slurp multiple events by passing a large buffer, for example

        size_t len = read (fd, buf, BUF_LEN);

Where "buf" is a pointer to an array of "inotify_event" structures at least
BUF_LEN bytes in size.  The above example will return as many events as are
available and fit in BUF_LEN.

Each inotify instance fd is also select()- and poll()-able.
-----------------


From bcannon at gmail.com  Fri Oct 28 02:42:42 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Thu, 27 Oct 2005 17:42:42 -0700
Subject: [Python-Dev] Conversion to Subversion is complete
In-Reply-To: <1130409313.4360ad6139518@www.domainfactory-webmail.de>
References: <1130409313.4360ad6139518@www.domainfactory-webmail.de>
Message-ID: <bbaeab100510271742w2ebe93b5r91a5e90d5fc0fe8e@mail.gmail.com>

I have started a svn section in the dev FAQ
(http://www.python.org/dev/devfaq.html) pertaining to checking out a
project from the repository and other stuff discussed so far.  If
something is not clear or people feel a step is missing, let me know.

I will remove the CVS section once Martin has tossed the CVS repository on SF.

-Brett

On 10/27/05, martin at v.loewis.de <martin at v.loewis.de> wrote:
> The Python source code repository is now converted to subversion;
> please feel free to start checking out new sandboxes. For a few
> days, this installation probably still needs to be considered in
> testing. If there are no serious problems found by next Monday,
> I would consider conversion of the data complete. The CVS repository
> will be kept available read-only for a while longer, so you can
> easily forward any patches you may have.
>
> Most of you are probably interested in checking out one of these
> folders:
>
> svn+ssh://pythondev at svn.python.org/python/trunk
> svn+ssh://pythondev at svn.python.org/python/branches/release24-maint
> svn+ssh://pythondev at svn.python.org/peps
>
> The anonymous read-only equivalents of these are
>
> http://svn.python.org/projects/python/trunk
> http://svn.python.org/projects/python/branches/release24-maint
> http://svn.python.org/projects/peps
>
> As mentioned before, in addition to "plain" http/WebDAV,
> viewcvs is available at
>
> http://svn.python.org/view/
>
> There are some more things left to be done, such as updating
> the developer documentation. I'll start working on that soon,
> but contributions are welcome.
>
> Regards,
> Martin
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org
>

From tim.peters at gmail.com  Fri Oct 28 03:27:18 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Thu, 27 Oct 2005 21:27:18 -0400
Subject: [Python-Dev] Conversion to Subversion is complete
In-Reply-To: <bbaeab100510271742w2ebe93b5r91a5e90d5fc0fe8e@mail.gmail.com>
References: <1130409313.4360ad6139518@www.domainfactory-webmail.de>
	<bbaeab100510271742w2ebe93b5r91a5e90d5fc0fe8e@mail.gmail.com>
Message-ID: <1f7befae0510271827t5bef2009l5af731679e38acfd@mail.gmail.com>

[Brett Cannon]
> I have started a svn section in the dev FAQ
> (http://www.python.org/dev/devfaq.html) pertaining to checking out a
> project from the repository and other stuff discussed so far.  If
> something is not clear or people feel a step is missing, let me know.

Thanks, Brett!  I'm just starting this trek, in slow motion, and that
was a real help

From skip at pobox.com  Fri Oct 28 04:53:26 2005
From: skip at pobox.com (skip@pobox.com)
Date: Thu, 27 Oct 2005 21:53:26 -0500
Subject: [Python-Dev] Conversion to Subversion is complete
In-Reply-To: <bbaeab100510271742w2ebe93b5r91a5e90d5fc0fe8e@mail.gmail.com>
References: <1130409313.4360ad6139518@www.domainfactory-webmail.de>
	<bbaeab100510271742w2ebe93b5r91a5e90d5fc0fe8e@mail.gmail.com>
Message-ID: <17249.37542.578327.785926@montanaro.dyndns.org>


    Brett> I have started a svn section in the dev FAQ
    Brett> (http://www.python.org/dev/devfaq.html) pertaining to checking
    Brett> out a project from the repository and other stuff discussed so
    Brett> far.  If something is not clear or people feel a step is missing,
    Brett> let me know.

We're starting to look at how much information we can push over to the Wiki.
Any pages where multiple people might contribute, especially if they are not
the typical website maintainers, seems to me like good Wiki candidates to
me.  That goes double for anything FAQ-ish.

Skip

From bcannon at gmail.com  Fri Oct 28 05:03:54 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Thu, 27 Oct 2005 20:03:54 -0700
Subject: [Python-Dev] Conversion to Subversion is complete
In-Reply-To: <17249.37542.578327.785926@montanaro.dyndns.org>
References: <1130409313.4360ad6139518@www.domainfactory-webmail.de>
	<bbaeab100510271742w2ebe93b5r91a5e90d5fc0fe8e@mail.gmail.com>
	<17249.37542.578327.785926@montanaro.dyndns.org>
Message-ID: <bbaeab100510272003x3ba1e187u103cf6efaca48710@mail.gmail.com>

On 10/27/05, skip at pobox.com <skip at pobox.com> wrote:
>
>     Brett> I have started a svn section in the dev FAQ
>     Brett> (http://www.python.org/dev/devfaq.html) pertaining to checking
>     Brett> out a project from the repository and other stuff discussed so
>     Brett> far.  If something is not clear or people feel a step is missing,
>     Brett> let me know.
>
> We're starting to look at how much information we can push over to the Wiki.
> Any pages where multiple people might contribute, especially if they are not
> the typical website maintainers, seems to me like good Wiki candidates to
> me.  That goes double for anything FAQ-ish.
>

I guess, but I just don't like wikis personally so I have no
inclination to make the conversion.  If someone wants to make the
conversion over to the wiki and keep it up that's fine, but I have no
problem keeping the dev FAQ updated like I have for CVS in the past.

-Brett

From bcannon at gmail.com  Fri Oct 28 05:15:38 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Thu, 27 Oct 2005 20:15:38 -0700
Subject: [Python-Dev] PEP 352: Required Superclass for Exceptions
Message-ID: <bbaeab100510272015k62f8d836p8c7878c05b007ac2@mail.gmail.com>

Well, I am at it again, but this time Guido is a co-conspirator.  We
wrote a PEP that introduces BaseException and moves KeyboardInterrupt
and SystemExit.  Even if you followed the discussion for PEP 348 you
should read the PEP since I am sure there will be something that
someone doesn't like, such as the transition plan or how I didn't use
British English throughout.  =)

Anyway, as soon as the cron job posts the PEP to the web site (already
checked into the new svn repository) have a read and start expounding
about how wonderful it is and that there is no qualms with it
whatsoever.  =)

-Brett

From fdrake at acm.org  Fri Oct 28 05:53:26 2005
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu, 27 Oct 2005 23:53:26 -0400
Subject: [Python-Dev] Conversion to Subversion is complete
In-Reply-To: <bbaeab100510272003x3ba1e187u103cf6efaca48710@mail.gmail.com>
References: <1130409313.4360ad6139518@www.domainfactory-webmail.de>
	<17249.37542.578327.785926@montanaro.dyndns.org>
	<bbaeab100510272003x3ba1e187u103cf6efaca48710@mail.gmail.com>
Message-ID: <200510272353.27064.fdrake@acm.org>

On Thursday 27 October 2005 23:03, Brett Cannon wrote:
 > I guess, but I just don't like wikis personally so I have no
 > inclination to make the conversion.  If someone wants to make the
 > conversion over to the wiki and keep it up that's fine, but I have no
 > problem keeping the dev FAQ updated like I have for CVS in the past.

And I'm sure we all appreciate your efforts!  I certainly do.

Regarding using the wiki... I have mixed feelings.  Wikis are really, really 
good for some things.  Anything that's "how-to" based on technology (how to 
use SVN, CVS, etc.) seems like a reasonable candidate, because we get the 
advantages of peer review.

For things that describe policy, I don't think that's so great.  For policy 
(how to use SVN for Python development, because we have certain rules), I 
think we want to maintain strict editorial control.


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From martin at v.loewis.de  Fri Oct 28 07:15:36 2005
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Fri, 28 Oct 2005 07:15:36 +0200
Subject: [Python-Dev] Conversion to Subversion is complete
In-Reply-To: <bbaeab100510271646v5eada245q43d85de5746a2e40@mail.gmail.com>
References: <1130409313.4360ad6139518@www.domainfactory-webmail.de>
	<bbaeab100510271646v5eada245q43d85de5746a2e40@mail.gmail.com>
Message-ID: <4361B3F8.4000304@v.loewis.de>

Brett Cannon wrote:
> Why the entire 'peps' directory and not just the trunk like with
> 'python'?  It looks like no tags or branches have ever been created
> for the PEPs and thus are not really needed.

Right.

> I am also curious as to what you would have me check out for the
> sandbox; whole directory or just the trunk?

You would usually only check out the trunk (unless you want to work
on a branch, of course).

Regards,
Martin

From stephen at xemacs.org  Fri Oct 28 09:44:42 2005
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri, 28 Oct 2005 16:44:42 +0900
Subject: [Python-Dev] Divorcing str and unicode (no more
 implicitconversions).
In-Reply-To: <50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com>
	(Neil Hodgson's message of "Fri, 28 Oct 2005 10:21:16 +1000")
References: <435F5149.4060804@egenix.com> <435FBDC4.8030300@v.loewis.de>
	<20051026105934.3977.JCARLSON@uci.edu>
	<50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com>
Message-ID: <87irvikrv9.fsf@tleepslib.sk.tsukuba.ac.jp>

>>>>> "Neil" == Neil Hodgson <nyamatongwe at gmail.com> writes:

    Neil> Most variable names were poorly chosen with s, p, q, fla
    Neil> (boolean=flag) and flafla being popular. When I asked some
    Neil> Japanese coders why they didn't use Japanese words expressed
    Neil> in ASCII (Romaji), their response was that it was a really
    Neil> weird idea.

That may be due to the fact that two-ideograph words will often have a
dozen homonyms, and sometimes several dozen.  I sometimes use kanji in
not-for-general-distribution Emacs LISP code when 2 kanji will give as
expressive an identifier as 10 or 15 ASCII characters.

    Neil> This is anecdotal but it appears to me that transliterations
    Neil> are not commonly used apart from learning languages

In everyday usage, they're used a lot for identifier-like purposes
like corporate logos.

The only large corpuses of Japanese-oriented Japanese-authored code
I'm familiar with are the input methods Wnn, Canna, and SKK, and these
invariably use transliterated Japanese grammatical terms for parser
components[1], although there are perfectly good equivalents in English,
at least (I think they may actually be standardized by the Ministry of
Education).

There's also an Emacs library, edict.el, that uses _mixed_
ASCII-hiragana-kanji identifiers.  (ISTR that was done just to prove a
point---the person who wrote it was an American, I
believe---definitely not Japanese.)


Footnotes: 
[1]  Japanese does not require word delimiters, so input methods must
have grammatical knowledge to choose among large numbers of homonyms.

-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.

From theller at python.net  Fri Oct 28 09:53:31 2005
From: theller at python.net (Thomas Heller)
Date: Fri, 28 Oct 2005 09:53:31 +0200
Subject: [Python-Dev] Conversion to Subversion is complete
References: <1130409313.4360ad6139518@www.domainfactory-webmail.de>
Message-ID: <8xwe6ps4.fsf@python.net>

martin at v.loewis.de writes:

> The Python source code repository is now converted to subversion;
> please feel free to start checking out new sandboxes. For a few
> days, this installation probably still needs to be considered in
> testing. If there are no serious problems found by next Monday,
> I would consider conversion of the data complete. The CVS repository
> will be kept available read-only for a while longer, so you can
> easily forward any patches you may have.
>
> Most of you are probably interested in checking out one of these
> folders:
>
> svn+ssh://pythondev at svn.python.org/python/trunk
> svn+ssh://pythondev at svn.python.org/python/branches/release24-maint
> svn+ssh://pythondev at svn.python.org/peps
>

Works out of the box for me, thanks, Martin (but we have debugged this
before).

Can anyone recommend an XEmacs svn plugin to use - I've tried psvn.el
from http://www.xsteve.at/prg/emacs/psvn.el which seems to work?

Thomas


From mwh at python.net  Fri Oct 28 10:34:30 2005
From: mwh at python.net (Michael Hudson)
Date: Fri, 28 Oct 2005 09:34:30 +0100
Subject: [Python-Dev] Conversion to Subversion is complete
In-Reply-To: <8xwe6ps4.fsf@python.net> (Thomas Heller's message of "Fri, 28
	Oct 2005 09:53:31 +0200")
References: <1130409313.4360ad6139518@www.domainfactory-webmail.de>
	<8xwe6ps4.fsf@python.net>
Message-ID: <2mbr1a3uqx.fsf@starship.python.net>

Thomas Heller <theller at python.net> writes:

> Can anyone recommend an XEmacs svn plugin to use - I've tried psvn.el
> from http://www.xsteve.at/prg/emacs/psvn.el which seems to work?

I've heard http://www.xsteve.at/prg/emacs/psvn.el works :)

I also have vc-svn.el installed (I think it's from the subversion
source, but it might be part of newer emacs distributions).

Cheers,
mwh

-- 
  <exarkun> INEFFICIENT CAPITALIST YOUR OPULENT 
            TOILET WILL BE YOUR UNDOING         -- from Twisted.Quotes

From theller at python.net  Fri Oct 28 11:54:08 2005
From: theller at python.net (Thomas Heller)
Date: Fri, 28 Oct 2005 11:54:08 +0200
Subject: [Python-Dev] Conversion to Subversion is complete
References: <1130409313.4360ad6139518@www.domainfactory-webmail.de>
	<8xwe6ps4.fsf@python.net> <2mbr1a3uqx.fsf@starship.python.net>
Message-ID: <3bmm6k73.fsf@python.net>

Michael Hudson <mwh at python.net> writes:

> Thomas Heller <theller at python.net> writes:
>
>> Can anyone recommend an XEmacs svn plugin to use - I've tried psvn.el
>> from http://www.xsteve.at/prg/emacs/psvn.el which seems to work?
>
> I've heard http://www.xsteve.at/prg/emacs/psvn.el works :)
>
> I also have vc-svn.el installed (I think it's from the subversion
> source, but it might be part of newer emacs distributions).

I've heard that vc-svn.el does NOT work with Xemacs (note the X), but
haven't tried it myself.

Thomas


From orent at hishome.net  Fri Oct 28 12:20:38 2005
From: orent at hishome.net (Oren Tirosh)
Date: Fri, 28 Oct 2005 12:20:38 +0200
Subject: [Python-Dev] Divorcing str and unicode (no more
	implicitconversions).
In-Reply-To: <50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com>
References: <435F5149.4060804@egenix.com> <435FBDC4.8030300@v.loewis.de>
	<20051026105934.3977.JCARLSON@uci.edu>
	<50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com>
Message-ID: <7168d65a0510280320j75a51f1btd71a109cd5d604f5@mail.gmail.com>

On 10/28/05, Neil Hodgson <nyamatongwe at gmail.com> wrote:
>    I used to work on software written by Japanese and English speakers
> at Fujitsu with most developers being Japanese. The rules were that
> comments could be in Japanese but identifiers were only allowed to
> contain ASCII characters. Most variable names were poorly chosen with
> s, p, q, fla (boolean=flag) and flafla being popular. When I asked
> some Japanese coders why they didn't use Japanese words expressed in
> ASCII (Romaji), their response was that it was a really weird idea.
>
>    This is anecdotal but it appears to me that transliterations are
> not commonly used apart from learning languages and some minimal help
> for foreigners such as including transliterated names on railway
> station name boards.

Israeli programmers generally use English identifiers but
transliterations are common for local business terminology: types of
financial instruments, tax and insurance terminology, employee benefit
plans etc. Yes, it looks weird, but it would be rather pointless to
try to translate them. Even native English speakers would find it
difficult to recognize the translations because they are used to using
them as loan words. Only transliteration (or possibly the use of
non-ASCII identifiers) would make sense in this situation and I do not
think it is unique to Israel.

BTW, I heard about a Cobol shop that had an explicit policy of using
only transliterated identifiers. This resulted in a much smaller
chance of hitting one of Cobol's numerous reserved words. Thankfully,
this is not an issue in Python...

  Oren

From ncoghlan at gmail.com  Fri Oct 28 13:12:42 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 28 Oct 2005 21:12:42 +1000
Subject: [Python-Dev] PEP 352: Required Superclass for Exceptions
In-Reply-To: <bbaeab100510272015k62f8d836p8c7878c05b007ac2@mail.gmail.com>
References: <bbaeab100510272015k62f8d836p8c7878c05b007ac2@mail.gmail.com>
Message-ID: <436207AA.20506@gmail.com>

Brett Cannon wrote:
> Anyway, as soon as the cron job posts the PEP to the web site (already
> checked into the new svn repository) have a read and start expounding
> about how wonderful it is and that there is no qualms with it
> whatsoever.  =)

You mean aside from the implementation of __getitem__ being broken in 
BaseException*? ;)

Aside from that, I actually do have one real problem and one observation.

The problem: The value of ex.args

   The PEP as written significantly changes the semantics of ex.args - instead 
of being an empty tuple when no arguments are provided, it is instead a 
singleton tuple containing the empty string.

   A backwards compatible definition of BaseException.__init__ would be:

     def __init__(self, *args):
         self.args = args
         self.message = '' if not args else args[0]

The observation: The value of ex.message

   Under PEP 352 the concept of allowing "return x" to be used in a generator 
to mean "raise StopIteration(x)" would actually align quite well. A bare 
"return", however, would need to be changed to translate to "raise 
StopIteration(None)" rather than its current "raise StopIteration" in order to 
get the correct value (None) into ex.message.

Cheers,
Nick.

* (self.args[0] is self.message) due to the way __init__ is written, but 
__getitem__ assumes self.message isn't in self.args)

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Fri Oct 28 13:28:54 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 28 Oct 2005 21:28:54 +1000
Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover
In-Reply-To: <200510271423.55919.fdrake@acm.org>
References: <435BC27C.1010503@v.loewis.de>
	<200510271340.04668.fdrake@acm.org>	<43611973.9030100@v.loewis.de>
	<200510271423.55919.fdrake@acm.org>
Message-ID: <43620B76.8060308@gmail.com>

Fred L. Drake, Jr. wrote:
> On Thursday 27 October 2005 14:16, Martin v. L?wis wrote:
>  > I think I would request a separate address; I don't think I want to get
>  > all webmaster email.
> 
> I like the idea of a separate address as well.

Perhaps the radically named svnaccess at python.org?

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From barry at python.org  Fri Oct 28 14:11:19 2005
From: barry at python.org (Barry Warsaw)
Date: Fri, 28 Oct 2005 08:11:19 -0400
Subject: [Python-Dev] Conversion to Subversion is complete
In-Reply-To: <8xwe6ps4.fsf@python.net>
References: <1130409313.4360ad6139518@www.domainfactory-webmail.de>
	<8xwe6ps4.fsf@python.net>
Message-ID: <1130501479.5145.43.camel@geddy.wooz.org>

On Fri, 2005-10-28 at 03:53, Thomas Heller wrote:

> Can anyone recommend an XEmacs svn plugin to use - I've tried psvn.el
> from http://www.xsteve.at/prg/emacs/psvn.el which seems to work?

Yep, that's the one I use, albeit a few revs back from what's up there
now.  It's had some performance problems in the past, but is generally
pretty good these days.  I've had issues with it bogging down XEmacs
/after/ running stat on a very large tree.  It seems (seemed?) as though
it was still hogging cpu even after the actual back-end svn command was
finished.

My only other nit is that I wish I could svn stat a few directories at a
time.  Say I know that only directory A and B are out of date.  At the
command line I can say "svn stat A B" or "svn commit A B".  Can't really
do that in psvn.el, but I can understand why that's problematic.

I also would love it to be hooked into vc-mode too, for modeline updates
and commits of single files.  I can understand why those things aren't
there yet though.  All in all psvn.el works very well, although it's not
(for me) a complete replacement for the command line.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20051028/c8ea519b/attachment.pgp

From guido at python.org  Fri Oct 28 17:22:09 2005
From: guido at python.org (Guido van Rossum)
Date: Fri, 28 Oct 2005 08:22:09 -0700
Subject: [Python-Dev] PEP 352: Required Superclass for Exceptions
In-Reply-To: <436207AA.20506@gmail.com>
References: <bbaeab100510272015k62f8d836p8c7878c05b007ac2@mail.gmail.com>
	<436207AA.20506@gmail.com>
Message-ID: <ca471dc20510280822w6e3849dbu80b239c1ba42c5f8@mail.gmail.com>

On 10/28/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Brett Cannon wrote:
> > Anyway, as soon as the cron job posts the PEP to the web site (already
> > checked into the new svn repository) have a read and start expounding
> > about how wonderful it is and that there is no qualms with it
> > whatsoever.  =)
>
> You mean aside from the implementation of __getitem__ being broken in
> BaseException*? ;)

Are you clairvoyant?! The cronjob wass broken due to the SVN
transition and the file wasn't on the site yet. (Now fixed BTW.) Oh,
and here's the URL just in case:
http://www.python.org/peps/pep-0352.html

> Aside from that, I actually do have one real problem and one observation.
>
> The problem: The value of ex.args
>
>    The PEP as written significantly changes the semantics of ex.args - instead
> of being an empty tuple when no arguments are provided, it is instead a
> singleton tuple containing the empty string.
>
>    A backwards compatible definition of BaseException.__init__ would be:
>
>      def __init__(self, *args):
>          self.args = args
>          self.message = '' if not args else args[0]

But does anyone care? As long as args exists and is a tuple, does it
matter that it doesn't match the argument list when the latter was
empty? IMO the protocol mostly says that ex.args exists and is a tuple
-- the values in there can't be relied upon in pre-2.5-Python.
Exceptions that have specific information should store it in a
different place, not in ex.args.

> The observation: The value of ex.message
>
>    Under PEP 352 the concept of allowing "return x" to be used in a generator
> to mean "raise StopIteration(x)" would actually align quite well. A bare
> "return", however, would need to be changed to translate to "raise
> StopIteration(None)" rather than its current "raise StopIteration" in order to
> get the correct value (None) into ex.message.

Since ex.message is new, how can you say that it should have the value
None? IMO the whole idea is that ex.message should always be a string
going forward (although I'm not going to add a typecheck to enforce
this).

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From bcannon at gmail.com  Fri Oct 28 21:12:57 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Fri, 28 Oct 2005 12:12:57 -0700
Subject: [Python-Dev] Conversion to Subversion is complete
In-Reply-To: <200510272353.27064.fdrake@acm.org>
References: <1130409313.4360ad6139518@www.domainfactory-webmail.de>
	<17249.37542.578327.785926@montanaro.dyndns.org>
	<bbaeab100510272003x3ba1e187u103cf6efaca48710@mail.gmail.com>
	<200510272353.27064.fdrake@acm.org>
Message-ID: <bbaeab100510281212i59b995a5wfd91787519f6e685@mail.gmail.com>

On 10/27/05, Fred L. Drake, Jr. <fdrake at acm.org> wrote:
> On Thursday 27 October 2005 23:03, Brett Cannon wrote:
>  > I guess, but I just don't like wikis personally so I have no
>  > inclination to make the conversion.  If someone wants to make the
>  > conversion over to the wiki and keep it up that's fine, but I have no
>  > problem keeping the dev FAQ updated like I have for CVS in the past.
>
> And I'm sure we all appreciate your efforts!  I certainly do.
>
> Regarding using the wiki... I have mixed feelings.  Wikis are really, really
> good for some things.  Anything that's "how-to" based on technology (how to
> use SVN, CVS, etc.) seems like a reasonable candidate, because we get the
> advantages of peer review.
>
> For things that describe policy, I don't think that's so great.  For policy
> (how to use SVN for Python development, because we have certain rules), I
> think we want to maintain strict editorial control.
>

I like that explanation more than mine.  =)  So I am just going to
keep the FAQ up then.  If there is anything at
http://www.python.org/dev/svn.html people feel should be moved over to
the FAQ that has not occurred yet, let me know.  Please have personal
experience, though, with what you want added so as to make sure the
information is relevant (e.g., Tim suffering through getting an SSH 2
key for Windows and what is exactly needed, complete with screenshots 
=) .

-Brett

From bcannon at gmail.com  Fri Oct 28 21:29:35 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Fri, 28 Oct 2005 12:29:35 -0700
Subject: [Python-Dev] PEP 352: Required Superclass for Exceptions
In-Reply-To: <ca471dc20510280822w6e3849dbu80b239c1ba42c5f8@mail.gmail.com>
References: <bbaeab100510272015k62f8d836p8c7878c05b007ac2@mail.gmail.com>
	<436207AA.20506@gmail.com>
	<ca471dc20510280822w6e3849dbu80b239c1ba42c5f8@mail.gmail.com>
Message-ID: <bbaeab100510281229n17de9a13x91a8f081ba32f7e5@mail.gmail.com>

On 10/28/05, Guido van Rossum <guido at python.org> wrote:
> On 10/28/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> > Brett Cannon wrote:
> > > Anyway, as soon as the cron job posts the PEP to the web site (already
> > > checked into the new svn repository) have a read and start expounding
> > > about how wonderful it is and that there is no qualms with it
> > > whatsoever.  =)
> >
> > You mean aside from the implementation of __getitem__ being broken in
> > BaseException*? ;)
>
> Are you clairvoyant?! The cronjob wass broken due to the SVN
> transition and the file wasn't on the site yet. (Now fixed BTW.) Oh,
> and here's the URL just in case:
> http://www.python.org/peps/pep-0352.html
>

Nick got the python-checkins email and then read the PEP from the
repository (or at least that is what I assume since that is how Neal
managed to catch the PEP literally in under 5 minutes after checkin).

> > Aside from that, I actually do have one real problem and one observation.
> >
> > The problem: The value of ex.args
> >
> >    The PEP as written significantly changes the semantics of ex.args - instead
> > of being an empty tuple when no arguments are provided, it is instead a
> > singleton tuple containing the empty string.
> >
> >    A backwards compatible definition of BaseException.__init__ would be:
> >
> >      def __init__(self, *args):
> >          self.args = args
> >          self.message = '' if not args else args[0]
>
> But does anyone care? As long as args exists and is a tuple, does it
> matter that it doesn't match the argument list when the latter was
> empty? IMO the protocol mostly says that ex.args exists and is a tuple
> -- the values in there can't be relied upon in pre-2.5-Python.
> Exceptions that have specific information should store it in a
> different place, not in ex.args.
>

Looking at http://docs.python.org/lib/module-exceptions.html , it
looks like Guido is right.  All it ever says is that it is a tuple and
that any passed-in arguments go into 'args'; nothing about its default
value if no arguments are passed in.

But I personally have no qualms changing it if people want it, so -0
from me on making it more backwards-compatible.

> > The observation: The value of ex.message
> >
> >    Under PEP 352 the concept of allowing "return x" to be used in a generator
> > to mean "raise StopIteration(x)" would actually align quite well. A bare
> > "return", however, would need to be changed to translate to "raise
> > StopIteration(None)" rather than its current "raise StopIteration" in order to
> > get the correct value (None) into ex.message.
>
> Since ex.message is new, how can you say that it should have the value
> None? IMO the whole idea is that ex.message should always be a string
> going forward (although I'm not going to add a typecheck to enforce
> this).
>

My feeling exactly on 'message'.

-Brett

From raymond.hettinger at verizon.net  Fri Oct 28 22:16:00 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Fri, 28 Oct 2005 16:16:00 -0400
Subject: [Python-Dev] PEP 352 Transition Plan
In-Reply-To: <20051028193558.77BDF1E407C@bag.python.org>
Message-ID: <007c01c5dbfc$6a46e600$b62dc797@oemcomputer>

I don't follow why the PEP deprecates catching a category of exceptions
in a different release than it deprecates raising them.  Why would a
release allow catching something that cannot be raised?  I must be
missing something here.


Raymond


From guido at python.org  Fri Oct 28 22:29:20 2005
From: guido at python.org (Guido van Rossum)
Date: Fri, 28 Oct 2005 13:29:20 -0700
Subject: [Python-Dev] PEP 352 Transition Plan
In-Reply-To: <007c01c5dbfc$6a46e600$b62dc797@oemcomputer>
References: <20051028193558.77BDF1E407C@bag.python.org>
	<007c01c5dbfc$6a46e600$b62dc797@oemcomputer>
Message-ID: <ca471dc20510281329m5312946bjedf100d942c0dc49@mail.gmail.com>

On 10/28/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
> I don't follow why the PEP deprecates catching a category of exceptions
> in a different release than it deprecates raising them.  Why would a
> release allow catching something that cannot be raised?  I must be
> missing something here.

So conforming code can catch exceptions raised by not-yet conforming code.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From rhamph at gmail.com  Fri Oct 28 22:32:40 2005
From: rhamph at gmail.com (Adam Olsen)
Date: Fri, 28 Oct 2005 14:32:40 -0600
Subject: [Python-Dev] PEP 352 Transition Plan
In-Reply-To: <007c01c5dbfc$6a46e600$b62dc797@oemcomputer>
References: <20051028193558.77BDF1E407C@bag.python.org>
	<007c01c5dbfc$6a46e600$b62dc797@oemcomputer>
Message-ID: <aac2c7cb0510281332s26504701k945b2be2ea84b185@mail.gmail.com>

On 10/28/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
> I don't follow why the PEP deprecates catching a category of exceptions
> in a different release than it deprecates raising them.  Why would a
> release allow catching something that cannot be raised?  I must be
> missing something here.

Presumably because they CAN still be raised; attempting to do so
provokes a warning, not an error.

It also facilitates upgrading from old versions of Python.  You can
work to eliminate cases where the exceptions are raised while still
handling them if they do get raised.

--
Adam Olsen, aka Rhamphoryncus

From raymond.hettinger at verizon.net  Fri Oct 28 22:44:53 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Fri, 28 Oct 2005 16:44:53 -0400
Subject: [Python-Dev] PEP 352 Transition Plan
In-Reply-To: <ca471dc20510281329m5312946bjedf100d942c0dc49@mail.gmail.com>
Message-ID: <007d01c5dc00$738da2e0$b62dc797@oemcomputer>

> > Why would a
> > release allow catching something that cannot be raised?  I must be
> > missing something here.
> 
> So conforming code can catch exceptions raised by not-yet conforming
code.

That makes sense.

What was the rationale for pushing the deprecation of __getitem__ and
args back to Py2.8?  Is the there a disadvantage for doing it earlier?
On the flip side, is there any reason it has to be done at all prior to
Py3.0?  That change seems orthogonal to the rest of the proposal and has
its own pluses and minuses (simplification on the plus-side and
code-breakage on the minus-side).  

FWIW, the args tuple does have a legitimate use case as one solution to
the problem of exception chaining (keeping the old info intact, but
adding new info as an extra field):

   try:
	raise TypeError('inner detail')
   except TypeError, e:
	args = e.args + ('outer context',)
	raise TypeError(*args)


Raymond


From martin at v.loewis.de  Sat Oct 29 00:07:23 2005
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Sat, 29 Oct 2005 00:07:23 +0200
Subject: [Python-Dev] Help with inotify
In-Reply-To: <djro25$t1s$1@sea.gmane.org>
References: <djr2mo$g9u$1@sea.gmane.org>	<43611604.4080404@v.loewis.de>	<djr5iu$tou$1@sea.gmane.org>	<43612500.4040403@v.loewis.de>
	<djr948$b7u$1@sea.gmane.org>	<43614070.2030007@v.loewis.de>
	<djro25$t1s$1@sea.gmane.org>
Message-ID: <4362A11B.8090305@v.loewis.de>

Neal Becker wrote:
> OK, does python have a C API that would allow me to create a python file
> object from my C (C++) code?  Then instead of using python's fdopen I could
> just do it myself.

I don't know - you will have to read the python source to find out
(this is actually not a pythondev question anymore).

Regards,
Martin

From martin at v.loewis.de  Sat Oct 29 00:14:53 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 29 Oct 2005 00:14:53 +0200
Subject: [Python-Dev] Conversion to Subversion is complete
In-Reply-To: <bbaeab100510271742w2ebe93b5r91a5e90d5fc0fe8e@mail.gmail.com>
References: <1130409313.4360ad6139518@www.domainfactory-webmail.de>
	<bbaeab100510271742w2ebe93b5r91a5e90d5fc0fe8e@mail.gmail.com>
Message-ID: <4362A2DD.5090704@v.loewis.de>

Brett Cannon wrote:
> I have started a svn section in the dev FAQ
> (http://www.python.org/dev/devfaq.html) pertaining to checking out a
> project from the repository and other stuff discussed so far.  If
> something is not clear or people feel a step is missing, let me know.

One think that should be carried over from svn.ht is how to setup
Putty on Windows. The issue is that subversion will look for a ssh
binary in its path, and if there is none, it fails. Saying

[tunnels]
ssh="c:/program files/putty/plink.exe" -T

in subversion's config file does the trick (see svn.html).

If you use a different SSH client, you need to adjust the configuration
accordingly. FYI, -T specifies to not allocate a terminal.

plink has the nice feature of giving GUI feedback if there is no
terminal for interactive feedback (such as whether the remote key
is trusted). This makes it useful for TortoiseSVN.

Regards,
Martin

From martin at v.loewis.de  Sat Oct 29 00:17:48 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 29 Oct 2005 00:17:48 +0200
Subject: [Python-Dev] Conversion to Subversion is complete
In-Reply-To: <4361B3F8.4000304@v.loewis.de>
References: <1130409313.4360ad6139518@www.domainfactory-webmail.de>	<bbaeab100510271646v5eada245q43d85de5746a2e40@mail.gmail.com>
	<4361B3F8.4000304@v.loewis.de>
Message-ID: <4362A38C.9050400@v.loewis.de>

Martin v. L?wis wrote:
>>I am also curious as to what you would have me check out for the
>>sandbox; whole directory or just the trunk?
> 
> 
> You would usually only check out the trunk (unless you want to work
> on a branch, of course).

Actually, you would probably check out a sandbox subdirectory, such
as

http://svn.python.org/projects/sandbox/trunk/decimal/

(say).

We don't have a policy for making tags or branches for single
directories only; I would suggest that either
"tags/decimal-1.0" or "tags/decimal/1.0" would be acceptable
(depending on how frequently anticipate to make takes, perhaps).

Regards,
Martin

From martin at v.loewis.de  Sat Oct 29 00:21:03 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 29 Oct 2005 00:21:03 +0200
Subject: [Python-Dev] Divorcing str and unicode (no
	more	implicitconversions).
In-Reply-To: <50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com>
References: <435F5149.4060804@egenix.com>
	<435FBDC4.8030300@v.loewis.de>	<20051026105934.3977.JCARLSON@uci.edu>
	<50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com>
Message-ID: <4362A44F.9010506@v.loewis.de>

Neil Hodgson wrote:
>    This is anecdotal but it appears to me that transliterations are
> not commonly used apart from learning languages and some minimal help
> for foreigners such as including transliterated names on railway
> station name boards.

That would be my guess also. Transliteration is clearly common for
Latin-based languages (French, German, Spanish, say), but I doubt
non-Latin scripts are that often transliterated (even if conventions
exist).

Regards,
Martin

From bcannon at gmail.com  Sat Oct 29 00:52:48 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Fri, 28 Oct 2005 15:52:48 -0700
Subject: [Python-Dev] PEP 352 Transition Plan
In-Reply-To: <007d01c5dc00$738da2e0$b62dc797@oemcomputer>
References: <ca471dc20510281329m5312946bjedf100d942c0dc49@mail.gmail.com>
	<007d01c5dc00$738da2e0$b62dc797@oemcomputer>
Message-ID: <bbaeab100510281552rfd260afrde3e72eec14dd5df@mail.gmail.com>

On 10/28/05, Raymond Hettinger <raymond.hettinger at verizon.net> wrote:
> > > Why would a
> > > release allow catching something that cannot be raised?  I must be
> > > missing something here.
> >
> > So conforming code can catch exceptions raised by not-yet conforming
> code.
>
> That makes sense.
>
> What was the rationale for pushing the deprecation of __getitem__ and
> args back to Py2.8?  Is the there a disadvantage for doing it earlier?
> On the flip side, is there any reason it has to be done at all prior to
> Py3.0?  That change seems orthogonal to the rest of the proposal and has
> its own pluses and minuses (simplification on the plus-side and
> code-breakage on the minus-side).
>

I thought that there was no exact rush on their removal.  And I
suspect the later versions of the 2.x branch will be used to help ease
transition to Python 3, so I figured pushing it to 2.8 seemed like a
good idea.  I could even push it all the way to 2.9 if people prefer.

> FWIW, the args tuple does have a legitimate use case as one solution to
> the problem of exception chaining (keeping the old info intact, but
> adding new info as an extra field):
>
>    try:
>         raise TypeError('inner detail')
>    except TypeError, e:
>         args = e.args + ('outer context',)
>         raise TypeError(*args)
>

Interesting point, but I think that chaining should have more concrete
support ala PEP 344 or some other mechanism.  I think most people
agree that exception chaining is important enough to have better
support than some implied way of a causing exception to be passed
along.  Perhaps something more along the lines of:

  try:
      raise TypeError("inner detail")
  except TypeError, e:
      raise TypeError("outer detail", cause=e)

where BaseException then has a 'cause' attribute that is set to None
by default or some specific object that is passed in as the second
argument to the constructor.

-Brett

From tim.peters at gmail.com  Sat Oct 29 03:29:09 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Fri, 28 Oct 2005 21:29:09 -0400
Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover
In-Reply-To: <43610C36.2030500@v.loewis.de>
References: <435BC27C.1010503@v.loewis.de> <2mbr1g6loh.fsf@starship.python.net>
	<e8bf7a530510270523g4a3bef5fk1dd5e8e016d9aa1a@mail.gmail.com>
	<17248.52771.225830.484931@montanaro.dyndns.org>
	<43610C36.2030500@v.loewis.de>
Message-ID: <1f7befae0510281829n20ae2936pbc9f923da807bf6a@mail.gmail.com>

[skip at pobox.com]
>> Though there's no svn/cvs cheatsheet there, you may also find isolated
>> tidbits in the Subversion FAQ:
>>
>>     http://subversion.tigris.org/faq.html
>>
>> Just grep around for "cvs".

[Martin v. L?wis]
> In addition, you might want to read
>
> http://www.python.org/dev/svn.html

Excellent suggestions!  I have a few to pass on:

1. CVS uses "update" for all sorts of things.  SVN has different commands
   for several of the use cases CVS's update conflates:

- Updating to the current server state.  "svn update" does that, and SVN's
  update isn't useful for anything other than that.

- Finding out what's changed in your sandbox.  Use "svn status"
  for that.  Bonus:  in return for creating zillions of admin files,
"svn status"
  is a local operation (no network access required).  Do "svn status -u" to
  get, in addition, a listing of files that _would_ change if you were to
  do "svn update".

- Merging.  Use "svn merge" for that.  This includes the case of reverting
  a checkin, in which case just reverse the revision numbers:

    svn merge URL -rNEW:OLD

  where NEW is the revision number of the checkin you want to revert, and
  OLD is typically NEW-1.  Very nice:  this reverts _all_ changes made in
  revision NEW, no matter how many files were involved.

2. Every checkin conceptually creates a new version of the entire repository,
   uniquely identified by its revision number.  This is very powerful,
but subtle,
   and CVS has nothing like it.  A glimpse of its power was given just above,
   talking about the ease of reverting an entire checkin in one easy gulp,

3. You're working on a trunk sandbox and discover it's going to take longer
   than you hoped.  Now you wish you had created a branch.  This is actually
   dead easy:  create a new branch of the trunk.  "svn switch" your sandbox
   to that new branch; this leaves your local change alone, which is key.
   "svn commit" -- you're done!  There's now a branch on the server matching
   your fiddled local state.

4. Making a branch or tag goes very fast under SVN.  Because branches
   and tags are just conventionally-named directories, you can delete them
   (like any other directory) when you're done with them.  These conspire to
   make simple applications of branches much more pleasant than under CVS.

5. CVS uses text mode for files by default.  SVN uses binary mode.  The
   latter is safer, but creates endless low-level snags for x-platform
   development.  I encourage Python developers to include this gibberish in
   their SVN config file:

"""
[auto-props]
# Setting eol-style to native on all files is a trick:  if svn
# believes a new file is binary, it won't honor the eol-style
# auto-prop.  However, svn considers the request to set eol-style
# to be an error then, and if adding multiple files with one
# svn "add" cmd, svn will stop adding files after the first
# such error.  A future release of svn will probably consider
# this to be a warning instead (and continue adding files).
* = svn:eol-style=native
*.c = svn:keywords=Id
*.h = svn:keywords=Id
*.py = svn:keywords=Id
"""

Then SVN will set the necessary svn:eol-style property to "native" on
new text files you commit.  I've never yet seen it tag a file
inappropriately using this trick, but it's guaranteed to screw up
_all_ text files without something like this (unless you have the
patience and discipline to manually set eol-style=native on all new
text files you add).

From ncoghlan at gmail.com  Sat Oct 29 03:52:04 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 29 Oct 2005 11:52:04 +1000
Subject: [Python-Dev] PEP 352: Required Superclass for Exceptions
In-Reply-To: <bbaeab100510281229n17de9a13x91a8f081ba32f7e5@mail.gmail.com>
References: <bbaeab100510272015k62f8d836p8c7878c05b007ac2@mail.gmail.com>	
	<436207AA.20506@gmail.com>	
	<ca471dc20510280822w6e3849dbu80b239c1ba42c5f8@mail.gmail.com>
	<bbaeab100510281229n17de9a13x91a8f081ba32f7e5@mail.gmail.com>
Message-ID: <4362D5C4.4080206@gmail.com>

Brett Cannon wrote:
> On 10/28/05, Guido van Rossum <guido at python.org> wrote:
> Nick got the python-checkins email and then read the PEP from the
> repository (or at least that is what I assume since that is how Neal
> managed to catch the PEP literally in under 5 minutes after checkin).

Actually, when you first check a PEP in, the diff includes the entire text of 
the PEP - so I just read the python-checkins email :)

>> But does anyone care? As long as args exists and is a tuple, does it
>> matter that it doesn't match the argument list when the latter was
>> empty? IMO the protocol mostly says that ex.args exists and is a tuple
>> -- the values in there can't be relied upon in pre-2.5-Python.
>> Exceptions that have specific information should store it in a
>> different place, not in ex.args.
> 
> Looking at http://docs.python.org/lib/module-exceptions.html , it
> looks like Guido is right.  All it ever says is that it is a tuple and
> that any passed-in arguments go into 'args'; nothing about its default
> value if no arguments are passed in.
> 
> But I personally have no qualms changing it if people want it, so -0
> from me on making it more backwards-compatible.

I agree changing the behaviour is highly unlikely to cause any serious 
problems (mainly because anyone *caring* about the contents of args is rare), 
the current behaviour is relatively undocumented, and the PEP now proposes 
deprecating ex.args immediately, so Guido's well within his rights if he wants 
to change the behaviour.

I was merely commenting from the 'its an unnecessary change to existing 
behaviour' angle, since the backwards compatible version gives the same 
behaviour of the new ex.message API as the version in the PEP, while leaving 
the now-deprecated ex.args API behaviour identical to that in Python 2.4.

In other words, I'm looking for a *benefit* that comes from the behavioural 
change, rather than a 'but the current behaviour is undocumented anyway' 
response. If there's no actual benefit in breaking it, then why break it? :)

>>> The observation: The value of ex.message
>>>
>>>    Under PEP 352 the concept of allowing "return x" to be used in a generator
>>> to mean "raise StopIteration(x)" would actually align quite well. A bare
>>> "return", however, would need to be changed to translate to "raise
>>> StopIteration(None)" rather than its current "raise StopIteration" in order to
>>> get the correct value (None) into ex.message.
>> Since ex.message is new, how can you say that it should have the value
>> None? IMO the whole idea is that ex.message should always be a string
>> going forward (although I'm not going to add a typecheck to enforce
>> this).
>>
> 
> My feeling exactly on 'message'.

I'm talking about the specific context of the behaviour of 'return' in 
generators, not on the behaviour of ex.message in general. For normal 
exceptions, I agree '' is the correct default.

For that specific case of allowing a return value from generators, and using 
it as the message on the raised StopIteration, *then* it makes sense for 
"return" to translate to "raise StopIteration(None)", so that generators have 
the same 'default return value' as normal functions.

There's a reason I said it was just an observation - it has no effect on PEP 
352 itself, only on a *different* syntax extension that hasn't even been 
officially suggested in a PEP (only mentioned in passing when discussing PEP 342).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ncoghlan at gmail.com  Sat Oct 29 04:23:17 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 29 Oct 2005 12:23:17 +1000
Subject: [Python-Dev] PEP 352 Transition Plan
In-Reply-To: <bbaeab100510281552rfd260afrde3e72eec14dd5df@mail.gmail.com>
References: <ca471dc20510281329m5312946bjedf100d942c0dc49@mail.gmail.com>	<007d01c5dc00$738da2e0$b62dc797@oemcomputer>
	<bbaeab100510281552rfd260afrde3e72eec14dd5df@mail.gmail.com>
Message-ID: <4362DD15.4080606@gmail.com>

Brett Cannon wrote:
> Interesting point, but I think that chaining should have more concrete
> support ala PEP 344 or some other mechanism.  I think most people
> agree that exception chaining is important enough to have better
> support than some implied way of a causing exception to be passed
> along.  Perhaps something more along the lines of:
> 
>   try:
>       raise TypeError("inner detail")
>   except TypeError, e:
>       raise TypeError("outer detail", cause=e)
> 
> where BaseException then has a 'cause' attribute that is set to None
> by default or some specific object that is passed in as the second
> argument to the constructor.

Another point in PEP 352's favour, is that it makes it far more feasible to 
implement something like PEP 344 by providing "__traceback__" and 
"__prev_exc__" attributes on BaseException.

The 'raise' statement could then take care of setting them appropriately if it 
was given an instance of BaseException to raise.

Actually, that brings up another question - PEP 352 says it will require 
objects that "inherit from BaseException". Does that mean that either subtypes 
or instances of BaseException will be acceptable? Or does it just mean 
instances? If the latter, how will that affect the multi-argument forms of 
'raise'?

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From ishimoto at gembook.org  Sat Oct 29 04:29:23 2005
From: ishimoto at gembook.org (Atsuo Ishimoto)
Date: Sat, 29 Oct 2005 11:29:23 +0900
Subject: [Python-Dev] Divorcing str and unicode (no
	more	implicitconversions).
In-Reply-To: <4362A44F.9010506@v.loewis.de>
References: <50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com>
	<4362A44F.9010506@v.loewis.de>
Message-ID: <20051029110331.D5AA.ISHIMOTO@gembook.org>

Hello from Japan,

I googled discussions about non-ASCII identifiers in Japanese, but I
found no consensus. Major languages such as Java or VB support non-ASCII
identifiers, so projects uses non-ASCII identifiers for their programs
are existing. Not all Japanese programmers think this is a good idea.
Some people enthusiastically prefer Japanese identifiers, but some feel
it reduces readability and hard to type, some worry about tool breakages
or encoding problem, etc. It looks that smart people don't like to
express their preference to Japanese identifiers, maybe because they
think such style is not cool, or they are afraid such confession may
reveal lack of their English ability.;) 

I'm +0.1 for non-ASCII identifiers, although module names should remain
ASCII. ASCII identifiers might be encouraged, but as Martin said, it is
very useful for some groups of users.


On Sat, 29 Oct 2005 00:21:03 +0200
"Martin v. Lvwis" <martin at v.loewis.de> wrote:

> Neil Hodgson wrote:
> >    This is anecdotal but it appears to me that transliterations are
> > not commonly used apart from learning languages and some minimal help
> > for foreigners such as including transliterated names on railway
> > station name boards.
> 
> That would be my guess also. Transliteration is clearly common for
> Latin-based languages (French, German, Spanish, say), but I doubt
> non-Latin scripts are that often transliterated (even if conventions
> exist).
> 

Yes, transliterations are rarely used in daily life in Japan. For
programming, I know a lot of projects use transliterated Japanses style,
but I guess they are rather minority.

--------------------------
Atsuo Ishimoto
ishimoto at gembook.org
Homepage:http://www.gembook.jp


From raymond.hettinger at verizon.net  Sat Oct 29 04:55:37 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Fri, 28 Oct 2005 22:55:37 -0400
Subject: [Python-Dev] PEP 352 Transition Plan
In-Reply-To: <4362DD15.4080606@gmail.com>
Message-ID: <009101c5dc34$3de2a1c0$b62dc797@oemcomputer>

[Nick Coghlan]
> Another point in PEP 352's favour, is that it makes it far more
feasible
> to
> implement something like PEP 344 by providing "__traceback__" and
> "__prev_exc__" attributes on BaseException.
> 
> The 'raise' statement could then take care of setting them
appropriately
> if it
> was given an instance of BaseException to raise.

IMO, there is no reason to take e.args out of the Py2.x series.
Take-aways should be left for Py3.0.  The existence of a legitimate use
case means that there may be working code in the field that would be
broken unnecessarily.  Nothing is gained by this breakage.

If 344 gets accepted and implemented, that's great.  Either way, there
is no rationale for chopping this long standing feature before 3.0.
IIRC, that was the whole point of 3.0 -- we could take out old stuff
that had been replaced by new and better things; otherwise, we would
simply deprecate old-style classes and be done with it in Py2.5 or
Py2.6.



Raymond


From bcannon at gmail.com  Sat Oct 29 05:16:02 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Fri, 28 Oct 2005 20:16:02 -0700
Subject: [Python-Dev] PEP 352: Required Superclass for Exceptions
In-Reply-To: <4362D5C4.4080206@gmail.com>
References: <bbaeab100510272015k62f8d836p8c7878c05b007ac2@mail.gmail.com>
	<436207AA.20506@gmail.com>
	<ca471dc20510280822w6e3849dbu80b239c1ba42c5f8@mail.gmail.com>
	<bbaeab100510281229n17de9a13x91a8f081ba32f7e5@mail.gmail.com>
	<4362D5C4.4080206@gmail.com>
Message-ID: <bbaeab100510282016v7ab44b3dn224435ed5b21567@mail.gmail.com>

On 10/28/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Brett Cannon wrote:
> > On 10/28/05, Guido van Rossum <guido at python.org> wrote:
> > Nick got the python-checkins email and then read the PEP from the
> > repository (or at least that is what I assume since that is how Neal
> > managed to catch the PEP literally in under 5 minutes after checkin).
>
> Actually, when you first check a PEP in, the diff includes the entire text of
> the PEP - so I just read the python-checkins email :)
>
> >> But does anyone care? As long as args exists and is a tuple, does it
> >> matter that it doesn't match the argument list when the latter was
> >> empty? IMO the protocol mostly says that ex.args exists and is a tuple
> >> -- the values in there can't be relied upon in pre-2.5-Python.
> >> Exceptions that have specific information should store it in a
> >> different place, not in ex.args.
> >
> > Looking at http://docs.python.org/lib/module-exceptions.html , it
> > looks like Guido is right.  All it ever says is that it is a tuple and
> > that any passed-in arguments go into 'args'; nothing about its default
> > value if no arguments are passed in.
> >
> > But I personally have no qualms changing it if people want it, so -0
> > from me on making it more backwards-compatible.
>
> I agree changing the behaviour is highly unlikely to cause any serious
> problems (mainly because anyone *caring* about the contents of args is rare),
> the current behaviour is relatively undocumented, and the PEP now proposes
> deprecating ex.args immediately, so Guido's well within his rights if he wants
> to change the behaviour.
>
> I was merely commenting from the 'its an unnecessary change to existing
> behaviour' angle, since the backwards compatible version gives the same
> behaviour of the new ex.message API as the version in the PEP, while leaving
> the now-deprecated ex.args API behaviour identical to that in Python 2.4.
>
> In other words, I'm looking for a *benefit* that comes from the behavioural
> change, rather than a 'but the current behaviour is undocumented anyway'
> response. If there's no actual benefit in breaking it, then why break it? :)
>

The benefit for me was that the code kept the 'message' argument and
thus, in my mind, made it much more obvious that 'mesage' and 'args'
are different.  But I think I have a much more reasonable solution
that lets me keep the 'message' argument explicit.  It also let me use
the conditional operator to simplify the code more.  So I went ahead
and made it the more backwards-compatible.

> >>> The observation: The value of ex.message
> >>>
> >>>    Under PEP 352 the concept of allowing "return x" to be used in a generator
> >>> to mean "raise StopIteration(x)" would actually align quite well. A bare
> >>> "return", however, would need to be changed to translate to "raise
> >>> StopIteration(None)" rather than its current "raise StopIteration" in order to
> >>> get the correct value (None) into ex.message.
> >> Since ex.message is new, how can you say that it should have the value
> >> None? IMO the whole idea is that ex.message should always be a string
> >> going forward (although I'm not going to add a typecheck to enforce
> >> this).
> >>
> >
> > My feeling exactly on 'message'.
>
> I'm talking about the specific context of the behaviour of 'return' in
> generators, not on the behaviour of ex.message in general. For normal
> exceptions, I agree '' is the correct default.
>
> For that specific case of allowing a return value from generators, and using
> it as the message on the raised StopIteration, *then* it makes sense for
> "return" to translate to "raise StopIteration(None)", so that generators have
> the same 'default return value' as normal functions.
>
> There's a reason I said it was just an observation - it has no effect on PEP
> 352 itself, only on a *different* syntax extension that hasn't even been
> officially suggested in a PEP (only mentioned in passing when discussing PEP 342).
>

Ah, OK.  So you just want to make sure that at the generator level
that the bytecode (or the ceval loop, not sure where the change would
need to be made) that the StopIteration be raised with an explicit
'message' argument of None.  Which obviously does not directly affect
PEP 352, but should be considered as a possible change.  That makes
sense to me and I have no trouble with that, but that is partially
because I don't have to make that change.  =)

-Brett

From guido at python.org  Sat Oct 29 05:22:36 2005
From: guido at python.org (Guido van Rossum)
Date: Fri, 28 Oct 2005 20:22:36 -0700
Subject: [Python-Dev] PEP 352: Required Superclass for Exceptions
In-Reply-To: <4362D5C4.4080206@gmail.com>
References: <bbaeab100510272015k62f8d836p8c7878c05b007ac2@mail.gmail.com>
	<436207AA.20506@gmail.com>
	<ca471dc20510280822w6e3849dbu80b239c1ba42c5f8@mail.gmail.com>
	<bbaeab100510281229n17de9a13x91a8f081ba32f7e5@mail.gmail.com>
	<4362D5C4.4080206@gmail.com>
Message-ID: <ca471dc20510282022q2a07afehcca4f6d34e48174c@mail.gmail.com>

[Trying to cut this short... We have too many threads for this topic. :-( ]

On 10/28/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
[on making args b/w compatible]
> I agree changing the behaviour is highly unlikely to cause any serious
> problems (mainly because anyone *caring* about the contents of args is rare),
> the current behaviour is relatively undocumented, and the PEP now proposes
> deprecating ex.args immediately, so Guido's well within his rights if he wants
> to change the behaviour.

I take it back. Since the feature will disappear in Python 3.0 and is
maintained only for b/w compatibility, we should keep it as b/w
compatible as possible. That means it should default to () and always
have as its value exactly the positional arguments that were passed.

OTOH, I want message to default to "", not to None (even though it
will be set to None if you explicitly pass None as the first
argument). So the constructor could be like this (until Python 3000):

def __init__(self, *args):
    self.args = args
    if args:
        self.message = args[0]
    else:
        self.message = ""

I think Nick proposed this before as well, so let's just do this.

> I'm talking about the specific context of the behaviour of 'return' in
> generators, not on the behaviour of ex.message in general. For normal
> exceptions, I agree '' is the correct default.
>
> For that specific case of allowing a return value from generators, and using
> it as the message on the raised StopIteration, *then* it makes sense for
> "return" to translate to "raise StopIteration(None)", so that generators have
> the same 'default return value' as normal functions.

I don't like that (not-even-proposed) feature anyway. I see no use for
it; it only gets proposed by people who are irked by the requirement
that generators can contain 'return' but not 'return value'. I think
that irkedness is unwarranted; 'return' is useful to cause an early
exit, but generators don't have a return value so 'return value' is
meaningless.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From bcannon at gmail.com  Sat Oct 29 05:27:27 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Fri, 28 Oct 2005 20:27:27 -0700
Subject: [Python-Dev] PEP 352: Required Superclass for Exceptions
In-Reply-To: <ca471dc20510282022q2a07afehcca4f6d34e48174c@mail.gmail.com>
References: <bbaeab100510272015k62f8d836p8c7878c05b007ac2@mail.gmail.com>
	<436207AA.20506@gmail.com>
	<ca471dc20510280822w6e3849dbu80b239c1ba42c5f8@mail.gmail.com>
	<bbaeab100510281229n17de9a13x91a8f081ba32f7e5@mail.gmail.com>
	<4362D5C4.4080206@gmail.com>
	<ca471dc20510282022q2a07afehcca4f6d34e48174c@mail.gmail.com>
Message-ID: <bbaeab100510282027l1bde4deekefca721788745c4@mail.gmail.com>

On 10/28/05, Guido van Rossum <guido at python.org> wrote:
> [Trying to cut this short... We have too many threads for this topic. :-( ]
>
> On 10/28/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> [on making args b/w compatible]
> > I agree changing the behaviour is highly unlikely to cause any serious
> > problems (mainly because anyone *caring* about the contents of args is rare),
> > the current behaviour is relatively undocumented, and the PEP now proposes
> > deprecating ex.args immediately, so Guido's well within his rights if he wants
> > to change the behaviour.
>
> I take it back. Since the feature will disappear in Python 3.0 and is
> maintained only for b/w compatibility, we should keep it as b/w
> compatible as possible. That means it should default to () and always
> have as its value exactly the positional arguments that were passed.
>
> OTOH, I want message to default to "", not to None (even though it
> will be set to None if you explicitly pass None as the first
> argument). So the constructor could be like this (until Python 3000):
>
> def __init__(self, *args):
>     self.args = args
>     if args:
>         self.message = args[0]
>     else:
>         self.message = ""
>
> I think Nick proposed this before as well, so let's just do this.

Yeah, but Nick used the conditional operator and I used that.  All checked in.

-Brett

From bcannon at gmail.com  Sat Oct 29 05:37:29 2005
From: bcannon at gmail.com (Brett Cannon)
Date: Fri, 28 Oct 2005 20:37:29 -0700
Subject: [Python-Dev] PEP 352 Transition Plan
In-Reply-To: <4362DD15.4080606@gmail.com>
References: <ca471dc20510281329m5312946bjedf100d942c0dc49@mail.gmail.com>
	<007d01c5dc00$738da2e0$b62dc797@oemcomputer>
	<bbaeab100510281552rfd260afrde3e72eec14dd5df@mail.gmail.com>
	<4362DD15.4080606@gmail.com>
Message-ID: <bbaeab100510282037m5bad1f67kb4d5cb7171ac163b@mail.gmail.com>

On 10/28/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Brett Cannon wrote:
> > Interesting point, but I think that chaining should have more concrete
> > support ala PEP 344 or some other mechanism.  I think most people
> > agree that exception chaining is important enough to have better
> > support than some implied way of a causing exception to be passed
> > along.  Perhaps something more along the lines of:
> >
> >   try:
> >       raise TypeError("inner detail")
> >   except TypeError, e:
> >       raise TypeError("outer detail", cause=e)
> >
> > where BaseException then has a 'cause' attribute that is set to None
> > by default or some specific object that is passed in as the second
> > argument to the constructor.
>
> Another point in PEP 352's favour, is that it makes it far more feasible to
> implement something like PEP 344 by providing "__traceback__" and
> "__prev_exc__" attributes on BaseException.
>
> The 'raise' statement could then take care of setting them appropriately if it
> was given an instance of BaseException to raise.
>

Yep.  This is why having a guaranteed API is so handy for exceptions. 
And actually PEP 3000 says that exceptions are supposed to gain a
traceback attribute.  But that can be another PEP if PEP 344 doesn't
make it.

> Actually, that brings up another question - PEP 352 says it will require
> objects that "inherit from BaseException". Does that mean that either subtypes
> or instances of BaseException will be acceptable? Or does it just mean
> instances? If the latter, how will that affect the multi-argument forms of
> 'raise'?
>

I don't see how a multi-argument 'raise' changes the situation any. 
``raise BaseException`` and ``raise BaseException()`` must both be
supported which means isinstance() or issubtype() will be used (unless
Python 3 bans raising a class or something).

-Brett

-Brett

From radeex at gmail.com  Sat Oct 29 06:43:25 2005
From: radeex at gmail.com (Christopher Armstrong)
Date: Sat, 29 Oct 2005 15:43:25 +1100
Subject: [Python-Dev] PEP 352 Transition Plan
In-Reply-To: <4362DD15.4080606@gmail.com>
References: <ca471dc20510281329m5312946bjedf100d942c0dc49@mail.gmail.com>
	<007d01c5dc00$738da2e0$b62dc797@oemcomputer>
	<bbaeab100510281552rfd260afrde3e72eec14dd5df@mail.gmail.com>
	<4362DD15.4080606@gmail.com>
Message-ID: <60ed19d40510282143x466fbdf1x5570c1b05c6cd53c@mail.gmail.com>

On 10/29/05, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Another point in PEP 352's favour, is that it makes it far more feasible to
> implement something like PEP 344 by providing "__traceback__" and
> "__prev_exc__" attributes on BaseException.

Not sure if I'm fully in-context here, but watch out for __traceback__
and garbage collection, since the traceback objects refer to all the
frames. I expect there's a significant amount of code out there that
expects Exception instances to be reasonably persistent. At least
Twisted does, with its encapsulation of Exceptions for the purposes of
asynchrony -- Failure objects. These Failure objects also refer to
tracebacks, but we had to be very careful about deleting them fairly
quickly because of GC issues. After deletion they simply contain an
inert, basically stringified copy of the traceback.

On an only semi-related note, at one point I tried making it possible
to have longer-lived Traceback objects that could be reraised, but
found it very hard to do, at least with my self-imposed requirement of
keeping it in an extension module.

http://mail.python.org/pipermail/python-dev/2005-September/056091.html



--
  Twisted   |  Christopher Armstrong: International Man of Twistery
   Radix    |    -- http://radix.twistedmatrix.com
            |  Release Manager, Twisted Project
  \\\V///   |    -- http://twistedmatrix.com
   |o O|    |
w----v----w-+

From ncoghlan at iinet.net.au  Sat Oct 29 08:16:19 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Sat, 29 Oct 2005 16:16:19 +1000
Subject: [Python-Dev] PEP 343 updated with outcome of recent discussions
Message-ID: <436313B3.2030707@iinet.net.au>

Once the cron job works it magic, the updated PEP 343 should be available on 
the website.

As far as I am aware, there aren't any more open issues, so it is once again 
ready for BDFL pronouncement.

I also tinkered with the example naming a bit, and added a new example for the 
"nested" context manager (it turns out there *were* mistakes in the last 
version I posted here - I had the deque method name wrong, and I wasn't 
invoking __context__ correctly on the nested contexts).

Cheers,
Nick.

P.S. My availability will be sketchy for the rest of this weekend, then 
nonexistent until next weekend, so don't be surprised if I don't respond to 
messages before then.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.blogspot.com

From martin at v.loewis.de  Sat Oct 29 10:56:58 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 29 Oct 2005 10:56:58 +0200
Subject: [Python-Dev] Divorcing str and unicode (no
	more	implicitconversions).
In-Reply-To: <20051029110331.D5AA.ISHIMOTO@gembook.org>
References: <50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com>
	<4362A44F.9010506@v.loewis.de>
	<20051029110331.D5AA.ISHIMOTO@gembook.org>
Message-ID: <4363395A.3040606@v.loewis.de>

Atsuo Ishimoto wrote:
> I'm +0.1 for non-ASCII identifiers, although module names should remain
> ASCII. ASCII identifiers might be encouraged, but as Martin said, it is
> very useful for some groups of users.

Thanks for these data. This mostly reflects my experience with German
and French users: some people would like to use non-ASCII identifiers
if they could, other argue they never would as a matter of principle.
Of course, transliteration is more straight-forward.

Regards,
Martin


From gjc at inescporto.pt  Sat Oct 29 13:09:10 2005
From: gjc at inescporto.pt (Gustavo J. A. M. Carneiro)
Date: Sat, 29 Oct 2005 12:09:10 +0100
Subject: [Python-Dev] Divorcing str and unicode
	(no	more	implicitconversions).
In-Reply-To: <4363395A.3040606@v.loewis.de>
References: <50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com>
	<4362A44F.9010506@v.loewis.de>
	<20051029110331.D5AA.ISHIMOTO@gembook.org>
	<4363395A.3040606@v.loewis.de>
Message-ID: <1130584150.10206.10.camel@localhost.localdomain>

On Sat, 2005-10-29 at 10:56 +0200, "Martin v. L?wis" wrote:
> Atsuo Ishimoto wrote:
> > I'm +0.1 for non-ASCII identifiers, although module names should remain
> > ASCII. ASCII identifiers might be encouraged, but as Martin said, it is
> > very useful for some groups of users.
> 
> Thanks for these data. This mostly reflects my experience with German
> and French users: some people would like to use non-ASCII identifiers
> if they could, other argue they never would as a matter of principle.
> Of course, transliteration is more straight-forward.

  Not sure if anyone has made this point already, but unicode
identifiers are also useful for math programs.  The ability to directly
type the math letters, like alpha, omega, etc., would actually make the
code more readable, while still understandable by programmers of all
nationalities.  For instance, you could write:

	?v = x1 - x0
	if ?v < ?:
	    return

Instead of:

	delta_v = x1 - x0
	if delta_v < epsilon:
	    return

But anyone that is supposed to understand the code will be able to read
the delta and epsilon symbols.

  Regards.

-- 
Gustavo J. A. M. Carneiro
<gjc at inescporto.pt> <gustavo at users.sourceforge.net>
The universe is always one step beyond logic


From solipsis at pitrou.net  Sat Oct 29 14:32:22 2005
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sat, 29 Oct 2005 14:32:22 +0200
Subject: [Python-Dev] Divorcing str and unicode
	(no	more	implicitconversions).
In-Reply-To: <4363395A.3040606@v.loewis.de>
References: <50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com>
	<4362A44F.9010506@v.loewis.de>
	<20051029110331.D5AA.ISHIMOTO@gembook.org>
	<4363395A.3040606@v.loewis.de>
Message-ID: <1130589142.5945.11.camel@fsol>


> Thanks for these data. This mostly reflects my experience with German
> and French users: some people would like to use non-ASCII identifiers
> if they could, other argue they never would as a matter of principle.
> Of course, transliteration is more straight-forward.

FWIW, being French, I don't remember hearing any programmer wish (s)he
could use non-ASCII identifiers, in any programming language. But
arguably translitteration is very straight-forward (although a bit
lossless at times ;-)).

I think typeability and reproduceability should be weighted carefully.
It's nice to have the real letter delta instead of "delta", but how do I
type it again on my non-Greek keyboard if I want to keep consistent
naming in the program?

ASCII is ethnocentric, but it probably can be typed easily with every
device in the world.

Also, as a matter of fact, if I type an identifier with an accented
letter inside, I would like Python to warn me, because it would be a
typing error on my part.

Maybe this should be an option at the beginning of any source file (like
encoding currently). Or is this overkill?



From skink at evhr.net  Sat Oct 29 14:50:36 2005
From: skink at evhr.net (Fabien Schwob)
Date: Sat, 29 Oct 2005 14:50:36 +0200
Subject: [Python-Dev] Divorcing str and
	unicode	(no	more	implicitconversions).
In-Reply-To: <1130589142.5945.11.camel@fsol>
References: <50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com>	<4362A44F.9010506@v.loewis.de>	<20051029110331.D5AA.ISHIMOTO@gembook.org>	<4363395A.3040606@v.loewis.de>
	<1130589142.5945.11.camel@fsol>
Message-ID: <4363701C.80904@evhr.net>

> FWIW, being French, I don't remember hearing any programmer wish (s)he
> could use non-ASCII identifiers, in any programming language. But
> arguably translitteration is very straight-forward (although a bit
> lossless at times ;-)).
> 
> I think typeability and reproduceability should be weighted carefully.
> It's nice to have the real letter delta instead of "delta", but how do I
> type it again on my non-Greek keyboard if I want to keep consistent
> naming in the program?
> 
> ASCII is ethnocentric, but it probably can be typed easily with every
> device in the world.
> 
> Also, as a matter of fact, if I type an identifier with an accented
> letter inside, I would like Python to warn me, because it would be a
> typing error on my part.
> 
> Maybe this should be an option at the beginning of any source file (like
> encoding currently). Or is this overkill?

I'm also French and I must say that I agree with you. In my case, the 
most important thing is to be able to manage the _data_ in the good 
encoding.

I'm currently trying to implement a little search engine in python (to 
improve my skills mainly) and the biggest problem I have to face is how 
to manage encoding. Some web pages are in French, in German, in English, 
etc. and it take me a lot of time to handle this problem correctly.

I think it's more useful to be able to manipulate simply the _data_ than 
to have accents in identifiers.

-- 
Derri?re chaque bogue, il y a un d?veloppeur, un homme qui s'est tromp?.
(Bon, OK, parfois ils s'y mettent ? plusieurs).


From martin at v.loewis.de  Sat Oct 29 16:48:32 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 29 Oct 2005 16:48:32 +0200
Subject: [Python-Dev] Divorcing str and
	unicode	(no	more	implicitconversions).
In-Reply-To: <1130589142.5945.11.camel@fsol>
References: <50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com>	<4362A44F.9010506@v.loewis.de>	<20051029110331.D5AA.ISHIMOTO@gembook.org>	<4363395A.3040606@v.loewis.de>
	<1130589142.5945.11.camel@fsol>
Message-ID: <43638BC0.40108@v.loewis.de>

Antoine Pitrou wrote:
> FWIW, being French, I don't remember hearing any programmer wish (s)he
> could use non-ASCII identifiers, in any programming language. But
> arguably translitteration is very straight-forward (although a bit
> lossless at times ;-)).

My canonical example is Fran?ois Pinard, who keeps requesting it,
saying that local people where surprised they couldn't use accented
characters in Python.

Perhaps that's because he actually is Quebecian :-)

Regards,
Martin

From phd at mail2.phd.pp.ru  Sat Oct 29 18:13:27 2005
From: phd at mail2.phd.pp.ru (Oleg Broytmann)
Date: Sat, 29 Oct 2005 20:13:27 +0400
Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover
In-Reply-To: <1f7befae0510281829n20ae2936pbc9f923da807bf6a@mail.gmail.com>
References: <435BC27C.1010503@v.loewis.de> <2mbr1g6loh.fsf@starship.python.net>
	<e8bf7a530510270523g4a3bef5fk1dd5e8e016d9aa1a@mail.gmail.com>
	<17248.52771.225830.484931@montanaro.dyndns.org>
	<43610C36.2030500@v.loewis.de>
	<1f7befae0510281829n20ae2936pbc9f923da807bf6a@mail.gmail.com>
Message-ID: <20051029161327.GB7048@phd.pp.ru>

Hello!

On Fri, Oct 28, 2005 at 09:29:09PM -0400, Tim Peters wrote:
> - Finding out what's changed in your sandbox.  Use "svn status"

   svn diff uses locally saved copies of files. This increases speed by
trading for the disk space. It also decreases net traffic; that's important
for those who have expensive connections.

> 4. Making a branch or tag goes very fast under SVN.

   Fast and cheap in terms of space; Subversion uses a kind of symlinks in
its internal filesystem.

>    make simple applications of branches much more pleasant than under CVS.

   Much more pleasant. I now use more branches than I did with CVS and have
less conflicts.

> * = svn:eol-style=native

   I would very much like to recommend developers to set svn:executable
property on executable scripts and unset it on non-executable files; thus
all those README and NEWS will be tarred with -rw-r--r-- attributes. :)

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd at phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.

From s.percivall at chello.se  Sat Oct 29 18:35:18 2005
From: s.percivall at chello.se (Simon Percivall)
Date: Sat, 29 Oct 2005 18:35:18 +0200
Subject: [Python-Dev] Conversion to Subversion is complete
In-Reply-To: <43611507.8090606@v.loewis.de>
References: <1130409313.4360ad6139518@www.domainfactory-webmail.de>
	<2m64rj5agw.fsf@starship.python.net> <43611507.8090606@v.loewis.de>
Message-ID: <D4A27581-7A21-4B41-8DA4-55DF6E832532@chello.se>

On 27 okt 2005, at 19.57, Martin v. L?wis wrote:
> Michael Hudson wrote:
>
>> Do checkins to svn.python.org go to the python-checkins list already?
>>
>
> They do indeed - you should have received one commit message by now
> (me testing whether committing works, on PEP 347).

Could the subject lines of those messages please be changed to something
more informative? Having which files were changed in the subject seems
better than having only the new rev and the folders the files are in.

//Simon


From martin at v.loewis.de  Sat Oct 29 18:44:50 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 29 Oct 2005 18:44:50 +0200
Subject: [Python-Dev] Conversion to Subversion is complete
In-Reply-To: <D4A27581-7A21-4B41-8DA4-55DF6E832532@chello.se>
References: <1130409313.4360ad6139518@www.domainfactory-webmail.de>
	<2m64rj5agw.fsf@starship.python.net> <43611507.8090606@v.loewis.de>
	<D4A27581-7A21-4B41-8DA4-55DF6E832532@chello.se>
Message-ID: <4363A702.1050502@v.loewis.de>

Simon Percivall wrote:
> Could the subject lines of those messages please be changed to something
> more informative? Having which files were changed in the subject seems
> better than having only the new rev and the folders the files are in.

I'm neither sure whether that should be done, or whether it could be
done.

What do others think? I personally found those long subject lines
listing all the changed files very ugly and unreadable.

The other question (whether it could be done) is probably answered
as "yes", but I have to research what magic precisely is necessary.

Regards,
Martin

From barry at python.org  Sat Oct 29 20:34:48 2005
From: barry at python.org (Barry Warsaw)
Date: Sat, 29 Oct 2005 14:34:48 -0400
Subject: [Python-Dev] Conversion to Subversion is complete
In-Reply-To: <4363A702.1050502@v.loewis.de>
References: <1130409313.4360ad6139518@www.domainfactory-webmail.de>
	<2m64rj5agw.fsf@starship.python.net> <43611507.8090606@v.loewis.de>
	<D4A27581-7A21-4B41-8DA4-55DF6E832532@chello.se>
	<4363A702.1050502@v.loewis.de>
Message-ID: <1130610888.11892.6.camel@geddy.wooz.org>

On Sat, 2005-10-29 at 12:44, "Martin v. L?wis" wrote:

> What do others think? I personally found those long subject lines
> listing all the changed files very ugly and unreadable.

Me too.  At work our subject lines contain something like:

Subject: [SVN][reponame] checkin of r12345 - dir/containing/changes

Note that we send a different commit message for every directory the
change happens in, even though it's all one revision.  We like it that
way because some people don't care about certain directories and can
filter based on that.

Inside the body of the email you'll see something like:

Author: person
Date: when
New Revision: r12345

Log:
Log message comes next.  Definitely best to show up before the diff.

diff comes next...

FWIW, this format has worked well for us.
-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20051029/b8d38328/attachment.pgp

From fdrake at acm.org  Sat Oct 29 21:04:15 2005
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Sat, 29 Oct 2005 15:04:15 -0400
Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover
In-Reply-To: <1f7befae0510281829n20ae2936pbc9f923da807bf6a@mail.gmail.com>
References: <435BC27C.1010503@v.loewis.de> <43610C36.2030500@v.loewis.de>
	<1f7befae0510281829n20ae2936pbc9f923da807bf6a@mail.gmail.com>
Message-ID: <200510291504.15504.fdrake@acm.org>

On Friday 28 October 2005 21:29, Tim Peters wrote:
 > - Finding out what's changed in your sandbox.  Use "svn status"
 >   for that.  Bonus:  in return for creating zillions of admin files,
 > "svn status"
 >   is a local operation (no network access required).  Do "svn status -u"
 > to get, in addition, a listing of files that _would_ change if you were to
 > do "svn update".

It's worth noting that "svn status -u" does require network access, since it 
has to check with the repository to see what's been updated there.


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From fdrake at acm.org  Sat Oct 29 21:50:11 2005
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Sat, 29 Oct 2005 15:50:11 -0400
Subject: [Python-Dev] [Python-checkins] commit of r41352 - in
	python/trunk: . Lib Lib/distutils Lib/distutils/command
	Lib/encodings
In-Reply-To: <20051029194022.470D61E40B4@bag.python.org>
References: <20051029194022.470D61E40B4@bag.python.org>
Message-ID: <200510291550.12279.fdrake@acm.org>

On Saturday 29 October 2005 15:40, martin.v.loewis at python.org wrote:
 > Author: martin.v.loewis
 > Date: Sat Oct 29 21:40:21 2005
 > New Revision: 41352
 >
 > Modified:
 >    python/trunk/   (props changed)
 >    python/trunk/.cvsignore
...
 > Add *.pyc to svn:ignore.
 > Add libpython*.a to .cvsignore and svn:ignore.

Shouldn't we simply remove the .cvsignore files?  Subversion doesn't use them, 
so they'll just end up getting out of sync with the svn:ignore properties.


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From noamraph at gmail.com  Sat Oct 29 22:42:49 2005
From: noamraph at gmail.com (Noam Raphael)
Date: Sat, 29 Oct 2005 22:42:49 +0200
Subject: [Python-Dev] PEP 351, the freeze protocol
In-Reply-To: <1130107429.11268.40.camel@geddy.wooz.org>
References: <1130107429.11268.40.camel@geddy.wooz.org>
Message-ID: <b348a0850510291342y25f0a3d9m5be75afa908c9e55@mail.gmail.com>

Hello,

I have thought about freezing for some time, and I think that it is a
fundamental need - the need to know, sometimes, that objects aren't
going to change.

This is mostly the need of containers. dicts need to know that the
objects that are used as keys aren't going to change, because if they
change, their hash value changes, and you end up with a data structure
in an inconsistent state. This is the need of sets too, and of heaps,
and binary trees, and so on.

I want to give another example: I and my colleges designed something
which can be described as an "electronic spreadsheet in Python". We
called it a "table". The values in the table are Python objects, and
the functions which relate them are written in Python. Then comes the
problem: the user has, of course, access to the objects stored in the
table. What would happen if he changes them? The answer is that the
table would be in an inconsistent state, since something which should
be the return value of a function is now something else, and there's
no way for the table to know about that.

The solution is to have a "freeze" protocol. It may be called "frozen"
(like frozen(set([1,2,3]))), so that it will be clear that it does not
change the object itself. The definition of a frozen object is that
its value can't change - that is, if you compare it with another
object, you should get the same result as long as the other object
hasn't changed. As a rule, only frozen objects should be hashable.

I want to give another, different, use case for freezing objects. I
once thought about writing a graph package in Python - I mean a graph
with vertices and edges. The most obvious way to store a directed
graph is as a mapping (dict) from a node to the set of nodes that it
points to. Since I want to be able to find also which nodes point to a
specific node, I will store another mapping, from a node to the set of
nodes that point to it. Now, I want a method of the graph which will
return the set of nodes that a given node points to, for example to
let me write "if y in graph.adjacent_nodes(x) then". The question is,
what will the adjacent_nodes method return? If it returns the set
which is a part of the data structure, there is nothing (even no
convention!) that will prevent the user from playing with it. This
will corrupt the data structure, since the change won't be recorded in
the inverse mapping. adjacent_nodes can return a copy of the set, it's
a waste if you only want to check whether an object is a member of the
set.

I gave this example to say that the "frozen" protocol should (when
possible) return an object which doesn't really contain a copy of the
data, but rather gives an "image" of the original object. If the
original object changes while there are frozen copies of it, the data
will be copied, and all the frozen objects will then reference a
version of the data that will never change again.

This will solve the graph problem nicely - adjacent_nodes would simply
return a frozen copy of the set, and a copy operation would happen
only in the rare cases when the returned set is being modified. This
would also help the container use cases: they may call the frozen()
method on objects that should be inserted into the container, and
usually the data won't be copied. Some objects can't be created in
their final form, but can only be constructed step after step. This
means that they must be non-frozen objects. Sometimes they are
constructed in order to get into a container. Unless the frozen()
method is copy-on-change the way I described, all the data would have
to be copied again, just for the commitment that it won't change.

I don't mean to frighten, but in principle, this may mean that
immutable strings might be introduced, which will allow us to get rid
of all the cStringIO workarounds. Immutable strings would be
constructed whenever they are needed, at a low performance cost
(remember that a frozen copy of a given object has to be constructed
only once - once it has been created, the same object can be returned
on additional frozen() calls.)

Copy-on-change of containers of non-frozen objects requires additional
complication: it requires frozen objects to have a way for setting a
callback that will be called when the original object was changed.
This is because the change makes the container of the original object
change, so it must drop its own frozen copy. This needs to happen only
once per frozen object, since after a change, all the containers drop
their frozen copies. I think this callback is conceptually similar to
the weakref callback.

Just an example that copy-on-change (at least of containers of frozen
objects) is needed: sets. It was decided that you can test whether a
non-frozen set is a member of a set. I understand that it is done by
"temporarily freezing" the set, and that it caused some threading
issues. A copy-on-change mechanism might solve it more elegantly.

What do you think?

Noam

From martin at v.loewis.de  Sun Oct 30 00:53:53 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 30 Oct 2005 00:53:53 +0200
Subject: [Python-Dev] [Python-checkins] commit of r41352 -
 in	python/trunk: . Lib Lib/distutils Lib/distutils/command	Lib/encodings
In-Reply-To: <200510291550.12279.fdrake@acm.org>
References: <20051029194022.470D61E40B4@bag.python.org>
	<200510291550.12279.fdrake@acm.org>
Message-ID: <4363FD81.10403@v.loewis.de>

Fred L. Drake, Jr. wrote:
> Shouldn't we simply remove the .cvsignore files?  Subversion doesn't use them, 
> so they'll just end up getting out of sync with the svn:ignore properties.

That might be reasonable. I just noticed that it is convenient to do

svn propset svn:ignore -F .cvsignore .

Without a file, I wouldn't know how to edit the property, so I would
probably do

svn propget svn:ignore . > ignores
vim ignores
svn propset svn:ignore -F ignores .
rm ignores

Regards,
Martin

From solipsis at pitrou.net  Sun Oct 30 01:25:54 2005
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sun, 30 Oct 2005 01:25:54 +0200
Subject: [Python-Dev] svn:ignore
In-Reply-To: <4363FD81.10403@v.loewis.de>
References: <20051029194022.470D61E40B4@bag.python.org>
	<200510291550.12279.fdrake@acm.org>  <4363FD81.10403@v.loewis.de>
Message-ID: <1130628354.5945.24.camel@fsol>


Hi,

FWIW, I opened a bug report on Subversion some time ago so that patterns
like "*.pyc" and "*.pyo" are ignored by default in Subversion. Feel free
to add comments or vote for the bug:
http://subversion.tigris.org/issues/show_bug.cgi?id=2415

Regards

Antoine.



From noamraph at gmail.com  Sun Oct 30 01:32:41 2005
From: noamraph at gmail.com (Noam Raphael)
Date: Sun, 30 Oct 2005 01:32:41 +0200
Subject: [Python-Dev] [Python-checkins] commit of r41352 - in
	python/trunk: . Lib Lib/distutils Lib/distutils/command
	Lib/encodings
In-Reply-To: <4363FD81.10403@v.loewis.de>
References: <20051029194022.470D61E40B4@bag.python.org>
	<200510291550.12279.fdrake@acm.org> <4363FD81.10403@v.loewis.de>
Message-ID: <b348a0850510291632l48855175y7e17c9c80bba19e6@mail.gmail.com>

> That might be reasonable. I just noticed that it is convenient to do
>
> svn propset svn:ignore -F .cvsignore .
>
> Without a file, I wouldn't know how to edit the property, so I would
> probably do
>
> svn propget svn:ignore . > ignores
> vim ignores
> svn propset svn:ignore -F ignores .
> rm ignores
>

Won't "svn propedit svn:ignore ." do the trick?

Noam

From pinard at iro.umontreal.ca  Sun Oct 30 02:16:11 2005
From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard)
Date: Sat, 29 Oct 2005 20:16:11 -0400
Subject: [Python-Dev] [Python-checkins] commit of r41352 -
	in	python/trunk: . Lib Lib/distutils
	Lib/distutils/command	Lib/encodings
In-Reply-To: <4363FD81.10403@v.loewis.de>
References: <20051029194022.470D61E40B4@bag.python.org>
	<200510291550.12279.fdrake@acm.org> <4363FD81.10403@v.loewis.de>
Message-ID: <20051030001611.GA22474@phenix.sram.qc.ca>

[Martin von L?wis]

>Without a file, I wouldn't know how to edit the property, so I would
>probably do

>svn propget svn:ignore . > ignores
>vim ignores
>svn propset svn:ignore -F ignores .
>rm ignores

You can use `svn propedit' (or `svn pe').

-- 
Fran?ois Pinard   http://pinard.progiciels-bpi.ca

From jcarlson at uci.edu  Sun Oct 30 02:34:12 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sat, 29 Oct 2005 17:34:12 -0700
Subject: [Python-Dev] PEP 351, the freeze protocol
In-Reply-To: <b348a0850510291342y25f0a3d9m5be75afa908c9e55@mail.gmail.com>
References: <1130107429.11268.40.camel@geddy.wooz.org>
	<b348a0850510291342y25f0a3d9m5be75afa908c9e55@mail.gmail.com>
Message-ID: <20051029164637.39E1.JCARLSON@uci.edu>


Noam Raphael <noamraph at gmail.com> wrote:
> 
> Hello,
> 
> I have thought about freezing for some time, and I think that it is a
> fundamental need - the need to know, sometimes, that objects aren't
> going to change.

I agree with this point.

> This is mostly the need of containers. dicts need to know that the
> objects that are used as keys aren't going to change, because if they
> change, their hash value changes, and you end up with a data structure
> in an inconsistent state. This is the need of sets too, and of heaps,
> and binary trees, and so on.

You are exactly mirroring the sentiments of the PEP.

> I want to give another example: I and my colleges designed something
> which can be described as an "electronic spreadsheet in Python". We
> called it a "table". The values in the table are Python objects, and
> the functions which relate them are written in Python. Then comes the
> problem: the user has, of course, access to the objects stored in the
> table. What would happen if he changes them? The answer is that the
> table would be in an inconsistent state, since something which should
> be the return value of a function is now something else, and there's
> no way for the table to know about that.

I respectfully disagree with this point and the rest of your email. Why?
For two use-cases, you offer 'tables of values' and 'graphs', as well as
a possible solution to the 'problem'; copy on write.

In reading your description of a 'table of values', I can't help but be
reminded of the wxPython (and wxWidget) wx.Grid and its semantics.  It
offers arbitrary tables of values (whose editors and viewers you can
change at will), which offers a mechanism by which you can "listen" to
changes that occur to the contents of a cell.  I can't help but think
that if you offered a protocol by which a user can signal that a cell
has been changed, perhaps by writing the value to the table itself
(table.SetValue(row, col, value)), every read a deepcopy (or a PEP 351
freeze), etc., that both you and the users of your table would be much
happier.

As for the graph issue, you've got a bigger problem than users just
being able to edit edge lists, users can clear the entire dictionary of
vertices (outgoing.clear()).  It seems to me that a more reasonable
method to handle this particular case is to tell your users "don't
modify the dictionaries or the edge lists", and/or store your edge lists
as tuples instead of lists or dictionaries, and/or use an immutable
dictionary (as offered by Barry in the PEP).


There's also this little issue of "copy on write" semantics with Python. 
Anyone who tells you that "copy on write" is easy, is probably hanging
out with the same kind of people who say that "threading is easy".  Of
course both are easy if you limit your uses to some small subset of
interesting interactions, but "copy on write" gets far harder when you
start thinking of dictionaries, lists, StringIOs, arrays, and all the
possible user-defined classes, which may be mutated beyond obj[key] =
value and/or obj.attr = value (some have obj.method() which mutates the
object). As such, offering a callback mechanism similar to weak
references is probably pretty close to impossible with CPython.


One of the reasons why I liked the freeze protocol is that it offered a
simple mechanism by which Python could easily offer support, for both
new and old objects alike.  Want an example?  Here's the implementation
for array freezing: tuple(a).  What about lists?  tuple(map(freeze, lst)) 
Freezing may not ultimately be the right solution for everything, but it
is a simple solution which handles the majority of cases.  Copy on write
in Python, on the other hand, is significantly harder to implement,
support, and is probably not the right solution for many problems.


 - Josiah

P.S. To reiterate to Barry: map freeze to the contents of containers,
otherwise the object can still be modified, and hence is not frozen.


From martin at v.loewis.de  Sun Oct 30 12:06:15 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 30 Oct 2005 12:06:15 +0100
Subject: [Python-Dev] [Python-checkins] commit of r41352 - in
 python/trunk: . Lib Lib/distutils Lib/distutils/command Lib/encodings
In-Reply-To: <b348a0850510291632l48855175y7e17c9c80bba19e6@mail.gmail.com>
References: <20051029194022.470D61E40B4@bag.python.org>	
	<200510291550.12279.fdrake@acm.org> <4363FD81.10403@v.loewis.de>
	<b348a0850510291632l48855175y7e17c9c80bba19e6@mail.gmail.com>
Message-ID: <4364A927.5040209@v.loewis.de>

Noam Raphael wrote:
> Won't "svn propedit svn:ignore ." do the trick?

It certainly would. Thanks for pointing that out.

Regards,
Martin

From skip at pobox.com  Sun Oct 30 14:04:22 2005
From: skip at pobox.com (skip@pobox.com)
Date: Sun, 30 Oct 2005 07:04:22 -0600
Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover
In-Reply-To: <1f7befae0510281829n20ae2936pbc9f923da807bf6a@mail.gmail.com>
References: <435BC27C.1010503@v.loewis.de> <2mbr1g6loh.fsf@starship.python.net>
	<e8bf7a530510270523g4a3bef5fk1dd5e8e016d9aa1a@mail.gmail.com>
	<17248.52771.225830.484931@montanaro.dyndns.org>
	<43610C36.2030500@v.loewis.de>
	<1f7befae0510281829n20ae2936pbc9f923da807bf6a@mail.gmail.com>
Message-ID: <17252.50390.256221.4882@montanaro.dyndns.org>


    Tim> Excellent suggestions!  I have a few to pass on:

    ...

Tim,

Thanks for the tips.  As a new svn user myself, I find these helpful.

These are precisely the things the Wiki would be good for.  They don't
prescribe policy.  They help people in a general way to migrate from cvs to
svn more easily.  Anyone with cvs and svn experience, but without the
ability to check stuff into the pydotorg repository could contribute.

Skip

From skip at pobox.com  Sun Oct 30 14:29:25 2005
From: skip at pobox.com (skip@pobox.com)
Date: Sun, 30 Oct 2005 07:29:25 -0600
Subject: [Python-Dev] [Python-checkins] commit of r41352 - in
 python/trunk: . Lib Lib/distutils Lib/distutils/command Lib/encodings
In-Reply-To: <200510291550.12279.fdrake@acm.org>
References: <20051029194022.470D61E40B4@bag.python.org>
	<200510291550.12279.fdrake@acm.org>
Message-ID: <17252.51893.314977.306457@montanaro.dyndns.org>


    Fred> Shouldn't we simply remove the .cvsignore files?  Subversion
    Fred> doesn't use them, so they'll just end up getting out of sync with
    Fred> the svn:ignore properties.

Is there some equivalent?  If so, can we convert the .cvsignore files before
deleting them?

Skip

From skip at pobox.com  Sun Oct 30 16:36:43 2005
From: skip at pobox.com (skip@pobox.com)
Date: Sun, 30 Oct 2005 09:36:43 -0600
Subject: [Python-Dev] svn checksum error
Message-ID: <17252.59531.252751.768301@montanaro.dyndns.org>

I tried "svn up" to bring my sandbox up-to-date and got this output:

    % svn up
    U    Include/unicodeobject.h
    subversion/libsvn_wc/update_editor.c:1609: (apr_err=155017)
    svn: Checksum mismatch for 'Objects/.svn/text-base/unicodeobject.c.svn-base'; expected: '8611dc5f592e7cbc6070524a1437db9b', actual: '2d28838f2fec366fc58386728a48568e'

What's that telling me?

Thx,

Skip

From skip at pobox.com  Sun Oct 30 16:38:45 2005
From: skip at pobox.com (skip@pobox.com)
Date: Sun, 30 Oct 2005 09:38:45 -0600
Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover
In-Reply-To: <17252.50390.256221.4882@montanaro.dyndns.org>
References: <435BC27C.1010503@v.loewis.de> <2mbr1g6loh.fsf@starship.python.net>
	<e8bf7a530510270523g4a3bef5fk1dd5e8e016d9aa1a@mail.gmail.com>
	<17248.52771.225830.484931@montanaro.dyndns.org>
	<43610C36.2030500@v.loewis.de>
	<1f7befae0510281829n20ae2936pbc9f923da807bf6a@mail.gmail.com>
	<17252.50390.256221.4882@montanaro.dyndns.org>
Message-ID: <17252.59653.792906.582288@montanaro.dyndns.org>

    Tim> Excellent suggestions!  I have a few to pass on:

    skip> These are precisely the things the Wiki would be good for.

I went ahead and used Tim's note as the basis for a page on the wiki:

    http://wiki.python.org/moin/CvsToSvn

It's linked from the PythonDevelopers page (a page of previously dubious
necessity).

Skip

From fredrik at pythonware.com  Sun Oct 30 17:58:01 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sun, 30 Oct 2005 17:58:01 +0100
Subject: [Python-Dev] svn checksum error
References: <17252.59531.252751.768301@montanaro.dyndns.org>
Message-ID: <dk2u2r$fca$1@sea.gmane.org>

skip at pobox.com wrote:

> I tried "svn up" to bring my sandbox up-to-date and got this output:
>
>     % svn up
>     U    Include/unicodeobject.h
>     subversion/libsvn_wc/update_editor.c:1609: (apr_err=155017)
>     svn: Checksum mismatch for 'Objects/.svn/text-base/unicodeobject.c.svn-base'; expected: '8611dc5f592e7cbc6070524a1437db9b',
actual: '2d28838f2fec366fc58386728a48568e'
>
> What's that telling me?

"welcome to the wonderful world of subversion error messages"

(from what I can tell, the message means that SVN thinks that there might
have been some checksum error somewhere, or some other error at a point
where subversion thinks it's likely that a checksum was involved; to figure
out what's really causing this problem, you probably need a debug build of
subversion).

deleting the offending directory and doing "svn up" is the easiest way to fix
this.

</F>




From martin at v.loewis.de  Sun Oct 30 23:03:54 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 30 Oct 2005 23:03:54 +0100
Subject: [Python-Dev] svn:ignore (Was: [Python-checkins] commit of
 r41352 - in python/trunk: . Lib Lib/distutils Lib/distutils/command
 Lib/encodings)
In-Reply-To: <17252.51893.314977.306457@montanaro.dyndns.org>
References: <20051029194022.470D61E40B4@bag.python.org>
	<200510291550.12279.fdrake@acm.org>
	<17252.51893.314977.306457@montanaro.dyndns.org>
Message-ID: <4365434A.5030808@v.loewis.de>

skip at pobox.com wrote:
>     Fred> Shouldn't we simply remove the .cvsignore files?  Subversion
>     Fred> doesn't use them, so they'll just end up getting out of sync with
>     Fred> the svn:ignore properties.
> 
> Is there some equivalent?  If so, can we convert the .cvsignore files before
> deleting them?

cvs2svn has already converted them automatically - to svn:ignore 
properties; try

svn propget svn:ignore Doc

(assuming . is the current directory).

I have now deleted all .cvsignore files in the trunk
in revision 41357 (yay, giving a single number for
a multi-file delete operation feels good :-)

Regards,
Martin

From martin at v.loewis.de  Sun Oct 30 23:25:28 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 30 Oct 2005 23:25:28 +0100
Subject: [Python-Dev] Freezing the CVS on Oct 26 for SVN switchover
In-Reply-To: <17252.59653.792906.582288@montanaro.dyndns.org>
References: <435BC27C.1010503@v.loewis.de>
	<2mbr1g6loh.fsf@starship.python.net>	<e8bf7a530510270523g4a3bef5fk1dd5e8e016d9aa1a@mail.gmail.com>	<17248.52771.225830.484931@montanaro.dyndns.org>	<43610C36.2030500@v.loewis.de>	<1f7befae0510281829n20ae2936pbc9f923da807bf6a@mail.gmail.com>	<17252.50390.256221.4882@montanaro.dyndns.org>
	<17252.59653.792906.582288@montanaro.dyndns.org>
Message-ID: <43654858.9020108@v.loewis.de>

skip at pobox.com wrote:
> I went ahead and used Tim's note as the basis for a page on the wiki:
> 
>     http://wiki.python.org/moin/CvsToSvn
> 
> It's linked from the PythonDevelopers page (a page of previously dubious
> necessity).

I have pretty much the same reservations against Wikis as Brett does;
it seems more productive if people would just use python-dev to ask
questions of the "how do I" kind (and probably of the "do I really need
to" kind as well).

I don't mind somebody collecting this information into whatever place
more permanent and accessible than a mailing list archive; I think
I would normally add them to the developer FAQ instead of to the
Wiki, primarily because I can memorize the location of the FAQ better.

Regards,
Martin

From martin at v.loewis.de  Sun Oct 30 23:43:51 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 30 Oct 2005 23:43:51 +0100
Subject: [Python-Dev] svn checksum error
In-Reply-To: <17252.59531.252751.768301@montanaro.dyndns.org>
References: <17252.59531.252751.768301@montanaro.dyndns.org>
Message-ID: <43654CA7.8030200@v.loewis.de>

skip at pobox.com wrote:
> I tried "svn up" to bring my sandbox up-to-date and got this output:
> 
>     % svn up
>     U    Include/unicodeobject.h
>     subversion/libsvn_wc/update_editor.c:1609: (apr_err=155017)
>     svn: Checksum mismatch for 'Objects/.svn/text-base/unicodeobject.c.svn-base'; expected: '8611dc5f592e7cbc6070524a1437db9b', actual: '2d28838f2fec366fc58386728a48568e'
> 
> What's that telling me?

At the shallow level, the message should be clear: there is an actual
checksum for a file and an expected checksum, and they differ. They
shouldn't differ.

Somewhat deeper, this indicates a bug in Subversion. It's not clear
to me whether this is a client or a server bug. In the version on
svn.python.org, the error message is on line 2846, so I would suspect
it is a client bug.

The natural question then is: what operating system, what subversion
version are you using?

Regards,
Martin

From ejones at uwaterloo.ca  Mon Oct 31 00:19:41 2005
From: ejones at uwaterloo.ca (Evan Jones)
Date: Sun, 30 Oct 2005 18:19:41 -0500
Subject: [Python-Dev] Parser and Runtime: Divorced!
In-Reply-To: <03b7f74aebe5c6249a8bb00ac17d1952@uwaterloo.ca>
References: <03b7f74aebe5c6249a8bb00ac17d1952@uwaterloo.ca>
Message-ID: <84c355f24dfa73224073d897c38edd44@uwaterloo.ca>

On Oct 26, 2005, at 20:02, Evan Jones wrote:
> In the process of doing this, I came across a comment mentioning that
> it would be desirable to separate the parser. Is there any interest in
> doing this? I now have a vague idea about how to do this. Of course,
> there is no point in making changes like this unless there is some
> tangible benefit.

I am going to assume that since no one was excited about my post, that 
the answer is: no, there is no interest in seperating the parser from 
the rest of the Python run time.

At any rate, if anyone is looking for a standalone C Python parser 
library, you can get it at the following URL. It includes a "print the 
tree" example that displays the AST for a specified file. It only 
supports a subset of the parse tree (assignment, functions, print, 
return), but it should be obvious how it could be extended.

http://evanjones.ca/software/pyparser.html

Evan Jones

--
Evan Jones
http://evanjones.ca/


From noamraph at gmail.com  Mon Oct 31 00:35:05 2005
From: noamraph at gmail.com (Noam Raphael)
Date: Mon, 31 Oct 2005 01:35:05 +0200
Subject: [Python-Dev] PEP 351, the freeze protocol
In-Reply-To: <20051029164637.39E1.JCARLSON@uci.edu>
References: <1130107429.11268.40.camel@geddy.wooz.org>
	<b348a0850510291342y25f0a3d9m5be75afa908c9e55@mail.gmail.com>
	<20051029164637.39E1.JCARLSON@uci.edu>
Message-ID: <b348a0850510301535id4588e6g406d2b3cc7673c56@mail.gmail.com>

Hello,

It seems that we both agree that freezing is cool (-; . We disagree on
whether a copy-on-write behaviour is desired. Your arguments agains
copy-on-write are:
1. It's not needed.
2. It's complicated to implement.

But first of all, you didn't like my use cases. I want to argue with that.

> In reading your description of a 'table of values', I can't help but be
> reminded of the wxPython (and wxWidget) wx.Grid and its semantics.  It
> offers arbitrary tables of values (whose editors and viewers you can
> change at will), which offers a mechanism by which you can "listen" to
> changes that occur to the contents of a cell.  I can't help but think
> that if you offered a protocol by which a user can signal that a cell
> has been changed, perhaps by writing the value to the table itself
> (table.SetValue(row, col, value)), every read a deepcopy (or a PEP 351
> freeze), etc., that both you and the users of your table would be much
> happier.

Perhaps I didn't make it clear. The difference between wxPython's Grid
and my table is that in the table, most values are *computed*. This
means that there's no point in changing the values themselves. They
are also used frequently as set members (I can describe why, but it's
a bit complicated.)

I want to say that even if sets weren't used, the objects in the table
should have been frozen. The fact the sets (and dicts) only allow
immutable objects as members/keys is just for protecting the user.
They could have declared, "you shouldn't change anything you insert -
as long as you don't, we'll function properly." The only reason why
you can't compute hash values of mutable objects is that you don't
want your user to make mistakes, and make the data structure
inconsistent.

> As for the graph issue, you've got a bigger problem than users just
> being able to edit edge lists, users can clear the entire dictionary of
> vertices (outgoing.clear()).  It seems to me that a more reasonable
> method to handle this particular case is to tell your users "don't
> modify the dictionaries or the edge lists", and/or store your edge lists
> as tuples instead of lists or dictionaries, and/or use an immutable
> dictionary (as offered by Barry in the PEP).

As I wrote before, telling my users "don't modify the edge lists" is
just like making lists hashable, and telling all Python users, "dont
modify lists that are dictionary keys." There's no way to tell the
users that - there's no convention for objects which should not be
changed. You can write it in the documentation, but who'll bother
looking there?

I don't think that your other suggestions will work: the data
structure of the graph itself can't be made of immutable objects,
because of the fact that the graph is a mutable object - you can
change it. It can be made of immutable objects, but this means copying
all the data every time the graph changes.


Now, about copy-on-write:

> There's also this little issue of "copy on write" semantics with Python.
> Anyone who tells you that "copy on write" is easy, is probably hanging
> out with the same kind of people who say that "threading is easy".  Of
> course both are easy if you limit your uses to some small subset of
> interesting interactions, but "copy on write" gets far harder when you
> start thinking of dictionaries, lists, StringIOs, arrays, and all the
> possible user-defined classes, which may be mutated beyond obj[key] =
> value and/or obj.attr = value (some have obj.method() which mutates the
> object). As such, offering a callback mechanism similar to weak
> references is probably pretty close to impossible with CPython.

Let's limit ourselves to copy-on-write of objects which do not contain
nonfrozen objects. Perhaps it's enough - the table, the graph, and
strings, are perfect examples of these. Implementation doesn't seem to
complicated to me - whenever the object is about to change, and there
is a connected frozen copy, you make a shallow copy of the object,
point the frozen copy to it, release the reference to the frozen copy,
and continue as usual. That's all.

I really think that this kind of copy-on-write is "correct". The
temporary freezing of sets in order to check if they are members of
other sets is a not-very-nice way of implementing it. This kind of
copy-on-write would allow, in principle, for Python strings to become
mutable, with almost no speed penalty. It would allow my table, and
other containers, to automatically freeze the objects that get into
it, without having to trust the user on not changing the objects - and
remember that there's no way of *telling* him not to change the
objects.

Now, the computer scientist in me wants to explain (and think about)
freezing containers of nonfrozen objects. What I actually want is that
as long as an object doesn't change after it's freezed, the cost of
freezing would be nothing - that is, O(1). Think about a mutable
string object, which is used in the same way as the current, immutable
strings. It is constructed once, and then may be used as a key in a
dictionary many times. I want to claim that it's a common pattern -
create an object, it doesn't matter how, and then use it without
changing it. If that is the case, it's obvious that all the frozen()
calls would take O(1) each.

How can we accomplish this (freezing costs O(1) as long as the object
doesn't change) with containers of nonfrozen objects? It seems
impossible - no matter what, on the first time the container is
freezed, you would have to call frozen() for every object it contains!
The answer is that in an amortized analysis, it is still an O(1)
operation. The reason is that as long as frozen() takes O(1)
(amortized), all those calls to frozen() can be considered a part of
the object construction, since they are made only once - on the next
call to frozen(), the already-created frozen object would be returned.
This analysis is correct as long as the object doesn't change after
it's freezed.

The problem is that we have to keep the created frozen object as long
as the original object stays alive. So we have to know if it has
changed. This is where those callbacks get in. As long as what is done
with them is correct, there should be no problems. They are used only
to disengage the frozen copies from their original objects. The action
they should trigger is simply that:

def on_contained_object_change(self):
    self._frozen_copy = None
    while self._callbacks:
        self._callbacks.pop()()

What's also interesting is that this freezing mechanism can be
provided automatically for user-created classes, since those are
simply containers of other objects, which behave exactly like dicts,
for this matter.

It allows everything in Python to be both mutable and hashable,
without changing the O() complexity! Wow!

Ok, I'm going to sleep now. If you find something wrong with this
idea, please tell me.

Have a good day,
Noam

From gustavo at niemeyer.net  Mon Oct 31 00:37:57 2005
From: gustavo at niemeyer.net (Gustavo Niemeyer)
Date: Sun, 30 Oct 2005 21:37:57 -0200
Subject: [Python-Dev] StreamHandler eating exceptions
Message-ID: <20051030233757.GB8344@localhost.localdomain>

The StreamHandler available under the logging package is currently
catching all exceptions under the emit() method call. In the
Handler.handleError() documentation it's mentioned that it's
implemented like that because users do not care about errors
in the logging system.

I'd like to apply the following patch:

Index: Lib/logging/__init__.py
===================================================================
--- Lib/logging/__init__.py     (revision 41357)
+++ Lib/logging/__init__.py     (working copy)
@@ -738,6 +738,8 @@
                 except UnicodeError:
                     self.stream.write(fs % msg.encode("UTF-8"))
             self.flush()
+        except KeyboardInterrupt:
+            raise
         except:
             self.handleError(record)


Anyone against the change?

-- 
Gustavo Niemeyer
http://niemeyer.net

From solipsis at pitrou.net  Mon Oct 31 00:46:07 2005
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 31 Oct 2005 00:46:07 +0100
Subject: [Python-Dev] PEP 351, the freeze protocol
In-Reply-To: <b348a0850510301535id4588e6g406d2b3cc7673c56@mail.gmail.com>
References: <1130107429.11268.40.camel@geddy.wooz.org>
	<b348a0850510291342y25f0a3d9m5be75afa908c9e55@mail.gmail.com>
	<20051029164637.39E1.JCARLSON@uci.edu>
	<b348a0850510301535id4588e6g406d2b3cc7673c56@mail.gmail.com>
Message-ID: <1130715967.6180.77.camel@fsol>


> It allows everything in Python to be both mutable and hashable,

I don't understand, since it's already the case. Any user-defined object
is at the same time mutable and hashable.
And if you want the hash value to follow the changes in attribute
values, just define an appropriate __hash__ method.

Regards

Antoine.



From skip at pobox.com  Mon Oct 31 02:08:22 2005
From: skip at pobox.com (skip@pobox.com)
Date: Sun, 30 Oct 2005 19:08:22 -0600
Subject: [Python-Dev] svn checksum error
In-Reply-To: <43654CA7.8030200@v.loewis.de>
References: <17252.59531.252751.768301@montanaro.dyndns.org>
	<43654CA7.8030200@v.loewis.de>
Message-ID: <17253.28294.538932.570903@montanaro.dyndns.org>

    Martin> The natural question then is: what operating system, what
    Martin> subversion version are you using?

Sorry, wasn't thinking in terms of svn bugs.  I was anticipating some sort
of obvious pilot error.  I am on Mac OSX 10.3.9, running svn 1.1.3 I built
from source back in the May timeframe.  Should I upgrade to 1.2.3 as a
matter of course?

    Fredrik> "welcome to the wonderful world of subversion error messages"
    ...
    Fredrik> deleting the offending directory and doing "svn up" is the
    Fredrik> easiest way to fix this.

Thanks.  I zapped Objects.  The next svn up complained about Misc.  The next
about Lib.  After that, the next svn up ran to completion.

Skip

From solipsis at pitrou.net  Mon Oct 31 02:17:50 2005
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 31 Oct 2005 02:17:50 +0100
Subject: [Python-Dev] svn checksum error
In-Reply-To: <17253.28294.538932.570903@montanaro.dyndns.org>
References: <17252.59531.252751.768301@montanaro.dyndns.org>
	<43654CA7.8030200@v.loewis.de>
	<17253.28294.538932.570903@montanaro.dyndns.org>
Message-ID: <1130721470.6180.87.camel@fsol>

Le dimanche 30 octobre 2005 ? 19:08 -0600, skip at pobox.com a ?crit :
> Sorry, wasn't thinking in terms of svn bugs.  I was anticipating some sort
> of obvious pilot error.  I am on Mac OSX 10.3.9, running svn 1.1.3 I built
> from source back in the May timeframe.  Should I upgrade to 1.2.3 as a
> matter of course?

IIRC, the version provided by Fink works fine. No need to compile
manually.




From pinard at iro.umontreal.ca  Mon Oct 31 03:25:54 2005
From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard)
Date: Sun, 30 Oct 2005 21:25:54 -0500
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
	conversions).
In-Reply-To: <43638BC0.40108@v.loewis.de>
References: <50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com>
	<4362A44F.9010506@v.loewis.de>
	<20051029110331.D5AA.ISHIMOTO@gembook.org>
	<4363395A.3040606@v.loewis.de> <1130589142.5945.11.camel@fsol>
	<43638BC0.40108@v.loewis.de>
Message-ID: <20051031022554.GA20255@alcyon.progiciels-bpi.ca>

[Martin von L?wis]

> My canonical example is Fran?ois Pinard, who keeps requesting it, 
> saying that local people where surprised they couldn't use accented 
> characters in Python.  Perhaps that's because he actually is Quebecian 
> :-)

I presume I should comment a bit on this.

People here are not "surprised" they couldn't use accented characters, 
they are rather saddened, and some hoped that Python would offer that
possibility, one of these days.  Also given that here, every production 
program or system has been progressively rewritten in Python, slowly at 
first, and more aggressively while the confidence was building up, to 
the point not much of the non-Python things remain by now.  So, all our 
hopes are concentrated into a single language.

All development is done in house by French people.  All documentation, 
external or internal, comments, identifier and function names, 
everything is in French.  Some of the developers here have had a long 
programming life, while they only barely read English.  It is surely 
a constant frustration, for some of us, having to mangle identifiers by 
ravelling out their necessary diacritics.  It does not look good, it 
does not smell good, and in many cases, mangling identifiers 
significantly decreases program legibility.

Now, I keep reading strange arguments from people opposing that we use 
national letters in identifiers, disturbed by the fact they would have 
a hard time reading our code or publishing it.  Even worse, some want to 
protect us (and the world) against ourselves, using made up, irrational 
arguments, producing false logic out of their own emotions and feelings.  
They would like us to think, write, and publish in English.  Is it some 
anachronical colonialism?  Quite possible.  It surely has some success, 
as you may find some French people that will only swear in English! :-)

For one, in my programming life, I surely chose to write a lot of 
English code, and I still think English is a good vehicle to planetary 
communication.  However, I like it to my choice.  I always felt much 
opened and collaborative with similarly minded people, and for them, 
happily rewrote my things from French to English in view of sharing, 
whenever I saw some mutual advantage to it.

I resent when people want to force me into English when I have no real 
reason to do so.  Let me choose to use my own language, as nicely as 
I can, when working in-shop with people sharing this language with me, 
for programs that will likely never be published outside anyway.  
Internationalisation is already granted in our overall view of today's
programming, as a way for letting people be comfortable with computers, 
each in his/her own language.  This comfort should extend widely to 
naming main programming objects (functions, classes, variables, modules) 
as legibly as possible.  Here, I mean legible in an ideal way for the 
team or the local community, and not necessarily legible to the whole 
planet.  It does not always have to be planetary, you know.

For keywords, the need is less stringent, as syntactical constructs are 
part of a language.  When English is opaque to a programmer, he/she can 
easily learn that small set of words making the syntax, understanding 
their effect, even while not necessarily understanding the real English 
meaning of those keywords.  This is not a real obstacle in practice.

It is true that many Python tools are not prepared to handle 
internationalised identifiers, and it is very unlikely that these tools 
will get ready before Python opens itself to internationalised 
identifiers.  Let's open Python first, tools will undoubtedly follow.
There will be some adaptation period, but after some while, everything 
will fall in place, things will become smooth again and just natural to 
everybody, to the point many of us might remember the current times, and 
wonder what was all that fuss about.  :-)

-- 
Fran?ois Pinard   http://pinard.progiciels-bpi.ca

From rhamph at gmail.com  Mon Oct 31 04:21:29 2005
From: rhamph at gmail.com (Adam Olsen)
Date: Sun, 30 Oct 2005 20:21:29 -0700
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
	conversions).
In-Reply-To: <20051031022554.GA20255@alcyon.progiciels-bpi.ca>
References: <50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com>
	<4362A44F.9010506@v.loewis.de>
	<20051029110331.D5AA.ISHIMOTO@gembook.org>
	<4363395A.3040606@v.loewis.de> <1130589142.5945.11.camel@fsol>
	<43638BC0.40108@v.loewis.de>
	<20051031022554.GA20255@alcyon.progiciels-bpi.ca>
Message-ID: <aac2c7cb0510301921t697f2547v9f0a2d38145854ca@mail.gmail.com>

On 10/30/05, Fran?ois Pinard <pinard at iro.umontreal.ca> wrote:
> All development is done in house by French people.  All documentation,
> external or internal, comments, identifier and function names,
> everything is in French.  Some of the developers here have had a long
> programming life, while they only barely read English.  It is surely
> a constant frustration, for some of us, having to mangle identifiers by
> ravelling out their necessary diacritics.  It does not look good, it
> does not smell good, and in many cases, mangling identifiers
> significantly decreases program legibility.

Hear, hear!  Not all the world uses english, and restricting them to
latin characters simply means it's not readable in any language.  It
doesn't make it any more readable for those of us who only understand
english.

+1 on internationalized identifiers.

--
Adam Olsen, aka Rhamphoryncus

From fdrake at acm.org  Mon Oct 31 06:16:18 2005
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Mon, 31 Oct 2005 01:16:18 -0400
Subject: [Python-Dev] StreamHandler eating exceptions
In-Reply-To: <20051030233757.GB8344@localhost.localdomain>
References: <20051030233757.GB8344@localhost.localdomain>
Message-ID: <200510310016.19467.fdrake@acm.org>

On Sunday 30 October 2005 18:37, Gustavo Niemeyer wrote:
 > I'd like to apply the following patch:

+1

Might want to include SystemExit as well, though I think that's less likely to 
be seen in practice.


  -Fred

-- 
Fred L. Drake, Jr.   <fdrake at acm.org>

From jcarlson at uci.edu  Mon Oct 31 06:09:17 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sun, 30 Oct 2005 22:09:17 -0700
Subject: [Python-Dev] PEP 351, the freeze protocol
In-Reply-To: <b348a0850510301535id4588e6g406d2b3cc7673c56@mail.gmail.com>
References: <20051029164637.39E1.JCARLSON@uci.edu>
	<b348a0850510301535id4588e6g406d2b3cc7673c56@mail.gmail.com>
Message-ID: <20051030202958.39FD.JCARLSON@uci.edu>


Noam Raphael <noamraph at gmail.com> wrote:
> 
> Hello,
> 
> It seems that we both agree that freezing is cool (-; . We disagree on
> whether a copy-on-write behaviour is desired. Your arguments agains
> copy-on-write are:
> 1. It's not needed.
> 2. It's complicated to implement.
> 
> But first of all, you didn't like my use cases. I want to argue with that.
> 
> > In reading your description of a 'table of values', I can't help but be
> > reminded of the wxPython (and wxWidget) wx.Grid and its semantics.  It
> > offers arbitrary tables of values (whose editors and viewers you can
> > change at will), which offers a mechanism by which you can "listen" to
> > changes that occur to the contents of a cell.  I can't help but think
> > that if you offered a protocol by which a user can signal that a cell
> > has been changed, perhaps by writing the value to the table itself
> > (table.SetValue(row, col, value)), every read a deepcopy (or a PEP 351
> > freeze), etc., that both you and the users of your table would be much
> > happier.
> 
> Perhaps I didn't make it clear. The difference between wxPython's Grid
> and my table is that in the table, most values are *computed*. This
> means that there's no point in changing the values themselves. They
> are also used frequently as set members (I can describe why, but it's
> a bit complicated.)

Again, user semantics.  Tell your users not to modify entries, and/or
you can make copies of objects you return.  If your users are too daft
to read and/or follow directions, then they deserve to have their
software not work.

Also from the sounds of it, you are storing both source and destination
values in the same table...hrm, that sounds quite a bit like a
spreadsheet.  How does every spreadsheet handle that again?  Oh yeah,
they only ever store immutables (generally strings which are interpreted). 
But I suppose since you are (of course) storing mutable objects, you
need to work a bit harder...so store mutables, and return immutable
copies (which you can cache if you want, and invalidate when your
application updates the results...like a wx.Grid update on changed).


> > As for the graph issue, you've got a bigger problem than users just
> > being able to edit edge lists, users can clear the entire dictionary of
> > vertices (outgoing.clear()).  It seems to me that a more reasonable
> > method to handle this particular case is to tell your users "don't
> > modify the dictionaries or the edge lists", and/or store your edge lists
> > as tuples instead of lists or dictionaries, and/or use an immutable
> > dictionary (as offered by Barry in the PEP).
> 
> As I wrote before, telling my users "don't modify the edge lists" is
> just like making lists hashable, and telling all Python users, "dont
> modify lists that are dictionary keys." There's no way to tell the
> users that - there's no convention for objects which should not be
> changed. You can write it in the documentation, but who'll bother
> looking there?

When someone complains that something doesn't work, I tell them to read
the documentation.  If your users haven't been told to RTFM often enough
to actually make it happen, then you need a RTFM-bat. Want to know how
you make one?  You start wrapping the objects you return which segfaults
the process if they change things. When they start asking, tell them it
is documented quite clearly "do not to modify objects returned, or else". 
Then there's the other option, which I provide below.


> I don't think that your other suggestions will work: the data
> structure of the graph itself can't be made of immutable objects,
> because of the fact that the graph is a mutable object - you can
> change it. It can be made of immutable objects, but this means copying
> all the data every time the graph changes.

So let me get this straight: You've got a graph.  You want to be able to
change the graph, but you don't want your users to accidentally change
the graph. Sounds to me like an API problem, not a freeze()/mutable problem.
Want an API?

class graph:
    ...
    def IterOutgoing(self, node):
        ...
    def IterIncoming(self, node):
        ...
    def IsAdjacent(self, node1, node2):
        ...
    def IterNodes(self):
        ...
    def AddEdge(self, f_node, t_node):
        ...
    def RemEdge(self, node1, node2):
        ...
    def AddNode(self):
        ...

If you are reasonable in your implementation, all of the above
operations can be fast, and you will never have to worry about your
users accidentally mucking about with your internal data structures:
because you aren't exposing them.  If you are really paranoid, you can
take the next step and implement it in Pyrex or C, so that only a
malicous user can muck about with internal structures, at which point
you stop supporting them.


> Now, about copy-on-write:
> 
> > There's also this little issue of "copy on write" semantics with Python.
> > Anyone who tells you that "copy on write" is easy, is probably hanging
> > out with the same kind of people who say that "threading is easy".  Of
> > course both are easy if you limit your uses to some small subset of
> > interesting interactions, but "copy on write" gets far harder when you
> > start thinking of dictionaries, lists, StringIOs, arrays, and all the
> > possible user-defined classes, which may be mutated beyond obj[key] =
> > value and/or obj.attr = value (some have obj.method() which mutates the
> > object). As such, offering a callback mechanism similar to weak
> > references is probably pretty close to impossible with CPython.
> 
> Let's limit ourselves to copy-on-write of objects which do not contain
> nonfrozen objects. Perhaps it's enough - the table, the graph, and
> strings, are perfect examples of these. Implementation doesn't seem to
> complicated to me - whenever the object is about to change, and there
> is a connected frozen copy, you make a shallow copy of the object,
> point the frozen copy to it, release the reference to the frozen copy,
> and continue as usual. That's all.

What you have written here is fairly unintelligible, but thankfully you
clarify yourself...pity it still doesn't work, I explain below.

[snip]


> The problem is that we have to keep the created frozen object as long
> as the original object stays alive. So we have to know if it has
> changed. This is where those callbacks get in. As long as what is done
> with them is correct, there should be no problems. They are used only
> to disengage the frozen copies from their original objects. The action
> they should trigger is simply that:
> 
> def on_contained_object_change(self):
>     self._frozen_copy = None
>     while self._callbacks:
>         self._callbacks.pop()()
> 
> What's also interesting is that this freezing mechanism can be
> provided automatically for user-created classes, since those are
> simply containers of other objects, which behave exactly like dicts,
> for this matter.
> 
> It allows everything in Python to be both mutable and hashable,
> without changing the O() complexity! Wow!
> 
> Ok, I'm going to sleep now. If you find something wrong with this
> idea, please tell me.


Here is where you are wrong.

    x = []
    for i in xrange(k):
        x.append(range(k))

We now have a simple list of lists, k lists, each of length k.  Let's
freeze it.

    y = frozen(x)

Ok, now we have a recursively frozen list of lists y, implemented
however you want, and you've ammortized this ONE call to creation time.
We'll give y to someone who does whatever he wants to it, and we'll
continue on.

    z = frozen(x)

Your claim is that due to the cache, the above operation can be done in
constant time after you have already frozen x, this is wrong, but we'll
get to that.  Let us mutate one of the contained lists, and see if this
can continue...

    x[0][0] = 7

Oh hrm.  This invalidates x[0]'s cached frozen object, which would
suggest that x's cached frozen object was also invalidated, even though
Python objects tend to know nothing about the objects which point to
them.  Well, that's a rub. In order to /validate/ that an object's
cached object is valid, you must validate that the contents of your
cached frozen object points to the cached frozen objects of your
contents.  Or in code (for this two-level example)...

    def frozen(x):
        if x.frozen_cache:
            for i,j in zip(x.contents, x.frozen_cache):
                if i.frozen_cache is not j:
                    x.frozen_cache = None
                    x.frozen_cache = frozen(x)
                    return x.frozen_cache
        x.frozen_cache = tuple(map(frozen, x.contents))
        return x.frozen_cache

Ouch, for the list of lists example with a total size O(k**2), you need
to spend O(k) time.  We'll say that n == k**2, so really, for this
particular object of size O(n), you need to spend O(sqrt(n)) time
verifying.  Not quite constant.

But wait...in order to verify that every cached frozen object is
valid...we should have been checking the contents of x[i] to verify that
they were frozen too!  Wow, that would take us O(n) time just to verify.
And we would need to do that EVERY TIME WE CALLED frozen(x), REGARDLESS
OF WHETHER x WAS MUTATED!

Wait a second...isn't that just the same as just recursively calling
freeze?  Yes.

Are we actually saving any time?  No.

What has this idea resulted in? The incorrect belief that caching frozen
objects will reduce the computation necessary in freeze(x), and a proposed
new attribute on literally every object which points to an immutable
copy of itself, generally doubling the amount of memory used.


Like I said before, it's not so easy.  In order to actually implement
the system, every object in an object heirarchy would need to know about
its parent, but such cannot work because...

    a = range(10)
    b = [a]
    c = [a]
    bf = frozen(b)
    cf = frozen(c)
    a[0] = 10       #oops!

That last line is the killer.  Every object would necessarily need to
know about all containers which point to it, and would necessarily need
to notify them all that their contents had changed.  I personally don't
see Python doing that any time soon.

Hope you sleep/slept well,
 - Josiah


From martin at v.loewis.de  Mon Oct 31 08:55:09 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 31 Oct 2005 08:55:09 +0100
Subject: [Python-Dev] svn checksum error
In-Reply-To: <17253.28294.538932.570903@montanaro.dyndns.org>
References: <17252.59531.252751.768301@montanaro.dyndns.org>
	<43654CA7.8030200@v.loewis.de>
	<17253.28294.538932.570903@montanaro.dyndns.org>
Message-ID: <4365CDDD.5060502@v.loewis.de>

skip at pobox.com wrote:
>     Martin> The natural question then is: what operating system, what
>     Martin> subversion version are you using?
> 
> Sorry, wasn't thinking in terms of svn bugs.  I was anticipating some sort
> of obvious pilot error.  I am on Mac OSX 10.3.9, running svn 1.1.3 I built
> from source back in the May timeframe.  Should I upgrade to 1.2.3 as a
> matter of course?

Not sure. The only mentioning of this specific message was about Linux
and multi-threading, in svnserve. Apparently, due to some race
condition/pthread locking semantics problems, something could get
corrupted. It could be a compiler bug as well, of course.

I would try to get some "official" binaries; 1.2.x works just fine with
svn.python.org as well.

Regards,
Martin

From vinay_sajip at yahoo.co.uk  Mon Oct 31 14:59:53 2005
From: vinay_sajip at yahoo.co.uk (Vinay Sajip)
Date: Mon, 31 Oct 2005 13:59:53 +0000 (UTC)
Subject: [Python-Dev] StreamHandler eating exceptions
References: <20051030233757.GB8344@localhost.localdomain>
Message-ID: <loom.20051031T145722-614@post.gmane.org>

Gustavo Niemeyer <gustavo <at> niemeyer.net> writes:

> 
> The StreamHandler available under the logging package is currently
> catching all exceptions under the emit() method call. In the
> Handler.handleError() documentation it's mentioned that it's
> implemented like that because users do not care about errors
> in the logging system.
> 
> I'd like to apply the following patch:
[patch snipped]
> Anyone against the change?
> 

Good idea. I've checked into svn a patch for both logging/__init__.py and
logging/handlers.py which raises both KeyboardInterrupt and SystemExit raised
during emit().


From guido at python.org  Mon Oct 31 16:26:26 2005
From: guido at python.org (Guido van Rossum)
Date: Mon, 31 Oct 2005 08:26:26 -0700
Subject: [Python-Dev] StreamHandler eating exceptions
In-Reply-To: <loom.20051031T145722-614@post.gmane.org>
References: <20051030233757.GB8344@localhost.localdomain>
	<loom.20051031T145722-614@post.gmane.org>
Message-ID: <ca471dc20510310726g30aa4cf6i49fa1340a3d90c7a@mail.gmail.com>

I wonder if, once PEP 352 is accepted, this shouldn't be changed so
that there is only one except clause instead of two, and it says
"except Exception:". This has roughly the same effect as the proposed
(and already applied) patch, but does it in a Python-3000-compatible
way. ATM it is less robust because it doesn't catch exceptions that
don't derive from Exception -- but in all cases where this particular
try/except has saved my butt (yes it has! multiple times! :-), the
exception thrown was a standard exception.

(Did anybody else notice the synchronicity of this request with the
PEP 352 discussion?)

On 10/31/05, Vinay Sajip <vinay_sajip at yahoo.co.uk> wrote:
> Gustavo Niemeyer <gustavo <at> niemeyer.net> writes:
>
> >
> > The StreamHandler available under the logging package is currently
> > catching all exceptions under the emit() method call. In the
> > Handler.handleError() documentation it's mentioned that it's
> > implemented like that because users do not care about errors
> > in the logging system.
> >
> > I'd like to apply the following patch:
> [patch snipped]
> > Anyone against the change?
> >
>
> Good idea. I've checked into svn a patch for both logging/__init__.py and
> logging/handlers.py which raises both KeyboardInterrupt and SystemExit raised
> during emit().
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From steve at holdenweb.com  Mon Oct 31 16:51:13 2005
From: steve at holdenweb.com (Steve Holden)
Date: Mon, 31 Oct 2005 15:51:13 +0000
Subject: [Python-Dev] Divorcing str and unicode (no more implicit
	conversions).
In-Reply-To: <aac2c7cb0510301921t697f2547v9f0a2d38145854ca@mail.gmail.com>
References: <50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com>	<4362A44F.9010506@v.loewis.de>	<20051029110331.D5AA.ISHIMOTO@gembook.org>	<4363395A.3040606@v.loewis.de>
	<1130589142.5945.11.camel@fsol>	<43638BC0.40108@v.loewis.de>	<20051031022554.GA20255@alcyon.progiciels-bpi.ca>
	<aac2c7cb0510301921t697f2547v9f0a2d38145854ca@mail.gmail.com>
Message-ID: <dk5ehf$m9s$1@sea.gmane.org>

Adam Olsen wrote:
> On 10/30/05, Fran?ois Pinard <pinard at iro.umontreal.ca> wrote:
> 
>>All development is done in house by French people.  All documentation,
>>external or internal, comments, identifier and function names,
>>everything is in French.  Some of the developers here have had a long
>>programming life, while they only barely read English.  It is surely
>>a constant frustration, for some of us, having to mangle identifiers by
>>ravelling out their necessary diacritics.  It does not look good, it
>>does not smell good, and in many cases, mangling identifiers
>>significantly decreases program legibility.
> 
> 
> Hear, hear!  Not all the world uses english, and restricting them to
> latin characters simply means it's not readable in any language.  It
> doesn't make it any more readable for those of us who only understand
> english.
> 
> +1 on internationalized identifiers.
> 
While I agree with the sentiments expressed, I think we should not 
underestimate the practical problems that moving away fr

Therefore, if such steps are really going to be considered, I would 
really like to see them introduced in such a way that no breakage occurs 
for existing users, even the parochial ones who feel they (and their 
programs) don't need to understand anything but ASCII.

If this means starting out with the features conditionally compiled, 
despite the added cost of the #ifdefs that would thereby be engendered I 
think that would be a good idea.

We can fix their programs by making Unicode the default string type, but 
it will take much longer to fix their thinking.

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC                     www.holdenweb.com
PyCon TX 2006                  www.python.org/pycon/


From steve at holdenweb.com  Mon Oct 31 16:53:25 2005
From: steve at holdenweb.com (Steve Holden)
Date: Mon, 31 Oct 2005 15:53:25 +0000
Subject: [Python-Dev] PEP 351, the freeze protocol
In-Reply-To: <20051030202958.39FD.JCARLSON@uci.edu>
References: <20051029164637.39E1.JCARLSON@uci.edu>	<b348a0850510301535id4588e6g406d2b3cc7673c56@mail.gmail.com>
	<20051030202958.39FD.JCARLSON@uci.edu>
Message-ID: <dk5elj$m9s$2@sea.gmane.org>

Josiah Carlson wrote:
[...]
>>Perhaps I didn't make it clear. The difference between wxPython's Grid
>>and my table is that in the table, most values are *computed*. This
>>means that there's no point in changing the values themselves. They
>>are also used frequently as set members (I can describe why, but it's
>>a bit complicated.)
> 
> 
> Again, user semantics.  Tell your users not to modify entries, and/or
> you can make copies of objects you return.  If your users are too daft
> to read and/or follow directions, then they deserve to have their
> software not work.
> 
That sounds like a "get out of jail free" card for Microsoft and many 
other software vendors ...

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC                     www.holdenweb.com
PyCon TX 2006                  www.python.org/pycon/


From orent at hishome.net  Mon Oct 31 18:28:26 2005
From: orent at hishome.net (Oren Tirosh)
Date: Mon, 31 Oct 2005 19:28:26 +0200
Subject: [Python-Dev] PEP 351, the freeze protocol
In-Reply-To: <1130715967.6180.77.camel@fsol>
References: <1130107429.11268.40.camel@geddy.wooz.org>
	<b348a0850510291342y25f0a3d9m5be75afa908c9e55@mail.gmail.com>
	<20051029164637.39E1.JCARLSON@uci.edu>
	<b348a0850510301535id4588e6g406d2b3cc7673c56@mail.gmail.com>
	<1130715967.6180.77.camel@fsol>
Message-ID: <7168d65a0510310928y178faddav5b0551c4ed8dac60@mail.gmail.com>

On 10/31/05, Antoine Pitrou <solipsis at pitrou.net> wrote:
>
> > It allows everything in Python to be both mutable and hashable,
>
> I don't understand, since it's already the case. Any user-defined object
> is at the same time mutable and hashable.

By default, user-defined objects are equal iff they are the same
object, regardless of their content. This makes mutability a
non-issue.

If you want to allow different objects be equal you need to implement
a consistent equality operator (commutative, etc), a consistent hash
function and ensure that any attributes affecting equality or hash
value are immutable. If you fail to meet any of these requirements and
put such objects in dictionaries or sets it will result in undefined
behavior that may change between Python versions and implementations.

  Oren

From martin at v.loewis.de  Mon Oct 31 19:21:17 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 31 Oct 2005 19:21:17 +0100
Subject: [Python-Dev] i18n identifiers (was: Divorcing str and unicode
 (no more implicit conversions).
In-Reply-To: <dk5ehf$m9s$1@sea.gmane.org>
References: <50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com>	<4362A44F.9010506@v.loewis.de>	<20051029110331.D5AA.ISHIMOTO@gembook.org>	<4363395A.3040606@v.loewis.de>	<1130589142.5945.11.camel@fsol>	<43638BC0.40108@v.loewis.de>	<20051031022554.GA20255@alcyon.progiciels-bpi.ca>	<aac2c7cb0510301921t697f2547v9f0a2d38145854ca@mail.gmail.com>
	<dk5ehf$m9s$1@sea.gmane.org>
Message-ID: <4366609D.4010205@v.loewis.de>

Steve Holden wrote:
> Therefore, if such steps are really going to be considered, I would 
> really like to see them introduced in such a way that no breakage occurs 
> for existing users, even the parochial ones who feel they (and their 
> programs) don't need to understand anything but ASCII.

It is straight-forward to make this feature completely backwards
compatible. Syntactically, it is a pure extension: existing code
will continue to work unmodified, and will continue to have the
same meaning. With the feature, you will be able to write code
that previously produced SyntaxErrors.

Semantically, the only potential incompatibility is that you
might find Unicode strings in __dict__. If purely-ASCII identifiers
are going to be represented by byte strings (as they are now),
no change in meaning for existing code is anticipated.

So it is not necessary to make the feature conditional to preserve
compatibility.

Regards,
Martin

From mal at egenix.com  Mon Oct 31 19:51:43 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 31 Oct 2005 19:51:43 +0100
Subject: [Python-Dev] i18n identifiers
In-Reply-To: <4366609D.4010205@v.loewis.de>
References: <50862ebd0510271721i77c1ebb4x6bcb39a4756c3a99@mail.gmail.com>	<4362A44F.9010506@v.loewis.de>	<20051029110331.D5AA.ISHIMOTO@gembook.org>	<4363395A.3040606@v.loewis.de>	<1130589142.5945.11.camel@fsol>	<43638BC0.40108@v.loewis.de>	<20051031022554.GA20255@alcyon.progiciels-bpi.ca>	<aac2c7cb0510301921t697f2547v9f0a2d38145854ca@mail.gmail.com>	<dk5ehf$m9s$1@sea.gmane.org>
	<4366609D.4010205@v.loewis.de>
Message-ID: <436667BF.4050903@egenix.com>

Martin v. L?wis wrote:
> Steve Holden wrote:
> 
>>Therefore, if such steps are really going to be considered, I would 
>>really like to see them introduced in such a way that no breakage occurs 
>>for existing users, even the parochial ones who feel they (and their 
>>programs) don't need to understand anything but ASCII.
> 
> 
> It is straight-forward to make this feature completely backwards
> compatible. Syntactically, it is a pure extension: existing code
> will continue to work unmodified, and will continue to have the
> same meaning. With the feature, you will be able to write code
> that previously produced SyntaxErrors.
> 
> Semantically, the only potential incompatibility is that you
> might find Unicode strings in __dict__. If purely-ASCII identifiers
> are going to be represented by byte strings (as they are now),
> no change in meaning for existing code is anticipated.
> 
> So it is not necessary to make the feature conditional to preserve
> compatibility.

If people are really all enthusiastic about such a feature,
then it should happen in Python3k when the parser is rewritten
to work on Unicode natively.

Note that if you start with this now, a single module in
your application using Unicode identifiers could potentially
break the application: simply by the fact that stack frames,
tracebacks and module globals would now contain Unicode.

Any processing done on the identifiers, like e.g. error formatting
would then have to deal with Unicode objects (due to the automatic
conversion).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Oct 31 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

From blais at furius.ca  Mon Oct 31 20:13:05 2005
From: blais at furius.ca (Martin Blais)
Date: Mon, 31 Oct 2005 14:13:05 -0500
Subject: [Python-Dev] a different kind of reduce...
Message-ID: <8393fff0510311113p63bc194ak88580f84a25b1a1a@mail.gmail.com>

Hi

I find myself occasionally doing this:

   ... = dirname(dirname(dirname(p)))

I'm always--literally every time-- looking for a more functional form,
something that would be like this:

   # apply dirname() 3 times on its results, initializing with p
   ... = repapply(dirname, 3, p)

There is a way to hack something like that with reduce, but it's not
pretty--it involves creating a temporary list and a lambda function:

  ... = reduce(lambda x, y: dirname(x), [p] + [None] * 3)

Just wondering, does anybody know how to do this nicely? Is there an
easy form that allows me to do this?

cheers,

From guido at python.org  Mon Oct 31 20:24:02 2005
From: guido at python.org (Guido van Rossum)
Date: Mon, 31 Oct 2005 12:24:02 -0700
Subject: [Python-Dev] PEP 352 Transition Plan
In-Reply-To: <bbaeab100510282037m5bad1f67kb4d5cb7171ac163b@mail.gmail.com>
References: <ca471dc20510281329m5312946bjedf100d942c0dc49@mail.gmail.com>
	<007d01c5dc00$738da2e0$b62dc797@oemcomputer>
	<bbaeab100510281552rfd260afrde3e72eec14dd5df@mail.gmail.com>
	<4362DD15.4080606@gmail.com>
	<bbaeab100510282037m5bad1f67kb4d5cb7171ac163b@mail.gmail.com>
Message-ID: <ca471dc20510311124s1c3aeffeya879056477ea515d@mail.gmail.com>

I've made a final pass over PEP 352, mostly fixing the __str__,
__unicode__ and __repr__ methods to behave more reasonably. I'm all
for accepting it now. Does anybody see any last-minute show-stopping
problems with it?

As always, http://python.org/peps/pep-0352.html

--
--Guido van Rossum (home page: http://www.python.org/~guido/)

From noamraph at gmail.com  Mon Oct 31 20:27:30 2005
From: noamraph at gmail.com (Noam Raphael)
Date: Mon, 31 Oct 2005 21:27:30 +0200
Subject: [Python-Dev] PEP 351, the freeze protocol
In-Reply-To: <20051030202958.39FD.JCARLSON@uci.edu>
References: <20051029164637.39E1.JCARLSON@uci.edu>
	<b348a0850510301535id4588e6g406d2b3cc7673c56@mail.gmail.com>
	<20051030202958.39FD.JCARLSON@uci.edu>
Message-ID: <b348a0850510311127k17025338qee19f20ea348c893@mail.gmail.com>

Hello,

I have slept quite well, and talked about it with a few people, and I
still think I'm right.

About the users-changing-my-internal-data issue:

> Again, user semantics.  Tell your users not to modify entries, and/or
> you can make copies of objects you return.  If your users are too daft
> to read and/or follow directions, then they deserve to have their
> software not work.
...
> When someone complains that something doesn't work, I tell them to read
> the documentation.  If your users haven't been told to RTFM often enough
> to actually make it happen, then you need a RTFM-bat. Want to know how
> you make one?  You start wrapping the objects you return which segfaults
> the process if they change things. When they start asking, tell them it
> is documented quite clearly "do not to modify objects returned, or else".
> Then there's the other option, which I provide below.

I disagree. I read the manual when I don't know what something does.
If I can guess what it does (and this is where conventions are good),
I don't read the manual. And let's say I ought to read the complete
manual for every method that I use, and that I deserve a death
punishment (or a segmentation fault) if I don't. But let's say that I
wrote a software, without reading the manual, and it worked. I have
gone to work on other things, and suddenly a bug arises. When the poor
guy who needs to fix it goes over the code, everything looks
absolutely correct. Should he also read the complete manuals of every
library that I used, in order to fix that bug? And remember that in
this case, the object could have traveled between several places
(including functions in other libraries), before it was changed, and
the original data structure starts behaving weird.

You suggest two ways for solving the problem. The first is by copying
my mutable objects to immutable copies:

> Also from the sounds of it, you are storing both source and destination
> values in the same table...hrm, that sounds quite a bit like a
> spreadsheet.  How does every spreadsheet handle that again?  Oh yeah,
> they only ever store immutables (generally strings which are interpreted).
> But I suppose since you are (of course) storing mutable objects, you
> need to work a bit harder...so store mutables, and return immutable
> copies (which you can cache if you want, and invalidate when your
> application updates the results...like a wx.Grid update on changed).

This is basically ok. It's just that in my solution, for many objects
it's not necessary to make a complete copy just to prevent changing
the value: Making frozen copies of objects which can't reference
nonfrozen objects (sets, for example), costs O(1), thanks to the
copy-on-write.

Now, about the graph:

> So let me get this straight: You've got a graph.  You want to be able to
> change the graph, but you don't want your users to accidentally change
> the graph. Sounds to me like an API problem, not a freeze()/mutable problem.
> Want an API?
>
> class graph:
>    ...
>    def IterOutgoing(self, node):
>        ...
>    def IterIncoming(self, node):
>        ...
>    def IsAdjacent(self, node1, node2):
>        ...
>    def IterNodes(self):
>        ...
>    def AddEdge(self, f_node, t_node):
>        ...
>    def RemEdge(self, node1, node2):
>        ...
>    def AddNode(self):
>        ...
>
> If you are reasonable in your implementation, all of the above
> operations can be fast, and you will never have to worry about your
> users accidentally mucking about with your internal data structures:
> because you aren't exposing them.  If you are really paranoid, you can
> take the next step and implement it in Pyrex or C, so that only a
> malicous user can muck about with internal structures, at which point
> you stop supporting them.

This will work. It's simply... well, not very beautiful. I have to
invent a lot of names, and my users need to remember them all. If I
give them a frozen set, with all the vertices than a vertex points to
(which is an absolutely reasonable API), they will know how to deal
with it without learning a lot of method names, thanks to the fact
that they are already familiar with sets, and that a lot of thought
has gone into the set interface.

> > Now, about copy-on-write:
...
>
> What you have written here is fairly unintelligible, but thankfully you
> clarify yourself...pity it still doesn't work, I explain below.

I'm sorry if I am sometimes hard to understand. English is not my
mother tongue, and it degrades as the hour gets later - and sometimes,
things are hard to explain. If I don't explain myself, please say so
and I'll try again. This is an excellent example - I wrote about
callbacks, and went to sleep. Let me try to explain again how it
*does* work.

The frozen() function, and the __frozen__ protocol, would get another
optional argument - an object to be notified when the *nonfrozen*
object has changed. It may be called at most once - only on the first
change to the object, and only if the object which requested to be
notified is still alive. I now introduce a second protocol, which I
will call __changed__. Objects would be notified by calling their
__changed__ method.

You say that every frozen() call takes O(n), because it needs to
verify that objects hadn't changed since the last call:

> Oh hrm.  This invalidates x[0]'s cached frozen object, which would
> suggest that x's cached frozen object was also invalidated, even though
> Python objects tend to know nothing about the objects which point to
> them.  Well, that's a rub. In order to /validate/ that an object's
> cached object is valid, you must validate that the contents of your
> cached frozen object points to the cached frozen objects of your
> contents.  Or in code (for this two-level example)...
>
>    def frozen(x):
>        if x.frozen_cache:
>            for i,j in zip(x.contents, x.frozen_cache):
>                if i.frozen_cache is not j:
>                    x.frozen_cache = None
>                    x.frozen_cache = frozen(x)
>                    return x.frozen_cache
>        x.frozen_cache = tuple(map(frozen, x.contents))
>        return x.frozen_cache

This is not so. When a list creates its frozen copy, it gives itself
to all those frozen() calls. This means that it will be notified if
one of its members is changed. In that case, it has to do two simple
actions: 1. release the reference to its frozen copy, so that
subsequent freezes of the list would create a new frozen copy, and: 2.
notify about the change any object which froze the list and requested
notification.

This frees us of any validation code. If we haven't been notified
about a change, there was no change, and the frozen copy is valid.

In case you ask, the cost of notification is O(1), amortized. The
reason is that every frozen() call can cause at most one callback in
the future.

> Like I said before, it's not so easy.  In order to actually implement
> the system, every object in an object heirarchy would need to know about
> its parent, but such cannot work because...
>
>    a = range(10)
>    b = [a]
>    c = [a]
>    bf = frozen(b)
>    cf = frozen(c)
>    a[0] = 10       #oops!
>
> That last line is the killer.  Every object would necessarily need to
> know about all containers which point to it, and would necessarily need
> to notify them all that their contents had changed.  I personally don't
> see Python doing that any time soon.
>
This is not the case. Every object has to know only on the objects
which created frozen copies of it, and requested notifications.
Actually, the object itself doesn't have to store anything. I thought
about it, and you can create a module for handling those
change-callbacks. It would store only weak references to objects. It
would have two functions:

def register_reference(x, y):
    '''Register the fact that if object x changes, it means that
object y changes too.'''

def changed(x):
    '''Notify all objects that change with x that they are changed.'''

When an object is changed, this module would call the __changed__
method of all the objects that have a reference to it, and haven't
changed since the connection was created.

I hope I have clarified my idea. Tell me if you still think I'm wrong.

> Hope you sleep/slept well,
>  - Josiah
>

Thanks! indeed, a good sleep is something very important. Sleep well
too (when the time comes, of course),
Noam

From aahz at pythoncraft.com  Mon Oct 31 20:40:31 2005
From: aahz at pythoncraft.com (Aahz)
Date: Mon, 31 Oct 2005 11:40:31 -0800
Subject: [Python-Dev] a different kind of reduce...
In-Reply-To: <8393fff0510311113p63bc194ak88580f84a25b1a1a@mail.gmail.com>
References: <8393fff0510311113p63bc194ak88580f84a25b1a1a@mail.gmail.com>
Message-ID: <20051031194031.GA4397@panix.com>

On Mon, Oct 31, 2005, Martin Blais wrote:
>
> There is a way to hack something like that with reduce, but it's not
> pretty--it involves creating a temporary list and a lambda function:
> 
>   ... = reduce(lambda x, y: dirname(x), [p] + [None] * 3)
> 
> Just wondering, does anybody know how to do this nicely? Is there an
> easy form that allows me to do this?

This should go on comp.lang.python.  Thanks.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"If you think it's expensive to hire a professional to do the job, wait
until you hire an amateur."  --Red Adair

From jcarlson at uci.edu  Mon Oct 31 20:28:20 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Mon, 31 Oct 2005 12:28:20 -0700
Subject: [Python-Dev] PEP 351, the freeze protocol
In-Reply-To: <dk5elj$m9s$2@sea.gmane.org>
References: <20051030202958.39FD.JCARLSON@uci.edu> <dk5elj$m9s$2@sea.gmane.org>
Message-ID: <20051031122308.3A0F.JCARLSON@uci.edu>


Steve Holden <steve at holdenweb.com> wrote:
> 
> Josiah Carlson wrote:
> [...]
> >>Perhaps I didn't make it clear. The difference between wxPython's Grid
> >>and my table is that in the table, most values are *computed*. This
> >>means that there's no point in changing the values themselves. They
> >>are also used frequently as set members (I can describe why, but it's
> >>a bit complicated.)
> > 
> > Again, user semantics.  Tell your users not to modify entries, and/or
> > you can make copies of objects you return.  If your users are too daft
> > to read and/or follow directions, then they deserve to have their
> > software not work.
> > 
> That sounds like a "get out of jail free" card for Microsoft and many 
> other software vendors ...

If/when vendors are COMPLETE in their specifications and documentation,
they can have that card, but being that even when they specify such
behaviors they are woefully incomplete, there is not a card to be found,
and I stand by my opinion.

 - Josiah


From jcarlson at uci.edu  Mon Oct 31 21:05:16 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Mon, 31 Oct 2005 13:05:16 -0700
Subject: [Python-Dev] PEP 351, the freeze protocol
In-Reply-To: <b348a0850510311127k17025338qee19f20ea348c893@mail.gmail.com>
References: <20051030202958.39FD.JCARLSON@uci.edu>
	<b348a0850510311127k17025338qee19f20ea348c893@mail.gmail.com>
Message-ID: <20051031120205.3A0C.JCARLSON@uci.edu>


Noam Raphael <noamraph at gmail.com> wrote:
> Hello,
> 
> I have slept quite well, and talked about it with a few people, and I
> still think I'm right.

And I'm going to point out why you are wrong.

> About the users-changing-my-internal-data issue:
> 
> > Again, user semantics.  Tell your users not to modify entries, and/or
> > you can make copies of objects you return.  If your users are too daft
> > to read and/or follow directions, then they deserve to have their
> > software not work.
> ...
> > When someone complains that something doesn't work, I tell them to read
> > the documentation.  If your users haven't been told to RTFM often enough
> > to actually make it happen, then you need a RTFM-bat. Want to know how
> > you make one?  You start wrapping the objects you return which segfaults
> > the process if they change things. When they start asking, tell them it
> > is documented quite clearly "do not to modify objects returned, or else".
> > Then there's the other option, which I provide below.
> 
> I disagree. I read the manual when I don't know what something does.
> If I can guess what it does (and this is where conventions are good),
> I don't read the manual. And let's say I ought to read the complete
> manual for every method that I use, and that I deserve a death
> punishment (or a segmentation fault) if I don't. But let's say that I
> wrote a software, without reading the manual, and it worked. I have
> gone to work on other things, and suddenly a bug arises. When the poor
> guy who needs to fix it goes over the code, everything looks
> absolutely correct. Should he also read the complete manuals of every
> library that I used, in order to fix that bug? And remember that in
> this case, the object could have traveled between several places
> (including functions in other libraries), before it was changed, and
> the original data structure starts behaving weird.

You can have a printout before it dies:
"I'm crashing your program because something attempted to modify a data
structure (here's the traceback), and you were told not to."

Then again, you can even raise an exception when people try to change
the object, as imdict does, as tuples do, etc.


> You suggest two ways for solving the problem. The first is by copying
> my mutable objects to immutable copies:

And by caching those results, then invalidating them when they are
updated by your application.  This is the same as what you would like to
do, except that I do not rely on copy-on-write semantics, which aren't
any faster than freeze+cache by your application.

[snip graph API example]

> This will work. It's simply... well, not very beautiful. I have to
> invent a lot of names, and my users need to remember them all. If I
> give them a frozen set, with all the vertices than a vertex points to
> (which is an absolutely reasonable API), they will know how to deal
> with it without learning a lot of method names, thanks to the fact
> that they are already familiar with sets, and that a lot of thought
> has gone into the set interface.

I never claimed it was beautiful, I claimed it would work.  And it does. 
There are 7 methods, which you can reduce if you play the special method
game:

RemEdge -> __delitem__((node, node))
RemNode -> __delitem__(node) #forgot this method before
IterNodes -> __iter__()
IterOutgoing,IterIncoming -> IterAdjacent(node)

In any case, whether you choose to use freeze, or use a different API,
this particular problem is solvable without copy-on-write semantics.

> 
> > > Now, about copy-on-write:
> ...
> >
> > What you have written here is fairly unintelligible, but thankfully you
> > clarify yourself...pity it still doesn't work, I explain below.
> 
> I'm sorry if I am sometimes hard to understand. English is not my
> mother tongue, and it degrades as the hour gets later - and sometimes,
> things are hard to explain. If I don't explain myself, please say so
> and I'll try again. This is an excellent example - I wrote about
> callbacks, and went to sleep. Let me try to explain again how it
> *does* work.

Thank you for the clarification (btw, your english is far better than
any of the foreign languages I've been "taught" over the years).

> This is not so. When a list creates its frozen copy, it gives itself
> to all those frozen() calls. This means that it will be notified if
> one of its members is changed. In that case, it has to do two simple
> actions: 1. release the reference to its frozen copy, so that
> subsequent freezes of the list would create a new frozen copy, and: 2.
> notify about the change any object which froze the list and requested
> notification.
> 
> This frees us of any validation code. If we haven't been notified
> about a change, there was no change, and the frozen copy is valid.
> 
> In case you ask, the cost of notification is O(1), amortized. The
> reason is that every frozen() call can cause at most one callback in
> the future.

Even without validation, there are examples that force a high number of
calls, which are not O(1), ammortized or otherwise.


> > Like I said before, it's not so easy.  In order to actually implement
> > the system, every object in an object heirarchy would need to know about
> > its parent, but such cannot work because...
> >
> >    a = range(10)
> >    b = [a]
> >    c = [a]
> >    bf = frozen(b)
> >    cf = frozen(c)
> >    a[0] = 10       #oops!
> >
> > That last line is the killer.  Every object would necessarily need to
> > know about all containers which point to it, and would necessarily need
> > to notify them all that their contents had changed.  I personally don't
> > see Python doing that any time soon.
> >
> This is not the case. Every object has to know only on the objects
> which created frozen copies of it, and requested notifications.
> Actually, the object itself doesn't have to store anything. I thought
> about it, and you can create a module for handling those
> change-callbacks. It would store only weak references to objects. It
> would have two functions:
> 
> def register_reference(x, y):
>     '''Register the fact that if object x changes, it means that
> object y changes too.'''
> 
> def changed(x):
>     '''Notify all objects that change with x that they are changed.'''
> 
> When an object is changed, this module would call the __changed__
> method of all the objects that have a reference to it, and haven't
> changed since the connection was created.

Callbacks work, in that you can notify parents, but they still don't
allow O(1) ammortization.


    a = [[] for i in xrange(k)]
    b = [list(a) for i in xrange(k)]
    del a

    c = freeze(b)

    b[0][0].append(1)

That append call requires that b and every list in b be notified.  That
takes O(k) time, because you have to notify the k lists which point to
b[0][0]. Let's freeze it again! Oh wait, that takes O(k**2) time for
that second freeze because you have to recreate the tuple version of b,
as well as the tuple version of everything in b. Cycling through
modifications and freezing ends up taking time equivalent to if you were
to just re-create the entire frozen version to start with every time you
freeze.

Now, the actual time analysis on repeated freezings and such gets ugly. 
There are actually O(k) objects, which take up O(k**2) space.  When you
modify object b[i][j] (which has just been frozen), you get O(k)
callbacks, and when you call freeze(b), it actually results in O(k**2)
time to re-copy the O(k**2) pointers to the O(k) objects.  It should be
obvious that this IS NOT AMMORTIZABLE to original object creation time.


> I hope I have clarified my idea. Tell me if you still think I'm wrong.

You have clarified it, but it is still wrong.  I stand by 'it is not
easy to get right', and would further claim, "I doubt it is possible to
make it fast."

Good day,
 - Josiah


From noamraph at gmail.com  Mon Oct 31 23:25:53 2005
From: noamraph at gmail.com (Noam Raphael)
Date: Tue, 1 Nov 2005 00:25:53 +0200
Subject: [Python-Dev] PEP 351, the freeze protocol
In-Reply-To: <20051031120205.3A0C.JCARLSON@uci.edu>
References: <20051030202958.39FD.JCARLSON@uci.edu>
	<b348a0850510311127k17025338qee19f20ea348c893@mail.gmail.com>
	<20051031120205.3A0C.JCARLSON@uci.edu>
Message-ID: <b348a0850510311425w493c14few57fc0677ad273d80@mail.gmail.com>

On 10/31/05, Josiah Carlson <jcarlson at uci.edu> wrote:
...
> And I'm going to point out why you are wrong.

I still don't think so. I think that I need to reclarify what I mean.

> > About the users-changing-my-internal-data issue:
...
> You can have a printout before it dies:
> "I'm crashing your program because something attempted to modify a data
> structure (here's the traceback), and you were told not to."
>
> Then again, you can even raise an exception when people try to change
> the object, as imdict does, as tuples do, etc.

Both solutions would solve the problem, but would require me to wrap
the built-in set with something which doesn't allow changes. This is a
lot of work - but it's quite similiar to what my solution would
actually do, in a single built-in function.
>
> > You suggest two ways for solving the problem. The first is by copying
> > my mutable objects to immutable copies:
>
> And by caching those results, then invalidating them when they are
> updated by your application.  This is the same as what you would like to
> do, except that I do not rely on copy-on-write semantics, which aren't
> any faster than freeze+cache by your application.

This isn't correct - freezing a set won't require a single copy to be
performed, as long as the frozen copy isn't saved after the original
is changed. Copy+cache always requires one copy.

...
> I never claimed it was beautiful, I claimed it would work.  And it does.
> There are 7 methods, which you can reduce if you play the special method
> game:
>
> RemEdge -> __delitem__((node, node))
> RemNode -> __delitem__(node) #forgot this method before
> IterNodes -> __iter__()
> IterOutgoing,IterIncoming -> IterAdjacent(node)
>
I just wanted to say that this game is of course a lot of fun, but it
doesn't simplify the interface.

> In any case, whether you choose to use freeze, or use a different API,
> this particular problem is solvable without copy-on-write semantics.

Right. But I think that a significant simplification of the API is a
nice bonus for my solution. And about those copy-on-write semantics -
it should be proven how complex they are. Remember that we are talking
about frozen-copy-on-write, which I think would simplify matters
considerably - for example, there are at most two instances sharing
the same data, since the frozen copy can be returned again and again.

> > > > Now, about copy-on-write:
> > ...
> Thank you for the clarification (btw, your english is far better than
> any of the foreign languages I've been "taught" over the years).
Thanks! It seems that if you are forced to use a language from time to
time it improves a little...

...

> Even without validation, there are examples that force a high number of
> calls, which are not O(1), ammortized or otherwise.
>
[Snap - a very interesting example]
>
> Now, the actual time analysis on repeated freezings and such gets ugly.
> There are actually O(k) objects, which take up O(k**2) space.  When you
> modify object b[i][j] (which has just been frozen), you get O(k)
> callbacks, and when you call freeze(b), it actually results in O(k**2)
> time to re-copy the O(k**2) pointers to the O(k) objects.  It should be
> obvious that this IS NOT AMMORTIZABLE to original object creation time.
>
That's absolutely right. My ammortized analysis is correct only if you
limit yourself to cases in which the original object doesn't change
after a frozen() call was made. In that case, it's ok to count the
O(k**2) copy with the O(k**2) object creation, because it's made only
once.

Why it's ok to analyze only that limited case? I am suggesting a
change in Python: that every object you would like be mutable, and
would support the frozen() protocol. When you evaluate my suggestion,
you need to take a program, and measure its performance in the current
Python and in a Python which implements my suggestion. This means that
the program should work also on the current Python. In that case, my
assumption is true - you won't change objects after you have frozen
them, simply because these objects (strings which are used as dict
keys, for example) can't be changed at all in the current Python
implementation!

I will write it in another way: I am proposing a change that will make
Python objects, including strings, mutable, and gives you other
advantages as well. I claim that it won't make existing Python
programs run slower in O() terms. It would allow you to do many things
that you can't do today; some of them would be fast, like editing a
string, and some of them would be less fast - for example, repeatedly
changing an object and freezing it.

I think that the performance penalty may be rather small - remember
that in programs which do not change strings, there would never be a
need to copy the string data at all. And since I think that usually
most of the dict lookups are for method or function names, there would
almost never be a need to constuct a new object on dict lookup,
because you search for the same names again and again, and a new
object is created only on the first frozen() call. You might even gain
performance, because s += x would be faster.

...

> You have clarified it, but it is still wrong.  I stand by 'it is not
> easy to get right', and would further claim, "I doubt it is possible to
> make it fast."

It would not be very easy to implement, of course, but I hope that it
won't be very hard either, since the basic idea is quite simple. Do
you still doubt the possibility of making it fast, given my (correct)
definition of fast?

And if it's possible (which I think it is), it would allow us to get
rid of inconvinient immutable objects, and it would let us put
everything into a set. Isn't that nice?

>
> Good day,
>  - Josiah
>
The same to you,
Noam

From p.f.moore at gmail.com  Mon Oct 31 23:29:30 2005
From: p.f.moore at gmail.com (Paul Moore)
Date: Mon, 31 Oct 2005 22:29:30 +0000
Subject: [Python-Dev] a different kind of reduce...
In-Reply-To: <8393fff0510311113p63bc194ak88580f84a25b1a1a@mail.gmail.com>
References: <8393fff0510311113p63bc194ak88580f84a25b1a1a@mail.gmail.com>
Message-ID: <79990c6b0510311429u2ac8f1dcw193f3dd2fd25f3b1@mail.gmail.com>

On 10/31/05, Martin Blais <blais at furius.ca> wrote:

> I'm always--literally every time-- looking for a more functional form,
> something that would be like this:
>
>    # apply dirname() 3 times on its results, initializing with p
>    ... = repapply(dirname, 3, p)
[...]
> Just wondering, does anybody know how to do this nicely? Is there an
> easy form that allows me to do this?

FWIW, something like this works:

>>> def fpow(f, n):
...     def res(*args, **kw):
...         nn = n
...         while nn > 0:
...             args = [f(*args, **kw)]
...             kw = {}
...             nn -= 1
...         return args[0]
...     return res
...

>>> fn = r'a\b\c\d\e\f\g'
>>> d3 = fpow(os.path.dirname, 3)
>>> d3(fn)
'a\\b\\c\\d'

You can vary this a bit - the handling of keyword arguments is an
obvious place where I've picked a very arbitrary approach - but you
get the idea. This *may* be a candidate for addition to the new
"functional" module, but I'd be surprised if it got added without
proving itself "in the wild" first. More likely, it should go in a
local "utilities" module.

Paul.

From noamraph at gmail.com  Mon Oct 31 23:55:34 2005
From: noamraph at gmail.com (Noam Raphael)
Date: Tue, 1 Nov 2005 00:55:34 +0200
Subject: [Python-Dev] PEP 351, the freeze protocol
In-Reply-To: <b348a0850510311425w493c14few57fc0677ad273d80@mail.gmail.com>
References: <20051030202958.39FD.JCARLSON@uci.edu>
	<b348a0850510311127k17025338qee19f20ea348c893@mail.gmail.com>
	<20051031120205.3A0C.JCARLSON@uci.edu>
	<b348a0850510311425w493c14few57fc0677ad273d80@mail.gmail.com>
Message-ID: <b348a0850510311455kdb504cdq65799fc93969c806@mail.gmail.com>

I thought about something -
>
> I think that the performance penalty may be rather small - remember
> that in programs which do not change strings, there would never be a
> need to copy the string data at all. And since I think that usually
> most of the dict lookups are for method or function names, there would
> almost never be a need to constuct a new object on dict lookup,
> because you search for the same names again and again, and a new
> object is created only on the first frozen() call. You might even gain
> performance, because s += x would be faster.
>
Name lookups can take virtually the same time they take now - method
names can be saved from the beginning as frozen strings, so finding
them in a dict will take just another bit test - is the object frozen
- before doing exactly what is done now. Remember, the strings we are
familiar with are simply frozen strings...