From greg at krypto.org  Sun Apr  1 05:50:17 2012
From: greg at krypto.org (Gregory P. Smith)
Date: Sat, 31 Mar 2012 20:50:17 -0700
Subject: [Python-ideas] Thread stopping
In-Reply-To: <CAKCKLWxV1K0LhZdoOOk+5xUOrnqtfQ09jQTJFYne3G44MbEOsQ@mail.gmail.com>
References: <CAL3CFcX3L8ODEEY5s8aLJ1rO1pA4tJdRP6fFBra9pGAHOpaWZA@mail.gmail.com>
	<jl40od$6lj$1@dough.gmane.org>
	<CAKCKLWxV1K0LhZdoOOk+5xUOrnqtfQ09jQTJFYne3G44MbEOsQ@mail.gmail.com>
Message-ID: <CAGE7PN+p1KQ_q=+8kxXuHaGAZ5SDT4uVex_R3T_+ANXidPNrcA@mail.gmail.com>

On Sat, Mar 31, 2012 at 3:33 AM, Michael Foord <fuzzyman at gmail.com> wrote:

> An "uninterruptable context manager" would be nice - but would probably
> need extra vm support and isn't essential.


I'm not so sure that would need much vm support for an uninterruptable
context manager, at least in CPython 3.2 and 3.3:

Isn't something like this about it; assuming you were only talking about
uninteruptable within the context of native Python code rather than
whatever other extension modules or interpreter embedding code may be
running on their own in C/C++/Java/C# thread land:

class UninterruptableContext:
  def __enter__(self, ...):
    self._orig_switchinterval = sys.getswitchinterval()
    sys.setswitchinterval(1000000000)   # 31 years with no Python thread
switching

  def __exit__(self, ...):
    sys.setswitchinterval(self._orig_switchinterval)

the danger with that of course is that you could be saving an obsolete
switch interval value.  but I suspect it is rare to change that other than
near process start time. you could document the caveat and suggest that the
switch interval be set to its desired setting before using any of these
context managers.  or monkeypatch setswitchinterval out with a dummy when
this library is imported so that it becomes the sole user and owner of that
api.  all of which are pretty evil-hacks to expunge from ones memory and
pretend you didn't read.

the _other_ big caveat to the above is that if you do any blocking
operations that release the GIL within such a context manager I think you
just voluntarily give up your right to not be interrupted.  Plus it depends
on setswitchinterval() which is an API that we could easily discard in the
future with different threading and GIL implementations.

brainstorming... its what python-ideas is for.

I have zero use cases for the uninterruptable context manager within
Python.  for tiny sections of C code, sure. Within a high level language...
not so much.  Please use finer grained locks.  An uninterruptible context
manager is essentially a context manager around the GIL.

-gps
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120331/060d7e98/attachment.html>

From xorninja at gmail.com  Sun Apr  1 16:34:42 2012
From: xorninja at gmail.com (Itzik Kotler)
Date: Sun, 1 Apr 2012 17:34:42 +0300
Subject: [Python-ideas] Pythonect 0.1.0 Release
Message-ID: <CAD-_V752mgDPUcy61o9+hf060rBZdPXS_nTLauYf_39y3auFow@mail.gmail.com>

Hi All,

I'm pleased to announce the first beta release of Pythonect interpreter.

Pythonect is a new, experimental, general-purpose dataflow programming
language based on Python.

It aims to combine the intuitive feel of shell scripting (and all of its
perks like implicit parallelism) with the flexibility and agility of
Python.

Pythonect interpreter (and reference implementation) is written in Python,
and is available under the BSD license.

Here's a quick tour of Pythonect:

The canonical "Hello, world" example program in Pythonect:

>>> "Hello World" -> print
<MainProcess:Thread-1> : Hello World
Hello World
>>>

'->' and '|' are both Pythonect operators.

The pipe operator (i.e. '|') passes one item at a item, while the other
operator passes all items at once.


Python statements and other None-returning function are acting as a
pass-through:

>>> "Hello World" -> print -> print
<MainProcess:Thread-2> : Hello World
<MainProcess:Thread-2> : Hello World
Hello World
>>>

>>> 1 -> import math -> math.log
0.0
>>>


Parallelization in Pythonect:

>>> "Hello World" -> [ print , print ]
<MainProcess:Thread-4> : Hello World
<MainProcess:Thread-5> : Hello World
['Hello World', 'Hello World']

>>> range(0,3) -> import math -> math.sqrt
[0.0, 1.0, 1.4142135623730951]
>>>

In the future, I am planning on adding support for multi-processing, and
even distributed computing.


The '_' identifier allow access to current item:

>>> "Hello World" -> [ print , print ] -> _ + " and Python"
<MainProcess:Thread-7> : Hello World
<MainProcess:Thread-8> : Hello World
['Hello World and Python', 'Hello World and Python']
>>>

>>> [ 1 , 2 ] -> _**_
[1, 4]
>>>


True/False return values as filters:

>>> "Hello World" -> _ == "Hello World" -> print
<MainProcess:Thread-9> : Hello World
>>>

>>> "Hello World" -> _ == "Hello World1" -> print
False
>>>

>>> range(1,10) -> _ % 2 == 0
[2, 4, 6, 8]
>>>


Last but not least, I have also added extra syntax for making remote
procedure call easy:

>> 1 -> inc at xmlrpc://localhost:8000 -> print
<MainProcess:Thread-2> : 2
2
>>>

Download Pythonect v0.1.0 from:
http://github.com/downloads/ikotler/pythonect/Pythonect-0.1.0.tar.gz

More information can be found at: http://www.pythonect.org


I will appreciate any input / feedback that you can give me.

Also, for those interested in working on the project, I'm actively
interested in welcoming and supporting both new developers and new users.
Feel free to contact me.


Regards,
Itzik Kotler | http://www.ikotler.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120401/c6e705d2/attachment.html>

From jkbbwr at gmail.com  Sun Apr  1 17:58:22 2012
From: jkbbwr at gmail.com (Jakob Bowyer)
Date: Sun, 1 Apr 2012 16:58:22 +0100
Subject: [Python-ideas] Pythonect 0.1.0 Release
In-Reply-To: <CAD-_V752mgDPUcy61o9+hf060rBZdPXS_nTLauYf_39y3auFow@mail.gmail.com>
References: <CAD-_V752mgDPUcy61o9+hf060rBZdPXS_nTLauYf_39y3auFow@mail.gmail.com>
Message-ID: <CAA+RL7Fd7L+5K8Z+RqrboVS59y3n9OM4vZK3TTgKf=HV26=Fmw@mail.gmail.com>

You might want to PEP8 your code, move imports to the top lose some of
the un-needed lines

On Sun, Apr 1, 2012 at 3:34 PM, Itzik Kotler <xorninja at gmail.com> wrote:
> Hi All,
>
> I'm pleased to announce the first beta release of Pythonect interpreter.
>
> Pythonect is a new, experimental, general-purpose dataflow programming
> language based on Python.
>
> It aims to combine the intuitive feel of shell scripting (and all of its
> perks like implicit parallelism) with the flexibility and agility of Python.
>
> Pythonect interpreter (and reference implementation) is written in Python,
> and is available under the BSD license.
>
> Here's a quick tour of Pythonect:
>
> The canonical "Hello, world" example program in Pythonect:
>
>>>> "Hello World" -> print
> <MainProcess:Thread-1> : Hello World
> Hello World
>>>>
>
> '->' and '|' are both Pythonect operators.
>
> The pipe operator (i.e. '|') passes one item at a item, while the other
> operator passes all items at once.
>
>
> Python statements and other None-returning function are acting as a
> pass-through:
>
>>>> "Hello World" -> print -> print
> <MainProcess:Thread-2> : Hello World
> <MainProcess:Thread-2> : Hello World
> Hello World
>>>>
>
>>>> 1 -> import math -> math.log
> 0.0
>>>>
>
>
> Parallelization in Pythonect:
>
>>>> "Hello World" -> [ print , print ]
> <MainProcess:Thread-4> : Hello World
> <MainProcess:Thread-5> : Hello World
> ['Hello World', 'Hello World']
>
>>>> range(0,3) -> import math -> math.sqrt
> [0.0, 1.0, 1.4142135623730951]
>>>>
>
> In the future, I am planning on adding support for multi-processing, and
> even distributed computing.
>
>
> The '_' identifier allow access to current item:
>
>>>> "Hello World" -> [ print , print ] -> _ + " and Python"
> <MainProcess:Thread-7> : Hello World
> <MainProcess:Thread-8> : Hello World
> ['Hello World and Python', 'Hello World and Python']
>>>>
>
>>>> [ 1 , 2 ] -> _**_
> [1, 4]
>>>>
>
>
> True/False return values as filters:
>
>>>> "Hello World" -> _ == "Hello World" -> print
> <MainProcess:Thread-9> : Hello World
>>>>
>
>>>> "Hello World" -> _ == "Hello World1" -> print
> False
>>>>
>
>>>> range(1,10) -> _ % 2 == 0
> [2, 4, 6, 8]
>>>>
>
>
> Last but not least, I have also added extra syntax for making remote
> procedure call easy:
>
>>> 1 -> inc at xmlrpc://localhost:8000 -> print
> <MainProcess:Thread-2> : 2
> 2
>>>>
>
> Download Pythonect v0.1.0 from:
> http://github.com/downloads/ikotler/pythonect/Pythonect-0.1.0.tar.gz
>
> More information can be found at: http://www.pythonect.org
>
>
> I will appreciate any input / feedback that you can give me.
>
> Also, for those interested in working on the project, I'm actively
> interested in welcoming and supporting both new developers and new users.
> Feel free to contact me.
>
>
> Regards,
> Itzik Kotler | http://www.ikotler.org
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


From guido at python.org  Sun Apr  1 18:05:22 2012
From: guido at python.org (Guido van Rossum)
Date: Sun, 1 Apr 2012 09:05:22 -0700
Subject: [Python-ideas] Pythonect 0.1.0 Release
In-Reply-To: <CAD-_V752mgDPUcy61o9+hf060rBZdPXS_nTLauYf_39y3auFow@mail.gmail.com>
References: <CAD-_V752mgDPUcy61o9+hf060rBZdPXS_nTLauYf_39y3auFow@mail.gmail.com>
Message-ID: <CAP7+vJJ=2__2DoYN0xfPWON2BLMPGKRyKk7a2eyfp019HFSREA@mail.gmail.com>

April fool, right?

On Sunday, April 1, 2012, Itzik Kotler wrote:

> Hi All,
>
> I'm pleased to announce the first beta release of Pythonect interpreter.
>
> Pythonect is a new, experimental, general-purpose dataflow programming
> language based on Python.
>
> It aims to combine the intuitive feel of shell scripting (and all of its
> perks like implicit parallelism) with the flexibility and agility of
> Python.
>
> Pythonect interpreter (and reference implementation) is written in Python,
> and is available under the BSD license.
>
> Here's a quick tour of Pythonect:
>
> The canonical "Hello, world" example program in Pythonect:
>
> >>> "Hello World" -> print
> <MainProcess:Thread-1> : Hello World
> Hello World
> >>>
>
> '->' and '|' are both Pythonect operators.
>
> The pipe operator (i.e. '|') passes one item at a item, while the other
> operator passes all items at once.
>
>
> Python statements and other None-returning function are acting as a
> pass-through:
>
> >>> "Hello World" -> print -> print
> <MainProcess:Thread-2> : Hello World
> <MainProcess:Thread-2> : Hello World
> Hello World
> >>>
>
> >>> 1 -> import math -> math.log
> 0.0
> >>>
>
>
> Parallelization in Pythonect:
>
> >>> "Hello World" -> [ print , print ]
> <MainProcess:Thread-4> : Hello World
> <MainProcess:Thread-5> : Hello World
> ['Hello World', 'Hello World']
>
> >>> range(0,3) -> import math -> math.sqrt
> [0.0, 1.0, 1.4142135623730951]
> >>>
>
> In the future, I am planning on adding support for multi-processing, and
> even distributed computing.
>
>
> The '_' identifier allow access to current item:
>
> >>> "Hello World" -> [ print , print ] -> _ + " and Python"
> <MainProcess:Thread-7> : Hello World
> <MainProcess:Thread-8> : Hello World
> ['Hello World and Python', 'Hello World and Python']
> >>>
>
> >>> [ 1 , 2 ] -> _**_
> [1, 4]
> >>>
>
>
> True/False return values as filters:
>
> >>> "Hello World" -> _ == "Hello World" -> print
> <MainProcess:Thread-9> : Hello World
> >>>
>
> >>> "Hello World" -> _ == "Hello World1" -> print
> False
> >>>
>
> >>> range(1,10) -> _ % 2 == 0
> [2, 4, 6, 8]
> >>>
>
>
> Last but not least, I have also added extra syntax for making remote
> procedure call easy:
>
> >> 1 -> inc at xmlrpc://localhost:8000 -> print
> <MainProcess:Thread-2> : 2
> 2
> >>>
>
> Download Pythonect v0.1.0 from:
> http://github.com/downloads/ikotler/pythonect/Pythonect-0.1.0.tar.gz
>
> More information can be found at: http://www.pythonect.org
>
>
> I will appreciate any input / feedback that you can give me.
>
> Also, for those interested in working on the project, I'm actively
> interested in welcoming and supporting both new developers and new users.
> Feel free to contact me.
>
>
> Regards,
> Itzik Kotler | http://www.ikotler.org
>


-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120401/8cf917dc/attachment.html>

From xorninja at gmail.com  Sun Apr  1 20:22:58 2012
From: xorninja at gmail.com (Itzik Kotler)
Date: Sun, 1 Apr 2012 21:22:58 +0300
Subject: [Python-ideas] Pythonect 0.1.0 Release
In-Reply-To: <CAP7+vJJ=2__2DoYN0xfPWON2BLMPGKRyKk7a2eyfp019HFSREA@mail.gmail.com>
References: <CAD-_V752mgDPUcy61o9+hf060rBZdPXS_nTLauYf_39y3auFow@mail.gmail.com>
	<CAP7+vJJ=2__2DoYN0xfPWON2BLMPGKRyKk7a2eyfp019HFSREA@mail.gmail.com>
Message-ID: <CAD-_V75OSqhFHjERwp3qbHddkKd6t9AgSm8TC_qdtpj12K7myw@mail.gmail.com>

It might be April fools, but its not a fool's concept :-)

Regards,
Itzik Kotler | http://www.ikotler.org

On Sun, Apr 1, 2012 at 7:05 PM, Guido van Rossum <guido at python.org> wrote:

> April fool, right?
>
>
> On Sunday, April 1, 2012, Itzik Kotler wrote:
>
>> Hi All,
>>
>> I'm pleased to announce the first beta release of Pythonect interpreter.
>>
>> Pythonect is a new, experimental, general-purpose dataflow programming
>> language based on Python.
>>
>> It aims to combine the intuitive feel of shell scripting (and all of its
>> perks like implicit parallelism) with the flexibility and agility of
>> Python.
>>
>> Pythonect interpreter (and reference implementation) is written in
>> Python, and is available under the BSD license.
>>
>> Here's a quick tour of Pythonect:
>>
>> The canonical "Hello, world" example program in Pythonect:
>>
>> >>> "Hello World" -> print
>> <MainProcess:Thread-1> : Hello World
>> Hello World
>> >>>
>>
>> '->' and '|' are both Pythonect operators.
>>
>> The pipe operator (i.e. '|') passes one item at a item, while the other
>> operator passes all items at once.
>>
>>
>> Python statements and other None-returning function are acting as a
>> pass-through:
>>
>> >>> "Hello World" -> print -> print
>> <MainProcess:Thread-2> : Hello World
>> <MainProcess:Thread-2> : Hello World
>> Hello World
>> >>>
>>
>> >>> 1 -> import math -> math.log
>> 0.0
>> >>>
>>
>>
>> Parallelization in Pythonect:
>>
>> >>> "Hello World" -> [ print , print ]
>> <MainProcess:Thread-4> : Hello World
>> <MainProcess:Thread-5> : Hello World
>> ['Hello World', 'Hello World']
>>
>> >>> range(0,3) -> import math -> math.sqrt
>> [0.0, 1.0, 1.4142135623730951]
>> >>>
>>
>> In the future, I am planning on adding support for multi-processing, and
>> even distributed computing.
>>
>>
>> The '_' identifier allow access to current item:
>>
>> >>> "Hello World" -> [ print , print ] -> _ + " and Python"
>> <MainProcess:Thread-7> : Hello World
>> <MainProcess:Thread-8> : Hello World
>> ['Hello World and Python', 'Hello World and Python']
>> >>>
>>
>> >>> [ 1 , 2 ] -> _**_
>> [1, 4]
>> >>>
>>
>>
>> True/False return values as filters:
>>
>> >>> "Hello World" -> _ == "Hello World" -> print
>> <MainProcess:Thread-9> : Hello World
>> >>>
>>
>> >>> "Hello World" -> _ == "Hello World1" -> print
>> False
>> >>>
>>
>> >>> range(1,10) -> _ % 2 == 0
>> [2, 4, 6, 8]
>> >>>
>>
>>
>> Last but not least, I have also added extra syntax for making remote
>> procedure call easy:
>>
>> >> 1 -> inc at xmlrpc://localhost:8000 -> print
>> <MainProcess:Thread-2> : 2
>> 2
>> >>>
>>
>> Download Pythonect v0.1.0 from:
>> http://github.com/downloads/ikotler/pythonect/Pythonect-0.1.0.tar.gz
>>
>> More information can be found at: http://www.pythonect.org
>>
>>
>> I will appreciate any input / feedback that you can give me.
>>
>> Also, for those interested in working on the project, I'm actively
>> interested in welcoming and supporting both new developers and new users.
>> Feel free to contact me.
>>
>>
>> Regards,
>> Itzik Kotler | http://www.ikotler.org
>>
>
>
> --
> --Guido van Rossum (python.org/~guido <http://python.org/%7Eguido>)
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120401/3a75f656/attachment.html>

From ram.rachum at gmail.com  Sun Apr  1 22:25:41 2012
From: ram.rachum at gmail.com (Ram Rachum)
Date: Sun, 1 Apr 2012 13:25:41 -0700 (PDT)
Subject: [Python-ideas] with *context_managers:
Message-ID: <29272049.1143.1333311941739.JavaMail.geo-discussion-forums@vbgx21>

I'd like to be able to do this:

with *context_managers:
    pass # Some suite.


This is useful when you have an unknown number of context managers that you 
want to use. I currently use `contextlib.nested`, but I'd like the *star 
syntax much better.

What do you think?


Ram.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120401/b7e42dfe/attachment.html>

From maxmoroz at gmail.com  Mon Apr  2 10:30:26 2012
From: maxmoroz at gmail.com (Max Moroz)
Date: Mon, 2 Apr 2012 01:30:26 -0700
Subject: [Python-ideas] comparison of operator.itemgetter objects
Message-ID: <CAOVPiMjeBBdwuDyhy7t0zRSNcAS1PdKBXAEQ_UA8-x+BvU_6MA@mail.gmail.com>

Currently, __eq__() method is not defined in class operator.itemgetter,
hence non-identical itemgetter objects compare as non-equal.

I wanted to propose defining __eq__() method that would return the result
of comparison for equality of the list of arguments submitted at
initialization. This would make operator.itemgetter('name') compare as
equal to operator.itemgetter('name').

The motivation for this is that sorted data structure (such as
blist.sortedset) might want to verify if two arguments (say, lhs and rhs)
of a binary operation (such as union) have the same sort key (a callable
object passed to the constructor of the sorted data structure). Such a
verification is useful because the desirable behavior of such binary
operations is to use the common sort key if the lhs and rhs have the same
sort key; and to raise an exception (or at least use a default value of the
sort key) otherwise.

I think that comparing sort keys for equality works well in many useful
cases:

(a) Named function. These compare as equal only if they are identical. If
lhs and rhs were initialized with distinct named functions, I would argue
that the programmer did not intend them to be compatible for the purpose of
binary operations, even if they happen to be identical in behavior (e.g.,
if both functions return back the argument passed to them). In a
well-designed program, there is no need to duplicate the named function
definition if the two are expected to always have the same behavior.
Therefore, the two distinct functions are intended to be different in
behavior at least in some situations, and therefore the sorted data
structure objects that use them as keys should be considered incompatible.

(b) User-defined callable class. The author of such class should define
__eq__() in a way that would compare as equal callable objects that behave
identically, assuming it's not prohibitively expensive.

Unfortunately, in two cases comparing keys for equality does not work well.

(c) itemgetter. Suppose a programmer passed `itemgetter('name')` as the
sort key argument to the sorted data structure's constructor. The resulting
data structures would seem incompatible for the purposes of binary
operations. This is likely to be confusing and undesirable.

(d) lambda functions. Similarly, suppose a programmer passed `lambda x :
-x` as the sort key argument to the sorted data structure's constructor.
Since two lambda functions are not identical, they would compare as
unequal.

It seems to be very easy to address the undesirable behavior described in
(c): add method __eq__() to operator.itemgetter, which would compare the
list of arguments received at initialization. This would only break code
that relies on an undocumented fact that distinct itemgetter instances
compare as non-equal.

The alternative is for each sorted data structure to handle this comparison
on its own. This is repetitive and error-prone. Furthermore, it is
expensive for an outsider to find out what arguments were given to an
itemgetter at initialization.

It is far harder to address the undesirable behavior described in (d). If
it can be addressed at all, it would have to done in the sorted data
structure implementation, since I don't think anyone would want lambda
function comparison behavior to change. So for the purposes of this
discussion, I ignore case (d).

Is this a reasonable idea? Is it useful enough to be considered? Are there
any downsides I didn't think of? Are there any other callables created by
Python's builtin or standard library functions where __eq__ might be useful
to define?

Thanks,

Max
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120402/ca54f51f/attachment.html>

From steve at pearwood.info  Mon Apr  2 11:38:05 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Mon, 02 Apr 2012 19:38:05 +1000
Subject: [Python-ideas] comparison of operator.itemgetter objects
In-Reply-To: <CAOVPiMjeBBdwuDyhy7t0zRSNcAS1PdKBXAEQ_UA8-x+BvU_6MA@mail.gmail.com>
References: <CAOVPiMjeBBdwuDyhy7t0zRSNcAS1PdKBXAEQ_UA8-x+BvU_6MA@mail.gmail.com>
Message-ID: <4F79737D.6030508@pearwood.info>

Max Moroz wrote:
> Currently, __eq__() method is not defined in class operator.itemgetter,
> hence non-identical itemgetter objects compare as non-equal.
> 
> I wanted to propose defining __eq__() method that would return the result
> of comparison for equality of the list of arguments submitted at
> initialization. This would make operator.itemgetter('name') compare as
> equal to operator.itemgetter('name').

In general, I think that having equality tests fall back on identity test is 
so rarely what you actually want that sometimes I wonder why we bother.

In this case I was going to say just write your own subclass, but:

py> from operator import itemgetter
py> class MyItemgetter(itemgetter):
...     pass
...
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
TypeError: type 'operator.itemgetter' is not an acceptable base type


-- 
Steven



From maxmoroz at gmail.com  Mon Apr  2 12:39:47 2012
From: maxmoroz at gmail.com (Max Moroz)
Date: Mon, 2 Apr 2012 03:39:47 -0700
Subject: [Python-ideas] Python-ideas Digest, Vol 65, Issue 3
In-Reply-To: <mailman.33.1333360802.21701.python-ideas@python.org>
References: <mailman.33.1333360802.21701.python-ideas@python.org>
Message-ID: <CAOVPiMivSF=SXdvic7o3qK6kAxsSGi=ck5MbhKjw8drcd5XyeQ@mail.gmail.com>

Steven D'Aprano wrote:
> In this case I was going to say just write your own subclass, but:
>
> py> from operator import itemgetter
> py> class MyItemgetter(itemgetter):
> ... ? ? pass
> ...
> Traceback (most recent call last):
> ? File "<stdin>", line 1, in <module>
> TypeError: type 'operator.itemgetter' is not an acceptable base type

I suspect it's the same reason that bool or generator can't be
subclassed: there is no obvious use case for subclassing it, and an
attempt to do so is more likely to create mistakes than produce
anything useful. I actually agree that itemgetter is a very specific
callable class that is unlikely to be extensible in any meaningful
way. Subclassing to add an __eq__() method seems to be adding what
really belongs in the base class, rather than truly extending the base
class. But that's just my opinion.

Even if it could be done, it's not cheap. I like this recipe on SO
(after a minor fix): http://stackoverflow.com/a/9970405/336527. An
alternative would be to create a dummy class that defines only
__getitem__ method, and use an instance of that class to collect all
the values. Either approach involves creating a new object, calling
the itemgetter, collecting the values into a set-like data structure,
and then comparing them.


From g.rodola at gmail.com  Mon Apr  2 13:40:43 2012
From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=)
Date: Mon, 2 Apr 2012 13:40:43 +0200
Subject: [Python-ideas] with *context_managers:
In-Reply-To: <29272049.1143.1333311941739.JavaMail.geo-discussion-forums@vbgx21>
References: <29272049.1143.1333311941739.JavaMail.geo-discussion-forums@vbgx21>
Message-ID: <CAFYqXL-aLExPa4TayR_t+w3dEFQoXSZp-O1iLy2K9itLWZQh0Q@mail.gmail.com>

Il 01 aprile 2012 22:25, Ram Rachum <ram.rachum at gmail.com> ha scritto:
> I'd like to be able to do this:
>
> with *context_managers:
> ? ? pass # Some suite.
>
>
> This is useful when you have an unknown number of context managers that you
> want to use. I currently use `contextlib.nested`, but I'd like the *star
> syntax much better.
>
> What do you think?
>
>
> Ram.

I believe writing a specialized context manager object which is able
to hold multiple context managers altogheter is better than
introducing a new syntax for such a use case which should be pretty
rare/uncommon.
Also, it's not clear what to expect from "with *context_managers as ctx: ...".

Regards,

--- Giampaolo
http://code.google.com/p/pyftpdlib/
http://code.google.com/p/psutil/
http://code.google.com/p/pysendfile/


From p.f.moore at gmail.com  Mon Apr  2 13:44:39 2012
From: p.f.moore at gmail.com (Paul Moore)
Date: Mon, 2 Apr 2012 12:44:39 +0100
Subject: [Python-ideas] comparison of operator.itemgetter objects
In-Reply-To: <4F79737D.6030508@pearwood.info>
References: <CAOVPiMjeBBdwuDyhy7t0zRSNcAS1PdKBXAEQ_UA8-x+BvU_6MA@mail.gmail.com>
	<4F79737D.6030508@pearwood.info>
Message-ID: <CACac1F-fdn_ciFzKD+VBNcc+ezeAOsJ=mmpp7eCOiZ_NqcAHYw@mail.gmail.com>

On 2 April 2012 10:38, Steven D'Aprano <steve at pearwood.info> wrote:
> TypeError: type 'operator.itemgetter' is not an acceptable base type

Quite apart from the question of whether you might want to subclass
operator.itemgetter, that's a really rubbish error message. Why is it
not acceptable?

Searching the source, it appears that types can say they can't be
subclassed by setting Py_TPFLAGS_BASETYPE, so maybe a better error
would be "the designer of type '%s' has disallowed subclassing". Still
doesn't say why they did, but at least it gives a hint as to what's
going on...

Paul.


From fuzzyman at gmail.com  Mon Apr  2 13:53:17 2012
From: fuzzyman at gmail.com (Michael Foord)
Date: Mon, 2 Apr 2012 12:53:17 +0100
Subject: [Python-ideas] Thread stopping
In-Reply-To: <CAF-Rda-MSyxt69NhXiPJpm09Uv3iBkg0CSMLeTUTcrfLpAbH9g@mail.gmail.com>
References: <CAL3CFcX3L8ODEEY5s8aLJ1rO1pA4tJdRP6fFBra9pGAHOpaWZA@mail.gmail.com>
	<CAF-Rda-MSyxt69NhXiPJpm09Uv3iBkg0CSMLeTUTcrfLpAbH9g@mail.gmail.com>
Message-ID: <CAKCKLWz4kDKkK=5i98pRanWciJaFv=KnmuLwSMJ4cpfb7cgqFg@mail.gmail.com>

On 30 March 2012 05:53, Eli Bendersky <eliben at gmail.com> wrote:

> On Thu, Mar 29, 2012 at 21:48, Andrew Svetlov <andrew.svetlov at gmail.com>wrote:
>
>> I propose to add Thread.interrupt() function.
>>
> <snip>
>
> Could you specify some use cases where you believe this would be better
> than explicitly asking the thread to stop?
>


What do you mean by "asking the thread to stop?". What is proposed is
precisely that. The usual suggestion is a flag, and have the thread check
if it has been "asked to stop". This is only suitable for fine grained
tasks (e.g. computationally bound loops) where there is a suitable place to
check. Any coarse grained task, or code with multiple loops for example,
may not have any place to check - or may need checking code in *many*
places.

One concrete example - at Resolver Systems we implemented a spreadsheet
application where multiple documents could be calculating simultaneously in
separate threads. (This was in IronPython with no GIL and true free
threading.) As we were executing *user code* there was no way for the code
to check if it had been requested to stop. (Unless we transformed the code
and annotated it with checks everywhere.) With .NET threads we could simply
request the thread to exit (if the user wanted to halt a calculation - for
example because they had updated the code / spreadsheet) and it worked very
well.

Thread interruption is a useful feature.

Michael



>
> Eli
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
>


-- 

http://www.voidspace.org.uk/

May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing http://www.sqlite.org/different.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120402/f7e137b9/attachment.html>

From fuzzyman at gmail.com  Mon Apr  2 13:56:05 2012
From: fuzzyman at gmail.com (Michael Foord)
Date: Mon, 2 Apr 2012 12:56:05 +0100
Subject: [Python-ideas] Thread stopping
In-Reply-To: <CAGE7PN+p1KQ_q=+8kxXuHaGAZ5SDT4uVex_R3T_+ANXidPNrcA@mail.gmail.com>
References: <CAL3CFcX3L8ODEEY5s8aLJ1rO1pA4tJdRP6fFBra9pGAHOpaWZA@mail.gmail.com>
	<jl40od$6lj$1@dough.gmane.org>
	<CAKCKLWxV1K0LhZdoOOk+5xUOrnqtfQ09jQTJFYne3G44MbEOsQ@mail.gmail.com>
	<CAGE7PN+p1KQ_q=+8kxXuHaGAZ5SDT4uVex_R3T_+ANXidPNrcA@mail.gmail.com>
Message-ID: <CAKCKLWxxp95fdTgOxaYNP5V0uM+p=S9uUooCN41HSLhqpvpd2w@mail.gmail.com>

On 1 April 2012 04:50, Gregory P. Smith <greg at krypto.org> wrote:

>
> On Sat, Mar 31, 2012 at 3:33 AM, Michael Foord <fuzzyman at gmail.com> wrote:
>
>> An "uninterruptable context manager" would be nice - but would probably
>> need extra vm support and isn't essential.
>
>
> I'm not so sure that would need much vm support for an uninterruptable
> context manager, at least in CPython 3.2 and 3.3:
>
> Isn't something like this about it; assuming you were only talking about
> uninteruptable within the context of native Python code rather than
> whatever other extension modules or interpreter embedding code may be
> running on their own in C/C++/Java/C# thread land:
>
> class UninterruptableContext:
>   def __enter__(self, ...):
>     self._orig_switchinterval = sys.getswitchinterval()
>     sys.setswitchinterval(1000000000)   # 31 years with no Python thread
> switching
>
>   def __exit__(self, ...):
>     sys.setswitchinterval(self._orig_switchinterval)
>
> the danger with that of course is that you could be saving an obsolete
> switch interval value.  but I suspect it is rare to change that other than
> near process start time. you could document the caveat and suggest that the
> switch interval be set to its desired setting before using any of these
> context managers.  or monkeypatch setswitchinterval out with a dummy when
> this library is imported so that it becomes the sole user and owner of that
> api.  all of which are pretty evil-hacks to expunge from ones memory and
> pretend you didn't read.
>
> the _other_ big caveat to the above is that if you do any blocking
> operations that release the GIL within such a context manager I think you
> just voluntarily give up your right to not be interrupted.  Plus it depends
> on setswitchinterval() which is an API that we could easily discard in the
> future with different threading and GIL implementations.
>
> brainstorming... its what python-ideas is for.
>
> I have zero use cases for the uninterruptable context manager within
> Python.  for tiny sections of C code, sure. Within a high level language...
> not so much.  Please use finer grained locks.  An uninterruptible context
> manager is essentially a context manager around the GIL.
>
>
Hello Gregory,

I think you misunderstand what we mean by uninterruptable. It has nothing
to do with *thread switching*, the interruption we are talking about is the
proposed new feature where threads can be terminated by raising a
ThreadInterrupt exception inside them. An uninterruptable context manager
(which I'm not convinced is needed or easy to implement) simply means that
a ThreadInterrupt won't be raised whilst code inside the context manager is
executing. It *does not* mean that execution can't switch to another thread
via the normal means.

Michael



> -gps
>



-- 

http://www.voidspace.org.uk/

May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing http://www.sqlite.org/different.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120402/a3e654cc/attachment.html>

From fuzzyman at gmail.com  Mon Apr  2 14:00:06 2012
From: fuzzyman at gmail.com (Michael Foord)
Date: Mon, 2 Apr 2012 13:00:06 +0100
Subject: [Python-ideas] with *context_managers:
In-Reply-To: <CAFYqXL-aLExPa4TayR_t+w3dEFQoXSZp-O1iLy2K9itLWZQh0Q@mail.gmail.com>
References: <29272049.1143.1333311941739.JavaMail.geo-discussion-forums@vbgx21>
	<CAFYqXL-aLExPa4TayR_t+w3dEFQoXSZp-O1iLy2K9itLWZQh0Q@mail.gmail.com>
Message-ID: <CAKCKLWzeqZj9fyMKu2_AkA+eTHDS1syT2q9kVbyaR2r9vxu5EA@mail.gmail.com>

On 2 April 2012 12:40, Giampaolo Rodol? <g.rodola at gmail.com> wrote:

> Il 01 aprile 2012 22:25, Ram Rachum <ram.rachum at gmail.com> ha scritto:
> > I'd like to be able to do this:
> >
> > with *context_managers:
> >     pass # Some suite.
> >
> >
> > This is useful when you have an unknown number of context managers that
> you
> > want to use. I currently use `contextlib.nested`, but I'd like the *star
> > syntax much better.
> >
> > What do you think?
> >
> >
> > Ram.
>
> I believe writing a specialized context manager object which is able
> to hold multiple context managers altogheter is better than
> introducing a new syntax for such a use case which should be pretty
> rare/uncommon.
>


There's now an example of a need for this in the standard library.
mock.patch collects together an arbitrary number of context managers that
need to be entered sequentially (together). As there is no replacement for
contextlib.nested it has custom code calling __enter__ and __exit__ on all
the context managers and keeping track of which ones have been successfully
entered (because if there is an exception whilst entering one, only those
that have *already* been entered should have __exit__ called).


> Also, it's not clear what to expect from "with *context_managers as ctx:
> ...".
>
>
It is clear. It should be a tuple of results (what else *could* it be).

Michael


> Regards,
>
> --- Giampaolo
> http://code.google.com/p/pyftpdlib/
> http://code.google.com/p/psutil/
> http://code.google.com/p/pysendfile/
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>



-- 

http://www.voidspace.org.uk/

May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing http://www.sqlite.org/different.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120402/591d6f75/attachment.html>

From solipsis at pitrou.net  Mon Apr  2 14:05:54 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 2 Apr 2012 14:05:54 +0200
Subject: [Python-ideas] with *context_managers:
References: <29272049.1143.1333311941739.JavaMail.geo-discussion-forums@vbgx21>
	<CAFYqXL-aLExPa4TayR_t+w3dEFQoXSZp-O1iLy2K9itLWZQh0Q@mail.gmail.com>
	<CAKCKLWzeqZj9fyMKu2_AkA+eTHDS1syT2q9kVbyaR2r9vxu5EA@mail.gmail.com>
Message-ID: <20120402140554.75f61281@pitrou.net>

On Mon, 2 Apr 2012 13:00:06 +0100
Michael Foord <fuzzyman at gmail.com> wrote:
> 
> > Also, it's not clear what to expect from "with *context_managers as ctx:
> > ...".
> >
> >
> It is clear. It should be a tuple of results (what else *could* it be).

A timedelta, obviously.

Regards

Antoine.




From eric at trueblade.com  Mon Apr  2 14:24:32 2012
From: eric at trueblade.com (Eric V. Smith)
Date: Mon, 02 Apr 2012 08:24:32 -0400
Subject: [Python-ideas] with *context_managers:
In-Reply-To: <CAFYqXL-aLExPa4TayR_t+w3dEFQoXSZp-O1iLy2K9itLWZQh0Q@mail.gmail.com>
References: <29272049.1143.1333311941739.JavaMail.geo-discussion-forums@vbgx21>
	<CAFYqXL-aLExPa4TayR_t+w3dEFQoXSZp-O1iLy2K9itLWZQh0Q@mail.gmail.com>
Message-ID: <4F799A80.6090604@trueblade.com>

On 4/2/2012 7:40 AM, Giampaolo Rodol? wrote:
> Il 01 aprile 2012 22:25, Ram Rachum <ram.rachum at gmail.com> ha scritto:
>> I'd like to be able to do this:
>>
>> with *context_managers:
>>     pass # Some suite.
>>
>>
>> This is useful when you have an unknown number of context managers that you
>> want to use. I currently use `contextlib.nested`, but I'd like the *star
>> syntax much better.
>>
>> What do you think?
>>
>>
>> Ram.
> 
> I believe writing a specialized context manager object which is able
> to hold multiple context managers altogheter is better than
> introducing a new syntax for such a use case which should be pretty
> rare/uncommon.
> Also, it's not clear what to expect from "with *context_managers as ctx: ...".

See http://bugs.python.org/issue13585



From guido at python.org  Mon Apr  2 16:54:22 2012
From: guido at python.org (Guido van Rossum)
Date: Mon, 2 Apr 2012 07:54:22 -0700
Subject: [Python-ideas] Thread stopping
In-Reply-To: <CAKCKLWxxp95fdTgOxaYNP5V0uM+p=S9uUooCN41HSLhqpvpd2w@mail.gmail.com>
References: <CAL3CFcX3L8ODEEY5s8aLJ1rO1pA4tJdRP6fFBra9pGAHOpaWZA@mail.gmail.com>
	<jl40od$6lj$1@dough.gmane.org>
	<CAKCKLWxV1K0LhZdoOOk+5xUOrnqtfQ09jQTJFYne3G44MbEOsQ@mail.gmail.com>
	<CAGE7PN+p1KQ_q=+8kxXuHaGAZ5SDT4uVex_R3T_+ANXidPNrcA@mail.gmail.com>
	<CAKCKLWxxp95fdTgOxaYNP5V0uM+p=S9uUooCN41HSLhqpvpd2w@mail.gmail.com>
Message-ID: <CAP7+vJKnHLA1dXSRZv+V=qRzgBCYKLM0hwGt2SX1H8e-sWdOXg@mail.gmail.com>

Perhaps off-topic, but the one thing that isn't easy to do is stopping
a thread that's blocked (perhaps forever) in some blocking operation
-- e.g. acquiring a lock that's been forgotten or a read on a
malfunctioning socket (it happens!). Having to code those operations
consistently with timeouts is a pain, so if there was a way to make
those system calls return an error I'd really like that.

I'm not super worried about skipping finally-clauses, we can figure
out a hack for that.

-- 
--Guido van Rossum (python.org/~guido)


From andrew.svetlov at gmail.com  Mon Apr  2 17:15:08 2012
From: andrew.svetlov at gmail.com (Andrew Svetlov)
Date: Mon, 2 Apr 2012 18:15:08 +0300
Subject: [Python-ideas] Thread stopping
In-Reply-To: <CAP7+vJKnHLA1dXSRZv+V=qRzgBCYKLM0hwGt2SX1H8e-sWdOXg@mail.gmail.com>
References: <CAL3CFcX3L8ODEEY5s8aLJ1rO1pA4tJdRP6fFBra9pGAHOpaWZA@mail.gmail.com>
	<jl40od$6lj$1@dough.gmane.org>
	<CAKCKLWxV1K0LhZdoOOk+5xUOrnqtfQ09jQTJFYne3G44MbEOsQ@mail.gmail.com>
	<CAGE7PN+p1KQ_q=+8kxXuHaGAZ5SDT4uVex_R3T_+ANXidPNrcA@mail.gmail.com>
	<CAKCKLWxxp95fdTgOxaYNP5V0uM+p=S9uUooCN41HSLhqpvpd2w@mail.gmail.com>
	<CAP7+vJKnHLA1dXSRZv+V=qRzgBCYKLM0hwGt2SX1H8e-sWdOXg@mail.gmail.com>
Message-ID: <CAL3CFcWJGPyDnruHM9yy+aGPtEsRbAOTC8SnBjgqCeA5QjrJ8A@mail.gmail.com>

On Mon, Apr 2, 2012 at 5:54 PM, Guido van Rossum <guido at python.org> wrote:
> Perhaps off-topic, but the one thing that isn't easy to do is stopping
> a thread that's blocked (perhaps forever) in some blocking operation
> -- e.g. acquiring a lock that's been forgotten or a read on a
> malfunctioning socket (it happens!). Having to code those operations
> consistently with timeouts is a pain, so if there was a way to make
> those system calls return an error I'd really like that.
>
> I'm not super worried about skipping finally-clauses, we can figure
> out a hack for that.
>

Python already has support for processing EITNR in threading
synchronization objects. It's done to switch GIL to main thread if
signal received when GIL acquired by some background thread. That
mechanic can be easy extended for thread interruption case I think.

Windows is also not a problem.


From sven at marnach.net  Mon Apr  2 17:34:52 2012
From: sven at marnach.net (Sven Marnach)
Date: Mon, 2 Apr 2012 16:34:52 +0100
Subject: [Python-ideas] with *context_managers:
In-Reply-To: <29272049.1143.1333311941739.JavaMail.geo-discussion-forums@vbgx21>
References: <29272049.1143.1333311941739.JavaMail.geo-discussion-forums@vbgx21>
Message-ID: <20120402153451.GA2470@bagheera>

Ram Rachum schrieb am Sun, 01. Apr 2012, um 13:25:41 -0700:
> I'd like to be able to do this:
> 
> with *context_managers:
>     pass # Some suite.
> 
> This is useful when you have an unknown number of context managers that you
> want to use. I currently use `contextlib.nested`, but I'd like the *star
> syntax much better.

'contextlib.nested()' is broken and not available in Python 3.x (the
language this list is about).  The only replacement so far is

    with CM1() as cm1, CM2() as cm2:
        ...

which only works for a fixed number of context managers.  And there is
a class 'ContextStack' in Nick Coghlan's 'contextlib2' library [1],
which might be included in Python 3.3.  With this class, you could
write your code as

    with ContextStack() as stack:
        for cm in context_managers:
            stack.enter_context(cm)

This still leaves the question whether your proposed syntax would be
preferable, also with regard to issue 2292 [2].

[1]: http://readthedocs.org/docs/contextlib2/en/latest/#contextlib2.ContextStack
[2]: http://bugs.python.org/issue2292

Cheers,
    Sven


From paul at colomiets.name  Mon Apr  2 21:43:52 2012
From: paul at colomiets.name (Paul Colomiets)
Date: Mon, 2 Apr 2012 22:43:52 +0300
Subject: [Python-ideas] Protecting finally clauses of interruptions
Message-ID: <CAA0gF6o3Vhzg=Zi0UjWUOi-iVAwN70mXanoawooj8iC0FJj8hw@mail.gmail.com>

Hi,

I'd like to propose a way to protect `finally` clauses from
interruptions (either by KeyboardInterrupt or by timeout, or any other
way).

I think frame may be extended to have `f_in_finally` attribute (or
pick a better name). Internally it should probably be implemented as a
counter of nested finally clauses, but interface should probably
expose only boolean attribute. For `__exit__` method some flag in
`co_flags` should be introduced, which says that for whole function
`f_in_finally` should be true.

Having this attribute you can then inspect stack and check whether
it's safe to interrupt it or not. Coroutine library which interrupts
by timeout, can then sleep a bit and try again (probably for finite
number of retries). For signal handler there are also several options
to wait when thread escapes finally clause: use another thread, use
alert signal, use sys.settrace, or exit only inside main loop.

To be clear: I do not propose to change default SIGINT behavior, only
to implement a frame flag, and give library developers experiment with
the rest.

--
Paul


From yselivanov.ml at gmail.com  Mon Apr  2 22:37:35 2012
From: yselivanov.ml at gmail.com (Yury Selivanov)
Date: Mon, 2 Apr 2012 16:37:35 -0400
Subject: [Python-ideas] Protecting finally clauses of interruptions
In-Reply-To: <CAA0gF6o3Vhzg=Zi0UjWUOi-iVAwN70mXanoawooj8iC0FJj8hw@mail.gmail.com>
References: <CAA0gF6o3Vhzg=Zi0UjWUOi-iVAwN70mXanoawooj8iC0FJj8hw@mail.gmail.com>
Message-ID: <7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com>

On 2012-04-02, at 3:43 PM, Paul Colomiets wrote:

> Hi,
> 
> I'd like to propose a way to protect `finally` clauses from
> interruptions (either by KeyboardInterrupt or by timeout, or any other
> way).
> 
> I think frame may be extended to have `f_in_finally` attribute (or
> pick a better name). Internally it should probably be implemented as a
> counter of nested finally clauses, but interface should probably
> expose only boolean attribute. For `__exit__` method some flag in
> `co_flags` should be introduced, which says that for whole function
> `f_in_finally` should be true.

Paul,

First of all sorry for not replying to your previous email in the thread.  

I've been thinking about the mechanism that will be both useful for thread
interruption + for the new emerging coroutine libraries.  And I think that
we need to draft a PEP.  Your current approach with only 'f_in_finally'
flag is a half measure, as you will have to somehow monitor frame 
execution.

I think a better solution would be to:

1. Implement a mechanism to throw exceptions in running threads.  It should
be possible to wake up thread if it waits on a lock, or any other syscall.

2. Add 'f_in_finally' counter, as you proposed.

3. Either add a special base exception, that can be thrown in a currently
executing frame to interrupt it, or add a special method to frame object 
'f_interrupt()'. Once a frame is attempted to be interrupted, it checks 
its 'f_in_finally' counter.  If it is 0, then throw exception, if not -
wait till it sets back to 0 and throw exception immediately.

This approach would give you enough flexibility to cover the following 
cases:

1. Thread interruption
2. Greenlet-based coroutines (throw exception in your event hub)
3. Generator-based coroutines

Plus, proper 'finally' statements execution will be guaranteed by the
interpreter.

-
Yury


From paul at colomiets.name  Mon Apr  2 22:49:21 2012
From: paul at colomiets.name (Paul Colomiets)
Date: Mon, 2 Apr 2012 23:49:21 +0300
Subject: [Python-ideas] Protecting finally clauses of interruptions
In-Reply-To: <7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com>
References: <CAA0gF6o3Vhzg=Zi0UjWUOi-iVAwN70mXanoawooj8iC0FJj8hw@mail.gmail.com>
	<7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com>
Message-ID: <CAA0gF6pW8oujoLVAgabdwReWby3td1Qw8pg3S4ceoVqFwy5mzQ@mail.gmail.com>

Hi Yury,

On Mon, Apr 2, 2012 at 11:37 PM, Yury Selivanov <yselivanov.ml at gmail.com> wrote:
> 1. Implement a mechanism to throw exceptions in running threads. ?It should
> be possible to wake up thread if it waits on a lock, or any other syscall.
>

It's complex, because if thread waits on a lock you can't determine if it's
interrupted after lock or before. E.g. it's common to write:

l.lock()
try:
    ...
finally:
    l.unlock()

Which will break if you interrupted just after lock is acquired.

> 2. Add 'f_in_finally' counter, as you proposed.
>

Ack

> 3. Either add a special base exception, that can be thrown in a currently
> executing frame to interrupt it, or add a special method to frame object
> 'f_interrupt()'. Once a frame is attempted to be interrupted, it checks
> its 'f_in_finally' counter. ?If it is 0, then throw exception, if not -
> wait till it sets back to 0 and throw exception immediately.
>

Not sure how it supposed to work. If it's coroutine it may yield
while in finally, and you want it be interrupted only when it exits from
finally.

-- 
Paul


From ncoghlan at gmail.com  Mon Apr  2 22:52:01 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 3 Apr 2012 06:52:01 +1000
Subject: [Python-ideas] with *context_managers:
In-Reply-To: <20120402153451.GA2470@bagheera>
References: <29272049.1143.1333311941739.JavaMail.geo-discussion-forums@vbgx21>
	<20120402153451.GA2470@bagheera>
Message-ID: <CADiSq7d2i9kuFQq6uuZSjYjwPPR27PLjkpdwQXG07F8cO1Ckew@mail.gmail.com>

On Tue, Apr 3, 2012 at 1:34 AM, Sven Marnach <sven at marnach.net> wrote:
> which only works for a fixed number of context managers. ?And there is
> a class 'ContextStack' in Nick Coghlan's 'contextlib2' library [1],
> which might be included in Python 3.3. ?With this class, you could
> write your code as
>
> ? ?with ContextStack() as stack:
> ? ? ? ?for cm in context_managers:
> ? ? ? ? ? ?stack.enter_context(cm)
>
> This still leaves the question whether your proposed syntax would be
> preferable, also with regard to issue 2292 [2].

Both "with *(iterable)" and "for cm in iterable: stack.enter(cm)" are
flawed in exactly the same way that contextlib.nested() is flawed:
they encourage creating the iterable of context managers first, which
means that inner __init__ methods are not covered by outer __exit__
methods.

This breaks as soon as you have resources (such as files) where the
acquire/release resource management pairing is actually
__init__/__exit__ with __enter__ just returning self rather than
acquiring the resource. If the iterable of context managers is created
first, then the outer resources *will be leaked* if any of the inner
constructors fail. The only way to write code that handles an
arbitrary number of arbitrary context managers in a robust fashion is
to ensure the initialisation steps are also covered by the outer
context managers:

    with CallbackStack() as stack:
        for make_cm in cm_factories:
            stack.enter(make_cm())

(Note that I'm not particularly happy with the class and method names
for contextlib2.ContextStack, and plan to redesign it a bit before
adding it to the stdlib module:
https://bitbucket.org/ncoghlan/contextlib2/issue/8/rename-contextstack-to-callbackstack-and)

The only time you can get away with a contextlib.nested() style API
where the iterable of context managers is created first is when you
*know* that all of the context managers involved do their resource
acquisition in __enter__ rather than __init__. In the general case,
though, any such API is broken because it doesn't reliably clean up
files and similar acquired-on-initialisation resources.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From yselivanov.ml at gmail.com  Mon Apr  2 23:15:38 2012
From: yselivanov.ml at gmail.com (Yury Selivanov)
Date: Mon, 2 Apr 2012 17:15:38 -0400
Subject: [Python-ideas] Protecting finally clauses of interruptions
In-Reply-To: <CAA0gF6pW8oujoLVAgabdwReWby3td1Qw8pg3S4ceoVqFwy5mzQ@mail.gmail.com>
References: <CAA0gF6o3Vhzg=Zi0UjWUOi-iVAwN70mXanoawooj8iC0FJj8hw@mail.gmail.com>
	<7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com>
	<CAA0gF6pW8oujoLVAgabdwReWby3td1Qw8pg3S4ceoVqFwy5mzQ@mail.gmail.com>
Message-ID: <339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com>

On 2012-04-02, at 4:49 PM, Paul Colomiets wrote:
> It's complex, because if thread waits on a lock you can't determine if it's
> interrupted after lock or before. E.g. it's common to write:
> 
> l.lock()
> try:
>    ...
> finally:
>    l.unlock()
> 
> Which will break if you interrupted just after lock is acquired.

Yes, that's a good question.  However, I fail to see how just adding
'f_in_finally' solves the problem.

>> 3. Either add a special base exception, that can be thrown in a currently
>> executing frame to interrupt it, or add a special method to frame object
>> 'f_interrupt()'. Once a frame is attempted to be interrupted, it checks
>> its 'f_in_finally' counter.  If it is 0, then throw exception, if not -
>> wait till it sets back to 0 and throw exception immediately.
>> 
> 
> Not sure how it supposed to work. If it's coroutine it may yield
> while in finally, and you want it be interrupted only when it exits from
> finally.

And what's the problem with that?  It should be able to yield in its 
finally freely.

@coroutine
def read_data(connection):
  try:
    yield connection.recv()
  finally:
    yield connection.close()
  print('this shouldn't be printed if a timeout occurs')

yield read_data().with_timeout(0.1)

In the above example, if 'connection.recv()' takes longer than 0.1s to 
execute, the scheduler (trampoline) should interrupt the coroutine, 
'connection.abort()' line will be executed, and once connection is
aborted, it should stop the coroutine immediately.

As of now, if you throw an exception while generator is in its 'try'
block, everything will work as I explained.  The interpreter will
execute the 'finally' block, and propagate the exception at the end
of it.

However, if you throw an exception while generator in its 'finally'
block (!), then your coroutine will be aborted too early.  With your 
'f_in_finally' flag, scheduler simply won't try to interrupt the
coroutine, but, then the 'print(...)' line will be executed (!!) 
(and  it shouldn't really).  So, we need to shift the control of when
a frame is best to be interrupted to the interpreter, not the user 
code.

-
Yury


From yselivanov.ml at gmail.com  Mon Apr  2 23:26:40 2012
From: yselivanov.ml at gmail.com (Yury Selivanov)
Date: Mon, 2 Apr 2012 17:26:40 -0400
Subject: [Python-ideas] Protecting finally clauses of interruptions
In-Reply-To: <CAA0gF6pW8oujoLVAgabdwReWby3td1Qw8pg3S4ceoVqFwy5mzQ@mail.gmail.com>
References: <CAA0gF6o3Vhzg=Zi0UjWUOi-iVAwN70mXanoawooj8iC0FJj8hw@mail.gmail.com>
	<7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com>
	<CAA0gF6pW8oujoLVAgabdwReWby3td1Qw8pg3S4ceoVqFwy5mzQ@mail.gmail.com>
Message-ID: <E824EAA6-26E7-4415-ABDE-4B0FDDDEF682@gmail.com>

On 2012-04-02, at 4:49 PM, Paul Colomiets wrote:
> l.lock()
> try:
>    ...
> finally:
>    l.unlock()
> 
> Which will break if you interrupted just after lock is acquired.

I guess the best way to solve this puzzle, is to track all locks that
the thread acquires and release them in case of forced interruption.

-
Yury


From paul at colomiets.name  Tue Apr  3 00:23:31 2012
From: paul at colomiets.name (Paul Colomiets)
Date: Tue, 3 Apr 2012 01:23:31 +0300
Subject: [Python-ideas] Protecting finally clauses of interruptions
In-Reply-To: <339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com>
References: <CAA0gF6o3Vhzg=Zi0UjWUOi-iVAwN70mXanoawooj8iC0FJj8hw@mail.gmail.com>
	<7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com>
	<CAA0gF6pW8oujoLVAgabdwReWby3td1Qw8pg3S4ceoVqFwy5mzQ@mail.gmail.com>
	<339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com>
Message-ID: <CAA0gF6qqpO0Z=DdpfCz699NJhPFPKibn-3uEv5-AqCcwSKfC=A@mail.gmail.com>

Hi Yury,

>On 2012-04-02, at 4:49 PM, Paul Colomiets wrote:
>> l.lock()
>> try:
>> ? ?...
>> finally:
>> ? ?l.unlock()
>>
>> Which will break if you interrupted just after lock is acquired.
>
>I guess the best way to solve this puzzle, is to track all locks that
>the thread acquires and release them in case of forced interruption.

Same with open files, and with all other kinds of contexts. I'd go
he route of making __enter__ also uninterruptable (and make timeout
inside a lock itself).

On Tue, Apr 3, 2012 at 12:15 AM, Yury Selivanov <yselivanov.ml at gmail.com> wrote:
>>> 3. Either add a special base exception, that can be thrown in a currently
>>> executing frame to interrupt it, or add a special method to frame object
>>> 'f_interrupt()'. Once a frame is attempted to be interrupted, it checks
>>> its 'f_in_finally' counter. ?If it is 0, then throw exception, if not -
>>> wait till it sets back to 0 and throw exception immediately.
>>>
>>
>> Not sure how it supposed to work. If it's coroutine it may yield
>> while in finally, and you want it be interrupted only when it exits from
>> finally.
>
> And what's the problem with that? ?It should be able to yield in its
> finally freely.
>
> @coroutine
> def read_data(connection):
> ?try:
> ? ?yield connection.recv()
> ?finally:
> ? ?yield connection.close()
> ?print('this shouldn't be printed if a timeout occurs')
>
> yield read_data().with_timeout(0.1)
>
> In the above example, if 'connection.recv()' takes longer than 0.1s to
> execute, the scheduler (trampoline) should interrupt the coroutine,
> 'connection.abort()' line will be executed, and once connection is
> aborted, it should stop the coroutine immediately.
>
> As of now, if you throw an exception while generator is in its 'try'
> block, everything will work as I explained. ?The interpreter will
> execute the 'finally' block, and propagate the exception at the end
> of it.
>
> However, if you throw an exception while generator in its 'finally'
> block (!), then your coroutine will be aborted too early. ?With your
> 'f_in_finally' flag, scheduler simply won't try to interrupt the
> coroutine, but, then the 'print(...)' line will be executed (!!)
> (and ?it shouldn't really). ?So, we need to shift the control of when
> a frame is best to be interrupted to the interpreter, not the user
> code.

You've probably not explained your proposal well.

If I call frame.f_interrupt() what should it do? Return anything
yield from a generator? And how you supposed to continue
generator iteration in this case? Or are you going to iterate
result of `f_interrupt()`? What it should do if it's not topmost
frame?

In all my use cases it doesn't matter if "print" is executed, just
like it doesn't matter if timeout occured after 1000 ms or
after 1001 or 1010 ms or even after 1500 ms as it actually
could. So sleeping a bit and trying again is OK. You need
to make all __exit__ and finally clauses fast, but it usually
not a problem.

--
Paul


From paul at colomiets.name  Tue Apr  3 00:24:24 2012
From: paul at colomiets.name (Paul Colomiets)
Date: Tue, 3 Apr 2012 01:24:24 +0300
Subject: [Python-ideas] Protecting finally clauses of interruptions
In-Reply-To: <EF0E28E3-EF74-4033-A907-89253FBD7D4B@gmail.com>
References: <CAA0gF6o3Vhzg=Zi0UjWUOi-iVAwN70mXanoawooj8iC0FJj8hw@mail.gmail.com>
	<7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com>
	<CAA0gF6pW8oujoLVAgabdwReWby3td1Qw8pg3S4ceoVqFwy5mzQ@mail.gmail.com>
	<339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com>
	<CAA0gF6rhwU-QxfQ-K40n7A9H80yXPbVhdofK-nOjbuLOnmwaJg@mail.gmail.com>
	<EF0E28E3-EF74-4033-A907-89253FBD7D4B@gmail.com>
Message-ID: <CAA0gF6oJuW5PbXMO4hhQdtwtA_YZZD2t7rZk_NGqhFu+NjwYUQ@mail.gmail.com>

Hi Yury,

On Tue, Apr 3, 2012 at 1:20 AM, Yury Selivanov <yselivanov.ml at gmail.com> wrote:
> On 2012-04-02, at 6:00 PM, Paul Colomiets wrote:
>
>> Hi Yury,
>>
>>> On 2012-04-02, at 4:49 PM, Paul Colomiets wrote:
>>>> l.lock()
>>>> try:
>>>> ? ...
>>>> finally:
>>>> ? l.unlock()
>>>>
>>>> Which will break if you interrupted just after lock is acquired.
>>>
>>> I guess the best way to solve this puzzle, is to track all locks that
>>> the thread acquires and release them in case of forced interruption.
>>
>> Same with open files, and with all other kinds of contexts. I'd go
>> he route of making __enter__ also uninterruptable (and make timeout
>> inside a lock itself).
>
>
> I still don't get how exactly do you propose to handle sudden thread
> interruption in your own example:
>
> l.lock()
> # (!) the thread may be interrupted at this point
> try:
> ? ...
> finally:
> ? l.unlock()
>
> You don't have a 'with' statement here.
>

By wrapping lock into a context manager.

-- 
Paul


From yselivanov.ml at gmail.com  Tue Apr  3 00:28:11 2012
From: yselivanov.ml at gmail.com (Yury Selivanov)
Date: Mon, 2 Apr 2012 18:28:11 -0400
Subject: [Python-ideas] Protecting finally clauses of interruptions
In-Reply-To: <CAA0gF6oJuW5PbXMO4hhQdtwtA_YZZD2t7rZk_NGqhFu+NjwYUQ@mail.gmail.com>
References: <CAA0gF6o3Vhzg=Zi0UjWUOi-iVAwN70mXanoawooj8iC0FJj8hw@mail.gmail.com>
	<7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com>
	<CAA0gF6pW8oujoLVAgabdwReWby3td1Qw8pg3S4ceoVqFwy5mzQ@mail.gmail.com>
	<339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com>
	<CAA0gF6rhwU-QxfQ-K40n7A9H80yXPbVhdofK-nOjbuLOnmwaJg@mail.gmail.com>
	<EF0E28E3-EF74-4033-A907-89253FBD7D4B@gmail.com>
	<CAA0gF6oJuW5PbXMO4hhQdtwtA_YZZD2t7rZk_NGqhFu+NjwYUQ@mail.gmail.com>
Message-ID: <7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com>

On 2012-04-02, at 6:24 PM, Paul Colomiets wrote:
>> I still don't get how exactly do you propose to handle sudden thread
>> interruption in your own example:
>> 
>> l.lock()
>> # (!) the thread may be interrupted at this point
>> try:
>>   ...
>> finally:
>>   l.unlock()
>> 
>> You don't have a 'with' statement here.
>> 
> 
> By wrapping lock into a context manager.

How's that going to work for tons of existing code?

-
Yury


From paul at colomiets.name  Tue Apr  3 00:33:02 2012
From: paul at colomiets.name (Paul Colomiets)
Date: Tue, 3 Apr 2012 01:33:02 +0300
Subject: [Python-ideas] Protecting finally clauses of interruptions
In-Reply-To: <7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com>
References: <CAA0gF6o3Vhzg=Zi0UjWUOi-iVAwN70mXanoawooj8iC0FJj8hw@mail.gmail.com>
	<7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com>
	<CAA0gF6pW8oujoLVAgabdwReWby3td1Qw8pg3S4ceoVqFwy5mzQ@mail.gmail.com>
	<339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com>
	<CAA0gF6rhwU-QxfQ-K40n7A9H80yXPbVhdofK-nOjbuLOnmwaJg@mail.gmail.com>
	<EF0E28E3-EF74-4033-A907-89253FBD7D4B@gmail.com>
	<CAA0gF6oJuW5PbXMO4hhQdtwtA_YZZD2t7rZk_NGqhFu+NjwYUQ@mail.gmail.com>
	<7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com>
Message-ID: <CAA0gF6qhFHQdMD-F1Tsx+aWHK-nEPvfGb2AwDVVzUPMCQCpX1A@mail.gmail.com>

Hi Yury,

On Tue, Apr 3, 2012 at 1:28 AM, Yury Selivanov <yselivanov.ml at gmail.com> wrote:
> On 2012-04-02, at 6:24 PM, Paul Colomiets wrote:
>>> I still don't get how exactly do you propose to handle sudden thread
>>> interruption in your own example:
>>>
>>> l.lock()
>>> # (!) the thread may be interrupted at this point
>>> try:
>>> ? ...
>>> finally:
>>> ? l.unlock()
>>>
>>> You don't have a 'with' statement here.
>>>
>>
>> By wrapping lock into a context manager.
>
> How's that going to work for tons of existing code?
>

It isn't. But it doesn't break code any more than it
already is. Your proposal doesn't solve any problems
with existing code too.

But anyway I don't propose any new ways to interrupt
code I only propose a way to inform trampoline when it's
unsafe to interrupt code.

-- 
Paul


From yselivanov.ml at gmail.com  Tue Apr  3 01:04:22 2012
From: yselivanov.ml at gmail.com (Yury Selivanov)
Date: Mon, 2 Apr 2012 19:04:22 -0400
Subject: [Python-ideas] Protecting finally clauses of interruptions
In-Reply-To: <CAA0gF6qhFHQdMD-F1Tsx+aWHK-nEPvfGb2AwDVVzUPMCQCpX1A@mail.gmail.com>
References: <CAA0gF6o3Vhzg=Zi0UjWUOi-iVAwN70mXanoawooj8iC0FJj8hw@mail.gmail.com>
	<7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com>
	<CAA0gF6pW8oujoLVAgabdwReWby3td1Qw8pg3S4ceoVqFwy5mzQ@mail.gmail.com>
	<339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com>
	<CAA0gF6rhwU-QxfQ-K40n7A9H80yXPbVhdofK-nOjbuLOnmwaJg@mail.gmail.com>
	<EF0E28E3-EF74-4033-A907-89253FBD7D4B@gmail.com>
	<CAA0gF6oJuW5PbXMO4hhQdtwtA_YZZD2t7rZk_NGqhFu+NjwYUQ@mail.gmail.com>
	<7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com>
	<CAA0gF6qhFHQdMD-F1Tsx+aWHK-nEPvfGb2AwDVVzUPMCQCpX1A@mail.gmail.com>
Message-ID: <187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com>

On 2012-04-02, at 6:33 PM, Paul Colomiets wrote:
>> How's that going to work for tons of existing code?
>> 
> 
> It isn't. But it doesn't break code any more than it
> already is. Your proposal doesn't solve any problems
> with existing code too.
> 
> But anyway I don't propose any new ways to interrupt
> code I only propose a way to inform trampoline when it's
> unsafe to interrupt code.

Well, if we're thinking only about interrupting coroutines 
(not threads), then it's going to work, yes.

My initial desire to use a special exception for the purpose, 
was because of:

- it's easier to throw exception in the thread (the C-API
function already exists, and we need to think about 
consequences of using it)

- PyPy disables the JIT when working with frames (if I recall
correctly).  That's why I wanted 'f_in_finally' to be an
implementation detail of CPython, hidden from the user code.  
Perhaps PyPy could implement the handling of our special 
exception in a more efficient way, without the side-effect 
of disabling the JIT.

What do you think?

-
Yury


From greg at krypto.org  Tue Apr  3 01:10:22 2012
From: greg at krypto.org (Gregory P. Smith)
Date: Mon, 2 Apr 2012 16:10:22 -0700
Subject: [Python-ideas] Protecting finally clauses of interruptions
In-Reply-To: <7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com>
References: <CAA0gF6o3Vhzg=Zi0UjWUOi-iVAwN70mXanoawooj8iC0FJj8hw@mail.gmail.com>
	<7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com>
	<CAA0gF6pW8oujoLVAgabdwReWby3td1Qw8pg3S4ceoVqFwy5mzQ@mail.gmail.com>
	<339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com>
	<CAA0gF6rhwU-QxfQ-K40n7A9H80yXPbVhdofK-nOjbuLOnmwaJg@mail.gmail.com>
	<EF0E28E3-EF74-4033-A907-89253FBD7D4B@gmail.com>
	<CAA0gF6oJuW5PbXMO4hhQdtwtA_YZZD2t7rZk_NGqhFu+NjwYUQ@mail.gmail.com>
	<7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com>
Message-ID: <CAGE7PNLoxN_uXm1t55crb+o7fffW75tRyU-PG4mq=RVvDvz=oA@mail.gmail.com>

On Mon, Apr 2, 2012 at 3:28 PM, Yury Selivanov <yselivanov.ml at gmail.com>wrote:

> On 2012-04-02, at 6:24 PM, Paul Colomiets wrote:
> >> I still don't get how exactly do you propose to handle sudden thread
> >> interruption in your own example:
> >>
> >> l.lock()
> >> # (!) the thread may be interrupted at this point
> >> try:
> >>   ...
> >> finally:
> >>   l.unlock()
> >>
> >> You don't have a 'with' statement here.
> >>
> >
> > By wrapping lock into a context manager.
>
> How's that going to work for tons of existing code?
>

A context manager doesn't solve this interruption "race condition" issue
anyways.

If the __enter__ method is interrupted it won't have returned a context and
thus __exit__ will never be called.

-gps


> -
> Yury
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120402/a40c5e99/attachment.html>

From yselivanov.ml at gmail.com  Tue Apr  3 01:13:47 2012
From: yselivanov.ml at gmail.com (Yury Selivanov)
Date: Mon, 2 Apr 2012 19:13:47 -0400
Subject: [Python-ideas] Protecting finally clauses of interruptions
In-Reply-To: <CAGE7PNLoxN_uXm1t55crb+o7fffW75tRyU-PG4mq=RVvDvz=oA@mail.gmail.com>
References: <CAA0gF6o3Vhzg=Zi0UjWUOi-iVAwN70mXanoawooj8iC0FJj8hw@mail.gmail.com>
	<7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com>
	<CAA0gF6pW8oujoLVAgabdwReWby3td1Qw8pg3S4ceoVqFwy5mzQ@mail.gmail.com>
	<339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com>
	<CAA0gF6rhwU-QxfQ-K40n7A9H80yXPbVhdofK-nOjbuLOnmwaJg@mail.gmail.com>
	<EF0E28E3-EF74-4033-A907-89253FBD7D4B@gmail.com>
	<CAA0gF6oJuW5PbXMO4hhQdtwtA_YZZD2t7rZk_NGqhFu+NjwYUQ@mail.gmail.com>
	<7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com>
	<CAGE7PNLoxN_uXm1t55crb+o7fffW75tRyU-PG4mq=RVvDvz=oA@mail.gmail.com>
Message-ID: <15E33FC2-CB5D-4713-9F54-C711BAF73B98@gmail.com>

On 2012-04-02, at 7:10 PM, Gregory P. Smith wrote:
> If the __enter__ method is interrupted it won't have returned a context and thus __exit__ will never be called.

To address that Paul proposed to make __enter__ non-interruptable as well.

-
Yury

From paul at colomiets.name  Tue Apr  3 01:36:42 2012
From: paul at colomiets.name (Paul Colomiets)
Date: Tue, 3 Apr 2012 02:36:42 +0300
Subject: [Python-ideas] Protecting finally clauses of interruptions
In-Reply-To: <187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com>
References: <CAA0gF6o3Vhzg=Zi0UjWUOi-iVAwN70mXanoawooj8iC0FJj8hw@mail.gmail.com>
	<7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com>
	<CAA0gF6pW8oujoLVAgabdwReWby3td1Qw8pg3S4ceoVqFwy5mzQ@mail.gmail.com>
	<339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com>
	<CAA0gF6rhwU-QxfQ-K40n7A9H80yXPbVhdofK-nOjbuLOnmwaJg@mail.gmail.com>
	<EF0E28E3-EF74-4033-A907-89253FBD7D4B@gmail.com>
	<CAA0gF6oJuW5PbXMO4hhQdtwtA_YZZD2t7rZk_NGqhFu+NjwYUQ@mail.gmail.com>
	<7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com>
	<CAA0gF6qhFHQdMD-F1Tsx+aWHK-nEPvfGb2AwDVVzUPMCQCpX1A@mail.gmail.com>
	<187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com>
Message-ID: <CAA0gF6oWXkZYYM=eLBZx_XPeN4EAMdCFd-z0BgoWqo+nMiBCtQ@mail.gmail.com>

Hi Yury,

On Tue, Apr 3, 2012 at 2:04 AM, Yury Selivanov <yselivanov.ml at gmail.com> wrote:
> On 2012-04-02, at 6:33 PM, Paul Colomiets wrote:
>>> How's that going to work for tons of existing code?
>>>
>>
>> It isn't. But it doesn't break code any more than it
>> already is. Your proposal doesn't solve any problems
>> with existing code too.
>>
>> But anyway I don't propose any new ways to interrupt
>> code I only propose a way to inform trampoline when it's
>> unsafe to interrupt code.
>
> Well, if we're thinking only about interrupting coroutines
> (not threads), then it's going to work, yes.
>

Yes the threading stuff is more complex. For the main
thread there are few possible implementations, e.g. using
signals. If thread signals were ever implemented in python
they can be used too.

The real problem is inspecting a stack from another
thread. But my solution by itself gives a pretty big field
of experimentation. Like you can wrap every blocking call,
and check the stack on EINTR, and either send an
exception or wait a bit, like with coroutines (and you can
emit EINTR with pthread_kill, and implement waiting either
using another thread or using sys.settrace(), as I don't think
performance really matter here)

> My initial desire to use a special exception for the purpose,
> was because of:
>
> - it's easier to throw exception in the thread (the C-API
> function already exists, and we need to think about
> consequences of using it)
>

It's nice for python to have finally protection built-in,
but I don't see how it can be implemented in a generic way.
E.g. I usually don't want to break finally unless it executes
too long, or if I hit Ctrl+C multiple times. And that is too
subjective, to be proposed as common behavior. I also
doubt how it can work with stack of yield-based coroutines,
as it knows nothing about this kind of stack.

> - PyPy disables the JIT when working with frames (if I recall
> correctly). ?That's why I wanted 'f_in_finally' to be an
> implementation detail of CPython, hidden from the user code.
> Perhaps PyPy could implement the handling of our special
> exception in a more efficient way, without the side-effect
> of disabling the JIT.
>

Yes, I think PyPy can make some exception. Or maybe
until PyPy implement support of 3.3, some library may end up
with nice high level API, which both python implementation can
include.

-- 
Paul


From yselivanov.ml at gmail.com  Tue Apr  3 02:02:06 2012
From: yselivanov.ml at gmail.com (Yury Selivanov)
Date: Mon, 2 Apr 2012 20:02:06 -0400
Subject: [Python-ideas] Protecting finally clauses of interruptions
In-Reply-To: <CAA0gF6oWXkZYYM=eLBZx_XPeN4EAMdCFd-z0BgoWqo+nMiBCtQ@mail.gmail.com>
References: <CAA0gF6o3Vhzg=Zi0UjWUOi-iVAwN70mXanoawooj8iC0FJj8hw@mail.gmail.com>
	<7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com>
	<CAA0gF6pW8oujoLVAgabdwReWby3td1Qw8pg3S4ceoVqFwy5mzQ@mail.gmail.com>
	<339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com>
	<CAA0gF6rhwU-QxfQ-K40n7A9H80yXPbVhdofK-nOjbuLOnmwaJg@mail.gmail.com>
	<EF0E28E3-EF74-4033-A907-89253FBD7D4B@gmail.com>
	<CAA0gF6oJuW5PbXMO4hhQdtwtA_YZZD2t7rZk_NGqhFu+NjwYUQ@mail.gmail.com>
	<7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com>
	<CAA0gF6qhFHQdMD-F1Tsx+aWHK-nEPvfGb2AwDVVzUPMCQCpX1A@mail.gmail.com>
	<187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com>
	<CAA0gF6oWXkZYYM=eLBZx_XPeN4EAMdCFd-z0BgoWqo+nMiBCtQ@mail.gmail.com>
Message-ID: <E245C90B-2948-4F6C-80C2-5D96B08AF580@gmail.com>

On 2012-04-02, at 7:36 PM, Paul Colomiets wrote:

> It's nice for python to have finally protection built-in,
> but I don't see how it can be implemented in a generic way.

How about adding some sort of 'interruption protocol'?

Say Threads, generators, and Greenlets will have a special
method called '_interrupt'.  While it will be implemented
differently for each of them, it will work on the same
principle:

- check if the underlying code block is in one of its
'finally' statements

- if it is: set a special flag to abort when the
frame's counter of 'finally' blocks reaches 0 to
raise the ExecutionInterrupt exception

- if it is not: raise the ExecutionInterrupt exception 
right away

-
Yury


From paul at colomiets.name  Tue Apr  3 09:16:22 2012
From: paul at colomiets.name (Paul Colomiets)
Date: Tue, 3 Apr 2012 10:16:22 +0300
Subject: [Python-ideas] Protecting finally clauses of interruptions
In-Reply-To: <E245C90B-2948-4F6C-80C2-5D96B08AF580@gmail.com>
References: <CAA0gF6o3Vhzg=Zi0UjWUOi-iVAwN70mXanoawooj8iC0FJj8hw@mail.gmail.com>
	<7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com>
	<CAA0gF6pW8oujoLVAgabdwReWby3td1Qw8pg3S4ceoVqFwy5mzQ@mail.gmail.com>
	<339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com>
	<CAA0gF6rhwU-QxfQ-K40n7A9H80yXPbVhdofK-nOjbuLOnmwaJg@mail.gmail.com>
	<EF0E28E3-EF74-4033-A907-89253FBD7D4B@gmail.com>
	<CAA0gF6oJuW5PbXMO4hhQdtwtA_YZZD2t7rZk_NGqhFu+NjwYUQ@mail.gmail.com>
	<7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com>
	<CAA0gF6qhFHQdMD-F1Tsx+aWHK-nEPvfGb2AwDVVzUPMCQCpX1A@mail.gmail.com>
	<187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com>
	<CAA0gF6oWXkZYYM=eLBZx_XPeN4EAMdCFd-z0BgoWqo+nMiBCtQ@mail.gmail.com>
	<E245C90B-2948-4F6C-80C2-5D96B08AF580@gmail.com>
Message-ID: <CAA0gF6qs_gp-NbmtLHse6LqQs7q3C-BNkvmf1U8HUDwy__Si=A@mail.gmail.com>

Hi Yury,

On Tue, Apr 3, 2012 at 3:02 AM, Yury Selivanov <yselivanov.ml at gmail.com> wrote:
> On 2012-04-02, at 7:36 PM, Paul Colomiets wrote:
>
>> It's nice for python to have finally protection built-in,
>> but I don't see how it can be implemented in a generic way.
>
> How about adding some sort of 'interruption protocol'?
>
> Say Threads, generators, and Greenlets will have a special
> method called '_interrupt'. ?While it will be implemented
> differently for each of them, it will work on the same
> principle:
>

On the first look, it's nice proposal. On the second one
we have the following problems:

1. For yield-based coroutines you must inspect stack
anyway, since interpreter doesn't have a stack, you
build it yourself (although, I don't know how `yield from`
changes that)

2. For greenlet based coroutines it is unclear what
the stack is. For example:

def f1():
    try:
        pass
    finally:
        g1.switch()

def f2():
    sleep(1.0)

g1 = greenlet(f1)
g2 = greenlet(f2)
g1.switch()

Is it safe to interrupt g2 while it's in `sleep`? (If you wonder
how I fix this problem with f_in_finally stack, it's easy. I
usually switch to a coroutine from trampoline, so this is
a boundary of the stack which should be checked for
f_in_finally).

3. For threads it was discussed several times and rejected.
This proposal may make thread interruptions slightly safer,
but I'm not sure it's enough to convince people.

Also at the first implementation we may oversight some
places where it's unsafe to break. Like for some objects
__init__/__exit__ is safe pair of functions not
__enter__/__exit__. So we might need `with` expression
to be uninterrtuptable, for the code like:

with open('something') as f:
    ...

It may break if interrupted inside `open()`. (Although, if
__enter__ is protected you can fix the problem with a
simple wrapper).

So I still propose add a frame flag, which doesn't break
anything, and gives us experiment with interruptions
without putting some experimental code into the core.

-- 
Paul


From alon at horev.net  Tue Apr  3 10:02:25 2012
From: alon at horev.net (Alon Horev)
Date: Tue, 3 Apr 2012 11:02:25 +0300
Subject: [Python-ideas] with *context_managers:
In-Reply-To: <CADiSq7d2i9kuFQq6uuZSjYjwPPR27PLjkpdwQXG07F8cO1Ckew@mail.gmail.com>
References: <29272049.1143.1333311941739.JavaMail.geo-discussion-forums@vbgx21>
	<20120402153451.GA2470@bagheera>
	<CADiSq7d2i9kuFQq6uuZSjYjwPPR27PLjkpdwQXG07F8cO1Ckew@mail.gmail.com>
Message-ID: <CAKZkVDs2nFhb0WX_FyvWoXfAYa_+KK731UjVLLKSjktvNsmBLw@mail.gmail.com>

Another proposal, change nested to work with generators:
with nested(open(path) for path in files):
    ....

pros:
1. lazy evaluation of context manager creation (which is everything that is
bad with today's nested).
2. shorter than ContextStack.

cons:
1. generators are not always an option, in these cases ContextStack is the
way to go.
2. a little implicit - I can imagine a python newbie swearing my mom
because he didn't know he should use a generator instead of a list.

                          thanks, Alon Horev



On Mon, Apr 2, 2012 at 11:52 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> On Tue, Apr 3, 2012 at 1:34 AM, Sven Marnach <sven at marnach.net> wrote:
> > which only works for a fixed number of context managers.  And there is
> > a class 'ContextStack' in Nick Coghlan's 'contextlib2' library [1],
> > which might be included in Python 3.3.  With this class, you could
> > write your code as
> >
> >    with ContextStack() as stack:
> >        for cm in context_managers:
> >            stack.enter_context(cm)
> >
> > This still leaves the question whether your proposed syntax would be
> > preferable, also with regard to issue 2292 [2].
>
> Both "with *(iterable)" and "for cm in iterable: stack.enter(cm)" are
> flawed in exactly the same way that contextlib.nested() is flawed:
> they encourage creating the iterable of context managers first, which
> means that inner __init__ methods are not covered by outer __exit__
> methods.
>
> This breaks as soon as you have resources (such as files) where the
> acquire/release resource management pairing is actually
> __init__/__exit__ with __enter__ just returning self rather than
> acquiring the resource. If the iterable of context managers is created
> first, then the outer resources *will be leaked* if any of the inner
> constructors fail. The only way to write code that handles an
> arbitrary number of arbitrary context managers in a robust fashion is
> to ensure the initialisation steps are also covered by the outer
> context managers:
>
>    with CallbackStack() as stack:
>        for make_cm in cm_factories:
>            stack.enter(make_cm())
>
> (Note that I'm not particularly happy with the class and method names
> for contextlib2.ContextStack, and plan to redesign it a bit before
> adding it to the stdlib module:
>
> https://bitbucket.org/ncoghlan/contextlib2/issue/8/rename-contextstack-to-callbackstack-and
> )
>
> The only time you can get away with a contextlib.nested() style API
> where the iterable of context managers is created first is when you
> *know* that all of the context managers involved do their resource
> acquisition in __enter__ rather than __init__. In the general case,
> though, any such API is broken because it doesn't reliably clean up
> files and similar acquired-on-initialisation resources.
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120403/8ac029d8/attachment.html>

From yselivanov.ml at gmail.com  Tue Apr  3 16:09:07 2012
From: yselivanov.ml at gmail.com (Yury Selivanov)
Date: Tue, 3 Apr 2012 10:09:07 -0400
Subject: [Python-ideas] Protecting finally clauses of interruptions
In-Reply-To: <CAA0gF6qs_gp-NbmtLHse6LqQs7q3C-BNkvmf1U8HUDwy__Si=A@mail.gmail.com>
References: <CAA0gF6o3Vhzg=Zi0UjWUOi-iVAwN70mXanoawooj8iC0FJj8hw@mail.gmail.com>
	<7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com>
	<CAA0gF6pW8oujoLVAgabdwReWby3td1Qw8pg3S4ceoVqFwy5mzQ@mail.gmail.com>
	<339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com>
	<CAA0gF6rhwU-QxfQ-K40n7A9H80yXPbVhdofK-nOjbuLOnmwaJg@mail.gmail.com>
	<EF0E28E3-EF74-4033-A907-89253FBD7D4B@gmail.com>
	<CAA0gF6oJuW5PbXMO4hhQdtwtA_YZZD2t7rZk_NGqhFu+NjwYUQ@mail.gmail.com>
	<7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com>
	<CAA0gF6qhFHQdMD-F1Tsx+aWHK-nEPvfGb2AwDVVzUPMCQCpX1A@mail.gmail.com>
	<187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com>
	<CAA0gF6oWXkZYYM=eLBZx_XPeN4EAMdCFd-z0BgoWqo+nMiBCtQ@mail.gmail.com>
	<E245C90B-2948-4F6C-80C2-5D96B08AF580@gmail.com>
	<CAA0gF6qs_gp-NbmtLHse6LqQs7q3C-BNkvmf1U8HUDwy__Si=A@mail.gmail.com>
Message-ID: <314BE01B-160C-4C84-8E0D-AD82D14E2E56@gmail.com>

On 2012-04-03, at 3:16 AM, Paul Colomiets wrote:
> 1. For yield-based coroutines you must inspect stack
> anyway, since interpreter doesn't have a stack, you
> build it yourself (although, I don't know how `yield from`
> changes that)
> 
> 2. For greenlet based coroutines it is unclear what
> the stack is. For example:
> 
> def f1():
>    try:
>        pass
>    finally:
>        g1.switch()
> 
> def f2():
>    sleep(1.0)
> 
> g1 = greenlet(f1)
> g2 = greenlet(f2)
> g1.switch()
> 
> Is it safe to interrupt g2 while it's in `sleep`? (If you wonder
> how I fix this problem with f_in_finally stack, it's easy. I
> usually switch to a coroutine from trampoline, so this is
> a boundary of the stack which should be checked for
> f_in_finally).

Wait.  So you're tracing the whole coroutine execution stack to
check if the current coroutine was called in a finally block of
some other coroutine?  For handling timeouts I don't think that
is necessary (maybe there are other use cases?)

In the example below you actually have to interrupt g2:

def g1():
   try:
     ...
   finally:
     g2().with_timeout(0.1)

def g2():
   sleep(2)

You shouldn't guarantee that the *whole* chain of functions/
coroutines/etc will be safe in their finally statements, you just 
need to protect the top coroutines in the timeouts queue.

Hence, in the above example, if you run g1() with a timeout, the
trampoline should ensure that it won't interrupt it while it is
in its finally block.  But it can interrupt g2() in any context
at any point of its execution.  And if g2() gets interrupted,
g1()'s finally statement will be broken, yes.  But that's the
responsibility of the developer to ensure that the code in
'finally' handles exceptions within it correctly.

That's just my approach to handle timeouts, I'm not advocating
it to be the very right one.

Are there any other use-cases when you have to inspect the
execution stack?  Because if there is no, 'interrupt()' method
is sufficient and implementable, as both generators and
greenlets are well aware about the code frames they holding.

> 3. For threads it was discussed several times and rejected.
> This proposal may make thread interruptions slightly safer,
> but I'm not sure it's enough to convince people.

That's why I'm advocating for a PEP.  Thread interruption isn't
a safe feature in the .NET CLR either.  You may break things with 
it there too.  And it doesn't protect the chain of functions 
calling each other from their 'finally' statements, it just 
protects the top frame.  The 'abort' and 'interrupt' methods 
aren't advertised to be used in .NET, use them at your own risk.

So I don't think that we can, or should ensure 100% safety when
interrupting a thread.  And that's why I think it is worth to
propose a mechanism that will work for many concurrency 
primitives.

> So I still propose add a frame flag, which doesn't break
> anything, and gives us experiment with interruptions
> without putting some experimental code into the core.


There are cons and pros in your solution.

Pros
----

- can be used right away in coroutine libraries.

- somewhat simple and small CPython patch.

Cons
----

- you have to work with frames almost throughout the execution
of the program.  In PyPy you simply will have the JIT disabled.
And I'm not sure how frame access works in Jython and IronPython
from the performance point of view.

- no mechanism for interrupting a running thread.  In almost any
coroutine library you will have a thread pool, and sometimes you
need a way to interrupt workers.  So it's not enough even for
coroutines.

-
Yury


From andrew.svetlov at gmail.com  Tue Apr  3 18:11:36 2012
From: andrew.svetlov at gmail.com (Andrew Svetlov)
Date: Tue, 3 Apr 2012 19:11:36 +0300
Subject: [Python-ideas] Protecting finally clauses of interruptions
In-Reply-To: <314BE01B-160C-4C84-8E0D-AD82D14E2E56@gmail.com>
References: <CAA0gF6o3Vhzg=Zi0UjWUOi-iVAwN70mXanoawooj8iC0FJj8hw@mail.gmail.com>
	<7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com>
	<CAA0gF6pW8oujoLVAgabdwReWby3td1Qw8pg3S4ceoVqFwy5mzQ@mail.gmail.com>
	<339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com>
	<CAA0gF6rhwU-QxfQ-K40n7A9H80yXPbVhdofK-nOjbuLOnmwaJg@mail.gmail.com>
	<EF0E28E3-EF74-4033-A907-89253FBD7D4B@gmail.com>
	<CAA0gF6oJuW5PbXMO4hhQdtwtA_YZZD2t7rZk_NGqhFu+NjwYUQ@mail.gmail.com>
	<7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com>
	<CAA0gF6qhFHQdMD-F1Tsx+aWHK-nEPvfGb2AwDVVzUPMCQCpX1A@mail.gmail.com>
	<187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com>
	<CAA0gF6oWXkZYYM=eLBZx_XPeN4EAMdCFd-z0BgoWqo+nMiBCtQ@mail.gmail.com>
	<E245C90B-2948-4F6C-80C2-5D96B08AF580@gmail.com>
	<CAA0gF6qs_gp-NbmtLHse6LqQs7q3C-BNkvmf1U8HUDwy__Si=A@mail.gmail.com>
	<314BE01B-160C-4C84-8E0D-AD82D14E2E56@gmail.com>
Message-ID: <CAL3CFcXBN7kF8V5FWf9fMvrXGXsR8cD5TpAcz98_K-7UoZmR3A@mail.gmail.com>

Instead of lookup from nested frame it's possible to propagate flag
down to called frames.

On Tue, Apr 3, 2012 at 5:09 PM, Yury Selivanov <yselivanov.ml at gmail.com> wrote:
> On 2012-04-03, at 3:16 AM, Paul Colomiets wrote:
>> 1. For yield-based coroutines you must inspect stack
>> anyway, since interpreter doesn't have a stack, you
>> build it yourself (although, I don't know how `yield from`
>> changes that)
>>
>> 2. For greenlet based coroutines it is unclear what
>> the stack is. For example:
>>
>> def f1():
>> ? ?try:
>> ? ? ? ?pass
>> ? ?finally:
>> ? ? ? ?g1.switch()
>>
>> def f2():
>> ? ?sleep(1.0)
>>
>> g1 = greenlet(f1)
>> g2 = greenlet(f2)
>> g1.switch()
>>
>> Is it safe to interrupt g2 while it's in `sleep`? (If you wonder
>> how I fix this problem with f_in_finally stack, it's easy. I
>> usually switch to a coroutine from trampoline, so this is
>> a boundary of the stack which should be checked for
>> f_in_finally).
>
> Wait. ?So you're tracing the whole coroutine execution stack to
> check if the current coroutine was called in a finally block of
> some other coroutine? ?For handling timeouts I don't think that
> is necessary (maybe there are other use cases?)
>
> In the example below you actually have to interrupt g2:
>
> def g1():
> ? try:
> ? ? ...
> ? finally:
> ? ? g2().with_timeout(0.1)
>
> def g2():
> ? sleep(2)
>
> You shouldn't guarantee that the *whole* chain of functions/
> coroutines/etc will be safe in their finally statements, you just
> need to protect the top coroutines in the timeouts queue.
>
> Hence, in the above example, if you run g1() with a timeout, the
> trampoline should ensure that it won't interrupt it while it is
> in its finally block. ?But it can interrupt g2() in any context
> at any point of its execution. ?And if g2() gets interrupted,
> g1()'s finally statement will be broken, yes. ?But that's the
> responsibility of the developer to ensure that the code in
> 'finally' handles exceptions within it correctly.
>
> That's just my approach to handle timeouts, I'm not advocating
> it to be the very right one.
>
> Are there any other use-cases when you have to inspect the
> execution stack? ?Because if there is no, 'interrupt()' method
> is sufficient and implementable, as both generators and
> greenlets are well aware about the code frames they holding.
>
>> 3. For threads it was discussed several times and rejected.
>> This proposal may make thread interruptions slightly safer,
>> but I'm not sure it's enough to convince people.
>
> That's why I'm advocating for a PEP. ?Thread interruption isn't
> a safe feature in the .NET CLR either. ?You may break things with
> it there too. ?And it doesn't protect the chain of functions
> calling each other from their 'finally' statements, it just
> protects the top frame. ?The 'abort' and 'interrupt' methods
> aren't advertised to be used in .NET, use them at your own risk.
>
> So I don't think that we can, or should ensure 100% safety when
> interrupting a thread. ?And that's why I think it is worth to
> propose a mechanism that will work for many concurrency
> primitives.
>
>> So I still propose add a frame flag, which doesn't break
>> anything, and gives us experiment with interruptions
>> without putting some experimental code into the core.
>
>
> There are cons and pros in your solution.
>
> Pros
> ----
>
> - can be used right away in coroutine libraries.
>
> - somewhat simple and small CPython patch.
>
> Cons
> ----
>
> - you have to work with frames almost throughout the execution
> of the program. ?In PyPy you simply will have the JIT disabled.
> And I'm not sure how frame access works in Jython and IronPython
> from the performance point of view.
>
> - no mechanism for interrupting a running thread. ?In almost any
> coroutine library you will have a thread pool, and sometimes you
> need a way to interrupt workers. ?So it's not enough even for
> coroutines.
>
> -
> Yury
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas



-- 
Thanks,
Andrew Svetlov


From paul at colomiets.name  Tue Apr  3 21:22:50 2012
From: paul at colomiets.name (Paul Colomiets)
Date: Tue, 3 Apr 2012 22:22:50 +0300
Subject: [Python-ideas] Protecting finally clauses of interruptions
In-Reply-To: <314BE01B-160C-4C84-8E0D-AD82D14E2E56@gmail.com>
References: <CAA0gF6o3Vhzg=Zi0UjWUOi-iVAwN70mXanoawooj8iC0FJj8hw@mail.gmail.com>
	<7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com>
	<CAA0gF6pW8oujoLVAgabdwReWby3td1Qw8pg3S4ceoVqFwy5mzQ@mail.gmail.com>
	<339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com>
	<CAA0gF6rhwU-QxfQ-K40n7A9H80yXPbVhdofK-nOjbuLOnmwaJg@mail.gmail.com>
	<EF0E28E3-EF74-4033-A907-89253FBD7D4B@gmail.com>
	<CAA0gF6oJuW5PbXMO4hhQdtwtA_YZZD2t7rZk_NGqhFu+NjwYUQ@mail.gmail.com>
	<7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com>
	<CAA0gF6qhFHQdMD-F1Tsx+aWHK-nEPvfGb2AwDVVzUPMCQCpX1A@mail.gmail.com>
	<187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com>
	<CAA0gF6oWXkZYYM=eLBZx_XPeN4EAMdCFd-z0BgoWqo+nMiBCtQ@mail.gmail.com>
	<E245C90B-2948-4F6C-80C2-5D96B08AF580@gmail.com>
	<CAA0gF6qs_gp-NbmtLHse6LqQs7q3C-BNkvmf1U8HUDwy__Si=A@mail.gmail.com>
	<314BE01B-160C-4C84-8E0D-AD82D14E2E56@gmail.com>
Message-ID: <CAA0gF6r6RcsdZwCLVUMsvo2VEumWttMbHPjUar5pKg9HirjGjA@mail.gmail.com>

Hi Yury,

On Tue, Apr 3, 2012 at 5:09 PM, Yury Selivanov <yselivanov.ml at gmail.com> wrote:
> You shouldn't guarantee that the *whole* chain of functions/
> coroutines/etc will be safe in their finally statements, you just
> need to protect the top coroutines in the timeouts queue.
>

For yield-based coroutines the common case is the following:

yield lock.acquire()
try:
    # something
finally:
    yield lock.release()

The implementation of lock.release() is something which
goes to distributed locking manager, so should not be interrupted.
Although I can think of some ways of fixing it using tight coupling
of locks with trampoline, but having obvious way to do it without
hacks is much better.

(Although, I don't know how `yield from` changes working with
yield-based coroutines, may be it's behavior is quite different)

For greenlets situation is a bit different, as Python knows the
stack there, but you still need to traverse it (or as Andrew
mentioned, you can just propagate flag).

> - no mechanism for interrupting a running thread.

Yes. That was intentionally to have greater chance to success.
Interruption may be separate proposal.

> In almost any coroutine library you will have a thread pool,
> and sometimes you need a way to interrupt workers.

Which one? I know that twisted uses thread pool to handle:

1. DNS (which is just silly)
2. Few other protocols which doesn't have asynchronous
libraries (which should be fixed)

The whole intention of using coroutine library is to not to
have thread pool. Could you describe your use case
with more details?

> So it's not enough even for coroutines.
>

Very subjective, and doesn't match my expectations.


-- 
Paul


From yselivanov.ml at gmail.com  Wed Apr  4 03:23:34 2012
From: yselivanov.ml at gmail.com (Yury Selivanov)
Date: Tue, 3 Apr 2012 21:23:34 -0400
Subject: [Python-ideas] Protecting finally clauses of interruptions
In-Reply-To: <CAA0gF6r6RcsdZwCLVUMsvo2VEumWttMbHPjUar5pKg9HirjGjA@mail.gmail.com>
References: <CAA0gF6o3Vhzg=Zi0UjWUOi-iVAwN70mXanoawooj8iC0FJj8hw@mail.gmail.com>
	<7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com>
	<CAA0gF6pW8oujoLVAgabdwReWby3td1Qw8pg3S4ceoVqFwy5mzQ@mail.gmail.com>
	<339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com>
	<CAA0gF6rhwU-QxfQ-K40n7A9H80yXPbVhdofK-nOjbuLOnmwaJg@mail.gmail.com>
	<EF0E28E3-EF74-4033-A907-89253FBD7D4B@gmail.com>
	<CAA0gF6oJuW5PbXMO4hhQdtwtA_YZZD2t7rZk_NGqhFu+NjwYUQ@mail.gmail.com>
	<7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com>
	<CAA0gF6qhFHQdMD-F1Tsx+aWHK-nEPvfGb2AwDVVzUPMCQCpX1A@mail.gmail.com>
	<187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com>
	<CAA0gF6oWXkZYYM=eLBZx_XPeN4EAMdCFd-z0BgoWqo+nMiBCtQ@mail.gmail.com>
	<E245C90B-2948-4F6C-80C2-5D96B08AF580@gmail.com>
	<CAA0gF6qs_gp-NbmtLHse6LqQs7q3C-BNkvmf1U8HUDwy__Si=A@mail.gmail.com>
	<314BE01B-160C-4C84-8E0D-AD82D14E2E56@gmail.com>
	<CAA0gF6r6RcsdZwCLVUMsvo2VEumWttMbHPjUar5pKg9HirjGjA@mail.gmail.com>
Message-ID: <A2767D04-7D85-46D5-A72F-2CB1839B9133@gmail.com>

On 2012-04-03, at 3:22 PM, Paul Colomiets wrote:
> (Although, I don't know how `yield from` changes working with
> yield-based coroutines, may be it's behavior is quite different)
> 
> For greenlets situation is a bit different, as Python knows the
> stack there, but you still need to traverse it (or as Andrew
> mentioned, you can just propagate flag).

Why traverse?  Why propagate?  As I explained in my previous posts 
here, you need to protect only the top-stack coroutines in the 
timeouts or trampoline execution queues.  You should illustrate
your logic with a more clear example - say three or four coroutines
that call each other + with a glimpse of how your trampoline works.
But I'm not sure that is really necessary.

>> - no mechanism for interrupting a running thread.
> 
> Yes. That was intentionally to have greater chance to success.
> Interruption may be separate proposal.
> 
>> In almost any coroutine library you will have a thread pool,
>> and sometimes you need a way to interrupt workers.
> 
> Which one? I know that twisted uses thread pool to handle:

Besides Twisted?  eventlet; gevent will have them in 1.0, etc.

> 1. DNS (which is just silly)
> 2. Few other protocols which doesn't have asynchronous
> libraries (which should be fixed)
> 
> The whole intention of using coroutine library is to not to
> have thread pool. Could you describe your use case
> with more details?

Well, our company has been using coroutines for like 2.5 years
now (the framework in not yet opensourced).  And in our practice
threadpool is really handy, as it allows you to:

- use non-asyncronous libraries, which you don't want to
monkeypatch with greensockets (or even unable to mokeypatch)

- wrap some functions that are usually very fast, but once in
a while may take some time.  And sometimes you don't want to
offload them to a separate process

- and yes, do DNS lookups if you don't have a compiled cpython
extension that wraps c-ares or something alike.

Please let's avoid shifting further discussion to proving or 
disproving the necessity of threadpools.  They are being actively 
used and there is a demand for (more or less) graceful threads 
interruption or abortion.

>> So it's not enough even for coroutines.
>> 
> 
> Very subjective, and doesn't match my expectations.


As I said -- we've been working with coroutines (combined
generators + greenlets) for a few years, and apparently have
different experience, opinions and expectations from what you
have.  And I suppose developers and users of eventlet, gevent,
twisted (@inlineCallbacks) and other libraries have their own 
opinions and ideas too.  Not to mention, it would be interesting
to hear from PyPy, Juthon and IronPython teams.  It also seems 
that neither of us have enough experience working with 
'yield from' style coroutines.

Please write a PEP and we'll continue discussion from that 
point.  Hopefully, it will get more attention than this thread.

-
Yury


From greg.ewing at canterbury.ac.nz  Wed Apr  4 02:03:29 2012
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 04 Apr 2012 12:03:29 +1200
Subject: [Python-ideas] Protecting finally clauses of interruptions
In-Reply-To: <CAA0gF6qs_gp-NbmtLHse6LqQs7q3C-BNkvmf1U8HUDwy__Si=A@mail.gmail.com>
References: <CAA0gF6o3Vhzg=Zi0UjWUOi-iVAwN70mXanoawooj8iC0FJj8hw@mail.gmail.com>
	<7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com>
	<CAA0gF6pW8oujoLVAgabdwReWby3td1Qw8pg3S4ceoVqFwy5mzQ@mail.gmail.com>
	<339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com>
	<CAA0gF6rhwU-QxfQ-K40n7A9H80yXPbVhdofK-nOjbuLOnmwaJg@mail.gmail.com>
	<EF0E28E3-EF74-4033-A907-89253FBD7D4B@gmail.com>
	<CAA0gF6oJuW5PbXMO4hhQdtwtA_YZZD2t7rZk_NGqhFu+NjwYUQ@mail.gmail.com>
	<7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com>
	<CAA0gF6qhFHQdMD-F1Tsx+aWHK-nEPvfGb2AwDVVzUPMCQCpX1A@mail.gmail.com>
	<187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com>
	<CAA0gF6oWXkZYYM=eLBZx_XPeN4EAMdCFd-z0BgoWqo+nMiBCtQ@mail.gmail.com>
	<E245C90B-2948-4F6C-80C2-5D96B08AF580@gmail.com>
	<CAA0gF6qs_gp-NbmtLHse6LqQs7q3C-BNkvmf1U8HUDwy__Si=A@mail.gmail.com>
Message-ID: <4F7B8FD1.3040308@canterbury.ac.nz>

Paul Colomiets wrote:

> So I still propose add a frame flag, which doesn't break
> anything, and gives us experiment with interruptions
> without putting some experimental code into the core.

I don't think a frame flag on its own is quite enough.
You don't just want to prevent interruptions while in
a finally block, you want to defer them until the finally
counter gets back to zero. Making the interrupter sleep
and try again in that situation is rather ugly.

So perhaps there could also be a callback that gets
invoked when the counter goes down to zero.

-- 
Greg



From paul at colomiets.name  Wed Apr  4 10:04:04 2012
From: paul at colomiets.name (Paul Colomiets)
Date: Wed, 4 Apr 2012 11:04:04 +0300
Subject: [Python-ideas] Protecting finally clauses of interruptions
In-Reply-To: <A2767D04-7D85-46D5-A72F-2CB1839B9133@gmail.com>
References: <CAA0gF6o3Vhzg=Zi0UjWUOi-iVAwN70mXanoawooj8iC0FJj8hw@mail.gmail.com>
	<7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com>
	<CAA0gF6pW8oujoLVAgabdwReWby3td1Qw8pg3S4ceoVqFwy5mzQ@mail.gmail.com>
	<339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com>
	<CAA0gF6rhwU-QxfQ-K40n7A9H80yXPbVhdofK-nOjbuLOnmwaJg@mail.gmail.com>
	<EF0E28E3-EF74-4033-A907-89253FBD7D4B@gmail.com>
	<CAA0gF6oJuW5PbXMO4hhQdtwtA_YZZD2t7rZk_NGqhFu+NjwYUQ@mail.gmail.com>
	<7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com>
	<CAA0gF6qhFHQdMD-F1Tsx+aWHK-nEPvfGb2AwDVVzUPMCQCpX1A@mail.gmail.com>
	<187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com>
	<CAA0gF6oWXkZYYM=eLBZx_XPeN4EAMdCFd-z0BgoWqo+nMiBCtQ@mail.gmail.com>
	<E245C90B-2948-4F6C-80C2-5D96B08AF580@gmail.com>
	<CAA0gF6qs_gp-NbmtLHse6LqQs7q3C-BNkvmf1U8HUDwy__Si=A@mail.gmail.com>
	<314BE01B-160C-4C84-8E0D-AD82D14E2E56@gmail.com>
	<CAA0gF6r6RcsdZwCLVUMsvo2VEumWttMbHPjUar5pKg9HirjGjA@mail.gmail.com>
	<A2767D04-7D85-46D5-A72F-2CB1839B9133@gmail.com>
Message-ID: <CAA0gF6o-8J_fDbD7hnry1=U6dMnHGN-bpYrnecQ44-cZkysJAw@mail.gmail.com>

Hi,

On Wed, Apr 4, 2012 at 4:23 AM, Yury Selivanov <yselivanov.ml at gmail.com> wrote:
> On 2012-04-03, at 3:22 PM, Paul Colomiets wrote:
>> (Although, I don't know how `yield from` changes working with
>> yield-based coroutines, may be it's behavior is quite different)
>>
>> For greenlets situation is a bit different, as Python knows the
>> stack there, but you still need to traverse it (or as Andrew
>> mentioned, you can just propagate flag).
>
> Why traverse? ?Why propagate? ?As I explained in my previous posts
> here, you need to protect only the top-stack coroutines in the
> timeouts or trampoline execution queues. ?You should illustrate
> your logic with a more clear example - say three or four coroutines
> that call each other + with a glimpse of how your trampoline works.
> But I'm not sure that is really necessary.
>

Here is more detailed previous example (although, still simplified):

@coroutine
def add_money(user_id, money):
    yield redis_lock(user_id)
    try:
        yield redis_incr('user:'+user_id+':money', money)
    finally:
        yield redis_unlock(user_id)

# this one is crucial to show the point of discusssion
# other function are similar:
@coroutine
def redis_unlock(lock):
    yield redis_socket.wait_write()  # yields back when socket is
ready for writing
    cmd = ('DEL user:'+lock+'\n').encode('ascii')
    redis_socket.write(cmd)  # should be loop here, actually
    yield redis_socket.wait_read()
    result = redis_socket.read(1024)  # here loop too
    assert result == 'OK\n'

The trampoline when gets coroutine from `next()` or `send()` method
puts it on top of stack and doesn't dispatch original one until topmost
one is exited.

The point is that if timeout arrives inside a `redis_unlock` function, we
must wait until finally from `add_user` is finished

>>
>> The whole intention of using coroutine library is to not to
>> have thread pool. Could you describe your use case
>> with more details?
>
> Well, our company has been using coroutines for like 2.5 years
> now (the framework in not yet opensourced). ?And in our practice
> threadpool is really handy, as it allows you to:
>
> - use non-asyncronous libraries, which you don't want to
> monkeypatch with greensockets (or even unable to mokeypatch)
>

And we rewrite them in python. It seems to be more useful.

> - wrap some functions that are usually very fast, but once in
> a while may take some time. ?And sometimes you don't want to
> offload them to a separate process
>

Ack.

> - and yes, do DNS lookups if you don't have a compiled cpython
> extension that wraps c-ares or something alike.
>

Maybe let's propose asynchronous DNS library for python?
We have same problem, although we do not resolve hosts at
runtime (only at startup) so synchronous one is well enough
for our purposes.

> Please let's avoid shifting further discussion to proving or
> disproving the necessity of threadpools.

Agreed.

> They are being actively used and there is a demand for
> (more or less) graceful threads interruption or abortion.
>

Given use cases, what stops you to make explicit
interrtuption points?

>
> Please write a PEP and we'll continue discussion from that
> point. ?Hopefully, it will get more attention than this thread.
>

I don't see the point in writing a PEP until I have an idea
what PEP should propose. If you have, you can do it. Again
you want to implement thread interruption, and that's not
my point, there is another thread for that.

On Wed, Apr 4, 2012 at 3:03 AM, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
>
> I don't think a frame flag on its own is quite enough.
> You don't just want to prevent interruptions while in
> a finally block, you want to defer them until the finally
> counter gets back to zero. Making the interrupter sleep
> and try again in that situation is rather ugly.
>
> So perhaps there could also be a callback that gets
> invoked when the counter goes down to zero.

Do you mean put callback in a frame, which get
executed at next bytecode just like signal handler,
except it waits until finally clause is executed?

I would work, except in may have light performance
impact on each bytecode. But I'm not sure if it will
be noticeable.

-- 
Paul


From victor.varvariuc at gmail.com  Wed Apr  4 10:07:55 2012
From: victor.varvariuc at gmail.com (Victor Varvariuc)
Date: Wed, 4 Apr 2012 11:07:55 +0300
Subject: [Python-ideas] dict.items to accept optional iterable with keys to
	use
Message-ID: <CA+Lge10R7O78ptDO=MS3b-YQrr1TLTwSCVHKW8n=fW=DhugquA@mail.gmail.com>

Sometimes you want a dict which is subset of another dict. It would nice if
dict.items accepted an optional list of keys to return. If no keys are
given - use default behavior - get all items.

class NewDict(dict):

    def items(self, keys=()):
        """Another version of dict.items() which accepts specific keys
to use."""
        for key in keys or self.keys():
            yield key, self[key]

       a = NewDict({
    1: 'one',
    2: 'two',
    3: 'three',
    4: 'four',
    5: 'five'})
print(dict(a.items()))print(dict(a.items((1, 3, 5))))

vic at ubuntu:~/Desktop$ python test.py {1: 'one', 2: 'two', 3: 'three',
4: 'four', 5: 'five'}{1: 'one', 3: 'three', 5: 'five'}

Thanks for the attention.

--
*Victor Varvariuc*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120404/8d2e3c0a/attachment.html>

From pyideas at rebertia.com  Wed Apr  4 10:22:05 2012
From: pyideas at rebertia.com (Chris Rebert)
Date: Wed, 4 Apr 2012 01:22:05 -0700
Subject: [Python-ideas] dict.items to accept optional iterable with keys
 to use
In-Reply-To: <CA+Lge10R7O78ptDO=MS3b-YQrr1TLTwSCVHKW8n=fW=DhugquA@mail.gmail.com>
References: <CA+Lge10R7O78ptDO=MS3b-YQrr1TLTwSCVHKW8n=fW=DhugquA@mail.gmail.com>
Message-ID: <CAMZYqRRhraNLJ=mCSSWd45TG5e79AtmT_WbcKuAD_y7L2t+gCg@mail.gmail.com>

On Wed, Apr 4, 2012 at 1:07 AM, Victor Varvariuc
<victor.varvariuc at gmail.com> wrote:
> Sometimes you want a dict which is subset of another dict. It would nice if
> dict.items accepted an optional list of keys to return. If no keys are given
> - use default behavior - get all items.
<snip>
> print(dict(a.items((1, 3, 5))))

In that use case, why not just write a dict comprehension?:
    print({k: a[k] for k in (1, 3, 5)})
Completely explicit, and only a mere few characters longer.

Cheers,
Chris


From steve at pearwood.info  Wed Apr  4 10:23:01 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 4 Apr 2012 18:23:01 +1000
Subject: [Python-ideas] dict.items to accept optional iterable with keys
	to use
In-Reply-To: <CA+Lge10R7O78ptDO=MS3b-YQrr1TLTwSCVHKW8n=fW=DhugquA@mail.gmail.com>
References: <CA+Lge10R7O78ptDO=MS3b-YQrr1TLTwSCVHKW8n=fW=DhugquA@mail.gmail.com>
Message-ID: <20120404082301.GB19862@ando>

On Wed, Apr 04, 2012 at 11:07:55AM +0300, Victor Varvariuc wrote:
> Sometimes you want a dict which is subset of another dict. It would nice if
> dict.items accepted an optional list of keys to return. If no keys are
> given - use default behavior - get all items.

Too trivial to bother with.

# I want a list of keys/values:
items = [(key, mydict[key]) for key in list_of_keys]


# I want them generated on demand:
iterable = ((key, mydict[key]) for key in list_of_keys)


# I want a new dict:
newdict = dict((key, mydict[key]) for key in list_of_keys)


Both of those require that list_of_keys includes keys that do exist. If 
you want to skip missing keys:

[(k,v) for (k,v) in mydict.items() if k in list_of_keys]

To be even more efficient, use a set of keys instead of a list.


-- 
Steven


From techtonik at gmail.com  Wed Apr  4 10:25:57 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Wed, 4 Apr 2012 11:25:57 +0300
Subject: [Python-ideas] Python probe: execute code in isolation
	(subinterpreter?) and get results
Message-ID: <CAPkN8x+z8jNBfYsEi1-qSeWFGsUMwu75vJjEz=6CnrWEvDDOZQ@mail.gmail.com>

Hi,

Is there a standard way to execute a Python code and inspect the
results without spawning an external Python process? If there is no
such way, I'd like to propose the feature, and there are two user
stories. Both are about application environment probing.


Story #1: Choosing the best library in a safe manner

Probing environment is required for Python applications to make
component selection logic explicit and less error-prone. I can tell
from my experience with Spyder IDE that startup procedure is the most
fragile part for this cross-platform application, which makes use of
optionally installed components on user system. Implicit import nature
and inability to revert import operation makes situation complicated.
Below is an example. Take a note that this is not about packaging.

Spyder IDE is a Qt application that optionally embeds IPython console.
Qt has two bindings - PyQt4 and PySide. PyQt4 binding has two APIs -
#1 and #2. If PyQt4 is used and version of installed IPython >= 0.11,
the API #2 must be chosen. So, the IPython version probing should come
first. A standard way to detect IPython version is to import IPython
before the rest of the application, but IPython may detect PyQt4
itself and import it too for probing version. And if Spyder uses
PySide we now have a conflict with Qt libraries loaded. If there was a
way to execute Python script in subinterpreter to probe all installed
component versions and return results, the selection logic would be
much more readable and sane.


Story #2: Get settings from user script

Blender uses SCons to automate builds. SCons script is written in
Python and it uses execfile(filename, globals, locals) to read
platform specific user script with settings. Unfortunately, execfile()
is a hack that doesn't treat Python scripts the same way the
interpreter treats them - for example, globals access will throw
exceptions - http://bugs.python.org/issue14049  More important that
users won't be able to troubleshoot the exceptions, because standalone
script works as expected.

Executing user script code in a subprocess will most likely negatively
affect performance, which is rather critical for a build tool.
Pickling and unpickling objects with global state through
communication pipe may also be the source of bugs. So, having a cheap
way to communicate with Python subinterpreter and get a simple dict in
result will make Python more snappy.
--
anatoly t.


From mal at egenix.com  Wed Apr  4 11:20:20 2012
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 04 Apr 2012 11:20:20 +0200
Subject: [Python-ideas] Python probe: execute code in isolation
 (subinterpreter?) and get results
In-Reply-To: <CAPkN8x+z8jNBfYsEi1-qSeWFGsUMwu75vJjEz=6CnrWEvDDOZQ@mail.gmail.com>
References: <CAPkN8x+z8jNBfYsEi1-qSeWFGsUMwu75vJjEz=6CnrWEvDDOZQ@mail.gmail.com>
Message-ID: <4F7C1254.5070703@egenix.com>

anatoly techtonik wrote:
> Hi,
> 
> Is there a standard way to execute a Python code and inspect the
> results without spawning an external Python process? If there is no
> such way, I'd like to propose the feature, and there are two user
> stories. Both are about application environment probing.
> 
> 
> Story #1: Choosing the best library in a safe manner
> 
> Probing environment is required for Python applications to make
> component selection logic explicit and less error-prone. I can tell
> from my experience with Spyder IDE that startup procedure is the most
> fragile part for this cross-platform application, which makes use of
> optionally installed components on user system. Implicit import nature
> and inability to revert import operation makes situation complicated.
> Below is an example. Take a note that this is not about packaging.
> 
> Spyder IDE is a Qt application that optionally embeds IPython console.
> Qt has two bindings - PyQt4 and PySide. PyQt4 binding has two APIs -
> #1 and #2. If PyQt4 is used and version of installed IPython >= 0.11,
> the API #2 must be chosen. So, the IPython version probing should come
> first. A standard way to detect IPython version is to import IPython
> before the rest of the application, but IPython may detect PyQt4
> itself and import it too for probing version. And if Spyder uses
> PySide we now have a conflict with Qt libraries loaded. If there was a
> way to execute Python script in subinterpreter to probe all installed
> component versions and return results, the selection logic would be
> much more readable and sane.

Given that you are also loading external shared libraries, I don't
see how you could do this within the same process. Unloading shared
libs is possible (even if fragile), but if you don't even know
which libs have been loaded, likely impossible to do in a cross-
platform way.

> Story #2: Get settings from user script
> 
> Blender uses SCons to automate builds. SCons script is written in
> Python and it uses execfile(filename, globals, locals) to read
> platform specific user script with settings. Unfortunately, execfile()
> is a hack that doesn't treat Python scripts the same way the
> interpreter treats them - for example, globals access will throw
> exceptions - http://bugs.python.org/issue14049  More important that
> users won't be able to troubleshoot the exceptions, because standalone
> script works as expected.

You're not using execfile() correctly: if you want a script to be
run in the same way as a module, then the local and global namespace
dictionaries have to be the same. So the second story already works
in vanilla Python :-)

Lots of Python applications read configuration data from user
supplied (Python) config files. It's less secure than e.g. INI
files, but gives you a lot of flexibility in defining what
you need.

> Executing user script code in a subprocess will most likely negatively
> affect performance, which is rather critical for a build tool.
> Pickling and unpickling objects with global state through
> communication pipe may also be the source of bugs. So, having a cheap
> way to communicate with Python subinterpreter and get a simple dict in
> result will make Python more snappy.

I don't see how you could get story #1 working without a subprocess.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 04 2012)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2012-04-03: Python Meeting Duesseldorf                             today

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


From yselivanov.ml at gmail.com  Wed Apr  4 18:59:57 2012
From: yselivanov.ml at gmail.com (Yury Selivanov)
Date: Wed, 4 Apr 2012 12:59:57 -0400
Subject: [Python-ideas] Protecting finally clauses of interruptions
In-Reply-To: <CAA0gF6o-8J_fDbD7hnry1=U6dMnHGN-bpYrnecQ44-cZkysJAw@mail.gmail.com>
References: <CAA0gF6o3Vhzg=Zi0UjWUOi-iVAwN70mXanoawooj8iC0FJj8hw@mail.gmail.com>
	<7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com>
	<CAA0gF6pW8oujoLVAgabdwReWby3td1Qw8pg3S4ceoVqFwy5mzQ@mail.gmail.com>
	<339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com>
	<CAA0gF6rhwU-QxfQ-K40n7A9H80yXPbVhdofK-nOjbuLOnmwaJg@mail.gmail.com>
	<EF0E28E3-EF74-4033-A907-89253FBD7D4B@gmail.com>
	<CAA0gF6oJuW5PbXMO4hhQdtwtA_YZZD2t7rZk_NGqhFu+NjwYUQ@mail.gmail.com>
	<7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com>
	<CAA0gF6qhFHQdMD-F1Tsx+aWHK-nEPvfGb2AwDVVzUPMCQCpX1A@mail.gmail.com>
	<187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com>
	<CAA0gF6oWXkZYYM=eLBZx_XPeN4EAMdCFd-z0BgoWqo+nMiBCtQ@mail.gmail.com>
	<E245C90B-2948-4F6C-80C2-5D96B08AF580@gmail.com>
	<CAA0gF6qs_gp-NbmtLHse6LqQs7q3C-BNkvmf1U8HUDwy__Si=A@mail.gmail.com>
	<314BE01B-160C-4C84-8E0D-AD82D14E2E56@gmail.com>
	<CAA0gF6r6RcsdZwCLVUMsvo2VEumWttMbHPjUar5pKg9HirjGjA@mail.gmail.com>
	<A2767D04-7D85-46D5-A72F-2CB1839B9133@gmail.com>
	<CAA0gF6o-8J_fDbD7hnry1=U6dMnHGN-bpYrnecQ44-cZkysJAw@mail.gmail.com>
Message-ID: <7007EA8B-09BB-46B8-A5D1-52CAB128DAB0@gmail.com>

On 2012-04-04, at 4:04 AM, Paul Colomiets wrote:

> Hi,
> 
> On Wed, Apr 4, 2012 at 4:23 AM, Yury Selivanov <yselivanov.ml at gmail.com> wrote:
>> On 2012-04-03, at 3:22 PM, Paul Colomiets wrote:
>>> (Although, I don't know how `yield from` changes working with
>>> yield-based coroutines, may be it's behavior is quite different)
>>> 
>>> For greenlets situation is a bit different, as Python knows the
>>> stack there, but you still need to traverse it (or as Andrew
>>> mentioned, you can just propagate flag).
>> 
>> Why traverse?  Why propagate?  As I explained in my previous posts
>> here, you need to protect only the top-stack coroutines in the
>> timeouts or trampoline execution queues.  You should illustrate
>> your logic with a more clear example - say three or four coroutines
>> that call each other + with a glimpse of how your trampoline works.
>> But I'm not sure that is really necessary.
>> 
> 
> Here is more detailed previous example (although, still simplified):
> 
> @coroutine
> def add_money(user_id, money):
>    yield redis_lock(user_id)
>    try:
>        yield redis_incr('user:'+user_id+':money', money)
>    finally:
>        yield redis_unlock(user_id)
> 
> # this one is crucial to show the point of discusssion
> # other function are similar:
> @coroutine
> def redis_unlock(lock):
>    yield redis_socket.wait_write()  # yields back when socket is
> ready for writing
>    cmd = ('DEL user:'+lock+'\n').encode('ascii')
>    redis_socket.write(cmd)  # should be loop here, actually
>    yield redis_socket.wait_read()
>    result = redis_socket.read(1024)  # here loop too
>    assert result == 'OK\n'
> 
> The trampoline when gets coroutine from `next()` or `send()` method
> puts it on top of stack and doesn't dispatch original one until topmost
> one is exited.
> 
> The point is that if timeout arrives inside a `redis_unlock` function, we
> must wait until finally from `add_user` is finished

How can it "arrive" inside "redis_unlock"?  Let's assume you called
"add_money" as such:

yield add_money(1, 10).with_timeout(10)

Then it's the 'add_money' coroutine that should be in the tieouts queue/tree!
'add_money' specifically should be tried to be interrupted when your 10s timeout
reaches.  And if 'add_money' is in its 'finally' statement - you simply postpone
its interruption, meaning that 'redis_unlock' will end its execution nicely.

Again, I'm not sure how exactly you manage your timeouts.  The way I am, 
simplified: I have a timeouts heapq with pointers to those coroutines
that were *explicitly* executed with a timeout.  So I'm protecting only
the coroutines in that queue, because only them can be interrupted.  And
the coroutines they call, are protected *automatically*.

If you do it differently, can you please elaborate on how your scheduler
is actually designed?

>>> 
>>> The whole intention of using coroutine library is to not to
>>> have thread pool. Could you describe your use case
>>> with more details?
>> 
>> Well, our company has been using coroutines for like 2.5 years
>> now (the framework in not yet opensourced).  And in our practice
>> threadpool is really handy, as it allows you to:
>> 
>> - use non-asyncronous libraries, which you don't want to
>> monkeypatch with greensockets (or even unable to mokeypatch)
>> 
> 
> And we rewrite them in python. It seems to be more useful.

Sometimes you can't afford the luxury ;)

> 
>> - wrap some functions that are usually very fast, but once in
>> a while may take some time.  And sometimes you don't want to
>> offload them to a separate process
>> 
> 
> Ack.
> 
>> - and yes, do DNS lookups if you don't have a compiled cpython
>> extension that wraps c-ares or something alike.
>> 
> 
> Maybe let's propose asynchronous DNS library for python?
> We have same problem, although we do not resolve hosts at
> runtime (only at startup) so synchronous one is well enough
> for our purposes.
> 
>> Please let's avoid shifting further discussion to proving or
>> disproving the necessity of threadpools.
> 
> Agreed.
> 
>> They are being actively used and there is a demand for
>> (more or less) graceful threads interruption or abortion.
>> 
> 
> Given use cases, what stops you to make explicit
> interrtuption points?
> 
>> 
>> Please write a PEP and we'll continue discussion from that
>> point.  Hopefully, it will get more attention than this thread.
>> 
> 
> I don't see the point in writing a PEP until I have an idea
> what PEP should propose. If you have, you can do it. Again

OK, point taken.  Please give me couple of days to at least
come up with a summary document.  I still don't like your
solution because it works directly with frames.  With an
upcoming PyPy support of python 3, I don't think I want
to loose the JIT support.

I also want to take a look at the new PyPy continuations.

Ideally, as I proposed earlier, we should introduce some
sort of interruption protocol -- method 'interrupt()', with 
perhaps a callback.

> you want to implement thread interruption, and that's not
> my point, there is another thread for that.

We have two requests: ability to safely interrupt python
function or generator (1); ability to safely interrupt
python's threads (2).  Both (1) and (2) share the same
requirement of safe 'finally' statements.  To me, both
features are similar enough to come up with a single
solution, rather than inventing different approaches.

> On Wed, Apr 4, 2012 at 3:03 AM, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
>> 
>> I don't think a frame flag on its own is quite enough.
>> You don't just want to prevent interruptions while in
>> a finally block, you want to defer them until the finally
>> counter gets back to zero. Making the interrupter sleep
>> and try again in that situation is rather ugly.

That's the second reason I don't like your proposal.

def foo():
   try:
      ..
   finally:
      yield unlock()
   # <--- the ideal point to interrupt foo

   f = open('a', 'w')
   # what if we interrupt it here?
   try:
      ..
   finally:
      f.close()

>> So perhaps there could also be a callback that gets
>> invoked when the counter goes down to zero.
> 
> Do you mean put callback in a frame, which get
> executed at next bytecode just like signal handler,
> except it waits until finally clause is executed?
> 
> I would work, except in may have light performance
> impact on each bytecode. But I'm not sure if it will
> be noticeable.

That's essentially the way we currently did it.  We transform the 
coroutine's __code__ object to make it from:

def a():
   try:
      # code1
   finally:
      # code2

to:

def a():
   __self__ = __get_current_coroutine()
   try:
     # code1
   finally:
     __self__.enter_finally()
     try:
       # code2
     finally:
       __self__.exit_finally()
       
'enter_finally' and 'exit_finally' maintain the internal counter
of finally blocks.  If a coroutine needs to be interrupted, we check
that counter.  If it is 0 - throw in a special exception.  If not - 
wait till it becomes 0 and throw the exception in 'exit_finally'.

Works flawlessly, but with the high cost of having to patch
__code__ objects.

-
Yury

From sven at marnach.net  Wed Apr  4 19:01:58 2012
From: sven at marnach.net (Sven Marnach)
Date: Wed, 4 Apr 2012 18:01:58 +0100
Subject: [Python-ideas] dict.items to accept optional iterable with keys
 to use
In-Reply-To: <CA+Lge10R7O78ptDO=MS3b-YQrr1TLTwSCVHKW8n=fW=DhugquA@mail.gmail.com>
References: <CA+Lge10R7O78ptDO=MS3b-YQrr1TLTwSCVHKW8n=fW=DhugquA@mail.gmail.com>
Message-ID: <20120404170158.GB2470@bagheera>

Victor Varvariuc schrieb am Wed, 04. Apr 2012, um 11:07:55 +0300:
> Sometimes you want a dict which is subset of another dict. It would nice if
> dict.items accepted an optional list of keys to return. If no keys are
> given - use default behavior - get all items.

How about using `operator.itemgetter()`?

    from operator import itemgetter
    itemgetter(*keys)(my_dict)

It will return a tuple of the values corresponding to the given keys.

Cheers,
    Sven


From paul at colomiets.name  Wed Apr  4 20:44:30 2012
From: paul at colomiets.name (Paul Colomiets)
Date: Wed, 4 Apr 2012 21:44:30 +0300
Subject: [Python-ideas] Protecting finally clauses of interruptions
In-Reply-To: <7007EA8B-09BB-46B8-A5D1-52CAB128DAB0@gmail.com>
References: <CAA0gF6o3Vhzg=Zi0UjWUOi-iVAwN70mXanoawooj8iC0FJj8hw@mail.gmail.com>
	<7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com>
	<CAA0gF6pW8oujoLVAgabdwReWby3td1Qw8pg3S4ceoVqFwy5mzQ@mail.gmail.com>
	<339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com>
	<CAA0gF6rhwU-QxfQ-K40n7A9H80yXPbVhdofK-nOjbuLOnmwaJg@mail.gmail.com>
	<EF0E28E3-EF74-4033-A907-89253FBD7D4B@gmail.com>
	<CAA0gF6oJuW5PbXMO4hhQdtwtA_YZZD2t7rZk_NGqhFu+NjwYUQ@mail.gmail.com>
	<7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com>
	<CAA0gF6qhFHQdMD-F1Tsx+aWHK-nEPvfGb2AwDVVzUPMCQCpX1A@mail.gmail.com>
	<187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com>
	<CAA0gF6oWXkZYYM=eLBZx_XPeN4EAMdCFd-z0BgoWqo+nMiBCtQ@mail.gmail.com>
	<E245C90B-2948-4F6C-80C2-5D96B08AF580@gmail.com>
	<CAA0gF6qs_gp-NbmtLHse6LqQs7q3C-BNkvmf1U8HUDwy__Si=A@mail.gmail.com>
	<314BE01B-160C-4C84-8E0D-AD82D14E2E56@gmail.com>
	<CAA0gF6r6RcsdZwCLVUMsvo2VEumWttMbHPjUar5pKg9HirjGjA@mail.gmail.com>
	<A2767D04-7D85-46D5-A72F-2CB1839B9133@gmail.com>
	<CAA0gF6o-8J_fDbD7hnry1=U6dMnHGN-bpYrnecQ44-cZkysJAw@mail.gmail.com>
	<7007EA8B-09BB-46B8-A5D1-52CAB128DAB0@gmail.com>
Message-ID: <CAA0gF6r_cFitcnEwj6oCXWxE3JXDtYPe+UhKc0p-KqXnovRphg@mail.gmail.com>

Hi Yury,

On Wed, Apr 4, 2012 at 7:59 PM, Yury Selivanov <yselivanov.ml at gmail.com> wrote:
>> Here is more detailed previous example (although, still simplified):
>>
>> @coroutine
>> def add_money(user_id, money):
>> ? ?yield redis_lock(user_id)
>> ? ?try:
>> ? ? ? ?yield redis_incr('user:'+user_id+':money', money)
>> ? ?finally:
>> ? ? ? ?yield redis_unlock(user_id)
>>
>> # this one is crucial to show the point of discusssion
>> # other function are similar:
>> @coroutine
>> def redis_unlock(lock):
>> ? ?yield redis_socket.wait_write() ?# yields back when socket is
>> ready for writing
>> ? ?cmd = ('DEL user:'+lock+'\n').encode('ascii')
>> ? ?redis_socket.write(cmd) ?# should be loop here, actually
>> ? ?yield redis_socket.wait_read()
>> ? ?result = redis_socket.read(1024) ?# here loop too
>> ? ?assert result == 'OK\n'
>>
>> The trampoline when gets coroutine from `next()` or `send()` method
>> puts it on top of stack and doesn't dispatch original one until topmost
>> one is exited.
>>
>> The point is that if timeout arrives inside a `redis_unlock` function, we
>> must wait until finally from `add_user` is finished
>
> How can it "arrive" inside "redis_unlock"? ?Let's assume you called
> "add_money" as such:
>
> yield add_money(1, 10).with_timeout(10)
>
> Then it's the 'add_money' coroutine that should be in the tieouts queue/tree!
> 'add_money' specifically should be tried to be interrupted when your 10s timeout
> reaches. ?And if 'add_money' is in its 'finally' statement - you simply postpone
> its interruption, meaning that 'redis_unlock' will end its execution nicely.
>
> Again, I'm not sure how exactly you manage your timeouts. ?The way I am,
> simplified: I have a timeouts heapq with pointers to those coroutines
> that were *explicitly* executed with a timeout. ?So I'm protecting only
> the coroutines in that queue, because only them can be interrupted. ?And
> the coroutines they call, are protected *automatically*.
>
> If you do it differently, can you please elaborate on how your scheduler
> is actually designed?
>

I have a global timeout for processing single request. It's actually higher
in a chain of generator calls. So dispatcher looks like:

def dispatcher(self, method, args):
    with timeout(10):
        yield getattr(self.method)(*args)

And all the local timeouts, like timeout for single request are
usually applied at a socket level, where specific protocol
is implemented:

def redis_unlock(lock):
? ?yield redis_socket.wait_write(2) ?# wait two seconds
   # TimeoutError may have been raised in wait_write()
? ?cmd = ('DEL user:'+lock+'\n').encode('ascii')
? ?redis_socket.write(cmd) ?# should be loop here, actually
? ?yield redis_socket.wait_read(2)  # another two seconds
? ?result = redis_socket.read(1024) ?# here loop too
? ?assert result == 'OK\n'

So they are not interruptions. Although, we don't use them
much with coroutines, global timeout for request is
usually enough.

But anyway I don't see a reason to protect a single frame,
because even if you have a simple mutex without coroutines
you end up with:

def something():
  lock.acquire()
  try:
    pass
  finally:
    lock.release()

And if lock's imlementation is something along the lines of:

def release(self):
    self._native_lock.release()

How would you be sure that interruption is not executed
when interpreter resolved `self._native_lock.release` but
not yet called it?

> OK, point taken. ?Please give me couple of days to at least
> come up with a summary document.

No hurry.

> I still don't like your
> solution because it works directly with frames. ?With an
> upcoming PyPy support of python 3, I don't think I want
> to loose the JIT support.
>

It's also interesting question. I don't think it's possible to interrupt
JIT'ed code in arbitrary location.

> Ideally, as I proposed earlier, we should introduce some
> sort of interruption protocol -- method 'interrupt()', with
> perhaps a callback.
>

On which object? Is it sys.interrupt()? Or is it thread.interrupt()?

>> you want to implement thread interruption, and that's not
>> my point, there is another thread for that.
>
> We have two requests: ability to safely interrupt python
> function or generator (1); ability to safely interrupt
> python's threads (2). ?Both (1) and (2) share the same
> requirement of safe 'finally' statements. ?To me, both
> features are similar enough to come up with a single
> solution, rather than inventing different approaches.
>

Again I do not propose described point (1). I propose a
way to *inspect* a stack if it's safe to interrupt.

>> On Wed, Apr 4, 2012 at 3:03 AM, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
>>>
>>> I don't think a frame flag on its own is quite enough.
>>> You don't just want to prevent interruptions while in
>>> a finally block, you want to defer them until the finally
>>> counter gets back to zero. Making the interrupter sleep
>>> and try again in that situation is rather ugly.
>
> That's the second reason I don't like your proposal.
>
> def foo():
> ? try:
> ? ? ?..
> ? finally:
> ? ? ?yield unlock()
> ? # <--- the ideal point to interrupt foo
>
> ? f = open('a', 'w')
> ? # what if we interrupt it here?
> ? try:
> ? ? ?..
> ? finally:
> ? ? ?f.close()
>

And which one fixes this problem? There is no guarantee
that your timeout code haven't interrupted
at " # what if we interrupt it here?". If it's a bit less likely,
it's not real solution. Please, don't present it as such.

>>> So perhaps there could also be a callback that gets
>>> invoked when the counter goes down to zero.
>>
>> Do you mean put callback in a frame, which get
>> executed at next bytecode just like signal handler,
>> except it waits until finally clause is executed?
>>
>> I would work, except in may have light performance
>> impact on each bytecode. But I'm not sure if it will
>> be noticeable.
>
> That's essentially the way we currently did it. ?We transform the
> coroutine's __code__ object to make it from:
>
> def a():
> ? try:
> ? ? ?# code1
> ? finally:
> ? ? ?# code2
>
> to:
>
> def a():
> ? __self__ = __get_current_coroutine()
> ? try:
> ? ? # code1
> ? finally:
> ? ? __self__.enter_finally()
> ? ? try:
> ? ? ? # code2
> ? ? finally:
> ? ? ? __self__.exit_finally()
>
> 'enter_finally' and 'exit_finally' maintain the internal counter
> of finally blocks. ?If a coroutine needs to be interrupted, we check
> that counter. ?If it is 0 - throw in a special exception. ?If not -
> wait till it becomes 0 and throw the exception in 'exit_finally'.
>

The problem is in interruption of another thread. You must
inspect stack only with C code holding GIL. Implementation
might be more complex, but yes, it's probably can be done,
without noticeable slow down.

-- 
Paul


From tjreedy at udel.edu  Wed Apr  4 21:05:34 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 04 Apr 2012 15:05:34 -0400
Subject: [Python-ideas] Python probe: execute code in isolation
 (subinterpreter?) and get results
In-Reply-To: <CAPkN8x+z8jNBfYsEi1-qSeWFGsUMwu75vJjEz=6CnrWEvDDOZQ@mail.gmail.com>
References: <CAPkN8x+z8jNBfYsEi1-qSeWFGsUMwu75vJjEz=6CnrWEvDDOZQ@mail.gmail.com>
Message-ID: <jli62e$b99$1@dough.gmane.org>

On 4/4/2012 4:25 AM, anatoly techtonik wrote:

> Story #2: Get settings from user script
>
> Blender uses SCons to automate builds. SCons script is written in
> Python and it uses execfile(filename, globals, locals) to read
> platform specific user script with settings. Unfortunately, execfile()
> is a hack that doesn't treat Python scripts the same way the
> interpreter treats them - for example, globals access will throw
> exceptions - http://bugs.python.org/issue14049

Please stop misrepresenting Python. I clearly explained the issue in 
14049 before closing it. If you did not understand, re-read until you 
do. Trying to get an object via an unbound name (non-existent variable), 
always results in a NameError. There is nothing unique about execfile. 
If Blender's scons script is using execfile wrong (by passing separate 
globals and locals), tell them to fix *their* bug.

-- 
Terry Jan Reedy



From yselivanov.ml at gmail.com  Wed Apr  4 21:37:16 2012
From: yselivanov.ml at gmail.com (Yury Selivanov)
Date: Wed, 4 Apr 2012 15:37:16 -0400
Subject: [Python-ideas] Protecting finally clauses of interruptions
In-Reply-To: <CAA0gF6r_cFitcnEwj6oCXWxE3JXDtYPe+UhKc0p-KqXnovRphg@mail.gmail.com>
References: <CAA0gF6o3Vhzg=Zi0UjWUOi-iVAwN70mXanoawooj8iC0FJj8hw@mail.gmail.com>
	<7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com>
	<CAA0gF6pW8oujoLVAgabdwReWby3td1Qw8pg3S4ceoVqFwy5mzQ@mail.gmail.com>
	<339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com>
	<CAA0gF6rhwU-QxfQ-K40n7A9H80yXPbVhdofK-nOjbuLOnmwaJg@mail.gmail.com>
	<EF0E28E3-EF74-4033-A907-89253FBD7D4B@gmail.com>
	<CAA0gF6oJuW5PbXMO4hhQdtwtA_YZZD2t7rZk_NGqhFu+NjwYUQ@mail.gmail.com>
	<7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com>
	<CAA0gF6qhFHQdMD-F1Tsx+aWHK-nEPvfGb2AwDVVzUPMCQCpX1A@mail.gmail.com>
	<187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com>
	<CAA0gF6oWXkZYYM=eLBZx_XPeN4EAMdCFd-z0BgoWqo+nMiBCtQ@mail.gmail.com>
	<E245C90B-2948-4F6C-80C2-5D96B08AF580@gmail.com>
	<CAA0gF6qs_gp-NbmtLHse6LqQs7q3C-BNkvmf1U8HUDwy__Si=A@mail.gmail.com>
	<314BE01B-160C-4C84-8E0D-AD82D14E2E56@gmail.com>
	<CAA0gF6r6RcsdZwCLVUMsvo2VEumWttMbHPjUar5pKg9HirjGjA@mail.gmail.com>
	<A2767D04-7D85-46D5-A72F-2CB1839B9133@gmail.com>
	<CAA0gF6o-8J_fDbD7hnry1=U6dMnHGN-bpYrnecQ44-cZkysJAw@mail.gmail.com>
	<7007EA8B-09BB-46B8-A5D1-52CAB128DAB0@gmail.com>
	<CAA0gF6r_cFitcnEwj6oCXWxE3JXDtYPe+UhKc0p-KqXnovRphg@mail.gmail.com>
Message-ID: <61FF0CD2-A658-4E90-A2CF-975D972774D4@gmail.com>

On 2012-04-04, at 2:44 PM, Paul Colomiets wrote:
> I have a global timeout for processing single request. It's actually higher
> in a chain of generator calls. So dispatcher looks like:
> 
> def dispatcher(self, method, args):
>    with timeout(10):
>        yield getattr(self.method)(*args)

How does it work?  To what object are you actually attaching timeout?

I'm just curious now how your 'timeout' context manager works.

And what's the advantage of having some "global" timeout instead
of a timeout specifically bound to some coroutine?

Do you have that code publicly released somewhere?  I just really want
to understand how exactly your architecture works to come with a 
better proposal (if there is one possible ;).

As an off-topic: would be interesting to have various coroutines
approaches and architectures listed somewhere, to understand how
python programmers actually do it.

> And all the local timeouts, like timeout for single request are
> usually applied at a socket level, where specific protocol
> is implemented:
> 
> def redis_unlock(lock):
>    yield redis_socket.wait_write(2)  # wait two seconds
>   # TimeoutError may have been raised in wait_write()
>    cmd = ('DEL user:'+lock+'\n').encode('ascii')
>    redis_socket.write(cmd)  # should be loop here, actually
>    yield redis_socket.wait_read(2)  # another two seconds
>    result = redis_socket.read(1024)  # here loop too
>    assert result == 'OK\n'

So you have explicit timeouts in the 'redis_unlock', but you want
them to be ignored if it was called from some 'finally' block?

> So they are not interruptions. Although, we don't use them
> much with coroutines, global timeout for request is
> usually enough.

Don't really follow you here.

> But anyway I don't see a reason to protect a single frame,
> because even if you have a simple mutex without coroutines
> you end up with:
> 
> def something():
>  lock.acquire()
>  try:
>    pass
>  finally:
>    lock.release()
> 
> And if lock's imlementation is something along the lines of:
> 
> def release(self):
>    self._native_lock.release()
> 
> How would you be sure that interruption is not executed
> when interpreter resolved `self._native_lock.release` but
> not yet called it?

Is it in a context of coroutines or threads?  If former, then
because you, perhaps, want to interrupt 'something()'?  And it is a
separate frame from the frame where 'release()' is running?

> It's also interesting question. I don't think it's possible to interrupt
> JIT'ed code in arbitrary location.

I guess that should really be asked on the pypy-dev mail-list, once 
we have a proposal.

>> That's the second reason I don't like your proposal.
>> 
>> def foo():
>>   try:
>>      ..
>>   finally:
>>      yield unlock()
>>   # <--- the ideal point to interrupt foo
>> 
>>   f = open('a', 'w')
>>   # what if we interrupt it here?
>>   try:
>>      ..
>>   finally:
>>      f.close()
>> 
> 
> And which one fixes this problem? There is no guarantee
> that your timeout code haven't interrupted
> at " # what if we interrupt it here?". If it's a bit less likely,
> it's not real solution. Please, don't present it as such.

Sorry, I must had it explained in more details.  Right now we
interrupt code only where we have a 'yield', a greenlet.switch(), 
or at the end of finally block, not at some arbitrary opcode.

-
Yury


From yselivanov.ml at gmail.com  Wed Apr  4 22:07:19 2012
From: yselivanov.ml at gmail.com (Yury Selivanov)
Date: Wed, 4 Apr 2012 16:07:19 -0400
Subject: [Python-ideas] Protecting finally clauses of interruptions
In-Reply-To: <CAA0gF6r_cFitcnEwj6oCXWxE3JXDtYPe+UhKc0p-KqXnovRphg@mail.gmail.com>
References: <CAA0gF6o3Vhzg=Zi0UjWUOi-iVAwN70mXanoawooj8iC0FJj8hw@mail.gmail.com>
	<7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com>
	<CAA0gF6pW8oujoLVAgabdwReWby3td1Qw8pg3S4ceoVqFwy5mzQ@mail.gmail.com>
	<339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com>
	<CAA0gF6rhwU-QxfQ-K40n7A9H80yXPbVhdofK-nOjbuLOnmwaJg@mail.gmail.com>
	<EF0E28E3-EF74-4033-A907-89253FBD7D4B@gmail.com>
	<CAA0gF6oJuW5PbXMO4hhQdtwtA_YZZD2t7rZk_NGqhFu+NjwYUQ@mail.gmail.com>
	<7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com>
	<CAA0gF6qhFHQdMD-F1Tsx+aWHK-nEPvfGb2AwDVVzUPMCQCpX1A@mail.gmail.com>
	<187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com>
	<CAA0gF6oWXkZYYM=eLBZx_XPeN4EAMdCFd-z0BgoWqo+nMiBCtQ@mail.gmail.com>
	<E245C90B-2948-4F6C-80C2-5D96B08AF580@gmail.com>
	<CAA0gF6qs_gp-NbmtLHse6LqQs7q3C-BNkvmf1U8HUDwy__Si=A@mail.gmail.com>
	<314BE01B-160C-4C84-8E0D-AD82D14E2E56@gmail.com>
	<CAA0gF6r6RcsdZwCLVUMsvo2VEumWttMbHPjUar5pKg9HirjGjA@mail.gmail.com>
	<A2767D04-7D85-46D5-A72F-2CB1839B9133@gmail.com>
	<CAA0gF6o-8J_fDbD7hnry1=U6dMnHGN-bpYrnecQ44-cZkysJAw@mail.gmail.com>
	<7007EA8B-09BB-46B8-A5D1-52CAB128DAB0@gmail.com>
	<CAA0gF6r_cFitcnEwj6oCXWxE3JXDtYPe+UhKc0p-KqXnovRphg@mail.gmail.com>
Message-ID: <7B674147-30DD-4ACD-A62A-3B1220994F31@gmail.com>

On 2012-04-04, at 2:44 PM, Paul Colomiets wrote:

> But anyway I don't see a reason to protect a single frame,
> because even if you have a simple mutex without coroutines
> you end up with:

BTW, for instance, in our framework each coroutine is a special
object that wraps generator/plain function.  It controls everything
that the underlying generator/function yields/returns.  But the actual
execution, propagation of returned values and raised errors is the
scheduler's job.  So when you are yielding a coroutine from another 
coroutine, frames are not even connected to each other, since the 
actual execution of the callee will be performed by the scheduler.  
It's not like a regular python call.

For us, having 'f_in_finally' somehow propagated would be completely 
useless.

I think even if it's decided to implement just your proposal, I feel
that 'f_in_finally' should indicate the state of only its *own* frame.

-
Yury


From paul at colomiets.name  Wed Apr  4 22:43:05 2012
From: paul at colomiets.name (Paul Colomiets)
Date: Wed, 4 Apr 2012 23:43:05 +0300
Subject: [Python-ideas] Protecting finally clauses of interruptions
In-Reply-To: <61FF0CD2-A658-4E90-A2CF-975D972774D4@gmail.com>
References: <CAA0gF6o3Vhzg=Zi0UjWUOi-iVAwN70mXanoawooj8iC0FJj8hw@mail.gmail.com>
	<7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com>
	<CAA0gF6pW8oujoLVAgabdwReWby3td1Qw8pg3S4ceoVqFwy5mzQ@mail.gmail.com>
	<339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com>
	<CAA0gF6rhwU-QxfQ-K40n7A9H80yXPbVhdofK-nOjbuLOnmwaJg@mail.gmail.com>
	<EF0E28E3-EF74-4033-A907-89253FBD7D4B@gmail.com>
	<CAA0gF6oJuW5PbXMO4hhQdtwtA_YZZD2t7rZk_NGqhFu+NjwYUQ@mail.gmail.com>
	<7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com>
	<CAA0gF6qhFHQdMD-F1Tsx+aWHK-nEPvfGb2AwDVVzUPMCQCpX1A@mail.gmail.com>
	<187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com>
	<CAA0gF6oWXkZYYM=eLBZx_XPeN4EAMdCFd-z0BgoWqo+nMiBCtQ@mail.gmail.com>
	<E245C90B-2948-4F6C-80C2-5D96B08AF580@gmail.com>
	<CAA0gF6qs_gp-NbmtLHse6LqQs7q3C-BNkvmf1U8HUDwy__Si=A@mail.gmail.com>
	<314BE01B-160C-4C84-8E0D-AD82D14E2E56@gmail.com>
	<CAA0gF6r6RcsdZwCLVUMsvo2VEumWttMbHPjUar5pKg9HirjGjA@mail.gmail.com>
	<A2767D04-7D85-46D5-A72F-2CB1839B9133@gmail.com>
	<CAA0gF6o-8J_fDbD7hnry1=U6dMnHGN-bpYrnecQ44-cZkysJAw@mail.gmail.com>
	<7007EA8B-09BB-46B8-A5D1-52CAB128DAB0@gmail.com>
	<CAA0gF6r_cFitcnEwj6oCXWxE3JXDtYPe+UhKc0p-KqXnovRphg@mail.gmail.com>
	<61FF0CD2-A658-4E90-A2CF-975D972774D4@gmail.com>
Message-ID: <CAA0gF6qALdmG-sag9piGXTE2KTWiw5qMAE6aPFFaRz=aHXJTAg@mail.gmail.com>

Hi Yury,

On Wed, Apr 4, 2012 at 10:37 PM, Yury Selivanov <yselivanov.ml at gmail.com> wrote:
> On 2012-04-04, at 2:44 PM, Paul Colomiets wrote:
>> I have a global timeout for processing single request. It's actually higher
>> in a chain of generator calls. So dispatcher looks like:
>>
>> def dispatcher(self, method, args):
>> ? ?with timeout(10):
>> ? ? ? ?yield getattr(self.method)(*args)
>
> How does it work? ?To what object are you actually attaching timeout?
>

There is basically a "Coroutine" object. It's actually a list with
paused generators, with top of them being currently running
(or stopped for doing IO). It represents stack, because there
is no built-in stack for generators.

> And what's the advantage of having some "global" timeout instead
> of a timeout specifically bound to some coroutine?
>

We have guaranteed time of request processing. Or to
be more precise guaranteed time when request stops
processing so we don't have a lot of coroutines hanging
forever. It allows to not to place timeouts all over the code.

May be your use case is very different. E.g. this pattern
doesn't work well with batch processing of big data. We
process many tiny user requests per second.

> Do you have that code publicly released somewhere? ?I just really want
> to understand how exactly your architecture works to come with a
> better proposal (if there is one possible ;).
>

This framework does timeout handling in described way:

https://github.com/tailhook/zorro

Although, it's using greenlets. The difference is that we
we don't need to keep a stack in our own scheduler
when using greenlets, but everything else applies.

> As an off-topic: would be interesting to have various coroutines
> approaches and architectures listed somewhere, to understand how
> python programmers actually do it.
>

Sure :)

>> And all the local timeouts, like timeout for single request are
>> usually applied at a socket level, where specific protocol
>> is implemented:
>>
>> def redis_unlock(lock):
>> ? ?yield redis_socket.wait_write(2) ?# wait two seconds
>> ? # TimeoutError may have been raised in wait_write()
>> ? ?cmd = ('DEL user:'+lock+'\n').encode('ascii')
>> ? ?redis_socket.write(cmd) ?# should be loop here, actually
>> ? ?yield redis_socket.wait_read(2) ?# another two seconds
>> ? ?result = redis_socket.read(1024) ?# here loop too
>> ? ?assert result == 'OK\n'
>
> So you have explicit timeouts in the 'redis_unlock', but you want
> them to be ignored if it was called from some 'finally' block?
>

No! I'd just omit them if I wanted. I don't want interruption of
`add_money` which calls `redis_unlock` in finally to be done.

>> So they are not interruptions. Although, we don't use them
>> much with coroutines, global timeout for request is
>> usually enough.
>
> Don't really follow you here.
>

You may think of it as socket with timeout set.

socket.set_timeout(2)
socket.recv(1024)

It will raise TimeoutError, this should propagate as
a normal exception. As opposed to being externally
interrupted e.g. with SIGINT or SIGALERT.

>> But anyway I don't see a reason to protect a single frame,
>> because even if you have a simple mutex without coroutines
>> you end up with:
>>
>> def something():
>> ?lock.acquire()
>> ?try:
>> ? ?pass
>> ?finally:
>> ? ?lock.release()
>>
>> And if lock's imlementation is something along the lines of:
>>
>> def release(self):
>> ? ?self._native_lock.release()
>>
>> How would you be sure that interruption is not executed
>> when interpreter resolved `self._native_lock.release` but
>> not yet called it?
>
> Is it in a context of coroutines or threads?

I don't see a difference, except the code which maintains
stack. I'd say both are problem, if you neither propagate
f_in_finally nor traverse a stack (although, a way of
propagation may be different)

> If former, the
> because you, perhaps, want to interrupt 'something()'?

I want to interrupt a thread. Or "Coroutine" in definition
described above (having a stack of frames) or in greenlet's
definition.

> And it is a
> separate frame from the frame where 'release()' is running?

Of course (How it can be inlined? :) )

>>> That's the second reason I don't like your proposal.
>>>
>>> def foo():
>>> ? try:
>>> ? ? ?..
>>> ? finally:
>>> ? ? ?yield unlock()
>>> ? # <--- the ideal point to interrupt foo
>>>
>>> ? f = open('a', 'w')
>>> ? # what if we interrupt it here?
>>> ? try:
>>> ? ? ?..
>>> ? finally:
>>> ? ? ?f.close()
>>>
>>
>> And which one fixes this problem? There is no guarantee
>> that your timeout code haven't interrupted
>> at " # what if we interrupt it here?". If it's a bit less likely,
>> it's not real solution. Please, don't present it as such.
>
> Sorry, I must had it explained in more details. ?Right now we
> interrupt code only where we have a 'yield', a greenlet.switch(),
> or at the end of finally block, not at some arbitrary opcode.
>

Sure I do similar. But it doesn't work with threads, as
they have no explicit yield or switch.


On Wed, Apr 4, 2012 at 11:07 PM, Yury Selivanov <yselivanov.ml at gmail.com> wrote:
> On 2012-04-04, at 2:44 PM, Paul Colomiets wrote:
>
>> But anyway I don't see a reason to protect a single frame,
>> because even if you have a simple mutex without coroutines
>> you end up with:
>
> BTW, for instance, in our framework each coroutine is a special
> object that wraps generator/plain function. ?It controls everything
> that the underlying generator/function yields/returns. ?But the actual
> execution, propagation of returned values and raised errors is the
> scheduler's job. ?So when you are yielding a coroutine from another
> coroutine, frames are not even connected to each other, since the
> actual execution of the callee will be performed by the scheduler.
> It's not like a regular python call.
>

Same applies here. But you propagate return value/error right?
So you can't say "frames are not connected". They aren't from
the interpreter point of view. But they are logically connected.

So for example:

def a():
  yield b()

def b():
  yield

If `a().with_timeout(0.1)` is interrupted when it's waiting for value
of `b()`, will `b()` continue it's execution?

> For us, having 'f_in_finally' somehow propagated would be completely
> useless.
>

I hope I can convince you with this email :)

> I think even if it's decided to implement just your proposal, I feel
> that 'f_in_finally' should indicate the state of only its *own* frame.

That was original intention. But it requires stack traversing. Andrew
proposed to propagate this flag, which is another point of view
on the same thing (not sure which one to pick though)

-- 
Paul


From yselivanov.ml at gmail.com  Wed Apr  4 23:24:45 2012
From: yselivanov.ml at gmail.com (Yury Selivanov)
Date: Wed, 4 Apr 2012 17:24:45 -0400
Subject: [Python-ideas] Protecting finally clauses of interruptions
In-Reply-To: <CAA0gF6qALdmG-sag9piGXTE2KTWiw5qMAE6aPFFaRz=aHXJTAg@mail.gmail.com>
References: <CAA0gF6o3Vhzg=Zi0UjWUOi-iVAwN70mXanoawooj8iC0FJj8hw@mail.gmail.com>
	<7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com>
	<CAA0gF6pW8oujoLVAgabdwReWby3td1Qw8pg3S4ceoVqFwy5mzQ@mail.gmail.com>
	<339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com>
	<CAA0gF6rhwU-QxfQ-K40n7A9H80yXPbVhdofK-nOjbuLOnmwaJg@mail.gmail.com>
	<EF0E28E3-EF74-4033-A907-89253FBD7D4B@gmail.com>
	<CAA0gF6oJuW5PbXMO4hhQdtwtA_YZZD2t7rZk_NGqhFu+NjwYUQ@mail.gmail.com>
	<7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com>
	<CAA0gF6qhFHQdMD-F1Tsx+aWHK-nEPvfGb2AwDVVzUPMCQCpX1A@mail.gmail.com>
	<187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com>
	<CAA0gF6oWXkZYYM=eLBZx_XPeN4EAMdCFd-z0BgoWqo+nMiBCtQ@mail.gmail.com>
	<E245C90B-2948-4F6C-80C2-5D96B08AF580@gmail.com>
	<CAA0gF6qs_gp-NbmtLHse6LqQs7q3C-BNkvmf1U8HUDwy__Si=A@mail.gmail.com>
	<314BE01B-160C-4C84-8E0D-AD82D14E2E56@gmail.com>
	<CAA0gF6r6RcsdZwCLVUMsvo2VEumWttMbHPjUar5pKg9HirjGjA@mail.gmail.com>
	<A2767D04-7D85-46D5-A72F-2CB1839B9133@gmail.com>
	<CAA0gF6o-8J_fDbD7hnry1=U6dMnHGN-bpYrnecQ44-cZkysJAw@mail.gmail.com>
	<7007EA8B-09BB-46B8-A5D1-52CAB128DAB0@gmail.com>
	<CAA0gF6r_cFitcnEwj6oCXWxE3JXDtYPe+UhKc0p-KqXnovRphg@mail.gmail.com>
	<61FF0CD2-A658-4E90-A2CF-975D972774D4@gmail.com>
	<CAA0gF6qALdmG-sag9piGXTE2KTWiw5qMAE6aPFFaRz=aHXJTAg@mail.gmail.com>
Message-ID: <F2EDF182-609F-4609-A4B0-F982A79F9676@gmail.com>

On 2012-04-04, at 4:43 PM, Paul Colomiets wrote:
>> How does it work?  To what object are you actually attaching timeout?
>> 
> 
> There is basically a "Coroutine" object. It's actually a list with
> paused generators, with top of them being currently running
> (or stopped for doing IO). It represents stack, because there
> is no built-in stack for generators.

Interesting.  I decided to go with a simple coroutine objects with
a '_caller' pointer to maintain the stack virtually.

> This framework does timeout handling in described way:
> 
> https://github.com/tailhook/zorro
> 
> Although, it's using greenlets. The difference is that we
> we don't need to keep a stack in our own scheduler
> when using greenlets, but everything else applies.

Are you using that particular framework (zorro)?  Or some modification
of it that uses generators too?

>> Sorry, I must had it explained in more details.  Right now we
>> interrupt code only where we have a 'yield', a greenlet.switch(),
>> or at the end of finally block, not at some arbitrary opcode.
>> 
> 
> Sure I do similar. But it doesn't work with threads, as
> they have no explicit yield or switch.

OK.

> On Wed, Apr 4, 2012 at 11:07 PM, Yury Selivanov <yselivanov.ml at gmail.com> wrote:
>> On 2012-04-04, at 2:44 PM, Paul Colomiets wrote:
>> 
>>> But anyway I don't see a reason to protect a single frame,
>>> because even if you have a simple mutex without coroutines
>>> you end up with:
>> 
>> BTW, for instance, in our framework each coroutine is a special
>> object that wraps generator/plain function.  It controls everything
>> that the underlying generator/function yields/returns.  But the actual
>> execution, propagation of returned values and raised errors is the
>> scheduler's job.  So when you are yielding a coroutine from another
>> coroutine, frames are not even connected to each other, since the
>> actual execution of the callee will be performed by the scheduler.
>> It's not like a regular python call.
>> 
> 
> Same applies here. But you propagate return value/error right?
> So you can't say "frames are not connected". They aren't from
> the interpreter point of view. But they are logically connected.

OK, we're on the same page.  '''"frames are not connected" from the
interpreter point of view''', that essentially means that 'f_in_finally'
will always be a flag related only to its own frame, right?  Any 
'propagation' of this flag is the responsibility of framework developers.

> So for example:
> 
> def a():
>  yield b()
> 
> def b():
>  yield
> 
> If `a().with_timeout(0.1)` is interrupted when it's waiting for value
> of `b()`, will `b()` continue it's execution?

Well, in our framework, if a() is getting aborted after it's scheduled
b(), but before it received the result of b(), we interrupt both of them
(and those that b() might had called).

>> I think even if it's decided to implement just your proposal, I feel
>> that 'f_in_finally' should indicate the state of only its *own* frame.
> 
> That was original intention. But it requires stack traversing. Andrew
> proposed to propagate this flag, which is another point of view
> on the same thing (not sure which one to pick though)

Again, if coroutines' frames aren't connected on the interpreter level
(it's the responsibility of a framework), about what exact propagation
are you and Andrew talking (in the sole context of the patch to cpython)?

-
Yury

From paul at colomiets.name  Wed Apr  4 23:46:13 2012
From: paul at colomiets.name (Paul Colomiets)
Date: Thu, 5 Apr 2012 00:46:13 +0300
Subject: [Python-ideas] Protecting finally clauses of interruptions
In-Reply-To: <F2EDF182-609F-4609-A4B0-F982A79F9676@gmail.com>
References: <CAA0gF6o3Vhzg=Zi0UjWUOi-iVAwN70mXanoawooj8iC0FJj8hw@mail.gmail.com>
	<7113307B-D8CB-4229-8D06-7242C1BE863E@gmail.com>
	<CAA0gF6pW8oujoLVAgabdwReWby3td1Qw8pg3S4ceoVqFwy5mzQ@mail.gmail.com>
	<339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com>
	<CAA0gF6rhwU-QxfQ-K40n7A9H80yXPbVhdofK-nOjbuLOnmwaJg@mail.gmail.com>
	<EF0E28E3-EF74-4033-A907-89253FBD7D4B@gmail.com>
	<CAA0gF6oJuW5PbXMO4hhQdtwtA_YZZD2t7rZk_NGqhFu+NjwYUQ@mail.gmail.com>
	<7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com>
	<CAA0gF6qhFHQdMD-F1Tsx+aWHK-nEPvfGb2AwDVVzUPMCQCpX1A@mail.gmail.com>
	<187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com>
	<CAA0gF6oWXkZYYM=eLBZx_XPeN4EAMdCFd-z0BgoWqo+nMiBCtQ@mail.gmail.com>
	<E245C90B-2948-4F6C-80C2-5D96B08AF580@gmail.com>
	<CAA0gF6qs_gp-NbmtLHse6LqQs7q3C-BNkvmf1U8HUDwy__Si=A@mail.gmail.com>
	<314BE01B-160C-4C84-8E0D-AD82D14E2E56@gmail.com>
	<CAA0gF6r6RcsdZwCLVUMsvo2VEumWttMbHPjUar5pKg9HirjGjA@mail.gmail.com>
	<A2767D04-7D85-46D5-A72F-2CB1839B9133@gmail.com>
	<CAA0gF6o-8J_fDbD7hnry1=U6dMnHGN-bpYrnecQ44-cZkysJAw@mail.gmail.com>
	<7007EA8B-09BB-46B8-A5D1-52CAB128DAB0@gmail.com>
	<CAA0gF6r_cFitcnEwj6oCXWxE3JXDtYPe+UhKc0p-KqXnovRphg@mail.gmail.com>
	<61FF0CD2-A658-4E90-A2CF-975D972774D4@gmail.com>
	<CAA0gF6qALdmG-sag9piGXTE2KTWiw5qMAE6aPFFaRz=aHXJTAg@mail.gmail.com>
	<F2EDF182-609F-4609-A4B0-F982A79F9676@gmail.com>
Message-ID: <CAA0gF6oBf02MPk3NSG=RAGgb=skt1uAaHBzdJH9GG4Vf5+c=kQ@mail.gmail.com>

Hi Yury,

On Thu, Apr 5, 2012 at 12:24 AM, Yury Selivanov <yselivanov.ml at gmail.com> wrote:
> On 2012-04-04, at 4:43 PM, Paul Colomiets wrote:
>>> How does it work? ?To what object are you actually attaching timeout?
>>>
>>
>> There is basically a "Coroutine" object. It's actually a list with
>> paused generators, with top of them being currently running
>> (or stopped for doing IO). It represents stack, because there
>> is no built-in stack for generators.
>
> Interesting. ?I decided to go with a simple coroutine objects with
> a '_caller' pointer to maintain the stack virtually.
>

It doesn't matter. IIRC, that was to draw a tree of calls starting
with roots. But that's irrelevant to the topic of discussion.

>> This framework does timeout handling in described way:
>>
>> https://github.com/tailhook/zorro
>>
>> Although, it's using greenlets. The difference is that we
>> we don't need to keep a stack in our own scheduler
>> when using greenlets, but everything else applies.
>
> Are you using that particular framework (zorro)? ?Or some modification
> of it that uses generators too?
>

Currently we experimenting with greenlets and zorro. My
description of yield-based coroutines from earlier project
(unfortunately non-public one).

> OK, we're on the same page. ?'''"frames are not connected" from the
> interpreter point of view''', that essentially means that 'f_in_finally'
> will always be a flag related only to its own frame, right? ?Any
> 'propagation' of this flag is the responsibility of framework developers.
>

Yes, that's ok for me.

>> So for example:
>>
>> def a():
>> ?yield b()
>>
>> def b():
>> ?yield
>>
>> If `a().with_timeout(0.1)` is interrupted when it's waiting for value
>> of `b()`, will `b()` continue it's execution?
>
> Well, in our framework, if a() is getting aborted after it's scheduled
> b(), but before it received the result of b(), we interrupt both of them
> (and those that b() might had called).
>

Exactly! And you don't want them to be interrupted in the case
`a()` rewritten as:

def a():
  try: pass
  finally: yield b()

Same with threads and greenlets.

>>> I think even if it's decided to implement just your proposal, I feel
>>> that 'f_in_finally' should indicate the state of only its *own* frame.
>>
>> That was original intention. But it requires stack traversing. Andrew
>> proposed to propagate this flag, which is another point of view
>> on the same thing (not sure which one to pick though)
>
> Again, if coroutines' frames aren't connected on the interpreter level
> (it's the responsibility of a framework), about what exact propagation
> are you and Andrew talking (in the sole context of the patch to cpython)?
>

For threads and greenlets flag can be implicitly propagated, and
for yield-based coroutines f_in_finally can be made writable, so you
can propagate it in you own scheduler

Not sure It's good idea, just describing it.

-- 
Paul


From greg.ewing at canterbury.ac.nz  Thu Apr  5 00:18:46 2012
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 05 Apr 2012 10:18:46 +1200
Subject: [Python-ideas] Protecting finally clauses of interruptions
In-Reply-To: <CAA0gF6o-8J_fDbD7hnry1=U6dMnHGN-bpYrnecQ44-cZkysJAw@mail.gmail.com>
References: <CAA0gF6o3Vhzg=Zi0UjWUOi-iVAwN70mXanoawooj8iC0FJj8hw@mail.gmail.com>
	<CAA0gF6pW8oujoLVAgabdwReWby3td1Qw8pg3S4ceoVqFwy5mzQ@mail.gmail.com>
	<339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com>
	<CAA0gF6rhwU-QxfQ-K40n7A9H80yXPbVhdofK-nOjbuLOnmwaJg@mail.gmail.com>
	<EF0E28E3-EF74-4033-A907-89253FBD7D4B@gmail.com>
	<CAA0gF6oJuW5PbXMO4hhQdtwtA_YZZD2t7rZk_NGqhFu+NjwYUQ@mail.gmail.com>
	<7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com>
	<CAA0gF6qhFHQdMD-F1Tsx+aWHK-nEPvfGb2AwDVVzUPMCQCpX1A@mail.gmail.com>
	<187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com>
	<CAA0gF6oWXkZYYM=eLBZx_XPeN4EAMdCFd-z0BgoWqo+nMiBCtQ@mail.gmail.com>
	<E245C90B-2948-4F6C-80C2-5D96B08AF580@gmail.com>
	<CAA0gF6qs_gp-NbmtLHse6LqQs7q3C-BNkvmf1U8HUDwy__Si=A@mail.gmail.com>
	<314BE01B-160C-4C84-8E0D-AD82D14E2E56@gmail.com>
	<CAA0gF6r6RcsdZwCLVUMsvo2VEumWttMbHPjUar5pKg9HirjGjA@mail.gmail.com>
	<A2767D04-7D85-46D5-A72F-2CB1839B9133@gmail.com>
	<CAA0gF6o-8J_fDbD7hnry1=U6dMnHGN-bpYrnecQ44-cZkysJAw@mail.gmail.com>
Message-ID: <4F7CC8C6.1030902@canterbury.ac.nz>

Paul Colomiets wrote:

> Do you mean put callback in a frame, which get
> executed at next bytecode just like signal handler,
> except it waits until finally clause is executed?

It wouldn't be in each frame -- probably it would just
be a global hook that gets called whenever a finally-counter
anywhere gets decremented from 1 to 0. It would be passed
the relevant frame so it could decide what to do from
there.

I don't think it would have much performance impact, since
it would only get triggered by exiting a finally block.
Nothing would need to happen per bytecode or anything
like that.

-- 
Greg


From paul at colomiets.name  Thu Apr  5 00:45:42 2012
From: paul at colomiets.name (Paul Colomiets)
Date: Thu, 5 Apr 2012 01:45:42 +0300
Subject: [Python-ideas] Protecting finally clauses of interruptions
In-Reply-To: <4F7CC8C6.1030902@canterbury.ac.nz>
References: <CAA0gF6o3Vhzg=Zi0UjWUOi-iVAwN70mXanoawooj8iC0FJj8hw@mail.gmail.com>
	<CAA0gF6pW8oujoLVAgabdwReWby3td1Qw8pg3S4ceoVqFwy5mzQ@mail.gmail.com>
	<339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com>
	<CAA0gF6rhwU-QxfQ-K40n7A9H80yXPbVhdofK-nOjbuLOnmwaJg@mail.gmail.com>
	<EF0E28E3-EF74-4033-A907-89253FBD7D4B@gmail.com>
	<CAA0gF6oJuW5PbXMO4hhQdtwtA_YZZD2t7rZk_NGqhFu+NjwYUQ@mail.gmail.com>
	<7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com>
	<CAA0gF6qhFHQdMD-F1Tsx+aWHK-nEPvfGb2AwDVVzUPMCQCpX1A@mail.gmail.com>
	<187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com>
	<CAA0gF6oWXkZYYM=eLBZx_XPeN4EAMdCFd-z0BgoWqo+nMiBCtQ@mail.gmail.com>
	<E245C90B-2948-4F6C-80C2-5D96B08AF580@gmail.com>
	<CAA0gF6qs_gp-NbmtLHse6LqQs7q3C-BNkvmf1U8HUDwy__Si=A@mail.gmail.com>
	<314BE01B-160C-4C84-8E0D-AD82D14E2E56@gmail.com>
	<CAA0gF6r6RcsdZwCLVUMsvo2VEumWttMbHPjUar5pKg9HirjGjA@mail.gmail.com>
	<A2767D04-7D85-46D5-A72F-2CB1839B9133@gmail.com>
	<CAA0gF6o-8J_fDbD7hnry1=U6dMnHGN-bpYrnecQ44-cZkysJAw@mail.gmail.com>
	<4F7CC8C6.1030902@canterbury.ac.nz>
Message-ID: <CAA0gF6oUJJJOt9HNqJ8CKQmgEwDm04jMHRXhs+1mVA0wWTuBWA@mail.gmail.com>

Hi Greg,

On Thu, Apr 5, 2012 at 1:18 AM, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> Paul Colomiets wrote:
>
>> Do you mean put callback in a frame, which get
>> executed at next bytecode just like signal handler,
>> except it waits until finally clause is executed?
>
>
> It wouldn't be in each frame -- probably it would just
> be a global hook that gets called whenever a finally-counter
> anywhere gets decremented from 1 to 0. It would be passed
> the relevant frame so it could decide what to do from
> there.
>

It's similar to sys.settrace() except it only executed when
finally counter decremented to 0, right?
Flag `f_in_finally` is still there, right?

It solves my use case well.

Yury, is it ok if l'll start a PEP with this idea, and when
it will have some support (or be rejected), you'll come
up with thread interruption proposal?

-- 
Paul


From yselivanov.ml at gmail.com  Thu Apr  5 01:19:41 2012
From: yselivanov.ml at gmail.com (Yury Selivanov)
Date: Wed, 4 Apr 2012 19:19:41 -0400
Subject: [Python-ideas] Protecting finally clauses of interruptions
In-Reply-To: <CAA0gF6oUJJJOt9HNqJ8CKQmgEwDm04jMHRXhs+1mVA0wWTuBWA@mail.gmail.com>
References: <CAA0gF6o3Vhzg=Zi0UjWUOi-iVAwN70mXanoawooj8iC0FJj8hw@mail.gmail.com>
	<CAA0gF6pW8oujoLVAgabdwReWby3td1Qw8pg3S4ceoVqFwy5mzQ@mail.gmail.com>
	<339D1D91-947C-41A8-BA34-790C0A0ADDAB@gmail.com>
	<CAA0gF6rhwU-QxfQ-K40n7A9H80yXPbVhdofK-nOjbuLOnmwaJg@mail.gmail.com>
	<EF0E28E3-EF74-4033-A907-89253FBD7D4B@gmail.com>
	<CAA0gF6oJuW5PbXMO4hhQdtwtA_YZZD2t7rZk_NGqhFu+NjwYUQ@mail.gmail.com>
	<7FBC933E-A110-495C-BB37-51E5EFC8B94F@gmail.com>
	<CAA0gF6qhFHQdMD-F1Tsx+aWHK-nEPvfGb2AwDVVzUPMCQCpX1A@mail.gmail.com>
	<187CE198-3636-47A8-85E6-9883D0DC97FE@gmail.com>
	<CAA0gF6oWXkZYYM=eLBZx_XPeN4EAMdCFd-z0BgoWqo+nMiBCtQ@mail.gmail.com>
	<E245C90B-2948-4F6C-80C2-5D96B08AF580@gmail.com>
	<CAA0gF6qs_gp-NbmtLHse6LqQs7q3C-BNkvmf1U8HUDwy__Si=A@mail.gmail.com>
	<314BE01B-160C-4C84-8E0D-AD82D14E2E56@gmail.com>
	<CAA0gF6r6RcsdZwCLVUMsvo2VEumWttMbHPjUar5pKg9HirjGjA@mail.gmail.com>
	<A2767D04-7D85-46D5-A72F-2CB1839B9133@gmail.com>
	<CAA0gF6o-8J_fDbD7hnry1=U6dMnHGN-bpYrnecQ44-cZkysJAw@mail.gmail.com>
	<4F7CC8C6.1030902@canterbury.ac.nz>
	<CAA0gF6oUJJJOt9HNqJ8CKQmgEwDm04jMHRXhs+1mVA0wWTuBWA@mail.gmail.com>
Message-ID: <052F12E9-BAD4-4B51-A850-261B46E8A8C8@gmail.com>

On 2012-04-04, at 6:45 PM, Paul Colomiets wrote:

> It's similar to sys.settrace() except it only executed when
> finally counter decremented to 0, right?
> Flag `f_in_finally` is still there, right?

Yes, please keep it.  With your current proposal, it's the only
way to see if it is safe to interrupt coroutine right now, or 
we have to wait until the callback gets called.

> Yury, is it ok if l'll start a PEP with this idea, and when
> it will have some support (or be rejected), you'll come
> up with thread interruption proposal?

Sure, go ahead.  If I'm lucky enough to come up with a better 
proposal I promise to shout about it loud ;)

-
Yury


From jimjjewett at gmail.com  Thu Apr  5 18:03:34 2012
From: jimjjewett at gmail.com (Jim Jewett)
Date: Thu, 5 Apr 2012 12:03:34 -0400
Subject: [Python-ideas] comparison of operator.itemgetter objects
In-Reply-To: <4F79737D.6030508@pearwood.info>
References: <CAOVPiMjeBBdwuDyhy7t0zRSNcAS1PdKBXAEQ_UA8-x+BvU_6MA@mail.gmail.com>
	<4F79737D.6030508@pearwood.info>
Message-ID: <CA+OGgf61bWeX0MDm9Ln-jsSsCf1KKCNzQN--DDhGm+MbQ3fG5w@mail.gmail.com>

On Mon, Apr 2, 2012 at 5:38 AM, Steven D'Aprano <steve at pearwood.info> wrote:

> In general, I think that having equality tests fall back on identity test is
> so rarely what you actually want that sometimes I wonder why we bother.

Because identity ==> equality.  (There are exceptions, like NaN, but
that behavior is buggy.)  And for objects without a comparison
function, the most commonly made comparison (e.g., as a dict key) is
one where identity is desired.

-jJ


From jimjjewett at gmail.com  Thu Apr  5 18:33:19 2012
From: jimjjewett at gmail.com (Jim Jewett)
Date: Thu, 5 Apr 2012 12:33:19 -0400
Subject: [Python-ideas] comparison of operator.itemgetter objects
In-Reply-To: <CAOVPiMjeBBdwuDyhy7t0zRSNcAS1PdKBXAEQ_UA8-x+BvU_6MA@mail.gmail.com>
References: <CAOVPiMjeBBdwuDyhy7t0zRSNcAS1PdKBXAEQ_UA8-x+BvU_6MA@mail.gmail.com>
Message-ID: <CA+OGgf4Ht3ojk76Hf86sqsc33rxVeZEbehqYgO_jO57O0-hxjQ@mail.gmail.com>

On Mon, Apr 2, 2012 at 4:30 AM, Max Moroz <maxmoroz at gmail.com> wrote:

> I think that comparing sort keys for equality works well in many useful
> cases:

> (a) Named function. These compare as equal only?if they are identical. If
> lhs and rhs were initialized with distinct named functions, I would argue
> that the programmer did not intend them to be compatible for the purpose of
> binary operations, even if they happen to be identical in behavior (e.g., if
> both functions return back the argument passed to them). In a well-designed
> program, there is no need to duplicate the named function definition if the
> two are expected to always have the same behavior.

It may be that they were created as inner functions, and the reason to
duplicate was either to avoid creating the function at all unless it
was needed, or to keep the smaller function's logic near where it was
needed.

In a sense, you are already recognizing this by asking that different
but equivalent functions produced by the itemgetter factor compare
equal.

> (c) itemgetter. Suppose a programmer passed `itemgetter('name')` as the sort
> key argument to the sorted data structure's constructor. The resulting data
> structures would seem incompatible for the purposes of binary operations.
> This is likely to be confusing and undesirable.

operator.attrgetter seems similar.

> (d) lambda functions. Similarly, suppose a programmer passed `lambda x : -x`
> as the sort key argument to the sorted data structure's constructor. Since
> two lambda functions are not identical, they would compare as unequal.

> It seems to be very easy to address the undesirable behavior described in
> (c): add method __eq__() to operator.itemgetter, which would compare the
> list of arguments received at initialization.

Agreed.  I think this may just be a case of someone assuming YAGNI,
but if you do need it, and submit a patch, it should be OK.

> It is far harder to address the undesirable behavior described in (d). If it
> can be addressed at all, it would have to done in the sorted data structure
> implementation, since I don't think anyone would want lambda function
> comparison behavior to change. So for the purposes of this discussion, I
> ignore case (d).

Why not?  If you really care about identity for a lambda function,
then you should be using "is", and if you don't, then equivalent
behavior should be enough.

I would support a change to function.__eq__ (which would fall through
to lambda) such that they were equal if they had the same bytecode,
signature, and execution context (defaults, globals, etc).  I would
also support making functions and methods orderable, for more easily
replicated reprs.  I'm not volunteering to write the patch, at least
today.

> Is this a reasonable idea? Is it useful enough to be considered? Are there
> any downsides I didn't think of?

Caring that two functions are identical is probably even less common
than sticking a function in a dict, and the "nope, these are not
equal" case would get a bit slower.

-jJ


From tjreedy at udel.edu  Thu Apr  5 21:14:28 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 05 Apr 2012 15:14:28 -0400
Subject: [Python-ideas] comparison of operator.itemgetter objects
In-Reply-To: <CA+OGgf4Ht3ojk76Hf86sqsc33rxVeZEbehqYgO_jO57O0-hxjQ@mail.gmail.com>
References: <CAOVPiMjeBBdwuDyhy7t0zRSNcAS1PdKBXAEQ_UA8-x+BvU_6MA@mail.gmail.com>
	<CA+OGgf4Ht3ojk76Hf86sqsc33rxVeZEbehqYgO_jO57O0-hxjQ@mail.gmail.com>
Message-ID: <jlkqv5$g9h$1@dough.gmane.org>

On 4/5/2012 12:33 PM, Jim Jewett wrote:

> Why not?  If you really care about identity for a lambda function,

A 'lambda function' is simply a function whose .__name__ attribute is 
"<lambda>". There is no difference otherwise. Hence cases '(a) function' 
and '(d) lambda function' (in snipped portion) are the same class and

> I would support a change to function.__eq__ (which would fall through
> to lambda)

'falling through' cannot happen as there is nothing other to fall 
through to.

-- 
Terry Jan Reedy



From paul at colomiets.name  Fri Apr  6 23:04:28 2012
From: paul at colomiets.name (Paul Colomiets)
Date: Sat, 7 Apr 2012 00:04:28 +0300
Subject: [Python-ideas] Draft PEP on protecting finally clauses
Message-ID: <CAA0gF6o=Moi8XoHHTN5OAUgUQ6WS5CJh+90+dc=XbLzDO=ANSA@mail.gmail.com>

Hi,

I've finally made a PEP. Any feedback is appreciated.

-- 
Paul


PEP: XXX
Title: Protecting cleanup statements from interruptions
Version: $Revision$
Last-Modified: $Date$
Author: Paul Colomiets <paul at colomiets.name>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 06-Apr-2012
Python-Version: 3.3


Abstract
========

This PEP proposes a way to protect python code from being interrupted inside
finally statement or context manager.


Rationale
=========

Python has two nice ways to do cleanup. One is a ``finally`` statement
and the other is context manager (or ``with`` statement). Although,
neither of them is protected from ``KeyboardInterrupt`` or
``generator.throw()``. For example::

    lock.acquire()
    try:
        print('starting')
        do_someting()
    finally:
        print('finished')
        lock.release()

If ``KeyboardInterrupt`` occurs just after ``print`` function is
executed, lock will not be released. Similarly the following code
using ``with`` statement is affected::

    from threading import Lock

    class MyLock:

        def __init__(self):
            self._lock_impl = lock

        def __enter__(self):
            self._lock_impl.acquire()
            print("LOCKED")

        def __exit__(self):
            print("UNLOCKING")
            self._lock_impl.release()

    lock = MyLock()
    with lock:
        do_something

If ``KeyboardInterrupt`` occurs near any of the ``print`` statements,
lock will never be released.


Coroutine Use Case
------------------

Similar case occurs with coroutines. Usually coroutine libraries want
to interrupt coroutine with a timeout. There is a
``generator.throw()`` method for this use case, but there are no
method to know is it currently yielded from inside a ``finally``.

Example that uses yield-based coroutines follows. Code looks
similar using any of the popular coroutine libraries Monocle [1]_,
Bluelet [2]_, or Twisted [3]_. ::

    def run_locked()
        yield connection.sendall('LOCK')
        try:
            yield do_something()
            yield do_something_else()
        finally:
            yield connection.sendall('UNLOCK')

    with timeout(5):
        yield run_locked()

In the example above ``yield something`` means pause executing current
coroutine and execute coroutine ``something`` until it finished
execution. So that library keeps stack of generators itself. The
``connection.sendall`` waits until socket is writable and does thing
similar to what ``socket.sendall`` does.

The ``with`` statement ensures that all that code is executed within 5
seconds timeout. It does so by registering a callback in main loop,
which calls ``generator.throw()`` to the top-most frame in the
coroutine stack when timeout happens.

The ``greenlets`` extension works in similar way, except it doesn't
need ``yield`` to enter new stack frame. Otherwise considerations are
similar.


Specification
=============

Frame Flag 'f_in_cleanup'
-------------------------

A new flag on frame object is proposed. It is set to ``True`` if this
frame is currently in the ``finally`` suite.  Internally it must be
implemented as a counter of nested finally statements currently
executed.

The internal counter is also incremented when entering ``WITH_SETUP``
bytecode and ``WITH_CLEANUP`` bytecode, and is decremented when
leaving that bytecode. This allows to protect ``__enter__`` and
``__exit__`` methods too.


Function 'sys.setcleanuphook'
-----------------------------

A new function for the ``sys`` module is proposed. This function sets
a callback which is executed every time ``f_in_cleanup`` becomes
``False``. Callbacks gets ``frame`` as it's sole argument so it can
get some evindence where it is called from.

The setting is thread local and is stored inside ``PyThreadState``
structure.


Inspect Module Enhancements
---------------------------

Two new functions are proposed for ``inspect`` module:
``isframeincleanup`` and ``getcleanupframe``.

``isframeincleanup`` given ``frame`` object or ``generator`` object as
sole argument returns the value of ``f_in_cleanup`` attribute of a
frame itself or of the ``gi_frame`` attribute of a generator.

``getcleanupframe`` given ``frame`` object as sole argument returns
the innermost frame which has true value of ``f_in_cleanup`` or
``None`` if no frames in the stack has the attribute set. It starts to
inspect from specified frame and walks to outer frames using
``f_back`` pointers, just like ``getouterframes`` does.


Example
=======

Example implementation of ``SIGINT`` handler that interrupts safely
might look like::

    import inspect, sys, functools

    def sigint_handler(sig, frame)
        if inspect.getcleanupframe(frame) is None:
            raise KeyboardInterrupt()
        sys.setcleanuphook(functools.partial(sigint_handler, 0))

Coroutine example is out of scope of this document, because it's
implemention depends very much on a trampoline (or main loop) used by
coroutine library.


Unresolved Issues
=================

Interruption Inside With Statement Expression
---------------------------------------------

Given the statement::

    with open(filename):
        do_something()

Python can be interrupted after ``open`` is called, but before
``SETUP_WITH`` bytecode is executed. There are two possible decisions:

* Protect expression inside ``with`` statement. This would need
  another bytecode, since currently there is no delimiter at the start
  of ``with`` expression

* Let user write a wrapper if he considers it's important for his
  use-case. Safe wrapper code might look like the following::

    class FileWrapper(object):

        def __init__(self, filename, mode):
            self.filename = filename
            self.mode = mode

        def __enter__(self):
            self.file = open(self.filename, self.mode)

        def __exit__(self):
            self.file.close()

  Alternatively it can be written using context manager::

    @contextmanager
    def open_wrapper(filename, mode):
        file = open(filename, mode)
        try:
            yield file
        finally:
            file.close()

  This code is safe, as first part of generator (before yield) is
  executed inside ``WITH_SETUP`` bytecode of caller


Exception Propagation
---------------------

Sometimes ``finally`` block or ``__enter__/__exit__`` method can be
exited with an exception. Usually it's not a problem, since more
important exception like ``KeyboardInterrupt`` or ``SystemExit``
should be thrown instead. But it may be nice to be able to keep
original exception inside a ``__context__`` attibute. So cleanup hook
signature may grow an exception argument::

    def sigint_handler(sig, frame)
        if inspect.getcleanupframe(frame) is None:
            raise KeyboardInterrupt()
        sys.setcleanuphook(retry_sigint)

    def retry_sigint(frame, exception=None):
        if inspect.getcleanupframe(frame) is None:
            raise KeyboardInterrupt() from exception

.. note::

    No need to have three arguments like in ``__exit__`` method since
    we have a ``__traceback__`` attribute in exception in Python 3.x

Although, this will set ``__cause__`` for the exception, which is not
exactly what's intended. So some hidden interpeter logic may be used
to put ``__context__`` attribute on every exception raised in cleanup
hook.


Interruption Between Acquiring Resource and Try Block
-----------------------------------------------------

Example from the first section is not totally safe. Let's look closer::

    lock.acquire()
    try:
        do_something()
    finally:
        lock.release()

There is no way it can be fixed without modifying the code. The actual
fix of this code depends very much on use case.

Usually code can be fixed using a ``with`` statement::

    with lock:
        do_something()

Although, for coroutines you usually can't use ``with`` statement
because you need to ``yield`` for both aquire and release operations.
So code might be rewritten as following::

    try:
        yield lock.acquire()
        do_something()
    finally:
        yield lock.release()

The actual lock code might need more code to support this use case,
but implementation is usually trivial, like check if lock has been
acquired and unlock if it is.


Setting Interruption Context Inside Finally Itself
--------------------------------------------------

Some coroutine libraries may need to set a timeout for the finally
clause itself. For example::

    try:
        do_something()
    finally:
        with timeout(0.5):
            try:
                yield do_slow_cleanup()
            finally:
                yield do_fast_cleanup()

With current semantics timeout will either protect
the whole ``with`` block or nothing at all, depending on the
implementation of a library. What the author is intended is to treat
``do_slow_cleanup`` as an ordinary code, and ``do_fast_cleanup`` as a
cleanup (non-interruptible one).

Similar case might occur when using greenlets or tasklets.

This case can be fixed by exposing ``f_in_cleanup`` as a counter, and
by calling cleanup hook on each decrement.  Corouting library may then
remember the value at timeout start, and compare it on each hook
execution.

But in practice example is considered to be too obscure to take in
account.


Alternative Python Implementations Support
==========================================

We consider ``f_in_cleanup`` and implementation detail. The actual
implementation may have some fake frame-like object passed to signal
handler, cleanup hook and returned from ``getcleanupframe``. The only
requirement is that ``inspect`` module functions work as expected on
that objects. For this reason we also allow to pass a ``generator``
object to a ``isframeincleanup`` function, this disables need to use
``gi_frame`` attribute.

It may need to be specified that ``getcleanupframe`` must return the
same object that will be passed to cleanup hook at next invocation.


Alternative Names
=================

Original proposal had ``f_in_finally`` flag. The original intention
was to protect ``finally`` clauses. But as it grew up to protecting
``__enter__`` and ``__exit__`` methods too, the ``f_in_cleanup``
method seems better. Although ``__enter__`` method is not a cleanup
routine, it at least relates to cleanup done by context managers.

``setcleanuphook``, ``isframeincleanup`` and ``getcleanupframe`` can
be unobscured to ``set_cleanup_hook``, ``is_frame_in_cleanup`` and
``get_cleanup_frame``, althought they follow convention of their
respective modules.


Alternative Proposals
=====================

Propagating 'f_in_cleanup' Flag Automatically
-----------------------------------------------

This can make ``getcleanupframe`` unnecessary. But for yield based
coroutines you need to propagate it yourself. Making it writable leads
to somewhat unpredictable behavior of ``setcleanuphook``


Add Bytecodes 'INCR_CLEANUP', 'DECR_CLEANUP'
--------------------------------------------

These bytecodes can be used to protect expression inside ``with``
statement, as well as making counter increments more explicit and easy
to debug (visible inside a disassembly). Some middle ground might be
chosen, like ``END_FINALLY`` and ``SETUP_WITH`` imlicitly decrements
counter (``END_FINALLY`` is present at end of ``with`` suite).

Although, adding new bytecodes must be considered very carefully.


Expose 'f_in_cleanup' as a Counter
----------------------------------

The original intention was to expose minimum needed functionality.
Although, as we consider frame flag ``f_in_cleanup`` as an
implementation detail, we may expose it as a counter.

Similarly, if we have a counter we may need to have cleanup hook
called on every counter decrement. It's unlikely have much performance
impact as nested finally clauses are unlikely common case.


Add code object flag 'CO_CLEANUP'
---------------------------------

As an alternative to set flag inside ``WITH_SETUP``, and
``WITH_CLEANUP`` bytecodes we can introduce a flag ``CO_CLEANUP``.
When interpreter starts to execute code with ``CO_CLEANUP`` set, it
sets ``f_in_cleanup`` for the whole function body.  This flag is set
for code object of ``__enter__`` and ``__exit__`` special methods.
Technically it might be set on functions called ``__enter__`` and
``__exit__``.

This seems to be less clear solution. It also covers the case where
``__enter__`` and ``__exit__`` are called manually. This may be
accepted either as feature or as a unnecessary side-effect (unlikely
as a bug).

It may also impose a problem when ``__enter__`` or ``__exit__``
function are implemented in C, as there usually no frame to check for
``f_in_cleanup`` flag.


Have Cleanup Callback on Frame Object Itself
----------------------------------------------

Frame may be extended to have ``f_cleanup_callback`` which is called
when ``f_in_cleanup`` is reset to 0. It would help to register
different callbacks to different coroutines.

Despite apparent beauty. This solution doesn't add anything. As there
are two primary use cases:

* Set callback in signal handler. The callback is inherently single
  one for this case

* Use single callback per loop for coroutine use case. And in almost
  all cases there is only one loop per thread


No Cleanup Hook
---------------

Original proposal included no cleanup hook specification. As there are
few ways to achieve the same using current tools:

* Use ``sys.settrace`` and ``f_trace`` callback. It may impose some
  problem to debugging, and has big performance impact (although,
  interrupting doesn't happen very often)

* Sleep a bit more and try again. For coroutine library it's easy. For
  signals it may be achieved using ``alert``.

Both methods are considered too impractical and a way to catch exit
from ``finally`` statement is proposed.


References
==========

.. [1] Monocle
   https://github.com/saucelabs/monocle

.. [2] Bluelet
   https://github.com/sampsyo/bluelet

.. [3] Twisted: inlineCallbacks
   http://twistedmatrix.com/documents/8.1.0/api/twisted.internet.defer.html

.. [4] Original discussion
   http://mail.python.org/pipermail/python-ideas/2012-April/014705.html


Copyright
=========

This document has been placed in the public domain.



..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   coding: utf-8
   End:


From victor.stinner at gmail.com  Sat Apr  7 12:04:28 2012
From: victor.stinner at gmail.com (Victor Stinner)
Date: Sat, 7 Apr 2012 12:04:28 +0200
Subject: [Python-ideas] Protecting finally clauses of interruptions
In-Reply-To: <CAA0gF6o3Vhzg=Zi0UjWUOi-iVAwN70mXanoawooj8iC0FJj8hw@mail.gmail.com>
References: <CAA0gF6o3Vhzg=Zi0UjWUOi-iVAwN70mXanoawooj8iC0FJj8hw@mail.gmail.com>
Message-ID: <CAMpsgwb_u2dhfKUdk8r0f5qepjLQqAjAa3MmxdcOGLtjgJpm5A@mail.gmail.com>

> I'd like to propose a way to protect `finally` clauses from
> interruptions (either by KeyboardInterrupt or by timeout, or any other
> way).

With Python 3.3, you can easily write a context manager disabling
interruptions using signal.pthread_sigmask(). If a signal is send, the
signal will be waiting in a queue, and the signal handler will be
called when the signals are unblocked. (On some OSes, the signal
handler is not called immediatly.)

pthread_sigmask() only affects the current thread. If you have two
threads, and you block all signals in thread A, the C signal handler
will be called in the thread B. But if I remember correctly, the
Python signal handler is always called in the main thread.

pthread_sigmask() is not available on all platforms (e.g. not on
Windows), and some OSes have a poor support of signals+threads (e.g.
OpenBSD and old versions of FreeBSD).

Calling pthread_sigmask() twice (enter and exit the final block) has a
cost, I don't think that it should be done by default. It may also
have unexpected behaviours. I prefer to make it explicit.

--

You may hack ceval.c to not call the Python signal handler in a final
block, but system calls will still be interrupted (EINTR).

Victor


From paul at colomiets.name  Sat Apr  7 13:11:00 2012
From: paul at colomiets.name (Paul Colomiets)
Date: Sat, 7 Apr 2012 14:11:00 +0300
Subject: [Python-ideas] Protecting finally clauses of interruptions
In-Reply-To: <CAMpsgwb_u2dhfKUdk8r0f5qepjLQqAjAa3MmxdcOGLtjgJpm5A@mail.gmail.com>
References: <CAA0gF6o3Vhzg=Zi0UjWUOi-iVAwN70mXanoawooj8iC0FJj8hw@mail.gmail.com>
	<CAMpsgwb_u2dhfKUdk8r0f5qepjLQqAjAa3MmxdcOGLtjgJpm5A@mail.gmail.com>
Message-ID: <CAA0gF6qPfee6q_bQ5vOdm-a1dxDAXVQfyjFHtuLpz5JpBNbo=w@mail.gmail.com>

Hi Victor,

On Sat, Apr 7, 2012 at 1:04 PM, Victor Stinner <victor.stinner at gmail.com> wrote:
>> I'd like to propose a way to protect `finally` clauses from
>> interruptions (either by KeyboardInterrupt or by timeout, or any other
>> way).
>
> With Python 3.3, you can easily write a context manager disabling
> interruptions using signal.pthread_sigmask(). If a signal is send, the
> signal will be waiting in a queue, and the signal handler will be
> called when the signals are unblocked. (On some OSes, the signal
> handler is not called immediatly.)
>

And now you need to patch every library which happens to use `finally`
statement, to make it work. Doesn't seem to be realistic.

>
> You may hack ceval.c to not call the Python signal handler in a final
> block, but system calls will still be interrupted (EINTR).
>

This is not a problem for networking IO as it is always prepared for
EINTR, and posix mutexes never return EINTR. So for the primary
use-cases it's ok. But at least I'll add this consideration to the
PEP.

-- 
Paul


From andrew.svetlov at gmail.com  Sat Apr  7 23:08:50 2012
From: andrew.svetlov at gmail.com (Andrew Svetlov)
Date: Sun, 8 Apr 2012 00:08:50 +0300
Subject: [Python-ideas] Draft PEP on protecting finally clauses
In-Reply-To: <CAA0gF6o=Moi8XoHHTN5OAUgUQ6WS5CJh+90+dc=XbLzDO=ANSA@mail.gmail.com>
References: <CAA0gF6o=Moi8XoHHTN5OAUgUQ6WS5CJh+90+dc=XbLzDO=ANSA@mail.gmail.com>
Message-ID: <CAL3CFcX7j+OKBO=LvU5eX60pj4+hfBtCb8w1v9b+cANomduX9g@mail.gmail.com>

I've published this PEP as PEP-419: http://www.python.org/dev/peps/pep-0419/
Thank you, Paul.

On Sat, Apr 7, 2012 at 12:04 AM, Paul Colomiets <paul at colomiets.name> wrote:
> Hi,
>
> I've finally made a PEP. Any feedback is appreciated.
>
> --
> Paul
>
>
> PEP: XXX
> Title: Protecting cleanup statements from interruptions
> Version: $Revision$
> Last-Modified: $Date$
> Author: Paul Colomiets <paul at colomiets.name>
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 06-Apr-2012
> Python-Version: 3.3
>
>
> Abstract
> ========
>
> This PEP proposes a way to protect python code from being interrupted inside
> finally statement or context manager.
>
>
> Rationale
> =========
>
> Python has two nice ways to do cleanup. One is a ``finally`` statement
> and the other is context manager (or ``with`` statement). Although,
> neither of them is protected from ``KeyboardInterrupt`` or
> ``generator.throw()``. For example::
>
> ? ?lock.acquire()
> ? ?try:
> ? ? ? ?print('starting')
> ? ? ? ?do_someting()
> ? ?finally:
> ? ? ? ?print('finished')
> ? ? ? ?lock.release()
>
> If ``KeyboardInterrupt`` occurs just after ``print`` function is
> executed, lock will not be released. Similarly the following code
> using ``with`` statement is affected::
>
> ? ?from threading import Lock
>
> ? ?class MyLock:
>
> ? ? ? ?def __init__(self):
> ? ? ? ? ? ?self._lock_impl = lock
>
> ? ? ? ?def __enter__(self):
> ? ? ? ? ? ?self._lock_impl.acquire()
> ? ? ? ? ? ?print("LOCKED")
>
> ? ? ? ?def __exit__(self):
> ? ? ? ? ? ?print("UNLOCKING")
> ? ? ? ? ? ?self._lock_impl.release()
>
> ? ?lock = MyLock()
> ? ?with lock:
> ? ? ? ?do_something
>
> If ``KeyboardInterrupt`` occurs near any of the ``print`` statements,
> lock will never be released.
>
>
> Coroutine Use Case
> ------------------
>
> Similar case occurs with coroutines. Usually coroutine libraries want
> to interrupt coroutine with a timeout. There is a
> ``generator.throw()`` method for this use case, but there are no
> method to know is it currently yielded from inside a ``finally``.
>
> Example that uses yield-based coroutines follows. Code looks
> similar using any of the popular coroutine libraries Monocle [1]_,
> Bluelet [2]_, or Twisted [3]_. ::
>
> ? ?def run_locked()
> ? ? ? ?yield connection.sendall('LOCK')
> ? ? ? ?try:
> ? ? ? ? ? ?yield do_something()
> ? ? ? ? ? ?yield do_something_else()
> ? ? ? ?finally:
> ? ? ? ? ? ?yield connection.sendall('UNLOCK')
>
> ? ?with timeout(5):
> ? ? ? ?yield run_locked()
>
> In the example above ``yield something`` means pause executing current
> coroutine and execute coroutine ``something`` until it finished
> execution. So that library keeps stack of generators itself. The
> ``connection.sendall`` waits until socket is writable and does thing
> similar to what ``socket.sendall`` does.
>
> The ``with`` statement ensures that all that code is executed within 5
> seconds timeout. It does so by registering a callback in main loop,
> which calls ``generator.throw()`` to the top-most frame in the
> coroutine stack when timeout happens.
>
> The ``greenlets`` extension works in similar way, except it doesn't
> need ``yield`` to enter new stack frame. Otherwise considerations are
> similar.
>
>
> Specification
> =============
>
> Frame Flag 'f_in_cleanup'
> -------------------------
>
> A new flag on frame object is proposed. It is set to ``True`` if this
> frame is currently in the ``finally`` suite. ?Internally it must be
> implemented as a counter of nested finally statements currently
> executed.
>
> The internal counter is also incremented when entering ``WITH_SETUP``
> bytecode and ``WITH_CLEANUP`` bytecode, and is decremented when
> leaving that bytecode. This allows to protect ``__enter__`` and
> ``__exit__`` methods too.
>
>
> Function 'sys.setcleanuphook'
> -----------------------------
>
> A new function for the ``sys`` module is proposed. This function sets
> a callback which is executed every time ``f_in_cleanup`` becomes
> ``False``. Callbacks gets ``frame`` as it's sole argument so it can
> get some evindence where it is called from.
>
> The setting is thread local and is stored inside ``PyThreadState``
> structure.
>
>
> Inspect Module Enhancements
> ---------------------------
>
> Two new functions are proposed for ``inspect`` module:
> ``isframeincleanup`` and ``getcleanupframe``.
>
> ``isframeincleanup`` given ``frame`` object or ``generator`` object as
> sole argument returns the value of ``f_in_cleanup`` attribute of a
> frame itself or of the ``gi_frame`` attribute of a generator.
>
> ``getcleanupframe`` given ``frame`` object as sole argument returns
> the innermost frame which has true value of ``f_in_cleanup`` or
> ``None`` if no frames in the stack has the attribute set. It starts to
> inspect from specified frame and walks to outer frames using
> ``f_back`` pointers, just like ``getouterframes`` does.
>
>
> Example
> =======
>
> Example implementation of ``SIGINT`` handler that interrupts safely
> might look like::
>
> ? ?import inspect, sys, functools
>
> ? ?def sigint_handler(sig, frame)
> ? ? ? ?if inspect.getcleanupframe(frame) is None:
> ? ? ? ? ? ?raise KeyboardInterrupt()
> ? ? ? ?sys.setcleanuphook(functools.partial(sigint_handler, 0))
>
> Coroutine example is out of scope of this document, because it's
> implemention depends very much on a trampoline (or main loop) used by
> coroutine library.
>
>
> Unresolved Issues
> =================
>
> Interruption Inside With Statement Expression
> ---------------------------------------------
>
> Given the statement::
>
> ? ?with open(filename):
> ? ? ? ?do_something()
>
> Python can be interrupted after ``open`` is called, but before
> ``SETUP_WITH`` bytecode is executed. There are two possible decisions:
>
> * Protect expression inside ``with`` statement. This would need
> ?another bytecode, since currently there is no delimiter at the start
> ?of ``with`` expression
>
> * Let user write a wrapper if he considers it's important for his
> ?use-case. Safe wrapper code might look like the following::
>
> ? ?class FileWrapper(object):
>
> ? ? ? ?def __init__(self, filename, mode):
> ? ? ? ? ? ?self.filename = filename
> ? ? ? ? ? ?self.mode = mode
>
> ? ? ? ?def __enter__(self):
> ? ? ? ? ? ?self.file = open(self.filename, self.mode)
>
> ? ? ? ?def __exit__(self):
> ? ? ? ? ? ?self.file.close()
>
> ?Alternatively it can be written using context manager::
>
> ? ?@contextmanager
> ? ?def open_wrapper(filename, mode):
> ? ? ? ?file = open(filename, mode)
> ? ? ? ?try:
> ? ? ? ? ? ?yield file
> ? ? ? ?finally:
> ? ? ? ? ? ?file.close()
>
> ?This code is safe, as first part of generator (before yield) is
> ?executed inside ``WITH_SETUP`` bytecode of caller
>
>
> Exception Propagation
> ---------------------
>
> Sometimes ``finally`` block or ``__enter__/__exit__`` method can be
> exited with an exception. Usually it's not a problem, since more
> important exception like ``KeyboardInterrupt`` or ``SystemExit``
> should be thrown instead. But it may be nice to be able to keep
> original exception inside a ``__context__`` attibute. So cleanup hook
> signature may grow an exception argument::
>
> ? ?def sigint_handler(sig, frame)
> ? ? ? ?if inspect.getcleanupframe(frame) is None:
> ? ? ? ? ? ?raise KeyboardInterrupt()
> ? ? ? ?sys.setcleanuphook(retry_sigint)
>
> ? ?def retry_sigint(frame, exception=None):
> ? ? ? ?if inspect.getcleanupframe(frame) is None:
> ? ? ? ? ? ?raise KeyboardInterrupt() from exception
>
> .. note::
>
> ? ?No need to have three arguments like in ``__exit__`` method since
> ? ?we have a ``__traceback__`` attribute in exception in Python 3.x
>
> Although, this will set ``__cause__`` for the exception, which is not
> exactly what's intended. So some hidden interpeter logic may be used
> to put ``__context__`` attribute on every exception raised in cleanup
> hook.
>
>
> Interruption Between Acquiring Resource and Try Block
> -----------------------------------------------------
>
> Example from the first section is not totally safe. Let's look closer::
>
> ? ?lock.acquire()
> ? ?try:
> ? ? ? ?do_something()
> ? ?finally:
> ? ? ? ?lock.release()
>
> There is no way it can be fixed without modifying the code. The actual
> fix of this code depends very much on use case.
>
> Usually code can be fixed using a ``with`` statement::
>
> ? ?with lock:
> ? ? ? ?do_something()
>
> Although, for coroutines you usually can't use ``with`` statement
> because you need to ``yield`` for both aquire and release operations.
> So code might be rewritten as following::
>
> ? ?try:
> ? ? ? ?yield lock.acquire()
> ? ? ? ?do_something()
> ? ?finally:
> ? ? ? ?yield lock.release()
>
> The actual lock code might need more code to support this use case,
> but implementation is usually trivial, like check if lock has been
> acquired and unlock if it is.
>
>
> Setting Interruption Context Inside Finally Itself
> --------------------------------------------------
>
> Some coroutine libraries may need to set a timeout for the finally
> clause itself. For example::
>
> ? ?try:
> ? ? ? ?do_something()
> ? ?finally:
> ? ? ? ?with timeout(0.5):
> ? ? ? ? ? ?try:
> ? ? ? ? ? ? ? ?yield do_slow_cleanup()
> ? ? ? ? ? ?finally:
> ? ? ? ? ? ? ? ?yield do_fast_cleanup()
>
> With current semantics timeout will either protect
> the whole ``with`` block or nothing at all, depending on the
> implementation of a library. What the author is intended is to treat
> ``do_slow_cleanup`` as an ordinary code, and ``do_fast_cleanup`` as a
> cleanup (non-interruptible one).
>
> Similar case might occur when using greenlets or tasklets.
>
> This case can be fixed by exposing ``f_in_cleanup`` as a counter, and
> by calling cleanup hook on each decrement. ?Corouting library may then
> remember the value at timeout start, and compare it on each hook
> execution.
>
> But in practice example is considered to be too obscure to take in
> account.
>
>
> Alternative Python Implementations Support
> ==========================================
>
> We consider ``f_in_cleanup`` and implementation detail. The actual
> implementation may have some fake frame-like object passed to signal
> handler, cleanup hook and returned from ``getcleanupframe``. The only
> requirement is that ``inspect`` module functions work as expected on
> that objects. For this reason we also allow to pass a ``generator``
> object to a ``isframeincleanup`` function, this disables need to use
> ``gi_frame`` attribute.
>
> It may need to be specified that ``getcleanupframe`` must return the
> same object that will be passed to cleanup hook at next invocation.
>
>
> Alternative Names
> =================
>
> Original proposal had ``f_in_finally`` flag. The original intention
> was to protect ``finally`` clauses. But as it grew up to protecting
> ``__enter__`` and ``__exit__`` methods too, the ``f_in_cleanup``
> method seems better. Although ``__enter__`` method is not a cleanup
> routine, it at least relates to cleanup done by context managers.
>
> ``setcleanuphook``, ``isframeincleanup`` and ``getcleanupframe`` can
> be unobscured to ``set_cleanup_hook``, ``is_frame_in_cleanup`` and
> ``get_cleanup_frame``, althought they follow convention of their
> respective modules.
>
>
> Alternative Proposals
> =====================
>
> Propagating 'f_in_cleanup' Flag Automatically
> -----------------------------------------------
>
> This can make ``getcleanupframe`` unnecessary. But for yield based
> coroutines you need to propagate it yourself. Making it writable leads
> to somewhat unpredictable behavior of ``setcleanuphook``
>
>
> Add Bytecodes 'INCR_CLEANUP', 'DECR_CLEANUP'
> --------------------------------------------
>
> These bytecodes can be used to protect expression inside ``with``
> statement, as well as making counter increments more explicit and easy
> to debug (visible inside a disassembly). Some middle ground might be
> chosen, like ``END_FINALLY`` and ``SETUP_WITH`` imlicitly decrements
> counter (``END_FINALLY`` is present at end of ``with`` suite).
>
> Although, adding new bytecodes must be considered very carefully.
>
>
> Expose 'f_in_cleanup' as a Counter
> ----------------------------------
>
> The original intention was to expose minimum needed functionality.
> Although, as we consider frame flag ``f_in_cleanup`` as an
> implementation detail, we may expose it as a counter.
>
> Similarly, if we have a counter we may need to have cleanup hook
> called on every counter decrement. It's unlikely have much performance
> impact as nested finally clauses are unlikely common case.
>
>
> Add code object flag 'CO_CLEANUP'
> ---------------------------------
>
> As an alternative to set flag inside ``WITH_SETUP``, and
> ``WITH_CLEANUP`` bytecodes we can introduce a flag ``CO_CLEANUP``.
> When interpreter starts to execute code with ``CO_CLEANUP`` set, it
> sets ``f_in_cleanup`` for the whole function body. ?This flag is set
> for code object of ``__enter__`` and ``__exit__`` special methods.
> Technically it might be set on functions called ``__enter__`` and
> ``__exit__``.
>
> This seems to be less clear solution. It also covers the case where
> ``__enter__`` and ``__exit__`` are called manually. This may be
> accepted either as feature or as a unnecessary side-effect (unlikely
> as a bug).
>
> It may also impose a problem when ``__enter__`` or ``__exit__``
> function are implemented in C, as there usually no frame to check for
> ``f_in_cleanup`` flag.
>
>
> Have Cleanup Callback on Frame Object Itself
> ----------------------------------------------
>
> Frame may be extended to have ``f_cleanup_callback`` which is called
> when ``f_in_cleanup`` is reset to 0. It would help to register
> different callbacks to different coroutines.
>
> Despite apparent beauty. This solution doesn't add anything. As there
> are two primary use cases:
>
> * Set callback in signal handler. The callback is inherently single
> ?one for this case
>
> * Use single callback per loop for coroutine use case. And in almost
> ?all cases there is only one loop per thread
>
>
> No Cleanup Hook
> ---------------
>
> Original proposal included no cleanup hook specification. As there are
> few ways to achieve the same using current tools:
>
> * Use ``sys.settrace`` and ``f_trace`` callback. It may impose some
> ?problem to debugging, and has big performance impact (although,
> ?interrupting doesn't happen very often)
>
> * Sleep a bit more and try again. For coroutine library it's easy. For
> ?signals it may be achieved using ``alert``.
>
> Both methods are considered too impractical and a way to catch exit
> from ``finally`` statement is proposed.
>
>
> References
> ==========
>
> .. [1] Monocle
> ? https://github.com/saucelabs/monocle
>
> .. [2] Bluelet
> ? https://github.com/sampsyo/bluelet
>
> .. [3] Twisted: inlineCallbacks
> ? http://twistedmatrix.com/documents/8.1.0/api/twisted.internet.defer.html
>
> .. [4] Original discussion
> ? http://mail.python.org/pipermail/python-ideas/2012-April/014705.html
>
>
> Copyright
> =========
>
> This document has been placed in the public domain.
>
>
>
> ..
> ? Local Variables:
> ? mode: indented-text
> ? indent-tabs-mode: nil
> ? sentence-end-double-space: t
> ? fill-column: 70
> ? coding: utf-8
> ? End:
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas



-- 
Thanks,
Andrew Svetlov


From andrew.svetlov at gmail.com  Sat Apr  7 23:09:37 2012
From: andrew.svetlov at gmail.com (Andrew Svetlov)
Date: Sun, 8 Apr 2012 00:09:37 +0300
Subject: [Python-ideas] Draft PEP on protecting finally clauses
In-Reply-To: <CAL3CFcX7j+OKBO=LvU5eX60pj4+hfBtCb8w1v9b+cANomduX9g@mail.gmail.com>
References: <CAA0gF6o=Moi8XoHHTN5OAUgUQ6WS5CJh+90+dc=XbLzDO=ANSA@mail.gmail.com>
	<CAL3CFcX7j+OKBO=LvU5eX60pj4+hfBtCb8w1v9b+cANomduX9g@mail.gmail.com>
Message-ID: <CAL3CFcWNebv3rjhus3=W0b5y36gevw0gLLZ6Xy2wwLe++ST1iA@mail.gmail.com>

What's about reference implementation?

On Sun, Apr 8, 2012 at 12:08 AM, Andrew Svetlov
<andrew.svetlov at gmail.com> wrote:
> I've published this PEP as PEP-419: http://www.python.org/dev/peps/pep-0419/
> Thank you, Paul.
>
> On Sat, Apr 7, 2012 at 12:04 AM, Paul Colomiets <paul at colomiets.name> wrote:
>> Hi,
>>
>> I've finally made a PEP. Any feedback is appreciated.
>>
>> --
>> Paul
>>
>>
>> PEP: XXX
>> Title: Protecting cleanup statements from interruptions
>> Version: $Revision$
>> Last-Modified: $Date$
>> Author: Paul Colomiets <paul at colomiets.name>
>> Status: Draft
>> Type: Standards Track
>> Content-Type: text/x-rst
>> Created: 06-Apr-2012
>> Python-Version: 3.3
>>
>>
>> Abstract
>> ========
>>
>> This PEP proposes a way to protect python code from being interrupted inside
>> finally statement or context manager.
>>
>>
>> Rationale
>> =========
>>
>> Python has two nice ways to do cleanup. One is a ``finally`` statement
>> and the other is context manager (or ``with`` statement). Although,
>> neither of them is protected from ``KeyboardInterrupt`` or
>> ``generator.throw()``. For example::
>>
>> ? ?lock.acquire()
>> ? ?try:
>> ? ? ? ?print('starting')
>> ? ? ? ?do_someting()
>> ? ?finally:
>> ? ? ? ?print('finished')
>> ? ? ? ?lock.release()
>>
>> If ``KeyboardInterrupt`` occurs just after ``print`` function is
>> executed, lock will not be released. Similarly the following code
>> using ``with`` statement is affected::
>>
>> ? ?from threading import Lock
>>
>> ? ?class MyLock:
>>
>> ? ? ? ?def __init__(self):
>> ? ? ? ? ? ?self._lock_impl = lock
>>
>> ? ? ? ?def __enter__(self):
>> ? ? ? ? ? ?self._lock_impl.acquire()
>> ? ? ? ? ? ?print("LOCKED")
>>
>> ? ? ? ?def __exit__(self):
>> ? ? ? ? ? ?print("UNLOCKING")
>> ? ? ? ? ? ?self._lock_impl.release()
>>
>> ? ?lock = MyLock()
>> ? ?with lock:
>> ? ? ? ?do_something
>>
>> If ``KeyboardInterrupt`` occurs near any of the ``print`` statements,
>> lock will never be released.
>>
>>
>> Coroutine Use Case
>> ------------------
>>
>> Similar case occurs with coroutines. Usually coroutine libraries want
>> to interrupt coroutine with a timeout. There is a
>> ``generator.throw()`` method for this use case, but there are no
>> method to know is it currently yielded from inside a ``finally``.
>>
>> Example that uses yield-based coroutines follows. Code looks
>> similar using any of the popular coroutine libraries Monocle [1]_,
>> Bluelet [2]_, or Twisted [3]_. ::
>>
>> ? ?def run_locked()
>> ? ? ? ?yield connection.sendall('LOCK')
>> ? ? ? ?try:
>> ? ? ? ? ? ?yield do_something()
>> ? ? ? ? ? ?yield do_something_else()
>> ? ? ? ?finally:
>> ? ? ? ? ? ?yield connection.sendall('UNLOCK')
>>
>> ? ?with timeout(5):
>> ? ? ? ?yield run_locked()
>>
>> In the example above ``yield something`` means pause executing current
>> coroutine and execute coroutine ``something`` until it finished
>> execution. So that library keeps stack of generators itself. The
>> ``connection.sendall`` waits until socket is writable and does thing
>> similar to what ``socket.sendall`` does.
>>
>> The ``with`` statement ensures that all that code is executed within 5
>> seconds timeout. It does so by registering a callback in main loop,
>> which calls ``generator.throw()`` to the top-most frame in the
>> coroutine stack when timeout happens.
>>
>> The ``greenlets`` extension works in similar way, except it doesn't
>> need ``yield`` to enter new stack frame. Otherwise considerations are
>> similar.
>>
>>
>> Specification
>> =============
>>
>> Frame Flag 'f_in_cleanup'
>> -------------------------
>>
>> A new flag on frame object is proposed. It is set to ``True`` if this
>> frame is currently in the ``finally`` suite. ?Internally it must be
>> implemented as a counter of nested finally statements currently
>> executed.
>>
>> The internal counter is also incremented when entering ``WITH_SETUP``
>> bytecode and ``WITH_CLEANUP`` bytecode, and is decremented when
>> leaving that bytecode. This allows to protect ``__enter__`` and
>> ``__exit__`` methods too.
>>
>>
>> Function 'sys.setcleanuphook'
>> -----------------------------
>>
>> A new function for the ``sys`` module is proposed. This function sets
>> a callback which is executed every time ``f_in_cleanup`` becomes
>> ``False``. Callbacks gets ``frame`` as it's sole argument so it can
>> get some evindence where it is called from.
>>
>> The setting is thread local and is stored inside ``PyThreadState``
>> structure.
>>
>>
>> Inspect Module Enhancements
>> ---------------------------
>>
>> Two new functions are proposed for ``inspect`` module:
>> ``isframeincleanup`` and ``getcleanupframe``.
>>
>> ``isframeincleanup`` given ``frame`` object or ``generator`` object as
>> sole argument returns the value of ``f_in_cleanup`` attribute of a
>> frame itself or of the ``gi_frame`` attribute of a generator.
>>
>> ``getcleanupframe`` given ``frame`` object as sole argument returns
>> the innermost frame which has true value of ``f_in_cleanup`` or
>> ``None`` if no frames in the stack has the attribute set. It starts to
>> inspect from specified frame and walks to outer frames using
>> ``f_back`` pointers, just like ``getouterframes`` does.
>>
>>
>> Example
>> =======
>>
>> Example implementation of ``SIGINT`` handler that interrupts safely
>> might look like::
>>
>> ? ?import inspect, sys, functools
>>
>> ? ?def sigint_handler(sig, frame)
>> ? ? ? ?if inspect.getcleanupframe(frame) is None:
>> ? ? ? ? ? ?raise KeyboardInterrupt()
>> ? ? ? ?sys.setcleanuphook(functools.partial(sigint_handler, 0))
>>
>> Coroutine example is out of scope of this document, because it's
>> implemention depends very much on a trampoline (or main loop) used by
>> coroutine library.
>>
>>
>> Unresolved Issues
>> =================
>>
>> Interruption Inside With Statement Expression
>> ---------------------------------------------
>>
>> Given the statement::
>>
>> ? ?with open(filename):
>> ? ? ? ?do_something()
>>
>> Python can be interrupted after ``open`` is called, but before
>> ``SETUP_WITH`` bytecode is executed. There are two possible decisions:
>>
>> * Protect expression inside ``with`` statement. This would need
>> ?another bytecode, since currently there is no delimiter at the start
>> ?of ``with`` expression
>>
>> * Let user write a wrapper if he considers it's important for his
>> ?use-case. Safe wrapper code might look like the following::
>>
>> ? ?class FileWrapper(object):
>>
>> ? ? ? ?def __init__(self, filename, mode):
>> ? ? ? ? ? ?self.filename = filename
>> ? ? ? ? ? ?self.mode = mode
>>
>> ? ? ? ?def __enter__(self):
>> ? ? ? ? ? ?self.file = open(self.filename, self.mode)
>>
>> ? ? ? ?def __exit__(self):
>> ? ? ? ? ? ?self.file.close()
>>
>> ?Alternatively it can be written using context manager::
>>
>> ? ?@contextmanager
>> ? ?def open_wrapper(filename, mode):
>> ? ? ? ?file = open(filename, mode)
>> ? ? ? ?try:
>> ? ? ? ? ? ?yield file
>> ? ? ? ?finally:
>> ? ? ? ? ? ?file.close()
>>
>> ?This code is safe, as first part of generator (before yield) is
>> ?executed inside ``WITH_SETUP`` bytecode of caller
>>
>>
>> Exception Propagation
>> ---------------------
>>
>> Sometimes ``finally`` block or ``__enter__/__exit__`` method can be
>> exited with an exception. Usually it's not a problem, since more
>> important exception like ``KeyboardInterrupt`` or ``SystemExit``
>> should be thrown instead. But it may be nice to be able to keep
>> original exception inside a ``__context__`` attibute. So cleanup hook
>> signature may grow an exception argument::
>>
>> ? ?def sigint_handler(sig, frame)
>> ? ? ? ?if inspect.getcleanupframe(frame) is None:
>> ? ? ? ? ? ?raise KeyboardInterrupt()
>> ? ? ? ?sys.setcleanuphook(retry_sigint)
>>
>> ? ?def retry_sigint(frame, exception=None):
>> ? ? ? ?if inspect.getcleanupframe(frame) is None:
>> ? ? ? ? ? ?raise KeyboardInterrupt() from exception
>>
>> .. note::
>>
>> ? ?No need to have three arguments like in ``__exit__`` method since
>> ? ?we have a ``__traceback__`` attribute in exception in Python 3.x
>>
>> Although, this will set ``__cause__`` for the exception, which is not
>> exactly what's intended. So some hidden interpeter logic may be used
>> to put ``__context__`` attribute on every exception raised in cleanup
>> hook.
>>
>>
>> Interruption Between Acquiring Resource and Try Block
>> -----------------------------------------------------
>>
>> Example from the first section is not totally safe. Let's look closer::
>>
>> ? ?lock.acquire()
>> ? ?try:
>> ? ? ? ?do_something()
>> ? ?finally:
>> ? ? ? ?lock.release()
>>
>> There is no way it can be fixed without modifying the code. The actual
>> fix of this code depends very much on use case.
>>
>> Usually code can be fixed using a ``with`` statement::
>>
>> ? ?with lock:
>> ? ? ? ?do_something()
>>
>> Although, for coroutines you usually can't use ``with`` statement
>> because you need to ``yield`` for both aquire and release operations.
>> So code might be rewritten as following::
>>
>> ? ?try:
>> ? ? ? ?yield lock.acquire()
>> ? ? ? ?do_something()
>> ? ?finally:
>> ? ? ? ?yield lock.release()
>>
>> The actual lock code might need more code to support this use case,
>> but implementation is usually trivial, like check if lock has been
>> acquired and unlock if it is.
>>
>>
>> Setting Interruption Context Inside Finally Itself
>> --------------------------------------------------
>>
>> Some coroutine libraries may need to set a timeout for the finally
>> clause itself. For example::
>>
>> ? ?try:
>> ? ? ? ?do_something()
>> ? ?finally:
>> ? ? ? ?with timeout(0.5):
>> ? ? ? ? ? ?try:
>> ? ? ? ? ? ? ? ?yield do_slow_cleanup()
>> ? ? ? ? ? ?finally:
>> ? ? ? ? ? ? ? ?yield do_fast_cleanup()
>>
>> With current semantics timeout will either protect
>> the whole ``with`` block or nothing at all, depending on the
>> implementation of a library. What the author is intended is to treat
>> ``do_slow_cleanup`` as an ordinary code, and ``do_fast_cleanup`` as a
>> cleanup (non-interruptible one).
>>
>> Similar case might occur when using greenlets or tasklets.
>>
>> This case can be fixed by exposing ``f_in_cleanup`` as a counter, and
>> by calling cleanup hook on each decrement. ?Corouting library may then
>> remember the value at timeout start, and compare it on each hook
>> execution.
>>
>> But in practice example is considered to be too obscure to take in
>> account.
>>
>>
>> Alternative Python Implementations Support
>> ==========================================
>>
>> We consider ``f_in_cleanup`` and implementation detail. The actual
>> implementation may have some fake frame-like object passed to signal
>> handler, cleanup hook and returned from ``getcleanupframe``. The only
>> requirement is that ``inspect`` module functions work as expected on
>> that objects. For this reason we also allow to pass a ``generator``
>> object to a ``isframeincleanup`` function, this disables need to use
>> ``gi_frame`` attribute.
>>
>> It may need to be specified that ``getcleanupframe`` must return the
>> same object that will be passed to cleanup hook at next invocation.
>>
>>
>> Alternative Names
>> =================
>>
>> Original proposal had ``f_in_finally`` flag. The original intention
>> was to protect ``finally`` clauses. But as it grew up to protecting
>> ``__enter__`` and ``__exit__`` methods too, the ``f_in_cleanup``
>> method seems better. Although ``__enter__`` method is not a cleanup
>> routine, it at least relates to cleanup done by context managers.
>>
>> ``setcleanuphook``, ``isframeincleanup`` and ``getcleanupframe`` can
>> be unobscured to ``set_cleanup_hook``, ``is_frame_in_cleanup`` and
>> ``get_cleanup_frame``, althought they follow convention of their
>> respective modules.
>>
>>
>> Alternative Proposals
>> =====================
>>
>> Propagating 'f_in_cleanup' Flag Automatically
>> -----------------------------------------------
>>
>> This can make ``getcleanupframe`` unnecessary. But for yield based
>> coroutines you need to propagate it yourself. Making it writable leads
>> to somewhat unpredictable behavior of ``setcleanuphook``
>>
>>
>> Add Bytecodes 'INCR_CLEANUP', 'DECR_CLEANUP'
>> --------------------------------------------
>>
>> These bytecodes can be used to protect expression inside ``with``
>> statement, as well as making counter increments more explicit and easy
>> to debug (visible inside a disassembly). Some middle ground might be
>> chosen, like ``END_FINALLY`` and ``SETUP_WITH`` imlicitly decrements
>> counter (``END_FINALLY`` is present at end of ``with`` suite).
>>
>> Although, adding new bytecodes must be considered very carefully.
>>
>>
>> Expose 'f_in_cleanup' as a Counter
>> ----------------------------------
>>
>> The original intention was to expose minimum needed functionality.
>> Although, as we consider frame flag ``f_in_cleanup`` as an
>> implementation detail, we may expose it as a counter.
>>
>> Similarly, if we have a counter we may need to have cleanup hook
>> called on every counter decrement. It's unlikely have much performance
>> impact as nested finally clauses are unlikely common case.
>>
>>
>> Add code object flag 'CO_CLEANUP'
>> ---------------------------------
>>
>> As an alternative to set flag inside ``WITH_SETUP``, and
>> ``WITH_CLEANUP`` bytecodes we can introduce a flag ``CO_CLEANUP``.
>> When interpreter starts to execute code with ``CO_CLEANUP`` set, it
>> sets ``f_in_cleanup`` for the whole function body. ?This flag is set
>> for code object of ``__enter__`` and ``__exit__`` special methods.
>> Technically it might be set on functions called ``__enter__`` and
>> ``__exit__``.
>>
>> This seems to be less clear solution. It also covers the case where
>> ``__enter__`` and ``__exit__`` are called manually. This may be
>> accepted either as feature or as a unnecessary side-effect (unlikely
>> as a bug).
>>
>> It may also impose a problem when ``__enter__`` or ``__exit__``
>> function are implemented in C, as there usually no frame to check for
>> ``f_in_cleanup`` flag.
>>
>>
>> Have Cleanup Callback on Frame Object Itself
>> ----------------------------------------------
>>
>> Frame may be extended to have ``f_cleanup_callback`` which is called
>> when ``f_in_cleanup`` is reset to 0. It would help to register
>> different callbacks to different coroutines.
>>
>> Despite apparent beauty. This solution doesn't add anything. As there
>> are two primary use cases:
>>
>> * Set callback in signal handler. The callback is inherently single
>> ?one for this case
>>
>> * Use single callback per loop for coroutine use case. And in almost
>> ?all cases there is only one loop per thread
>>
>>
>> No Cleanup Hook
>> ---------------
>>
>> Original proposal included no cleanup hook specification. As there are
>> few ways to achieve the same using current tools:
>>
>> * Use ``sys.settrace`` and ``f_trace`` callback. It may impose some
>> ?problem to debugging, and has big performance impact (although,
>> ?interrupting doesn't happen very often)
>>
>> * Sleep a bit more and try again. For coroutine library it's easy. For
>> ?signals it may be achieved using ``alert``.
>>
>> Both methods are considered too impractical and a way to catch exit
>> from ``finally`` statement is proposed.
>>
>>
>> References
>> ==========
>>
>> .. [1] Monocle
>> ? https://github.com/saucelabs/monocle
>>
>> .. [2] Bluelet
>> ? https://github.com/sampsyo/bluelet
>>
>> .. [3] Twisted: inlineCallbacks
>> ? http://twistedmatrix.com/documents/8.1.0/api/twisted.internet.defer.html
>>
>> .. [4] Original discussion
>> ? http://mail.python.org/pipermail/python-ideas/2012-April/014705.html
>>
>>
>> Copyright
>> =========
>>
>> This document has been placed in the public domain.
>>
>>
>>
>> ..
>> ? Local Variables:
>> ? mode: indented-text
>> ? indent-tabs-mode: nil
>> ? sentence-end-double-space: t
>> ? fill-column: 70
>> ? coding: utf-8
>> ? End:
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> http://mail.python.org/mailman/listinfo/python-ideas
>
>
>
> --
> Thanks,
> Andrew Svetlov



-- 
Thanks,
Andrew Svetlov


From g.brandl at gmx.net  Sun Apr  8 00:45:46 2012
From: g.brandl at gmx.net (Georg Brandl)
Date: Sun, 08 Apr 2012 00:45:46 +0200
Subject: [Python-ideas] Draft PEP on protecting finally clauses
In-Reply-To: <CAL3CFcX7j+OKBO=LvU5eX60pj4+hfBtCb8w1v9b+cANomduX9g@mail.gmail.com>
References: <CAA0gF6o=Moi8XoHHTN5OAUgUQ6WS5CJh+90+dc=XbLzDO=ANSA@mail.gmail.com>
	<CAL3CFcX7j+OKBO=LvU5eX60pj4+hfBtCb8w1v9b+cANomduX9g@mail.gmail.com>
Message-ID: <jlqg2j$v0r$1@dough.gmane.org>

On 04/07/2012 11:08 PM, Andrew Svetlov wrote:
> I've published this PEP as PEP-419: http://www.python.org/dev/peps/pep-0419/
> Thank you, Paul.
> 
> On Sat, Apr 7, 2012 at 12:04 AM, Paul Colomiets <paul at colomiets.name> wrote:
>> Hi,
>>
>> I've finally made a PEP. Any feedback is appreciated.

NB: After a PEP is checked in, it should be posted to python-dev for general
discussion.

Georg



From luoyonggang at gmail.com  Sun Apr  8 09:39:56 2012
From: luoyonggang at gmail.com (=?UTF-8?B?572X5YuH5YiaKFlvbmdnYW5nIEx1bykg?=)
Date: Sun, 8 Apr 2012 15:39:56 +0800
Subject: [Python-ideas] I found to detect if an object is GCed is very hard
	within python.
Message-ID: <CAE2XoE-yBKw_CvFZsB_juEJTabsXpSzUWSfqXo9CXmY+Y0VFnw@mail.gmail.com>

For example, I write an C extension with two object Parent and Child,
they referenced each other as member, so circular referenced, how to
use Python unittest to detect that?

2012/4/7, Alec Taylor <alec.taylor6 at gmail.com>:
> Has been withdrawn... and implemented
>
> http://www.python.org/dev/peps/pep-0274/
> --
> http://mail.python.org/mailman/listinfo/python-list
>

-- 
?????????

         ??
?
???
Yours
    sincerely,
Yonggang Luo


From phd at phdru.name  Sun Apr  8 12:57:29 2012
From: phd at phdru.name (Oleg Broytman)
Date: Sun, 8 Apr 2012 14:57:29 +0400
Subject: [Python-ideas] I found to detect if an object is GCed is very
 hard within python.
In-Reply-To: <CAE2XoE-yBKw_CvFZsB_juEJTabsXpSzUWSfqXo9CXmY+Y0VFnw@mail.gmail.com>
References: <CAE2XoE-yBKw_CvFZsB_juEJTabsXpSzUWSfqXo9CXmY+Y0VFnw@mail.gmail.com>
Message-ID: <20120408105729.GA23012@iskra.aviel.ru>

Hello.

   We are sorry but we cannot help you. This mailing list is to discuss
new Python ideas; if you're having problems learning, understanding or
using Python, please find another forum. Probably
python-list/comp.lang.python mailing list/news group is the best place;
there are Python developers who participate in it; you may get a faster,
and probably more complete, answer there. See
http://www.python.org/community/ for other lists/news groups/fora. Thank
you for understanding.

On Sun, Apr 08, 2012 at 03:39:56PM +0800, "?????????(Yonggang Luo) " <luoyonggang at gmail.com> wrote:
> For example, I write an C extension with two object Parent and Child,
> they referenced each other as member, so circular referenced, how to
> use Python unittest to detect that?

   I think, a weak reference with a callback can help.

Oleg.
-- 
     Oleg Broytman            http://phdru.name/            phd at phdru.name
           Programmers don't die, they just GOSUB without RETURN.


From ubershmekel at gmail.com  Mon Apr  9 16:20:22 2012
From: ubershmekel at gmail.com (Yuval Greenfield)
Date: Mon, 9 Apr 2012 17:20:22 +0300
Subject: [Python-ideas] pdb.set_trace may not seem long
Message-ID: <CANSw7KwojP6KWQ8ND29KV1MgW0anNrpWZCEYFmKzNArRUAke-w@mail.gmail.com>

Proposal:

pdb.st = pdb.set_trace
-----------

I find myself typing this a lot:

    import pdb;pdb.set_trace()

It's the int3 of python. When I want to debug an exact point in the code I
use the above line.

I hope I don't come off as spoiled, it's just that import pdb;pdb.pm() is
so short that I can't help but wonder how better my life would be if I
could do:

    import pdb;pdb.st()


What do you guys think? I know aliasing isn't cool since TSBOAPOOOWTDI but
practicality beats purity....


Yuval
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120409/af0bf772/attachment.html>

From guido at python.org  Mon Apr  9 17:01:11 2012
From: guido at python.org (Guido van Rossum)
Date: Mon, 9 Apr 2012 08:01:11 -0700
Subject: [Python-ideas] pdb.set_trace may not seem long
In-Reply-To: <CANSw7KwojP6KWQ8ND29KV1MgW0anNrpWZCEYFmKzNArRUAke-w@mail.gmail.com>
References: <CANSw7KwojP6KWQ8ND29KV1MgW0anNrpWZCEYFmKzNArRUAke-w@mail.gmail.com>
Message-ID: <CAP7+vJLi+_p2k1AsYsLy9mhNwjH_1K8hnqBVbKiJErAe3gNveA@mail.gmail.com>

I think this is unnecessary; if you find yourself typing that so much,
create a module with a one-letter name and put a bunch of one-letter
convenience functions in it. Or figure out how to create macros for
your editor.

On Mon, Apr 9, 2012 at 7:20 AM, Yuval Greenfield <ubershmekel at gmail.com> wrote:
> Proposal:
>
> pdb.st = pdb.set_trace
> -----------
>
> I find myself typing this a lot:
>
> ? ? import pdb;pdb.set_trace()
>
> It's the int3 of python. When I want to debug an exact point in the code I
> use the above line.
>
> I hope I don't come off as spoiled, it's just that import pdb;pdb.pm() is so
> short that I can't help but wonder how better my life would be if I could
> do:
>
> ? ? import pdb;pdb.st()
>
>
> What do you guys think? I know aliasing isn't cool since?TSBOAPOOOWTDI but
> practicality beats purity....
>
>
> Yuval
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>



-- 
--Guido van Rossum (python.org/~guido)


From sven at marnach.net  Mon Apr  9 16:43:33 2012
From: sven at marnach.net (Sven Marnach)
Date: Mon, 9 Apr 2012 15:43:33 +0100
Subject: [Python-ideas] pdb.set_trace may not seem long
In-Reply-To: <CANSw7KwojP6KWQ8ND29KV1MgW0anNrpWZCEYFmKzNArRUAke-w@mail.gmail.com>
References: <CANSw7KwojP6KWQ8ND29KV1MgW0anNrpWZCEYFmKzNArRUAke-w@mail.gmail.com>
Message-ID: <20120409144333.GA24979@bagheera>

Yuval Greenfield schrieb am Mon, 09. Apr 2012, um 17:20:22 +0300:
> I find myself typing this a lot:
> 
>     import pdb;pdb.set_trace()

What I do instead: I start pdb in Emacs, set a break point on the line
and run the script.  This should work in any other interactive
debugger for Python, too.

If for some reason you prefer to insert the above-mentioned line into
your source code, how about defining an editor macro for this purpose,
so you could do it with a single shortcut?

Cheers,
    Sven


From ned at nedbatchelder.com  Mon Apr  9 18:26:17 2012
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Mon, 09 Apr 2012 12:26:17 -0400
Subject: [Python-ideas] pdb.set_trace may not seem long
In-Reply-To: <CANSw7KwojP6KWQ8ND29KV1MgW0anNrpWZCEYFmKzNArRUAke-w@mail.gmail.com>
References: <CANSw7KwojP6KWQ8ND29KV1MgW0anNrpWZCEYFmKzNArRUAke-w@mail.gmail.com>
Message-ID: <4F830DA9.30703@nedbatchelder.com>

On 4/9/2012 10:20 AM, Yuval Greenfield wrote:
> Proposal:
>
> pdb.st <http://pdb.st> = pdb.set_trace
> -----------
>
> I find myself typing this a lot:
>
>     import pdb;pdb.set_trace()
>
> It's the int3 of python. When I want to debug an exact point in the 
> code I use the above line.
>
> I hope I don't come off as spoiled, it's just that import pdb;pdb.pm 
> <http://pdb.pm>() is so short that I can't help but wonder how better 
> my life would be if I could do:
>
>     import pdb;pdb.st <http://pdb.st>()
>
>
> What do you guys think? I know aliasing isn't cool since TSBOAPOOOWTDI 
> but practicality beats purity....
>
I have exactly one abbreviation defined in my .vimrc:

     abbrev pdbxx      import pdb;pdb.set_trace()

--Ned.

>
> Yuval
>
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120409/1a100c29/attachment.html>

From techtonik at gmail.com  Mon Apr  9 19:45:28 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Mon, 9 Apr 2012 20:45:28 +0300
Subject: [Python-ideas] pdb.set_trace may not seem long
In-Reply-To: <CANSw7KwojP6KWQ8ND29KV1MgW0anNrpWZCEYFmKzNArRUAke-w@mail.gmail.com>
References: <CANSw7KwojP6KWQ8ND29KV1MgW0anNrpWZCEYFmKzNArRUAke-w@mail.gmail.com>
Message-ID: <CAPkN8xKQpo6_eT0VO2nWnfmVbNzsJjpT4cUoUtOMTpQ07q322g@mail.gmail.com>

On Mon, Apr 9, 2012 at 5:20 PM, Yuval Greenfield <ubershmekel at gmail.com> wrote:
> Proposal:
>
> pdb.st = pdb.set_trace
> -----------
>
> I find myself typing this a lot:
>
> ? ? import pdb;pdb.set_trace()

How about?

    import pdb.trace

--
anatoly t.


From guido at python.org  Mon Apr  9 20:13:10 2012
From: guido at python.org (Guido van Rossum)
Date: Mon, 9 Apr 2012 11:13:10 -0700
Subject: [Python-ideas] pdb.set_trace may not seem long
In-Reply-To: <CAPkN8xKQpo6_eT0VO2nWnfmVbNzsJjpT4cUoUtOMTpQ07q322g@mail.gmail.com>
References: <CANSw7KwojP6KWQ8ND29KV1MgW0anNrpWZCEYFmKzNArRUAke-w@mail.gmail.com>
	<CAPkN8xKQpo6_eT0VO2nWnfmVbNzsJjpT4cUoUtOMTpQ07q322g@mail.gmail.com>
Message-ID: <CAP7+vJK6eLFhocYju94xfQcFza0GiENnzKBmj5SbHaNhD2NX_Q@mail.gmail.com>

On Mon, Apr 9, 2012 at 10:45 AM, anatoly techtonik <techtonik at gmail.com> wrote:
> On Mon, Apr 9, 2012 at 5:20 PM, Yuval Greenfield <ubershmekel at gmail.com> wrote:
>> Proposal:
>>
>> pdb.st = pdb.set_trace
>> -----------
>>
>> I find myself typing this a lot:
>>
>> ? ? import pdb;pdb.set_trace()
>
> How about?
>
> ? ?import pdb.trace

Yuck. An import intended to have a side effect. This also won't work
if pdb.trace was imported before.

-- 
--Guido van Rossum (python.org/~guido)


From techtonik at gmail.com  Mon Apr  9 20:59:43 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Mon, 9 Apr 2012 21:59:43 +0300
Subject: [Python-ideas] pdb.set_trace may not seem long
In-Reply-To: <CAP7+vJK6eLFhocYju94xfQcFza0GiENnzKBmj5SbHaNhD2NX_Q@mail.gmail.com>
References: <CANSw7KwojP6KWQ8ND29KV1MgW0anNrpWZCEYFmKzNArRUAke-w@mail.gmail.com>
	<CAPkN8xKQpo6_eT0VO2nWnfmVbNzsJjpT4cUoUtOMTpQ07q322g@mail.gmail.com>
	<CAP7+vJK6eLFhocYju94xfQcFza0GiENnzKBmj5SbHaNhD2NX_Q@mail.gmail.com>
Message-ID: <CAPkN8x+43LVQVwD6jk-4kBDEd4CU8Nj94rYe3qiJnOr74ppxvA@mail.gmail.com>

On Mon, Apr 9, 2012 at 9:13 PM, Guido van Rossum <guido at python.org> wrote:
> On Mon, Apr 9, 2012 at 10:45 AM, anatoly techtonik <techtonik at gmail.com> wrote:
>> On Mon, Apr 9, 2012 at 5:20 PM, Yuval Greenfield <ubershmekel at gmail.com> wrote:
>>> Proposal:
>>>
>>> pdb.st = pdb.set_trace
>>> -----------
>>>
>>> I find myself typing this a lot:
>>>
>>> ? ? import pdb;pdb.set_trace()
>>
>> How about?
>>
>> ? ?import pdb.trace
>
> Yuck. An import intended to have a side effect.

On the other side it is an import intended to debug side effects.
Syntax sugar that makes debugging in Python more intuitive. I always
land on stackoverflow when I need to recall the structure of this pdb
import call. In ideal world it would be even something like:

    import debug.start

but of course, a builtin

    debug()

which calls registered debugger for an application or pdb (by default)
could be even more shorter and easy for newbies at the cost of added
magic.

> This also won't work
> if pdb.trace was imported before.

Can pdb.trace remove itself from sys.modules while being imported?


From victor.varvariuc at gmail.com  Tue Apr 10 11:30:34 2012
From: victor.varvariuc at gmail.com (Victor Varvariuc)
Date: Tue, 10 Apr 2012 12:30:34 +0300
Subject: [Python-ideas] Improve import mechanism for circular imports
Message-ID: <CA+Lge13ttagjgYjdCtrk18e0B-my4zMQx0zX038Q98VDF7W+CA@mail.gmail.com>

Consider the following directory structure (Python 3):

[test]
    main.py
    [tree]
        __init__.py # empty
        branch.py
        root.py

main.py:

print('main: Entered the module')
print('main: from tree import root')from tree import root # this first
finished importing module 'tree.root'# then creates local name 'root'
which references that imported module# Why not creating local name
'root' at the same time when sys.modules['tree.root'] is created?# If
import fails:# - either delete the attribute 'root'# - or leave it -
how it affects the runtime?from tree import branch
print('main: Creating a root and a leaf attached to it')
_root = root.Root()_branch = branch.Branch(_root)


branch.py:

print('tree.branch: Entered the module')
import tree, sysprint("tree.branch: 'root' in dir(tree) ->", 'root' in
dir(tree))print("tree.branch: 'tree.root' in sys.modules ->",
'tree.root' in sys.modules)
print('tree.branch: from tree import root')#from tree import root #
ImportError: cannot import name root import tree.root # workaroundroot
= sys.modules['tree.root'] # though name 'root' does not exist yet in
module 'tree', sys.modules['root.tree'] already does
print('tree.branch: Defining class branch.Branch()')
class Branch():
    def __init__(self, _parent):
        assert isinstance(_parent, root.Root), 'Pass a `Root` instance'


root.py:

print('tree.root: Entered the module')
print('tree.root: from tree import branch')from tree import branch
print('tree.root: Defining class Root()')
class Root():
    def attach(self, _branch):
        assert isinstance(_branch, branch.Branch), 'Pass a `Branch` instance'
        self.branch = _branch


Running it:

vic at wic:~/projects/test$ python3 main.py main: Entered the modulemain:
from tree import roottree.root: Entered the moduletree.root: from tree
import branchtree.branch: Entered the moduletree.branch: 'root' in
dir(tree) -> Falsetree.branch: 'tree.root' in sys.modules ->
Truetree.branch: from tree import roottree.branch: Defining class
branch.Branch()tree.root: Defining class Root()main: Creating a root
and a leaf attached to it


So,
There are circular imports in this example code.
Currently, `root = sys.modules['tree.root']` hack in branch.py works.
Wouldn't it be useful to create attribute `root` in `main` at the same time
`sys.modules['tree.root']` is created when doing `from tree import root` in
main?
This would solve more complex cases with when circular imports are
involved, without applying such hacks.

Thank you
--
*Victor *
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120410/a3fa28dc/attachment.html>

From phd at phdru.name  Tue Apr 10 11:55:51 2012
From: phd at phdru.name (Oleg Broytman)
Date: Tue, 10 Apr 2012 13:55:51 +0400
Subject: [Python-ideas] Avoid (was: Improve) circular import
In-Reply-To: <CA+Lge13ttagjgYjdCtrk18e0B-my4zMQx0zX038Q98VDF7W+CA@mail.gmail.com>
References: <CA+Lge13ttagjgYjdCtrk18e0B-my4zMQx0zX038Q98VDF7W+CA@mail.gmail.com>
Message-ID: <20120410095551.GC22368@iskra.aviel.ru>

Hi!

On Tue, Apr 10, 2012 at 12:30:34PM +0300, Victor Varvariuc <victor.varvariuc at gmail.com> wrote:
> Consider the following directory structure (Python 3):
> 
> [test]
>     main.py
>     [tree]
>         __init__.py # empty
>         branch.py
>         root.py
> 
> branch.py:
> 
> import tree

   My opinion is - restructure code to avoid circular import instead of
hacking import machinery.
   Why does a submodule import the entire package instead of importing
just root?

Oleg.
-- 
     Oleg Broytman            http://phdru.name/            phd at phdru.name
           Programmers don't die, they just GOSUB without RETURN.


From victor.varvariuc at gmail.com  Tue Apr 10 13:24:18 2012
From: victor.varvariuc at gmail.com (Victor Varvariuc)
Date: Tue, 10 Apr 2012 14:24:18 +0300
Subject: [Python-ideas] Avoid (was: Improve) circular import
In-Reply-To: <20120410095551.GC22368@iskra.aviel.ru>
References: <CA+Lge13ttagjgYjdCtrk18e0B-my4zMQx0zX038Q98VDF7W+CA@mail.gmail.com>
	<20120410095551.GC22368@iskra.aviel.ru>
Message-ID: <CA+Lge13Ti=eQMebcK7hffr78KpntpX=gF-Qz9xZsp2X88c7_dg@mail.gmail.com>

On Tue, Apr 10, 2012 at 12:55 PM, Oleg Broytman <phd at phdru.name> wrote:

> My opinion is - restructure code to avoid circular import instead of
hacking import machinery.

It's not feasible sometimes.

See:
http://stackoverflow.com/questions/1556387/circular-import-dependency-in-python

Yes, they could be considered the same package. But if this results in a
massively huge file then it's impractical. I agree that frequently,
circular dependencies mean the design should be thought through again. But
there ARE some design patterns where it's appropriate (and where merging
the files together would result in a huge file) so I think it's dogmatic to
say that the packages should either be combined or the design should be
re-evaluated. ? Matthew Lund Dec 1 '11 at 21:49


> Why does a submodule import the entire package instead of importing
just root?

import tree, sys
print("tree.branch: 'root' in dir(tree) ->", 'root' in dir(tree))
print("tree.branch: 'tree.root' in sys.modules ->", 'tree.root' in
sys.modules)

Ignore this part - my fault.

--
*Victor*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120410/b2c7c475/attachment.html>

From victor.varvariuc at gmail.com  Tue Apr 10 13:30:00 2012
From: victor.varvariuc at gmail.com (Victor Varvariuc)
Date: Tue, 10 Apr 2012 14:30:00 +0300
Subject: [Python-ideas] Improve circular import
Message-ID: <CA+Lge13zev4g2uahQx1Tyrk9_jePkrMq4nNntaBHXVvCCe8DYA@mail.gmail.com>

On Tue, Apr 10, 2012 at 12:55 PM, Oleg Broytman <phd at phdru.name> wrote:

> Why does a submodule import the entire package instead of importing
just root?

import tree, sys
print("tree.branch: 'root' in dir(tree) ->", 'root' in dir(tree))
print("tree.branch: 'tree.root' in sys.modules ->", 'tree.root' in
sys.modules)

Ignore this part - my fault. It should have been:

import sys
print("tree.branch: 'root' in dir(main) ->", 'root' in
dir(sys.modules['__main__']))
print("tree.branch: 'tree.root' in sys.modules ->", 'tree.root' in
sys.modules)

This shows that `root` attribute does not exist yet in main, though
'tree.root' exists in `sys.modules`.
--
*Victor*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120410/b2805060/attachment.html>

From p.f.moore at gmail.com  Tue Apr 10 13:57:11 2012
From: p.f.moore at gmail.com (Paul Moore)
Date: Tue, 10 Apr 2012 12:57:11 +0100
Subject: [Python-ideas] Improve import mechanism for circular imports
In-Reply-To: <CA+Lge13ttagjgYjdCtrk18e0B-my4zMQx0zX038Q98VDF7W+CA@mail.gmail.com>
References: <CA+Lge13ttagjgYjdCtrk18e0B-my4zMQx0zX038Q98VDF7W+CA@mail.gmail.com>
Message-ID: <CACac1F9NCvziOwxDO6ArOv1s+sS1RDA2ZeJvfzSH9U-oEaAM7w@mail.gmail.com>

On 10 April 2012 10:30, Victor Varvariuc <victor.varvariuc at gmail.com> wrote:
> There are circular imports in this example code.
> Currently, `root = sys.modules['tree.root']` hack in branch.py works.
> Wouldn't it be useful to create attribute `root` in `main` at the same time
> `sys.modules['tree.root']` is created when doing `from tree import root` in
> main?
> This would solve more complex cases with when circular imports are involved,
> without applying such hacks.

Why does "tree" even exist? Why not just have 2 top-level modules
"root" and "branch". ("It's only an example" isn't a good answer - it
isn't a good example if it doesn't demonstrate why "tree" is needed
for your real use case). Can you explain the purpose of "tree" in your
real code?

Also, having things happen at import time (as opposed to simply
defining classes, functions, etc) is not good form, precisely because
partially-imported modules are in an odd state, and problems like this
can easily arise if you don't know what you are doing.

Instead of giving a made-up example, if you describe what you are
trying to achieve, I'm fairly certain someone here (or more likely on
python-list) could show you a better way to do it, without needing
circular imports.

Paul.


From phd at phdru.name  Tue Apr 10 14:03:15 2012
From: phd at phdru.name (Oleg Broytman)
Date: Tue, 10 Apr 2012 16:03:15 +0400
Subject: [Python-ideas] Avoid circular import
In-Reply-To: <CA+Lge13Ti=eQMebcK7hffr78KpntpX=gF-Qz9xZsp2X88c7_dg@mail.gmail.com>
References: <CA+Lge13ttagjgYjdCtrk18e0B-my4zMQx0zX038Q98VDF7W+CA@mail.gmail.com>
	<20120410095551.GC22368@iskra.aviel.ru>
	<CA+Lge13Ti=eQMebcK7hffr78KpntpX=gF-Qz9xZsp2X88c7_dg@mail.gmail.com>
Message-ID: <20120410120315.GD22368@iskra.aviel.ru>

On Tue, Apr 10, 2012 at 02:24:18PM +0300, Victor Varvariuc <victor.varvariuc at gmail.com> wrote:
> On Tue, Apr 10, 2012 at 12:55 PM, Oleg Broytman <phd at phdru.name> wrote:
> 
> > My opinion is - restructure code to avoid circular import instead of
> hacking import machinery.
> 
> It's not feasible sometimes.
> 
> See:
> http://stackoverflow.com/questions/1556387/circular-import-dependency-in-python

   I don't see anything interesting there. Without deeper knowledge
about the code I'd recommend to import b.d in a/__init__.py before
importing c.

> Yes, they could be considered the same package. But if this results in a
> massively huge file then it's impractical. I agree that frequently,
> circular dependencies mean the design should be thought through again. But
> there ARE some design patterns where it's appropriate (and where merging
> the files together would result in a huge file) so I think it's dogmatic to
> say that the packages should either be combined or the design should be
> re-evaluated. ? Matthew Lund Dec 1 '11 at 21:49

   I didn't and do not recommend merging code into one huge module. Call
me dogmatic but I recommend to refactor and move common parts to avoid
circular import.

Oleg.
-- 
     Oleg Broytman            http://phdru.name/            phd at phdru.name
           Programmers don't die, they just GOSUB without RETURN.


From techtonik at gmail.com  Tue Apr 10 14:35:16 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Tue, 10 Apr 2012 15:35:16 +0300
Subject: [Python-ideas] Avoid circular import
In-Reply-To: <20120410120315.GD22368@iskra.aviel.ru>
References: <CA+Lge13ttagjgYjdCtrk18e0B-my4zMQx0zX038Q98VDF7W+CA@mail.gmail.com>
	<20120410095551.GC22368@iskra.aviel.ru>
	<CA+Lge13Ti=eQMebcK7hffr78KpntpX=gF-Qz9xZsp2X88c7_dg@mail.gmail.com>
	<20120410120315.GD22368@iskra.aviel.ru>
Message-ID: <CAPkN8xKYkQ3cz1_FFVYCjs=QFutKPOzKqaTZVyFacs0rQ4yB4g@mail.gmail.com>

On Tue, Apr 10, 2012 at 3:03 PM, Oleg Broytman <phd at phdru.name> wrote:
> On Tue, Apr 10, 2012 at 02:24:18PM +0300, Victor Varvariuc <victor.varvariuc at gmail.com> wrote:
>> On Tue, Apr 10, 2012 at 12:55 PM, Oleg Broytman <phd at phdru.name> wrote:
>>
>> > My opinion is - restructure code to avoid circular import instead of
>> hacking import machinery.
>>
>> It's not feasible sometimes.

JFY here is an example how to make it in general case.
http://codereview.appspot.com/5449109/

>> See:
>> http://stackoverflow.com/questions/1556387/circular-import-dependency-in-python

Do you really fighting with this specific case? There is a solution to
that with delayed import.


From ncoghlan at gmail.com  Tue Apr 10 15:45:04 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 10 Apr 2012 23:45:04 +1000
Subject: [Python-ideas] Improve import mechanism for circular imports
In-Reply-To: <CACac1F9NCvziOwxDO6ArOv1s+sS1RDA2ZeJvfzSH9U-oEaAM7w@mail.gmail.com>
References: <CA+Lge13ttagjgYjdCtrk18e0B-my4zMQx0zX038Q98VDF7W+CA@mail.gmail.com>
	<CACac1F9NCvziOwxDO6ArOv1s+sS1RDA2ZeJvfzSH9U-oEaAM7w@mail.gmail.com>
Message-ID: <CADiSq7f+um_YHfsksOyah=VhPaxYax7AQt3hYXrJqX2voEJS8g@mail.gmail.com>

On Tue, Apr 10, 2012 at 9:57 PM, Paul Moore <p.f.moore at gmail.com> wrote:
> Instead of giving a made-up example, if you describe what you are
> trying to achieve, I'm fairly certain someone here (or more likely on
> python-list) could show you a better way to do it, without needing
> circular imports.

There isn't actually a *strong* philosophical objection to improving
the circular import support. It's just a sufficiently hard problem
that the rote answer is "nobody has cared enough about the problem to
come up with a fix that works properly, is backwards compatible and
doesn't hurt the performance of regular imports".

The relevant tracker issue is http://bugs.python.org/issue992389 (yes,
that issue is approaching it's 8th birthday later this year)

With import.c going away soon (courtesy of the migration to importlib
as the main import implementation), it may become easier to devise a
solution (or at least generate a better error message).

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From sven at marnach.net  Tue Apr 10 15:56:36 2012
From: sven at marnach.net (Sven Marnach)
Date: Tue, 10 Apr 2012 14:56:36 +0100
Subject: [Python-ideas] Avoid circular import
In-Reply-To: <20120410120315.GD22368@iskra.aviel.ru>
References: <CA+Lge13ttagjgYjdCtrk18e0B-my4zMQx0zX038Q98VDF7W+CA@mail.gmail.com>
	<20120410095551.GC22368@iskra.aviel.ru>
	<CA+Lge13Ti=eQMebcK7hffr78KpntpX=gF-Qz9xZsp2X88c7_dg@mail.gmail.com>
	<20120410120315.GD22368@iskra.aviel.ru>
Message-ID: <20120410135636.GA30763@bagheera>

Oleg Broytman schrieb am Tue, 10. Apr 2012, um 16:03:15 +0400:
>    I didn't and do not recommend merging code into one huge module. Call
> me dogmatic but I recommend to refactor and move common parts to avoid
> circular import.

I actually did ran into cases where this was not possible.  (I just
wrote a lengthy description of such a case, but I figured it wasn't
too helpful, so it's not included here.)

A point to consider is that there are cases of circular imports that
used to work fine with implicit relative imports.  When using explicit
relative imports though, they would stop working -- see [1] for a
minimal example demonstrating this problem.

[1]: http://stackoverflow.com/questions/6351805/cyclic-module-dependencies-and-relative-imports-in-python

This issue can be easily overcome with function-level imports, but
some people don't like function-level imports either.

The same issue turned up when porting the Python Imaging Library to
Python 3.  PIL uses implicit relative, circular imports which have to
be turned into function-level imports to work properly on Python 3,
see [2] for details.

[2]: https://github.com/sloonz/pil-py3k/pull/2

Cheers,
    Sven


From phd at phdru.name  Tue Apr 10 16:11:05 2012
From: phd at phdru.name (Oleg Broytman)
Date: Tue, 10 Apr 2012 18:11:05 +0400
Subject: [Python-ideas] Avoid circular import
In-Reply-To: <20120410135636.GA30763@bagheera>
References: <CA+Lge13ttagjgYjdCtrk18e0B-my4zMQx0zX038Q98VDF7W+CA@mail.gmail.com>
	<20120410095551.GC22368@iskra.aviel.ru>
	<CA+Lge13Ti=eQMebcK7hffr78KpntpX=gF-Qz9xZsp2X88c7_dg@mail.gmail.com>
	<20120410120315.GD22368@iskra.aviel.ru>
	<20120410135636.GA30763@bagheera>
Message-ID: <20120410141105.GA30439@iskra.aviel.ru>

On Tue, Apr 10, 2012 at 02:56:36PM +0100, Sven Marnach <sven at marnach.net> wrote:
> This issue can be easily overcome with function-level imports, but
> some people don't like function-level imports either.

   Can I say I doubt it's a good reason to change Python?

> The same issue turned up when porting the Python Imaging Library to
> Python 3.  PIL uses implicit relative, circular imports which have to
> be turned into function-level imports to work properly on Python 3,
> see [2] for details.
> 
> [2]: https://github.com/sloonz/pil-py3k/pull/2

   Was there any major problem in fixing that?

Oleg.
-- 
     Oleg Broytman            http://phdru.name/            phd at phdru.name
           Programmers don't die, they just GOSUB without RETURN.


From sven at marnach.net  Tue Apr 10 16:53:01 2012
From: sven at marnach.net (Sven Marnach)
Date: Tue, 10 Apr 2012 15:53:01 +0100
Subject: [Python-ideas] Avoid circular import
In-Reply-To: <20120410141105.GA30439@iskra.aviel.ru>
References: <CA+Lge13ttagjgYjdCtrk18e0B-my4zMQx0zX038Q98VDF7W+CA@mail.gmail.com>
	<20120410095551.GC22368@iskra.aviel.ru>
	<CA+Lge13Ti=eQMebcK7hffr78KpntpX=gF-Qz9xZsp2X88c7_dg@mail.gmail.com>
	<20120410120315.GD22368@iskra.aviel.ru>
	<20120410135636.GA30763@bagheera>
	<20120410141105.GA30439@iskra.aviel.ru>
Message-ID: <20120410145301.GB30763@bagheera>

Oleg Broytman schrieb am Tue, 10. Apr 2012, um 18:11:05 +0400:
> > This issue can be easily overcome with function-level imports, but
> > some people don't like function-level imports either.
> 
>    Can I say I doubt it's a good reason to change Python?

Sorry, I actuually didn't mean to argue in favour of any proposal.  I
just meant to point out issues relevant to this thread that some
people probably are not aware of.

> > The same issue turned up when porting the Python Imaging Library to
> > Python 3.  PIL uses implicit relative, circular imports which have to
> > be turned into function-level imports to work properly on Python 3,
> > see [2] for details.
> > 
> > [2]: https://github.com/sloonz/pil-py3k/pull/2
> 
>    Was there any major problem in fixing that?

This was meant as an example that these issues arise in practice, even
in libraries that can hardly be cosidered obscure.

Cheers,
    Sven


From steven.samuel.cole at gmail.com  Tue Apr 10 17:22:29 2012
From: steven.samuel.cole at gmail.com (Steven Samuel Cole)
Date: Wed, 11 Apr 2012 01:22:29 +1000
Subject: [Python-ideas] all, any - why no none ?
Message-ID: <4F845035.4020907@gmail.com>

hello,

i'm aware they've been around for quite a while, but for some reason, i 
did not have the builtins all(seq) and any(seq) on the radar thus far. 
when i used them today, i just assumed there was a corresponding 
none(seq), but was surprised to learn that is not true.

why is that ? has this been considered ? discussed ? dismissed ? i did 
search, but the net being neither case-sensitive nor semantic, the 
results were off topic.

sure, i can do not all or not any or any of these:
http://stackoverflow.com/q/6518394/217844
http://stackoverflow.com/q/3583860/217844
but none(seq): True if all items in seq are None
would IMHO be the pythonic way to do this.

what do you think ?

# ssc


From anacrolix at gmail.com  Tue Apr 10 17:44:04 2012
From: anacrolix at gmail.com (Matt Joiner)
Date: Tue, 10 Apr 2012 23:44:04 +0800
Subject: [Python-ideas] all, any - why no none ?
In-Reply-To: <4F845035.4020907@gmail.com>
References: <4F845035.4020907@gmail.com>
Message-ID: <CAB4yi1MTyJQwoNnc8GVSR_t2-D=ncCznN20mZ4UeoTQ3xCtwCQ@mail.gmail.com>

Try not any().
On Apr 10, 2012 11:23 PM, "Steven Samuel Cole" <steven.samuel.cole at gmail.com>
wrote:

> hello,
>
> i'm aware they've been around for quite a while, but for some reason, i
> did not have the builtins all(seq) and any(seq) on the radar thus far. when
> i used them today, i just assumed there was a corresponding none(seq), but
> was surprised to learn that is not true.
>
> why is that ? has this been considered ? discussed ? dismissed ? i did
> search, but the net being neither case-sensitive nor semantic, the results
> were off topic.
>
> sure, i can do not all or not any or any of these:
> http://stackoverflow.com/q/**6518394/217844<http://stackoverflow.com/q/6518394/217844>
> http://stackoverflow.com/q/**3583860/217844<http://stackoverflow.com/q/3583860/217844>
> but none(seq): True if all items in seq are None
> would IMHO be the pythonic way to do this.
>
> what do you think ?
>
> # ssc
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/**mailman/listinfo/python-ideas<http://mail.python.org/mailman/listinfo/python-ideas>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120410/dd1e8190/attachment.html>

From ncoghlan at gmail.com  Tue Apr 10 17:45:22 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 11 Apr 2012 01:45:22 +1000
Subject: [Python-ideas] all, any - why no none ?
In-Reply-To: <4F845035.4020907@gmail.com>
References: <4F845035.4020907@gmail.com>
Message-ID: <CADiSq7dhu3p-ZYOc5Nbb9WUEuLnKb7=TnV4fxb2a3yZhKqo2Rg@mail.gmail.com>

On Wed, Apr 11, 2012 at 1:22 AM, Steven Samuel Cole
<steven.samuel.cole at gmail.com> wrote:
> hello,
>
> i'm aware they've been around for quite a while, but for some reason, i did
> not have the builtins all(seq) and any(seq) on the radar thus far. when i
> used them today, i just assumed there was a corresponding none(seq), but was
> surprised to learn that is not true.
>
> why is that ? has this been considered ? discussed ? dismissed ? i did
> search, but the net being neither case-sensitive nor semantic, the results
> were off topic.
>
> sure, i can do not all or not any or any of these:
> http://stackoverflow.com/q/6518394/217844
> http://stackoverflow.com/q/3583860/217844
> but none(seq): True if all items in seq are None
> what do you think ?

I think it doesn't come up often enough to be worth special casing
over the more explicit "all(x is None for x in seq)".

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From bruce at leapyear.org  Tue Apr 10 17:46:18 2012
From: bruce at leapyear.org (Bruce Leban)
Date: Tue, 10 Apr 2012 08:46:18 -0700
Subject: [Python-ideas] all, any - why no none ?
In-Reply-To: <4F845035.4020907@gmail.com>
References: <4F845035.4020907@gmail.com>
Message-ID: <CAGu0AnuHzyAD+nB2K7U-jr_iE_N35UujsBPPVixcBsUbZTxBQA@mail.gmail.com>

If you really mean "if all items in seq are None" then all(x is None for x
in seq) is very clear and explicit that you don't mean  all(x == None for x
in seq) which is not exactly the same thing.

If you don't care about exactly being None and you just want falsenes, then not
any(seq) works already.

If I saw none(seq) I would think it meant "none of seq is true" as that is
a more common phrase. You'd need a name like all_none(seq). But then I want
any_none() and none_none() too. And  all_true() and all_false(), etc. Not
enough value here.

--- Bruce
Follow me: http://www.twitter.com/Vroo http://www.vroospeak.com




On Tue, Apr 10, 2012 at 8:22 AM, Steven Samuel Cole <
steven.samuel.cole at gmail.com> wrote:

> hello,
>
> i'm aware they've been around for quite a while, but for some reason, i
> did not have the builtins all(seq) and any(seq) on the radar thus far. when
> i used them today, i just assumed there was a corresponding none(seq), but
> was surprised to learn that is not true.
>
> why is that ? has this been considered ? discussed ? dismissed ? i did
> search, but the net being neither case-sensitive nor semantic, the results
> were off topic.
>
> sure, i can do not all or not any or any of these:
> http://stackoverflow.com/q/**6518394/217844<http://stackoverflow.com/q/6518394/217844>
> http://stackoverflow.com/q/**3583860/217844<http://stackoverflow.com/q/3583860/217844>
> but none(seq): True if all items in seq are None
> would IMHO be the pythonic way to do this.
>
> what do you think ?
>
> # ssc
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/**mailman/listinfo/python-ideas<http://mail.python.org/mailman/listinfo/python-ideas>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120410/80c11eca/attachment.html>

From oscar.j.benjamin at gmail.com  Tue Apr 10 17:49:01 2012
From: oscar.j.benjamin at gmail.com (Oscar Benjamin)
Date: Tue, 10 Apr 2012 16:49:01 +0100
Subject: [Python-ideas] all, any - why no none ?
In-Reply-To: <4F845035.4020907@gmail.com>
References: <4F845035.4020907@gmail.com>
Message-ID: <CAHVvXxR3omAGhBFnz0_5ufsqWU-ZuwKMYJ5yJcHM1rPaPd0Grg@mail.gmail.com>

none looks similar to None.

The code below rightly gives a NameError. If none were a builtin function,
not only would it allow the bug below but it would evaluate to True.

>>> def f(x=none):
...     if x:
...         do_stuff()
...
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'none' is not defined




On Tue, Apr 10, 2012 at 4:22 PM, Steven Samuel Cole <
steven.samuel.cole at gmail.com> wrote:

> hello,
>
> i'm aware they've been around for quite a while, but for some reason, i
> did not have the builtins all(seq) and any(seq) on the radar thus far. when
> i used them today, i just assumed there was a corresponding none(seq), but
> was surprised to learn that is not true.
>
> why is that ? has this been considered ? discussed ? dismissed ? i did
> search, but the net being neither case-sensitive nor semantic, the results
> were off topic.
>
> sure, i can do not all or not any or any of these:
> http://stackoverflow.com/q/**6518394/217844<http://stackoverflow.com/q/6518394/217844>
> http://stackoverflow.com/q/**3583860/217844<http://stackoverflow.com/q/3583860/217844>
> but none(seq): True if all items in seq are None
> would IMHO be the pythonic way to do this.
>
> what do you think ?
>
> # ssc
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/**mailman/listinfo/python-ideas<http://mail.python.org/mailman/listinfo/python-ideas>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120410/f4ca9861/attachment.html>

From ubershmekel at gmail.com  Tue Apr 10 17:50:59 2012
From: ubershmekel at gmail.com (Yuval Greenfield)
Date: Tue, 10 Apr 2012 18:50:59 +0300
Subject: [Python-ideas] all, any - why no none ?
In-Reply-To: <4F845035.4020907@gmail.com>
References: <4F845035.4020907@gmail.com>
Message-ID: <CANSw7Kwi_U8ZAkkooOpb6RyTYHDBQOEiMb0toS5ma3hPDQs3Kw@mail.gmail.com>

On Tue, Apr 10, 2012 at 6:22 PM, Steven Samuel Cole <
steven.samuel.cole at gmail.com> wrote:

> ....
> what do you think ?
>
>
I don't like how a missed capitalization can be so deadly.

    if x is none:
        return

    assert x is not none

I can see that happening a lot and people being terribly confused and
annoyed.

Strongly against this idea.

Yuval
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120410/08e9abf5/attachment.html>

From sven at marnach.net  Tue Apr 10 20:27:11 2012
From: sven at marnach.net (Sven Marnach)
Date: Tue, 10 Apr 2012 19:27:11 +0100
Subject: [Python-ideas] Allow imports to a global name
Message-ID: <20120410182711.GC30763@bagheera>

Sometimes it is useful to do a import to a global name inside a
function.  A common use case is the 'pylab' module, which must be
imported *after* the backend has been set using 'matplotlib.use()'.
If the backend is configuration-dependent, the statement

    import pylab

will usually be inside a function, but the module should be available
globally, so you would do

    global pylab
    import pylab

While this code works (at least in CPython), the current language
specification forbids it [1], so the correct code should be

    global pylab
    import pylab as _pylab
    pylab = _pylab

I don't see why we shouldn't allow the shorter version -- it is
certainly easier to read.

The behaviour of pylab might be considered a questionable design
choice.  I've encountered the above non-conforming, but working code
out in the wild several times, though, so the language specification
might as well allow it.

Cheers,
    Sven

[1]: http://docs.python.org/dev/reference/simple_stmts.html#the-global-statement


From nathan.alexander.rice at gmail.com  Tue Apr 10 23:10:45 2012
From: nathan.alexander.rice at gmail.com (Nathan Rice)
Date: Tue, 10 Apr 2012 17:10:45 -0400
Subject: [Python-ideas] make __closure__ writable
In-Reply-To: <CADiSq7co_=_a=hB4RodhGwwTxCeG=xdV9G+NnxTo+ZJwC6EXEw@mail.gmail.com>
References: <724E3224-50E5-43EA-8F72-878D4B87D3F5@gmail.com>
	<4F638871.1010603@hotpy.org>
	<0A6B28B2-C9FC-4835-9821-3F7079742AAE@gmail.com>
	<DC2487A0-1D84-4A76-A5AA-87019E778022@gmail.com>
	<4F639370.1020609@hotpy.org>
	<93E337A9-1326-41DB-9C0D-281D78C611A4@gmail.com>
	<CAL3CFcXUJzfudwobNbfxTKKHvaRRic+_R-2=_Zp0zojQ0GHx5Q@mail.gmail.com>
	<CACC7D08-CACD-4546-8C03-6D80027C593F@gmail.com>
	<8FBB3A4C-E5B4-4EC1-B5B5-70E2DEA2A23B@gmail.com>
	<4F684F32.5080006@hotpy.org>
	<8406714B-876E-435E-84AB-716804C92387@gmail.com>
	<CADiSq7dmZ=hA+nTmxpaDsLY+r5yQMreLa5OK2DoBMaBOQ2+i=w@mail.gmail.com>
	<EF548FEA-EFAA-497E-B779-E2052B66D499@gmail.com>
	<CALFfu7CcQ2w3U1Luvh-tsBR7rmnT7oBUzDQmJEqWXPdQRY4DSA@mail.gmail.com>
	<CADiSq7co_=_a=hB4RodhGwwTxCeG=xdV9G+NnxTo+ZJwC6EXEw@mail.gmail.com>
Message-ID: <CAOFbRm+tFKCNvoytyi10Awtvfdi8T=q8uxT4TdLQ3JQjF9yLjg@mail.gmail.com>

As one of the resident crackpot/idealists I don't know that my +1
means much, but I have a decent amount of code where I jump through a
lot of hoops to emulate writable __closure__, including copying a
function using FunctionType(), then replace one instance of a function
with another multiple places in a function graph; I also do a lot of
lambda wrapping and unwrapping for the same purpose.  This is
primarily relevant relative to symbolic computation graphs, such as
dataflow structures, computer algebra systems, etc.


From ben+python at benfinney.id.au  Wed Apr 11 00:34:46 2012
From: ben+python at benfinney.id.au (Ben Finney)
Date: Wed, 11 Apr 2012 08:34:46 +1000
Subject: [Python-ideas] all, any - why no none ?
References: <4F845035.4020907@gmail.com>
Message-ID: <87ehrvgqrd.fsf@benfinney.id.au>

Steven Samuel Cole
<steven.samuel.cole at gmail.com> writes:

> i'm aware they've been around for quite a while, but for some reason,
> i did not have the builtins all(seq) and any(seq) on the radar thus
> far. when i used them today, i just assumed there was a corresponding
> none(seq), but was surprised to learn that is not true.

And then you realised ?not any(seq)? works fine, and continued on
satisfied. Right?

-- 
 \     ?Not to be absolutely certain is, I think, one of the essential |
  `\                         things in rationality.? ?Bertrand Russell |
_o__)                                                                  |
Ben Finney



From steve at pearwood.info  Wed Apr 11 02:22:31 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 11 Apr 2012 10:22:31 +1000
Subject: [Python-ideas] Allow imports to a global name
In-Reply-To: <20120410182711.GC30763@bagheera>
References: <20120410182711.GC30763@bagheera>
Message-ID: <4F84CEC7.9090402@pearwood.info>

Sven Marnach wrote:
> Sometimes it is useful to do a import to a global name inside a
> function.  A common use case is the 'pylab' module, which must be
> imported *after* the backend has been set using 'matplotlib.use()'.
> If the backend is configuration-dependent, the statement
> 
>     import pylab
> 
> will usually be inside a function, but the module should be available
> globally, so you would do
> 
>     global pylab
>     import pylab
> 
> While this code works (at least in CPython), the current language
> specification forbids it [1]


I quote:

     Names listed in a global statement must not be defined as
     formal parameters or in a for loop control target, class
     definition, function definition, or import statement.

http://docs.python.org/dev/reference/simple_stmts.html#the-global-statement

I can understand that it makes no sense to declare a function parameter as 
global, and I can an argument in favour of optimizing for loops by ensuring 
that the target is always a local rather than global. But what is the 
rationale for prohibiting globals being used for classes, functions, and imports?

It seems like an unnecessary restriction, particularly since CPython doesn't 
bother to enforce it. The semantics of "global x; import x" is simple and obvious.

+1 on removing the unenforced prohibition on global class/def/import inside 
functions.


-- 
Steven



From guido at python.org  Wed Apr 11 02:36:46 2012
From: guido at python.org (Guido van Rossum)
Date: Tue, 10 Apr 2012 17:36:46 -0700
Subject: [Python-ideas] Allow imports to a global name
In-Reply-To: <4F84CEC7.9090402@pearwood.info>
References: <20120410182711.GC30763@bagheera> <4F84CEC7.9090402@pearwood.info>
Message-ID: <CAP7+vJL=2NSqZ+FUVhyzY9NehxVi2nVDTMoS-wYL2-4+YDF5yQ@mail.gmail.com>

+1

--Guido van Rossum (sent from Android phone)
On Apr 10, 2012 5:23 PM, "Steven D'Aprano" <steve at pearwood.info> wrote:

> Sven Marnach wrote:
>
>> Sometimes it is useful to do a import to a global name inside a
>> function.  A common use case is the 'pylab' module, which must be
>> imported *after* the backend has been set using 'matplotlib.use()'.
>> If the backend is configuration-dependent, the statement
>>
>>    import pylab
>>
>> will usually be inside a function, but the module should be available
>> globally, so you would do
>>
>>    global pylab
>>    import pylab
>>
>> While this code works (at least in CPython), the current language
>> specification forbids it [1]
>>
>
>
> I quote:
>
>    Names listed in a global statement must not be defined as
>    formal parameters or in a for loop control target, class
>    definition, function definition, or import statement.
>
> http://docs.python.org/dev/**reference/simple_stmts.html#**
> the-global-statement<http://docs.python.org/dev/reference/simple_stmts.html#the-global-statement>
>
> I can understand that it makes no sense to declare a function parameter as
> global, and I can an argument in favour of optimizing for loops by ensuring
> that the target is always a local rather than global. But what is the
> rationale for prohibiting globals being used for classes, functions, and
> imports?
>
> It seems like an unnecessary restriction, particularly since CPython
> doesn't bother to enforce it. The semantics of "global x; import x" is
> simple and obvious.
>
> +1 on removing the unenforced prohibition on global class/def/import
> inside functions.
>
>
> --
> Steven
>
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/**mailman/listinfo/python-ideas<http://mail.python.org/mailman/listinfo/python-ideas>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120410/5982c1cf/attachment.html>

From ncoghlan at gmail.com  Wed Apr 11 02:59:47 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 11 Apr 2012 10:59:47 +1000
Subject: [Python-ideas] Allow imports to a global name
In-Reply-To: <CAP7+vJL=2NSqZ+FUVhyzY9NehxVi2nVDTMoS-wYL2-4+YDF5yQ@mail.gmail.com>
References: <20120410182711.GC30763@bagheera> <4F84CEC7.9090402@pearwood.info>
	<CAP7+vJL=2NSqZ+FUVhyzY9NehxVi2nVDTMoS-wYL2-4+YDF5yQ@mail.gmail.com>
Message-ID: <CADiSq7e=9k-_XNRKBYUHZMMCrxQJBUxb6A6Ys90k3_4pZNuRGg@mail.gmail.com>

On Wed, Apr 11, 2012 at 10:36 AM, Guido van Rossum <guido at python.org> wrote:
> +1

Ditto. FWIW, I'm actually in favour of dropping everything after the
"or" in that paragraph from the language spec, since we don't enforce
*any* of it. Aside from formal parameter definitions (which explicitly
declare local variables), name binding operations are just name
binding operations regardless of the specific syntax.

With global:

>>> def f():
...   global x
...   for x in (): pass
...   class x: pass
...   def x(): pass
...   import sys as x
...
>>> f()
>>> x
<module 'sys' (built-in)>

With nonlocal:

>>> def outer():
...   x = 1
...   def inner():
...     nonlocal x
...     for x in (): pass
...     class x: pass
...     def x(): pass
...     import sys as x
...   inner()
...   return x
...
>>> outer()
<module 'sys' (built-in)>

By contrast:

>>> def f(x):
...   global x
...
  File "<stdin>", line 1
SyntaxError: name 'x' is parameter and global
>>> def outer(x):
...   def inner(x):
...     nonlocal x
...
SyntaxError: name 'x' is parameter and nonlocal

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From raymond.hettinger at gmail.com  Wed Apr 11 05:29:36 2012
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Tue, 10 Apr 2012 23:29:36 -0400
Subject: [Python-ideas] Allow imports to a global name
In-Reply-To: <CADiSq7e=9k-_XNRKBYUHZMMCrxQJBUxb6A6Ys90k3_4pZNuRGg@mail.gmail.com>
References: <20120410182711.GC30763@bagheera> <4F84CEC7.9090402@pearwood.info>
	<CAP7+vJL=2NSqZ+FUVhyzY9NehxVi2nVDTMoS-wYL2-4+YDF5yQ@mail.gmail.com>
	<CADiSq7e=9k-_XNRKBYUHZMMCrxQJBUxb6A6Ys90k3_4pZNuRGg@mail.gmail.com>
Message-ID: <C69DAFA8-D699-463D-86EA-7AA8612D9696@gmail.com>


On Apr 10, 2012, at 8:59 PM, Nick Coghlan wrote:

> . FWIW, I'm actually in favour of dropping everything after the
> "or" in that paragraph from the language spec, since we don't enforce
> *any* of it. 

+1  The restriction seems unnecessary to me.

That being said, we should check to make sure that the other
implementations don't need the restrictions for some reason or other.


Raymond
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120410/e643f0c2/attachment.html>

From ncoghlan at gmail.com  Wed Apr 11 06:48:35 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 11 Apr 2012 14:48:35 +1000
Subject: [Python-ideas] Allow imports to a global name
In-Reply-To: <C69DAFA8-D699-463D-86EA-7AA8612D9696@gmail.com>
References: <20120410182711.GC30763@bagheera> <4F84CEC7.9090402@pearwood.info>
	<CAP7+vJL=2NSqZ+FUVhyzY9NehxVi2nVDTMoS-wYL2-4+YDF5yQ@mail.gmail.com>
	<CADiSq7e=9k-_XNRKBYUHZMMCrxQJBUxb6A6Ys90k3_4pZNuRGg@mail.gmail.com>
	<C69DAFA8-D699-463D-86EA-7AA8612D9696@gmail.com>
Message-ID: <CADiSq7fJkVBJCv9+8Cq=evjj9G76hAMBKe10oX_JLpsPfZanCA@mail.gmail.com>

On Wed, Apr 11, 2012 at 1:29 PM, Raymond Hettinger
<raymond.hettinger at gmail.com> wrote:
>
> On Apr 10, 2012, at 8:59 PM, Nick Coghlan wrote:
>
> . FWIW, I'm actually in favour of dropping everything after the
> "or" in that paragraph from the language spec, since we don't enforce
> *any* of it.
>
>
> +1 ?The restriction seems unnecessary to me.
>
> That being said, we should check to make sure that the other
> implementations don't need the restrictions for some reason or other.

Agreed. I created http://bugs.python.org/issue14544 to provide a
location for people to object (and will share the link around a bit).

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From tjreedy at udel.edu  Wed Apr 11 07:08:50 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 11 Apr 2012 01:08:50 -0400
Subject: [Python-ideas] Allow imports to a global name
In-Reply-To: <CADiSq7e=9k-_XNRKBYUHZMMCrxQJBUxb6A6Ys90k3_4pZNuRGg@mail.gmail.com>
References: <20120410182711.GC30763@bagheera> <4F84CEC7.9090402@pearwood.info>
	<CAP7+vJL=2NSqZ+FUVhyzY9NehxVi2nVDTMoS-wYL2-4+YDF5yQ@mail.gmail.com>
	<CADiSq7e=9k-_XNRKBYUHZMMCrxQJBUxb6A6Ys90k3_4pZNuRGg@mail.gmail.com>
Message-ID: <jm33l7$qv9$1@dough.gmane.org>

On 4/10/2012 8:59 PM, Nick Coghlan wrote:
> On Wed, Apr 11, 2012 at 10:36 AM, Guido van Rossum<guido at python.org>  wrote:
>> +1
>
> Ditto. FWIW, I'm actually in favour of dropping everything after the
> "or" in that paragraph from the language spec, since we don't enforce
> *any* of it. Aside from formal parameter definitions (which explicitly
> declare local variables), name binding operations are just name
> binding operations regardless of the specific syntax.
>
> With global:
>
>>>> def f():
> ...   global x
> ...   for x in (): pass
> ...   class x: pass
> ...   def x(): pass
> ...   import sys as x
> ...
>>>> f()
>>>> x
> <module 'sys' (built-in)>

I found this slightly surprising, but on checking with dis, each of the 
'implicit' assignments is implemented as the equivalent of x = 
internal_function_call(args).  (For for-loops, the assignment is within 
the loop.) Which is to say, each bytecode sequence ends with store_xxx 
where xxx is 'fast' or 'global. So now I am not surprised.

-- 
Terry Jan Reedy



From ncoghlan at gmail.com  Wed Apr 11 07:23:12 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 11 Apr 2012 15:23:12 +1000
Subject: [Python-ideas] Allow imports to a global name
In-Reply-To: <jm33l7$qv9$1@dough.gmane.org>
References: <20120410182711.GC30763@bagheera> <4F84CEC7.9090402@pearwood.info>
	<CAP7+vJL=2NSqZ+FUVhyzY9NehxVi2nVDTMoS-wYL2-4+YDF5yQ@mail.gmail.com>
	<CADiSq7e=9k-_XNRKBYUHZMMCrxQJBUxb6A6Ys90k3_4pZNuRGg@mail.gmail.com>
	<jm33l7$qv9$1@dough.gmane.org>
Message-ID: <CADiSq7eCUxtBuvCodpC7bJ+Vv7P6J5QgWNtjxjvoUs4g7QqaZA@mail.gmail.com>

On Wed, Apr 11, 2012 at 3:08 PM, Terry Reedy <tjreedy at udel.edu> wrote:
> I found this slightly surprising, but on checking with dis, each of the
> 'implicit' assignments is implemented as the equivalent of x =
> internal_function_call(args). ?(For for-loops, the assignment is within the
> loop.) Which is to say, each bytecode sequence ends with store_xxx where xxx
> is 'fast' or 'global. So now I am not surprised.

By contrast, I already knew that CPython's underlying implementation
of the name binding step was the same in all these cases, so I was
surprised by the documented restriction in the language spec.

I'm now wondering if the initial restriction that prompted the note in
the language spec was something that existed in the pre-AST version of
the compiler (I only started learning the code generation machinery
when I was helping to get the AST based compiler branch ready for
inclusion in Python 2.5, so I know very little about how the compiler
used to work in 2.4 and earlier)

Regards,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From anacrolix at gmail.com  Wed Apr 11 11:38:39 2012
From: anacrolix at gmail.com (Matt Joiner)
Date: Wed, 11 Apr 2012 17:38:39 +0800
Subject: [Python-ideas] generator.close
Message-ID: <CAB4yi1PYOaBHAW+qsnFo4q9SXoxWb1aucODTvEwruXd6Wrms5w@mail.gmail.com>

Why can't generator.close() return the value if a StopIteration is raised?

The PEPs mentioned that it was proposed before, but I can't find any
definitive reason, and it's terribly convenient if it does.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120411/f1165299/attachment.html>

From mark at hotpy.org  Wed Apr 11 12:04:29 2012
From: mark at hotpy.org (Mark Shannon)
Date: Wed, 11 Apr 2012 11:04:29 +0100
Subject: [Python-ideas] generator.close
In-Reply-To: <CAB4yi1PYOaBHAW+qsnFo4q9SXoxWb1aucODTvEwruXd6Wrms5w@mail.gmail.com>
References: <CAB4yi1PYOaBHAW+qsnFo4q9SXoxWb1aucODTvEwruXd6Wrms5w@mail.gmail.com>
Message-ID: <4F85572D.9000608@hotpy.org>

Matt Joiner wrote:
> Why can't generator.close() return the value if a StopIteration is raised?

No reason as far as I can see. The semantics are clear enough.
 From an implementation point of view it would be a simple patch.

> 
> The PEPs mentioned that it was proposed before, but I can't find any 
> definitive reason, and it's terribly convenient if it does.

I'm sure it is convenient, but I believe it is convention to provide a
use case ;)

Cheers,
Mark.


From steve at pearwood.info  Wed Apr 11 12:42:02 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 11 Apr 2012 20:42:02 +1000
Subject: [Python-ideas] Allow imports to a global name
In-Reply-To: <C69DAFA8-D699-463D-86EA-7AA8612D9696@gmail.com>
References: <20120410182711.GC30763@bagheera>
	<4F84CEC7.9090402@pearwood.info>	<CAP7+vJL=2NSqZ+FUVhyzY9NehxVi2nVDTMoS-wYL2-4+YDF5yQ@mail.gmail.com>	<CADiSq7e=9k-_XNRKBYUHZMMCrxQJBUxb6A6Ys90k3_4pZNuRGg@mail.gmail.com>
	<C69DAFA8-D699-463D-86EA-7AA8612D9696@gmail.com>
Message-ID: <4F855FFA.7060505@pearwood.info>

Raymond Hettinger wrote:
> On Apr 10, 2012, at 8:59 PM, Nick Coghlan wrote:
> 
>> . FWIW, I'm actually in favour of dropping everything after the
>> "or" in that paragraph from the language spec, since we don't enforce
>> *any* of it. 
> 
> +1  The restriction seems unnecessary to me.
> 
> That being said, we should check to make sure that the other
> implementations don't need the restrictions for some reason or other.



Seems to work with Jython:

steve at runes:~$ jython
Jython 2.5.1+ (Release_2_5_1, Aug 4 2010, 07:18:19)
[OpenJDK Client VM (Sun Microsystems Inc.)] on java1.6.0_18
Type "help", "copyright", "credits" or "license" for more information.
 >>> def test():
...     global math
...     import math
...
 >>>
 >>> test()
 >>> math
<type 'org.python.modules.math'>



-- 
Steven


From jh at improva.dk  Wed Apr 11 12:50:28 2012
From: jh at improva.dk (Jacob Holm)
Date: Wed, 11 Apr 2012 12:50:28 +0200
Subject: [Python-ideas] generator.close
In-Reply-To: <CAB4yi1PYOaBHAW+qsnFo4q9SXoxWb1aucODTvEwruXd6Wrms5w@mail.gmail.com>
References: <CAB4yi1PYOaBHAW+qsnFo4q9SXoxWb1aucODTvEwruXd6Wrms5w@mail.gmail.com>
Message-ID: <4F8561F4.9060806@improva.dk>

Hello Matt

On 04/11/2012 11:38 AM, Matt Joiner wrote:
> Why can't generator.close() return the value if a StopIteration is raised?
>
> The PEPs mentioned that it was proposed before, but I can't find any
> definitive reason, and it's terribly convenient if it does.
>

What should be returned when you call close on an already-exhausted 
generator?

You can't return the value of the final StopIteration unless you arrange 
to have that value stored somewhere.  Storing the value was deemed 
undesireable by the powers that be.

The alternative is to return None if the generator is already exhausted. 
  That would work, but severely reduces the usefulness of the change.

If you don't care about the performance of yield-from, it is quite easy 
to write a class you can use to wrap your generator-iterators and get 
the desired result (see untested example below).


- Jacob


import functools

class generator_result_wrapper(object):
     __slots__ = ('_it', '_result')

     def __init__(self, it):
         self._it = it

     def __iter__(self):
         return self

     def next(self):
         try:
             return self._it.next()
         except StopIteration as e:
             self._result = e.result
             raise

     def send(self, value):
         try:
             return self._it.send(value)
         except StopIteration as e:
             self._result = e.result
             raise

     def throw(self, *args, **kwargs):
         try:
             return self._it.throw(*args, **kwargs)
         except StopIteration as e:
             self._result = e.result
             raise

     def close(self):
         try:
             return self._result
         except AttributeError:
             pass
         try:
             self._it.throw(GeneratorExit)
         except StopIteration as e:
             self._result = e.result
             return self._result
         except GeneratorExit:
             pass

def close_result(func):
     @functools.wraps(func)
     def factory(*args, **kwargs):
         return generator_result_wrapper(func(*args, **kwargs))
     return factory



From anacrolix at gmail.com  Wed Apr 11 13:28:18 2012
From: anacrolix at gmail.com (Matt Joiner)
Date: Wed, 11 Apr 2012 19:28:18 +0800
Subject: [Python-ideas] generator.close
In-Reply-To: <4F8561F4.9060806@improva.dk>
References: <CAB4yi1PYOaBHAW+qsnFo4q9SXoxWb1aucODTvEwruXd6Wrms5w@mail.gmail.com>
	<4F8561F4.9060806@improva.dk>
Message-ID: <CAB4yi1PTgVmnSTgXa4peW_Nn39pgLo3nvu19iPqeOrhPz6Dewg@mail.gmail.com>

You make an excellent point. I'm inclined to agree with you. Cheers
On Apr 11, 2012 6:50 PM, "Jacob Holm" <jh at improva.dk> wrote:

> Hello Matt
>
> On 04/11/2012 11:38 AM, Matt Joiner wrote:
>
>> Why can't generator.close() return the value if a StopIteration is raised?
>>
>> The PEPs mentioned that it was proposed before, but I can't find any
>> definitive reason, and it's terribly convenient if it does.
>>
>>
> What should be returned when you call close on an already-exhausted
> generator?
>
> You can't return the value of the final StopIteration unless you arrange
> to have that value stored somewhere.  Storing the value was deemed
> undesireable by the powers that be.
>
> The alternative is to return None if the generator is already exhausted.
>  That would work, but severely reduces the usefulness of the change.
>
> If you don't care about the performance of yield-from, it is quite easy to
> write a class you can use to wrap your generator-iterators and get the
> desired result (see untested example below).
>
>
> - Jacob
>
>
> import functools
>
> class generator_result_wrapper(**object):
>    __slots__ = ('_it', '_result')
>
>    def __init__(self, it):
>        self._it = it
>
>    def __iter__(self):
>        return self
>
>    def next(self):
>        try:
>            return self._it.next()
>        except StopIteration as e:
>            self._result = e.result
>            raise
>
>    def send(self, value):
>        try:
>            return self._it.send(value)
>        except StopIteration as e:
>            self._result = e.result
>            raise
>
>    def throw(self, *args, **kwargs):
>        try:
>            return self._it.throw(*args, **kwargs)
>        except StopIteration as e:
>            self._result = e.result
>            raise
>
>    def close(self):
>        try:
>            return self._result
>        except AttributeError:
>            pass
>        try:
>            self._it.throw(GeneratorExit)
>        except StopIteration as e:
>            self._result = e.result
>            return self._result
>        except GeneratorExit:
>            pass
>
> def close_result(func):
>    @functools.wraps(func)
>    def factory(*args, **kwargs):
>        return generator_result_wrapper(func(***args, **kwargs))
>    return factory
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120411/1f767952/attachment.html>

From ncoghlan at gmail.com  Wed Apr 11 14:26:43 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 11 Apr 2012 22:26:43 +1000
Subject: [Python-ideas] generator.close
In-Reply-To: <CAB4yi1PTgVmnSTgXa4peW_Nn39pgLo3nvu19iPqeOrhPz6Dewg@mail.gmail.com>
References: <CAB4yi1PYOaBHAW+qsnFo4q9SXoxWb1aucODTvEwruXd6Wrms5w@mail.gmail.com>
	<4F8561F4.9060806@improva.dk>
	<CAB4yi1PTgVmnSTgXa4peW_Nn39pgLo3nvu19iPqeOrhPz6Dewg@mail.gmail.com>
Message-ID: <CADiSq7f8AHNP111b-=Tt7cqfm9uX2Jhk=WGMVTLYdTBSOi+GBg@mail.gmail.com>

On Wed, Apr 11, 2012 at 9:28 PM, Matt Joiner <anacrolix at gmail.com> wrote:
> You make an excellent point. I'm inclined to agree with you.

While Jacob does make a valid point about the question of what to do
when close() is called multiple times (or on a generator that has
already been exhausted through iteration), the specific reason that
got the idea killed in PEP 380 [1] is that generators shouldn't be
raising StopIteration as a result of a close() invocation anyway -
they should usually just reraise the GeneratorExit that gets thrown in
to finalise the generator body. If inner generators start making a
habit of converting GeneratorExit to StopIteration, then intervening
"yield from" operations may yield another value instead of terminating
the way they're supposed to in response to close().

Cheers,
Nick.

[1] http://www.python.org/dev/peps/pep-0380/#rejected-ideas

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From tshepang at gmail.com  Wed Apr 11 22:35:54 2012
From: tshepang at gmail.com (Tshepang Lekhonkhobe)
Date: Wed, 11 Apr 2012 22:35:54 +0200
Subject: [Python-ideas] in str.replace(old, new),
	allow 'old' to accept a tuple
Message-ID: <CAA77j2Bcds07k9_QaLL5fTXbQc3X6B88yXYNpnqfxLg81ixidw@mail.gmail.com>

Hi,

I find the fact that 'prefix' in str.startswith(prefix) accept a tuple
quite useful. That's because one can do a match on more than one
pattern at a time, without ugliness. Would it be a good idea to do the
same for str.replace(old, new)?

before
>>> 'foo bar baz'.replace('foo', 'baz').replace('bar', 'baz')
baz baz baz

after
>>> 'foo bar baz'.replace(('foo', 'bar'), 'baz')
baz baz baz


From sven at marnach.net  Thu Apr 12 02:41:53 2012
From: sven at marnach.net (Sven Marnach)
Date: Thu, 12 Apr 2012 01:41:53 +0100
Subject: [Python-ideas] in str.replace(old, new),
 allow 'old' to accept a tuple
In-Reply-To: <CAA77j2Bcds07k9_QaLL5fTXbQc3X6B88yXYNpnqfxLg81ixidw@mail.gmail.com>
References: <CAA77j2Bcds07k9_QaLL5fTXbQc3X6B88yXYNpnqfxLg81ixidw@mail.gmail.com>
Message-ID: <20120412004153.GE30763@bagheera>

Tshepang Lekhonkhobe schrieb am Wed, 11. Apr 2012, um 22:35:54 +0200:
> before
> >>> 'foo bar baz'.replace('foo', 'baz').replace('bar', 'baz')
> baz baz baz
> 
> after
> >>> 'foo bar baz'.replace(('foo', 'bar'), 'baz')
> baz baz baz

The usual current solution is to use `re.sub`:

    >>> re.sub("foo|bar", "baz", "foo bar baz")
    'baz baz baz'

or, for a general iterable of patterns

    re.sub("|".join(map(re.escape, patterns)), repl, string)

Cheers,
    Sven


From ben+python at benfinney.id.au  Thu Apr 12 03:47:07 2012
From: ben+python at benfinney.id.au (Ben Finney)
Date: Thu, 12 Apr 2012 11:47:07 +1000
Subject: [Python-ideas] in str.replace(old, new),
	allow 'old' to accept a tuple
References: <CAA77j2Bcds07k9_QaLL5fTXbQc3X6B88yXYNpnqfxLg81ixidw@mail.gmail.com>
Message-ID: <87lim1g1r8.fsf@benfinney.id.au>

Tshepang Lekhonkhobe <tshepang at gmail.com>
writes:

> >>> 'foo bar baz'.replace(('foo', 'bar'), 'baz')
> baz baz baz

How about:

    'foo bar baz'.replace(('foo', 'bar'), 'foobar')

You can't replace multiple matches ?at the same time?, as you're
implying. The order of replacements is important, since it will affect
the outcome in many cases.

Do you think it's important to allow a set as the first argument to
str.replace()?

    search_strings = set(['foo', 'bar'])
    'foo bar baz'.replace(search_strings, 'foobar')

I think that would be at least as desirable as your proposal; but what
would be the order of replacements?

-- 
 \      ?Shepherds ? look after their sheep so they can, first, fleece |
  `\   them and second, turn them into meat. That's much more like the |
_o__)      priesthood as I know it.? ?Christopher Hitchens, 2008-10-29 |
Ben Finney



From cmjohnson.mailinglist at gmail.com  Thu Apr 12 03:59:34 2012
From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson)
Date: Wed, 11 Apr 2012 15:59:34 -1000
Subject: [Python-ideas] in str.replace(old, new),
	allow 'old' to accept a tuple
In-Reply-To: <87lim1g1r8.fsf@benfinney.id.au>
References: <CAA77j2Bcds07k9_QaLL5fTXbQc3X6B88yXYNpnqfxLg81ixidw@mail.gmail.com>
	<87lim1g1r8.fsf@benfinney.id.au>
Message-ID: <F35470A2-84EB-4885-89B3-EDD87C355D16@gmail.com>

On Apr 11, 2012, at 3:47 PM, Ben Finney wrote:

>   'foo bar baz'.replace(('foo', 'bar'), 'foobar')
> 
> You can't replace multiple matches ?at the same time?, as you're
> implying. The order of replacements is important, since it will affect
> the outcome in many cases.

Can't you say the same about 'a b c'.replace("a", "aa")? 

I think the case of the needles overlapping is more to your point though.

"abc".replace( ("ab", "bc"), "b")

What should that produce? "bc"? "b"? "ab" even (if we ignore the order of the tuple)?

From ben+python at benfinney.id.au  Thu Apr 12 04:46:19 2012
From: ben+python at benfinney.id.au (Ben Finney)
Date: Thu, 12 Apr 2012 12:46:19 +1000
Subject: [Python-ideas] in str.replace(old, new),
	allow 'old' to accept a tuple
References: <CAA77j2Bcds07k9_QaLL5fTXbQc3X6B88yXYNpnqfxLg81ixidw@mail.gmail.com>
	<87lim1g1r8.fsf@benfinney.id.au>
	<F35470A2-84EB-4885-89B3-EDD87C355D16@gmail.com>
Message-ID: <878vi13bwk.fsf@benfinney.id.au>

"Carl M. Johnson"
<cmjohnson.mailinglist at gmail.com> writes:

> On Apr 11, 2012, at 3:47 PM, Ben Finney wrote:
> > You can't replace multiple matches ?at the same time?, as you're
> > implying. The order of replacements is important, since it will
> > affect the outcome in many cases.
>
> Can't you say the same about 'a b c'.replace("a", "aa")? 

Not the same thing. The matches *can* be all ?at the same time?, in
every case, since only a single pattern is being matched. Then, once all
those matches are found, they're all replaced. So it's not a problem.

I'm pointing out that, if distinct patterns are being matched and
replaced, then the order of replacement matters.

> I think the case of the needles overlapping is more to your point
> though.
>
> "abc".replace( ("ab", "bc"), "b")
>
> What should that produce? "bc"? "b"? "ab" even (if we ignore the order
> of the tuple)?

Yes, these and other cases make it problematic to think in terms of
?replace them all at the same time?. The replacements should be done in
an order predictable by the person reading the code.

And if they should be done in order, then that order should be explicit.
I think the existing solution helps with that.

-- 
 \        ?? it's best to confuse only one issue at a time.? ?Brian W. |
  `\    Kernighan and Dennis M. Ritchie, _The C programming language_, |
_o__)                                                             1988 |
Ben Finney



From greg.ewing at canterbury.ac.nz  Thu Apr 12 04:47:42 2012
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 12 Apr 2012 14:47:42 +1200
Subject: [Python-ideas] in str.replace(old, new),
 allow 'old' to accept a tuple
In-Reply-To: <87lim1g1r8.fsf@benfinney.id.au>
References: <CAA77j2Bcds07k9_QaLL5fTXbQc3X6B88yXYNpnqfxLg81ixidw@mail.gmail.com>
	<87lim1g1r8.fsf@benfinney.id.au>
Message-ID: <4F86424E.7090403@canterbury.ac.nz>

Ben Finney wrote:
> Tshepang Lekhonkhobe <tshepang at gmail.com>
> writes:

>>>>>'foo bar baz'.replace(('foo', 'bar'), 'baz')

> You can't replace multiple matches ?at the same time?, as you're
> implying.

An obvious thing to do is to try them in the order they
appear in the sequence. That would argue against allowing
an unordered collection.

Not quite so obvious is whether the replacements should
be considered as candidates for further replacements.
I would say not, because it complicates the algorithm
and in my experience is rarely needed. If you want that,
you would just have to do multiple replace calls like
you do now.

And how about allowing a sequence of (old, new) pairs
instead of just a single replacement? That would be even
more useful.

-- 
Greg


From cs at zip.com.au  Thu Apr 12 05:58:36 2012
From: cs at zip.com.au (Cameron Simpson)
Date: Thu, 12 Apr 2012 13:58:36 +1000
Subject: [Python-ideas] in str.replace(old, new),
	allow 'old' to accept a tuple
In-Reply-To: <4F86424E.7090403@canterbury.ac.nz>
References: <4F86424E.7090403@canterbury.ac.nz>
Message-ID: <20120412035836.GA19824@cskk.homeip.net>

On 12Apr2012 14:47, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
| Ben Finney wrote:
| > Tshepang Lekhonkhobe <tshepang at gmail.com>
| > writes:
| >>>>>'foo bar baz'.replace(('foo', 'bar'), 'baz')
| 
| > You can't replace multiple matches ?at the same time?, as you're
| > implying.
| 
| An obvious thing to do is to try them in the order they
| appear in the sequence. That would argue against allowing
| an unordered collection.

And likewise with Ben's set() suggestion.

I for one would allow it.

If the order matters, the caller can produce a sequence with the
required order. If the order doesn't matter (you know no replacement
overlaps, and no replacement introduces text that itself should get
replaced), then why not allow a set?

I vote for any iterable if this ges ahead. The specification should sy
that replacements happen in the order items come from the iterable,
leaving the choice of control up to the caller but providing predicable
behaviour if the caller provides a predictable sequence.

| Not quite so obvious is whether the replacements should
| be considered as candidates for further replacements.
| I would say not, because it complicates the algorithm
| and in my experience is rarely needed.

Not to mention recursion!

| If you want that,
| you would just have to do multiple replace calls like
| you do now.
| 
| And how about allowing a sequence of (old, new) pairs
| instead of just a single replacement? That would be even
| more useful.

Sure. But doesn't that break the function signature? I suppose we're
already there though.

Do you want to special case the single string replacement or
require callers to use zip(repls, [ "foo" for s in repls ])?
Personally, I would require the zip; the, um, flexibility of the
%-format operator with string-vs-list has long bothered me to the point
%of always providing a sequence, even a single element tuple.
-- 
Cameron Simpson <cs at zip.com.au> DoD#743
http://www.cskk.ezoshosting.com/cs/

Mountain rescue teams insist the all climbers wear helmets, and fall haedfirst.
They are then impacted into a small globular mass easily stowed in a rucsac.
        - Tom Patey, who didnt, and wasnt


From cs at zip.com.au  Thu Apr 12 06:01:02 2012
From: cs at zip.com.au (Cameron Simpson)
Date: Thu, 12 Apr 2012 14:01:02 +1000
Subject: [Python-ideas] in str.replace(old, new),
	allow 'old' to accept a tuple
In-Reply-To: <87lim1g1r8.fsf@benfinney.id.au>
References: <87lim1g1r8.fsf@benfinney.id.au>
Message-ID: <20120412040102.GA21051@cskk.homeip.net>

On 12Apr2012 11:47, Ben Finney <ben+python at benfinney.id.au> wrote:
| Tshepang Lekhonkhobe <tshepang at gmail.com>
| writes:
| 
| > >>> 'foo bar baz'.replace(('foo', 'bar'), 'baz')
| > baz baz baz
| 
| How about:
| 
|     'foo bar baz'.replace(('foo', 'bar'), 'foobar')
| 
| You can't replace multiple matches ?at the same time?, as you're
| implying. The order of replacements is important, since it will affect
| the outcome in many cases.

"At the same time" might imply something equivalent to the cited
"re.sub('foo|bar',...)" suggestion. And that is different to an iterated
"replace foo, then replace bar" if the possible matched overlap.

Just a thought about what semantics the OP may have envisaged.

Personally, given re.sub and the ease of running replace a few times in
a loop, I'm -0.3 on the suggestion itself.

Cheers,
-- 
Cameron Simpson <cs at zip.com.au> DoD#743
http://www.cskk.ezoshosting.com/cs/

A software engineering discussion from Perl-Porters:
Chip Salzenberg:        The wise one has seen the calamity,
                        and has proceeded to hide himself.
                        - Ecclesiastes
Gurusamy Sarathy:       He that observeth the wind shall not sow;
                        and he that regardeth the clouds shall not reap.


From ben+python at benfinney.id.au  Thu Apr 12 06:33:51 2012
From: ben+python at benfinney.id.au (Ben Finney)
Date: Thu, 12 Apr 2012 14:33:51 +1000
Subject: [Python-ideas] in str.replace(old, new),
	allow 'old' to accept a tuple
References: <CAA77j2Bcds07k9_QaLL5fTXbQc3X6B88yXYNpnqfxLg81ixidw@mail.gmail.com>
	<87lim1g1r8.fsf@benfinney.id.au> <4F86424E.7090403@canterbury.ac.nz>
Message-ID: <874nsp36xc.fsf@benfinney.id.au>

Greg Ewing <greg.ewing at canterbury.ac.nz> writes:

> An obvious thing to do is to try them in the order they appear in the
> sequence. That would argue against allowing an unordered collection.

For that reason, I'm ?0.5 on the proposal. If we're to specify multiple
match patterns and do them all in a single operation, I'd prefer to
specify them in e.g. a set or some other efficient non-ordered
collection.

-- 
 \     ?On the other hand, you have different fingers.? ?Steven Wright |
  `\                                                                   |
_o__)                                                                  |
Ben Finney



From ben+python at benfinney.id.au  Thu Apr 12 06:37:01 2012
From: ben+python at benfinney.id.au (Ben Finney)
Date: Thu, 12 Apr 2012 14:37:01 +1000
Subject: [Python-ideas] in str.replace(old, new),
	allow 'old' to accept a tuple
References: <87lim1g1r8.fsf@benfinney.id.au>
	<20120412040102.GA21051@cskk.homeip.net>
Message-ID: <87zkah1s7m.fsf@benfinney.id.au>

Cameron Simpson <cs at zip.com.au> writes:

> "At the same time" might imply something equivalent to the cited
> "re.sub('foo|bar',...)" suggestion. And that is different to an iterated
> "replace foo, then replace bar" if the possible matched overlap.

Yes, it is; but the OP presented a proposal as though it were to have
the same semantics as a sequence of replace operations.

If the OP wants to specify different semantics, let's hear it.

-- 
 \         ?A child of five could understand this. Fetch me a child of |
  `\                                              five.? ?Groucho Marx |
_o__)                                                                  |
Ben Finney



From ncoghlan at gmail.com  Thu Apr 12 06:56:16 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 12 Apr 2012 14:56:16 +1000
Subject: [Python-ideas] in str.replace(old, new),
	allow 'old' to accept a tuple
In-Reply-To: <87zkah1s7m.fsf@benfinney.id.au>
References: <87lim1g1r8.fsf@benfinney.id.au>
	<20120412040102.GA21051@cskk.homeip.net>
	<87zkah1s7m.fsf@benfinney.id.au>
Message-ID: <CADiSq7dJaHWYgtgS+ztL2Psj9fd8F261JEW7gaXve33TP1Xw9A@mail.gmail.com>

On Thu, Apr 12, 2012 at 2:37 PM, Ben Finney <ben+python at benfinney.id.au> wrote:
> If the OP wants to specify different semantics, let's hear it.

Whatever semantics were chosen, they would end up being confusing to *someone*.

With prefix and suffix matching, the implicit OR is simple and
obvious. The same can't be said for the replacement command,
particular if it can be used with unordered collections.

Far better to leave this task to re.sub (which uses regex syntax to
avoid ambiguity) or to explicit flow control and multiple invocations
of replace().

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From rob.cliffe at btinternet.com  Thu Apr 12 09:32:44 2012
From: rob.cliffe at btinternet.com (Rob Cliffe)
Date: Thu, 12 Apr 2012 08:32:44 +0100
Subject: [Python-ideas] in str.replace(old, new),
 allow 'old' to accept a tuple
In-Reply-To: <CADiSq7dJaHWYgtgS+ztL2Psj9fd8F261JEW7gaXve33TP1Xw9A@mail.gmail.com>
References: <87lim1g1r8.fsf@benfinney.id.au>
	<20120412040102.GA21051@cskk.homeip.net>
	<87zkah1s7m.fsf@benfinney.id.au>
	<CADiSq7dJaHWYgtgS+ztL2Psj9fd8F261JEW7gaXve33TP1Xw9A@mail.gmail.com>
Message-ID: <4F86851C.9060504@btinternet.com>



On 12/04/2012 05:56, Nick Coghlan wrote:
> On Thu, Apr 12, 2012 at 2:37 PM, Ben Finney<ben+python at benfinney.id.au>  wrote:
>> If the OP wants to specify different semantics, let's hear it.
> Whatever semantics were chosen, they would end up being confusing to *someone*.
>
> With prefix and suffix matching, the implicit OR is simple and
> obvious. The same can't be said for the replacement command,
> particular if it can be used with unordered collections.
>
> Far better to leave this task to re.sub (which uses regex syntax to
> avoid ambiguity) or to explicit flow control and multiple invocations
> of replace().
>
> Cheers,
> Nick.
>
I rather like this proposal.
The semantics for s.replace(strings, replacementString) could be:
     'strings', if not a string, must be a tuple, for consistency with 
str.startswith
         (although I don't see why a list shouldn't be allowed for both).
     Scan s from left to right;
     whenever a match is found with any member of 'strings' (tested in 
the order specified by 'strings'), do the replacement.
     The replaced text is not eligible for further replacement.
But the real value for such proposals is not the complicated cases where 
the precise semantics matter,
but the convenience in simple cases (almost any language feature CAN be 
used in an obscure way), e.g.

def dequote(s):
     singlequote = "'"
     doublequote = '"'
     return s.replace((singlequote, doublequote), '')

+0.8
Rob Cliffe


From tshepang at gmail.com  Thu Apr 12 11:39:45 2012
From: tshepang at gmail.com (Tshepang Lekhonkhobe)
Date: Thu, 12 Apr 2012 11:39:45 +0200
Subject: [Python-ideas] in str.replace(old, new),
	allow 'old' to accept a tuple
In-Reply-To: <87zkah1s7m.fsf@benfinney.id.au>
References: <87lim1g1r8.fsf@benfinney.id.au>
	<20120412040102.GA21051@cskk.homeip.net>
	<87zkah1s7m.fsf@benfinney.id.au>
Message-ID: <CAA77j2DPGOSGeA35EyJSNgWCbstZGDMrC99iz5U1YjjuP9iGig@mail.gmail.com>

On Thu, Apr 12, 2012 at 06:37, Ben Finney <ben+python at benfinney.id.au> wrote:
> Cameron Simpson <cs at zip.com.au> writes:
>
>> "At the same time" might imply something equivalent to the cited
>> "re.sub('foo|bar',...)" suggestion. And that is different to an iterated
>> "replace foo, then replace bar" if the possible matched overlap.
>
> Yes, it is; but the OP presented a proposal as though it were to have
> the same semantics as a sequence of replace operations.
>
> If the OP wants to specify different semantics, let's hear it.

You guys are thinking more deeply about this than I was. I don't even
see a difference between the 2:

>>> 'foo bar baz'.replace('foo', 'baz').replace('bar', 'baz') == re.sub('foo|bar', 'baz', 'foo bar baz')
True

I was not even thinking about ordering, but it would help to have it
to avoid confusion I think. The example I gave was just the closest I
could think of.


From songofacandy at gmail.com  Thu Apr 12 13:32:45 2012
From: songofacandy at gmail.com (INADA Naoki)
Date: Thu, 12 Apr 2012 20:32:45 +0900
Subject: [Python-ideas] in str.replace(old, new),
	allow 'old' to accept a tuple
In-Reply-To: <CAA77j2DPGOSGeA35EyJSNgWCbstZGDMrC99iz5U1YjjuP9iGig@mail.gmail.com>
References: <87lim1g1r8.fsf@benfinney.id.au>
	<20120412040102.GA21051@cskk.homeip.net>
	<87zkah1s7m.fsf@benfinney.id.au>
	<CAA77j2DPGOSGeA35EyJSNgWCbstZGDMrC99iz5U1YjjuP9iGig@mail.gmail.com>
Message-ID: <CAEfz+TzCX9HqH+R0XCa6-VO+851stsOrx266Rds0cYvvwOcAEA@mail.gmail.com>

I want multiple replace at once. For example html escape looks like:

>>> "<>&".replace('<', '&lt;', '>', '&gt;', '&', '&amp;')
'&lt;&gt;&amp;'

or

>>> "<>&".replace( ('<', '&lt;'), ('>', '&gt;'), ('&', '&amp;') )
'&lt;&gt;&amp;'


On Thu, Apr 12, 2012 at 6:39 PM, Tshepang Lekhonkhobe
<tshepang at gmail.com> wrote:
> On Thu, Apr 12, 2012 at 06:37, Ben Finney <ben+python at benfinney.id.au> wrote:
>> Cameron Simpson <cs at zip.com.au> writes:
>>
>>> "At the same time" might imply something equivalent to the cited
>>> "re.sub('foo|bar',...)" suggestion. And that is different to an iterated
>>> "replace foo, then replace bar" if the possible matched overlap.
>>
>> Yes, it is; but the OP presented a proposal as though it were to have
>> the same semantics as a sequence of replace operations.
>>
>> If the OP wants to specify different semantics, let's hear it.
>
> You guys are thinking more deeply about this than I was. I don't even
> see a difference between the 2:
>
>>>> 'foo bar baz'.replace('foo', 'baz').replace('bar', 'baz') == re.sub('foo|bar', 'baz', 'foo bar baz')
> True
>
> I was not even thinking about ordering, but it would help to have it
> to avoid confusion I think. The example I gave was just the closest I
> could think of.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas



-- 
INADA Naoki? <songofacandy at gmail.com>


From sven at marnach.net  Thu Apr 12 15:10:23 2012
From: sven at marnach.net (Sven Marnach)
Date: Thu, 12 Apr 2012 14:10:23 +0100
Subject: [Python-ideas] in str.replace(old, new),
 allow 'old' to accept a tuple
In-Reply-To: <CAEfz+TzCX9HqH+R0XCa6-VO+851stsOrx266Rds0cYvvwOcAEA@mail.gmail.com>
References: <87lim1g1r8.fsf@benfinney.id.au>
	<20120412040102.GA21051@cskk.homeip.net>
	<87zkah1s7m.fsf@benfinney.id.au>
	<CAA77j2DPGOSGeA35EyJSNgWCbstZGDMrC99iz5U1YjjuP9iGig@mail.gmail.com>
	<CAEfz+TzCX9HqH+R0XCa6-VO+851stsOrx266Rds0cYvvwOcAEA@mail.gmail.com>
Message-ID: <20120412131023.GF30763@bagheera>

INADA Naoki schrieb am Thu, 12. Apr 2012, um 20:32:45 +0900:
> >>> "<>&".replace( ('<', '&lt;'), ('>', '&gt;'), ('&', '&amp;') )
> '&lt;&gt;&amp;'

In current Python, it's

    >>> t = str.maketrans({"<": "&lt;", ">": "&gt;", "&": "&amp;"})
    >>> "<>&".translate(t)
    '&lt;&gt;&amp;'

Looks good enough for me.

Cheers,
    Sven


From songofacandy at gmail.com  Thu Apr 12 15:17:30 2012
From: songofacandy at gmail.com (INADA Naoki)
Date: Thu, 12 Apr 2012 22:17:30 +0900
Subject: [Python-ideas] in str.replace(old, new),
	allow 'old' to accept a tuple
In-Reply-To: <20120412131023.GF30763@bagheera>
References: <87lim1g1r8.fsf@benfinney.id.au>
	<20120412040102.GA21051@cskk.homeip.net>
	<87zkah1s7m.fsf@benfinney.id.au>
	<CAA77j2DPGOSGeA35EyJSNgWCbstZGDMrC99iz5U1YjjuP9iGig@mail.gmail.com>
	<CAEfz+TzCX9HqH+R0XCa6-VO+851stsOrx266Rds0cYvvwOcAEA@mail.gmail.com>
	<20120412131023.GF30763@bagheera>
Message-ID: <CAEfz+TwnzM3NEJNWCj-MqsDBhOcq7L+gPXxzFUxm_pqKaAoLTg@mail.gmail.com>

Oh, I didn't know that. Thank you.
But what about unescape? str.translate accepts only one character key.


On Thu, Apr 12, 2012 at 10:10 PM, Sven Marnach <sven at marnach.net> wrote:
> INADA Naoki schrieb am Thu, 12. Apr 2012, um 20:32:45 +0900:
>> >>> "<>&".replace( ('<', '&lt;'), ('>', '&gt;'), ('&', '&amp;') )
>> '&lt;&gt;&amp;'
>
> In current Python, it's
>
> ? ?>>> t = str.maketrans({"<": "&lt;", ">": "&gt;", "&": "&amp;"})
> ? ?>>> "<>&".translate(t)
> ? ?'&lt;&gt;&amp;'
>
> Looks good enough for me.
>
> Cheers,
> ? ?Sven
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas



-- 
INADA Naoki? <songofacandy at gmail.com>


From sven at marnach.net  Thu Apr 12 18:20:13 2012
From: sven at marnach.net (Sven Marnach)
Date: Thu, 12 Apr 2012 17:20:13 +0100
Subject: [Python-ideas] in str.replace(old, new),
 allow 'old' to accept a tuple
In-Reply-To: <CAEfz+TwnzM3NEJNWCj-MqsDBhOcq7L+gPXxzFUxm_pqKaAoLTg@mail.gmail.com>
References: <87lim1g1r8.fsf@benfinney.id.au>
	<20120412040102.GA21051@cskk.homeip.net>
	<87zkah1s7m.fsf@benfinney.id.au>
	<CAA77j2DPGOSGeA35EyJSNgWCbstZGDMrC99iz5U1YjjuP9iGig@mail.gmail.com>
	<CAEfz+TzCX9HqH+R0XCa6-VO+851stsOrx266Rds0cYvvwOcAEA@mail.gmail.com>
	<20120412131023.GF30763@bagheera>
	<CAEfz+TwnzM3NEJNWCj-MqsDBhOcq7L+gPXxzFUxm_pqKaAoLTg@mail.gmail.com>
Message-ID: <20120412162013.GG30763@bagheera>

INADA Naoki schrieb am Thu, 12. Apr 2012, um 22:17:30 +0900:
> Oh, I didn't know that. Thank you.
> But what about unescape? str.translate accepts only one character key.

You'd currently need to use the `re` module:

    >>> d = {"&amp;": "&", "&gt;": ">", "&lt;": "<"}
    >>> re.sub("|".join(d), lambda m: d[m.group()], "&lt;&gt;&amp;")
    '<>&'

Cheers,
    Sven


From songofacandy at gmail.com  Thu Apr 12 18:32:58 2012
From: songofacandy at gmail.com (INADA Naoki)
Date: Fri, 13 Apr 2012 01:32:58 +0900
Subject: [Python-ideas] in str.replace(old, new),
	allow 'old' to accept a tuple
In-Reply-To: <20120412162013.GG30763@bagheera>
References: <87lim1g1r8.fsf@benfinney.id.au>
	<20120412040102.GA21051@cskk.homeip.net>
	<87zkah1s7m.fsf@benfinney.id.au>
	<CAA77j2DPGOSGeA35EyJSNgWCbstZGDMrC99iz5U1YjjuP9iGig@mail.gmail.com>
	<CAEfz+TzCX9HqH+R0XCa6-VO+851stsOrx266Rds0cYvvwOcAEA@mail.gmail.com>
	<20120412131023.GF30763@bagheera>
	<CAEfz+TwnzM3NEJNWCj-MqsDBhOcq7L+gPXxzFUxm_pqKaAoLTg@mail.gmail.com>
	<20120412162013.GG30763@bagheera>
Message-ID: <CAEfz+Ty0eKx4r=p1MEy3sN=Wtj4kDHdJSPrsL5625gNhWX3gng@mail.gmail.com>

Yes, I know it.
But if str.replace() or str.translate() can do it, it is simpler and
faster than re.sub().


On Fri, Apr 13, 2012 at 1:20 AM, Sven Marnach <sven at marnach.net> wrote:
> INADA Naoki schrieb am Thu, 12. Apr 2012, um 22:17:30 +0900:
>> Oh, I didn't know that. Thank you.
>> But what about unescape? str.translate accepts only one character key.
>
> You'd currently need to use the `re` module:
>
> ? ?>>> d = {"&amp;": "&", "&gt;": ">", "&lt;": "<"}
> ? ?>>> re.sub("|".join(d), lambda m: d[m.group()], "&lt;&gt;&amp;")
> ? ?'<>&'
>
> Cheers,
> ? ?Sven
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas



-- 
INADA Naoki? <songofacandy at gmail.com>


From stefan_ml at behnel.de  Thu Apr 12 19:08:58 2012
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Thu, 12 Apr 2012 19:08:58 +0200
Subject: [Python-ideas] in str.replace(old, new),
	allow 'old' to accept a tuple
In-Reply-To: <CAEfz+Ty0eKx4r=p1MEy3sN=Wtj4kDHdJSPrsL5625gNhWX3gng@mail.gmail.com>
References: <87lim1g1r8.fsf@benfinney.id.au>
	<20120412040102.GA21051@cskk.homeip.net>
	<87zkah1s7m.fsf@benfinney.id.au>
	<CAA77j2DPGOSGeA35EyJSNgWCbstZGDMrC99iz5U1YjjuP9iGig@mail.gmail.com>
	<CAEfz+TzCX9HqH+R0XCa6-VO+851stsOrx266Rds0cYvvwOcAEA@mail.gmail.com>
	<20120412131023.GF30763@bagheera>
	<CAEfz+TwnzM3NEJNWCj-MqsDBhOcq7L+gPXxzFUxm_pqKaAoLTg@mail.gmail.com>
	<20120412162013.GG30763@bagheera>
	<CAEfz+Ty0eKx4r=p1MEy3sN=Wtj4kDHdJSPrsL5625gNhWX3gng@mail.gmail.com>
Message-ID: <jm727a$114$1@dough.gmane.org>

INADA Naoki, 12.04.2012 18:32:
> On Fri, Apr 13, 2012 at 1:20 AM, Sven Marnach wrote:
>> INADA Naoki schrieb am Thu, 12. Apr 2012, um 22:17:30 +0900:
>>> Oh, I didn't know that. Thank you.
>>> But what about unescape? str.translate accepts only one character key.
>>
>> You'd currently need to use the `re` module:
>>
>>    >>> d = {"&amp;": "&", "&gt;": ">", "&lt;": "<"}
>>    >>> re.sub("|".join(d), lambda m: d[m.group()], "&lt;&gt;&amp;")
>>    '<>&'
>
> Yes, I know it.
> But if str.replace() or str.translate() can do it, it is simpler and
> faster than re.sub().

Simpler, maybe, at least at the API level. But faster? Not necesarily. It
could use Aho-Corasick, but that means it needs to construct the search
graph on each call, which is fairly expensive. And str.replace() isn't the
right interface for anything but a one-shot operation if the intention is
to pass in a sequence of keywords.

Stefan



From greg.ewing at canterbury.ac.nz  Fri Apr 13 01:01:46 2012
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 13 Apr 2012 11:01:46 +1200
Subject: [Python-ideas] in str.replace(old, new),
 allow 'old' to accept a tuple
In-Reply-To: <jm727a$114$1@dough.gmane.org>
References: <87lim1g1r8.fsf@benfinney.id.au>
	<20120412040102.GA21051@cskk.homeip.net>
	<87zkah1s7m.fsf@benfinney.id.au>
	<CAA77j2DPGOSGeA35EyJSNgWCbstZGDMrC99iz5U1YjjuP9iGig@mail.gmail.com>
	<CAEfz+TzCX9HqH+R0XCa6-VO+851stsOrx266Rds0cYvvwOcAEA@mail.gmail.com>
	<20120412131023.GF30763@bagheera>
	<CAEfz+TwnzM3NEJNWCj-MqsDBhOcq7L+gPXxzFUxm_pqKaAoLTg@mail.gmail.com>
	<20120412162013.GG30763@bagheera>
	<CAEfz+Ty0eKx4r=p1MEy3sN=Wtj4kDHdJSPrsL5625gNhWX3gng@mail.gmail.com>
	<jm727a$114$1@dough.gmane.org>
Message-ID: <4F875EDA.500@canterbury.ac.nz>

Stefan Behnel wrote:
> And str.replace() isn't the
> right interface for anything but a one-shot operation if the intention is
> to pass in a sequence of keywords.

So maybe a better approach would be to enhance maketrans so
that both keys and replacements can be more than one character
long?

Behind the scenes, it could build a DFA or whatever is needed
to do it efficiently.

-- 
Greg


From greg at krypto.org  Fri Apr 13 02:40:01 2012
From: greg at krypto.org (Gregory P. Smith)
Date: Thu, 12 Apr 2012 17:40:01 -0700
Subject: [Python-ideas] in str.replace(old, new),
	allow 'old' to accept a tuple
In-Reply-To: <CADiSq7dJaHWYgtgS+ztL2Psj9fd8F261JEW7gaXve33TP1Xw9A@mail.gmail.com>
References: <87lim1g1r8.fsf@benfinney.id.au>
	<20120412040102.GA21051@cskk.homeip.net>
	<87zkah1s7m.fsf@benfinney.id.au>
	<CADiSq7dJaHWYgtgS+ztL2Psj9fd8F261JEW7gaXve33TP1Xw9A@mail.gmail.com>
Message-ID: <CAGE7PNLO5OKSGup1ATS16_itQVRjZexXNAFCZs+rgbT5=QqwSw@mail.gmail.com>

On Wed, Apr 11, 2012 at 9:56 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> On Thu, Apr 12, 2012 at 2:37 PM, Ben Finney <ben+python at benfinney.id.au>
> wrote:
> > If the OP wants to specify different semantics, let's hear it.
>
> Whatever semantics were chosen, they would end up being confusing to
> *someone*.
>
> With prefix and suffix matching, the implicit OR is simple and
> obvious. The same can't be said for the replacement command,
> particular if it can be used with unordered collections.
>
> Far better to leave this task to re.sub (which uses regex syntax to
> avoid ambiguity) or to explicit flow control and multiple invocations
> of replace().
>
>
Agreed.  Which is why I'm -1 on the proposed change to str.replace().
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120412/7976e99e/attachment.html>

From songofacandy at gmail.com  Fri Apr 13 03:00:33 2012
From: songofacandy at gmail.com (INADA Naoki)
Date: Fri, 13 Apr 2012 10:00:33 +0900
Subject: [Python-ideas] in str.replace(old, new),
	allow 'old' to accept a tuple
In-Reply-To: <jm727a$114$1@dough.gmane.org>
References: <87lim1g1r8.fsf@benfinney.id.au>
	<20120412040102.GA21051@cskk.homeip.net>
	<87zkah1s7m.fsf@benfinney.id.au>
	<CAA77j2DPGOSGeA35EyJSNgWCbstZGDMrC99iz5U1YjjuP9iGig@mail.gmail.com>
	<CAEfz+TzCX9HqH+R0XCa6-VO+851stsOrx266Rds0cYvvwOcAEA@mail.gmail.com>
	<20120412131023.GF30763@bagheera>
	<CAEfz+TwnzM3NEJNWCj-MqsDBhOcq7L+gPXxzFUxm_pqKaAoLTg@mail.gmail.com>
	<20120412162013.GG30763@bagheera>
	<CAEfz+Ty0eKx4r=p1MEy3sN=Wtj4kDHdJSPrsL5625gNhWX3gng@mail.gmail.com>
	<jm727a$114$1@dough.gmane.org>
Message-ID: <CAEfz+TyrY1bMohJQCLa+kf301jw=qwKScqvCD0xGW8beQ=7dFA@mail.gmail.com>

> Simpler, maybe, at least at the API level. But faster? Not necesarily. It
> could use Aho-Corasick, but that means it needs to construct the search
> graph on each call, which is fairly expensive.

You're right. But in simple situation, overhead of making match object and
calling callback is more expensive. (ex. https://gist.github.com/2369648 )

I think chaining replace is not so bad for such simple cases.
So a problem is there are no "one obvious way" to replace multiple keywords.


On Fri, Apr 13, 2012 at 2:08 AM, Stefan Behnel <stefan_ml at behnel.de> wrote:
> INADA Naoki, 12.04.2012 18:32:
>> On Fri, Apr 13, 2012 at 1:20 AM, Sven Marnach wrote:
>>> INADA Naoki schrieb am Thu, 12. Apr 2012, um 22:17:30 +0900:
>>>> Oh, I didn't know that. Thank you.
>>>> But what about unescape? str.translate accepts only one character key.
>>>
>>> You'd currently need to use the `re` module:
>>>
>>> ? ?>>> d = {"&amp;": "&", "&gt;": ">", "&lt;": "<"}
>>> ? ?>>> re.sub("|".join(d), lambda m: d[m.group()], "&lt;&gt;&amp;")
>>> ? ?'<>&'
>>
>> Yes, I know it.
>> But if str.replace() or str.translate() can do it, it is simpler and
>> faster than re.sub().
>
> Simpler, maybe, at least at the API level. But faster? Not necesarily. It
> could use Aho-Corasick, but that means it needs to construct the search
> graph on each call, which is fairly expensive. And str.replace() isn't the
> right interface for anything but a one-shot operation if the intention is
> to pass in a sequence of keywords.
>
> Stefan
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas



-- 
INADA Naoki? <songofacandy at gmail.com>


From raymond.hettinger at gmail.com  Fri Apr 13 04:03:00 2012
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Thu, 12 Apr 2012 22:03:00 -0400
Subject: [Python-ideas] in str.replace(old, new),
	allow 'old' to accept a tuple
In-Reply-To: <CAA77j2Bcds07k9_QaLL5fTXbQc3X6B88yXYNpnqfxLg81ixidw@mail.gmail.com>
References: <CAA77j2Bcds07k9_QaLL5fTXbQc3X6B88yXYNpnqfxLg81ixidw@mail.gmail.com>
Message-ID: <19BE1705-7076-4E1E-AF5A-7C4C99800032@gmail.com>


On Apr 11, 2012, at 4:35 PM, Tshepang Lekhonkhobe wrote:

> I find the fact that 'prefix' in str.startswith(prefix) accept a tuple
> quite useful. That's because one can do a match on more than one
> pattern at a time, without ugliness. Would it be a good idea to do the
> same for str.replace(old, new)?
> 
> before
>>>> 'foo bar baz'.replace('foo', 'baz').replace('bar', 'baz')
> baz baz baz
> 
> after
>>>> 'foo bar baz'.replace(('foo', 'bar'), 'baz')
> baz baz baz

It seems to meet that it is a rare use case to want
to replace many things with a single replacement string.
I can't remember a single case of ever needing this.
This only thing that comes to mind is automated redaction.

What I have needed and have seen others need is a dictionary
based replace:     {'customer': 'client', 'headquarters': 'office', 'now': 'soon'}.
Even that case is a fraught with peril -- I would want "now" to change
to "soon" but not have "snow" change to "ssoon".

In the end, I think want people want is to have the power
and control afforded by re.sub() but without having
to learn regular expressions.


Raymond
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120412/288c5c84/attachment.html>

From stephen at xemacs.org  Fri Apr 13 05:14:51 2012
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri, 13 Apr 2012 12:14:51 +0900
Subject: [Python-ideas] in str.replace(old, new),
	allow 'old' to accept a tuple
In-Reply-To: <19BE1705-7076-4E1E-AF5A-7C4C99800032@gmail.com>
References: <CAA77j2Bcds07k9_QaLL5fTXbQc3X6B88yXYNpnqfxLg81ixidw@mail.gmail.com>
	<19BE1705-7076-4E1E-AF5A-7C4C99800032@gmail.com>
Message-ID: <CAL_0O18TPBX3p6MRd_NbJX6nUmy2kdcaYafwKtTCWX=e-oj4=Q@mail.gmail.com>

On Fri, Apr 13, 2012 at 11:03 AM, Raymond Hettinger
<raymond.hettinger at gmail.com> wrote:

> What I have needed and have seen others need is a dictionary
> based replace: ? ? {'customer': 'client', 'headquarters': 'office', 'now':
> 'soon'}.
> Even that case is a fraught with peril -- I would want "now" to change
> to "soon" but not have "snow" change to "ssoon".
>
> In the end, I think want people want is to have the power
> and control afforded by re.sub() but without having
> to learn regular expressions.

There is one very attractive special case, however, which is an
invertible translation like URL-escaping (or HTML-escaping), where at
least one side of the transform is single characters.  Then there is
no ambiguity.  Nevertheless, I think that case is special enough that
it may as well be done in the modules that deal with URLs and HTML
respectively.


From breamoreboy at yahoo.co.uk  Fri Apr 13 05:43:40 2012
From: breamoreboy at yahoo.co.uk (Mark Lawrence)
Date: Fri, 13 Apr 2012 04:43:40 +0100
Subject: [Python-ideas] in str.replace(old, new),
	allow 'old' to accept a tuple
In-Reply-To: <878vi13bwk.fsf@benfinney.id.au>
References: <CAA77j2Bcds07k9_QaLL5fTXbQc3X6B88yXYNpnqfxLg81ixidw@mail.gmail.com>
	<87lim1g1r8.fsf@benfinney.id.au>
	<F35470A2-84EB-4885-89B3-EDD87C355D16@gmail.com>
	<878vi13bwk.fsf@benfinney.id.au>
Message-ID: <jm87d5$706$1@dough.gmane.org>

On 12/04/2012 03:46, Ben Finney wrote:

> And if they should be done in order, then that order should be explicit.
> I think the existing solution helps with that.
>

Something along the lines of

 >>> mystr = 'foo bar baz'
 >>> for old in 'foo', 'bar':
...     mystr = mystr.replace(old, 'baz')
...
 >>> mystr
'baz baz baz'

Or can this be simplified with the Python Swiss Army Knife aka the 
itertools module? :)

-- 
Cheers.

Mark Lawrence.



From xorninja at gmail.com  Fri Apr 13 17:41:21 2012
From: xorninja at gmail.com (Itzik Kotler)
Date: Fri, 13 Apr 2012 18:41:21 +0300
Subject: [Python-ideas] Pythonect 0.1.0 Release
In-Reply-To: <CAA+RL7Fd7L+5K8Z+RqrboVS59y3n9OM4vZK3TTgKf=HV26=Fmw@mail.gmail.com>
References: <CAD-_V752mgDPUcy61o9+hf060rBZdPXS_nTLauYf_39y3auFow@mail.gmail.com>
	<CAA+RL7Fd7L+5K8Z+RqrboVS59y3n9OM4vZK3TTgKf=HV26=Fmw@mail.gmail.com>
Message-ID: <CAD-_V77vUvbicTx3KK8AfqbMhGYfzXAk7qVDce3+DRYMnCoCGw@mail.gmail.com>

Hi,

I have just committed PEP8 fixes to Pythonect (
https://github.com/ikotler/pythonect).
 <https://github.com/ikotler/pythonect>
And, I also made a Pythonect Tutorial: Learn By
Example<https://github.com/ikotler/pythonect/wiki/Pythonect-Tutorial:-Learn-By-Example>

Regards,
Itzik Kotler | http://www.ikotler.org

On Sun, Apr 1, 2012 at 6:58 PM, Jakob Bowyer <jkbbwr at gmail.com> wrote:

> You might want to PEP8 your code, move imports to the top lose some of
> the un-needed lines
>
> On Sun, Apr 1, 2012 at 3:34 PM, Itzik Kotler <xorninja at gmail.com> wrote:
> > Hi All,
> >
> > I'm pleased to announce the first beta release of Pythonect interpreter.
> >
> > Pythonect is a new, experimental, general-purpose dataflow programming
> > language based on Python.
> >
> > It aims to combine the intuitive feel of shell scripting (and all of its
> > perks like implicit parallelism) with the flexibility and agility of
> Python.
> >
> > Pythonect interpreter (and reference implementation) is written in
> Python,
> > and is available under the BSD license.
> >
> > Here's a quick tour of Pythonect:
> >
> > The canonical "Hello, world" example program in Pythonect:
> >
> >>>> "Hello World" -> print
> > <MainProcess:Thread-1> : Hello World
> > Hello World
> >>>>
> >
> > '->' and '|' are both Pythonect operators.
> >
> > The pipe operator (i.e. '|') passes one item at a item, while the other
> > operator passes all items at once.
> >
> >
> > Python statements and other None-returning function are acting as a
> > pass-through:
> >
> >>>> "Hello World" -> print -> print
> > <MainProcess:Thread-2> : Hello World
> > <MainProcess:Thread-2> : Hello World
> > Hello World
> >>>>
> >
> >>>> 1 -> import math -> math.log
> > 0.0
> >>>>
> >
> >
> > Parallelization in Pythonect:
> >
> >>>> "Hello World" -> [ print , print ]
> > <MainProcess:Thread-4> : Hello World
> > <MainProcess:Thread-5> : Hello World
> > ['Hello World', 'Hello World']
> >
> >>>> range(0,3) -> import math -> math.sqrt
> > [0.0, 1.0, 1.4142135623730951]
> >>>>
> >
> > In the future, I am planning on adding support for multi-processing, and
> > even distributed computing.
> >
> >
> > The '_' identifier allow access to current item:
> >
> >>>> "Hello World" -> [ print , print ] -> _ + " and Python"
> > <MainProcess:Thread-7> : Hello World
> > <MainProcess:Thread-8> : Hello World
> > ['Hello World and Python', 'Hello World and Python']
> >>>>
> >
> >>>> [ 1 , 2 ] -> _**_
> > [1, 4]
> >>>>
> >
> >
> > True/False return values as filters:
> >
> >>>> "Hello World" -> _ == "Hello World" -> print
> > <MainProcess:Thread-9> : Hello World
> >>>>
> >
> >>>> "Hello World" -> _ == "Hello World1" -> print
> > False
> >>>>
> >
> >>>> range(1,10) -> _ % 2 == 0
> > [2, 4, 6, 8]
> >>>>
> >
> >
> > Last but not least, I have also added extra syntax for making remote
> > procedure call easy:
> >
> >>> 1 -> inc at xmlrpc://localhost:8000 -> print
> > <MainProcess:Thread-2> : 2
> > 2
> >>>>
> >
> > Download Pythonect v0.1.0 from:
> > http://github.com/downloads/ikotler/pythonect/Pythonect-0.1.0.tar.gz
> >
> > More information can be found at: http://www.pythonect.org
> >
> >
> > I will appreciate any input / feedback that you can give me.
> >
> > Also, for those interested in working on the project, I'm actively
> > interested in welcoming and supporting both new developers and new users.
> > Feel free to contact me.
> >
> >
> > Regards,
> > Itzik Kotler | http://www.ikotler.org
> >
> > _______________________________________________
> > Python-ideas mailing list
> > Python-ideas at python.org
> > http://mail.python.org/mailman/listinfo/python-ideas
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120413/d3a9d806/attachment.html>

From maxmoroz at gmail.com  Wed Apr 18 03:23:10 2012
From: maxmoroz at gmail.com (Max Moroz)
Date: Tue, 17 Apr 2012 18:23:10 -0700
Subject: [Python-ideas] Providing a guarantee that instances of user-defined
 classes have distinct identities
In-Reply-To: <CAOVPiMgRRMv4iYg90LoGoE9xbw10f-0Zw5Ln8o_raYP_4YcO+A@mail.gmail.com>
References: <CAOVPiMgRRMv4iYg90LoGoE9xbw10f-0Zw5Ln8o_raYP_4YcO+A@mail.gmail.com>
Message-ID: <CAOVPiMgB518p3AreNd6wytYLsRpkUjeUcs10A4C1Vfa1J4nPPQ@mail.gmail.com>

Suppose I model a game of cards, where suits don't matter. I might find the
integer representation of cards (14 for "Ace", 13 for "King", ..., 2 for
"2") to be convenient.

The deck has 4 copies of each card, which I need to distinguish.

I was thinking to model a card as follows:

class Card(int):
__hash__ = int.__hash__
def __eq__(self, other):
return self is other

This works precisely as I want (at least in CPython 3.2):

x = Card(14)
y = Card(14)
assert x != y # x and y are two different Aces
z = x
assert x == z # x and z are bound to the same Ace

But this behavior is implementation dependent, so the above code may one
day break (very painfully for whoever happens to maintain it at the time).

Is it possible to add a guarantee to the language that would make the above
code safe to use? Currently the language promises:

"For immutable types, operations that compute new values may actually
return a reference to any existing object with the same type and value,
while for mutable objects this is not allowed."

Nowhere in the documentation is it clearly defined which objects are
considered "immutable" for the purpose of this promise. As a result, a
Python implementation, now or in the future, may decide that it's ok to
return a reference to an existing object when a Card instance is created -
since arguably, class Card is immutable (since it derives from an immutable
base class, and doesn't add any new attributes).

Perhaps a change like this would be fine (it obviously won't break any
existing code):

"For certain types, operations that compute new values may actually return
a reference to an existing object with the same type and value. The only
types for which this may happen are:

- built-in immutable types
- user-defined classes that explicitly override __new__ to return a
reference to an existing object

Note that a user-defined class that inherits from a built-in immutable
types, without overriding __new__, will not exhibit this behavior."
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120417/50d540ff/attachment.html>

From ncoghlan at gmail.com  Wed Apr 18 03:44:47 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 18 Apr 2012 11:44:47 +1000
Subject: [Python-ideas] Providing a guarantee that instances of
 user-defined classes have distinct identities
In-Reply-To: <CAOVPiMgB518p3AreNd6wytYLsRpkUjeUcs10A4C1Vfa1J4nPPQ@mail.gmail.com>
References: <CAOVPiMgRRMv4iYg90LoGoE9xbw10f-0Zw5Ln8o_raYP_4YcO+A@mail.gmail.com>
	<CAOVPiMgB518p3AreNd6wytYLsRpkUjeUcs10A4C1Vfa1J4nPPQ@mail.gmail.com>
Message-ID: <CADiSq7ciLgt-DXxX5_-tUs_NhdhP6pBKbijEeFGvuA8TLpF_rA@mail.gmail.com>

On Wed, Apr 18, 2012 at 11:23 AM, Max Moroz <maxmoroz at gmail.com> wrote:
> "For immutable types, operations that compute new values may actually return
> a reference to any existing object with the same type and value, while for
> mutable objects this is not allowed."
>
> Nowhere in the documentation is it clearly defined which objects are
> considered "immutable" for the purpose of this promise. As a result, a
> Python implementation, now or in the future, may decide that it's ok to
> return a reference to an existing object when a Card instance is created -
> since arguably, class Card is immutable (since it derives from an immutable
> base class, and doesn't add any new attributes).

It's up to the objects themselves (and their metaclasses) - any such
optimisation must be implemented in cls.__new__ or metacls.__call__.

So, no, you're not going to get a stronger guarantee than is already
in place (and you'd be better of just writing Card properly -
inheriting from int for an object that should model a "value, suit"
2-tuple is a bad idea. Using collections.namedtuple would be a much
better option.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From simon.sapin at kozea.fr  Wed Apr 18 07:26:06 2012
From: simon.sapin at kozea.fr (Simon Sapin)
Date: Wed, 18 Apr 2012 07:26:06 +0200
Subject: [Python-ideas] Providing a guarantee that instances of
 user-defined classes have distinct identities
In-Reply-To: <CAOVPiMgB518p3AreNd6wytYLsRpkUjeUcs10A4C1Vfa1J4nPPQ@mail.gmail.com>
References: <CAOVPiMgRRMv4iYg90LoGoE9xbw10f-0Zw5Ln8o_raYP_4YcO+A@mail.gmail.com>
	<CAOVPiMgB518p3AreNd6wytYLsRpkUjeUcs10A4C1Vfa1J4nPPQ@mail.gmail.com>
Message-ID: <4F8E506E.7040606@kozea.fr>

Le 18/04/2012 03:23, Max Moroz a ?crit :
> Nowhere in the documentation is it clearly defined which objects are
> considered "immutable" for the purpose of this promise. As a result, a
> Python implementation, now or in the future, may decide that it's ok to
> return a reference to an existing object when a Card instance is created
> - since arguably, class Card is immutable (since it derives from an
> immutable base class, and doesn't add any new attributes).

Hi,

I agree that the definition of "immutable" is not very clear, but I 
don?t think that your Card class is immutable.

As Card inherits without __slots__, it gets a __dict__ and can hold 
arbitrary attributes. Even if none of its methods do so, this is 
perfectly okay:

x = Card(14)
y = Card(14)
x.foo = 42
y.foo  # AttributeError

Because of their __dict__, Card objects can never be considered immutable.
(Now, I?m not sure what would happen with an empty __slots__.)

Regards,
-- 
Simon Sapin


From maxmoroz at gmail.com  Wed Apr 18 11:57:37 2012
From: maxmoroz at gmail.com (Max Moroz)
Date: Wed, 18 Apr 2012 02:57:37 -0700
Subject: [Python-ideas] Providing a guarantee that instances of
 user-defined classes have distinct identities
In-Reply-To: <4F8E506E.7040606@kozea.fr>
References: <CAOVPiMgRRMv4iYg90LoGoE9xbw10f-0Zw5Ln8o_raYP_4YcO+A@mail.gmail.com>
	<CAOVPiMgB518p3AreNd6wytYLsRpkUjeUcs10A4C1Vfa1J4nPPQ@mail.gmail.com>
	<4F8E506E.7040606@kozea.fr>
Message-ID: <CAOVPiMinTXt2a8WKFWBGADGdhtpyzTYa_pkoYtwpY39NXduKdQ@mail.gmail.com>

Simon Sapin <simon.sapin at kozea.fr> wrote:
> I agree that the definition of "immutable" is not very clear, but I don?t
> think that your Card class is immutable.

I shouldn't have used the word "immutable"; it is a bit confusing and
distracts from my real concern.

I'm really just trying to get a guarantee from the language that would
make my original code safe. As is, it relies on a very reasonable, but
undocumented, assumption about the behavior of built-in classes'
__new__ method

The exact guarantee I need is: "Any built-in class' __new__ method
called with the cls argument set to a user-defined subclass, will
always return a new instance of type cls."

(Emphasis on "new instance" - as opposed to a reference to an existing object.)

Nick Coghlan <ncoghlan at gmail.com> wrote:
> and you'd be better of just writing Card properly -
> inheriting from int for an object that should model a "value, suit"
> 2-tuple is a bad idea. Using collections.namedtuple would be a much
> better option.

My example might have been poor. Still, I have use cases for objects
nearly identical to `int`, `tuple`, etc., but where I want to
distinguish two objects created at different times (place them both in
a set, compare them as unequal, etc.).

Thanks for your comments.

Max


From mwm at mired.org  Wed Apr 18 15:19:17 2012
From: mwm at mired.org (Mike Meyer)
Date: Wed, 18 Apr 2012 09:19:17 -0400
Subject: [Python-ideas] Providing a guarantee that instances of
 user-defined classes have distinct identities
In-Reply-To: <CADiSq7ciLgt-DXxX5_-tUs_NhdhP6pBKbijEeFGvuA8TLpF_rA@mail.gmail.com>
References: <CAOVPiMgRRMv4iYg90LoGoE9xbw10f-0Zw5Ln8o_raYP_4YcO+A@mail.gmail.com>
	<CAOVPiMgB518p3AreNd6wytYLsRpkUjeUcs10A4C1Vfa1J4nPPQ@mail.gmail.com>
	<CADiSq7ciLgt-DXxX5_-tUs_NhdhP6pBKbijEeFGvuA8TLpF_rA@mail.gmail.com>
Message-ID: <20120418091917.0e201797@bhuda.mired.org>

On Wed, 18 Apr 2012 11:44:47 +1000
Nick Coghlan <ncoghlan at gmail.com> wrote:

> On Wed, Apr 18, 2012 at 11:23 AM, Max Moroz <maxmoroz at gmail.com> wrote:
> > "For immutable types, operations that compute new values may actually return
> > a reference to any existing object with the same type and value, while for
> > mutable objects this is not allowed."
> >
> > Nowhere in the documentation is it clearly defined which objects are
> > considered "immutable" for the purpose of this promise. As a result, a
> > Python implementation, now or in the future, may decide that it's ok to
> > return a reference to an existing object when a Card instance is created -
> > since arguably, class Card is immutable (since it derives from an immutable
> > base class, and doesn't add any new attributes).
> 
> It's up to the objects themselves (and their metaclasses) - any such
> optimisation must be implemented in cls.__new__ or metacls.__call__.

First, the Python docs don't clearly tell you what objects are
immutable because, well, it's an extensible language. With that
constraint, the best you can do about that is what it says not far
above the section you quoted:

      An object?s mutability is determined by its type; for instance,
      numbers, strings and tuples are immutable, while dictionaries
      and lists are mutable.

I.e. - that this is determined by it's type, and a list of builtin
types that are immutable.

> So, no, you're not going to get a stronger guarantee than is already
> in place

I believe that this guarantee is strong enough to guarantee that
classes that inherit from immutable types won't share values unless
the class code does something to make that happen. The type of such an
object is *not* the type that it inherits from, it's a Python class
type. As demonstrated, such classes aren't immutable, so Python needs
to make different instances different objects even if they share the
same value. If you want behavior different from that, the class or
metaclass has to provide it.

    <mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From sven at marnach.net  Wed Apr 18 14:55:15 2012
From: sven at marnach.net (Sven Marnach)
Date: Wed, 18 Apr 2012 13:55:15 +0100
Subject: [Python-ideas] Providing a guarantee that instances of
 user-defined classes have distinct identities
In-Reply-To: <CAOVPiMinTXt2a8WKFWBGADGdhtpyzTYa_pkoYtwpY39NXduKdQ@mail.gmail.com>
References: <CAOVPiMgRRMv4iYg90LoGoE9xbw10f-0Zw5Ln8o_raYP_4YcO+A@mail.gmail.com>
	<CAOVPiMgB518p3AreNd6wytYLsRpkUjeUcs10A4C1Vfa1J4nPPQ@mail.gmail.com>
	<4F8E506E.7040606@kozea.fr>
	<CAOVPiMinTXt2a8WKFWBGADGdhtpyzTYa_pkoYtwpY39NXduKdQ@mail.gmail.com>
Message-ID: <20120418125515.GN30763@bagheera>

Max Moroz schrieb am Wed, 18. Apr 2012, um 02:57:37 -0700:
> I'm really just trying to get a guarantee from the language that would
> make my original code safe. As is, it relies on a very reasonable, but
> undocumented, assumption about the behavior of built-in classes'
> __new__ method

Simon's point is that your current code *is* safe since your instances
are not immutable.

> The exact guarantee I need is: "Any built-in class' __new__ method
> called with the cls argument set to a user-defined subclass, will
> always return a new instance of type cls."

As long as your class does not set `__slots__` to an empty sequence,
you already have this guarantee, since your type is not immutable.
And while the current documentation might suggest that built-in types
would be allowed to check for empty `__slots__` and reuse already
created instances of a subclass in that case, it's very unlikely they
will ever implement such a mechanism.

So just don't define `__slots__` if you want this kind of guarantee,
or better even, add an ID to your instances to make the differences
you rely on explicit.

Cheers,
    Sven


From steve at pearwood.info  Wed Apr 18 15:22:55 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 18 Apr 2012 23:22:55 +1000
Subject: [Python-ideas] Providing a guarantee that instances of
 user-defined classes have distinct identities
In-Reply-To: <CAOVPiMinTXt2a8WKFWBGADGdhtpyzTYa_pkoYtwpY39NXduKdQ@mail.gmail.com>
References: <CAOVPiMgRRMv4iYg90LoGoE9xbw10f-0Zw5Ln8o_raYP_4YcO+A@mail.gmail.com>	<CAOVPiMgB518p3AreNd6wytYLsRpkUjeUcs10A4C1Vfa1J4nPPQ@mail.gmail.com>	<4F8E506E.7040606@kozea.fr>
	<CAOVPiMinTXt2a8WKFWBGADGdhtpyzTYa_pkoYtwpY39NXduKdQ@mail.gmail.com>
Message-ID: <4F8EC02F.3090009@pearwood.info>

Max Moroz wrote:

> I'm really just trying to get a guarantee from the language that would
> make my original code safe. As is, it relies on a very reasonable, but
> undocumented, assumption about the behavior of built-in classes'
> __new__ method
> 
> The exact guarantee I need is: "Any built-in class' __new__ method
> called with the cls argument set to a user-defined subclass, will
> always return a new instance of type cls."
> 
> (Emphasis on "new instance" - as opposed to a reference to an existing object.)

I can't help feel that you are worrying about nothing. Why would a built-in 
class ever return an existing instance of a sub-class? While technically it 
would be possible, it would require the built-in class to keep a cache of 
instances for each subclass. Who is going to do that, and why would they bother?

It seems to me that you're asking for a guarantee like:

"Calling len() on a list will never randomly shuffle the list as a side-effect."

The fact that len() doesn't shuffle the list as a side-effect is not a 
documented promise of the language. But does it have to be? Some things would 
be just stupid for any implementation to do. There is no limit to the number 
of stupid things an implementation might do, and for the language to specify 
that it doesn't do any of them is impossible.

I think that __new__ returning an existing instance of a subclass would be one 
of those stupid things. After all, it is a *constructor*, it is supposed to 
construct a new instance, if it doesn't do so in the case of being called with 
a subclass argument it isn't living up to the implied contract.

I guess what this comes down to is that I'm quite satisfied with the implied 
promise that constructors will construct new instances and don't think it is 
necessary to make that explicit. It's not that I object to your request as 
that I think it's unnecessary.



-- 
Steven


From maxmoroz at gmail.com  Wed Apr 18 18:49:43 2012
From: maxmoroz at gmail.com (Max Moroz)
Date: Wed, 18 Apr 2012 09:49:43 -0700
Subject: [Python-ideas] Providing a guarantee that instances of
 user-defined classes have distinct identities
In-Reply-To: <4F8EC02F.3090009@pearwood.info>
References: <CAOVPiMgRRMv4iYg90LoGoE9xbw10f-0Zw5Ln8o_raYP_4YcO+A@mail.gmail.com>
	<CAOVPiMgB518p3AreNd6wytYLsRpkUjeUcs10A4C1Vfa1J4nPPQ@mail.gmail.com>
	<4F8E506E.7040606@kozea.fr>
	<CAOVPiMinTXt2a8WKFWBGADGdhtpyzTYa_pkoYtwpY39NXduKdQ@mail.gmail.com>
	<4F8EC02F.3090009@pearwood.info>
Message-ID: <CAOVPiMgbMYDCB+OGVbmXSQhuom9Z5=SnZGQkRf1HkuQR-cz+aw@mail.gmail.com>

After reading the comments, and especially the one below, I am now
persuaded that this implied guarantee is sufficient.

Part of the problem was that I didn't have a clear picture of what
__new__ is supposed to do when called with a (proper) subclass
argument. Now I (hopefully correctly) understand that a well-behaved
__new__ should in this case simply pass the call to object.__new__, or
at least do something very similar: the subclass has every right to
expect this behavior to remain unchanged whether or not one of its
parent classes defined a custom __new__. No explicit guarantee is
needed to confirm that this is what built-in classes do.

Steven D'Aprano wrote:
> "Calling len() on a list will never randomly shuffle the list as a
> side-effect."
>
> The fact that len() doesn't shuffle the list as a side-effect is not a
> documented promise of the language. But does it have to be? Some things
> would be just stupid for any implementation to do. There is no limit to the
> number of stupid things an implementation might do, and for the language to
> specify that it doesn't do any of them is impossible.


From sven at marnach.net  Thu Apr 19 12:35:13 2012
From: sven at marnach.net (Sven Marnach)
Date: Thu, 19 Apr 2012 11:35:13 +0100
Subject: [Python-ideas] Providing a guarantee that instances of
 user-defined classes have distinct identities
In-Reply-To: <4F8EC02F.3090009@pearwood.info>
References: <CAOVPiMgRRMv4iYg90LoGoE9xbw10f-0Zw5Ln8o_raYP_4YcO+A@mail.gmail.com>
	<CAOVPiMgB518p3AreNd6wytYLsRpkUjeUcs10A4C1Vfa1J4nPPQ@mail.gmail.com>
	<4F8E506E.7040606@kozea.fr>
	<CAOVPiMinTXt2a8WKFWBGADGdhtpyzTYa_pkoYtwpY39NXduKdQ@mail.gmail.com>
	<4F8EC02F.3090009@pearwood.info>
Message-ID: <20120419103513.GO30763@bagheera>

Steven D'Aprano schrieb am Wed, 18. Apr 2012, um 23:22:55 +1000:
> I can't help feel that you are worrying about nothing. Why would a
> built-in class ever return an existing instance of a sub-class?
> While technically it would be possible, it would require the
> built-in class to keep a cache of instances for each subclass.

This was also my first reaction; there is one case, though, which you
wouldn't need a cache for:  if the constructor is called with an
instance of the subclass as an argument.  As an example, the tuple
implementation does not have a cache of instances, and reuses only
tuples that are directly passed to the constructor:

    >>> a = 1, 2
    >>> b = 1, 2
    >>> a is b
    False
    >>> b = tuple(a)
    >>> a is b
    True

It wouldn't be completely unthinkable that a Python implementation
chooses to extend this behaviour to immutable subclasses of immutable
types.  I don't think there is any reason to disallow such an
implementation either.

Cheers,
    Sven


From mwm at mired.org  Thu Apr 19 14:58:28 2012
From: mwm at mired.org (Mike Meyer)
Date: Thu, 19 Apr 2012 08:58:28 -0400
Subject: [Python-ideas] Providing a guarantee that instances of
 user-defined classes have distinct identities
In-Reply-To: <20120419103513.GO30763@bagheera>
References: <CAOVPiMgRRMv4iYg90LoGoE9xbw10f-0Zw5Ln8o_raYP_4YcO+A@mail.gmail.com>
	<CAOVPiMgB518p3AreNd6wytYLsRpkUjeUcs10A4C1Vfa1J4nPPQ@mail.gmail.com>
	<4F8E506E.7040606@kozea.fr>
	<CAOVPiMinTXt2a8WKFWBGADGdhtpyzTYa_pkoYtwpY39NXduKdQ@mail.gmail.com>
	<4F8EC02F.3090009@pearwood.info> <20120419103513.GO30763@bagheera>
Message-ID: <20120419085828.6c58185d@bhuda.mired.org>

On Thu, 19 Apr 2012 11:35:13 +0100
Sven Marnach <sven at marnach.net> wrote:
> It wouldn't be completely unthinkable that a Python implementation
> chooses to extend this behaviour to immutable subclasses of immutable
> types.  I don't think there is any reason to disallow such an
> implementation either.

How would the implementation determine that the subclass was
immutable?

	<mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From simon.sapin at kozea.fr  Thu Apr 19 15:31:02 2012
From: simon.sapin at kozea.fr (Simon Sapin)
Date: Thu, 19 Apr 2012 15:31:02 +0200
Subject: [Python-ideas] Providing a guarantee that instances of
 user-defined classes have distinct identities
In-Reply-To: <20120419085828.6c58185d@bhuda.mired.org>
References: <CAOVPiMgRRMv4iYg90LoGoE9xbw10f-0Zw5Ln8o_raYP_4YcO+A@mail.gmail.com>
	<CAOVPiMgB518p3AreNd6wytYLsRpkUjeUcs10A4C1Vfa1J4nPPQ@mail.gmail.com>
	<4F8E506E.7040606@kozea.fr>
	<CAOVPiMinTXt2a8WKFWBGADGdhtpyzTYa_pkoYtwpY39NXduKdQ@mail.gmail.com>
	<4F8EC02F.3090009@pearwood.info> <20120419103513.GO30763@bagheera>
	<20120419085828.6c58185d@bhuda.mired.org>
Message-ID: <4F901396.5000506@kozea.fr>

Le 19/04/2012 14:58, Mike Meyer a ?crit :
> On Thu, 19 Apr 2012 11:35:13 +0100
> Sven Marnach<sven at marnach.net>  wrote:
>> >  It wouldn't be completely unthinkable that a Python implementation
>> >  chooses to extend this behaviour to immutable subclasses of immutable
>> >  types.  I don't think there is any reason to disallow such an
>> >  implementation either.
> How would the implementation determine that the subclass was
> immutable?

When all classes in the MRO except 'object' have an empty __slots__?

(object behaves just as if it had an empty __slots__, but accessing 
object.__slots__ raises an AttributeError.)

-- 
Simon Sapin


From amauryfa at gmail.com  Thu Apr 19 15:48:42 2012
From: amauryfa at gmail.com (Amaury Forgeot d'Arc)
Date: Thu, 19 Apr 2012 15:48:42 +0200
Subject: [Python-ideas] Providing a guarantee that instances of
 user-defined classes have distinct identities
In-Reply-To: <4F901396.5000506@kozea.fr>
References: <CAOVPiMgRRMv4iYg90LoGoE9xbw10f-0Zw5Ln8o_raYP_4YcO+A@mail.gmail.com>
	<CAOVPiMgB518p3AreNd6wytYLsRpkUjeUcs10A4C1Vfa1J4nPPQ@mail.gmail.com>
	<4F8E506E.7040606@kozea.fr>
	<CAOVPiMinTXt2a8WKFWBGADGdhtpyzTYa_pkoYtwpY39NXduKdQ@mail.gmail.com>
	<4F8EC02F.3090009@pearwood.info> <20120419103513.GO30763@bagheera>
	<20120419085828.6c58185d@bhuda.mired.org>
	<4F901396.5000506@kozea.fr>
Message-ID: <CAGmFidZ1UFoAmOyGJao0=5nw3baXX58mqd0qj+HJQGze8JRJ6Q@mail.gmail.com>

2012/4/19 Simon Sapin <simon.sapin at kozea.fr>

> Le 19/04/2012 14:58, Mike Meyer a ?crit :
>
>  On Thu, 19 Apr 2012 11:35:13 +0100
>> Sven Marnach<sven at marnach.net>  wrote:
>>
>>> >  It wouldn't be completely unthinkable that a Python implementation
>>> >  chooses to extend this behaviour to immutable subclasses of immutable
>>> >  types.  I don't think there is any reason to disallow such an
>>> >  implementation either.
>>>
>> How would the implementation determine that the subclass was
>> immutable?
>>
>
> When all classes in the MRO except 'object' have an empty __slots__?
>

But you can still change the __class__ of an object, even immutable ones.

-- 
Amaury Forgeot d'Arc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120419/d869b07a/attachment.html>

From contact at xavierho.com  Fri Apr 20 14:32:55 2012
From: contact at xavierho.com (Xavier Ho)
Date: Fri, 20 Apr 2012 22:32:55 +1000
Subject: [Python-ideas] Have dict().update() return its own reference.
Message-ID: <CALWePYy3DhmoqUx_tZuFCc_a-+=dV=MT=B-mQhG7FU93V05NzA@mail.gmail.com>

Hello,

What's the rationale behind the fact that `dict().update()` return nothing?
 If it returned the dictionary reference, at least we could chain methods,
or assign it to another variable, or pass it into a function, etc..

What's the design decision made behind this?

Cheers,
Xav
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120420/87bfdc2a/attachment.html>

From contact at xavierho.com  Fri Apr 20 14:37:45 2012
From: contact at xavierho.com (Xavier Ho)
Date: Fri, 20 Apr 2012 22:37:45 +1000
Subject: [Python-ideas] Have dict().update() return its own reference.
In-Reply-To: <DC5051AF-DE5B-4477-8358-7BBBDDC4D8C1@lvh.cc>
References: <CALWePYy3DhmoqUx_tZuFCc_a-+=dV=MT=B-mQhG7FU93V05NzA@mail.gmail.com>
	<DC5051AF-DE5B-4477-8358-7BBBDDC4D8C1@lvh.cc>
Message-ID: <CALWePYwwnb+ijeBeNvGW4Z_mKCLBZGKO33TQKH8dokPqAe1a4A@mail.gmail.com>

Thanks, that's fair, for consistency.

One use case for my question was a stackoverflow question regarding merging
two dict's.  If update() returned its own reference, and if we explicitly
wanted a copy (instead of an in-place modification), we could have used

    dict(x).update(y)

given x and y are both dict() instances.

Cheers,
Xav



On 20 April 2012 22:35, Laurens Van Houtven <_ at lvh.cc> wrote:

> As a general rule, methods/functions in Python either *mutate* or
> *return*. (Obviously, mutating methods also return, they just return None)
>
> For example: random.shuffle shuffles in place so doesn't return anything
> list.sort sorts in place so doesn't return anything
> sorted creates a new sorted thing, so returns that sorted thing
>
> cheers
> lvh
>
>
>
> On 20 Apr 2012, at 14:32, Xavier Ho wrote:
>
> > Hello,
> >
> > What's the rationale behind the fact that `dict().update()` return
> nothing?  If it returned the dictionary reference, at least we could chain
> methods, or assign it to another variable, or pass it into a function, etc..
> >
> > What's the design decision made behind this?
> >
> > Cheers,
> > Xav
> >
> > _______________________________________________
> > Python-ideas mailing list
> > Python-ideas at python.org
> > http://mail.python.org/mailman/listinfo/python-ideas
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120420/43d93765/attachment.html>

From masklinn at masklinn.net  Fri Apr 20 14:47:34 2012
From: masklinn at masklinn.net (Masklinn)
Date: Fri, 20 Apr 2012 14:47:34 +0200
Subject: [Python-ideas] Have dict().update() return its own reference.
In-Reply-To: <CALWePYwwnb+ijeBeNvGW4Z_mKCLBZGKO33TQKH8dokPqAe1a4A@mail.gmail.com>
References: <CALWePYy3DhmoqUx_tZuFCc_a-+=dV=MT=B-mQhG7FU93V05NzA@mail.gmail.com>
	<DC5051AF-DE5B-4477-8358-7BBBDDC4D8C1@lvh.cc>
	<CALWePYwwnb+ijeBeNvGW4Z_mKCLBZGKO33TQKH8dokPqAe1a4A@mail.gmail.com>
Message-ID: <6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net>


On 2012-04-20, at 14:37 , Xavier Ho wrote:

> Thanks, that's fair, for consistency.
> 
> One use case for my question was a stackoverflow question regarding merging
> two dict's.  If update() returned its own reference, and if we explicitly
> wanted a copy (instead of an in-place modification), we could have used
> 
>    dict(x).update(y)
> 
> given x and y are both dict() instances.

If you start from dict instances, you could always use:

    merged = dict(x, **y)


From contact at xavierho.com  Fri Apr 20 14:48:38 2012
From: contact at xavierho.com (Xavier Ho)
Date: Fri, 20 Apr 2012 22:48:38 +1000
Subject: [Python-ideas] Have dict().update() return its own reference.
In-Reply-To: <6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net>
References: <CALWePYy3DhmoqUx_tZuFCc_a-+=dV=MT=B-mQhG7FU93V05NzA@mail.gmail.com>
	<DC5051AF-DE5B-4477-8358-7BBBDDC4D8C1@lvh.cc>
	<CALWePYwwnb+ijeBeNvGW4Z_mKCLBZGKO33TQKH8dokPqAe1a4A@mail.gmail.com>
	<6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net>
Message-ID: <CALWePYyK-aV0NFEgWRB5MetrWDqpqZjRNYhax0iuJZ-nq=45kA@mail.gmail.com>

On 20 April 2012 22:47, Masklinn <masklinn at masklinn.net> wrote:
>
> If you start from dict instances, you could always use:
>
>    merged = dict(x, **y)
>

I heard that Guido wasn't a fan of this.

Cheers,
Xav
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120420/5f1776e7/attachment.html>

From masklinn at masklinn.net  Fri Apr 20 14:58:46 2012
From: masklinn at masklinn.net (Masklinn)
Date: Fri, 20 Apr 2012 14:58:46 +0200
Subject: [Python-ideas] Have dict().update() return its own reference.
In-Reply-To: <CALWePYyK-aV0NFEgWRB5MetrWDqpqZjRNYhax0iuJZ-nq=45kA@mail.gmail.com>
References: <CALWePYy3DhmoqUx_tZuFCc_a-+=dV=MT=B-mQhG7FU93V05NzA@mail.gmail.com>
	<DC5051AF-DE5B-4477-8358-7BBBDDC4D8C1@lvh.cc>
	<CALWePYwwnb+ijeBeNvGW4Z_mKCLBZGKO33TQKH8dokPqAe1a4A@mail.gmail.com>
	<6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net>
	<CALWePYyK-aV0NFEgWRB5MetrWDqpqZjRNYhax0iuJZ-nq=45kA@mail.gmail.com>
Message-ID: <0D667BF1-9EB0-447C-AF8F-01E262F98B65@masklinn.net>

On 2012-04-20, at 14:48 , Xavier Ho wrote:

> On 20 April 2012 22:47, Masklinn <masklinn at masklinn.net> wrote:
>> 
>> If you start from dict instances, you could always use:
>> 
>>   merged = dict(x, **y)
>> 
> 
> I heard that Guido wasn't a fan of this.

Works to merge two dicts in a single expression, if you don't
want to define a wrapper function and find a name for it.


From sven at marnach.net  Fri Apr 20 15:37:10 2012
From: sven at marnach.net (Sven Marnach)
Date: Fri, 20 Apr 2012 14:37:10 +0100
Subject: [Python-ideas] Have dict().update() return its own reference.
In-Reply-To: <6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net>
References: <CALWePYy3DhmoqUx_tZuFCc_a-+=dV=MT=B-mQhG7FU93V05NzA@mail.gmail.com>
	<DC5051AF-DE5B-4477-8358-7BBBDDC4D8C1@lvh.cc>
	<CALWePYwwnb+ijeBeNvGW4Z_mKCLBZGKO33TQKH8dokPqAe1a4A@mail.gmail.com>
	<6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net>
Message-ID: <20120420133710.GR30763@bagheera>

Masklinn schrieb am Fri, 20. Apr 2012, um 14:47:34 +0200:
> If you start from dict instances, you could always use:
> 
>     merged = dict(x, **y)

No, not always.  Only if all keys of `y` are strings (and probably
they should also be valid Python identifiers.)

Cheers,
    Sven


From stefan_ml at behnel.de  Fri Apr 20 16:28:28 2012
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Fri, 20 Apr 2012 16:28:28 +0200
Subject: [Python-ideas] Have dict().update() return its own reference.
In-Reply-To: <20120420133710.GR30763@bagheera>
References: <CALWePYy3DhmoqUx_tZuFCc_a-+=dV=MT=B-mQhG7FU93V05NzA@mail.gmail.com>
	<DC5051AF-DE5B-4477-8358-7BBBDDC4D8C1@lvh.cc>
	<CALWePYwwnb+ijeBeNvGW4Z_mKCLBZGKO33TQKH8dokPqAe1a4A@mail.gmail.com>
	<6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net>
	<20120420133710.GR30763@bagheera>
Message-ID: <jmrrqc$q5o$2@dough.gmane.org>

Sven Marnach, 20.04.2012 15:37:
> Masklinn schrieb am Fri, 20. Apr 2012, um 14:47:34 +0200:
>> If you start from dict instances, you could always use:
>>
>>     merged = dict(x, **y)
> 
> No, not always.  Only if all keys of `y` are strings (and probably
> they should also be valid Python identifiers.)

Also, it's not immediately clear from the expression what happens for
duplicate keys, and the intended behaviour for that case may be different
from what the above does.

Stefan



From alexander.belopolsky at gmail.com  Fri Apr 20 16:35:55 2012
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Fri, 20 Apr 2012 10:35:55 -0400
Subject: [Python-ideas] Have dict().update() return its own reference.
In-Reply-To: <20120420133710.GR30763@bagheera>
References: <CALWePYy3DhmoqUx_tZuFCc_a-+=dV=MT=B-mQhG7FU93V05NzA@mail.gmail.com>
	<DC5051AF-DE5B-4477-8358-7BBBDDC4D8C1@lvh.cc>
	<CALWePYwwnb+ijeBeNvGW4Z_mKCLBZGKO33TQKH8dokPqAe1a4A@mail.gmail.com>
	<6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net>
	<20120420133710.GR30763@bagheera>
Message-ID: <CAP7h-xbwoW+Ds9TPKnnsRVrnfKKvC6p7i9avZPxVG3tzttfhHQ@mail.gmail.com>

On Fri, Apr 20, 2012 at 9:37 AM, Sven Marnach <sven at marnach.net> wrote:
>> If you start from dict instances, you could always use:
>>
>> ? ? merged = dict(x, **y)
>
> No, not always. ?Only if all keys of `y` are strings (and probably
> they should also be valid Python identifiers.)

>>> a = {}
>>> b = {1:2}
>>> dict(a, **b)
{1: 2}


From _ at lvh.cc  Fri Apr 20 16:41:24 2012
From: _ at lvh.cc (Laurens Van Houtven)
Date: Fri, 20 Apr 2012 16:41:24 +0200
Subject: [Python-ideas] Have dict().update() return its own reference.
In-Reply-To: <20120420133710.GR30763@bagheera>
References: <CALWePYy3DhmoqUx_tZuFCc_a-+=dV=MT=B-mQhG7FU93V05NzA@mail.gmail.com>
	<DC5051AF-DE5B-4477-8358-7BBBDDC4D8C1@lvh.cc>
	<CALWePYwwnb+ijeBeNvGW4Z_mKCLBZGKO33TQKH8dokPqAe1a4A@mail.gmail.com>
	<6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net>
	<20120420133710.GR30763@bagheera>
Message-ID: <61BA1F20-2717-4CEF-9AFA-8582D5DC23A2@lvh.cc>

That's not actually true. **kwargs can contain things that aren't strings :)

cheers
lvh



On 20 Apr 2012, at 15:37, Sven Marnach wrote:

> Masklinn schrieb am Fri, 20. Apr 2012, um 14:47:34 +0200:
>> If you start from dict instances, you could always use:
>> 
>>    merged = dict(x, **y)
> 
> No, not always.  Only if all keys of `y` are strings (and probably
> they should also be valid Python identifiers.)
> 
> Cheers,
>    Sven
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas



From stefan_ml at behnel.de  Fri Apr 20 16:49:17 2012
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Fri, 20 Apr 2012 16:49:17 +0200
Subject: [Python-ideas] Have dict().update() return its own reference.
In-Reply-To: <CAP7h-xbwoW+Ds9TPKnnsRVrnfKKvC6p7i9avZPxVG3tzttfhHQ@mail.gmail.com>
References: <CALWePYy3DhmoqUx_tZuFCc_a-+=dV=MT=B-mQhG7FU93V05NzA@mail.gmail.com>
	<DC5051AF-DE5B-4477-8358-7BBBDDC4D8C1@lvh.cc>
	<CALWePYwwnb+ijeBeNvGW4Z_mKCLBZGKO33TQKH8dokPqAe1a4A@mail.gmail.com>
	<6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net>
	<20120420133710.GR30763@bagheera>
	<CAP7h-xbwoW+Ds9TPKnnsRVrnfKKvC6p7i9avZPxVG3tzttfhHQ@mail.gmail.com>
Message-ID: <jmrt1d$9k7$1@dough.gmane.org>

Alexander Belopolsky, 20.04.2012 16:35:
> On Fri, Apr 20, 2012 at 9:37 AM, Sven Marnach wrote:
>>> If you start from dict instances, you could always use:
>>>
>>>     merged = dict(x, **y)
>>
>> No, not always.  Only if all keys of `y` are strings (and probably
>> they should also be valid Python identifiers.)
> 
> >>> a = {}
> >>> b = {1:2}
> >>> dict(a, **b)
> {1: 2}

That's no guaranteed behaviour, though. It doesn't work in PyPy, for example:

>>>> a={}
>>>> b={1:2}
>>>> dict(a,**b)
Traceback (most recent call last):
  File "<console>", line 1, in <module>
TypeError: keywords must be strings

(and, no, it's not PyPy that's wrong here)

Stefan



From masklinn at masklinn.net  Fri Apr 20 16:56:04 2012
From: masklinn at masklinn.net (Masklinn)
Date: Fri, 20 Apr 2012 16:56:04 +0200
Subject: [Python-ideas] Have dict().update() return its own reference.
In-Reply-To: <jmrrqc$q5o$2@dough.gmane.org>
References: <CALWePYy3DhmoqUx_tZuFCc_a-+=dV=MT=B-mQhG7FU93V05NzA@mail.gmail.com>
	<DC5051AF-DE5B-4477-8358-7BBBDDC4D8C1@lvh.cc>
	<CALWePYwwnb+ijeBeNvGW4Z_mKCLBZGKO33TQKH8dokPqAe1a4A@mail.gmail.com>
	<6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net>
	<20120420133710.GR30763@bagheera> <jmrrqc$q5o$2@dough.gmane.org>
Message-ID: <D0820DE7-9167-4DB2-A693-91847429DD02@masklinn.net>


On 2012-04-20, at 16:28 , Stefan Behnel wrote:

> Sven Marnach, 20.04.2012 15:37:
>> Masklinn schrieb am Fri, 20. Apr 2012, um 14:47:34 +0200:
>>> If you start from dict instances, you could always use:
>>> 
>>>    merged = dict(x, **y)
>> 
>> No, not always.  Only if all keys of `y` are strings (and probably
>> they should also be valid Python identifiers.)
> 
> Also, it's not immediately clear from the expression what happens for
> duplicate keys

Not sure why, as with `dict.update` `dict` is defined as setting from the
first argument, then setting from the keyword arguments (overriding
keys originally set if any).

Now of course that might not be obvious to people who don't know how dict
works, but I fail to see why an other function which they don't know either
will be any more "immediately clear".

You may counter that a function taking (and merging) a sequence of mappings
would "obviously" apply a left fold in merging the mappings, but in that
case the dict constructor would "obviously" copy the positional then apply
the keywords (which are after the positional).

Which is exactly what happens.


From alexander.belopolsky at gmail.com  Fri Apr 20 17:00:28 2012
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Fri, 20 Apr 2012 11:00:28 -0400
Subject: [Python-ideas] Have dict().update() return its own reference.
In-Reply-To: <jmrt1d$9k7$1@dough.gmane.org>
References: <CALWePYy3DhmoqUx_tZuFCc_a-+=dV=MT=B-mQhG7FU93V05NzA@mail.gmail.com>
	<DC5051AF-DE5B-4477-8358-7BBBDDC4D8C1@lvh.cc>
	<CALWePYwwnb+ijeBeNvGW4Z_mKCLBZGKO33TQKH8dokPqAe1a4A@mail.gmail.com>
	<6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net>
	<20120420133710.GR30763@bagheera>
	<CAP7h-xbwoW+Ds9TPKnnsRVrnfKKvC6p7i9avZPxVG3tzttfhHQ@mail.gmail.com>
	<jmrt1d$9k7$1@dough.gmane.org>
Message-ID: <CAP7h-xbHocZkyeBUvCXPzJ7-Oj3pKeuUr8cn61PkrZ+eTf5gsg@mail.gmail.com>

On Fri, Apr 20, 2012 at 10:49 AM, Stefan Behnel <stefan_ml at behnel.de> wrote:
>> >>> a = {}
>> >>> b = {1:2}
>> >>> dict(a, **b)
>> {1: 2}
>
> That's no guaranteed behaviour, though. It doesn't work in PyPy, for example.

I seem to recall that CPython had a similar limitation in the past,
but it was removed at some point.  I will try to dig out the relevant
discussion, but I think the consensus was that ** should not attempt
validate the keys.


From sven at marnach.net  Fri Apr 20 17:47:05 2012
From: sven at marnach.net (Sven Marnach)
Date: Fri, 20 Apr 2012 16:47:05 +0100
Subject: [Python-ideas] Have dict().update() return its own reference.
In-Reply-To: <CAP7h-xbHocZkyeBUvCXPzJ7-Oj3pKeuUr8cn61PkrZ+eTf5gsg@mail.gmail.com>
References: <CALWePYy3DhmoqUx_tZuFCc_a-+=dV=MT=B-mQhG7FU93V05NzA@mail.gmail.com>
	<DC5051AF-DE5B-4477-8358-7BBBDDC4D8C1@lvh.cc>
	<CALWePYwwnb+ijeBeNvGW4Z_mKCLBZGKO33TQKH8dokPqAe1a4A@mail.gmail.com>
	<6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net>
	<20120420133710.GR30763@bagheera>
	<CAP7h-xbwoW+Ds9TPKnnsRVrnfKKvC6p7i9avZPxVG3tzttfhHQ@mail.gmail.com>
	<jmrt1d$9k7$1@dough.gmane.org>
	<CAP7h-xbHocZkyeBUvCXPzJ7-Oj3pKeuUr8cn61PkrZ+eTf5gsg@mail.gmail.com>
Message-ID: <20120420154705.GS30763@bagheera>

Alexander Belopolsky schrieb am Fri, 20. Apr 2012, um 11:00:28 -0400:
> I seem to recall that CPython had a similar limitation in the past,
> but it was removed at some point.  I will try to dig out the relevant
> discussion, but I think the consensus was that ** should not attempt
> validate the keys.

It's the other way around.  Your code used to work in Python 2.x, but
it doesn't work in Python 3.x.

Cheers,
    Sven


From guido at python.org  Fri Apr 20 18:21:50 2012
From: guido at python.org (Guido van Rossum)
Date: Fri, 20 Apr 2012 09:21:50 -0700
Subject: [Python-ideas] Have dict().update() return its own reference.
In-Reply-To: <CAP7h-xbHocZkyeBUvCXPzJ7-Oj3pKeuUr8cn61PkrZ+eTf5gsg@mail.gmail.com>
References: <CALWePYy3DhmoqUx_tZuFCc_a-+=dV=MT=B-mQhG7FU93V05NzA@mail.gmail.com>
	<DC5051AF-DE5B-4477-8358-7BBBDDC4D8C1@lvh.cc>
	<CALWePYwwnb+ijeBeNvGW4Z_mKCLBZGKO33TQKH8dokPqAe1a4A@mail.gmail.com>
	<6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net>
	<20120420133710.GR30763@bagheera>
	<CAP7h-xbwoW+Ds9TPKnnsRVrnfKKvC6p7i9avZPxVG3tzttfhHQ@mail.gmail.com>
	<jmrt1d$9k7$1@dough.gmane.org>
	<CAP7h-xbHocZkyeBUvCXPzJ7-Oj3pKeuUr8cn61PkrZ+eTf5gsg@mail.gmail.com>
Message-ID: <CAP7+vJLRM5_YhGfgMYBgBt9LO+SP+_KGzRMn6bYvFTOfdiidzw@mail.gmail.com>

On Fri, Apr 20, 2012 at 8:00 AM, Alexander Belopolsky
<alexander.belopolsky at gmail.com> wrote:
> On Fri, Apr 20, 2012 at 10:49 AM, Stefan Behnel <stefan_ml at behnel.de> wrote:
>>> >>> a = {}
>>> >>> b = {1:2}
>>> >>> dict(a, **b)
>>> {1: 2}
>>
>> That's no guaranteed behaviour, though. It doesn't work in PyPy, for example.
>
> I seem to recall that CPython had a similar limitation in the past,
> but it was removed at some point. ?I will try to dig out the relevant
> discussion, but I think the consensus was that ** should not attempt
> validate the keys.

The should be strings. There is no requirement that they are valid identifiers.

-- 
--Guido van Rossum (python.org/~guido)


From victor.varvariuc at gmail.com  Fri Apr 20 20:30:09 2012
From: victor.varvariuc at gmail.com (Victor Varvariuc)
Date: Fri, 20 Apr 2012 21:30:09 +0300
Subject: [Python-ideas] Have dict().update() return its own reference.
In-Reply-To: <CAP7+vJLRM5_YhGfgMYBgBt9LO+SP+_KGzRMn6bYvFTOfdiidzw@mail.gmail.com>
References: <CALWePYy3DhmoqUx_tZuFCc_a-+=dV=MT=B-mQhG7FU93V05NzA@mail.gmail.com>
	<DC5051AF-DE5B-4477-8358-7BBBDDC4D8C1@lvh.cc>
	<CALWePYwwnb+ijeBeNvGW4Z_mKCLBZGKO33TQKH8dokPqAe1a4A@mail.gmail.com>
	<6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net>
	<20120420133710.GR30763@bagheera>
	<CAP7h-xbwoW+Ds9TPKnnsRVrnfKKvC6p7i9avZPxVG3tzttfhHQ@mail.gmail.com>
	<jmrt1d$9k7$1@dough.gmane.org>
	<CAP7h-xbHocZkyeBUvCXPzJ7-Oj3pKeuUr8cn61PkrZ+eTf5gsg@mail.gmail.com>
	<CAP7+vJLRM5_YhGfgMYBgBt9LO+SP+_KGzRMn6bYvFTOfdiidzw@mail.gmail.com>
Message-ID: <CA+Lge10ZMYguDnWSb+EduWw_2iBXhuTpSgw6u2Cz_hEUqeTWcw@mail.gmail.com>

>>> a = {}
>>> b = {1:2}
>>> dict(a, **b)

If b is a huge dict - not a good approach

--
*Victor Varvariuc*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120420/b692fd1a/attachment.html>

From masklinn at masklinn.net  Fri Apr 20 21:49:10 2012
From: masklinn at masklinn.net (Masklinn)
Date: Fri, 20 Apr 2012 21:49:10 +0200
Subject: [Python-ideas] Have dict().update() return its own reference.
In-Reply-To: <CA+Lge10ZMYguDnWSb+EduWw_2iBXhuTpSgw6u2Cz_hEUqeTWcw@mail.gmail.com>
References: <CALWePYy3DhmoqUx_tZuFCc_a-+=dV=MT=B-mQhG7FU93V05NzA@mail.gmail.com>
	<DC5051AF-DE5B-4477-8358-7BBBDDC4D8C1@lvh.cc>
	<CALWePYwwnb+ijeBeNvGW4Z_mKCLBZGKO33TQKH8dokPqAe1a4A@mail.gmail.com>
	<6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net>
	<20120420133710.GR30763@bagheera>
	<CAP7h-xbwoW+Ds9TPKnnsRVrnfKKvC6p7i9avZPxVG3tzttfhHQ@mail.gmail.com>
	<jmrt1d$9k7$1@dough.gmane.org>
	<CAP7h-xbHocZkyeBUvCXPzJ7-Oj3pKeuUr8cn61PkrZ+eTf5gsg@mail.gmail.com>
	<CAP7+vJLRM5_YhGfgMYBgBt9LO+SP+_KGzRMn6bYvFTOfdiidzw@mail.gmail.com>
	<CA+Lge10ZMYguDnWSb+EduWw_2iBXhuTpSgw6u2Cz_hEUqeTWcw@mail.gmail.com>
Message-ID: <E2743EFF-2515-457E-BF2F-CA1DF1CAFE5C@masklinn.net>


On 2012-04-20, at 20:30 , Victor Varvariuc wrote:

>>>> a = {}
>>>> b = {1:2}
>>>> dict(a, **b)
> 
> If b is a huge dict - not a good approach

If they're huge mappings, you probably don't want to go around copying
them either way[0] and would instead use more custom mappings, either
some sort of joining proxy or something out of Okasaki (a clojure-style
tree-based map with structural sharing for instance)

[0] I'm pretty sure "being fast to copy when bloody huge" is not at the
    forefront of Python's dict priorities.


From p.f.moore at gmail.com  Fri Apr 20 23:41:42 2012
From: p.f.moore at gmail.com (Paul Moore)
Date: Fri, 20 Apr 2012 22:41:42 +0100
Subject: [Python-ideas] Have dict().update() return its own reference.
In-Reply-To: <E2743EFF-2515-457E-BF2F-CA1DF1CAFE5C@masklinn.net>
References: <CALWePYy3DhmoqUx_tZuFCc_a-+=dV=MT=B-mQhG7FU93V05NzA@mail.gmail.com>
	<DC5051AF-DE5B-4477-8358-7BBBDDC4D8C1@lvh.cc>
	<CALWePYwwnb+ijeBeNvGW4Z_mKCLBZGKO33TQKH8dokPqAe1a4A@mail.gmail.com>
	<6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net>
	<20120420133710.GR30763@bagheera>
	<CAP7h-xbwoW+Ds9TPKnnsRVrnfKKvC6p7i9avZPxVG3tzttfhHQ@mail.gmail.com>
	<jmrt1d$9k7$1@dough.gmane.org>
	<CAP7h-xbHocZkyeBUvCXPzJ7-Oj3pKeuUr8cn61PkrZ+eTf5gsg@mail.gmail.com>
	<CAP7+vJLRM5_YhGfgMYBgBt9LO+SP+_KGzRMn6bYvFTOfdiidzw@mail.gmail.com>
	<CA+Lge10ZMYguDnWSb+EduWw_2iBXhuTpSgw6u2Cz_hEUqeTWcw@mail.gmail.com>
	<E2743EFF-2515-457E-BF2F-CA1DF1CAFE5C@masklinn.net>
Message-ID: <CACac1F8YE0Rm4cTo5KJ1W07jnnbZmz87JuPY_pnvR+WxL--j4Q@mail.gmail.com>

On 20 April 2012 20:49, Masklinn <masklinn at masklinn.net> wrote:
> If they're huge mappings, you probably don't want to go around copying
> them either way[0] and would instead use more custom mappings, either
> some sort of joining proxy or something out of Okasaki (a clojure-style
> tree-based map with structural sharing for instance)

Python 3.3 has collections.ChainMap for this sort of case.

Paul.


From masklinn at masklinn.net  Sat Apr 21 01:10:35 2012
From: masklinn at masklinn.net (Masklinn)
Date: Sat, 21 Apr 2012 01:10:35 +0200
Subject: [Python-ideas] Have dict().update() return its own reference.
In-Reply-To: <CACac1F8YE0Rm4cTo5KJ1W07jnnbZmz87JuPY_pnvR+WxL--j4Q@mail.gmail.com>
References: <CALWePYy3DhmoqUx_tZuFCc_a-+=dV=MT=B-mQhG7FU93V05NzA@mail.gmail.com>
	<DC5051AF-DE5B-4477-8358-7BBBDDC4D8C1@lvh.cc>
	<CALWePYwwnb+ijeBeNvGW4Z_mKCLBZGKO33TQKH8dokPqAe1a4A@mail.gmail.com>
	<6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net>
	<20120420133710.GR30763@bagheera>
	<CAP7h-xbwoW+Ds9TPKnnsRVrnfKKvC6p7i9avZPxVG3tzttfhHQ@mail.gmail.com>
	<jmrt1d$9k7$1@dough.gmane.org>
	<CAP7h-xbHocZkyeBUvCXPzJ7-Oj3pKeuUr8cn61PkrZ+eTf5gsg@mail.gmail.com>
	<CAP7+vJLRM5_YhGfgMYBgBt9LO+SP+_KGzRMn6bYvFTOfdiidzw@mail.gmail.com>
	<CA+Lge10ZMYguDnWSb+EduWw_2iBXhuTpSgw6u2Cz_hEUqeTWcw@mail.gmail.com>
	<E2743EFF-2515-457E-BF2F-CA1DF1CAFE5C@masklinn.net>
	<CACac1F8YE0Rm4cTo5KJ1W07jnnbZmz87JuPY_pnvR+WxL--j4Q@mail.gmail.com>
Message-ID: <27E61C7A-0F78-4825-A235-227882A64C6A@masklinn.net>


On 2012-04-20, at 23:41 , Paul Moore wrote:

> On 20 April 2012 20:49, Masklinn <masklinn at masklinn.net> wrote:
>> If they're huge mappings, you probably don't want to go around copying
>> them either way[0] and would instead use more custom mappings, either
>> some sort of joining proxy or something out of Okasaki (a clojure-style
>> tree-based map with structural sharing for instance)
> 
> Python 3.3 has collections.ChainMap for this sort of case.

Yeah, it's an example of the "joining proxy" thing. Though I'm not sure
I like it being editable, or the lookup order when providing a sequence
of maps (I haven't tested it but it appears the maps sequence is traversed
front-to-back, I'd have found the other way around more "obvious", as if
each sub-mapping was applied to a base through an update call).

An other potential weirdness of this solution ? I don't know how ChainMap
behaves there, the documentation is unclear ? is iteration over the map
and mapping.items() versus [(key, mapping[key]) for key in mapping]
potentially having very different behaviors/values since the former is
going to return all key:value pairs but the latter is only going to return
the key:(first value for key) pairs which may lead to significant repetitions
any time a key is present in multiple contexts.

From steve at pearwood.info  Sat Apr 21 02:24:41 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 21 Apr 2012 10:24:41 +1000
Subject: [Python-ideas] Have dict().update() return its own reference.
In-Reply-To: <27E61C7A-0F78-4825-A235-227882A64C6A@masklinn.net>
References: <CALWePYy3DhmoqUx_tZuFCc_a-+=dV=MT=B-mQhG7FU93V05NzA@mail.gmail.com>	<DC5051AF-DE5B-4477-8358-7BBBDDC4D8C1@lvh.cc>	<CALWePYwwnb+ijeBeNvGW4Z_mKCLBZGKO33TQKH8dokPqAe1a4A@mail.gmail.com>	<6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net>	<20120420133710.GR30763@bagheera>	<CAP7h-xbwoW+Ds9TPKnnsRVrnfKKvC6p7i9avZPxVG3tzttfhHQ@mail.gmail.com>	<jmrt1d$9k7$1@dough.gmane.org>	<CAP7h-xbHocZkyeBUvCXPzJ7-Oj3pKeuUr8cn61PkrZ+eTf5gsg@mail.gmail.com>	<CAP7+vJLRM5_YhGfgMYBgBt9LO+SP+_KGzRMn6bYvFTOfdiidzw@mail.gmail.com>	<CA+Lge10ZMYguDnWSb+EduWw_2iBXhuTpSgw6u2Cz_hEUqeTWcw@mail.gmail.com>	<E2743EFF-2515-457E-BF2F-CA1DF1CAFE5C@masklinn.net>	<CACac1F8YE0Rm4cTo5KJ1W07jnnbZmz87JuPY_pnvR+WxL--j4Q@mail.gmail.com>
	<27E61C7A-0F78-4825-A235-227882A64C6A@masklinn.net>
Message-ID: <4F91FE49.2000704@pearwood.info>

Masklinn wrote:
> On 2012-04-20, at 23:41 , Paul Moore wrote:
> 
>> On 20 April 2012 20:49, Masklinn <masklinn at masklinn.net> wrote:
>>> If they're huge mappings, you probably don't want to go around copying
>>> them either way[0] and would instead use more custom mappings, either
>>> some sort of joining proxy or something out of Okasaki (a clojure-style
>>> tree-based map with structural sharing for instance)
>> Python 3.3 has collections.ChainMap for this sort of case.
> 
> Yeah, it's an example of the "joining proxy" thing. Though I'm not sure
> I like it being editable, or the lookup order when providing a sequence
> of maps (I haven't tested it but it appears the maps sequence is traversed
> front-to-back, I'd have found the other way around more "obvious", as if
> each sub-mapping was applied to a base through an update call).

ChainMap is meant to emulate scoped lookups, e.g. builtins + globals + 
nonlocals + locals. Hence, newer scopes mask older scopes. "Locals" should be 
fast, hence it is at the front.

As for being editable, I'm not sure what you mean here, but surely you don't 
object to it being mutable?


> An other potential weirdness of this solution ? I don't know how ChainMap
> behaves there, the documentation is unclear ? is iteration over the map
> and mapping.items() versus [(key, mapping[key]) for key in mapping]
> potentially having very different behaviors/values since the former is
> going to return all key:value pairs but the latter is only going to return
> the key:(first value for key) pairs which may lead to significant repetitions
> any time a key is present in multiple contexts.

No, iteration over the ChainMap returns unique keys, not duplicates.


 >>> from collections import ChainMap
 >>> mapping = ChainMap(dict(a=1, b=2, c=3, d=4))
 >>> mapping = mapping.new_child()
 >>> mapping.update(dict(d=5, e=6, f=7))
 >>> mapping = mapping.new_child()
 >>> mapping.update(dict(f=8, g=9, h=10))
 >>>
 >>> len(mapping)
8
 >>> mapping
ChainMap({'h': 10, 'g': 9, 'f': 8}, {'e': 6, 'd': 5, 'f': 7}, {'a': 1, 'c': 3, 
'b': 2, 'd': 4})
 >>> list(mapping.keys())
['h', 'a', 'c', 'b', 'e', 'd', 'g', 'f']
 >>> list(mapping.values())
[10, 1, 3, 2, 6, 5, 9, 8]




-- 
Steven


From masklinn at masklinn.net  Sat Apr 21 14:53:27 2012
From: masklinn at masklinn.net (Masklinn)
Date: Sat, 21 Apr 2012 14:53:27 +0200
Subject: [Python-ideas] Have dict().update() return its own reference.
In-Reply-To: <4F91FE49.2000704@pearwood.info>
References: <CALWePYy3DhmoqUx_tZuFCc_a-+=dV=MT=B-mQhG7FU93V05NzA@mail.gmail.com>	<DC5051AF-DE5B-4477-8358-7BBBDDC4D8C1@lvh.cc>	<CALWePYwwnb+ijeBeNvGW4Z_mKCLBZGKO33TQKH8dokPqAe1a4A@mail.gmail.com>	<6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net>	<20120420133710.GR30763@bagheera>	<CAP7h-xbwoW+Ds9TPKnnsRVrnfKKvC6p7i9avZPxVG3tzttfhHQ@mail.gmail.com>	<jmrt1d$9k7$1@dough.gmane.org>	<CAP7h-xbHocZkyeBUvCXPzJ7-Oj3pKeuUr8cn61PkrZ+eTf5gsg@mail.gmail.com>	<CAP7+vJLRM5_YhGfgMYBgBt9LO+SP+_KGzRMn6bYvFTOfdiidzw@mail.gmail.com>	<CA+Lge10ZMYguDnWSb+EduWw_2iBXhuTpSgw6u2Cz_hEUqeTWcw@mail.gmail.com>	<E2743EFF-2515-457E-BF2F-CA1DF1CAFE5C@masklinn.net>	<CACac1F8YE0Rm4cTo5KJ1W07jnnbZmz87JuPY_pnvR+WxL--j4Q@mail.gmail.com>
	<27E61C7A-0F78-4825-A235-227882A64C6A@masklinn.net>
	<4F91FE49.2000704@pearwood.info>
Message-ID: <55920183-6C39-49F6-9017-8D358F7ED739@masklinn.net>


On 2012-04-21, at 02:24 , Steven D'Aprano wrote:

> Masklinn wrote:
>> On 2012-04-20, at 23:41 , Paul Moore wrote:
>>> On 20 April 2012 20:49, Masklinn <masklinn at masklinn.net> wrote:
>>>> If they're huge mappings, you probably don't want to go around copying
>>>> them either way[0] and would instead use more custom mappings, either
>>>> some sort of joining proxy or something out of Okasaki (a clojure-style
>>>> tree-based map with structural sharing for instance)
>>> Python 3.3 has collections.ChainMap for this sort of case.
>> Yeah, it's an example of the "joining proxy" thing. Though I'm not sure
>> I like it being editable, or the lookup order when providing a sequence
>> of maps (I haven't tested it but it appears the maps sequence is traversed
>> front-to-back, I'd have found the other way around more "obvious", as if
>> each sub-mapping was applied to a base through an update call).
> 
> ChainMap is meant to emulate scoped lookups

yes, my notes were in the context of the thread considering chainmap as a
proxy for multiple mappings, I understand this was not the primary use case
for chainmap

> , e.g. builtins + globals + nonlocals + locals. Hence, newer scopes mask older scopes. "Locals" should be fast, hence it is at the front.

That's just a question of traversal order for the maps sequence, if the sequence
is in the order you specify there: [builtins, globals, nonlocals, locals] then it
can be traversed from the back for locals to have the highest priority. The difference
in speed should be almost nil

> As for being editable, I'm not sure what you mean here, but surely you don't object to it being mutable?

I do, though again that's considering the usage of chainmap as a proxy, not
as a scope chain.

>> An other potential weirdness of this solution ? I don't know how ChainMap
>> behaves there, the documentation is unclear ? is iteration over the map
>> and mapping.items() versus [(key, mapping[key]) for key in mapping]
>> potentially having very different behaviors/values since the former is
>> going to return all key:value pairs but the latter is only going to return
>> the key:(first value for key) pairs which may lead to significant repetitions
>> any time a key is present in multiple contexts.
> 
> No, iteration over the ChainMap returns unique keys, not duplicates.

Ah, that's good. Would probably warrant mention in the documentation though.

From stefan_ml at behnel.de  Sat Apr 21 15:40:37 2012
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sat, 21 Apr 2012 15:40:37 +0200
Subject: [Python-ideas] Have dict().update() return its own reference.
In-Reply-To: <55920183-6C39-49F6-9017-8D358F7ED739@masklinn.net>
References: <CALWePYy3DhmoqUx_tZuFCc_a-+=dV=MT=B-mQhG7FU93V05NzA@mail.gmail.com>	<DC5051AF-DE5B-4477-8358-7BBBDDC4D8C1@lvh.cc>	<CALWePYwwnb+ijeBeNvGW4Z_mKCLBZGKO33TQKH8dokPqAe1a4A@mail.gmail.com>	<6ECC43EE-3AB1-4C1B-BFF4-557C153CD6F3@masklinn.net>	<20120420133710.GR30763@bagheera>	<CAP7h-xbwoW+Ds9TPKnnsRVrnfKKvC6p7i9avZPxVG3tzttfhHQ@mail.gmail.com>	<jmrt1d$9k7$1@dough.gmane.org>	<CAP7h-xbHocZkyeBUvCXPzJ7-Oj3pKeuUr8cn61PkrZ+eTf5gsg@mail.gmail.com>	<CAP7+vJLRM5_YhGfgMYBgBt9LO+SP+_KGzRMn6bYvFTOfdiidzw@mail.gmail.com>	<CA+Lge10ZMYguDnWSb+EduWw_2iBXhuTpSgw6u2Cz_hEUqeTWcw@mail.gmail.com>	<E2743EFF-2515-457E-BF2F-CA1DF1CAFE5C@masklinn.net>	<CACac1F8YE0Rm4cTo5KJ1W07jnnbZmz87JuPY_pnvR+WxL--j4Q@mail.gmail.com>
	<27E61C7A-0F78-4825-A235-227882A64C6A@masklinn.net>
	<4F91FE49.2000704@pearwood.info>
	<55920183-6C39-49F6-9017-8D358F7ED739@masklinn.net>
Message-ID: <jmudcl$fd5$1@dough.gmane.org>

Masklinn, 21.04.2012 14:53:
> On 2012-04-21, at 02:24 , Steven D'Aprano wrote:
>> Masklinn wrote:
>>> An other potential weirdness of this solution ? I don't know how ChainMap
>>> behaves there, the documentation is unclear ? is iteration over the map
>>> and mapping.items() versus [(key, mapping[key]) for key in mapping]
>>> potentially having very different behaviors/values since the former is
>>> going to return all key:value pairs but the latter is only going to return
>>> the key:(first value for key) pairs which may lead to significant repetitions
>>> any time a key is present in multiple contexts.
>>
>> No, iteration over the ChainMap returns unique keys, not duplicates.
> 
> Ah, that's good. Would probably warrant mention in the documentation though.

What would you want to see there? "This class works as expected even when
iterating over it"?

Stefan



From ericsnowcurrently at gmail.com  Sun Apr 22 10:59:54 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Sun, 22 Apr 2012 02:59:54 -0600
Subject: [Python-ideas] sys.implementation
In-Reply-To: <CADiSq7fj7X1BxfSqJLnB9pERnyW_+idp5wEi+cbXTs4rAi+gfQ@mail.gmail.com>
References: <CALFfu7DYyZMUp40MDR9-vhpOkPvr=cwt5EmMHEGTrmix_kZbYg@mail.gmail.com>
	<CALFfu7AnbLQGPn2vJdwSoweYsNgWbGpiS8m7h7q-asYLwgjPQg@mail.gmail.com>
	<loom.20120321T193710-866@post.gmane.org>
	<CADiSq7fj7X1BxfSqJLnB9pERnyW_+idp5wEi+cbXTs4rAi+gfQ@mail.gmail.com>
Message-ID: <CALFfu7DadfGGFxU+BHOj5GhnhOQKUJ4vL=qn2_=K_4qx6xTVnw@mail.gmail.com>

On Thu, Mar 22, 2012 at 7:22 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On Thu, Mar 22, 2012 at 4:37 AM, Benjamin Peterson <benjamin at python.org> wrote:
>> Eric Snow <ericsnowcurrently at ...> writes:
>>>
>>> I'd like to move this forward, so any objections or feedback at this
>>> point would be helpful.
>>
>> I would like to see a concrete proposal of what would get put in there.
>
> +1
>
> A possible starting list:
>
> - impl name (with a view to further standardising the way we check for
> impl specific tests in the regression test suite)
> - impl version (official place to store the implementation version,
> potentially independent of the language version as it already is in
> PyPy)
> - cache tag (replacement for imp.get_tag())

This is a great start, Nick.  Having a solid sys.implementation would
(for one) help with the import machinery, as your list suggests.  I'm
planning on reviewing that old thread over the next few days. [1]  In
the meantime, any additional thoughts by anyone on what would go into
sys.implementation would be very helpful.

-eric

p.s. is python-ideas the right place for this discussion?  Also,
ultimately this topic should go into a PEP, right?


[1] http://mail.python.org/pipermail/python-dev/2009-October/092893.html


From mark at hotpy.org  Sun Apr 22 11:57:13 2012
From: mark at hotpy.org (Mark Shannon)
Date: Sun, 22 Apr 2012 10:57:13 +0100
Subject: [Python-ideas] sys.implementation
In-Reply-To: <CALFfu7DadfGGFxU+BHOj5GhnhOQKUJ4vL=qn2_=K_4qx6xTVnw@mail.gmail.com>
References: <CALFfu7DYyZMUp40MDR9-vhpOkPvr=cwt5EmMHEGTrmix_kZbYg@mail.gmail.com>	<CALFfu7AnbLQGPn2vJdwSoweYsNgWbGpiS8m7h7q-asYLwgjPQg@mail.gmail.com>	<loom.20120321T193710-866@post.gmane.org>	<CADiSq7fj7X1BxfSqJLnB9pERnyW_+idp5wEi+cbXTs4rAi+gfQ@mail.gmail.com>
	<CALFfu7DadfGGFxU+BHOj5GhnhOQKUJ4vL=qn2_=K_4qx6xTVnw@mail.gmail.com>
Message-ID: <4F93D5F9.4050402@hotpy.org>

Eric Snow wrote:
> On Thu, Mar 22, 2012 at 7:22 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> On Thu, Mar 22, 2012 at 4:37 AM, Benjamin Peterson <benjamin at python.org> wrote:
>>> Eric Snow <ericsnowcurrently at ...> writes:
>>>> I'd like to move this forward, so any objections or feedback at this
>>>> point would be helpful.
>>> I would like to see a concrete proposal of what would get put in there.
>> +1
>>
>> A possible starting list:
>>
>> - impl name (with a view to further standardising the way we check for
>> impl specific tests in the regression test suite)
>> - impl version (official place to store the implementation version,
>> potentially independent of the language version as it already is in
>> PyPy)
>> - cache tag (replacement for imp.get_tag())

One more:

GC (reference-counting, copying, mark-and-sweep, generational, etc.)

> 
> This is a great start, Nick.  Having a solid sys.implementation would
> (for one) help with the import machinery, as your list suggests.  I'm
> planning on reviewing that old thread over the next few days. [1]  In
> the meantime, any additional thoughts by anyone on what would go into
> sys.implementation would be very helpful.
> 
> -eric
> 
> p.s. is python-ideas the right place for this discussion?  Also,
> ultimately this topic should go into a PEP, right?
> 
> 
> [1] http://mail.python.org/pipermail/python-dev/2009-October/092893.html
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas



From solipsis at pitrou.net  Sun Apr 22 13:30:24 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sun, 22 Apr 2012 13:30:24 +0200
Subject: [Python-ideas] sys.implementation
References: <CALFfu7DYyZMUp40MDR9-vhpOkPvr=cwt5EmMHEGTrmix_kZbYg@mail.gmail.com>
	<CALFfu7AnbLQGPn2vJdwSoweYsNgWbGpiS8m7h7q-asYLwgjPQg@mail.gmail.com>
	<loom.20120321T193710-866@post.gmane.org>
	<CADiSq7fj7X1BxfSqJLnB9pERnyW_+idp5wEi+cbXTs4rAi+gfQ@mail.gmail.com>
	<CALFfu7DadfGGFxU+BHOj5GhnhOQKUJ4vL=qn2_=K_4qx6xTVnw@mail.gmail.com>
	<4F93D5F9.4050402@hotpy.org>
Message-ID: <20120422133024.71125555@pitrou.net>

On Sun, 22 Apr 2012 10:57:13 +0100
Mark Shannon <mark at hotpy.org> wrote:
> 
> One more:
> 
> GC (reference-counting, copying, mark-and-sweep, generational, etc.)

I think this would sound more natural in the gc module itself.

Regards

Antoine.




From nestornissen at gmail.com  Mon Apr 23 03:07:36 2012
From: nestornissen at gmail.com (Nestor)
Date: Sun, 22 Apr 2012 21:07:36 -0400
Subject: [Python-ideas] Haskell envy
Message-ID: <CAFJiHS-=g6QFrG_zWkRAx4AB8kYG6PRr3vL+XqAV4ku1gqEwOA@mail.gmail.com>

The other day a colleague of mine submitted this challenge taken from
some website to us coworkers:

Have the function ArrayAddition(arr) take the array of numbers stored
in arr and print true if any combination of numbers in the array can
be added up to equal the largest number in the array, otherwise print
false. For example: if arr contains [4, 6, 23, 10, 1, 3] the output
should print true because 4 + 6 + 10 + 3 = 23.  The array will not be
empty, will not contain all the same elements, and may contain
negative numbers.

Examples:
5,7,16,1,2
3,5,-1,8,12

I proposed this solution:

from itertools import combinations, chain
def array_test(arr):
    biggest = max(arr)
    arr.remove(biggest)
    my_comb = chain(*(combinations(arr,i) for i in range(1,len(arr)+1)))
    for one_comb in my_comb:
        if sum(one_comb) == biggest:
            return True, one_comb
    return False


But then somebody else submitted this Haskell solution:

import Data.List
f :: (Ord a,Num a) => [a] -> Bool
f lst = (\y -> elem y $ map sum $ subsequences $ [ x | x <- lst, x /=
y ]) $ maximum lst


I wonder if we should add a subsequences function to itertools or make
the "r" parameter of combinations optional to return all combinations
up to len(iterable).


From ncoghlan at gmail.com  Mon Apr 23 03:26:38 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 23 Apr 2012 11:26:38 +1000
Subject: [Python-ideas] Haskell envy
In-Reply-To: <CAFJiHS-=g6QFrG_zWkRAx4AB8kYG6PRr3vL+XqAV4ku1gqEwOA@mail.gmail.com>
References: <CAFJiHS-=g6QFrG_zWkRAx4AB8kYG6PRr3vL+XqAV4ku1gqEwOA@mail.gmail.com>
Message-ID: <CADiSq7fSJFdKTqvE6bwi0xuBFZeopL4AidfeSfvVV9CJJ_fZNA@mail.gmail.com>

On Mon, Apr 23, 2012 at 11:07 AM, Nestor <nestornissen at gmail.com> wrote:
> I wonder if we should add a subsequences function to itertools or make
> the "r" parameter of combinations optional to return all combinations
> up to len(iterable).

Why? It just makes itertools using code that much harder to read,
since you'd have yet another variant to learn. If you need it, then
just define a separate "all_combinations()" that makes it clear what
is going on (replace yield from usage with itertools.chain() for
Python < 3.3):

from itertools import combinations

def all_combinations(data):
    for num_items in range(1, len(data)+1):
        yield from combinations(data, num_items)

def array_test(arr):
   biggest = max(arr)
   data = [x for x in arr if x != biggest]
   for combo in all_combinations(data):
       if sum(combo) == biggest:
           return True, combo
   return False, None

When a gain in brevity increases the necessary level of assumed
knowledge for future maintainers, it isn't a clear win from a language
design point of view.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From tjreedy at udel.edu  Mon Apr 23 04:55:49 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 22 Apr 2012 22:55:49 -0400
Subject: [Python-ideas] Haskell envy
In-Reply-To: <CAFJiHS-=g6QFrG_zWkRAx4AB8kYG6PRr3vL+XqAV4ku1gqEwOA@mail.gmail.com>
References: <CAFJiHS-=g6QFrG_zWkRAx4AB8kYG6PRr3vL+XqAV4ku1gqEwOA@mail.gmail.com>
Message-ID: <jn2gbm$fm5$1@dough.gmane.org>

On 4/22/2012 9:07 PM, Nestor wrote:
> The other day a colleague of mine submitted this challenge taken from
> some website to us coworkers:

This is more of a python-list post than a python-ideas post.
So is my response.

> Have the function ArrayAddition(arr) take the array of numbers stored
> in arr and print true if any combination of numbers in the array can
> be added up to equal the largest number in the array, otherwise print
> false. For example: if arr contains [4, 6, 23, 10, 1, 3] the output
> should print true because 4 + 6 + 10 + 3 = 23.  The array will not be
> empty, will not contain all the same elements, and may contain
> negative numbers.

Since the order of the numbers is arbitrary and irrelevant to the 
problem, it should be formulated in term of a set of numbers.
Non-empty means that max(set) exists.
No duplicates means that max(set) is unique and there is no question of 
whether to remove one or all copies.

> Examples:
> 5,7,16,1,2

False (5+7+1+2 == 15)

 > 3,5,-1,8,12

True (8+5-1 == 12)

> I proposed this solution:
>
> from itertools import combinations, chain
> def array_test(arr):
>      biggest = max(arr)
>      arr.remove(biggest
>      my_comb = chain(*(combinations(arr,i) for i in range(1,len(arr)+1)))

I believe the above works for sets as well as lists.

>      for one_comb in my_comb:
>          if sum(one_comb) == biggest:
>              return True, one_comb
>      return False

The above is must clearer to me than the Haskell (or my equivalent) 
below, and I suspect that would be true even if I really knew Haskell. 
If you want something more like the Haskell version, try

return any(filter(x == biggest, map(sum, my_comb)))

To make it even more like a one-liner, replace my_comb with its expression.

> But then somebody else submitted this Haskell solution:
>
> import Data.List
> f :: (Ord a,Num a) =>  [a] ->  Bool
> f lst = (\y ->  elem y $ map sum $ subsequences $ [ x | x<- lst, x /=
> y ]) $ maximum lst

If you really want one logical line that I believe matches the above ;-)

def sum_to_max(numbers):
     return (
         (lambda num_max:
             (lambda nums:
                 any(filter(lambda n: n == num_max, map(sum,
                         chain(*(combinations(nums,i) for i in
                                 range(1,len(nums)+1))))))
             ) (numbers - {num_max})
               # because numbers.remove(num_max)is None
         ) (max(numbers))
     )

print(sum_to_max({5,7,16,1,2}) == False,
       sum_to_max({3,5,-1,8,12}) == True)
# sets because of numbers - {num_max}
True, True

Comment 1: debugging nested expressions is harder than debugging a 
sequence of statements because cannot insert print statements.

Comment 2: the function should really return an iterator that yields the 
sets that add to the max. Then the user can decide whether or not to 
collapse the information to True/False and stop after the first.

> I wonder if we should add a subsequences function to itertools or make
> the "r" parameter of combinations optional to return all combinations
> up to len(iterable).

I think not. The goal of itertools is to provide basic iterators that 
can be combined to produce specific iterators, just as you did.

-- 
Terry Jan Reedy



From pyideas at rebertia.com  Mon Apr 23 05:18:36 2012
From: pyideas at rebertia.com (Chris Rebert)
Date: Sun, 22 Apr 2012 20:18:36 -0700
Subject: [Python-ideas] Haskell envy
In-Reply-To: <jn2gbm$fm5$1@dough.gmane.org>
References: <CAFJiHS-=g6QFrG_zWkRAx4AB8kYG6PRr3vL+XqAV4ku1gqEwOA@mail.gmail.com>
	<jn2gbm$fm5$1@dough.gmane.org>
Message-ID: <CAMZYqRQE1gAF4o-ne6RXKttR8qNicpRXn5sZgTEGd4KLhZ1O4w@mail.gmail.com>

On Sun, Apr 22, 2012 at 7:55 PM, Terry Reedy <tjreedy at udel.edu> wrote:
> On 4/22/2012 9:07 PM, Nestor wrote:
<snip>
>> Have the function ArrayAddition(arr) take the array of numbers stored
>> in arr and print true if any combination of numbers in the array can
>> be added up to equal the largest number in the array, otherwise print
>> false. For example: if arr contains [4, 6, 23, 10, 1, 3] the output
>> should print true because 4 + 6 + 10 + 3 = 23. ?The array will not be
>> empty, will not contain all the same elements, and may contain
>> negative numbers.
>
> Since the order of the numbers is arbitrary and irrelevant to the problem,
> it should be formulated in term of a set of numbers.

Er, multiplicity still matters, so it should be a multiset/bag. One
possible representation thereof would be a list...

Cheers,
Chris


From jeanpierreda at gmail.com  Mon Apr 23 06:30:43 2012
From: jeanpierreda at gmail.com (Devin Jeanpierre)
Date: Mon, 23 Apr 2012 00:30:43 -0400
Subject: [Python-ideas] Haskell envy
In-Reply-To: <jn2gbm$fm5$1@dough.gmane.org>
References: <CAFJiHS-=g6QFrG_zWkRAx4AB8kYG6PRr3vL+XqAV4ku1gqEwOA@mail.gmail.com>
	<jn2gbm$fm5$1@dough.gmane.org>
Message-ID: <CABicbJLQbGXgHB5__CTDSdHiJ-b-7=R-P=oNEGsW21zbyqpObg@mail.gmail.com>

On Sun, Apr 22, 2012 at 10:55 PM, Terry Reedy <tjreedy at udel.edu> wrote:
> Comment 1: debugging nested expressions is harder than debugging a sequence
> of statements because cannot insert print statements.

Also, pdb has no support for debugging expressions.

-- Devin


From hobsonlane at gmail.com  Mon Apr 23 07:21:10 2012
From: hobsonlane at gmail.com (Hobson Lane)
Date: Mon, 23 Apr 2012 13:21:10 +0800
Subject: [Python-ideas] Anyone working on a platform-agnostic os.startfile()
Message-ID: <CACZ_DoceeafEpu9wsyG0JPuAGUay0PR91xOZSwzrVy4BGZo1nQ@mail.gmail.com>

There is significant interest in a cross-platform
file-launcher.[1][2][3][4]  The ideal implementation would be
an operating-system-agnostic interface that launches a file for editing or
viewing, similar to the way os.startfile() works for Windows, but
generalized to allow caller-specification of view vs. edit preference and
support all registered os.name operating systems, not just 'nt'.

Mercurial has a mature python implementation for cross-platform launching
of an editor (either GUI editor or terminal-based editor like vi).[5][6]
 The python std lib os.startfile obviously works for Windows.

The Mercurial functionality could be rolled into os.startfile() with
additional named parameters for edit or view preference and gui or non-gui
preference. Perhaps that would enable backporting belwo Python 3.x. Or is
there a better place to incorporate this multi-platform file launching
capability?

  [1]:
http://stackoverflow.com/questions/1856792/intelligently-launching-the-default-editor-from-inside-a-python-cli-program
  [2]:
http://stackoverflow.com/questions/434597/open-document-with-default-application-in-python
  [3]:
http://stackoverflow.com/questions/1442841/lauch-default-editor-like-webbrowser-module
  [4]:
http://stackoverflow.com/questions/434597/open-document-with-default-application-in-python
  [5]: http://selenic.com/repo/hg-stable/file/2770d03ae49f/mercurial/ui.py
  [6]: http://selenic.com/repo/hg-stable/file/2770d03ae49f/mercurial/util.py
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120423/e672411c/attachment.html>

From pyideas at rebertia.com  Mon Apr 23 07:57:30 2012
From: pyideas at rebertia.com (Chris Rebert)
Date: Sun, 22 Apr 2012 22:57:30 -0700
Subject: [Python-ideas] Anyone working on a platform-agnostic
	os.startfile()
In-Reply-To: <CACZ_DoceeafEpu9wsyG0JPuAGUay0PR91xOZSwzrVy4BGZo1nQ@mail.gmail.com>
References: <CACZ_DoceeafEpu9wsyG0JPuAGUay0PR91xOZSwzrVy4BGZo1nQ@mail.gmail.com>
Message-ID: <CAMZYqRRiRcXGZa50ggxrRRHKjncXq2mv5=+aueAtazB76KfntQ@mail.gmail.com>

On Sun, Apr 22, 2012 at 10:21 PM, Hobson Lane <hobsonlane at gmail.com> wrote:
> There is significant interest in a cross-platform
> file-launcher.[1][2][3][4]??The ideal implementation would be
> an?operating-system-agnostic interface that launches a file for editing or
> viewing, similar to the way os.startfile() works for Windows, but
> generalized to allow caller-specification of view vs. edit preference and
> support all registered os.name operating systems, not just 'nt'.

There is an existing open bug that requests such a feature:
"Add shutil.open": http://bugs.python.org/issue3177

Cheers,
Chris


From hobsonlane at gmail.com  Mon Apr 23 13:10:27 2012
From: hobsonlane at gmail.com (Hobson Lane)
Date: Mon, 23 Apr 2012 19:10:27 +0800
Subject: [Python-ideas] Python-ideas Digest, Vol 65, Issue 43
In-Reply-To: <mailman.31.1335175203.9054.python-ideas@python.org>
References: <mailman.31.1335175203.9054.python-ideas@python.org>
Message-ID: <CACZ_DodeyqgbVMCceyXU=HzBzeXS9nFTX=2oseQ8=zoBDw-_-A@mail.gmail.com>

Formatted and finished Rebert's solution to this issue

http://bugs.python.org/issue3177

But the question of where to put it is still open ( shutil.open vs.
shutil.launch vs. os.startfile ):

1. `shutil.open()` will break anyone that does `from shutil import *` or
edits the shutil.py file and tries to use the builtin open() after the
shutil.open() definition.

2. `shutil.launch()` is better than shutil.open() due to reduced breakage,
but not as simple or DRY or reverse-compatible as putting it in
os.startfile() in my mind. This fix just implements the functionality of
os.startfile() for non-Windows OSes.

3. `shutil.startfile()` was recommended against by a developer or two on
this mailing list, but seems appropriate to me. The only upstream
"breakage" for an os.startfile() location that I can think of is the
failure to raise exceptions on non-Windows OSes. Any legacy (<3.0) code
that relies on os.startfile() exceptions in order to detect a non-windows
OS is misguided and needs re-factoring anyway, IMHO. Though their only
indication of a "problem" in their code would be the successful launching
of a viewer for whatever path they pointed to...

4. `os.launch()` anyone? Not me.

On Mon, Apr 23, 2012 at 6:00 PM, <python-ideas-request at python.org> wrote:

> Send Python-ideas mailing list submissions to
>        python-ideas at python.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>        http://mail.python.org/mailman/listinfo/python-ideas
> or, via email, send a message with subject or body 'help' to
>        python-ideas-request at python.org
>
> You can reach the person managing the list at
>        python-ideas-owner at python.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Python-ideas digest..."
>
>
> Today's Topics:
>
>   1. Anyone working on a platform-agnostic os.startfile() (Hobson Lane)
>   2. Re: Anyone working on a platform-agnostic os.startfile()
>      (Chris Rebert)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Mon, 23 Apr 2012 13:21:10 +0800
> From: Hobson Lane <hobsonlane at gmail.com>
> To: python-ideas at python.org
> Cc: Hobson's Totalgood Aliases <knowledge at totalgood.com>
> Subject: [Python-ideas] Anyone working on a platform-agnostic
>        os.startfile()
> Message-ID:
>        <CACZ_DoceeafEpu9wsyG0JPuAGUay0PR91xOZSwzrVy4BGZo1nQ at mail.gmail.com
> >
> Content-Type: text/plain; charset="iso-8859-1"
>
> There is significant interest in a cross-platform
> file-launcher.[1][2][3][4]  The ideal implementation would be
> an operating-system-agnostic interface that launches a file for editing or
> viewing, similar to the way os.startfile() works for Windows, but
> generalized to allow caller-specification of view vs. edit preference and
> support all registered os.name operating systems, not just 'nt'.
>
> Mercurial has a mature python implementation for cross-platform launching
> of an editor (either GUI editor or terminal-based editor like vi).[5][6]
>  The python std lib os.startfile obviously works for Windows.
>
> The Mercurial functionality could be rolled into os.startfile() with
> additional named parameters for edit or view preference and gui or non-gui
> preference. Perhaps that would enable backporting belwo Python 3.x. Or is
> there a better place to incorporate this multi-platform file launching
> capability?
>
>  [1]:
>
> http://stackoverflow.com/questions/1856792/intelligently-launching-the-default-editor-from-inside-a-python-cli-program
>  [2]:
>
> http://stackoverflow.com/questions/434597/open-document-with-default-application-in-python
>  [3]:
>
> http://stackoverflow.com/questions/1442841/lauch-default-editor-like-webbrowser-module
>  [4]:
>
> http://stackoverflow.com/questions/434597/open-document-with-default-application-in-python
>  [5]: http://selenic.com/repo/hg-stable/file/2770d03ae49f/mercurial/ui.py
>  [6]:
> http://selenic.com/repo/hg-stable/file/2770d03ae49f/mercurial/util.py
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://mail.python.org/pipermail/python-ideas/attachments/20120423/e672411c/attachment-0001.html
> >
>
> ------------------------------
>
> Message: 2
> Date: Sun, 22 Apr 2012 22:57:30 -0700
> From: Chris Rebert <pyideas at rebertia.com>
> Cc: python-ideas at python.org
> Subject: Re: [Python-ideas] Anyone working on a platform-agnostic
>        os.startfile()
> Message-ID:
>        <CAMZYqRRiRcXGZa50ggxrRRHKjncXq2mv5=+aueAtazB76KfntQ at mail.gmail.com
> >
> Content-Type: text/plain; charset=UTF-8
>
> On Sun, Apr 22, 2012 at 10:21 PM, Hobson Lane <hobsonlane at gmail.com>
> wrote:
> > There is significant interest in a cross-platform
> > file-launcher.[1][2][3][4]??The ideal implementation would be
> > an?operating-system-agnostic interface that launches a file for editing
> or
> > viewing, similar to the way os.startfile() works for Windows, but
> > generalized to allow caller-specification of view vs. edit preference and
> > support all registered os.name operating systems, not just 'nt'.
>
> There is an existing open bug that requests such a feature:
> "Add shutil.open": http://bugs.python.org/issue3177
>
> Cheers,
> Chris
>
>
> ------------------------------
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
>
> End of Python-ideas Digest, Vol 65, Issue 43
> ********************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120423/651d3fed/attachment.html>

From pyideas at rebertia.com  Mon Apr 23 14:01:15 2012
From: pyideas at rebertia.com (Chris Rebert)
Date: Mon, 23 Apr 2012 05:01:15 -0700
Subject: [Python-ideas] Python-ideas Digest, Vol 65, Issue 43
In-Reply-To: <CACZ_DodeyqgbVMCceyXU=HzBzeXS9nFTX=2oseQ8=zoBDw-_-A@mail.gmail.com>
References: <mailman.31.1335175203.9054.python-ideas@python.org>
	<CACZ_DodeyqgbVMCceyXU=HzBzeXS9nFTX=2oseQ8=zoBDw-_-A@mail.gmail.com>
Message-ID: <CAMZYqRTobyMQrMsNu-8yC8TqxaRjxZD0UWGJNOdmL2AfzSi6Mg@mail.gmail.com>

On Mon, Apr 23, 2012 at 4:10 AM, Hobson Lane <hobsonlane at gmail.com> wrote:
<snip>
> On Mon, Apr 23, 2012 at 6:00 PM, <python-ideas-request at python.org> wrote:
>>
>> Send Python-ideas mailing list submissions to
>> ? ? ? ?python-ideas at python.org
>>
>> To subscribe or unsubscribe via the World Wide Web, visit
>> ? ? ? ?http://mail.python.org/mailman/listinfo/python-ideas
>> or, via email, send a message with subject or body 'help' to
>> ? ? ? ?python-ideas-request at python.org
>>
>> You can reach the person managing the list at
>> ? ? ? ?python-ideas-owner at python.org
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of Python-ideas digest..."

Please avoid replying to the digest; it breaks conversation threading.
Switch to a non-digest mailing list subscription when not lurking.

Cheers,
Chris


From solipsis at pitrou.net  Mon Apr 23 14:02:33 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 23 Apr 2012 14:02:33 +0200
Subject: [Python-ideas] Python-ideas Digest, Vol 65, Issue 43
References: <mailman.31.1335175203.9054.python-ideas@python.org>
	<CACZ_DodeyqgbVMCceyXU=HzBzeXS9nFTX=2oseQ8=zoBDw-_-A@mail.gmail.com>
	<CAMZYqRTobyMQrMsNu-8yC8TqxaRjxZD0UWGJNOdmL2AfzSi6Mg@mail.gmail.com>
Message-ID: <20120423140233.052ed5e0@pitrou.net>


Or at least, if you do reply to the digest, please change the e-mail
subject line to something informative.

Regards

Antoine.



On Mon, 23 Apr 2012 05:01:15 -0700
Chris Rebert <pyideas at rebertia.com> wrote:
> On Mon, Apr 23, 2012 at 4:10 AM, Hobson Lane <hobsonlane at gmail.com> wrote:
> <snip>
> > On Mon, Apr 23, 2012 at 6:00 PM, <python-ideas-request at python.org> wrote:
> >>
> >> Send Python-ideas mailing list submissions to
> >> ? ? ? ?python-ideas at python.org
> >>
> >> To subscribe or unsubscribe via the World Wide Web, visit
> >> ? ? ? ?http://mail.python.org/mailman/listinfo/python-ideas
> >> or, via email, send a message with subject or body 'help' to
> >> ? ? ? ?python-ideas-request at python.org
> >>
> >> You can reach the person managing the list at
> >> ? ? ? ?python-ideas-owner at python.org
> >>
> >> When replying, please edit your Subject line so it is more specific
> >> than "Re: Contents of Python-ideas digest..."
> 
> Please avoid replying to the digest; it breaks conversation threading.
> Switch to a non-digest mailing list subscription when not lurking.
> 
> Cheers,
> Chris
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas




From solipsis at pitrou.net  Mon Apr 23 14:04:32 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 23 Apr 2012 14:04:32 +0200
Subject: [Python-ideas] shutil.launch / open/ startfile
References: <mailman.31.1335175203.9054.python-ideas@python.org>
	<CACZ_DodeyqgbVMCceyXU=HzBzeXS9nFTX=2oseQ8=zoBDw-_-A@mail.gmail.com>
Message-ID: <20120423140432.2980d52b@pitrou.net>

On Mon, 23 Apr 2012 19:10:27 +0800
Hobson Lane <hobsonlane at gmail.com> wrote:
> Formatted and finished Rebert's solution to this issue
> 
> http://bugs.python.org/issue3177
> 
> But the question of where to put it is still open ( shutil.open vs.
> shutil.launch vs. os.startfile ):
> 
> 1. `shutil.open()` will break anyone that does `from shutil import *` or
> edits the shutil.py file and tries to use the builtin open() after the
> shutil.open() definition.
> 
> 2. `shutil.launch()` is better than shutil.open() due to reduced breakage,
> but not as simple or DRY or reverse-compatible as putting it in
> os.startfile() in my mind. This fix just implements the functionality of
> os.startfile() for non-Windows OSes.
> 
> 3. `shutil.startfile()` was recommended against by a developer or two on
> this mailing list, but seems appropriate to me. The only upstream
> "breakage" for an os.startfile() location that I can think of is the
> failure to raise exceptions on non-Windows OSes. Any legacy (<3.0) code
> that relies on os.startfile() exceptions in order to detect a non-windows
> OS is misguided and needs re-factoring anyway, IMHO. Though their only
> indication of a "problem" in their code would be the successful launching
> of a viewer for whatever path they pointed to...

Both shutil.launch() and shutil.startfile() are fine to me. I must
admit launch() sounds a bit more obvious than startfile().

I am -1 on shutil.open().

Regards

Antoine.




From ncoghlan at gmail.com  Mon Apr 23 18:43:59 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 24 Apr 2012 02:43:59 +1000
Subject: [Python-ideas] shutil.launch / open/ startfile
In-Reply-To: <20120423140432.2980d52b@pitrou.net>
References: <mailman.31.1335175203.9054.python-ideas@python.org>
	<CACZ_DodeyqgbVMCceyXU=HzBzeXS9nFTX=2oseQ8=zoBDw-_-A@mail.gmail.com>
	<20120423140432.2980d52b@pitrou.net>
Message-ID: <CADiSq7dmLQBBKjFQtdUNf2HxB0=SkgHB_Xbyoa9pkbS40RhCww@mail.gmail.com>

On Mon, Apr 23, 2012 at 10:04 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> On Mon, 23 Apr 2012 19:10:27 +0800
> Hobson Lane <hobsonlane at gmail.com> wrote:
>> Formatted and finished Rebert's solution to this issue
>>
>> http://bugs.python.org/issue3177
>>
>> But the question of where to put it is still open ( shutil.open vs.
>> shutil.launch vs. os.startfile ):
>>
>> 1. `shutil.open()` will break anyone that does `from shutil import *` or
>> edits the shutil.py file and tries to use the builtin open() after the
>> shutil.open() definition.
>>
>> 2. `shutil.launch()` is better than shutil.open() due to reduced breakage,
>> but not as simple or DRY or reverse-compatible as putting it in
>> os.startfile() in my mind. This fix just implements the functionality of
>> os.startfile() for non-Windows OSes.
>>
>> 3. `shutil.startfile()` was recommended against by a developer or two on
>> this mailing list, but seems appropriate to me. The only upstream
>> "breakage" for an os.startfile() location that I can think of is the
>> failure to raise exceptions on non-Windows OSes. Any legacy (<3.0) code
>> that relies on os.startfile() exceptions in order to detect a non-windows
>> OS is misguided and needs re-factoring anyway, IMHO. Though their only
>> indication of a "problem" in their code would be the successful launching
>> of a viewer for whatever path they pointed to...
>
> Both shutil.launch() and shutil.startfile() are fine to me. I must
> admit launch() sounds a bit more obvious than startfile().
>
> I am -1 on shutil.open().

Indeed, my preference is the same (i.e. shutil.launch()). The file
isn't being started - an application is being started to read the
file.

And, despite the existence of os.startfile(), this functionality feels
too high level to be generally appropriate for the os module.

Cheers.
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From steve at pearwood.info  Mon Apr 23 18:52:16 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 24 Apr 2012 02:52:16 +1000
Subject: [Python-ideas] Haskell envy
In-Reply-To: <CAFJiHS-=g6QFrG_zWkRAx4AB8kYG6PRr3vL+XqAV4ku1gqEwOA@mail.gmail.com>
References: <CAFJiHS-=g6QFrG_zWkRAx4AB8kYG6PRr3vL+XqAV4ku1gqEwOA@mail.gmail.com>
Message-ID: <4F9588C0.2020005@pearwood.info>

Nestor wrote:
[...]
> But then somebody else submitted this Haskell solution:
> 
> import Data.List
> f :: (Ord a,Num a) => [a] -> Bool
> f lst = (\y -> elem y $ map sum $ subsequences $ [ x | x <- lst, x /=
> y ]) $ maximum lst
> 
> 
> I wonder if we should add a subsequences function to itertools or make
> the "r" parameter of combinations optional to return all combinations
> up to len(iterable).

You really don't do your idea any favours by painting it as "Haskell envy".

I have mixed ideas on this. On the one hand, subsequences is a natural 
function to include along with permutations and combinations in any 
combinatorics tool set. As you point out, Haskell has it. So does Mathematica, 
under the name "subsets".

But on the other hand, itertools is arguably not the right place for it. If we 
add subsequences, will people then ask for partitions and derangements next? 
Where will it end? At some point the line needs to be drawn, with some 
functions declared "too specialised" for the general itertools module.

(Perhaps there should be a separate combinatorics module.)

And on the third hand, what you want is a simple two-liner, easy enough to 
write in place:

from itertools import chain, combinations
allcombs = chain(*(combinations(data, i) for i in range(len(data)+1)))

which is close to what your code used. However, the discoverability of this 
solution is essentially zero (if you don't know how to do this, it's hard to 
find out), and it is hardly as self-documenting as

from itertools import subsequences
allcombs = subsequences(data)


Overall, I would vote +0.5 on a subsequences function.



-- 
Steven


From steve at pearwood.info  Mon Apr 23 19:24:54 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 24 Apr 2012 03:24:54 +1000
Subject: [Python-ideas] Haskell envy
In-Reply-To: <CAFJiHS-=g6QFrG_zWkRAx4AB8kYG6PRr3vL+XqAV4ku1gqEwOA@mail.gmail.com>
References: <CAFJiHS-=g6QFrG_zWkRAx4AB8kYG6PRr3vL+XqAV4ku1gqEwOA@mail.gmail.com>
Message-ID: <4F959066.6090504@pearwood.info>

Nestor wrote:
> The other day a colleague of mine submitted this challenge taken from
> some website to us coworkers:
> 
> Have the function ArrayAddition(arr) take the array of numbers stored
> in arr and print true if any combination of numbers in the array can
> be added up to equal the largest number in the array, otherwise print
> false. For example: if arr contains [4, 6, 23, 10, 1, 3] the output
> should print true because 4 + 6 + 10 + 3 = 23.  The array will not be
> empty, will not contain all the same elements, and may contain
> negative numbers.

By the way, your solution is wrong. Consider this sample data:

-2, -5, 0, -1, -3

In this case, the Haskell solution should correctly print true, while yours 
will print false, because you skip the empty subset. The empty sum equals the 
maximum value of the set, 0.

The ease at which people can get this wrong is an argument in favour of a 
standard solution.



-- 
Steven


From raymond.hettinger at gmail.com  Mon Apr 23 21:55:42 2012
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Mon, 23 Apr 2012 12:55:42 -0700
Subject: [Python-ideas] Haskell envy
In-Reply-To: <4F9588C0.2020005@pearwood.info>
References: <CAFJiHS-=g6QFrG_zWkRAx4AB8kYG6PRr3vL+XqAV4ku1gqEwOA@mail.gmail.com>
	<4F9588C0.2020005@pearwood.info>
Message-ID: <647E82A1-48ED-4025-8195-19982D8BC441@gmail.com>


On Apr 23, 2012, at 9:52 AM, Steven D'Aprano wrote:

> However, the discoverability of this solution is essentially zero 

That exact code has been in the documentation for years:

def powerset(iterable):
    "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
    s = list(iterable)
    return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))

The whole purpose of the itertools recipes are to teach how
the itertools can be readily combined to build new tools.

http://docs.python.org/library/itertools.html#module-itertools


Raymond
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120423/fa1073e6/attachment.html>

From tjreedy at udel.edu  Mon Apr 23 22:14:11 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 23 Apr 2012 16:14:11 -0400
Subject: [Python-ideas] Anyone working on a platform-agnostic
	os.startfile()
In-Reply-To: <CACZ_DodeyqgbVMCceyXU=HzBzeXS9nFTX=2oseQ8=zoBDw-_-A@mail.gmail.com>
References: <mailman.31.1335175203.9054.python-ideas@python.org>
	<CACZ_DodeyqgbVMCceyXU=HzBzeXS9nFTX=2oseQ8=zoBDw-_-A@mail.gmail.com>
Message-ID: <jn4d6k$rv$1@dough.gmane.org>

[subject corrected from digest...]

On 4/23/2012 7:10 AM, Hobson Lane wrote:
> Formatted and finished Rebert's solution to this issue
>
> http://bugs.python.org/issue3177
>
> But the question of where to put it is still open ( shutil.open vs.
> shutil.launch vs. os.startfile ):
>
> 1. `shutil.open()` will break anyone that does `from shutil import *` or
> edits the shutil.py file and tries to use the builtin open() after the
> shutil.open() definition.
>
> 2. `shutil.launch()` is better than shutil.open() due to reduced
> breakage, but not as simple or DRY or reverse-compatible as putting it
> in os.startfile() in my mind. This fix just implements the functionality
> of os.startfile() for non-Windows OSes.
>
> 3. `shutil.startfile()` was recommended against by a developer or two on
> this mailing list, but seems appropriate to me. The only upstream
> "breakage" for an os.startfile() location that I can think of is the
> failure to raise exceptions on non-Windows OSes. Any legacy (<3.0) code
> that relies on os.startfile() exceptions in order to detect a
> non-windows OS is misguided and needs re-factoring anyway, IMHO. Though
> their only indication of a "problem" in their code would be the
> successful launching of a viewer for whatever path they pointed to...
>
> 4. `os.launch()` anyone? Not me.

The functions in os are intended to be thin wrappers around os 
functions, and a launcher is not. So shutil seems a better place. Since 
launch or startfile means more than just 'open', but open to edit or 
run, I would not use 'open'.

-- 
Terry Jan Reedy



From tjreedy at udel.edu  Mon Apr 23 22:28:50 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 23 Apr 2012 16:28:50 -0400
Subject: [Python-ideas] Haskell envy
In-Reply-To: <CAMZYqRQE1gAF4o-ne6RXKttR8qNicpRXn5sZgTEGd4KLhZ1O4w@mail.gmail.com>
References: <CAFJiHS-=g6QFrG_zWkRAx4AB8kYG6PRr3vL+XqAV4ku1gqEwOA@mail.gmail.com>
	<jn2gbm$fm5$1@dough.gmane.org>
	<CAMZYqRQE1gAF4o-ne6RXKttR8qNicpRXn5sZgTEGd4KLhZ1O4w@mail.gmail.com>
Message-ID: <jn4e24$7ka$1@dough.gmane.org>

On 4/22/2012 11:18 PM, Chris Rebert wrote:
> On Sun, Apr 22, 2012 at 7:55 PM, Terry Reedy<tjreedy at udel.edu>  wrote:
>> On 4/22/2012 9:07 PM, Nestor wrote:
> <snip>
>>> Have the function ArrayAddition(arr) take the array of numbers stored
>>> in arr and print true if any combination of numbers in the array can
>>> be added up to equal the largest number in the array, otherwise print
>>> false. For example: if arr contains [4, 6, 23, 10, 1, 3] the output
>>> should print true because 4 + 6 + 10 + 3 = 23.  The array will not be
>>> empty, will not contain all the same elements, and may contain
>>> negative numbers.
>>
>> Since the order of the numbers is arbitrary and irrelevant to the problem,
>> it should be formulated in term of a set of numbers.
>
> Er, multiplicity still matters, so it should be a multiset/bag. One
> possible representation thereof would be a list...

Er, yes. Given the examples, I (too quickly) misread 'will not contain 
all the same elements' as 'no duplicates'. In any case, a set was needed 
for the functional version as there is no 'list1 - list2' expression 
that returns the list1 minus the items in list2. (Well, I could have 
defined an auxiliary list sub function, but that is beside the point of 
the example.)

-- 
Terry Jan Reedy



From ethan at stoneleaf.us  Mon Apr 23 21:51:19 2012
From: ethan at stoneleaf.us (Ethan Furman)
Date: Mon, 23 Apr 2012 12:51:19 -0700
Subject: [Python-ideas] shutil.launch / open/ startfile
In-Reply-To: <CACZ_DodeyqgbVMCceyXU=HzBzeXS9nFTX=2oseQ8=zoBDw-_-A@mail.gmail.com>
References: <mailman.31.1335175203.9054.python-ideas@python.org>
	<CACZ_DodeyqgbVMCceyXU=HzBzeXS9nFTX=2oseQ8=zoBDw-_-A@mail.gmail.com>
Message-ID: <4F95B2B7.6030409@stoneleaf.us>

Hobson Lane wrote:
> Formatted and finished Rebert's solution to this issue
> 
> http://bugs.python.org/issue3177
> 
> But the question of where to put it is still open ( shutil.open vs. 
> shutil.launch vs. os.startfile ):
> 
> 1. `shutil.open()` will break anyone that does `from shutil import *` or 
> edits the shutil.py file and tries to use the builtin open() after the 
> shutil.open() definition.

`from ... import *` should not be used with any module that does not 
explicitly support it, and shutil does not.

How often do people modify library modules?


> 2. `shutil.launch()` is better than shutil.open() due to reduced 
> breakage, but not as simple or DRY or reverse-compatible as putting it 
> in os.startfile() in my mind. This fix just implements the functionality 
> of os.startfile() for non-Windows OSes.

+1 for shutil.launch()

~Ethan~


From tjreedy at udel.edu  Mon Apr 23 22:30:16 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 23 Apr 2012 16:30:16 -0400
Subject: [Python-ideas] Haskell envy
In-Reply-To: <4F959066.6090504@pearwood.info>
References: <CAFJiHS-=g6QFrG_zWkRAx4AB8kYG6PRr3vL+XqAV4ku1gqEwOA@mail.gmail.com>
	<4F959066.6090504@pearwood.info>
Message-ID: <jn4e4p$7ka$2@dough.gmane.org>

On 4/23/2012 1:24 PM, Steven D'Aprano wrote:
> Nestor wrote:
>> The other day a colleague of mine submitted this challenge taken from
>> some website to us coworkers:
>>
>> Have the function ArrayAddition(arr) take the array of numbers stored
>> in arr and print true if any combination of numbers in the array can
>> be added up to equal the largest number in the array, otherwise print
>> false. For example: if arr contains [4, 6, 23, 10, 1, 3] the output
>> should print true because 4 + 6 + 10 + 3 = 23. The array will not be
>> empty, will not contain all the same elements, and may contain
>> negative numbers.
>
> By the way, your solution is wrong. Consider this sample data:
>
> -2, -5, 0, -1, -3
>
> In this case, the Haskell solution should correctly print true, while
> yours will print false, because you skip the empty subset. The empty sum
> equals the maximum value of the set, 0.
>
> The ease at which people can get this wrong is an argument in favour of
> a standard solution.

I call it an argument for writing tests first, starting with the most 
simple and trivial case(s).

-- 
Terry Jan Reedy



From ned at nedbatchelder.com  Tue Apr 24 01:45:37 2012
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Mon, 23 Apr 2012 19:45:37 -0400
Subject: [Python-ideas] Haskell envy
In-Reply-To: <647E82A1-48ED-4025-8195-19982D8BC441@gmail.com>
References: <CAFJiHS-=g6QFrG_zWkRAx4AB8kYG6PRr3vL+XqAV4ku1gqEwOA@mail.gmail.com>
	<4F9588C0.2020005@pearwood.info>
	<647E82A1-48ED-4025-8195-19982D8BC441@gmail.com>
Message-ID: <4F95E9A1.3070001@nedbatchelder.com>

On 4/23/2012 3:55 PM, Raymond Hettinger wrote:
>
> On Apr 23, 2012, at 9:52 AM, Steven D'Aprano wrote:
>
>> However, the discoverability of this solution is essentially zero
>
> That exact code has been in the documentation for years:
>
> def  powerset(iterable):
>      "powerset([1,2,3]) -->  () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
>      s  =  list(iterable)
>      return  chain.from_iterable(combinations(s,  r)  for  r  in  range(len(s)+1))
>
> The whole purpose of the itertools recipes are to teach how
> the itertools can be readily combined to build new tools.
>
> http://docs.python.org/library/itertools.html#module-itertools
>
>
Raymond's "that code has been in the docs for years," and Steven's "the 
discoverability of this solution is essentially zero" are not 
contradictions.  It sounds like we need a better way to find the 
information in the itertools docs.  For example, there is no index entry 
for "powerset", and I don't know what term Steven tried looking it up 
with.  Sounds like you two could work together to make people more aware 
of the tools we've already got.

--Ned.

> Raymond
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120423/99f5f873/attachment.html>

From ericsnowcurrently at gmail.com  Tue Apr 24 08:42:54 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Tue, 24 Apr 2012 00:42:54 -0600
Subject: [Python-ideas] sys.implementation
In-Reply-To: <CALFfu7DYyZMUp40MDR9-vhpOkPvr=cwt5EmMHEGTrmix_kZbYg@mail.gmail.com>
References: <CALFfu7DYyZMUp40MDR9-vhpOkPvr=cwt5EmMHEGTrmix_kZbYg@mail.gmail.com>
Message-ID: <CALFfu7B4GpCNgyxLUmtji1LLKLgFPPvR2Zd3e2V6UW4_0HoR7g@mail.gmail.com>

On Mon, Mar 19, 2012 at 10:41 PM, Eric Snow <ericsnowcurrently at gmail.com> wrote:
> In October 2009 there was a short flurry of interest in adding
> "sys.implementation" as an object to encapsulate some
> implementation-specific information [1]. ?Does anyone recollect where
> this proposal went? ?Would anyone object to reviving it (or a
> variant)?

The premise is that sys.implementation would be a "namedtuple" (like
sys.version_info).  It would contain (as much as is practical) the
information that is specific to a particular implementation of Python.
 "Required" attributes of sys.implementation would be those that the
standard library makes use of.  For instance, importlib would make use
of sys.implementation.name (or sys.implementation.cache_tag) if there
were one.  The thread from 2009 covered a lot of this ground already.
[1]

Here are the "required" attributes of sys.implementation that I advocate:

* name (mixed case; sys.implementation.name.lower() used as an identifier)
* version (of the implementation, not of the targeted language
version; structured like sys.version_info?)

Here are other variables that _could_ go in sys.implementation:

* cache_tag (e.g. 'cpython33' for CPython 3.3)
* repository
* repository_revision
* build_toolchain
* url (or website)
* site_prefix
* runtime

Let's start with a minimum set of expected attributes, which would
have an immediate purpose in the stdlib.  However, let's not disallow
implementations from adding whatever other attributes are meaningful
for them.

-eric


[1] http://mail.python.org/pipermail/python-dev/2009-October/092893.html


From p.f.moore at gmail.com  Mon Apr 23 20:50:14 2012
From: p.f.moore at gmail.com (Paul Moore)
Date: Mon, 23 Apr 2012 19:50:14 +0100
Subject: [Python-ideas] Haskell envy
In-Reply-To: <4F9588C0.2020005@pearwood.info>
References: <CAFJiHS-=g6QFrG_zWkRAx4AB8kYG6PRr3vL+XqAV4ku1gqEwOA@mail.gmail.com>
	<4F9588C0.2020005@pearwood.info>
Message-ID: <CACac1F9XGvdeKkoiRnsHpmQGQcD4zfB=N9+jNbxqC6a8McWjNw@mail.gmail.com>

On 23 April 2012 17:52, Steven D'Aprano <steve at pearwood.info> wrote:
> I have mixed ideas on this. On the one hand, subsequences is a natural
> function to include along with permutations and combinations in any
> combinatorics tool set. As you point out, Haskell has it. So does
> Mathematica, under the name "subsets".
>
> But on the other hand, itertools is arguably not the right place for it. If
> we add subsequences, will people then ask for partitions and derangements
> next? Where will it end? At some point the line needs to be drawn, with some
> functions declared "too specialised" for the general itertools module.
>
> (Perhaps there should be a separate combinatorics module.)

It seems to me that you're precisely right here - it's *not* right to
keep adding this sort of thing to itertools, or as you say it will
never end. On the other hand, it's potentially useful, and not
necessarily immediately obvious.

So I'd say it's exactly right for a 3rd party "combinatorics" module
on PyPI. Or if no-one thinks it's sufficiently useful to write and
maintain one, then as a simple named function in your application.
Once your application has a combinatorics module within it, that's the
point where you should think about releasing that module separately...

Of course, I'm arguing theoretically here. I've never even used
itertools.combinations, so I have no real need for *any* of this. If
people who do use this sort of thing on a regular basis have other
opinions, then that's a much stronger argument :-)

Paul.


From ncoghlan at gmail.com  Tue Apr 24 13:37:18 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 24 Apr 2012 21:37:18 +1000
Subject: [Python-ideas] Haskell envy
In-Reply-To: <CACac1F9XGvdeKkoiRnsHpmQGQcD4zfB=N9+jNbxqC6a8McWjNw@mail.gmail.com>
References: <CAFJiHS-=g6QFrG_zWkRAx4AB8kYG6PRr3vL+XqAV4ku1gqEwOA@mail.gmail.com>
	<4F9588C0.2020005@pearwood.info>
	<CACac1F9XGvdeKkoiRnsHpmQGQcD4zfB=N9+jNbxqC6a8McWjNw@mail.gmail.com>
Message-ID: <CADiSq7eUEvcEtDSz+JOFeD1FKuXxLoKjXvzAXNXjtS0PAB7E7w@mail.gmail.com>

On Tue, Apr 24, 2012 at 4:50 AM, Paul Moore <p.f.moore at gmail.com> wrote:
> Of course, I'm arguing theoretically here. I've never even used
> itertools.combinations, so I have no real need for *any* of this. If
> people who do use this sort of thing on a regular basis have other
> opinions, then that's a much stronger argument :-)

In practice, you'll be doing prefiltering and other conditioning on
your combinations to weed out nonsensical variants cheaply, or your
combinations will be coming from a database query or map reduce
result. So I personally think it's the kind of thing that comes up in
toy programming exercises and various academic explorations rather
than something that solves a significant practical need.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From techtonik at gmail.com  Tue Apr 24 17:42:24 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Tue, 24 Apr 2012 18:42:24 +0300
Subject: [Python-ideas] Python-ideas Digest, Vol 65, Issue 43
In-Reply-To: <CAMZYqRTobyMQrMsNu-8yC8TqxaRjxZD0UWGJNOdmL2AfzSi6Mg@mail.gmail.com>
References: <mailman.31.1335175203.9054.python-ideas@python.org>
	<CACZ_DodeyqgbVMCceyXU=HzBzeXS9nFTX=2oseQ8=zoBDw-_-A@mail.gmail.com>
	<CAMZYqRTobyMQrMsNu-8yC8TqxaRjxZD0UWGJNOdmL2AfzSi6Mg@mail.gmail.com>
Message-ID: <CAPkN8xKFXYu8BssNdqMWqbHPvcnvKroG0kWzMXbOmjXTFp46nQ@mail.gmail.com>

On Mon, Apr 23, 2012 at 3:01 PM, Chris Rebert <pyideas at rebertia.com> wrote:
> On Mon, Apr 23, 2012 at 4:10 AM, Hobson Lane <hobsonlane at gmail.com> wrote:
> <snip>
>> On Mon, Apr 23, 2012 at 6:00 PM, <python-ideas-request at python.org> wrote:
>>>
>>> Send Python-ideas mailing list submissions to
>>> ? ? ? ?python-ideas at python.org
>>>
>>> To subscribe or unsubscribe via the World Wide Web, visit
>>> ? ? ? ?http://mail.python.org/mailman/listinfo/python-ideas
>>> or, via email, send a message with subject or body 'help' to
>>> ? ? ? ?python-ideas-request at python.org
>>>
>>> You can reach the person managing the list at
>>> ? ? ? ?python-ideas-owner at python.org
>>>
>>> When replying, please edit your Subject line so it is more specific
>>> than "Re: Contents of Python-ideas digest..."
>
> Please avoid replying to the digest; it breaks conversation threading.
> Switch to a non-digest mailing list subscription when not lurking.

But to reply to a non-digest message you need to receive it in
non-digest mode, which didn't happen already. The only way it makes
sense is when you ask the Mailman to resend message again. I don't
know if that's possible.
--
anatoly t.


From ericsnowcurrently at gmail.com  Tue Apr 24 21:23:53 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Tue, 24 Apr 2012 13:23:53 -0600
Subject: [Python-ideas] breaking out of module execution
Message-ID: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>

In a function you can use a return statement to break out of execution
in the middle of the function.  With modules you have no recourse.
This is akin to return statements being allowed only at the end of a
function.

There are a small number of ways you can work around this, but they
aren't great.  This includes using wrapper modules or import hooks or
sometimes from-import-*.  Otherwise, if your module's execution is
conditional, you end up indenting everything inside an if/else
statement.

Proposal: introduce a non-error mechanism to break out of module
execution.  This could be satisfied by a statement like break or
return, though those specific ones could be confusing.  It could also
involve raising a special subclass of ImportError that the import
machinery simply handles as not-an-error.

This came up last year on python-list with mixed results. [1]
However, time has not dimmed the appeal for me so I'm rebooting here.

While the proposal seems relatively minor, the use cases are not
extensive. <wink>  The three main ones I've encountered are these:

1. C extension module with fallback to pure Python:

  try:
      from _extension_module import *
  except ImportError:
      pass
  else:
      break  # or whatever color the bikeshed is

  # pure python implementation goes here

2. module imported under different name:

  if __name__ != "expected_name":
      from expected_name import *
      break

  # business as usual

3. module already imported under a different name:

  if "other_module" in sys.modules:
      exec("from other_module import *", globals())
      break

  # module code here

Thoughts?

-eric


[1] http://mail.python.org/pipermail/python-list/2011-June/1274424.html


From solipsis at pitrou.net  Tue Apr 24 21:40:58 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 24 Apr 2012 21:40:58 +0200
Subject: [Python-ideas] breaking out of module execution
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
Message-ID: <20120424214058.22540616@pitrou.net>

On Tue, 24 Apr 2012 13:23:53 -0600
Eric Snow <ericsnowcurrently at gmail.com>
wrote:
> In a function you can use a return statement to break out of execution
> in the middle of the function.  With modules you have no recourse.
> This is akin to return statements being allowed only at the end of a
> function.
> 
> There are a small number of ways you can work around this, but they
> aren't great.  This includes using wrapper modules or import hooks or
> sometimes from-import-*.  Otherwise, if your module's execution is
> conditional, you end up indenting everything inside an if/else
> statement.

I think good practice should lead you to put your initialization code
in a dedicated function that you call from your module toplevel. In
this case, breaking out of execution is a matter of adding a return
statement.

I'm not sure the particular use cases you brought up are a good enough
reason to add a syntactical construct.

Regards

Antoine.




From mal at egenix.com  Tue Apr 24 21:58:35 2012
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 24 Apr 2012 21:58:35 +0200
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <20120424214058.22540616@pitrou.net>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
	<20120424214058.22540616@pitrou.net>
Message-ID: <4F9705EB.1070702@egenix.com>

Antoine Pitrou wrote:
> On Tue, 24 Apr 2012 13:23:53 -0600
> Eric Snow <ericsnowcurrently at gmail.com>
> wrote:
>> In a function you can use a return statement to break out of execution
>> in the middle of the function.  With modules you have no recourse.
>> This is akin to return statements being allowed only at the end of a
>> function.
>>
>> There are a small number of ways you can work around this, but they
>> aren't great.  This includes using wrapper modules or import hooks or
>> sometimes from-import-*.  Otherwise, if your module's execution is
>> conditional, you end up indenting everything inside an if/else
>> statement.
> 
> I think good practice should lead you to put your initialization code
> in a dedicated function that you call from your module toplevel. In
> this case, breaking out of execution is a matter of adding a return
> statement.

True, but that doesn't prevent import from being run, functions and
classes from being defined and resources being bound which are not
going to get used.

Think of code like this (let's assume the "break" statement is used
for stopping module execution):

"""
#
# MyModule
#

### Try using the fast variant

try:
    from MyModule_C_Extension import *
except ImportError:
    pass
else:
    # Stop execution of the module code object right here
    break

### Ah, well, so go ahead with the slow version

import os, sys
from MyOtherPackage import foo, bar, baz

class MyClass:
    ...

def MyFunc(a,b,c):
    ...

def main():
    ...

if __name__ == '__main__':
    main()
"""

You can solve this by using two separate modules and a top-level
module to switch between the implementations, but that's cumbersome
if you have more than just a few of such modules in a package.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 24 2012)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2012-04-28: PythonCamp 2012, Cologne, Germany               4 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


From g.brandl at gmx.net  Tue Apr 24 22:11:53 2012
From: g.brandl at gmx.net (Georg Brandl)
Date: Tue, 24 Apr 2012 22:11:53 +0200
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <4F9705EB.1070702@egenix.com>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
	<20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com>
Message-ID: <jn71dc$3on$1@dough.gmane.org>

On 24.04.2012 21:58, M.-A. Lemburg wrote:
> Antoine Pitrou wrote:
>> On Tue, 24 Apr 2012 13:23:53 -0600
>> Eric Snow <ericsnowcurrently at gmail.com>
>> wrote:
>>> In a function you can use a return statement to break out of execution
>>> in the middle of the function.  With modules you have no recourse.
>>> This is akin to return statements being allowed only at the end of a
>>> function.
>>>
>>> There are a small number of ways you can work around this, but they
>>> aren't great.  This includes using wrapper modules or import hooks or
>>> sometimes from-import-*.  Otherwise, if your module's execution is
>>> conditional, you end up indenting everything inside an if/else
>>> statement.
>> 
>> I think good practice should lead you to put your initialization code
>> in a dedicated function that you call from your module toplevel. In
>> this case, breaking out of execution is a matter of adding a return
>> statement.
> 
> True, but that doesn't prevent import from being run, functions and
> classes from being defined and resources being bound which are not
> going to get used.

What's wrong with an if statement on module level, if you even care
about this?

> Think of code like this (let's assume the "break" statement is used
> for stopping module execution):
> 
> """
> #
> # MyModule
> #
> 
> ### Try using the fast variant
> 
> try:
>     from MyModule_C_Extension import *
> except ImportError:
>     pass
> else:
>     # Stop execution of the module code object right here
>     break
> 
> ### Ah, well, so go ahead with the slow version
> 
> import os, sys
> from MyOtherPackage import foo, bar, baz
> 
> class MyClass:
>     ...
> 
> def MyFunc(a,b,c):
>     ...
> 
> def main():
>     ...
> 
> if __name__ == '__main__':
>     main()
> """

There's a subtle bug here that shows that the proposed feature has its
awkward points:  you probably want to execute the "if __name__ == '__main__'"
block in the C extension case as well.

Georg



From mark at hotpy.org  Tue Apr 24 22:15:57 2012
From: mark at hotpy.org (Mark Shannon)
Date: Tue, 24 Apr 2012 21:15:57 +0100
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <4F9705EB.1070702@egenix.com>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>	<20120424214058.22540616@pitrou.net>
	<4F9705EB.1070702@egenix.com>
Message-ID: <4F9709FD.4060901@hotpy.org>


This has to be the only few feature request that can implemented by 
removing code :)

I implemented this by deleting 2 lines of code from the compiler.

M.-A. Lemburg wrote:
> Antoine Pitrou wrote:
>> On Tue, 24 Apr 2012 13:23:53 -0600
>> Eric Snow <ericsnowcurrently at gmail.com>
>> wrote:
>>> In a function you can use a return statement to break out of execution
>>> in the middle of the function.  With modules you have no recourse.
>>> This is akin to return statements being allowed only at the end of a
>>> function.
>>>
>>> There are a small number of ways you can work around this, but they
>>> aren't great.  This includes using wrapper modules or import hooks or
>>> sometimes from-import-*.  Otherwise, if your module's execution is
>>> conditional, you end up indenting everything inside an if/else
>>> statement.
>> I think good practice should lead you to put your initialization code
>> in a dedicated function that you call from your module toplevel. In
>> this case, breaking out of execution is a matter of adding a return
>> statement.
> 
> True, but that doesn't prevent import from being run, functions and
> classes from being defined and resources being bound which are not
> going to get used.
> 
> Think of code like this (let's assume the "break" statement is used
> for stopping module execution):
> 
> """
> #
> # MyModule
> #
> 
> ### Try using the fast variant
> 
> try:
>     from MyModule_C_Extension import *
> except ImportError:
>     pass
> else:
>     # Stop execution of the module code object right here
>     break

It will have to be "return" not "break".

> 
> ### Ah, well, so go ahead with the slow version
> 
> import os, sys
> from MyOtherPackage import foo, bar, baz
> 
> class MyClass:
>     ...
> 
> def MyFunc(a,b,c):
>     ...
> 
> def main():
>     ...
> 
> if __name__ == '__main__':
>     main()
> """
> 
> You can solve this by using two separate modules and a top-level
> module to switch between the implementations, but that's cumbersome
> if you have more than just a few of such modules in a package.
> 



From mal at egenix.com  Tue Apr 24 22:20:38 2012
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 24 Apr 2012 22:20:38 +0200
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <jn71dc$3on$1@dough.gmane.org>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
	<20120424214058.22540616@pitrou.net>
	<4F9705EB.1070702@egenix.com> <jn71dc$3on$1@dough.gmane.org>
Message-ID: <4F970B16.2020702@egenix.com>

Georg Brandl wrote:
> On 24.04.2012 21:58, M.-A. Lemburg wrote:
>> Antoine Pitrou wrote:
>>> On Tue, 24 Apr 2012 13:23:53 -0600
>>> Eric Snow <ericsnowcurrently at gmail.com>
>>> wrote:
>>>> In a function you can use a return statement to break out of execution
>>>> in the middle of the function.  With modules you have no recourse.
>>>> This is akin to return statements being allowed only at the end of a
>>>> function.
>>>>
>>>> There are a small number of ways you can work around this, but they
>>>> aren't great.  This includes using wrapper modules or import hooks or
>>>> sometimes from-import-*.  Otherwise, if your module's execution is
>>>> conditional, you end up indenting everything inside an if/else
>>>> statement.
>>>
>>> I think good practice should lead you to put your initialization code
>>> in a dedicated function that you call from your module toplevel. In
>>> this case, breaking out of execution is a matter of adding a return
>>> statement.
>>
>> True, but that doesn't prevent import from being run, functions and
>> classes from being defined and resources being bound which are not
>> going to get used.
> 
> What's wrong with an if statement on module level, if you even care
> about this?

You'd have to indent the whole module. Been there, done that, doesn't
look nice :-)

>> Think of code like this (let's assume the "break" statement is used
>> for stopping module execution):
>>
>> """
>> #
>> # MyModule
>> #
>>
>> ### Try using the fast variant
>>
>> try:
>>     from MyModule_C_Extension import *
>> except ImportError:
>>     pass
>> else:
>>     # Stop execution of the module code object right here
>>     break
>>
>> ### Ah, well, so go ahead with the slow version
>>
>> import os, sys
>> from MyOtherPackage import foo, bar, baz
>>
>> class MyClass:
>>     ...
>>
>> def MyFunc(a,b,c):
>>     ...
>>
>> def main():
>>     ...
>>
>> if __name__ == '__main__':
>>     main()
>> """
> 
> There's a subtle bug here that shows that the proposed feature has its
> awkward points:  you probably want to execute the "if __name__ == '__main__'"
> block in the C extension case as well.

No, you don't :-) If you would have wanted that to happen, you'd
put the "if __name__..." into the else: branch.

You think of the "break" as having the same functionality as a "return"
in a function.

If reusing a statement is too much trouble, the same functionality
could be had with an exception that get's caught by the executing
(import) code.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 24 2012)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2012-04-28: PythonCamp 2012, Cologne, Germany               4 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


From g.brandl at gmx.net  Tue Apr 24 22:29:28 2012
From: g.brandl at gmx.net (Georg Brandl)
Date: Tue, 24 Apr 2012 22:29:28 +0200
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <4F970B16.2020702@egenix.com>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
	<20120424214058.22540616@pitrou.net>
	<4F9705EB.1070702@egenix.com> <jn71dc$3on$1@dough.gmane.org>
	<4F970B16.2020702@egenix.com>
Message-ID: <jn72eb$el9$1@dough.gmane.org>

On 24.04.2012 22:20, M.-A. Lemburg wrote:

>>> Think of code like this (let's assume the "break" statement is used
>>> for stopping module execution):
>>>
>>> """
>>> #
>>> # MyModule
>>> #
>>>
>>> ### Try using the fast variant
>>>
>>> try:
>>>     from MyModule_C_Extension import *
>>> except ImportError:
>>>     pass
>>> else:
>>>     # Stop execution of the module code object right here
>>>     break
>>>
>>> ### Ah, well, so go ahead with the slow version
>>>
>>> import os, sys
>>> from MyOtherPackage import foo, bar, baz
>>>
>>> class MyClass:
>>>     ...
>>>
>>> def MyFunc(a,b,c):
>>>     ...
>>>
>>> def main():
>>>     ...
>>>
>>> if __name__ == '__main__':
>>>     main()
>>> """
>> 
>> There's a subtle bug here that shows that the proposed feature has its
>> awkward points:  you probably want to execute the "if __name__ == '__main__'"
>> block in the C extension case as well.
> 
> No, you don't :-) If you would have wanted that to happen, you'd
> put the "if __name__..." into the else: branch.

Not sure I understand.  Your example code is flawed because it doesn't execute
the main() for the C extension case.  Of course you can duplicate the code in
the else branch, but you didn't do it in the first place, which was the bug.

Georg



From mal at egenix.com  Tue Apr 24 22:38:14 2012
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 24 Apr 2012 22:38:14 +0200
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <jn72eb$el9$1@dough.gmane.org>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
	<20120424214058.22540616@pitrou.net>
	<4F9705EB.1070702@egenix.com> <jn71dc$3on$1@dough.gmane.org>
	<4F970B16.2020702@egenix.com> <jn72eb$el9$1@dough.gmane.org>
Message-ID: <4F970F36.2070208@egenix.com>

Georg Brandl wrote:
> On 24.04.2012 22:20, M.-A. Lemburg wrote:
> 
>>>> Think of code like this (let's assume the "break" statement is used
>>>> for stopping module execution):
>>>>
>>>> """
>>>> #
>>>> # MyModule
>>>> #
>>>>
>>>> ### Try using the fast variant
>>>>
>>>> try:
>>>>     from MyModule_C_Extension import *
>>>> except ImportError:
>>>>     pass
>>>> else:
>>>>     # Stop execution of the module code object right here
>>>>     break
>>>>
>>>> ### Ah, well, so go ahead with the slow version
>>>>
>>>> import os, sys
>>>> from MyOtherPackage import foo, bar, baz
>>>>
>>>> class MyClass:
>>>>     ...
>>>>
>>>> def MyFunc(a,b,c):
>>>>     ...
>>>>
>>>> def main():
>>>>     ...
>>>>
>>>> if __name__ == '__main__':
>>>>     main()
>>>> """
>>>
>>> There's a subtle bug here that shows that the proposed feature has its
>>> awkward points:  you probably want to execute the "if __name__ == '__main__'"
>>> block in the C extension case as well.
>>
>> No, you don't :-) If you would have wanted that to happen, you'd
>> put the "if __name__..." into the else: branch.
> 
> Not sure I understand.  Your example code is flawed because it doesn't execute
> the main() for the C extension case.  Of course you can duplicate the code in
> the else branch, but you didn't do it in the first place, which was the bug.

Ok, you got me :-) Should've paid more attention.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 24 2012)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2012-04-28: PythonCamp 2012, Cologne, Germany               4 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


From ethan at stoneleaf.us  Tue Apr 24 22:43:44 2012
From: ethan at stoneleaf.us (Ethan Furman)
Date: Tue, 24 Apr 2012 13:43:44 -0700
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <jn72eb$el9$1@dough.gmane.org>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>	<20120424214058.22540616@pitrou.net>	<4F9705EB.1070702@egenix.com>
	<jn71dc$3on$1@dough.gmane.org>	<4F970B16.2020702@egenix.com>
	<jn72eb$el9$1@dough.gmane.org>
Message-ID: <4F971080.4000303@stoneleaf.us>

Georg Brandl wrote:
> On 24.04.2012 22:20, M.-A. Lemburg wrote:
> 
>>>> Think of code like this (let's assume the "break" statement is used
>>>> for stopping module execution):
>>>>
>>>> """
>>>> #
>>>> # MyModule
>>>> #
>>>>
>>>> ### Try using the fast variant
>>>>
>>>> try:
>>>>     from MyModule_C_Extension import *
>>>> except ImportError:
>>>>     pass
>>>> else:
>>>>     # Stop execution of the module code object right here
>>>>     break
>>>>
>>>> ### Ah, well, so go ahead with the slow version
>>>>
>>>> import os, sys
>>>> from MyOtherPackage import foo, bar, baz
>>>>
>>>> class MyClass:
>>>>     ...
>>>>
>>>> def MyFunc(a,b,c):
>>>>     ...
>>>>
>>>> def main():
>>>>     ...
>>>>
>>>> if __name__ == '__main__':
>>>>     main()
>>>> """
>>> There's a subtle bug here that shows that the proposed feature has its
>>> awkward points:  you probably want to execute the "if __name__ == '__main__'"
>>> block in the C extension case as well.
>> No, you don't :-) If you would have wanted that to happen, you'd
>> put the "if __name__..." into the else: branch.
> 
> Not sure I understand.  Your example code is flawed because it doesn't execute
> the main() for the C extension case.  Of course you can duplicate the code in
> the else branch, but you didn't do it in the first place, which was the bug.

It's only a bug if you *want* main() to execute in the C extension case 
-- so for M.A.L. it's not a bug (apparently he meant "I" when he wrote 
"you" ;)

~Ethan~


From fuzzyman at gmail.com  Tue Apr 24 23:45:28 2012
From: fuzzyman at gmail.com (Michael Foord)
Date: Tue, 24 Apr 2012 22:45:28 +0100
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
Message-ID: <CAKCKLWzHipaWUx9=T3o3zqXvR77VS+m9sS_-0gTpR4ktODuLmA@mail.gmail.com>

On 24 April 2012 20:23, Eric Snow <ericsnowcurrently at gmail.com> wrote:

> In a function you can use a return statement to break out of execution
> in the middle of the function.  With modules you have no recourse.
> This is akin to return statements being allowed only at the end of a
> function.
>
> There are a small number of ways you can work around this, but they
> aren't great.  This includes using wrapper modules or import hooks or
> sometimes from-import-*.  Otherwise, if your module's execution is
> conditional, you end up indenting everything inside an if/else
> statement.
>
> Proposal: introduce a non-error mechanism to break out of module
> execution.  This could be satisfied by a statement like break or
> return, though those specific ones could be confusing.  It could also
> involve raising a special subclass of ImportError that the import
> machinery simply handles as not-an-error.
>
> This came up last year on python-list with mixed results. [1]
> However, time has not dimmed the appeal for me so I'm rebooting here.
>
>

For what it's worth I've wanted this a couple of times. There are always
workarounds of course (but not particularly pretty sometimes).

Michael



> While the proposal seems relatively minor, the use cases are not
> extensive. <wink>  The three main ones I've encountered are these:
>
> 1. C extension module with fallback to pure Python:
>
>  try:
>      from _extension_module import *
>  except ImportError:
>      pass
>  else:
>      break  # or whatever color the bikeshed is
>
>  # pure python implementation goes here
>
> 2. module imported under different name:
>
>  if __name__ != "expected_name":
>      from expected_name import *
>      break
>
>  # business as usual
>
> 3. module already imported under a different name:
>
>  if "other_module" in sys.modules:
>      exec("from other_module import *", globals())
>      break
>
>  # module code here
>
> Thoughts?
>
> -eric
>
>
> [1] http://mail.python.org/pipermail/python-list/2011-June/1274424.html
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>



-- 

http://www.voidspace.org.uk/

May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing http://www.sqlite.org/different.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120424/09111968/attachment.html>

From mwm at mired.org  Wed Apr 25 00:45:47 2012
From: mwm at mired.org (Mike Meyer)
Date: Tue, 24 Apr 2012 18:45:47 -0400
Subject: [Python-ideas] Python-ideas Digest, Vol 65, Issue 43
In-Reply-To: <CAPkN8xKFXYu8BssNdqMWqbHPvcnvKroG0kWzMXbOmjXTFp46nQ@mail.gmail.com>
References: <mailman.31.1335175203.9054.python-ideas@python.org>
	<CACZ_DodeyqgbVMCceyXU=HzBzeXS9nFTX=2oseQ8=zoBDw-_-A@mail.gmail.com>
	<CAMZYqRTobyMQrMsNu-8yC8TqxaRjxZD0UWGJNOdmL2AfzSi6Mg@mail.gmail.com>
	<CAPkN8xKFXYu8BssNdqMWqbHPvcnvKroG0kWzMXbOmjXTFp46nQ@mail.gmail.com>
Message-ID: <20120424184547.0bca79f5@bhuda.mired.org>

On Tue, 24 Apr 2012 18:42:24 +0300
anatoly techtonik <techtonik at gmail.com> wrote:

> On Mon, Apr 23, 2012 at 3:01 PM, Chris Rebert <pyideas at rebertia.com> wrote:
> > On Mon, Apr 23, 2012 at 4:10 AM, Hobson Lane <hobsonlane at gmail.com> wrote:
> > <snip>
> >> On Mon, Apr 23, 2012 at 6:00 PM, <python-ideas-request at python.org> wrote:
> >>>
> >>> Send Python-ideas mailing list submissions to
> >>> ? ? ? ?python-ideas at python.org
> >>>
> >>> To subscribe or unsubscribe via the World Wide Web, visit
> >>> ? ? ? ?http://mail.python.org/mailman/listinfo/python-ideas
> >>> or, via email, send a message with subject or body 'help' to
> >>> ? ? ? ?python-ideas-request at python.org
> >>>
> >>> You can reach the person managing the list at
> >>> ? ? ? ?python-ideas-owner at python.org
> >>>
> >>> When replying, please edit your Subject line so it is more specific
> >>> than "Re: Contents of Python-ideas digest..."
> >
> > Please avoid replying to the digest; it breaks conversation threading.
> > Switch to a non-digest mailing list subscription when not lurking.
> 
> But to reply to a non-digest message you need to receive it in
> non-digest mode, which didn't happen already. The only way it makes
> sense is when you ask the Mailman to resend message again. I don't
> know if that's possible.

Your initial statement is - or at least may be - wrong. If the digest
is in one of the well-known formats, a good MUA will let you burst a
digest into individual messages and reply to them just like any other
message to a list. mh, nmh and GUI's built on top them can do this.

I haven't subscribed to a digest of any kind in years, so don't even
know if any of my current mail readers can do that. The reason I don't
is a different solution to same problem: current mail agents are
sufficiently power that I can automatically mark/tag/file all messages
to a list and then deal with them outside the normal flow of email,
just like a digest.

There are other, more esoteric solutions (like the formail program),
available as well.

      <mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From hobsonlane at gmail.com  Wed Apr 25 01:10:49 2012
From: hobsonlane at gmail.com (Hobson Lane)
Date: Wed, 25 Apr 2012 07:10:49 +0800
Subject: [Python-ideas] Haskell envy (Terry Reedy)
Message-ID: <CACZ_DoeP+GsWmDYNSFvuSFR+4tVgcM3r2FJHwUvTYSmG1Sdqcw@mail.gmail.com>

> On 4/22/2012 11:18 PM, Chris Rebert wrote:
> > On Sun, Apr 22, 2012 at 7:55 PM, Terry Reedy<tjreedy at udel.edu>  wrote:
> >> On 4/22/2012 9:07 PM, Nestor wrote:
> > <snip>
> >>> false. For example: if arr contains [4, 6, 23, 10, 1, 3] the output
> >>> should print true because 4 + 6 + 10 + 3 = 23.



> >> Since the order of the numbers is arbitrary and irrelevant to the
> problem,
> >> it should be formulated in term of a set of numbers.
> >
> > Er, multiplicity still matters, so it should be a multiset/bag. One
> > possible representation thereof would be a list...
>
> Er, yes. Given the examples, I (too quickly) misread 'will not contain
> all the same elements' as 'no duplicates'. In any case, a set was needed
>

And doesn't ordering matter too (for efficiency). A sorted list of the
positive integers may solve in much less less time, right?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120425/c56a20b2/attachment.html>

From tjreedy at udel.edu  Wed Apr 25 02:09:18 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 24 Apr 2012 20:09:18 -0400
Subject: [Python-ideas] Haskell envy (Terry Reedy)
In-Reply-To: <CACZ_DoeP+GsWmDYNSFvuSFR+4tVgcM3r2FJHwUvTYSmG1Sdqcw@mail.gmail.com>
References: <CACZ_DoeP+GsWmDYNSFvuSFR+4tVgcM3r2FJHwUvTYSmG1Sdqcw@mail.gmail.com>
Message-ID: <jn7fbi$sit$1@dough.gmane.org>

On 4/24/2012 7:10 PM, Hobson Lane wrote:

> And doesn't ordering matter too (for efficiency). A sorted list of the
> positive integers may solve in much less less time, right?

The OP gave the brute-force solution of testing all subsets as a 
justification for adding a powerset iterator. With negative numbers 
allowed (as in this problem), that may be the best one can do.

But if negatives are excluded, sorting allows a pruning strategy. For 
instance, all subsets can be separated in to those with and without the 
2nd highest. For those with the 2nd highest, one only need consider the 
initial slice of numbers <= hi - 2nd_hi. If two numbers add to more than 
the target, then all larger subsets containing the pair can be excluded 
and not generated and summed. One can apply this idea recursively. But 
there is no guarantee of any saving.

-- 
Terry Jan Reedy



From tjreedy at udel.edu  Wed Apr 25 02:17:21 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 24 Apr 2012 20:17:21 -0400
Subject: [Python-ideas] Haskell envy (Terry Reedy)
In-Reply-To: <jn7fbi$sit$1@dough.gmane.org>
References: <CACZ_DoeP+GsWmDYNSFvuSFR+4tVgcM3r2FJHwUvTYSmG1Sdqcw@mail.gmail.com>
	<jn7fbi$sit$1@dough.gmane.org>
Message-ID: <jn7fqm$vft$1@dough.gmane.org>

On 4/24/2012 8:09 PM, Terry Reedy wrote:
> On 4/24/2012 7:10 PM, Hobson Lane wrote:
>
>> And doesn't ordering matter too (for efficiency). A sorted list of the
>> positive integers may solve in much less less time, right?
>
> The OP gave the brute-force solution of testing all subsets as a
> justification for adding a powerset iterator. With negative numbers
> allowed (as in this problem), that may be the best one can do.
>
> But if negatives are excluded, sorting allows a pruning strategy. For
> instance, all subsets can be separated in to those with and without the
> 2nd highest. For those with the 2nd highest, one only need consider the
> initial slice of numbers <= hi - 2nd_hi. If two numbers add to more than
> the target, then all larger subsets containing the pair can be excluded
> and not generated and summed. One can apply this idea recursively. But
> there is no guarantee of any saving.

As an algorithm question and answer, this is off topic. However, it 
reinforces Nick's claim that powerset() is not too useful because
"In practice, you'll be doing prefiltering and other conditioning on
your combinations to weed out nonsensical variants cheaply,"

-- 
Terry Jan Reedy



From steve at pearwood.info  Wed Apr 25 02:40:32 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 25 Apr 2012 10:40:32 +1000
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <4F9705EB.1070702@egenix.com>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>	<20120424214058.22540616@pitrou.net>
	<4F9705EB.1070702@egenix.com>
Message-ID: <4F974800.5000301@pearwood.info>

M.-A. Lemburg wrote:
> Antoine Pitrou wrote:

>> I think good practice should lead you to put your initialization code
>> in a dedicated function that you call from your module toplevel. In
>> this case, breaking out of execution is a matter of adding a return
>> statement.
> 
> True, but that doesn't prevent import from being run, functions and
> classes from being defined and resources being bound which are not
> going to get used.

Premature micro-optimizations. Just starting the interpreter runs imports, 
defines functions and classes, and binds resources which your script may never 
use. Lists and dicts are over-allocated, which you may never need. Ints and 
strings are boxed. Python is a language that takes Knuth seriously:

"We should forget about small efficiencies, say about 97% of the time: 
premature optimization is the root of all evil."

In the majority of cases, a few more small inefficiencies, especially those 
that are one-off costs at startup, are not a big deal.

In those cases where it is a big deal, you can place code inside if-blocks, 
factor it out into separate modules, or use delayed execution by putting it 
inside functions. It may not be quite so pretty, but it gets the job done, and 
it requires no new features.


> Think of code like this (let's assume the "break" statement is used
> for stopping module execution):
> 
> """
> #
> # MyModule
> #
> 
> ### Try using the fast variant
> 
> try:
>     from MyModule_C_Extension import *
> except ImportError:
>     pass
> else:
>     # Stop execution of the module code object right here
>     break
> 
> ### Ah, well, so go ahead with the slow version
> 
> import os, sys
> from MyOtherPackage import foo, bar, baz
> 
> class MyClass:
>     ...
> 
> def MyFunc(a,b,c):
>     ...
> 
> def main():
>     ...
> 
> if __name__ == '__main__':
>     main()
> """
> 
> You can solve this by using two separate modules and a top-level
> module to switch between the implementations, but that's cumbersome
> if you have more than just a few of such modules in a package.


You can also solve it by defining the slow version first, as a fall-back, then 
replace it with the fast version, if it exists:

import os, sys
from MyOtherPackage import foo, bar, baz

class MyClass:  ...
def MyFunc(a,b,c):  ...

def main():  ...

try:
     from MyModule_C_Extension import *
except ImportError:
     pass

if __name__ == '__main__':
     main()



The advantage of this is that MyModule_C_Extension doesn't need to supply 
everything or nothing. It may only supply (say) MyClass, and MyFunc will 
naturally fall back on the pre-defined Python versions.

This is the idiom used by at least the bisect and pickle modules, and I think 
it is the Pythonic way.



-- 
Steven


From steve at pearwood.info  Wed Apr 25 02:55:07 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 25 Apr 2012 10:55:07 +1000
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <4F970F36.2070208@egenix.com>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>	<20120424214058.22540616@pitrou.net>	<4F9705EB.1070702@egenix.com>
	<jn71dc$3on$1@dough.gmane.org>	<4F970B16.2020702@egenix.com>
	<jn72eb$el9$1@dough.gmane.org> <4F970F36.2070208@egenix.com>
Message-ID: <4F974B6B.3040208@pearwood.info>

M.-A. Lemburg wrote:
> Georg Brandl wrote:
>> On 24.04.2012 22:20, M.-A. Lemburg wrote:

>>>> There's a subtle bug here that shows that the proposed feature has its
>>>> awkward points:  you probably want to execute the "if __name__ == '__main__'"
>>>> block in the C extension case as well.
>>> No, you don't :-) If you would have wanted that to happen, you'd
>>> put the "if __name__..." into the else: branch.
>> Not sure I understand.  Your example code is flawed because it doesn't execute
>> the main() for the C extension case.  Of course you can duplicate the code in
>> the else branch, but you didn't do it in the first place, which was the bug.
> 
> Ok, you got me :-) Should've paid more attention.


I think you have inadvertently demonstrated that that this proposed feature is 
hard to use correctly. Possibly even harder to use than existing idioms for 
solving the problems this is meant to solve. If the user does use it, they 
will likely need to duplicate code, which encourages copy-and-paste programming.

Even if break at the module level is useful on rare occasions, I think the 
usefulness is far outweighed by the costs:

- hard to use correctly, hence code using this feature risks being buggy
- encourages premature micro-optimization, or at least the illusion of
   optimization
- encourages or requires duplicate code and copy-and-paste programming
- complicates the top-level program flow

Today, if you successfully import a module, you know that all the top-level 
code in that module was executed. If this feature is added, you cannot be sure 
what top-level code was reached unless you scan through all the code above it.

In my opinion, this is an attractive nuisance.

-1 on the feature.



-- 
Steven



From nathan.alexander.rice at gmail.com  Wed Apr 25 04:03:43 2012
From: nathan.alexander.rice at gmail.com (Nathan Rice)
Date: Tue, 24 Apr 2012 22:03:43 -0400
Subject: [Python-ideas] Haskell envy
In-Reply-To: <CADiSq7eUEvcEtDSz+JOFeD1FKuXxLoKjXvzAXNXjtS0PAB7E7w@mail.gmail.com>
References: <CAFJiHS-=g6QFrG_zWkRAx4AB8kYG6PRr3vL+XqAV4ku1gqEwOA@mail.gmail.com>
	<4F9588C0.2020005@pearwood.info>
	<CACac1F9XGvdeKkoiRnsHpmQGQcD4zfB=N9+jNbxqC6a8McWjNw@mail.gmail.com>
	<CADiSq7eUEvcEtDSz+JOFeD1FKuXxLoKjXvzAXNXjtS0PAB7E7w@mail.gmail.com>
Message-ID: <CAOFbRm+uwYTh9pOEq7gquz0xNDARkPhF3o7eUpcqR=9=mqEyiw@mail.gmail.com>

Why not repeat the same practice as is currently in place for range,
enumerate, etc.  Take 3 arguments, the iterables, the repeat minimum
size and the repeat maximum size.  If the third argument is None, just
treat the minimum as the maximum (yielding the current behavior).

Most things in python have gobs of keyword arguments and most of
itertools has relatively few.  I don't think it is going to harm
anyone to give them a few more toys to play with.


From ncoghlan at gmail.com  Wed Apr 25 07:33:54 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 25 Apr 2012 15:33:54 +1000
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <4F974B6B.3040208@pearwood.info>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
	<20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com>
	<jn71dc$3on$1@dough.gmane.org> <4F970B16.2020702@egenix.com>
	<jn72eb$el9$1@dough.gmane.org> <4F970F36.2070208@egenix.com>
	<4F974B6B.3040208@pearwood.info>
Message-ID: <CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>

On Wed, Apr 25, 2012 at 10:55 AM, Steven D'Aprano <steve at pearwood.info> wrote:
> In my opinion, this is an attractive nuisance.
>
> -1 on the feature.

Agreed (and my preferred idiom for all the cited cases is also "always
define the Python version, override at the end with the accelerated
version").

Although, if we *did* do it, I think allowing "return" at module level
would be the way to proceed (as Mark Shannon noted, that's only
*disallowed* now because the compiler specifically prevents it. The
eval loop itself understands it just fine and ceases execution as soon
as it encounters the relevant bytecode. It isn't quite as simple as
just deleting those lines though, since we likely still wouldn't want
to allow return statements in class bodies).

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From greg.ewing at canterbury.ac.nz  Wed Apr 25 09:58:34 2012
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 25 Apr 2012 19:58:34 +1200
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
	<20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com>
	<jn71dc$3on$1@dough.gmane.org> <4F970B16.2020702@egenix.com>
	<jn72eb$el9$1@dough.gmane.org> <4F970F36.2070208@egenix.com>
	<4F974B6B.3040208@pearwood.info>
	<CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>
Message-ID: <4F97AEAA.90009@canterbury.ac.nz>

Nick Coghlan wrote:
> It isn't quite as simple as
> just deleting those lines though, since we likely still wouldn't want
> to allow return statements in class bodies.

I'm sure there's someone out there with a twisted enough
mind to think of a use for that...

-- 
Greg


From mark at hotpy.org  Wed Apr 25 10:11:10 2012
From: mark at hotpy.org (Mark Shannon)
Date: Wed, 25 Apr 2012 09:11:10 +0100
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <4F97AEAA.90009@canterbury.ac.nz>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>	<20120424214058.22540616@pitrou.net>
	<4F9705EB.1070702@egenix.com>	<jn71dc$3on$1@dough.gmane.org>
	<4F970B16.2020702@egenix.com>	<jn72eb$el9$1@dough.gmane.org>
	<4F970F36.2070208@egenix.com>	<4F974B6B.3040208@pearwood.info>	<CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>
	<4F97AEAA.90009@canterbury.ac.nz>
Message-ID: <4F97B19E.7000604@hotpy.org>

Greg Ewing wrote:
> Nick Coghlan wrote:
>> It isn't quite as simple as
>> just deleting those lines though, since we likely still wouldn't want
>> to allow return statements in class bodies.
> 
> I'm sure there's someone out there with a twisted enough
> mind to think of a use for that...
> 

It's not that twisted.

class C:

    def basic_feature(self):
        ...

    if use_simple_api:
        return

    def advanced_feature(self):
        ...

Not that this is a good use, but it is a use :)

Cheers,
Mark.



From anacrolix at gmail.com  Wed Apr 25 10:24:30 2012
From: anacrolix at gmail.com (Matt Joiner)
Date: Wed, 25 Apr 2012 16:24:30 +0800
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <4F97B19E.7000604@hotpy.org>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
	<20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com>
	<jn71dc$3on$1@dough.gmane.org> <4F970B16.2020702@egenix.com>
	<jn72eb$el9$1@dough.gmane.org> <4F970F36.2070208@egenix.com>
	<4F974B6B.3040208@pearwood.info>
	<CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>
	<4F97AEAA.90009@canterbury.ac.nz> <4F97B19E.7000604@hotpy.org>
Message-ID: <CAB4yi1M4S=J9sHQysBm6e1d6Lau6MY2aDNj-TQn+xcqrH72D-w@mail.gmail.com>

If not use_simple_api:
    class C:
On Apr 25, 2012 4:19 PM, "Mark Shannon" <mark at hotpy.org> wrote:

> Greg Ewing wrote:
>
>> Nick Coghlan wrote:
>>
>>> It isn't quite as simple as
>>> just deleting those lines though, since we likely still wouldn't want
>>> to allow return statements in class bodies.
>>>
>>
>> I'm sure there's someone out there with a twisted enough
>> mind to think of a use for that...
>>
>>
> It's not that twisted.
>
> class C:
>
>   def basic_feature(self):
>       ...
>
>   if use_simple_api:
>       return
>
>   def advanced_feature(self):
>       ...
>
> Not that this is a good use, but it is a use :)
>
> Cheers,
> Mark.
>
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/**mailman/listinfo/python-ideas<http://mail.python.org/mailman/listinfo/python-ideas>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120425/30edfea9/attachment.html>

From mal at egenix.com  Wed Apr 25 11:41:35 2012
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 25 Apr 2012 11:41:35 +0200
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
	<20120424214058.22540616@pitrou.net>
	<4F9705EB.1070702@egenix.com> <jn71dc$3on$1@dough.gmane.org>
	<4F970B16.2020702@egenix.com> <jn72eb$el9$1@dough.gmane.org>
	<4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info>
	<CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>
Message-ID: <4F97C6CF.8060106@egenix.com>

Nick Coghlan wrote:
> On Wed, Apr 25, 2012 at 10:55 AM, Steven D'Aprano <steve at pearwood.info> wrote:
>> In my opinion, this is an attractive nuisance.
>>
>> -1 on the feature.
> 
> Agreed (and my preferred idiom for all the cited cases is also "always
> define the Python version, override at the end with the accelerated
> version").

IMO, defining things twice in the same module is not a very Pythonic
way of designing Python software.

Left aside the resource leakage, it also makes if difficult to find
the implementation that actually gets used, bypasses "explicit is
better than implicit", and it doesn't address possible side-effects
of the definitions that you eventually override at the end of the
module.

Python is normally written with a top-to-bottom view in mind, where
you don't expect things to suddenly change near the end.

This is why we introduced decorators before the function definition,
rather than place them after the function definition. It's also why
we tend to put imports, globals, helpers at the top of the file.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 25 2012)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2012-04-28: PythonCamp 2012, Cologne, Germany               3 days to go
2012-04-25: Released eGenix mx Base 3.2.4         http://egenix.com/go27

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


From ncoghlan at gmail.com  Wed Apr 25 11:52:17 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 25 Apr 2012 19:52:17 +1000
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <4F97C6CF.8060106@egenix.com>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
	<20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com>
	<jn71dc$3on$1@dough.gmane.org> <4F970B16.2020702@egenix.com>
	<jn72eb$el9$1@dough.gmane.org> <4F970F36.2070208@egenix.com>
	<4F974B6B.3040208@pearwood.info>
	<CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>
	<4F97C6CF.8060106@egenix.com>
Message-ID: <CADiSq7fTK6+tDGnusRp=KJ3k3TLG_HGUQA=4-KXkS_+ocEzXYQ@mail.gmail.com>

On Wed, Apr 25, 2012 at 7:41 PM, M.-A. Lemburg <mal at egenix.com> wrote:
> Nick Coghlan wrote:
>> On Wed, Apr 25, 2012 at 10:55 AM, Steven D'Aprano <steve at pearwood.info> wrote:
>>> In my opinion, this is an attractive nuisance.
>>>
>>> -1 on the feature.
>>
>> Agreed (and my preferred idiom for all the cited cases is also "always
>> define the Python version, override at the end with the accelerated
>> version").
>
> IMO, defining things twice in the same module is not a very Pythonic
> way of designing Python software.
>
> Left aside the resource leakage, it also makes if difficult to find
> the implementation that actually gets used, bypasses "explicit is
> better than implicit", and it doesn't address possible side-effects
> of the definitions that you eventually override at the end of the
> module.
>
> Python is normally written with a top-to-bottom view in mind, where
> you don't expect things to suddenly change near the end.
>
> This is why we introduced decorators before the function definition,
> rather than place them after the function definition. It's also why
> we tend to put imports, globals, helpers at the top of the file.

I agree overwriting at the end isn't ideal, but I don't think allowing
returns at module level is a significant improvement. I'd rather see a
higher level approach that specifically set out to tackle the problem
of choosing between multiple implementations of a module at runtime
that cleanly supported *testing* all the implementations in a single
process, while still having one implementation that was used be
default.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From ubershmekel at gmail.com  Wed Apr 25 12:22:41 2012
From: ubershmekel at gmail.com (Yuval Greenfield)
Date: Wed, 25 Apr 2012 13:22:41 +0300
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <CADiSq7fTK6+tDGnusRp=KJ3k3TLG_HGUQA=4-KXkS_+ocEzXYQ@mail.gmail.com>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
	<20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com>
	<jn71dc$3on$1@dough.gmane.org> <4F970B16.2020702@egenix.com>
	<jn72eb$el9$1@dough.gmane.org> <4F970F36.2070208@egenix.com>
	<4F974B6B.3040208@pearwood.info>
	<CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>
	<4F97C6CF.8060106@egenix.com>
	<CADiSq7fTK6+tDGnusRp=KJ3k3TLG_HGUQA=4-KXkS_+ocEzXYQ@mail.gmail.com>
Message-ID: <CANSw7KxRpaa4OvYfhxdoDENDSha4Zk0PeD30Qp=iXpRxFgeSnw@mail.gmail.com>

On Wed, Apr 25, 2012 at 12:52 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> I agree overwriting at the end isn't ideal, but I don't think allowing
> returns at module level is a significant improvement. I'd rather see a
> higher level approach that specifically set out to tackle the problem
> of choosing between multiple implementations of a module at runtime
> that *cleanly supported *testing* all the implementations in a single
> process,* while still having one implementation that was used be
> default.
>

+1 for Nick's remark.

-1 for the current proposal.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120425/9faeed39/attachment.html>

From mal at egenix.com  Wed Apr 25 12:37:54 2012
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 25 Apr 2012 12:37:54 +0200
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <CADiSq7fTK6+tDGnusRp=KJ3k3TLG_HGUQA=4-KXkS_+ocEzXYQ@mail.gmail.com>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
	<20120424214058.22540616@pitrou.net>
	<4F9705EB.1070702@egenix.com> <jn71dc$3on$1@dough.gmane.org>
	<4F970B16.2020702@egenix.com> <jn72eb$el9$1@dough.gmane.org>
	<4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info>
	<CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>
	<4F97C6CF.8060106@egenix.com>
	<CADiSq7fTK6+tDGnusRp=KJ3k3TLG_HGUQA=4-KXkS_+ocEzXYQ@mail.gmail.com>
Message-ID: <4F97D402.5050701@egenix.com>

Nick Coghlan wrote:
> On Wed, Apr 25, 2012 at 7:41 PM, M.-A. Lemburg <mal at egenix.com> wrote:
>> Nick Coghlan wrote:
>>> On Wed, Apr 25, 2012 at 10:55 AM, Steven D'Aprano <steve at pearwood.info> wrote:
>>>> In my opinion, this is an attractive nuisance.
>>>>
>>>> -1 on the feature.
>>>
>>> Agreed (and my preferred idiom for all the cited cases is also "always
>>> define the Python version, override at the end with the accelerated
>>> version").
>>
>> IMO, defining things twice in the same module is not a very Pythonic
>> way of designing Python software.
>>
>> Left aside the resource leakage, it also makes if difficult to find
>> the implementation that actually gets used, bypasses "explicit is
>> better than implicit", and it doesn't address possible side-effects
>> of the definitions that you eventually override at the end of the
>> module.
>>
>> Python is normally written with a top-to-bottom view in mind, where
>> you don't expect things to suddenly change near the end.
>>
>> This is why we introduced decorators before the function definition,
>> rather than place them after the function definition. It's also why
>> we tend to put imports, globals, helpers at the top of the file.
> 
> I agree overwriting at the end isn't ideal, but I don't think allowing
> returns at module level is a significant improvement. I'd rather see a
> higher level approach that specifically set out to tackle the problem
> of choosing between multiple implementations of a module at runtime
> that cleanly supported *testing* all the implementations in a single
> process, while still having one implementation that was used be
> default.

Isn't that an application developer choice to make rather than
one that we force upon the developer and one which only addresses
a single use case (having multiple implementation variants in a
module) ?

What about other use cases, where you e.g.

* know that the subsequent function/class definitions are going to
  fail, because your runtime environment doesn't provide the
  needed functionality ?

* want to limit the available defined APIs based on flags or
  other settings ?

* want to make modules behave more like functions or classes ?

* want to debug import loops ?

Since the module body is run more or less like a function or
class body, it seems natural to allow the same statements
available there in modules as well.

Esp. with the new importlib, tapping into the wealth of
functionality in that area has become a lot easier than before.
Only the compiler is preventing it.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 25 2012)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2012-04-28: PythonCamp 2012, Cologne, Germany               3 days to go
2012-04-25: Released eGenix mx Base 3.2.4         http://egenix.com/go27

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


From steve at pearwood.info  Wed Apr 25 13:49:06 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 25 Apr 2012 21:49:06 +1000
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <4F97C6CF.8060106@egenix.com>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
	<20120424214058.22540616@pitrou.net>
	<4F9705EB.1070702@egenix.com> <jn71dc$3on$1@dough.gmane.org>
	<4F970B16.2020702@egenix.com> <jn72eb$el9$1@dough.gmane.org>
	<4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info>
	<CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>
	<4F97C6CF.8060106@egenix.com>
Message-ID: <4F97E4B2.6070302@pearwood.info>

M.-A. Lemburg wrote:
> Nick Coghlan wrote:
>> On Wed, Apr 25, 2012 at 10:55 AM, Steven D'Aprano <steve at pearwood.info> wrote:
>>> In my opinion, this is an attractive nuisance.
>>>
>>> -1 on the feature.
>> Agreed (and my preferred idiom for all the cited cases is also "always
>> define the Python version, override at the end with the accelerated
>> version").
> 
> IMO, defining things twice in the same module is not a very Pythonic
> way of designing Python software.

You're not defining things twice in the same module. You're designing them 
twice in two modules, one in Python and one in a C extension module.

Your own example does the same thing. The only difference is that you try to 
avoid creating the pure-Python versions if you don't need them, but you still 
have the source code for them in the module.


> Left aside the resource leakage, 

What resource leakage?

If I do this:

def f(): pass
from module import f

then the first function f is garbage collected and there is no resource leakage.

As for the rest of your post, I'm afraid that I find most of it "not even 
wrong" so I won't address it directly. I will say this:

I'm sure you can come up with all sorts of reasons for not liking the current 
idiom of "define pure-Python code first, then replace with accelerated C 
version if available", e.g. extremely unlikely scenarios for code that has 
side-effects that you might want to avoid. That's all fine. The argument is 
not that there is never a use for a top level return, or that alternatives are 
perfect. The argument is that a top level return has more disadvantages than 
advantages.

Unless you address the disadvantages and costs of top level return, you won't 
convince me, and I doubt you will convince many others.




-- 
Steven


From mark at hotpy.org  Wed Apr 25 14:11:03 2012
From: mark at hotpy.org (Mark Shannon)
Date: Wed, 25 Apr 2012 13:11:03 +0100
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <4F97E4B2.6070302@pearwood.info>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>	<20120424214058.22540616@pitrou.net>	<4F9705EB.1070702@egenix.com>
	<jn71dc$3on$1@dough.gmane.org>	<4F970B16.2020702@egenix.com>
	<jn72eb$el9$1@dough.gmane.org>	<4F970F36.2070208@egenix.com>
	<4F974B6B.3040208@pearwood.info>	<CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>	<4F97C6CF.8060106@egenix.com>
	<4F97E4B2.6070302@pearwood.info>
Message-ID: <4F97E9D7.5090804@hotpy.org>

Steven D'Aprano wrote:

[snip]

> Unless you address the disadvantages and costs of top level return, you 
> won't convince me, and I doubt you will convince many others.
> 

What costs?


From steve at pearwood.info  Wed Apr 25 14:30:32 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 25 Apr 2012 22:30:32 +1000
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <4F97E9D7.5090804@hotpy.org>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>	<20120424214058.22540616@pitrou.net>	<4F9705EB.1070702@egenix.com>	<jn71dc$3on$1@dough.gmane.org>	<4F970B16.2020702@egenix.com>	<jn72eb$el9$1@dough.gmane.org>	<4F970F36.2070208@egenix.com>	<4F974B6B.3040208@pearwood.info>	<CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>	<4F97C6CF.8060106@egenix.com>	<4F97E4B2.6070302@pearwood.info>
	<4F97E9D7.5090804@hotpy.org>
Message-ID: <4F97EE68.4080102@pearwood.info>

Mark Shannon wrote:
> Steven D'Aprano wrote:
> 
> [snip]
> 
>> Unless you address the disadvantages and costs of top level return, 
>> you won't convince me, and I doubt you will convince many others.
>>
> 
> What costs?

Quoting from my post earlier today:


[quote]
Even if break at the module level is useful on rare occasions, I think the 
usefulness is far outweighed by the costs:

- hard to use correctly, hence code using this feature risks being buggy
- encourages premature micro-optimization, or at least the illusion of
   optimization
- encourages or requires duplicate code and copy-and-paste programming
- complicates the top-level program flow

Today, if you successfully import a module, you know that all the top-level 
code in that module was executed. If this feature is added, you cannot be sure 
what top-level code was reached unless you scan through all the code above it.
[end quote]


And to see the context:

http://mail.python.org/pipermail/python-ideas/2012-April/014897.html




-- 
Steven



From mark at hotpy.org  Wed Apr 25 14:43:08 2012
From: mark at hotpy.org (Mark Shannon)
Date: Wed, 25 Apr 2012 13:43:08 +0100
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <4F97EE68.4080102@pearwood.info>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>	<20120424214058.22540616@pitrou.net>	<4F9705EB.1070702@egenix.com>	<jn71dc$3on$1@dough.gmane.org>	<4F970B16.2020702@egenix.com>	<jn72eb$el9$1@dough.gmane.org>	<4F970F36.2070208@egenix.com>	<4F974B6B.3040208@pearwood.info>	<CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>	<4F97C6CF.8060106@egenix.com>	<4F97E4B2.6070302@pearwood.info>	<4F97E9D7.5090804@hotpy.org>
	<4F97EE68.4080102@pearwood.info>
Message-ID: <4F97F15C.6000509@hotpy.org>

Steven D'Aprano wrote:
> Mark Shannon wrote:
>> Steven D'Aprano wrote:
>>
>> [snip]
>>
>>> Unless you address the disadvantages and costs of top level return, 
>>> you won't convince me, and I doubt you will convince many others.
>>>
>>
>> What costs?
> 
> Quoting from my post earlier today:
> 
> 
> [quote]
> Even if break at the module level is useful on rare occasions, I think 
> the usefulness is far outweighed by the costs:
> 
> - hard to use correctly, hence code using this feature risks being buggy
> - encourages premature micro-optimization, or at least the illusion of
>   optimization
> - encourages or requires duplicate code and copy-and-paste programming
> - complicates the top-level program flow

I would have taken these to be the disadvantages, rather costs.
By costs, I assumed you meant implementation effort or runtime overhead.

Also, I don't see the difference between a return in a module, and a 
return in a function. Both terminate execution and return to the caller. 
Why do these four points apply to module-level returns any more than 
function-level returns?

> 
> Today, if you successfully import a module, you know that all the 
> top-level code in that module was executed. If this feature is added, 
> you cannot be sure what top-level code was reached unless you scan 
> through all the code above it.
> [end quote]

I don't know if this is a good idea or not, but the fact that to it can
be implemented by removing a single restriction in the compiler suggests
it might have some merit.

Cheers,
Mark.


From ronaldoussoren at mac.com  Wed Apr 25 14:37:56 2012
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Wed, 25 Apr 2012 14:37:56 +0200
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <4F97E9D7.5090804@hotpy.org>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
	<20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com>
	<jn71dc$3on$1@dough.gmane.org> <4F970B16.2020702@egenix.com>
	<jn72eb$el9$1@dough.gmane.org> <4F970F36.2070208@egenix.com>
	<4F974B6B.3040208@pearwood.info>
	<CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>
	<4F97C6CF.8060106@egenix.com> <4F97E4B2.6070302@pearwood.info>
	<4F97E9D7.5090804@hotpy.org>
Message-ID: <4A8CACF1-860D-48F7-9538-596A3EEB4445@mac.com>


On 25 Apr, 2012, at 14:11, Mark Shannon wrote:

> Steven D'Aprano wrote:
> 
> [snip]
> 
>> Unless you address the disadvantages and costs of top level return, you won't convince me, and I doubt you will convince many others.
> 
> What costs?

Harder to understand code is one disadvantage. The "return" that ends execution can easily be hidden in a list of definitions, such as


... some definitions ...

if sys.platform != 'win32': return

... more definitions for win32 specific functionality ...



That's easy to read with a 10 line module, but not when the module gets significantly larger.

Also, why use the proposed module-scope return instead of an if-statement with nested definitions, this works just fine:

: def foo(): pass
:
: if sys.platform == 'linux':
:
:    def linux_bar(): pass




> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4788 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120425/9f65dd8f/attachment.bin>

From mal at egenix.com  Wed Apr 25 16:00:58 2012
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 25 Apr 2012 16:00:58 +0200
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <4A8CACF1-860D-48F7-9538-596A3EEB4445@mac.com>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
	<20120424214058.22540616@pitrou.net>
	<4F9705EB.1070702@egenix.com> <jn71dc$3on$1@dough.gmane.org>
	<4F970B16.2020702@egenix.com> <jn72eb$el9$1@dough.gmane.org>
	<4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info>
	<CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>
	<4F97C6CF.8060106@egenix.com> <4F97E4B2.6070302@pearwood.info>
	<4F97E9D7.5090804@hotpy.org>
	<4A8CACF1-860D-48F7-9538-596A3EEB4445@mac.com>
Message-ID: <4F98039A.3090301@egenix.com>

Ronald Oussoren wrote:
> 
> Also, why use the proposed module-scope return instead of an if-statement with nested definitions, this works just fine:
> 
> : def foo(): pass
> :
> : if sys.platform == 'linux':
> :
> :    def linux_bar(): pass

Because this only works reasonably if you have a few lines of code
to indent. As soon as you have hundreds of lines, this becomes
both unreadable and difficult to edit.

The above is how the thread was started, BTW :-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 25 2012)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2012-04-28: PythonCamp 2012, Cologne, Germany               3 days to go
2012-04-25: Released eGenix mx Base 3.2.4         http://egenix.com/go27

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


From arnodel at gmail.com  Wed Apr 25 16:44:11 2012
From: arnodel at gmail.com (Arnaud Delobelle)
Date: Wed, 25 Apr 2012 15:44:11 +0100
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <4F98039A.3090301@egenix.com>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
	<20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com>
	<jn71dc$3on$1@dough.gmane.org> <4F970B16.2020702@egenix.com>
	<jn72eb$el9$1@dough.gmane.org> <4F970F36.2070208@egenix.com>
	<4F974B6B.3040208@pearwood.info>
	<CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>
	<4F97C6CF.8060106@egenix.com> <4F97E4B2.6070302@pearwood.info>
	<4F97E9D7.5090804@hotpy.org>
	<4A8CACF1-860D-48F7-9538-596A3EEB4445@mac.com>
	<4F98039A.3090301@egenix.com>
Message-ID: <CAJ6cK1Zr0pZ1XLZd1ZP4bbkP4KzEUO-3t604ROm14WYReyXLYw@mail.gmail.com>

(sent from my phone)
On Apr 25, 2012 3:01 PM, "M.-A. Lemburg" <mal at egenix.com> wrote:
>
> Ronald Oussoren wrote:
> >
> > Also, why use the proposed module-scope return instead of an
if-statement with nested definitions, this works just fine:
> >
> > : def foo(): pass
> > :
> > : if sys.platform == 'linux':
> > :
> > :    def linux_bar(): pass
>
> Because this only works reasonably if you have a few lines of code
> to indent. As soon as you have hundreds of lines, this becomes
> both unreadable and difficult to edit.

OTOH the return statement becomes really hard to spot...

Arnaud
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120425/3f2b6601/attachment.html>

From mal at egenix.com  Wed Apr 25 17:09:16 2012
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 25 Apr 2012 17:09:16 +0200
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <CAJ6cK1Zr0pZ1XLZd1ZP4bbkP4KzEUO-3t604ROm14WYReyXLYw@mail.gmail.com>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
	<20120424214058.22540616@pitrou.net>
	<4F9705EB.1070702@egenix.com> <jn71dc$3on$1@dough.gmane.org>
	<4F970B16.2020702@egenix.com> <jn72eb$el9$1@dough.gmane.org>
	<4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info>
	<CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>
	<4F97C6CF.8060106@egenix.com> <4F97E4B2.6070302@pearwood.info>
	<4F97E9D7.5090804@hotpy.org>
	<4A8CACF1-860D-48F7-9538-596A3EEB4445@mac.com>
	<4F98039A.3090301@egenix.com>
	<CAJ6cK1Zr0pZ1XLZd1ZP4bbkP4KzEUO-3t604ROm14WYReyXLYw@mail.gmail.com>
Message-ID: <4F98139C.1020003@egenix.com>

Arnaud Delobelle wrote:
> (sent from my phone)
> On Apr 25, 2012 3:01 PM, "M.-A. Lemburg" <mal at egenix.com> wrote:
>>
>> Ronald Oussoren wrote:
>>>
>>> Also, why use the proposed module-scope return instead of an
> if-statement with nested definitions, this works just fine:
>>>
>>> : def foo(): pass
>>> :
>>> : if sys.platform == 'linux':
>>> :
>>> :    def linux_bar(): pass
>>
>> Because this only works reasonably if you have a few lines of code
>> to indent. As soon as you have hundreds of lines, this becomes
>> both unreadable and difficult to edit.
> 
> OTOH the return statement becomes really hard to spot...

People don't appear to have a problem with this in long functions
or methods, so I'm not sure how well that argument qualifies.

The programmer can of course add an easy to spot comment where
the return is used. Just as question of programming style.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 25 2012)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2012-04-28: PythonCamp 2012, Cologne, Germany               3 days to go
2012-04-25: Released eGenix mx Base 3.2.4         http://egenix.com/go27

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


From ericsnowcurrently at gmail.com  Wed Apr 25 17:28:36 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Wed, 25 Apr 2012 09:28:36 -0600
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <4F97EE68.4080102@pearwood.info>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
	<20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com>
	<jn71dc$3on$1@dough.gmane.org> <4F970B16.2020702@egenix.com>
	<jn72eb$el9$1@dough.gmane.org> <4F970F36.2070208@egenix.com>
	<4F974B6B.3040208@pearwood.info>
	<CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>
	<4F97C6CF.8060106@egenix.com> <4F97E4B2.6070302@pearwood.info>
	<4F97E9D7.5090804@hotpy.org> <4F97EE68.4080102@pearwood.info>
Message-ID: <CALFfu7CBVb30aUCqRDUR7BXGZMxMm-2=yBizKOM+nVfX0gbNTw@mail.gmail.com>

I'm actually fine with the way things are and would _really_ have used
a module "break" perhaps once.  As I alluded to originally, there just
aren't enough good use cases to make it worth it over the status quo.
[1][2]  At the very least, the mailing list archives will have a
pretty good discussion on the idea (of which I could not find one
previously).

To recap, this idea is about making the intent/context of a module
explicit at the beginning, rather than the end -- without resorting to
the extra level(s) of indent that an if/else solution would require.
Decorators and the with statement (both targeting code blocks) came
about for the same reason.  However, a simple return/break statement
would allow much more than that.  As Nick suggested, a more specific,
targeted solution would be better.  ("import-else" doesn't fit the
bill. [3])

For the record, I still think the status quo is sub-optimal.  My
original post lists what I think are legitimate (if uncommon) use
cases.  Here's my list of (nitpick-ish?) concerns with the current
solutions:

  1. if/else to make context explicit:
one extra level of indent (or ever-so-slightly-possibly more)
  2. conditionally replace module contents at the end:
without a clear comment at the beginning, may miscommunicate the final
contents of the module
  3. put the code in the else of #1 in a separate module:
one more import involved (weak, I know), and one more level of FS
indirection ("flat is better than nested")
  4. special exception + import hook:
not worth the trouble

-eric

[1] http://www.boredomandlaziness.org/2011/02/justifying-python-language-changes.html
[2] http://www.boredomandlaziness.org/2011/02/status-quo-wins-stalemate.html
[3] An idea that has come up before (and at least once recently):
  import cdecimal else decimal as decimal
    <==>
  try:
      import cdecimal as decimal
  except ImportError:
      import decimal


From ethan at stoneleaf.us  Wed Apr 25 17:31:34 2012
From: ethan at stoneleaf.us (Ethan Furman)
Date: Wed, 25 Apr 2012 08:31:34 -0700
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <CAB4yi1M4S=J9sHQysBm6e1d6Lau6MY2aDNj-TQn+xcqrH72D-w@mail.gmail.com>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>	<20120424214058.22540616@pitrou.net>
	<4F9705EB.1070702@egenix.com>	<jn71dc$3on$1@dough.gmane.org>
	<4F970B16.2020702@egenix.com>	<jn72eb$el9$1@dough.gmane.org>
	<4F970F36.2070208@egenix.com>	<4F974B6B.3040208@pearwood.info>	<CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>	<4F97AEAA.90009@canterbury.ac.nz>
	<4F97B19E.7000604@hotpy.org>
	<CAB4yi1M4S=J9sHQysBm6e1d6Lau6MY2aDNj-TQn+xcqrH72D-w@mail.gmail.com>
Message-ID: <4F9818D6.5010808@stoneleaf.us>

Matt Joiner wrote:
> If not use_simple_api:
>     class C:

More like:

   class C:
     def basic_method(self):
       pass
     if not use_simple_gui:
       def advanced_method(self, this, that):
         pass

~Ethan~


From ron3200 at gmail.com  Wed Apr 25 17:52:49 2012
From: ron3200 at gmail.com (Ron Adam)
Date: Wed, 25 Apr 2012 10:52:49 -0500
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <4F97AEAA.90009@canterbury.ac.nz>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
	<20120424214058.22540616@pitrou.net>
	<4F9705EB.1070702@egenix.com> <jn71dc$3on$1@dough.gmane.org>
	<4F970B16.2020702@egenix.com> <jn72eb$el9$1@dough.gmane.org>
	<4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info>
	<CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>
	<4F97AEAA.90009@canterbury.ac.nz>
Message-ID: <jn96kn$605$1@dough.gmane.org>



On 04/25/2012 02:58 AM, Greg Ewing wrote:
> Nick Coghlan wrote:
>> It isn't quite as simple as
>> just deleting those lines though, since we likely still wouldn't want
>> to allow return statements in class bodies.
>
> I'm sure there's someone out there with a twisted enough
> mind to think of a use for that...

Currently return isn't allowed in class bodies defined inside functons.  So 
it probably won't work in top level either.

>>> def foo():
  ...    class A:
  ...       return
  ...
    File "<stdin>", line 3
  SyntaxError: 'return' outside function


As far as the feature goes, it wouldn't be consistent with class behaviour 
unless you allow the returns in class's to work too.

Think of modules as a type of class where ...

   import module

is equivalent to ...

   module module_name:
       <module file contents here>

Like classes the module body would execute to define the module, and return 
inside the module body would be a syntax error.

Of course since modules aren't specifically defined in this way, there is 
the option to not follow that consultancy.

Cheers,
    Ron







Ron





From ron3200 at gmail.com  Wed Apr 25 18:06:23 2012
From: ron3200 at gmail.com (Ron Adam)
Date: Wed, 25 Apr 2012 11:06:23 -0500
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <4F9709FD.4060901@hotpy.org>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>	<20120424214058.22540616@pitrou.net>
	<4F9705EB.1070702@egenix.com> <4F9709FD.4060901@hotpy.org>
Message-ID: <jn97e0$c6h$1@dough.gmane.org>



On 04/24/2012 03:15 PM, Mark Shannon wrote:
>
> This has to be the only few feature request that can implemented by
> removing code :)
>
> I implemented this by deleting 2 lines of code from the compiler.



Does it also allow returns in class bodies when you do that?

>>> class A:
...    return
...
   File "<stdin>", line 2
SyntaxError: 'return' outside function



Weather or not it does.  I think modules are closer to class bodies and 
return should be a syntax error in that case as well.  They aren't 
functions and we shouldn't think of them that way.  IMHO  It would make 
Python harder to learn when the lines between them get blurred.

Cheers,
    Ron




From ericsnowcurrently at gmail.com  Wed Apr 25 18:28:38 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Wed, 25 Apr 2012 10:28:38 -0600
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <4F97EE68.4080102@pearwood.info>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
	<20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com>
	<jn71dc$3on$1@dough.gmane.org> <4F970B16.2020702@egenix.com>
	<jn72eb$el9$1@dough.gmane.org> <4F970F36.2070208@egenix.com>
	<4F974B6B.3040208@pearwood.info>
	<CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>
	<4F97C6CF.8060106@egenix.com> <4F97E4B2.6070302@pearwood.info>
	<4F97E9D7.5090804@hotpy.org> <4F97EE68.4080102@pearwood.info>
Message-ID: <CALFfu7DX73WG5D4aE64Xna2qF__XHAkDGbbp1vmEp+LtjX9+jg@mail.gmail.com>

On Wed, Apr 25, 2012 at 6:30 AM, Steven D'Aprano <steve at pearwood.info> wrote:
> - encourages premature micro-optimization, or at least the illusion of
> ?optimization
> - hard to use correctly, hence code using this feature risks being buggy
> - encourages premature micro-optimization, or at least the illusion of
>  optimization
> - encourages or requires duplicate code and copy-and-paste programming
> - complicates the top-level program flow
>
> Today, if you successfully import a module, you know that all the top-level code in
> that module was executed. If this feature is added, you cannot be sure what top-level
> code was reached unless you scan through all the code above it.


This got me thinking "well, you get the same thing with functions and
the return statement".  Then I realized there's a problem with that
line of thinking and stepped back.

Modules and functions have distinct purposes (by design) and we
shouldn't help make that distinction blurrier.  We should (and mostly
do) teach the concept of a module as a top-level (singleton) namespace
definition.  The idioms presented in this thread mostly bear this out.

Python doesn't force the distinction syntactically, nor am I
suggesting it should.  However, it seems to me that this not how most
people think of modules.  The culprit here is the lack of distinction
between modules and scripts.  If a module is like a class, a script is
like a function.  Perhaps we should consider ways of making the
difference between scripts and modules clearer, whether in the docs,
with syntax, or otherwise.

-eric


From rob.cliffe at btinternet.com  Wed Apr 25 18:34:44 2012
From: rob.cliffe at btinternet.com (Rob Cliffe)
Date: Wed, 25 Apr 2012 17:34:44 +0100
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <CALFfu7DX73WG5D4aE64Xna2qF__XHAkDGbbp1vmEp+LtjX9+jg@mail.gmail.com>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
	<20120424214058.22540616@pitrou.net>
	<4F9705EB.1070702@egenix.com> <jn71dc$3on$1@dough.gmane.org>
	<4F970B16.2020702@egenix.com> <jn72eb$el9$1@dough.gmane.org>
	<4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info>
	<CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>
	<4F97C6CF.8060106@egenix.com> <4F97E4B2.6070302@pearwood.info>
	<4F97E9D7.5090804@hotpy.org> <4F97EE68.4080102@pearwood.info>
	<CALFfu7DX73WG5D4aE64Xna2qF__XHAkDGbbp1vmEp+LtjX9+jg@mail.gmail.com>
Message-ID: <4F9827A4.6070901@btinternet.com>

Not, please, with adding differences to the syntax allowed in scripts 
and in modules.  (More to learn and remember.)
But in the docs: absolutely.
Rob Cliffe

On 25/04/2012 17:28, Eric Snow wrote:
>
> Perhaps we should consider ways of making the
> difference between scripts and modules clearer, whether in the docs,
> with syntax, or otherwise.
>
> -eric
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
>


From steve at pearwood.info  Wed Apr 25 18:54:58 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 26 Apr 2012 02:54:58 +1000
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <4F97F15C.6000509@hotpy.org>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>	<20120424214058.22540616@pitrou.net>	<4F9705EB.1070702@egenix.com>	<jn71dc$3on$1@dough.gmane.org>	<4F970B16.2020702@egenix.com>	<jn72eb$el9$1@dough.gmane.org>	<4F970F36.2070208@egenix.com>	<4F974B6B.3040208@pearwood.info>	<CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>	<4F97C6CF.8060106@egenix.com>	<4F97E4B2.6070302@pearwood.info>	<4F97E9D7.5090804@hotpy.org>	<4F97EE68.4080102@pearwood.info>
	<4F97F15C.6000509@hotpy.org>
Message-ID: <4F982C62.9040405@pearwood.info>

Mark Shannon wrote:
[...]
> I would have taken these to be the disadvantages, rather costs.
> By costs, I assumed you meant implementation effort or runtime overhead.

No, costs as in "costs versus benefits".


> Also, I don't see the difference between a return in a module, and a 
> return in a function. Both terminate execution and return to the caller. 
> Why do these four points apply to module-level returns any more than 
> function-level returns?

A very good point. There is a school of thought that functions should always 
have a single entry (the top of the function) and a single exit (the bottom). 
If I recall correctly, Pascal is like that: there's no way to return early 
from a function in Pascal.

However, code inside a function normally performs a calculation and returns a 
value, so once that value is calculated there's no point hanging around. The 
benefit of early return in functions outweighs the cost. It is my argument 
that this is not the case for top level module code.

Some differences between return in a function and return in a module:

1) Modules don't have a caller as such, so it isn't clear what you are 
returning too. (If the module is being imported, I suppose you could call the 
importing module the caller; but when the module is being run instead, there 
is no importing module.) So a top level return is more like an exit than a 
return, except it doesn't actually exit the Python interpreter.

2) When functions unexpectedly return early, you can sometimes get a clue why 
by inspecting the return value. Modules don't have a return value.

3) Functions tend to be relatively small (or at least, they should be 
relatively small), so while an early return in the middle of a function can be 
surprising, the cost of discovering that is not very high. In contrast, 
modules tend to be relatively large, hundreds or even thousands of lines. An 
early return could be anywhere.

4) Code at the top level of modules is usually transparent: the details of 
what gets done are important. People will want to know which functions, 
classes and global variables are actually created, and which are skipped due 
to an early return. In contrast, functions are usually treated as opaque 
blackboxes: people usually care about the interface, not the implementation. 
So typically they don't care whether the function returns out early or not.

There may be other differences.


>> Today, if you successfully import a module, you know that all the 
>> top-level code in that module was executed. If this feature is added, 
>> you cannot be sure what top-level code was reached unless you scan 
>> through all the code above it.
>> [end quote]
> 
> I don't know if this is a good idea or not, but the fact that to it can
> be implemented by removing a single restriction in the compiler suggests
> it might have some merit.

Do you really mean to say that *because* something is easy, it therefore might 
be a good idea?

rm -rf /

Easy, and therefore a good idea, yes? *wink*



-- 
Steven



From steve at pearwood.info  Wed Apr 25 18:58:28 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 26 Apr 2012 02:58:28 +1000
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <4F98039A.3090301@egenix.com>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>	<20120424214058.22540616@pitrou.net>	<4F9705EB.1070702@egenix.com>
	<jn71dc$3on$1@dough.gmane.org>	<4F970B16.2020702@egenix.com>
	<jn72eb$el9$1@dough.gmane.org>	<4F970F36.2070208@egenix.com>
	<4F974B6B.3040208@pearwood.info>	<CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>	<4F97C6CF.8060106@egenix.com>
	<4F97E4B2.6070302@pearwood.info>	<4F97E9D7.5090804@hotpy.org>	<4A8CACF1-860D-48F7-9538-596A3EEB4445@mac.com>
	<4F98039A.3090301@egenix.com>
Message-ID: <4F982D34.4010709@pearwood.info>

M.-A. Lemburg wrote:
> Ronald Oussoren wrote:
>> Also, why use the proposed module-scope return instead of an if-statement with nested definitions, this works just fine:
>>
>> : def foo(): pass
>> :
>> : if sys.platform == 'linux':
>> :
>> :    def linux_bar(): pass
> 
> Because this only works reasonably if you have a few lines of code
> to indent. As soon as you have hundreds of lines, this becomes
> both unreadable and difficult to edit.

I think that is wrong. Why would hundreds of lines suddenly become unreadable 
and hard to edit because they have a little bit of leading whitespace in front 
of them?


-- 
Steven



From mwm at mired.org  Wed Apr 25 19:18:30 2012
From: mwm at mired.org (Mike Meyer)
Date: Wed, 25 Apr 2012 13:18:30 -0400
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <4F982C62.9040405@pearwood.info>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
	<20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com>
	<jn71dc$3on$1@dough.gmane.org> <4F970B16.2020702@egenix.com>
	<jn72eb$el9$1@dough.gmane.org> <4F970F36.2070208@egenix.com>
	<4F974B6B.3040208@pearwood.info>
	<CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>
	<4F97C6CF.8060106@egenix.com> <4F97E4B2.6070302@pearwood.info>
	<4F97E9D7.5090804@hotpy.org> <4F97EE68.4080102@pearwood.info>
	<4F97F15C.6000509@hotpy.org> <4F982C62.9040405@pearwood.info>
Message-ID: <20120425131830.3f58c130@bhuda.mired.org>

On Thu, 26 Apr 2012 02:54:58 +1000
Steven D'Aprano <steve at pearwood.info> wrote:
> Mark Shannon wrote:
> > I don't know if this is a good idea or not, but the fact that to it can
> > be implemented by removing a single restriction in the compiler suggests
> > it might have some merit.
> Do you really mean to say that *because* something is easy, it therefore might 
> be a good idea?

I read it as an expression of the language design philosophy that the
best way to add power is to remove restrictions. Personally, I agree
with that philosophy, as removing a single restriction is a much
better alternative than having a flock of tools, syntax and special
cases to compensate. Compare Python - where functions are first-class
objects and can be trivially passed as arguments - to pretty much any
modern language that restricts such usage.

That said, "more power" is not always the best choice from a language
design point of few. In this case there's really only one use case for
lifting the restriction against return in classes and modules, and the
problems already pointed out that lifting this restriction creates
outweigh the benefits of that use case.

-1.

	<mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From anacrolix at gmail.com  Wed Apr 25 20:35:11 2012
From: anacrolix at gmail.com (Matt Joiner)
Date: Thu, 26 Apr 2012 02:35:11 +0800
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <20120425131830.3f58c130@bhuda.mired.org>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
	<20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com>
	<jn71dc$3on$1@dough.gmane.org> <4F970B16.2020702@egenix.com>
	<jn72eb$el9$1@dough.gmane.org> <4F970F36.2070208@egenix.com>
	<4F974B6B.3040208@pearwood.info>
	<CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>
	<4F97C6CF.8060106@egenix.com> <4F97E4B2.6070302@pearwood.info>
	<4F97E9D7.5090804@hotpy.org> <4F97EE68.4080102@pearwood.info>
	<4F97F15C.6000509@hotpy.org> <4F982C62.9040405@pearwood.info>
	<20120425131830.3f58c130@bhuda.mired.org>
Message-ID: <CAB4yi1PBCyVx5akg-e4LdXFMNjvNXchEzeia4QFLsswJ3iWnsw@mail.gmail.com>

If this is to be done I'd like to see all special methods supported. One of
particular interest to modules is __getattr__...

I think the idea is crazy and will lead to chicken and egg discussions.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120426/db10cd77/attachment.html>

From g.brandl at gmx.net  Wed Apr 25 21:31:05 2012
From: g.brandl at gmx.net (Georg Brandl)
Date: Wed, 25 Apr 2012 21:31:05 +0200
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <jn96kn$605$1@dough.gmane.org>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
	<20120424214058.22540616@pitrou.net>
	<4F9705EB.1070702@egenix.com> <jn71dc$3on$1@dough.gmane.org>
	<4F970B16.2020702@egenix.com> <jn72eb$el9$1@dough.gmane.org>
	<4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info>
	<CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>
	<4F97AEAA.90009@canterbury.ac.nz> <jn96kn$605$1@dough.gmane.org>
Message-ID: <jn9jcr$dvd$1@dough.gmane.org>

On 25.04.2012 17:52, Ron Adam wrote:

> Think of modules as a type of class where ...
> 
>    import module
> 
> is equivalent to ...
> 
>    module module_name:
>        <module file contents here>
> 
> Like classes the module body would execute to define the module, and return 
> inside the module body would be a syntax error.

No, sorry, that's not a good equivalence.  It reinforces the impression some
people have of "import" working like "#include" in C or (God forbid) "require"
in PHP.

Georg



From bboe at cs.ucsb.edu  Wed Apr 25 21:27:03 2012
From: bboe at cs.ucsb.edu (Bryce Boe)
Date: Wed, 25 Apr 2012 12:27:03 -0700
Subject: [Python-ideas] Structured Error Output
Message-ID: <CAJ6ei3SRGMRocFb7uBkp3t0RyvCeAAs-th5U=o0NuM8jpMhvBA@mail.gmail.com>

Hi,

I looked through the man page for python's interpreter and appears
that there is no way to properly distinguish between error messages
output to stderr by the interpreter and output produced the by a
user-program to stderr.

What I would really like to have are two things:

1) an option to output interpreter generated messages to a specified
file, whether these messages are uncatchable syntax errors, or
catchable runtime errors that result in the termination of the
interpreter. This feature would allow a wrapper program to distinguish
between user-output and python interpreter output.

2) an option to provide a structured error output in some common
easy-to-parse and extendable format that can be used to associate the
file, line number, error type/number in some post-processing error
handler. This feature would make the parsing of error messages more
deterministic, and would be of significant benefit if other
compilers/interpreters also provide the same functionality in the same
common format.

Does anyone know if there is already such a way to do what I've asked?
If not, do you think having such features added to python would be
something that would actually be included?

Thanks,
Bryce Boe


From g.brandl at gmx.net  Wed Apr 25 21:37:24 2012
From: g.brandl at gmx.net (Georg Brandl)
Date: Wed, 25 Apr 2012 21:37:24 +0200
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <4F98039A.3090301@egenix.com>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
	<20120424214058.22540616@pitrou.net>
	<4F9705EB.1070702@egenix.com> <jn71dc$3on$1@dough.gmane.org>
	<4F970B16.2020702@egenix.com> <jn72eb$el9$1@dough.gmane.org>
	<4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info>
	<CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>
	<4F97C6CF.8060106@egenix.com> <4F97E4B2.6070302@pearwood.info>
	<4F97E9D7.5090804@hotpy.org>
	<4A8CACF1-860D-48F7-9538-596A3EEB4445@mac.com>
	<4F98039A.3090301@egenix.com>
Message-ID: <jn9jom$hio$1@dough.gmane.org>

On 25.04.2012 16:00, M.-A. Lemburg wrote:
> Ronald Oussoren wrote:
>> 
>> Also, why use the proposed module-scope return instead of an if-statement with nested definitions, this works just fine:
>> 
>> : def foo(): pass
>> :
>> : if sys.platform == 'linux':
>> :
>> :    def linux_bar(): pass
> 
> Because this only works reasonably if you have a few lines of code
> to indent. As soon as you have hundreds of lines, this becomes
> both unreadable and difficult to edit.

So you don't have any classes that span hundreds of lines?  I don't
see how this is different in terms of editing difficulty.

As for readability, at least you can see from the indentation that
there's something special about the module code in question.  You
don't see that if there's a "return" scattered somewhere.

cheers,
Georg

PS: I don't buy the "it's no problem with functions" argument.  Even
though I'm certainly guilty of writing 100+ line functions, I find them
quite ungraspable (is that a word?) and usually try to limit functions
and methods to a screenful.



From g.brandl at gmx.net  Wed Apr 25 21:38:27 2012
From: g.brandl at gmx.net (Georg Brandl)
Date: Wed, 25 Apr 2012 21:38:27 +0200
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <4F97F15C.6000509@hotpy.org>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>	<20120424214058.22540616@pitrou.net>	<4F9705EB.1070702@egenix.com>	<jn71dc$3on$1@dough.gmane.org>	<4F970B16.2020702@egenix.com>	<jn72eb$el9$1@dough.gmane.org>	<4F970F36.2070208@egenix.com>	<4F974B6B.3040208@pearwood.info>	<CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>	<4F97C6CF.8060106@egenix.com>	<4F97E4B2.6070302@pearwood.info>	<4F97E9D7.5090804@hotpy.org>
	<4F97EE68.4080102@pearwood.info> <4F97F15C.6000509@hotpy.org>
Message-ID: <jn9jql$hio$2@dough.gmane.org>

On 25.04.2012 14:43, Mark Shannon wrote:

>> Today, if you successfully import a module, you know that all the 
>> top-level code in that module was executed. If this feature is added, 
>> you cannot be sure what top-level code was reached unless you scan 
>> through all the code above it.
>> [end quote]
> 
> I don't know if this is a good idea or not, but the fact that to it can
> be implemented by removing a single restriction in the compiler suggests
> it might have some merit.

That is a strange argument.  The restriction was placed there for a reason.
(With the same argument you could argue for "None = 1" to be valid code.)

Georg



From ron3200 at gmail.com  Wed Apr 25 22:27:33 2012
From: ron3200 at gmail.com (Ron Adam)
Date: Wed, 25 Apr 2012 15:27:33 -0500
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <jn9jcr$dvd$1@dough.gmane.org>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
	<20120424214058.22540616@pitrou.net>
	<4F9705EB.1070702@egenix.com> <jn71dc$3on$1@dough.gmane.org>
	<4F970B16.2020702@egenix.com> <jn72eb$el9$1@dough.gmane.org>
	<4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info>
	<CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>
	<4F97AEAA.90009@canterbury.ac.nz>
	<jn96kn$605$1@dough.gmane.org> <jn9jcr$dvd$1@dough.gmane.org>
Message-ID: <jn9mnm$8pd$1@dough.gmane.org>



On 04/25/2012 02:31 PM, Georg Brandl wrote:
> On 25.04.2012 17:52, Ron Adam wrote:
>
>> Think of modules as a type of class where ...
>>
>>     import module
>>
>> is equivalent to ...
>>
>>     module module_name:
>>         <module file contents here>
>>
>> Like classes the module body would execute to define the module, and return
>> inside the module body would be a syntax error.
>
> No, sorry, that's not a good equivalence.  It reinforces the impression some
> people have of "import" working like "#include" in C or (God forbid) "require"
> in PHP.

Not quite the same thing, but I see how you would think that from the way I 
wrote the example.

I didn't mean the file to be inserted, but instead as if it was written in 
the module statement body.  That is, if we even had a "module" keyword, 
which we don't. ;-)

The point is that a module contents execute from beginning to end to create 
a module, in the same way a class's contents execute from beginning to end 
to create a class.

There are also differences, such as where a module is stored and how it's 
contents are accessed, and so they are not the same thing. You can't just 
change a class into a module and vise-versa by just changing it's header or 
moving it's body into a separate file.

I was just trying to point out a module is closer to a class than it is to 
a function, and that is a good thing.  Allowing a return or break in a 
module could make things more confusing.  Also, by not allowing return or 
breaks, it catches errors were the indentation is lost in functions or 
methods quicker.

Cheers,
   Ron





From tjreedy at udel.edu  Wed Apr 25 23:33:21 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 25 Apr 2012 17:33:21 -0400
Subject: [Python-ideas] Structured Error Output
In-Reply-To: <CAJ6ei3SRGMRocFb7uBkp3t0RyvCeAAs-th5U=o0NuM8jpMhvBA@mail.gmail.com>
References: <CAJ6ei3SRGMRocFb7uBkp3t0RyvCeAAs-th5U=o0NuM8jpMhvBA@mail.gmail.com>
Message-ID: <jn9qj6$5go$1@dough.gmane.org>

On 4/25/2012 3:27 PM, Bryce Boe wrote:
> Hi,
>
> I looked through the man page for python's interpreter and appears
> that there is no way to properly distinguish between error messages
> output to stderr by the interpreter and output produced the by a
> user-program to stderr.

That should be a false distinction. User programs should only print 
error messages to stderr. Some modify error messages before they get 
printed. Some raise exceptions themselves with messages. The interpreter 
makes makes no distinction between user code, 3rd party code, and stdlib 
code.

> What I would really like to have are two things:
>
> 1) an option to output interpreter generated messages to a specified
> file, whether these messages are uncatchable syntax errors, or
> catchable runtime errors that result in the termination of the
> interpreter. This feature would allow a wrapper program to distinguish
> between user-output and python interpreter output.

'Raw' interpreter error messages start with 'SyntaxError' or 
'Traceback'. Runtime errors do not seem to go to the normal stderr channel.

> 2) an option to provide a structured error output in some common
> easy-to-parse and extendable format that can be used to associate the
> file, line number, error type/number in some post-processing error
> handler. This feature would make the parsing of error messages more
> deterministic, and would be of significant benefit if other
> compilers/interpreters also provide the same functionality in the same
> common format.

Exception instances have a .__traceback__ instance that is used to print 
the default traceback message. So it has or can generate much of what 
you request. I believe traceback objects are documented somewhere. Some 
apps wrap everything in

try: run_app
except Exception as e: custom_handle(e)

-- 
Terry Jan Reedy



From ckaynor at zindagigames.com  Wed Apr 25 23:46:22 2012
From: ckaynor at zindagigames.com (Chris Kaynor)
Date: Wed, 25 Apr 2012 14:46:22 -0700
Subject: [Python-ideas] Structured Error Output
In-Reply-To: <jn9qj6$5go$1@dough.gmane.org>
References: <CAJ6ei3SRGMRocFb7uBkp3t0RyvCeAAs-th5U=o0NuM8jpMhvBA@mail.gmail.com>
	<jn9qj6$5go$1@dough.gmane.org>
Message-ID: <CALvWhxuK_byV=GzTCcsBz9kWF4_nO-CwPasHXGGN0EyoEnXK4A@mail.gmail.com>

On Wed, Apr 25, 2012 at 2:33 PM, Terry Reedy <tjreedy at udel.edu> wrote:

> On 4/25/2012 3:27 PM, Bryce Boe wrote:
>
>> 2) an option to provide a structured error output in some common
>> easy-to-parse and extendable format that can be used to associate the
>> file, line number, error type/number in some post-processing error
>> handler. This feature would make the parsing of error messages more
>> deterministic, and would be of significant benefit if other
>> compilers/interpreters also provide the same functionality in the same
>> common format.
>>
>
> Exception instances have a .__traceback__ instance that is used to print
> the default traceback message. So it has or can generate much of what you
> request. I believe traceback objects are documented somewhere. Some apps
> wrap everything in
>
> try: run_app
> except Exception as e: custom_handle(e)


The sys module also has an excepthook (1) which can be overridden to
customize the exception handling. Note that it does not function with the
threading module, however.

(1) http://docs.python.org/library/sys.html#sys.excepthook


>
>
> --
> Terry Jan Reedy
>
>
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/**mailman/listinfo/python-ideas<http://mail.python.org/mailman/listinfo/python-ideas>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120425/98b5dcb9/attachment.html>

From tjreedy at udel.edu  Wed Apr 25 23:51:41 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 25 Apr 2012 17:51:41 -0400
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <jn9mnm$8pd$1@dough.gmane.org>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
	<20120424214058.22540616@pitrou.net>
	<4F9705EB.1070702@egenix.com> <jn71dc$3on$1@dough.gmane.org>
	<4F970B16.2020702@egenix.com> <jn72eb$el9$1@dough.gmane.org>
	<4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info>
	<CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>
	<4F97AEAA.90009@canterbury.ac.nz>
	<jn96kn$605$1@dough.gmane.org> <jn9jcr$dvd$1@dough.gmane.org>
	<jn9mnm$8pd$1@dough.gmane.org>
Message-ID: <jn9rli$d5j$1@dough.gmane.org>

On 4/25/2012 4:27 PM, Ron Adam wrote:

> I was just trying to point out a module is closer to a class than it is
> to a function, and that is a good thing.

Each should only be executed once, and this is enforced for modules by 
the sys.modules cache.

 > Allowing a return or break in a
> module could make things more confusing.

Agreed.

It is not actually necessary for functions to have an explicit return 
statement. I believe that there are languages that define the function 
return value as the last value assigned to the function name.

def fact(n):
   if n > 1: fact = n*fact(n-1)
   else: fact = 1

(This is pretty close to how mathematicians might write the definition. 
It does, however, require special-casing the function name on the left 
of '='.)

'return' is, however, useful for returning out of more deeply nested 
constructs, especially loops, without setting flag variables or having 
multi-level breaks. This consideration does not really apply to the use 
case for module return.

A virtue of defining everything in Python and then trying to import the 
accelerated override is that different implementations and versions 
thereof can use the same file even though they accelerate different 
parts and amounts (even none) of the module.

-- 
Terry Jan Reedy



From bboe at cs.ucsb.edu  Thu Apr 26 00:06:16 2012
From: bboe at cs.ucsb.edu (Bryce Boe)
Date: Wed, 25 Apr 2012 15:06:16 -0700
Subject: [Python-ideas] Structured Error Output
In-Reply-To: <jn9qj6$5go$1@dough.gmane.org>
References: <CAJ6ei3SRGMRocFb7uBkp3t0RyvCeAAs-th5U=o0NuM8jpMhvBA@mail.gmail.com>
	<jn9qj6$5go$1@dough.gmane.org>
Message-ID: <CAJ6ei3T1WFZKNUmvnfe3rieqHiP0EyGRnyk8zhrTn79+gSkcJA@mail.gmail.com>

>> I looked through the man page for python's interpreter and appears
>> that there is no way to properly distinguish between error messages
>> output to stderr by the interpreter and output produced the by a
>> user-program to stderr.
>
>
> That should be a false distinction. User programs should only print error
> messages to stderr. Some modify error messages before they get printed. Some
> raise exceptions themselves with messages. The interpreter makes makes no
> distinction between user code, 3rd party code, and stdlib code.

Perhaps I wasn't very clear. I want to write a tool to collect error
messages when I run a program. Ideally the tool should be agnostic to
what language is used and should be able to identify syntax errors,
parser errors, and runtime errors. While I can parse both the stdout
and stderr streams to find this information, from what I can tell
there is no way to distinguish between a real syntax error (output to
stderr):

  File "./test.py", line 5
    class
        ^
SyntaxError: invalid syntax

and a program that outputs that exact output to stderr and exits with status 1.

This "channel" sharing of control (error messages) and data is a
problem that affects more than just the python interpreter. I am
hoping to start with python and provide a way to separate the control
and data information so I can be certain that output on the "control"
file descriptor is guaranteed to be generated by the interpreter.

> Exception instances have a .__traceback__ instance that is used to print the default traceback message

I am aware I can obtain this information and output it however I want
from my own program (less syntax errors), however, the goal is to run
third party code and provide a more detailed error report.

-Bryce


From zuo at chopin.edu.pl  Thu Apr 26 00:27:47 2012
From: zuo at chopin.edu.pl (Jan Kaliszewski)
Date: Thu, 26 Apr 2012 00:27:47 +0200
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <4F97D402.5050701@egenix.com>
References: <4F9705EB.1070702@egenix.com> <jn71dc$3on$1@dough.gmane.org>
	<4F970B16.2020702@egenix.com> <jn72eb$el9$1@dough.gmane.org>
	<4F970F36.2070208@egenix.com> <4F974B6B.3040208@pearwood.info>
	<CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>
	<4F97C6CF.8060106@egenix.com>
	<CADiSq7fTK6+tDGnusRp=KJ3k3TLG_HGUQA=4-KXkS_+ocEzXYQ@mail.gmail.com>
	<4F97D402.5050701@egenix.com>
Message-ID: <20120425222747.GA1806@chopin.edu.pl>

M.-A. Lemburg dixit (2012-04-25, 12:37):

> Isn't that an application developer choice to make rather than
> one that we force upon the developer and one which only addresses
> a single use case (having multiple implementation variants in a
> module) ?
> 
> What about other use cases, where you e.g.
> 
> * know that the subsequent function/class definitions are going to
>   fail, because your runtime environment doesn't provide the
>   needed functionality ?

IMHO then you should refactor your module into a few smaller ones.

> * want to limit the available defined APIs based on flags or
>   other settings ?

As above if there are many and/or long variants. Otherwise top-level
if/else should be sufficient.

> * want to make modules behave more like functions or classes ?

What for? Use functions or classes then.

> * want to debug import loops ?

Don't we have some good ways and tools to do it anyway?

> Since the module body is run more or less like a function or
> class body, it seems natural to allow the same statements
> available there in modules as well.

Function body and class body are completely different stories.

* The former can contain return, the latter not (and IMHO this
limitation is a good thing -- see below...).

* The former is executed many times (each time creating a new local
namespace); the latter is executed immediately and only once.

* The former takes some input (arguments) and returns some output;
the latter is a container for some callables and/or some data.

* The former is mostly relatively small; the latter is often quite
large (breaking execution flow of large bodies is often described as a
bad practice, and not without good reasons).

Modules resomehow similar to singleton classes (instantiated with
the first import).

Cheers.
*j



From greg.ewing at canterbury.ac.nz  Thu Apr 26 01:14:39 2012
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 26 Apr 2012 11:14:39 +1200
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <4F97F15C.6000509@hotpy.org>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
	<20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com>
	<jn71dc$3on$1@dough.gmane.org> <4F970B16.2020702@egenix.com>
	<jn72eb$el9$1@dough.gmane.org> <4F970F36.2070208@egenix.com>
	<4F974B6B.3040208@pearwood.info>
	<CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>
	<4F97C6CF.8060106@egenix.com> <4F97E4B2.6070302@pearwood.info>
	<4F97E9D7.5090804@hotpy.org> <4F97EE68.4080102@pearwood.info>
	<4F97F15C.6000509@hotpy.org>
Message-ID: <4F98855F.1000704@canterbury.ac.nz>

Mark Shannon wrote:

> Also, I don't see the difference between a return in a module, and a 
> return in a function. Both terminate execution and return to the caller. 
> Why do these four points apply to module-level returns any more than 
> function-level returns?

I think the difference is that functions are usually small,
whereas modules can be very large. So there is much more scope
for surprises due to a return lurking in a module.

-- 
Greg


From mwm at mired.org  Thu Apr 26 01:15:00 2012
From: mwm at mired.org (Mike Meyer)
Date: Wed, 25 Apr 2012 19:15:00 -0400
Subject: [Python-ideas] Structured Error Output
In-Reply-To: <CAJ6ei3T1WFZKNUmvnfe3rieqHiP0EyGRnyk8zhrTn79+gSkcJA@mail.gmail.com>
References: <CAJ6ei3SRGMRocFb7uBkp3t0RyvCeAAs-th5U=o0NuM8jpMhvBA@mail.gmail.com>
	<jn9qj6$5go$1@dough.gmane.org>
	<CAJ6ei3T1WFZKNUmvnfe3rieqHiP0EyGRnyk8zhrTn79+gSkcJA@mail.gmail.com>
Message-ID: <20120425191500.081ce8f0@bhuda.mired.org>

On Wed, 25 Apr 2012 15:06:16 -0700
Bryce Boe <bboe at cs.ucsb.edu> wrote:

> Perhaps I wasn't very clear. I want to write a tool to collect error
> messages when I run a program. Ideally the tool should be agnostic to
> what language is used and should be able to identify syntax errors,
> parser errors, and runtime errors. While I can parse both the stdout
> and stderr streams to find this information, from what I can tell
> there is no way to distinguish between a real syntax error (output to
> stderr):
> 
>   File "./test.py", line 5
>     class
>         ^
> SyntaxError: invalid syntax
> 
> and a program that outputs that exact output to stderr and exits with status 1.

Correct. And it isn't a problem, because it shouldn't matter if the
programmer wrote a bit of code that caused the compiler to raise a
syntax error, raised the syntax error themselves, evaled a string that
raised the syntax error, or wrote the message explicitly to standard
error: all those cases represent a syntax error to the programmer.

> This "channel" sharing of control (error messages) and data is a
> problem that affects more than just the python interpreter. I am
> hoping to start with python and provide a way to separate the control
> and data information so I can be certain that output on the "control"
> file descriptor is guaranteed to be generated by the interpreter.

If your program is writing *data* to stderr, it's badly designed. You
should fix it instead of trying to get Python changed to accommodate
it.

Or maybe you're trying to draw a distinction between messages
purposely generated by the programmer, and messages the programmer
didn't want? I think that's an artificial distinction. An error is an
error is an error, and by any other name would be just as
smelly. Whether it's an exception raised by the interpreter, an
exception raised by the programmer, or just a message printed by the
programmer really doesn't matter.

Python programmers have complete control over all of this, and can
make it do what they want. Trying to make distinctions based on
default behaviors is misguided.

        <mike

-- Mike Meyer <mwm at mired.org> http://www.mired.org/ Independent
Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From greg.ewing at canterbury.ac.nz  Thu Apr 26 01:24:50 2012
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 26 Apr 2012 11:24:50 +1200
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <4F98139C.1020003@egenix.com>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
	<20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com>
	<jn71dc$3on$1@dough.gmane.org> <4F970B16.2020702@egenix.com>
	<jn72eb$el9$1@dough.gmane.org> <4F970F36.2070208@egenix.com>
	<4F974B6B.3040208@pearwood.info>
	<CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>
	<4F97C6CF.8060106@egenix.com> <4F97E4B2.6070302@pearwood.info>
	<4F97E9D7.5090804@hotpy.org>
	<4A8CACF1-860D-48F7-9538-596A3EEB4445@mac.com>
	<4F98039A.3090301@egenix.com>
	<CAJ6cK1Zr0pZ1XLZd1ZP4bbkP4KzEUO-3t604ROm14WYReyXLYw@mail.gmail.com>
	<4F98139C.1020003@egenix.com>
Message-ID: <4F9887C2.6090803@canterbury.ac.nz>

M.-A. Lemburg wrote:

> People don't appear to have a problem with this in long functions
> or methods, so I'm not sure how well that argument qualifies.

Well, I have a problem with long functions whether they
have embedded returns or not.

But another difference is that functions usually consist
of a set of related statements that need to be read in
order from top to bottom. Modules, on the other hand,
typically consist of multiple relatively small units
whose order mostly doesn't matter. Sticking a return
in the middle introduces an unexpected ordering
dependency.

-- 
Greg


From cs at zip.com.au  Thu Apr 26 01:32:10 2012
From: cs at zip.com.au (Cameron Simpson)
Date: Thu, 26 Apr 2012 09:32:10 +1000
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
Message-ID: <20120425233210.GA1109@cskk.homeip.net>

On 24Apr2012 13:23, Eric Snow <ericsnowcurrently at gmail.com> wrote:
| In a function you can use a return statement to break out of execution
| in the middle of the function.  With modules you have no recourse.
| This is akin to return statements being allowed only at the end of a
| function.
[...]

Something very similar came up some months ago I think.

My personal suggestion was:

  raise StopImport

to cleanly exit a module without causing an ImportError, just as exiting
a generator raises Stopiteration in the caller. In this case the caller
is the import machinery, which should consider this except a clean
completion of the import.

Cheers,
-- 
Cameron Simpson <cs at zip.com.au> DoD#743
http://www.cskk.ezoshosting.com/cs/

Disclaimer: Opinions expressed here are CORRECT, mine, and not PSLs or NMSUs..
        - Larry Cunningham <larry at psl.nmsu.edu>


From greg.ewing at canterbury.ac.nz  Thu Apr 26 01:42:51 2012
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 26 Apr 2012 11:42:51 +1200
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <4F982C62.9040405@pearwood.info>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
	<20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com>
	<jn71dc$3on$1@dough.gmane.org> <4F970B16.2020702@egenix.com>
	<jn72eb$el9$1@dough.gmane.org> <4F970F36.2070208@egenix.com>
	<4F974B6B.3040208@pearwood.info>
	<CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>
	<4F97C6CF.8060106@egenix.com> <4F97E4B2.6070302@pearwood.info>
	<4F97E9D7.5090804@hotpy.org> <4F97EE68.4080102@pearwood.info>
	<4F97F15C.6000509@hotpy.org> <4F982C62.9040405@pearwood.info>
Message-ID: <4F988BFB.5020103@canterbury.ac.nz>

Steven D'Aprano wrote:
> If I recall correctly, Pascal is like that: there's no way 
> to return early from a function in Pascal.

Actually, there is -- you can 'goto' a label at the
bottom.

-- 
Greg




From jimjjewett at gmail.com  Thu Apr 26 02:04:25 2012
From: jimjjewett at gmail.com (Jim Jewett)
Date: Wed, 25 Apr 2012 20:04:25 -0400
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <4F974B6B.3040208@pearwood.info>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
	<20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com>
	<jn71dc$3on$1@dough.gmane.org> <4F970B16.2020702@egenix.com>
	<jn72eb$el9$1@dough.gmane.org> <4F970F36.2070208@egenix.com>
	<4F974B6B.3040208@pearwood.info>
Message-ID: <CA+OGgf7=RrgVZOZJ-Wf3OdNOrNQMigqtzjHscjwTGiBr_qAnwQ@mail.gmail.com>

On Tue, Apr 24, 2012 at 8:55 PM, Steven D'Aprano <steve at pearwood.info> wrote:

> - encourages or requires duplicate code and copy-and-paste programming

How does stopping a module definition early encourage duplicate code?
Just because you might have to react differently after a full import
vs a partial?


> Today, if you successfully import a module, you know that all the top-level
> code in that module was executed.

No, you don't.

On the other hand, the counterexamples (circular imports, some lazy
imports, module-name clashes, top-level definitions dependent on what
was already imported by other modules, etc) are already painful to
debug.

-jJ


From jimjjewett at gmail.com  Thu Apr 26 02:27:54 2012
From: jimjjewett at gmail.com (Jim Jewett)
Date: Wed, 25 Apr 2012 20:27:54 -0400
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <4F97C6CF.8060106@egenix.com>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
	<20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com>
	<jn71dc$3on$1@dough.gmane.org> <4F970B16.2020702@egenix.com>
	<jn72eb$el9$1@dough.gmane.org> <4F970F36.2070208@egenix.com>
	<4F974B6B.3040208@pearwood.info>
	<CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>
	<4F97C6CF.8060106@egenix.com>
Message-ID: <CA+OGgf6WQELGrMS3t_eJNDwHe6EUuNxoGVRXXnq3ZfFaJFkA3w@mail.gmail.com>

On Wed, Apr 25, 2012 at 5:41 AM, M.-A. Lemburg <mal at egenix.com> wrote:
> Nick Coghlan wrote:
>> On Wed, Apr 25, 2012 at 10:55 AM, Steven D'Aprano <steve at pearwood.info> wrote:

> IMO, defining things twice in the same module is not a very Pythonic
> way of designing Python software.

Agreed, which is one problem with accelerator modules.

> Left aside the resource leakage, it also makes if difficult to find
> the implementation that actually gets used,

Agreed, which is another problem with accelerator modules.

> bypasses "explicit is better than implicit",

Agreed, which is is another problem with
import * from accelerator

So far, these reasons have been saying "Don't do that if you can help
it", and if you do it anyhow, making the code stick out is better than
nothing.

(That said, making it stick out by having top-level definitions that
are *not* at the far left is ... unfortunate.)

> and it doesn't address possible side-effects
> of the definitions that you eventually override at the end of the
> module.

That is a real problem, but the answer is to be more explicit.

try:
    from _foo import A
except ImportError
    ...  # define exactly A, as it will be needed

or

try:
    A
except NameError:
    ...  # define exactly A, as it will be needed

or even create a wrapper that imports or defines your object the first
time it is called...

> Python is normally written with a top-to-bottom view in mind, where
> you don't expect things to suddenly change near the end.

Actually, I tend to (wrongly) view the module-level code as fixed
declarations rather than commands, so that leaving a name undefined or
defining it conditionally is just as bad.

Having an alternative definition in the same location is less of a
problem.  I have learned to look for import * at the beginning and end
of a module,  but more special cases would not be helpful.

-jJ


From bboe at cs.ucsb.edu  Thu Apr 26 02:30:45 2012
From: bboe at cs.ucsb.edu (Bryce Boe)
Date: Wed, 25 Apr 2012 17:30:45 -0700
Subject: [Python-ideas] Structured Error Output
In-Reply-To: <20120425191500.081ce8f0@bhuda.mired.org>
References: <CAJ6ei3SRGMRocFb7uBkp3t0RyvCeAAs-th5U=o0NuM8jpMhvBA@mail.gmail.com>
	<jn9qj6$5go$1@dough.gmane.org>
	<CAJ6ei3T1WFZKNUmvnfe3rieqHiP0EyGRnyk8zhrTn79+gSkcJA@mail.gmail.com>
	<20120425191500.081ce8f0@bhuda.mired.org>
Message-ID: <CAJ6ei3Q_==a=maCSS2-+QDYEqCN1X9yboyNw+iGK==j_1a7X_g@mail.gmail.com>

> Correct. And it isn't a problem, because it shouldn't matter if the
> programmer wrote a bit of code that caused the compiler to raise a
> syntax error, raised the syntax error themselves, evaled a string that
> raised the syntax error, or wrote the message explicitly to standard
> error: all those cases represent a syntax error to the programmer.

Perhaps you cannot envision a case where it matters because you don't
work with people that intentionally try to cheat or mislead a system.
I am merely pointing out that currently there is no way to distinguish
between these behaviors, and I would personally like to add that
support because I have a need to deterministically differentiate
between them. I don't want to break backwards compatibility, so what I
propose would only take place via a command line argument.

>> This "channel" sharing of control (error messages) and data is a
>> problem that affects more than just the python interpreter. I am
>> hoping to start with python and provide a way to separate the control
>> and data information so I can be certain that output on the "control"
>> file descriptor is guaranteed to be generated by the interpreter.
>
> If your program is writing *data* to stderr, it's badly designed. You
> should fix it instead of trying to get Python changed to accommodate
> it.

That is a generalization which is not true. Counter example: I have
students writing a simple interpreter in python and their compiler
should output syntax errors to stderr when the program their
interpreter is interpreting. Now say their errors looks very similar
to python errors, how does one distinguish between an error in their
implementation of their interpreter, or an error raised by their
interpreter?

Furthermore, having this separation is somewhat pointless without the
structured part, as ideally I would like it if all compilers and
interpreters produced similar output so I could easily measure how
many errors beginning programmers have in various languages and group
them by type. I have to start somewhere with this project and I was
hoping the python community would be in favor of adding such support
as I feel the changes are relatively trivial.

-Bryce


From ncoghlan at gmail.com  Thu Apr 26 02:58:04 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 26 Apr 2012 10:58:04 +1000
Subject: [Python-ideas] Structured Error Output
In-Reply-To: <CAJ6ei3Q_==a=maCSS2-+QDYEqCN1X9yboyNw+iGK==j_1a7X_g@mail.gmail.com>
References: <CAJ6ei3SRGMRocFb7uBkp3t0RyvCeAAs-th5U=o0NuM8jpMhvBA@mail.gmail.com>
	<jn9qj6$5go$1@dough.gmane.org>
	<CAJ6ei3T1WFZKNUmvnfe3rieqHiP0EyGRnyk8zhrTn79+gSkcJA@mail.gmail.com>
	<20120425191500.081ce8f0@bhuda.mired.org>
	<CAJ6ei3Q_==a=maCSS2-+QDYEqCN1X9yboyNw+iGK==j_1a7X_g@mail.gmail.com>
Message-ID: <CADiSq7ceE6Cj7zhsGVYJ2vNexbpdZrFFjH_pAjho2Y3piLXaVg@mail.gmail.com>

On Thu, Apr 26, 2012 at 10:30 AM, Bryce Boe <bboe at cs.ucsb.edu> wrote:
> Furthermore, having this separation is somewhat pointless without the
> structured part, as ideally I would like it if all compilers and
> interpreters produced similar output so I could easily measure how
> many errors beginning programmers have in various languages and group
> them by type. I have to start somewhere with this project and I was
> hoping the python community would be in favor of adding such support
> as I feel the changes are relatively trivial.

And we're telling you that no, the changes you're interested in are
not trivial - the use of stderr is embedded deep within many parts of
the interpreter.

I suggest just raising the bar for your students and require that they
write their errors to both stderr *and* to an error log. Interpreter
generated errors will show up in the former, but will never appear in
the latter.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From bboe at cs.ucsb.edu  Thu Apr 26 04:04:15 2012
From: bboe at cs.ucsb.edu (Bryce Boe)
Date: Wed, 25 Apr 2012 19:04:15 -0700
Subject: [Python-ideas] Structured Error Output
In-Reply-To: <CADiSq7ceE6Cj7zhsGVYJ2vNexbpdZrFFjH_pAjho2Y3piLXaVg@mail.gmail.com>
References: <CAJ6ei3SRGMRocFb7uBkp3t0RyvCeAAs-th5U=o0NuM8jpMhvBA@mail.gmail.com>
	<jn9qj6$5go$1@dough.gmane.org>
	<CAJ6ei3T1WFZKNUmvnfe3rieqHiP0EyGRnyk8zhrTn79+gSkcJA@mail.gmail.com>
	<20120425191500.081ce8f0@bhuda.mired.org>
	<CAJ6ei3Q_==a=maCSS2-+QDYEqCN1X9yboyNw+iGK==j_1a7X_g@mail.gmail.com>
	<CADiSq7ceE6Cj7zhsGVYJ2vNexbpdZrFFjH_pAjho2Y3piLXaVg@mail.gmail.com>
Message-ID: <CAJ6ei3Tsq1XAJq9caaWJ9HS_2vGvPmYn+TXYJeyVh6ypZBoSfg@mail.gmail.com>

> And we're telling you that no, the changes you're interested in are
> not trivial - the use of stderr is embedded deep within many parts of
> the interpreter.

This is the first constructive comment I've received thus far. Perhaps
I am a bit optimistic that grepping for output to stderr and replacing
the write or fprintf calls with a function call would be appropriate.
Maybe a tedious procedure, but it still seems trivial. It's not like
trying to replace the GIL ;)

> I suggest just raising the bar for your students and require that they
> write their errors to both stderr *and* to an error log. Interpreter
> generated errors will show up in the former, but will never appear in
> the latter.

The example I gave was simply an example. Sure we could ask the
students to more, but ideally this tool would work on any recent
python source without requiring source modification. Many of these
solutions have been presented before, however, they fail to work in
the general case.

> Oh, I work with such cases often enough to know that if they've got a
> complete programming language as a tool, you've already lost.
> And if the people writing the code are antagonistic, there is no way
> to differentiate those behaviors. Anything the python interpreter can
> do the programmer can also do. And, for that matter, suppress.

I realize that even if there was another output stream the user could
write to it via, os.write(3, "foobar"), however, static checks can be
made on the code to detect such function calls. Also, I'm curious, can
a python program suppress later syntax errors?

>> That is a generalization which is not true. Counter example: I have
>> students writing a simple interpreter in python and their compiler
>> should output syntax errors to stderr when the program their
>> interpreter is interpreting. Now say their errors looks very similar
>> to python errors, how does one distinguish between an error in their
>> implementation of their interpreter, or an error raised by their
>> interpreter?

> That's not writing data to stderr, that's writing errors. The problem
> in this case is that the program in question isn't handling errors in
> the implementation properly.  If your students aren't bright enough to
> figure out how to catch errors in their implementation and flag them
> as such, flunk them.

Again this was just a contrived example that demonstrates my want to
differentiate between the two. Whether or not the students get stuck
isn't the problem, the problem is that it's not possible to do the
differentiation. But simply put, why not allow for such
differentiation? What is lost by doing so?

> Trying to get all language processors to produce similar error
> messages is tilting at windmills.  The existence of IDE's that parse
> error message and let the user go through them in order hasn't been
> sufficient to cause that to happen. Some abstract wish to study
> beginners errors will have even less effect.

It is my opinion that most people make due with what they have
available to them. Of course, I can do exactly what I want to do
without modifying the interpreter, however, it suffers from the
ambiguity problem I've already mentioned. One of the great things
about open source software is the ability to adapt it to suit your own
needs, and thus I prefer to take the approach of making things better.
Perhaps no one before me has even considered separating interpreter
output from the programs that the interpreter interprets, but I find
that quite hard to believe. This really isn't a problem with compiled
code, because the compilation and type checking process is separate
from the execution process, though I'll admit there is the same
problem with runtime errors such as segmentation faults but these can
be discovered with proper signal handling.

> But you claim the structure is the import part.  Want to give an
> example of how you would "structure the error output" so that errors
> in a program processing program source can be distinguished from
> errors in the processed source, yet at the same time be similar enough
> so that some tool could be used on both sets of errors?

First, the two changes should work in tandem thus both interpreters
would have a flag, say --structured-error-output that takes a
filename. With such a flag directing the error output to different
files is quite trivial. However, even if they went to the same stream
(poor design in my opinion) the structured messages can have an
attribute indicating which interpreter produced the message thus
allowing for differentiation. Of course, if they went to the same
stream, then you still have the possibility of spoofing the other
program which is why the separation is necessary.

Anyway, I appreciate the argument. It is fairly clear that if I were
to implement this support it is not something that would be integrated
in python thus it's not worth my time. I'll take the band-aid approach
as everyone before me has.

Thanks,
Bryce


From steve at pearwood.info  Thu Apr 26 04:44:01 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 26 Apr 2012 12:44:01 +1000
Subject: [Python-ideas] Structured Error Output
In-Reply-To: <CAJ6ei3Tsq1XAJq9caaWJ9HS_2vGvPmYn+TXYJeyVh6ypZBoSfg@mail.gmail.com>
References: <CAJ6ei3SRGMRocFb7uBkp3t0RyvCeAAs-th5U=o0NuM8jpMhvBA@mail.gmail.com>
	<jn9qj6$5go$1@dough.gmane.org>
	<CAJ6ei3T1WFZKNUmvnfe3rieqHiP0EyGRnyk8zhrTn79+gSkcJA@mail.gmail.com>
	<20120425191500.081ce8f0@bhuda.mired.org>
	<CAJ6ei3Q_==a=maCSS2-+QDYEqCN1X9yboyNw+iGK==j_1a7X_g@mail.gmail.com>
	<CADiSq7ceE6Cj7zhsGVYJ2vNexbpdZrFFjH_pAjho2Y3piLXaVg@mail.gmail.com>
	<CAJ6ei3Tsq1XAJq9caaWJ9HS_2vGvPmYn+TXYJeyVh6ypZBoSfg@mail.gmail.com>
Message-ID: <20120426024401.GB30490@ando>

On Wed, Apr 25, 2012 at 07:04:15PM -0700, Bryce Boe wrote:

> I realize that even if there was another output stream the user could
> write to it via, os.write(3, "foobar"), however, static checks can be
> made on the code to detect such function calls.

I don't think so. Or at least, not easily.

funcs = [len, str, eval, map]
value = funcs[2]("__imp" + "port"[1:] + "__('sys')")
f = getattr(value, ''.join(map(chr, (115, 116, 100, 101, 114, 114))))
y = getattr(f, ''.join(map(chr, (119, 114, 105, 116, 101))))
y("statically check this!\n")


> Also, I'm curious, can
> a python program suppress later syntax errors?

try:
    exec "x = )("
except SyntaxError:
    print("nothing to see here, move along")


[...]
> Again this was just a contrived example that demonstrates my want to
> differentiate between the two. Whether or not the students get stuck
> isn't the problem, the problem is that it's not possible to do the
> differentiation. But simply put, why not allow for such
> differentiation? What is lost by doing so?

Apart from simplicity?

You risk infinite regress. stderr exists to differentiate "good" output 
from "error" output. So you propose a new stream to differentiate "good 
errors" (those raised by the students' interpreter, call it stderr2) 
from "bad errors" (those raised by the interpreter running the students' 
interpreter). At some point, someone will have some compelling (to them, 
not necessarily everyone else) use-case that needs to distinguish 
between "real" good errors going to stderr2 and "fake" good errors going 
to stderr2, and propose stderr3. And deeper and deeper down the rabbit 
hole we go...

At some point, people will just say "Enough!". I suggest that the 
distinction between stdout and stderr is exactly that point.

(Although, having said that, I wish there was a stdinfo for 
informational messages that are neither the intended program output nor 
unintended program errors, e.g. status messages, progress indicators, 
etc.)



-- 
Steven


From greg.ewing at canterbury.ac.nz  Thu Apr 26 06:04:17 2012
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 26 Apr 2012 16:04:17 +1200
Subject: [Python-ideas] Structured Error Output
In-Reply-To: <CAJ6ei3Tsq1XAJq9caaWJ9HS_2vGvPmYn+TXYJeyVh6ypZBoSfg@mail.gmail.com>
References: <CAJ6ei3SRGMRocFb7uBkp3t0RyvCeAAs-th5U=o0NuM8jpMhvBA@mail.gmail.com>
	<jn9qj6$5go$1@dough.gmane.org>
	<CAJ6ei3T1WFZKNUmvnfe3rieqHiP0EyGRnyk8zhrTn79+gSkcJA@mail.gmail.com>
	<20120425191500.081ce8f0@bhuda.mired.org>
	<CAJ6ei3Q_==a=maCSS2-+QDYEqCN1X9yboyNw+iGK==j_1a7X_g@mail.gmail.com>
	<CADiSq7ceE6Cj7zhsGVYJ2vNexbpdZrFFjH_pAjho2Y3piLXaVg@mail.gmail.com>
	<CAJ6ei3Tsq1XAJq9caaWJ9HS_2vGvPmYn+TXYJeyVh6ypZBoSfg@mail.gmail.com>
Message-ID: <4F98C941.2050406@canterbury.ac.nz>

On 26/04/12 14:04, Bryce Boe wrote:
> Perhaps
> I am a bit optimistic that grepping for output to stderr and replacing
> the write or fprintf calls with a function call would be appropriate.

I don't see how that would solve your problem anyway. If your student's
interpreter code, written in Python, raises e.g. a TypeError, and nothing
catches it, the error message for it will get printed by the very same
printf call as any other uncaught exception.

Seems to me the solution to your problem lies in sandboxing the student's
code inside something that catches any exceptions emanating from it and
logs them in a distinctive way. You could also replace sys.stdout and
sys.stderr with objects that perform a similar function.

-- 
Greg


From ncoghlan at gmail.com  Thu Apr 26 06:07:38 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 26 Apr 2012 14:07:38 +1000
Subject: [Python-ideas] Structured Error Output
In-Reply-To: <20120426024401.GB30490@ando>
References: <CAJ6ei3SRGMRocFb7uBkp3t0RyvCeAAs-th5U=o0NuM8jpMhvBA@mail.gmail.com>
	<jn9qj6$5go$1@dough.gmane.org>
	<CAJ6ei3T1WFZKNUmvnfe3rieqHiP0EyGRnyk8zhrTn79+gSkcJA@mail.gmail.com>
	<20120425191500.081ce8f0@bhuda.mired.org>
	<CAJ6ei3Q_==a=maCSS2-+QDYEqCN1X9yboyNw+iGK==j_1a7X_g@mail.gmail.com>
	<CADiSq7ceE6Cj7zhsGVYJ2vNexbpdZrFFjH_pAjho2Y3piLXaVg@mail.gmail.com>
	<CAJ6ei3Tsq1XAJq9caaWJ9HS_2vGvPmYn+TXYJeyVh6ypZBoSfg@mail.gmail.com>
	<20120426024401.GB30490@ando>
Message-ID: <CADiSq7edUkcQMDw2MDpS4vu7iSpycmB+L3jFfMfvxHo9hGHsgw@mail.gmail.com>

On Thu, Apr 26, 2012 at 12:44 PM, Steven D'Aprano <steve at pearwood.info> wrote:
> (Although, having said that, I wish there was a stdinfo for
> informational messages that are neither the intended program output nor
> unintended program errors, e.g. status messages, progress indicators,
> etc.)

And, indeed, such a channel exists: it's called the logging system,
which separates event *generation* (calls to the logging message API)
from event *display* (configuration of logging handlers). In
particular, see the following table in the logging HOWTO guide:
http://docs.python.org/howto/logging.html#when-to-use-logging

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From mwm at mired.org  Thu Apr 26 06:18:12 2012
From: mwm at mired.org (Mike Meyer)
Date: Thu, 26 Apr 2012 00:18:12 -0400
Subject: [Python-ideas] Structured Error Output
In-Reply-To: <CAJ6ei3Tsq1XAJq9caaWJ9HS_2vGvPmYn+TXYJeyVh6ypZBoSfg@mail.gmail.com>
References: <CAJ6ei3SRGMRocFb7uBkp3t0RyvCeAAs-th5U=o0NuM8jpMhvBA@mail.gmail.com>
	<jn9qj6$5go$1@dough.gmane.org>
	<CAJ6ei3T1WFZKNUmvnfe3rieqHiP0EyGRnyk8zhrTn79+gSkcJA@mail.gmail.com>
	<20120425191500.081ce8f0@bhuda.mired.org>
	<CAJ6ei3Q_==a=maCSS2-+QDYEqCN1X9yboyNw+iGK==j_1a7X_g@mail.gmail.com>
	<CADiSq7ceE6Cj7zhsGVYJ2vNexbpdZrFFjH_pAjho2Y3piLXaVg@mail.gmail.com>
	<CAJ6ei3Tsq1XAJq9caaWJ9HS_2vGvPmYn+TXYJeyVh6ypZBoSfg@mail.gmail.com>
Message-ID: <8851db61-4657-4368-9eae-78dbb2ad046c@email.android.com>

Bryce Boe <bboe at cs.ucsb.edu> wrote:

>I realize that even if there was another output stream the user could
>write to it via, os.write(3, "foobar"), however, static checks can be
>made on the code to detect such function calls. Also, I'm curious, can
>a python program suppress later syntax errors?

Yes, a python program can supporess later syntax errors.

>This really isn't a problem with compiled
>code, because the compilation and type checking process is separate
>from the execution process

For sorting out error messages in modern languages, compilation and execution
are not necessarily separate.  You can get compilation errors at pretty much any
point in the execution of the program. Most such languages - include python, but also
include languages that compile to machine or JVM code - include both the ability to
import uncompiled source, compiling it along the way, and the ability to compile and
run code fragments (aka "eval")  in the program. Both of these can generate compilation
errors in the middle of runtime. If my proram imports a config modules the user provided
and it has a syntax error in it, is that syntax error a runtime error or a compilation error?

>> But you claim the structure is the import part.  Want to give an
>> example of how you would "structure the error output" so that errors
>> in a program processing program source can be distinguished from
>> errors in the processed source, yet at the same time be similar
>enough
>> so that some tool could be used on both sets of errors?>
>First, the two changes should work in tandem thus both interpreters
>would have a flag, say --structured-error-output that takes a
>filename. With such a flag directing the different errors to different
>files is quite trivial. 

It is? How do you distinguish between an actual syntax error and a syntax
error raised by an explicit raise statement? And which of those two cases
would a syntax error raised by passing a bad code fragment to eval be, or
is that a third case requiring  yet another flag?

>Anyway, I appreciate the argument. It is fairly clear that if I were
>to implement this support it is not something that would be integrated
>in python thus it's not worth my time. I'll take the band-aid approach
>as everyone before me has.

I'm still waiting for a proposal solid enough to evaluate. I like the idea
of more structured error output, and think it might be a nice addition to
the interpreter. A python programmer already has complete control over
all error messages, though it can be hard to get to. Making that easier is
a worthy goal, but it's got to be more than the ability to send some ill-defined
set of exceptions to a different output stream.

    <mike

-- 
Sent from my Android tablet. Please excuse my swyping.


From ericsnowcurrently at gmail.com  Thu Apr 26 07:34:21 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Wed, 25 Apr 2012 23:34:21 -0600
Subject: [Python-ideas] sys.implementation
In-Reply-To: <CALFfu7B4GpCNgyxLUmtji1LLKLgFPPvR2Zd3e2V6UW4_0HoR7g@mail.gmail.com>
References: <CALFfu7DYyZMUp40MDR9-vhpOkPvr=cwt5EmMHEGTrmix_kZbYg@mail.gmail.com>
	<CALFfu7B4GpCNgyxLUmtji1LLKLgFPPvR2Zd3e2V6UW4_0HoR7g@mail.gmail.com>
Message-ID: <CALFfu7CFG0dqi709ZHutWgXWs6Rzb_bO8rUqvZx4FRQtuo0dMQ@mail.gmail.com>

On Tue, Apr 24, 2012 at 12:42 AM, Eric Snow <ericsnowcurrently at gmail.com> wrote:
> The premise is that sys.implementation would be a "namedtuple" (like
> sys.version_info). ?It would contain (as much as is practical) the
> information that is specific to a particular implementation of Python.
> ?"Required" attributes of sys.implementation would be those that the
> standard library makes use of. ?For instance, importlib would make use
> of sys.implementation.name (or sys.implementation.cache_tag) if there
> were one. ?The thread from 2009 covered a lot of this ground already.
> [1]
>
> Here are the "required" attributes of sys.implementation that I advocate:
>
> * name (mixed case; sys.implementation.name.lower() used as an identifier)
> * version (of the implementation, not of the targeted language
> version; structured like sys.version_info?)
>
> Here are other variables that _could_ go in sys.implementation:
>
> * cache_tag (e.g. 'cpython33' for CPython 3.3)
> * repository
> * repository_revision
> * build_toolchain
> * url (or website)
> * site_prefix
> * runtime
>
> Let's start with a minimum set of expected attributes, which would
> have an immediate purpose in the stdlib. ?However, let's not disallow
> implementations from adding whatever other attributes are meaningful
> for them.

FYI, I've created a tracker ticket with a patch and moved this over to
python-dev.

-eric

[1] http://bugs.python.org/issue14673


From jimjjewett at gmail.com  Thu Apr 26 16:36:04 2012
From: jimjjewett at gmail.com (Jim Jewett)
Date: Thu, 26 Apr 2012 10:36:04 -0400
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <CADiSq7fTK6+tDGnusRp=KJ3k3TLG_HGUQA=4-KXkS_+ocEzXYQ@mail.gmail.com>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
	<20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com>
	<jn71dc$3on$1@dough.gmane.org> <4F970B16.2020702@egenix.com>
	<jn72eb$el9$1@dough.gmane.org> <4F970F36.2070208@egenix.com>
	<4F974B6B.3040208@pearwood.info>
	<CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>
	<4F97C6CF.8060106@egenix.com>
	<CADiSq7fTK6+tDGnusRp=KJ3k3TLG_HGUQA=4-KXkS_+ocEzXYQ@mail.gmail.com>
Message-ID: <CA+OGgf4BC18E6S7UTxPX-gn35WNcUnsP12vxfuXJNUaRt4uy0g@mail.gmail.com>

On Wed, Apr 25, 2012 at 5:52 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> ... tackle the problem of choosing between multiple
> implementations of a module at runtime ...

I like the idea, but I'm not sure I would like the result.

The obvious path (at least for experimenting) is to replace the import
statement with an import function.

(a)  At the interactive prompt, the function would return the module,
and therefore allow it to be referenced as _.  (I don't always
remember the "as shortname" until I've already hit enter.  And often
all I really want is to see help(module), but without switching to
another window.)

(b)  The functional interface could expose a configuration object, so
that in addition to deciding between alternate implementations, a
single implementation could set up objects differently.  (Do I really
need to define that type of handler?  What loggers should I enable
initially, even while setting up the rest of the logging machinery?
OK, let me open that database connection before I define this class.)

These are both features I have often wanted.  But I don't want to deal
with the resulting questions, like "Wait, how can that logger not
exist?  Oh, someone else imported logging first..."

And I'm not sure it is possible to get the freedom of (b) without
those problems, unless modules stop being singletons.

-jJ


From jimjjewett at gmail.com  Thu Apr 26 22:30:32 2012
From: jimjjewett at gmail.com (Jim Jewett)
Date: Thu, 26 Apr 2012 16:30:32 -0400
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <4F9818D6.5010808@stoneleaf.us>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
	<20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com>
	<jn71dc$3on$1@dough.gmane.org> <4F970B16.2020702@egenix.com>
	<jn72eb$el9$1@dough.gmane.org> <4F970F36.2070208@egenix.com>
	<4F974B6B.3040208@pearwood.info>
	<CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>
	<4F97AEAA.90009@canterbury.ac.nz> <4F97B19E.7000604@hotpy.org>
	<CAB4yi1M4S=J9sHQysBm6e1d6Lau6MY2aDNj-TQn+xcqrH72D-w@mail.gmail.com>
	<4F9818D6.5010808@stoneleaf.us>
Message-ID: <CA+OGgf6iMUkGNHoqf1SY43xBYLoD4XaMT2KDYK7QOc6ZwaYHtA@mail.gmail.com>

On Wed, Apr 25, 2012 at 11:31 AM, Ethan Furman <ethan at stoneleaf.us> wrote:
> Matt Joiner wrote:
>>
>> If not use_simple_api:
>> ? ?class C:

> More like:
>
> ?class C:
> ? ?def basic_method(self):
> ? ? ?pass
> ? ?if not use_simple_gui:
> ? ? ?def advanced_method(self, this, that):
> ? ? ? ?pass


He may have been thinking of a replacement idiom

    class C:
        ...

    if system_has_propertyX():
        class C(C):
            # Extend the Base class, but keep the name...


-jJ


From techtonik at gmail.com  Fri Apr 27 07:45:20 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Fri, 27 Apr 2012 08:45:20 +0300
Subject: [Python-ideas] Python-ideas Digest, Vol 65, Issue 43
In-Reply-To: <20120424184547.0bca79f5@bhuda.mired.org>
References: <mailman.31.1335175203.9054.python-ideas@python.org>
	<CACZ_DodeyqgbVMCceyXU=HzBzeXS9nFTX=2oseQ8=zoBDw-_-A@mail.gmail.com>
	<CAMZYqRTobyMQrMsNu-8yC8TqxaRjxZD0UWGJNOdmL2AfzSi6Mg@mail.gmail.com>
	<CAPkN8xKFXYu8BssNdqMWqbHPvcnvKroG0kWzMXbOmjXTFp46nQ@mail.gmail.com>
	<20120424184547.0bca79f5@bhuda.mired.org>
Message-ID: <CAPkN8x+Cwj_RPHEe5cN+ws7r3biB9D3D0psj_gfuJOX5T0ijug@mail.gmail.com>

On Wed, Apr 25, 2012 at 1:45 AM, Mike Meyer <mwm at mired.org> wrote:
> On Tue, 24 Apr 2012 18:42:24 +0300
> anatoly techtonik <techtonik at gmail.com> wrote:
>
>> On Mon, Apr 23, 2012 at 3:01 PM, Chris Rebert <pyideas at rebertia.com> wrote:
>> > On Mon, Apr 23, 2012 at 4:10 AM, Hobson Lane <hobsonlane at gmail.com> wrote:
>> > <snip>
>> >> On Mon, Apr 23, 2012 at 6:00 PM, <python-ideas-request at python.org> wrote:
>> >>>
>> >>> Send Python-ideas mailing list submissions to
>> >>> ? ? ? ?python-ideas at python.org
>> >>>
>> >>> To subscribe or unsubscribe via the World Wide Web, visit
>> >>> ? ? ? ?http://mail.python.org/mailman/listinfo/python-ideas
>> >>> or, via email, send a message with subject or body 'help' to
>> >>> ? ? ? ?python-ideas-request at python.org
>> >>>
>> >>> You can reach the person managing the list at
>> >>> ? ? ? ?python-ideas-owner at python.org
>> >>>
>> >>> When replying, please edit your Subject line so it is more specific
>> >>> than "Re: Contents of Python-ideas digest..."
>> >
>> > Please avoid replying to the digest; it breaks conversation threading.
>> > Switch to a non-digest mailing list subscription when not lurking.
>>
>> But to reply to a non-digest message you need to receive it in
>> non-digest mode, which didn't happen already. The only way it makes
>> sense is when you ask the Mailman to resend message again. I don't
>> know if that's possible.
>
> Your initial statement is - or at least may be - wrong. If the digest
> is in one of the well-known formats, a good MUA will let you burst a
> digest into individual messages and reply to them just like any other
> message to a list. mh, nmh and GUI's built on top them can do this.

I use GMail, which is a good MUA to me (much better than mh, nmh and
their GUIs). I believe most users feel comfortable with emails in
their browsers and won't install some awkward terminal mh, nmh just to
reply to the digest messages. Requirement to "use something proper
beforehand" was neither a solution nor an alternative.

I am not sure that `programs` nowadays makes any sense if you can not
access your data from all the entrypionts.
--
anatoly t.


From phd at phdru.name  Fri Apr 27 08:25:57 2012
From: phd at phdru.name (Oleg Broytman)
Date: Fri, 27 Apr 2012 10:25:57 +0400
Subject: [Python-ideas] Python-ideas Digest, Vol 65, Issue 43
In-Reply-To: <CAPkN8x+Cwj_RPHEe5cN+ws7r3biB9D3D0psj_gfuJOX5T0ijug@mail.gmail.com>
References: <mailman.31.1335175203.9054.python-ideas@python.org>
	<CACZ_DodeyqgbVMCceyXU=HzBzeXS9nFTX=2oseQ8=zoBDw-_-A@mail.gmail.com>
	<CAMZYqRTobyMQrMsNu-8yC8TqxaRjxZD0UWGJNOdmL2AfzSi6Mg@mail.gmail.com>
	<CAPkN8xKFXYu8BssNdqMWqbHPvcnvKroG0kWzMXbOmjXTFp46nQ@mail.gmail.com>
	<20120424184547.0bca79f5@bhuda.mired.org>
	<CAPkN8x+Cwj_RPHEe5cN+ws7r3biB9D3D0psj_gfuJOX5T0ijug@mail.gmail.com>
Message-ID: <20120427062557.GA32663@iskra.aviel.ru>

Hell!

On Fri, Apr 27, 2012 at 08:45:20AM +0300, anatoly techtonik <techtonik at gmail.com> wrote:
> won't install some awkward terminal

   "awkward" indeed! Trolling as usual, yeah?

> Requirement to "use something proper
> beforehand" was neither a solution nor an alternative.

   It was not a requirement, just an advice. But that was an advice to
solve a real problem. Replying to digest brings a lot of problems and
thus prevents effective communication. Email, mailing lists and archives
don't make sense if they don't help to communicate.
   And it was only AN advice, not THE advice. Another solution for the
same problem would be not to reply to digest. I am sure there are other
solutions.

> I am not sure that `programs` nowadays makes any sense if you can not
> access your data from all the entrypionts.

   Do you believe - those "awkward" terminal programs works remotely
quite fine?!

Oleg.
-- 
     Oleg Broytman            http://phdru.name/            phd at phdru.name
           Programmers don't die, they just GOSUB without RETURN.


From ericsnowcurrently at gmail.com  Fri Apr 27 08:36:26 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Fri, 27 Apr 2012 00:36:26 -0600
Subject: [Python-ideas] PEP 4XX: Adding sys.implementation
Message-ID: <CALFfu7AYmTqsNzfc-hX-+i7voVNFh28z1pgeBF1WXh1vhtpcLA@mail.gmail.com>

I've written up a PEP for the sys.implementation idea.  Feedback is welcome!

You'll notice some gaps which I'll be working on to fill in over the
next couple days.  Don't mind the gaps. <wink>  They are in less
critical (?) portions and I wanted to get this out to you before the
weekend.  Thanks!

-eric

--------------------------------------------------------------

PEP: 4XX
Title: Adding sys.implementation
Version: $Revision$
Last-Modified: $Date$
Author: Eric Snow <ericsnowcurrently at gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 26-April-2012
Python-Version: 3.3


Abstract
========

This PEP introduces a new variable for the sys module: ``sys.implementation``.
The variable holds consolidated information about the implementation of
the running interpreter.  Thus ``sys.implementation`` is the source to
which the standard library may look for implementation-specific
information.

The proposal in this PEP is in line with a broader emphasis on making
Python friendlier to alternate implementations.  It describes the new
variable and the constraints on what that variable contains.  The PEP
also explains some immediate use cases for ``sys.implementation``.


Motivation
==========

For a number of years now, the distinction between Python-the-language
and CPython (the reference implementation) has been growing.  Most of
this change is due to the emergence of Jython, IronPython, and PyPy as
viable alternate implementations of Python.

Consider, however, the nearly two decades of CPython-centric Python
(i.e. most of its existance).  That focus had understandably contributed
to quite a few CPython-specific artifacts both in the standard library
and exposed in the interpreter.  Though the core developers have made an
effort in recent years to address this, quite a few of the artifacts
remain.

Part of the solution is presented in this PEP:  a single namespace on
which to consolidate implementation specifics.  This will help focus
efforts to differentiate the implementation specifics from the language.
Additionally, it will foster a multiple-implementation mindset.


Proposal
========

We will add ``sys.implementation``, in the sys module, as a namespace to
contain implementation-specific information.

The contents of this namespace will remain fixed during interpreter
execution and through the course of an implementation version.  This
ensures behaviors don't change between versions which depend on variables
in ``sys.implementation``.

``sys.implementation`` is a dictionary, as opposed to any form of "named"
tuple (a la ``sys.version_info``).  This is partly because it doesn't
have meaning as a sequence, and partly because it's a potentially more
variable data structure.

The namespace will contain at least the variables described in the
`Required Variables`_ section below.  However, implementations are free
to add other implementation information there.  Some possible extra
variables are described in the `Other Possible Variables`_ section.

This proposal takes a conservative approach in requiring only two
variables.  As more become appropriate, they may be added with discretion.


Required Variables
--------------------

These are variables in ``sys.implementation`` on which the standard
library would rely, meaning they would need to be defined:

name
   the name of the implementation (case sensitive).

version
   the version of the implementation, as opposed to the version of the
   language it implements.  This would use a standard format, similar to
   ``sys.version_info`` (see `Version Format`_).


Other Possible Variables
------------------------

These variables could be useful, but don't necessarily have a clear use
case presently:

cache_tag
   a string used for the PEP 3147 cache tag (e.g. 'cpython33' for
   CPython 3.3).  The name and version from above could be used to
   compose this, though an implementation may want something else.
   However, module caching is not a requirement of implementations, nor
   is the use of cache tags.

repository
   the implementation's repository URL.

repository_revision
   the revision identifier for the implementation.

build_toolchain
   identifies the tools used to build the interpreter.

url (or website)
   the URL of the implementation's site.

site_prefix
   the preferred site prefix for this implementation.

runtime
   the run-time environment in which the interpreter is running.

gc_type
   the type of garbage collection used.


Version Format
--------------

XXX same as sys.version_info?


Rationale
=========

The status quo for implementation-specific information gives us that
information in a more fragile, harder to maintain way.  It's spread out
over different modules or inferred from other information, as we see with
``platform.python_implementation()``.

This PEP is the main alternative to that approach.  It consolidates the
implementation-specific information into a single namespace and makes
explicit that which was implicit.

With the single-namespace-under-sys so straightforward, no alternatives
have been considered for this PEP.

Discussion
==========

The topic of ``sys.implementation`` came up on the python-ideas list in
2009, where the reception was broadly positive [1]_.  I revived the
discussion recently while working on a pure-python ``imp.get_tag()`` [2]_.
The messages in `issue 14673`_ are also relevant.


Use-cases
=========

``platform.python_implementation()``
------------------------------------

"explicit is better than implicit"

The platform module guesses the python implementation by looking for
clues in a couple different sys variables [3]_.  However, this approach
is fragile.  Beyond that, it's limited to those implementations that core
developers have blessed by special-casing them in the platform module.

With ``sys.implementation` the various implementations would *explicitly*
set the values in their own version of the sys module.

Aside from the guessing, another concern is that the platform module is
part of the stdlib, which ideally would minimize implementation details
such as would be moved to ``sys.implementation``.

Any overlap between ``sys.implementation`` and the platform module would
simply defer to ``sys.implementation`` (with the same interface in
platform wrapping it).


Cache Tag Generation in Frozen Importlib
----------------------------------------

PEP 3147 defined the use of a module cache and cache tags for file names.
The importlib bootstrap code, frozen into the Python binary as of 3.3,
uses the cache tags during the import process.  Part of the project to
bootstrap importlib has been to clean out of Lib/import.c any code that
did not need to be there.

The cache tag defined in Lib/import.c was hard-coded to
``"cpython" MAJOR MINOR`` [4]_.  For importlib the options are either
hard-coding it in the same way, or guessing the implementation in the
same way as does ``platform.python_implementation()``.

As long as the hard-coded tag is limited to CPython-specific code, it's
livable.  However, inasmuch as other Python implementations use the
importlib code to work with the module cache, a hard-coded tag would
become a problem..

Directly using the platform module in this case is a non-starter.  Any
module used in the importlib bootstrap must be built-in or frozen,
neither of which apply to the platform module.  This is the point that
led to the recent interest in ``sys.implementation``.

Regardless of how the implementation name is gotten, the version to use
for the cache tag is more likely to be the implementation version rather
than the language version.  That implementation version is not readily
identified anywhere in the standard library.


Implementation-Specific Tests
-----------------------------

XXX

http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l509
http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l1246
http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l1252
http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l1275


Jython's ``os.name`` Hack
-------------------------

XXX

http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l512


Impact on CPython
=================

XXX


Feedback From Other Python Implementators
=========================================

IronPython
----------

XXX

Jython
------

XXX

PyPy
----

XXX


Past Efforts
============

XXX PEP 3139
XXX PEP 399


Open Issues
===========

* What are the long-term objectives for sys.implementation?

  - pull in implementation detail from the main sys namespace and
    elsewhere (PEP 3137 lite).

* Alternatives to the approach dictated by this PEP?

* ``sys.implementation`` as a proper namespace rather than a dict.  It
  would be it's own module or an instance of a concrete class.


Implementation
==============

The implementatation of this PEP is covered in `issue 14673`_.


References
==========

.. [1]

   http://mail.python.org/pipermail/python-dev/2009-October/092893.html

.. [2]

   http://mail.python.org/pipermail/python-ideas/2012-April/014878.html

.. [3]

   http://hg.python.org/cpython/file/2f563908ebc5/Lib/platform.py#l1247

.. [4]

   http://hg.python.org/cpython/file/2f563908ebc5/Python/import.c#l121

.. _issue 14673

   http://bugs.python.org/issue14673


Copyright
=========

    This document has been placed in the public domain.

Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:


From stefan_ml at behnel.de  Fri Apr 27 09:05:26 2012
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Fri, 27 Apr 2012 09:05:26 +0200
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
Message-ID: <jndgfm$rl6$1@dough.gmane.org>

Eric Snow, 24.04.2012 21:23:
> In a function you can use a return statement to break out of execution
> in the middle of the function.  With modules you have no recourse.
> This is akin to return statements being allowed only at the end of a
> function.
> 
> There are a small number of ways you can work around this, but they
> aren't great.  This includes using wrapper modules or import hooks or
> sometimes from-import-*.  Otherwise, if your module's execution is
> conditional, you end up indenting everything inside an if/else
> statement.
> 
> Proposal: introduce a non-error mechanism to break out of module
> execution.  This could be satisfied by a statement like break or
> return, though those specific ones could be confusing.  It could also
> involve raising a special subclass of ImportError that the import
> machinery simply handles as not-an-error.
> 
> This came up last year on python-list with mixed results. [1]
> However, time has not dimmed the appeal for me so I'm rebooting here.
> 
> While the proposal seems relatively minor, the use cases are not
> extensive. <wink>  The three main ones I've encountered are these:
> 
> 1. C extension module with fallback to pure Python:
> 
>   try:
>       from _extension_module import *
>   except ImportError:
>       pass
>   else:
>       break  # or whatever color the bikeshed is
> 
>   # pure python implementation goes here
> 
> 2. module imported under different name:
> 
>   if __name__ != "expected_name":
>       from expected_name import *
>       break
> 
>   # business as usual
> 
> 3. module already imported under a different name:
> 
>   if "other_module" in sys.modules:
>       exec("from other_module import *", globals())
>       break
> 
>   # module code here
> 
> Thoughts?

Without having read through the thread, I think that code that needs this
is just badly structured. All of the above cases can be fixed by moving the
code into a separate (and appropriately named) module and importing
conditionally from that.

I'm generally -1 on anything that would allow non-error code at an
arbitrary place further up in a module to prevent the non-indented module
code I'm looking at from being executed. Whether the result of that
execution makes it into the module API or not is a different question that
is commonly answered either by "__all__" at the very top of a module or by
the code at the very end of the module, not in between.

Stefan



From jimjjewett at gmail.com  Fri Apr 27 15:19:26 2012
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri, 27 Apr 2012 09:19:26 -0400
Subject: [Python-ideas] breaking out of module execution
In-Reply-To: <4F982D34.4010709@pearwood.info>
References: <CALFfu7A=Bp6rcfBTRGknC_tXrqqsOyXQ7Gx1GzqUio3CRh-M7g@mail.gmail.com>
	<20120424214058.22540616@pitrou.net> <4F9705EB.1070702@egenix.com>
	<jn71dc$3on$1@dough.gmane.org> <4F970B16.2020702@egenix.com>
	<jn72eb$el9$1@dough.gmane.org> <4F970F36.2070208@egenix.com>
	<4F974B6B.3040208@pearwood.info>
	<CADiSq7fnz74FQnQrCMo6gvKx541t6wKZJKLNTUvt0hgnTKeyTg@mail.gmail.com>
	<4F97C6CF.8060106@egenix.com> <4F97E4B2.6070302@pearwood.info>
	<4F97E9D7.5090804@hotpy.org>
	<4A8CACF1-860D-48F7-9538-596A3EEB4445@mac.com>
	<4F98039A.3090301@egenix.com> <4F982D34.4010709@pearwood.info>
Message-ID: <CA+OGgf7=Wf+Z352X-xP6eW5UT8rdje1Rq3B2xxa8nidv=69oPg@mail.gmail.com>

On Wed, Apr 25, 2012 at 12:58 PM, Steven D'Aprano <steve at pearwood.info> wrote:

> Why would hundreds of lines suddenly become
> unreadable and hard to edit because they have a little bit of leading
> whitespace in front of them?

A single indent isn't that bad for reading or editing; the problem is
with skimming.

def X...


class A....

ZZZ= ....


Anything not on the far left is part of the undifferentiated "...".

-jJ


From jimjjewett at gmail.com  Fri Apr 27 15:31:02 2012
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri, 27 Apr 2012 09:31:02 -0400
Subject: [Python-ideas] Module __getattr__ [Was: breaking out of module
	execution]
Message-ID: <CA+OGgf6d+J6wdqr6f0PVT44QpmXOXt=iw+-sQwKcFz9KY-SBOA@mail.gmail.com>

On Wed, Apr 25, 2012 at 2:35 PM, Matt Joiner <anacrolix at gmail.com> wrote:
> If this is to be done I'd like to see all special methods supported. One of
> particular interest to modules is __getattr__...

For What It's Worth, supporting __setattr__ and __getattr__ is one of
the few reasons that I have considered subclassing modules.

The workarounds of either offering public set_varX and get_varX
functions, or moving configuration to a separate singleton, just feel
kludgy.

Since those module methods would be defined at the far left, I don't
think it would mess up understanding any more than they already do on
regular classes.  (There is always *some* surprise, just because they
are unusual.)

That said, I personally tend to view modules as a special case of
classes, so I wouldn't be shocked if others found it more confusing
than I would -- particularly as to whether or not the module's
__getattr__ would somehow affect the lookup chain for classes defined
within the module.

-jJ


From techtonik at gmail.com  Fri Apr 27 20:12:15 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Fri, 27 Apr 2012 21:12:15 +0300
Subject: [Python-ideas] Python-ideas Digest, Vol 65, Issue 43
In-Reply-To: <20120427062557.GA32663@iskra.aviel.ru>
References: <mailman.31.1335175203.9054.python-ideas@python.org>
	<CACZ_DodeyqgbVMCceyXU=HzBzeXS9nFTX=2oseQ8=zoBDw-_-A@mail.gmail.com>
	<CAMZYqRTobyMQrMsNu-8yC8TqxaRjxZD0UWGJNOdmL2AfzSi6Mg@mail.gmail.com>
	<CAPkN8xKFXYu8BssNdqMWqbHPvcnvKroG0kWzMXbOmjXTFp46nQ@mail.gmail.com>
	<20120424184547.0bca79f5@bhuda.mired.org>
	<CAPkN8x+Cwj_RPHEe5cN+ws7r3biB9D3D0psj_gfuJOX5T0ijug@mail.gmail.com>
	<20120427062557.GA32663@iskra.aviel.ru>
Message-ID: <CAPkN8xKD_WGWHucuknfwkL2NanwR1yOJyv9AOC-JoAUwKy9Sow@mail.gmail.com>

On Fri, Apr 27, 2012 at 9:25 AM, Oleg Broytman <phd at phdru.name> wrote:
> Hell!

Ok. Let's discuss.

> On Fri, Apr 27, 2012 at 08:45:20AM +0300, anatoly techtonik <techtonik at gmail.com> wrote:
>> won't install some awkward terminal
>
> ? "awkward" indeed! Trolling as usual, yeah?

You won't win this fight. =)

"Awkward (Adjective): Causing difficulty; hard to do or deal with."

I can't see that's wrong with that. You don't want to say that
installing unknown program and learn how to work with the whole
terminal toolchain and syncing your mail archive across nodes is as
easy as working with GMail in Chrome, do you? You can record a session
at shelr.tv and try to convince me, but I have experience that tells
me that Linux keyboard terminal input is sick, and that is the reason
why terminal programs are awkward to deal with - the final explanation
I reached rests at http://www.leonerd.org.uk/hacks/fixterms/ - I wish
the library was a part of Python distribution.

>> Requirement to "use something proper
>> beforehand" was neither a solution nor an alternative.
>
> ? It was not a requirement, just an advice. But that was an advice to
> solve a real problem. Replying to digest brings a lot of problems and
> thus prevents effective communication. Email, mailing lists and archives
> don't make sense if they don't help to communicate.

That's why I prefer Google Groups. You can use email, subscribe as a
mailing list, read the web, have a searchable archive and reply to any
thread you haven't been subscribed to. Everything from a single
interface - no need to carry your mail archive around anymore if you
want to search it without 3rd party services. That is my definition of
effective communication platform. The constructive advice - research a
tutorial how to properly integrate full sync between Mailman and
Google Groups and ban usage of digests altogether.

In fact I've asked in Mailman group how to properly setup it to
automatically accept the mails from Groups subscribers, but the stuff
got too complicated, so it was postponed for a better time to learn
email protocol intricacies.

> ? And it was only AN advice, not THE advice. Another solution for the
> same problem would be not to reply to digest. I am sure there are other
> solutions.

Well, sorry for my tone. It seems I've entered "that" favorite style
again. Of course, I accept the idea that mh (which is Public Domain
and that's awesome) can solve the problem with digest reading, but the
story is too exotic for me, and I certainly won't sacrifice features
of web based mail services to make sure I can properly reply to
digests. I better won't reply to them at all next time, just because
my mail agent doesn't allow. That's my personal choice, but it is also
the choice that makes people exclude then they faced with such
requirements.

>> I am not sure that `programs` nowadays makes any sense if you can not
>> access your data from all the entrypionts.
>
> ? Do you believe - those "awkward" terminal programs works remotely
> quite fine?!

But I can't use them from my (imaginary) tablet version 3. =)
And for me any SSH session interaction is still slow - maybe I am too
picky, but the typing delay in comparison with browser is more that
enough to feel the difference.

--
anatoly t.


From mwm at mired.org  Fri Apr 27 21:57:12 2012
From: mwm at mired.org (Mike Meyer)
Date: Fri, 27 Apr 2012 15:57:12 -0400
Subject: [Python-ideas] Python-ideas Digest, Vol 65, Issue 43
In-Reply-To: <CAPkN8xKD_WGWHucuknfwkL2NanwR1yOJyv9AOC-JoAUwKy9Sow@mail.gmail.com>
References: <mailman.31.1335175203.9054.python-ideas@python.org>
	<CACZ_DodeyqgbVMCceyXU=HzBzeXS9nFTX=2oseQ8=zoBDw-_-A@mail.gmail.com>
	<CAMZYqRTobyMQrMsNu-8yC8TqxaRjxZD0UWGJNOdmL2AfzSi6Mg@mail.gmail.com>
	<CAPkN8xKFXYu8BssNdqMWqbHPvcnvKroG0kWzMXbOmjXTFp46nQ@mail.gmail.com>
	<20120424184547.0bca79f5@bhuda.mired.org>
	<CAPkN8x+Cwj_RPHEe5cN+ws7r3biB9D3D0psj_gfuJOX5T0ijug@mail.gmail.com>
	<20120427062557.GA32663@iskra.aviel.ru>
	<CAPkN8xKD_WGWHucuknfwkL2NanwR1yOJyv9AOC-JoAUwKy9Sow@mail.gmail.com>
Message-ID: <20120427155712.2929246d@bhuda.mired.org>

On Fri, 27 Apr 2012 21:12:15 +0300
anatoly techtonik <techtonik at gmail.com> wrote:

> On Fri, Apr 27, 2012 at 9:25 AM, Oleg Broytman <phd at phdru.name> wrote:
> > Hell!
> Ok. Let's discuss.

Since I made the suggestion, I'll step in here.

> > On Fri, Apr 27, 2012 at 08:45:20AM +0300, anatoly techtonik <techtonik at gmail.com> wrote:
> >> won't install some awkward terminal
> > ? "awkward" indeed! Trolling as usual, yeah?
> You won't win this fight. =)

A truism in any web discussion.

> "Awkward (Adjective): Causing difficulty; hard to do or deal with."
> I can't see that's wrong with that. You don't want to say that
> installing unknown program and learn how to work with the whole
> terminal toolchain and syncing your mail archive across nodes is as
> easy as working with GMail in Chrome, do you?

Yes, it is. The setup process is a relatively straight-forward,
one-time-per-node thing. If you're willing to do a little research
instead of expecting only using the work of others, you also might be
able to avoid the terminal toolchain issues.

The GMail web client has a number of problems that crop up over and
over again. It's mail marking facilities are subpar, making simply
reading mail more difficult than it is on other clients *every time
you read mail*.  It's mail quoting mechanism is fundamentally broken,
require hand-editing the quote *every time you reply to mail*. And, of
course, GMail can't burst a digest, meaning you either don't reply to
digest posts, fix the headers by hand, or piss other people off - all
creating unneeded difficulty *every time you reply to a digest
message*.

So eventually using only GMail in a web browser will cause more
difficulty than setting up a proper mail client. Unless you use read
very little mail, in which case - why are you getting the digests?

> >> Requirement to "use something proper
> >> beforehand" was neither a solution nor an alternative.
> > ? It was not a requirement, just an advice. But that was an advice to
> > solve a real problem. Replying to digest brings a lot of problems and
> > thus prevents effective communication. Email, mailing lists and archives
> > don't make sense if they don't help to communicate.
> That's why I prefer Google Groups. You can use email, subscribe as a
> mailing list, read the web, have a searchable archive and reply to any
> thread you haven't been subscribed to. Everything from a single
> interface - no need to carry your mail archive around anymore if you
> want to search it without 3rd party services. That is my definition of
> effective communication platform. The constructive advice - research a
> tutorial how to properly integrate full sync between Mailman and
> Google Groups and ban usage of digests altogether.

Google Groups is nothing more than a web interface to a netnews
system. IIUC, one with a broken news<->mail gateway. It doesn't
provide anything that the mail list doesn't provide, except the
ability to reply from Google Groups. And that's broken because Google
Groups is broken. If you really want to use a news or web interface
and reply properly, take the time to find one that doesn't have a
broken mail gateway.

> > ? And it was only AN advice, not THE advice. Another solution for the
> > same problem would be not to reply to digest. I am sure there are other
> > solutions.
> Well, sorry for my tone. It seems I've entered "that" favorite style
> again. Of course, I accept the idea that mh (which is Public Domain
> and that's awesome) can solve the problem with digest reading, but the
> story is too exotic for me, and I certainly won't sacrifice features
> of web based mail services to make sure I can properly reply to
> digests.

Nobody said you had to make that sacrifice. There aren't any features
of generic web based mail services that aren't available in proper
mail readers.  Sure, mh isn't one of those. But it may not be the only
mail reader that can burst a digest. So long as you waste effort
trying to change the world rather than changing the part you can
change, you'll never find out if that's true or not.

And of course, you always have the option of only using mh (or a gui
wrapper for same) to read digests. Treating a digest as a single
message is awkward enough that the difficulty of setting up mh and a
GUI wrapper will be lost in the noise if you read enough digests.

> I better won't reply to them at all next time, just because
> my mail agent doesn't allow. That's my personal choice, but it is also
> the choice that makes people exclude then they faced with such
> requirements.

If you chose to act in a way that makes you feel excluded, that's your
problem. Sure, it's not the one I want people following, but I'm not
going to waste time trying to change your behavior. I'll point out
your errors to help other people avoid them, but you can do what you
want with that information.

> >> I am not sure that `programs` nowadays makes any sense if you can not
> >> access your data from all the entrypionts.
> > ? Do you believe - those "awkward" terminal programs works remotely
> > quite fine?!
> But I can't use them from my (imaginary) tablet version 3. =)

I could, if I really wanted to. But here you're fundamentally correct
- the mh mail readers only have one interaction with mail servers:
"Load unread mail". That makes using them in an environment where you
want to deal with mail from multiple machines problematical, at
best. That's why I quit using them.

Of course, there are other mail readers besides GMail that don't have
that problem. They also don't have the problems that GMail has. They
may well have other problems, but only you can figure out what those
are and change the things you can control to best deal with them.

       <mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From miki.tebeka at gmail.com  Sat Apr 28 00:52:17 2012
From: miki.tebeka at gmail.com (Miki Tebeka)
Date: Fri, 27 Apr 2012 15:52:17 -0700 (PDT)
Subject: [Python-ideas] Anyone working on a platform-agnostic
 os.startfile()
In-Reply-To: <CACZ_DoceeafEpu9wsyG0JPuAGUay0PR91xOZSwzrVy4BGZo1nQ@mail.gmail.com>
References: <CACZ_DoceeafEpu9wsyG0JPuAGUay0PR91xOZSwzrVy4BGZo1nQ@mail.gmail.com>
Message-ID: <2596653.232.1335567137261.JavaMail.geo-discussion-forums@ynbv36>

There's http://pypi.python.org/pypi/desktop/0.4, but it seems to be 
unmaintained.
It provides a an "open" command.

On Sunday, April 22, 2012 10:21:10 PM UTC-7, Hobson Lane wrote:
>
> There is significant interest in a cross-platform 
> file-launcher.[1][2][3][4]  The ideal implementation would be 
> an operating-system-agnostic interface that launches a file for editing or 
> viewing, similar to the way os.startfile() works for Windows, but 
> generalized to allow caller-specification of view vs. edit preference and 
> support all registered os.name operating systems, not just 'nt'.
>
> Mercurial has a mature python implementation for cross-platform launching 
> of an editor (either GUI editor or terminal-based editor like vi).[5][6] 
>  The python std lib os.startfile obviously works for Windows.
>
> The Mercurial functionality could be rolled into os.startfile() with 
> additional named parameters for edit or view preference and gui or non-gui 
> preference. Perhaps that would enable backporting belwo Python 3.x. Or is 
> there a better place to incorporate this multi-platform file launching 
> capability?
>
>   [1]: 
> http://stackoverflow.com/questions/1856792/intelligently-launching-the-default-editor-from-inside-a-python-cli-program
>   [2]: 
> http://stackoverflow.com/questions/434597/open-document-with-default-application-in-python
>   [3]: 
> http://stackoverflow.com/questions/1442841/lauch-default-editor-like-webbrowser-module
>   [4]: 
> http://stackoverflow.com/questions/434597/open-document-with-default-application-in-python
>   [5]: http://selenic.com/repo/hg-stable/file/2770d03ae49f/mercurial/ui.py
>   [6]: 
> http://selenic.com/repo/hg-stable/file/2770d03ae49f/mercurial/util.py
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120427/5c0df718/attachment.html>

From ericsnowcurrently at gmail.com  Sat Apr 28 08:06:29 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Sat, 28 Apr 2012 00:06:29 -0600
Subject: [Python-ideas] PEP 4XX: Adding sys.implementation
In-Reply-To: <CALFfu7AYmTqsNzfc-hX-+i7voVNFh28z1pgeBF1WXh1vhtpcLA@mail.gmail.com>
References: <CALFfu7AYmTqsNzfc-hX-+i7voVNFh28z1pgeBF1WXh1vhtpcLA@mail.gmail.com>
Message-ID: <CALFfu7DEzWXnRxxg_MK4+GMaJFEsA_aHREQcqzJpFfPfvK9Tdg@mail.gmail.com>

Here's an update to the PEP.  Though I have indirect or old feedback
already, I'd love to hear from the other main Python implementations,
particularly regarding the version variable.  Thanks.

-eric

-------------------------------------------------------------


PEP: 421
Title: Adding sys.implementation
Version: $Revision$
Last-Modified: $Date$
Author: Eric Snow <ericsnowcurrently at gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 26-April-2012
Post-History: 26-April-2012


Abstract
========

This PEP introduces a new variable for the ``sys`` module:
``sys.implementation``.  The variable holds consolidated information
about the implementation of the running interpreter.  Thus
``sys.implementation`` is the source to which the standard library may
look for implementation-specific information.

The proposal in this PEP is in line with a broader emphasis on making
Python friendlier to alternate implementations.  It describes the new
variable and the constraints on what that variable contains.  The PEP
also explains some immediate use cases for ``sys.implementation``.


Motivation
==========

For a number of years now, the distinction between Python-the-language
and CPython (the reference implementation) has been growing.  Most of
this change is due to the emergence of Jython, IronPython, and PyPy as
viable alternate implementations of Python.

Consider, however, the nearly two decades of CPython-centric Python
(i.e. most of its existance).  That focus had understandably contributed
to quite a few CPython-specific artifacts both in the standard library
and exposed in the interpreter.  Though the core developers have made an
effort in recent years to address this, quite a few of the artifacts
remain.

Part of the solution is presented in this PEP:  a single namespace on
which to consolidate implementation specifics.  This will help focus
efforts to differentiate the implementation specifics from the language.
Additionally, it will foster a multiple-implementation mindset.


Proposal
========

We will add ``sys.implementation``, in the ``sys`` module, as a namespace
to contain implementation-specific information.

The contents of this namespace will remain fixed during interpreter
execution and through the course of an implementation version.  This
ensures behaviors don't change between versions which depend on variables
in ``sys.implementation``.

``sys.implementation`` will be a dictionary, as opposed to any form of
"named" tuple (a la ``sys.version_info``).  This is partly because it
doesn't have meaning as a sequence, and partly because it's a potentially
more variable data structure.

The namespace will contain at least the variables described in the
`Required Variables`_ section below.  However, implementations are free
to add other implementation information there.  Some possible extra
variables are described in the `Other Possible Variables`_ section.

This proposal takes a conservative approach in requiring only two
variables.  As more become appropriate, they may be added with discretion.


Required Variables
--------------------

These are variables in ``sys.implementation`` on which the standard
library would rely, meaning implementors must define them:

name
   the name of the implementation (case sensitive).

version
   the version of the implementation, as opposed to the version of the
   language it implements.  This would use a standard format, similar to
   ``sys.version_info`` (see `Version Format`_).


Other Possible Variables
------------------------

These variables could be useful, but don't necessarily have a clear use
case presently:

cache_tag
   a string used for the PEP 3147 cache tag (e.g. 'cpython33' for
   CPython 3.3).  The name and version from above could be used to
   compose this, though an implementation may want something else.
   However, module caching is not a requirement of implementations, nor
   is the use of cache tags.

repository
   the implementation's repository URL.

repository_revision
   the revision identifier for the implementation.

build_toolchain
   identifies the tools used to build the interpreter.

url (or website)
   the URL of the implementation's site.

site_prefix
   the preferred site prefix for this implementation.

runtime
   the run-time environment in which the interpreter is running.

gc_type
   the type of garbage collection used.


Version Format
--------------

A main point of ``sys.implementation`` is to contain information that
will be used in the standard library.  In order to facilitate the
usefulness of a version variable, its value should be in a consistent
format across implementations.

XXX Subject to feedback

As such, the format of ``sys.implementation['version']`` must follow that
of ``sys.version_info``, which is effectively a named tuple.  It is a
familiar format and generally consistent with normal version format
conventions.


Rationale
=========

The status quo for implementation-specific information gives us that
information in a more fragile, harder to maintain way.  It's spread out
over different modules or inferred from other information, as we see with
``platform.python_implementation()``.

This PEP is the main alternative to that approach.  It consolidates the
implementation-specific information into a single namespace and makes
explicit that which was implicit.  The ``sys`` module should old the new
namespace because ``sys`` is the depot for interpreter-centric variables
and functions.

With the single-namespace-under-sys so straightforward, no alternatives
have been considered for this PEP.


Discussion
==========

The topic of ``sys.implementation`` came up on the python-ideas list in
2009, where the reception was broadly positive [1]_.  I revived the
discussion recently while working on a pure-python ``imp.get_tag()`` [2]_.
The messages in `issue #14673`_ are also relevant.


Use-cases
=========

``platform.python_implementation()``
------------------------------------

"explicit is better than implicit"

The ``platform`` module guesses the python implementation by looking for
clues in a couple different ``sys`` variables [3]_.  However, this
approach is fragile.  Beyond that, it's limited to those implementations
that core developers have blessed by special-casing them in the
``platform`` module.

With ``sys.implementation`` the various implementations would
*explicitly* set the values in their own version of the ``sys`` module.

Aside from the guessing, another concern is that the ``platform`` module
is part of the stdlib, which ideally would minimize implementation details
such as would be moved to ``sys.implementation``.

Any overlap between ``sys.implementation`` and the ``platform`` module
would simply defer to ``sys.implementation`` (with the same interface in
``platform`` wrapping it).


Cache Tag Generation in Frozen Importlib
----------------------------------------

PEP 3147 defined the use of a module cache and cache tags for file names.
The importlib bootstrap code, frozen into the Python binary as of 3.3,
uses the cache tags during the import process.  Part of the project to
bootstrap importlib has been to clean out of `Python/import.c` any code
that did not need to be there.

The cache tag defined in `Python/import.c` was hard-coded to
``"cpython" MAJOR MINOR`` [4]_.  For importlib the options are either
hard-coding it in the same way, or guessing the implementation in the
same way as does ``platform.python_implementation()``.

As long as the hard-coded tag is limited to CPython-specific code, it's
livable.  However, inasmuch as other Python implementations use the
importlib code to work with the module cache, a hard-coded tag would
become a problem..

Directly using the ``platform`` module in this case is a non-starter.  Any
module used in the importlib bootstrap must be built-in or frozen,
neither of which apply to the ``platform`` module.  This is the point that
led to the recent interest in ``sys.implementation``.

Regardless of the outcome for the implementation name used, another
problem relates to the version used in the cache tag.  That version is
likely to be the implementation version rather than the language version.
However, the implementation version is not readily identified anywhere in
the standard library.


Implementation-Specific Tests
-----------------------------

Currently there are a number of implementation-specific tests in the test
suite under ``Lib/test``.  The test support module (`Lib/test/support.py`_)
provides some functionality for dealing with these tests.  However, like
the ``platform`` module, ``test.support`` must do some guessing that
``sys.implementation`` would render unnecessary.


Jython's ``os.name`` Hack
-------------------------

XXX

http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l512


Feedback From Other Python Implementators
=========================================

IronPython
----------

XXX

Jython
------

XXX

PyPy
----

XXX


Past Efforts
============

PEP 3139
--------

This PEP from 2008 recommended a clean-up of the ``sys`` module in part
by extracting implementation-specific variables and functions into a
separate module.  PEP 421 is a much lighter version of that idea.
While PEP 3139 was rejected, its goals are reflected in PEP 421 to a
large extent, though with a much lighter approach.


PEP 399
-------

This informational PEP dictates policy regarding the standard library,
helping to make it friendlier to alternate implementations.  PEP 421 is
proposed in that same spirit.


Open Issues
===========

* What are the long-term objectives for ``sys.implementation``?

  - possibly pull in implementation details from the main ``sys`` namespace
    and elsewhere (PEP 3137 lite).

* Alternatives to the approach dictated by this PEP?

* ``sys.implementation`` as a proper namespace rather than a dict.  It
  would be it's own module or an instance of a concrete class.


Implementation
==============

The implementatation of this PEP is covered in `issue #14673`_.


References
==========

.. [1] http://mail.python.org/pipermail/python-dev/2009-October/092893.html

.. [2] http://mail.python.org/pipermail/python-ideas/2012-April/014878.html

.. [3] http://hg.python.org/cpython/file/2f563908ebc5/Lib/platform.py#l1247

.. [4] http://hg.python.org/cpython/file/2f563908ebc5/Python/import.c#l121

.. [5] Examples of implementation-specific handling in test.support:

| http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l509
| http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l1246
| http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l1252
| http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py#l1275

.. _issue #14673: http://bugs.python.org/issue14673

.. _Lib/test/support.py:
http://hg.python.org/cpython/file/2f563908ebc5/Lib/test/support.py

Copyright
=========

    This document has been placed in the public domain.


From pyideas at rebertia.com  Sat Apr 28 08:22:25 2012
From: pyideas at rebertia.com (Chris Rebert)
Date: Fri, 27 Apr 2012 23:22:25 -0700
Subject: [Python-ideas] PEP 4XX: Adding sys.implementation
In-Reply-To: <CALFfu7DEzWXnRxxg_MK4+GMaJFEsA_aHREQcqzJpFfPfvK9Tdg@mail.gmail.com>
References: <CALFfu7AYmTqsNzfc-hX-+i7voVNFh28z1pgeBF1WXh1vhtpcLA@mail.gmail.com>
	<CALFfu7DEzWXnRxxg_MK4+GMaJFEsA_aHREQcqzJpFfPfvK9Tdg@mail.gmail.com>
Message-ID: <CAMZYqRTomCXg1UZH33fpdvADRFqU37GRTMB7dqPO-aG5XExXSA@mail.gmail.com>

On Fri, Apr 27, 2012 at 11:06 PM, Eric Snow <ericsnowcurrently at gmail.com> wrote:
<snip>
> Proposal
> ========
>
> We will add ``sys.implementation``, in the ``sys`` module, as a namespace
> to contain implementation-specific information.
>
> The contents of this namespace will remain fixed during interpreter
> execution and through the course of an implementation version. ?This
> ensures behaviors don't change between versions which depend on variables
> in ``sys.implementation``.
>
> ``sys.implementation`` will be a dictionary, as opposed to any form of
> "named" tuple (a la ``sys.version_info``). ?This is partly because it
> doesn't have meaning as a sequence, and partly because it's a potentially
> more variable data structure.
<snip>
> Open Issues
> ===========
>
> * What are the long-term objectives for ``sys.implementation``?
>
> ?- possibly pull in implementation details from the main ``sys`` namespace
> ? ?and elsewhere (PEP 3137 lite).
>
> * Alternatives to the approach dictated by this PEP?
>
> * ``sys.implementation`` as a proper namespace rather than a dict. ?It
> ?would be it's own module or an instance of a concrete class.

So, what's the justification for it being a dict rather than an object
with attributes? The PEP merely (sensibly) concludes that it cannot be
considered a sequence.

Relatedly, I find the PEP's use of the term "namespace" in reference
to a dict to be somewhat confusing.

Cheers,
Chris


From victor.stinner at gmail.com  Sun Apr 29 03:39:53 2012
From: victor.stinner at gmail.com (Victor Stinner)
Date: Sun, 29 Apr 2012 03:39:53 +0200
Subject: [Python-ideas] PEP 4XX: Adding sys.implementation
In-Reply-To: <CALFfu7AYmTqsNzfc-hX-+i7voVNFh28z1pgeBF1WXh1vhtpcLA@mail.gmail.com>
References: <CALFfu7AYmTqsNzfc-hX-+i7voVNFh28z1pgeBF1WXh1vhtpcLA@mail.gmail.com>
Message-ID: <CAMpsgwaAcs_SPAxjiXjpy2n=b8DzXJHYDx4X_pHVjNAxHqfE0A@mail.gmail.com>

> I've written up a PEP for the sys.implementation idea.  Feedback is welcome!

Cool, it's better with PEP! Even the change looks trivial.

> name
>  the name of the implementation (case sensitive).

It would help if the PEP (and the documentation of sys.implementation)
lists at least the most common names. I suppose that we would have
something like: "CPython", "PyPy", "Jython", "IronPython".

> version
>  the version of the implementation, as opposed to the version of the
>  language it implements.  This would use a standard format, similar to
>  ``sys.version_info`` (see `Version Format`_).

Dummy question: what is sys.version/sys.version_info? The version of
the implementation or the version of the Python lnaguage? The PEP
should explain that, and maybe also the documentation of
sys.implementation.version (something like "use sys.version_info to
get the version of the Python language").

> cache_tag

Why not adding this information to the imp module?

Victor


From mwm at mired.org  Mon Apr 30 19:33:38 2012
From: mwm at mired.org (Mike Meyer)
Date: Mon, 30 Apr 2012 13:33:38 -0400
Subject: [Python-ideas] argparse FileType v.s default arguments...
Message-ID: <20120430133338.33b2f75d@bhuda.mired.org>

While I really like the argparse module, I've run into a case I think
it ought to handle that it doesn't.

So I'm asking here to see if 1) I've overlooked something, and it can
do this, or 2) there's a good reason for it not to do this or maybe 3)
this is a bad idea.

The usage I ran into looks like this:

parser.add_argument('configfile', default='/my/default/config',
		     type=FileType('r'), nargs='?')

If I provide the argument, everything works fine, and it opens the
named file for me. If I don't, parser.configfile is set to the string,
which doesn't work very well when I try to use it's read method.
Unfortunately, setting default to open('/my/default/config') has the
side affect of opening the file. Or raising an exception if the file
doesn't exist (which is a common reason for wanting to provide an
alternative!)

Could default handling could be made smarter, and if 1) type is set
and 2) the value of default is a string, call pass the value of
default to type? Or maybe a flag to make that happen, or even a
default_factory argument (incompatible with default) that would accept
something like default_factory=lambda: open('/my/default/config')?

	  Thanks,
	  <mike
-- 
Mike Meyer <mwm at mired.org>		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From greg at krypto.org  Mon Apr 30 21:59:09 2012
From: greg at krypto.org (Gregory P. Smith)
Date: Mon, 30 Apr 2012 12:59:09 -0700
Subject: [Python-ideas] argparse FileType v.s default arguments...
In-Reply-To: <20120430133338.33b2f75d@bhuda.mired.org>
References: <20120430133338.33b2f75d@bhuda.mired.org>
Message-ID: <CAGE7PNK5TUyRFc7nhSrHsh2VBuTRYG9qsF53DJCXQkNKcXj3Rw@mail.gmail.com>

On Mon, Apr 30, 2012 at 10:33 AM, Mike Meyer <mwm at mired.org> wrote:

> While I really like the argparse module, I've run into a case I think
> it ought to handle that it doesn't.
>
> So I'm asking here to see if 1) I've overlooked something, and it can
> do this, or 2) there's a good reason for it not to do this or maybe 3)
> this is a bad idea.
>
> The usage I ran into looks like this:
>
> parser.add_argument('configfile', default='/my/default/config',
>                     type=FileType('r'), nargs='?')
>
> If I provide the argument, everything works fine, and it opens the
> named file for me. If I don't, parser.configfile is set to the string,
> which doesn't work very well when I try to use it's read method.
> Unfortunately, setting default to open('/my/default/config') has the
> side affect of opening the file. Or raising an exception if the file
> doesn't exist (which is a common reason for wanting to provide an
> alternative!)
>
> Could default handling could be made smarter, and if 1) type is set
> and 2) the value of default is a string, call pass the value of
> default to type? Or maybe a flag to make that happen, or even a
> default_factory argument (incompatible with default) that would accept
> something like default_factory=lambda: open('/my/default/config')?
>
>          Thanks,
>          <mike
>

This makes sense to me as described.  I suggest going ahead and file an
issue on bugs.python.org with the above.

-gps (who hasn't actually used argparse enough)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120430/39255f88/attachment.html>

From barry at python.org  Mon Apr 30 23:04:54 2012
From: barry at python.org (Barry Warsaw)
Date: Mon, 30 Apr 2012 17:04:54 -0400
Subject: [Python-ideas] PEP 4XX: Adding sys.implementation
References: <CALFfu7AYmTqsNzfc-hX-+i7voVNFh28z1pgeBF1WXh1vhtpcLA@mail.gmail.com>
Message-ID: <20120430170454.08d73f74@resist.wooz.org>

On Apr 27, 2012, at 12:36 AM, Eric Snow wrote:

>I've written up a PEP for the sys.implementation idea.  Feedback is welcome!

Thanks for working on this PEP, Eric!

>``sys.implementation`` is a dictionary, as opposed to any form of "named"
>tuple (a la ``sys.version_info``).  This is partly because it doesn't
>have meaning as a sequence, and partly because it's a potentially more
>variable data structure.

I agree that sequence semantics are meaningless here.  Presumably, a
dictionary is proposed because this

    cache_tag = sys.implementation.get('cache_tag')

is nicer than

    cache_tag = getattr(sys.implementation, 'cache_tag', None)

OTOH, maybe we need a nameddict type!

>repository
>   the implementation's repository URL.

What does this mean?  Oh, I think you mean the URL for the VCS used to develop
this version of the implementation.  Maybe vcs_url (and even then there could
be alternative blessed mirrors in other vcs's).  A Debian analog are the Vcs-*
header (e.g. Vcs-Git, Vcs-Bzr, etc.).

>repository_revision
>   the revision identifier for the implementation.

I'm not sure what this is.  Is it like the hexgoo you see in the banner of a
from-source build that identifies the revision used to build this interpreter?
Is this key a replacement for that?

>build_toolchain
>   identifies the tools used to build the interpreter.

As a tuple of free-form strings?

>url (or website)
>   the URL of the implementation's site.

Maybe 'homepage' (another Debian analog).


>site_prefix
>   the preferred site prefix for this implementation.
>
>runtime
>   the run-time environment in which the interpreter is running.

I'm not sure what this means either. ;)

>gc_type
>   the type of garbage collection used.

Another free-form string?  What would be the values say, for CPython and
Jython?

>Version Format
>--------------
>
>XXX same as sys.version_info?

Why not? :)  It might be useful also to have something similar to
sys.hexversion, which I often find convenient.

>* What are the long-term objectives for sys.implementation?
>
>  - pull in implementation detail from the main sys namespace and
>    elsewhere (PEP 3137 lite).

That's where this seems to be leaning.  Even if it's a good idea, I bet it
will be a long time before the old sys names can be removed.

>* Alternatives to the approach dictated by this PEP?
>
>* ``sys.implementation`` as a proper namespace rather than a dict.  It
>  would be it's own module or an instance of a concrete class.

Which might make sense, as would perhaps a top-level `implementation` module.
IOW, why situate it in sys?

>The implementatation of this PEP is covered in `issue 14673`_.

s/implementatation/implementation

Nicely done!  Let's see how those placeholders shake out.

Cheers,
-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120430/71babf86/attachment.pgp>

From greg at krypto.org  Mon Apr 30 23:21:08 2012
From: greg at krypto.org (Gregory P. Smith)
Date: Mon, 30 Apr 2012 14:21:08 -0700
Subject: [Python-ideas] Module __getattr__ [Was: breaking out of module
	execution]
In-Reply-To: <CA+OGgf6d+J6wdqr6f0PVT44QpmXOXt=iw+-sQwKcFz9KY-SBOA@mail.gmail.com>
References: <CA+OGgf6d+J6wdqr6f0PVT44QpmXOXt=iw+-sQwKcFz9KY-SBOA@mail.gmail.com>
Message-ID: <CAGE7PNKy5TiOsw_=4BbUWdPikKPrUQQGe+qw8JqEYTrrs26+eA@mail.gmail.com>

On Fri, Apr 27, 2012 at 6:31 AM, Jim Jewett <jimjjewett at gmail.com> wrote:

> On Wed, Apr 25, 2012 at 2:35 PM, Matt Joiner <anacrolix at gmail.com> wrote:
> > If this is to be done I'd like to see all special methods supported. One
> of
> > particular interest to modules is __getattr__...
>
> For What It's Worth, supporting __setattr__ and __getattr__ is one of
> the few reasons that I have considered subclassing modules.
>
> The workarounds of either offering public set_varX and get_varX
> functions, or moving configuration to a separate singleton, just feel
> kludgy.
>
> Since those module methods would be defined at the far left, I don't
> think it would mess up understanding any more than they already do on
> regular classes.  (There is always *some* surprise, just because they
> are unusual.)
>
> That said, I personally tend to view modules as a special case of
> classes, so I wouldn't be shocked if others found it more confusing
> than I would -- particularly as to whether or not the module's
> __getattr__ would somehow affect the lookup chain for classes defined
> within the module.
>
> -jJ
>

Making modules "simply" be a class that could be subclasses rather than
their own thing _would_ be nice for one particular project I've worked on
where the project including APIs and basic implementations were open source
but which allowed for site specific code to override many/most of those
base implementations as a way of customizing it for your own specific (non
open source) environment.  Any APIs that were unfortunately defined as a
module with a bunch of functions in it was a real pain to make site
specific overrides for.  Anything APIs that were thankfully defined as a
class within a module even when there wasn't a real need for a class was
much easier to make site specific.

But this is not an easy thing to do.  I wouldn't want functions in a module
to need to be declared as class methods with a cls parameter nor would I
want an implicit named equivalent of cls; or does that already exist
through an existing __name__ style variable today that I've been ignoring?
 This could leads to basically treating a module globals() dict as the
class dict which at first glance seems surprising but I'd have to ponder
this more.

(and yes, note that I am thinking of a module as a class, not an instance)

-gps
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120430/533a8446/attachment.html>