From steve at pearwood.info Tue May 1 02:47:09 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 01 May 2012 10:47:09 +1000 Subject: [Python-ideas] Module __getattr__ [Was: breaking out of module execution] In-Reply-To: References: Message-ID: <4F9F328D.1040003@pearwood.info> Gregory P. Smith wrote: > Making modules "simply" be a class that could be subclasses rather than > their own thing _would_ be nice for one particular project I've worked on > where the project including APIs and basic implementations were open source > but which allowed for site specific code to override many/most of those > base implementations as a way of customizing it for your own specific (non > open source) environment. This makes no sense to me. What does the *licence* of a project have to do with the library API? I mean, yes, you could do such a thing, but surely you shouldn't. That would be like saying that the accelerator pedal should be on the right in cars you buy outright, but on the left for cars you get on hire-purchase. Nevertheless, I think your focus here is on the wrong thing. It seems to me that you are jumping to an implementation, namely that modules should stop being instances of a type and become classes, without having a clear idea of your functional requirements. The functional requirements *might* be: "There ought to be an easy way to customize the behaviour of attribute access in modules." Or perhaps: "There ought to be an easy way for one module to shadow another module with the same name, but still inherit behaviour from the shadowed module." neither of which *require* modules to become classes. Or perhaps it is something else... it is unclear to me exactly what problems you and Jim wish to solve, or whether they're the same kind of problem, which is why I say the functional requirements are unclear. Changing modules from an instance of ModuleType to "a class that could be a subclass" is surely going to break code. Somewhere, someone is relying on the fact that modules are not types and you're going to break their application. > Any APIs that were unfortunately defined as a > module with a bunch of functions in it was a real pain to make site > specific overrides for. It shouldn't be. Just ensure the site-specific override module comes first in the path, and "import module" will pick up the override module instead of the standard one. This is a simple exercise in shadowing modules. Of course, this implies that the override module has to override *everything*. There's currently no simple way for the shadowing module to inherit functionality from the shadowed module. You can probably hack something together, but it would be a PITA. -- Steven From guido at python.org Tue May 1 04:33:40 2012 From: guido at python.org (Guido van Rossum) Date: Mon, 30 Apr 2012 19:33:40 -0700 Subject: [Python-ideas] Module __getattr__ [Was: breaking out of module execution] In-Reply-To: <4F9F328D.1040003@pearwood.info> References: <4F9F328D.1040003@pearwood.info> Message-ID: On Mon, Apr 30, 2012 at 5:47 PM, Steven D'Aprano wrote: > Gregory P. Smith wrote: > >> Making modules "simply" be a class that could be subclasses rather than >> their own thing _would_ be nice for one particular project I've worked on >> where the project including APIs and basic implementations were open >> source >> but which allowed for site specific code to override many/most of those >> base implementations as a way of customizing it for your own specific (non >> open source) environment. > This makes no sense to me. What does the *licence* of a project have to do > with the library API? I mean, yes, you could do such a thing, but surely you > shouldn't. That would be like saying that the accelerator pedal should be on > the right in cars you buy outright, but on the left for cars you get on > hire-purchase. That's an irrelevant, surprising and unfair criticism of Greg's message. He just tried to give a specific example without being too specific. > Nevertheless, I think your focus here is on the wrong thing. It seems to me > that you are jumping to an implementation, namely that modules should stop > being instances of a type and become classes, without having a clear idea of > your functional requirements. > > The functional requirements *might* be: > > "There ought to be an easy way to customize the behaviour of attribute > access in modules." > > Or perhaps: > > "There ought to be an easy way for one module to shadow another module with > the same name, but still inherit behaviour from the shadowed module." > > neither of which *require* modules to become classes. > > Or perhaps it is something else... it is unclear to me exactly what problems > you and Jim wish to solve, or whether they're the same kind of problem, > which is why I say the functional requirements are unclear. > > Changing modules from an instance of ModuleType to "a class that could be a > subclass" is surely going to break code. Somewhere, someone is relying on > the fact that modules are not types and you're going to break their > application. > > > >> Any APIs that were unfortunately defined as a >> module with a bunch of functions in it was a real pain to make site >> specific overrides for. > > > It shouldn't be. Just ensure the site-specific override module comes first > in the path, and "import module" will pick up the override module instead of > the standard one. This is a simple exercise in shadowing modules. > > Of course, this implies that the override module has to override > *everything*. There's currently no simple way for the shadowing module to > inherit functionality from the shadowed module. You can probably hack > something together, but it would be a PITA. If there is a bunch of functions and you want to replace a few of those, you can probably get the desired effect quite easily: from base_module import * # Or the specific set of functions that comprise the API. def funct1(): def funct2(): Not that I would recommend this -- it's easy to get confused if there are more than a very small number of functions. Also if base_module.funct3 were to call func2, it wouldn't call the overridden version. But all attempts to view modules as classes or instances have lead to negative results. (I'm sure I've thought about it at various times in the past.) I think the reason is that a module at best acts as a class where every method is a *static* method, but implicitly so. Ad we all know how limited static methods are. (They're basically an accident -- back in the Python 2.2 days when I was inventing new-style classes and descriptors, I meant to implement class methods but at first I didn't understand them and accidentally implemented static methods first. Then it was too late to remove them and only provide class methods.) There is actually a hack that is occasionally used and recommended: a module can define a class with the desired functionality, and then at the end, replace itself in sys.modules with an instance of that class (or with the class, if you insist, but that's generally less useful). E.g.: # module foo.py import sys class Foo: def funct1(self, ):

    def funct2(self, ): 

  sys.modules[__name__] = Foo()

This works because the import machinery is actively enabling this
hack, and as its final step pulls the actual module out of
sys.modules, after loading it. (This is no accident. The hack was
proposed long ago and we decided we liked enough to support it in the
import machinery.)

You can easily override __getattr__ / __getattribute__ / __setattr__
this way. It also makes "subclassing" the module a little easier
(although accessing the class to be used as a base class is a little
tricky: you'd have to use foo.__class__). But of course the kind of
API that Greg was griping about would never be implemented this way,
so that's fairly useless. And if you were designing a module as an
inheritable class right from the start you're much better off just
using a class instead of the above hack.

But all in all I don't think there's a great future in stock for the
idea of allowing modules to be "subclassed". In the vast, vast
majority of cases it's better to clearly have a separation between
modules, which provide no inheritance and no instantiation, and
classes, which provide both. I think Python is better off this way
than Java, where all you have is classes (its packages cannot contain
anything except class definitions).

-- 
--Guido van Rossum (python.org/~guido)


From ericsnowcurrently at gmail.com  Tue May  1 04:39:47 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Mon, 30 Apr 2012 20:39:47 -0600
Subject: [Python-ideas] PEP 4XX: Adding sys.implementation
In-Reply-To: 
References: 
	
	
Message-ID: 

On Sat, Apr 28, 2012 at 12:22 AM, Chris Rebert  wrote:
> On Fri, Apr 27, 2012 at 11:06 PM, Eric Snow  wrote:
>>
>> * ``sys.implementation`` as a proper namespace rather than a dict. ?It
>> ?would be it's own module or an instance of a concrete class.
>
> So, what's the justification for it being a dict rather than an object
> with attributes? The PEP merely (sensibly) concludes that it cannot be
> considered a sequence.

At this point I'm not aware of the strong justifications either way.
However, sys.implementation is currently intended as a simple
collection of variables.  A dict reflects that.

One obvious concern is that if we start off with a dict we're binding
ourselves to that interface.  If we later want concrete class with
dotted lookup, we'd be looking at backwards-incompatibility.  This is
the part of the PEP that still needs more serious thought.

> Relatedly, I find the PEP's use of the term "namespace" in reference
> to a dict to be somewhat confusing.

In my mind a mapping is a namespace.  I don't have a problem changing
that to mitigate any confusion.  Thanks for the feedback.

-eric


From ncoghlan at gmail.com  Tue May  1 04:48:02 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 1 May 2012 12:48:02 +1000
Subject: [Python-ideas] Module __getattr__ [Was: breaking out of module
	execution]
In-Reply-To: 
References: 
	
	<4F9F328D.1040003@pearwood.info>
	
Message-ID: 

On Tue, May 1, 2012 at 12:33 PM, Guido van Rossum  wrote:
> But all in all I don't think there's a great future in stock for the
> idea of allowing modules to be "subclassed". In the vast, vast
> majority of cases it's better to clearly have a separation between
> modules, which provide no inheritance and no instantiation, and
> classes, which provide both. I think Python is better off this way
> than Java, where all you have is classes (its packages cannot contain
> anything except class definitions).

FWIW, in 3.3 the full import machinery will be exposed in
sys.meta_path (and sys.path_hooks), so third parties will be free to
experiment with whatever crazy things they want without having to work
around the implicit import behaviour :)

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From ericsnowcurrently at gmail.com  Tue May  1 04:50:24 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Mon, 30 Apr 2012 20:50:24 -0600
Subject: [Python-ideas] PEP 4XX: Adding sys.implementation
In-Reply-To: 
References: 
	
Message-ID: 

On Sat, Apr 28, 2012 at 7:39 PM, Victor Stinner
 wrote:
>> I've written up a PEP for the sys.implementation idea. ?Feedback is welcome!
>
> Cool, it's better with PEP! Even the change looks trivial.
>
>> name
>> ?the name of the implementation (case sensitive).
>
> It would help if the PEP (and the documentation of sys.implementation)
> lists at least the most common names. I suppose that we would have
> something like: "CPython", "PyPy", "Jython", "IronPython".

Good point.  I'll do that.

>> version
>> ?the version of the implementation, as opposed to the version of the
>> ?language it implements. ?This would use a standard format, similar to
>> ?``sys.version_info`` (see `Version Format`_).
>
> Dummy question: what is sys.version/sys.version_info? The version of
> the implementation or the version of the Python lnaguage? The PEP
> should explain that, and maybe also the documentation of
> sys.implementation.version (something like "use sys.version_info to
> get the version of the Python language").

Yeah, sys.version (et al.) is the version of the language.  It just
happens to be the same as the implementation version for CPython.
I'll make that more clear.

>> cache_tag
>
> Why not adding this information to the imp module?

This is certainly something I need to clarify.  Either the different
implementors set these values in the various modules to which they
pertain; or they set them all in one place (sys.implementation).  I
really think we should avoid having a mix.

In my mind sys.implementation makes more sense.  For example, in the
case of cache_tag (which is merely a potential future variable), its
value is an implementation detail used by importlib.  Having it in
sys.implementation would emphasize this point.

-eric


From ncoghlan at gmail.com  Tue May  1 04:57:49 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 1 May 2012 12:57:49 +1000
Subject: [Python-ideas] PEP 4XX: Adding sys.implementation
In-Reply-To: 
References: 
	
	
	
Message-ID: 

On Tue, May 1, 2012 at 12:39 PM, Eric Snow  wrote:
> On Sat, Apr 28, 2012 at 12:22 AM, Chris Rebert  wrote:
>> On Fri, Apr 27, 2012 at 11:06 PM, Eric Snow  wrote:
>>>
>>> * ``sys.implementation`` as a proper namespace rather than a dict. ?It
>>> ?would be it's own module or an instance of a concrete class.
>>
>> So, what's the justification for it being a dict rather than an object
>> with attributes? The PEP merely (sensibly) concludes that it cannot be
>> considered a sequence.
>
> At this point I'm not aware of the strong justifications either way.
> However, sys.implementation is currently intended as a simple
> collection of variables. ?A dict reflects that.
>
> One obvious concern is that if we start off with a dict we're binding
> ourselves to that interface. ?If we later want concrete class with
> dotted lookup, we'd be looking at backwards-incompatibility. ?This is
> the part of the PEP that still needs more serious thought.

I think it's a case where practicality beats purity. By using
structseq, we get a nice representation and dotted attribute access,
just as we have for sys.float_info. Providing this kind of convenience
is the same reason collections.namedtuple exists.

We should just document that the length of the tuple and the order of
items is not guaranteed (either across implementations or between
versions), and even the ability to iterate over the items or access
them by index is not mandatory in an implementation. Would it be
better if we had a separate "namespace" type in CPython that simply
*disallowed* iteration and indexing? Perhaps, but we've survived long
enough without it that I have my doubts about the practical need.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From ncoghlan at gmail.com  Tue May  1 05:08:44 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 1 May 2012 13:08:44 +1000
Subject: [Python-ideas] PEP 4XX: Adding sys.implementation
In-Reply-To: 
References: 
	
	
Message-ID: 

On Tue, May 1, 2012 at 12:50 PM, Eric Snow  wrote:
> In my mind sys.implementation makes more sense. ?For example, in the
> case of cache_tag (which is merely a potential future variable), its
> value is an implementation detail used by importlib. ?Having it in
> sys.implementation would emphasize this point.

Personally, I think cache_tag should be part of the initial proposal.
Implementations may want to use different cache tags depending on
additional information that importlib shouldn't need to care about,
and I think it would also be reasonable to allow "cache_tag=None" to
disable the implicit caching altogether.

The ultimate goal would be for us to be able to eliminate
implementation checks from other parts of the standard library.
importlib is a good place to start, since the idea is that, aside from
the mechanism used to bootstrap it into place, along with optional
acceleration of __import__, importlib itself should be implementation
independent.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From ericsnowcurrently at gmail.com  Tue May  1 05:22:30 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Mon, 30 Apr 2012 21:22:30 -0600
Subject: [Python-ideas] PEP 4XX: Adding sys.implementation
In-Reply-To: <20120430170454.08d73f74@resist.wooz.org>
References: 
	<20120430170454.08d73f74@resist.wooz.org>
Message-ID: 

On Mon, Apr 30, 2012 at 3:04 PM, Barry Warsaw  wrote:
> On Apr 27, 2012, at 12:36 AM, Eric Snow wrote:
>>``sys.implementation`` is a dictionary, as opposed to any form of "named"
>>tuple (a la ``sys.version_info``). ?This is partly because it doesn't
>>have meaning as a sequence, and partly because it's a potentially more
>>variable data structure.
>
> I agree that sequence semantics are meaningless here. ?Presumably, a
> dictionary is proposed because this
>
> ? ?cache_tag = sys.implementation.get('cache_tag')
>
> is nicer than
>
> ? ?cache_tag = getattr(sys.implementation, 'cache_tag', None)

That's a good point.  Also, a dict better reflects a collection of
variables that a dotted-access object, which to me implies the
potential for methods as well.

> OTOH, maybe we need a nameddict type!

You won't have to convince _me_. :)

>>repository
>> ? the implementation's repository URL.
>
> What does this mean? ?Oh, I think you mean the URL for the VCS used to develop
> this version of the implementation. ?Maybe vcs_url (and even then there could
> be alternative blessed mirrors in other vcs's). ?A Debian analog are the Vcs-*
> header (e.g. Vcs-Git, Vcs-Bzr, etc.).

Yeah, you got it.  For CPython it would be
"http://hg.python.org/cpython".  You're right that vcs_url is more
clear.  I'll update it.

Perhaps I should clarify "Other Possible Values" in the PEP?  I'd
intended it as a list of meaningful names, most of which others had
suggested, that could be considered at some later point.  That's part
of why I didn't develop the descriptions there too much.  Rather, I
wanted to focus on the two primary names for now.

Should those potential names be considered more seriously right now?
I was hoping to keep it light to start out, just the things we'd use
immediately.

>>repository_revision
>> ? the revision identifier for the implementation.
>
> I'm not sure what this is. ?Is it like the hexgoo you see in the banner of a
> from-source build that identifies the revision used to build this interpreter?
> Is this key a replacement for that?

I was thinking along those lines.  For CPython, it could be 76678 or
ab63e874265e or both.  The decision on any constraints for this one
would be subject to further discussion.

>
>>build_toolchain
>> ? identifies the tools used to build the interpreter.
>
> As a tuple of free-form strings?

That would work.  I expect it would depend on how it would be used.

>>url (or website)
>> ? the URL of the implementation's site.
>
> Maybe 'homepage' (another Debian analog).

Sounds good to me.

>>site_prefix
>> ? the preferred site prefix for this implementation.
>>
>>runtime
>> ? the run-time environment in which the interpreter is running.
>
> I'm not sure what this means either. ;)

Yeah, it's not so clear there.  For Jython it would be something like
"jvm X.X", for IronPython it would be ".net CLR X.X" or whatever.
Again the actual definition would be subject to more discussion
relative to the use case, be it information or otherwise.

>>gc_type
>> ? the type of garbage collection used.
>
> Another free-form string? ?What would be the values say, for CPython and
> Jython?

I was imagining a free-form string, like "reference counting" or "mark
and sweep".  I just depends on what people need it for.

>>Version Format
>>--------------
>>
>>XXX same as sys.version_info?
>
> Why not? :) ?It might be useful also to have something similar to
> sys.hexversion, which I often find convenient.

That's the way I'm leaning.  I've covered it a little more in the
newer version of the PEP (on python-ideas).

>>* What are the long-term objectives for sys.implementation?
>>
>> ?- pull in implementation detail from the main sys namespace and
>> ? ?elsewhere (PEP 3137 lite).
>
> That's where this seems to be leaning. ?Even if it's a good idea, I bet it
> will be a long time before the old sys names can be removed.

Yeah, it's definitely not the focus of the PEP, but I think it's a
valid long-term goal of which we should be cognizant.

>>* Alternatives to the approach dictated by this PEP?
>>
>>* ``sys.implementation`` as a proper namespace rather than a dict. ?It
>> ?would be it's own module or an instance of a concrete class.
>
> Which might make sense, as would perhaps a top-level `implementation` module.
> IOW, why situate it in sys?
>
>>The implementatation of this PEP is covered in `issue 14673`_.
>
> s/implementatation/implementation

Got it.

> Nicely done! ?Let's see how those placeholders shake out.

Thanks.  I'm glad to get this rolling.  And yeah, I need to poke the
folks with the other implementations to get their feedback (rather
than rely on nods from 3 years ago). :)

-eric


From ericsnowcurrently at gmail.com  Tue May  1 05:43:47 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Mon, 30 Apr 2012 21:43:47 -0600
Subject: [Python-ideas] PEP 4XX: Adding sys.implementation
In-Reply-To: 
References: 
	
	
	
	
Message-ID: 

On Mon, Apr 30, 2012 at 8:57 PM, Nick Coghlan  wrote:
> On Tue, May 1, 2012 at 12:39 PM, Eric Snow  wrote:
>> At this point I'm not aware of the strong justifications either way.
>> However, sys.implementation is currently intended as a simple
>> collection of variables. ?A dict reflects that.
>>
>> One obvious concern is that if we start off with a dict we're binding
>> ourselves to that interface. ?If we later want concrete class with
>> dotted lookup, we'd be looking at backwards-incompatibility. ?This is
>> the part of the PEP that still needs more serious thought.
>
> I think it's a case where practicality beats purity. By using
> structseq, we get a nice representation and dotted attribute access,
> just as we have for sys.float_info. Providing this kind of convenience
> is the same reason collections.namedtuple exists.

That was my original sentiment, partly for the "this is how it's
already been done" aspect.  Barry made a good point about
sys.implementation.get(name) vs. getattr(sys.implementation, name,
None).  However, having dotted access still seems more correct.
(continued below...)

> We should just document that the length of the tuple and the order of
> items is not guaranteed (either across implementations or between
> versions), and even the ability to iterate over the items or access
> them by index is not mandatory in an implementation. Would it be
> better if we had a separate "namespace" type in CPython that simply
> *disallowed* iteration and indexing? Perhaps, but we've survived long
> enough without it that I have my doubts about the practical need.

That's a good point.  Perhaps it depends on how general we expect the
consumption of sys.implementation to be.  If its practicality is
oriented toward internal use then the data structure is not as
critical.  However, sys.implementation is intended to have a
non-localized impact across the standard library and the interpreter.
I'd rather not make hacking it become an attractive nuisance,
regardless of our intentions for usage.

This is where I usually defer to those that have been dealing for
 years with the aftermath of these types of decisions.


-eric


From ericsnowcurrently at gmail.com  Tue May  1 05:47:51 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Mon, 30 Apr 2012 21:47:51 -0600
Subject: [Python-ideas] PEP 4XX: Adding sys.implementation
In-Reply-To: 
References: 
	
	
	
Message-ID: 

On Mon, Apr 30, 2012 at 9:08 PM, Nick Coghlan  wrote:
> On Tue, May 1, 2012 at 12:50 PM, Eric Snow  wrote:
>> In my mind sys.implementation makes more sense. ?For example, in the
>> case of cache_tag (which is merely a potential future variable), its
>> value is an implementation detail used by importlib. ?Having it in
>> sys.implementation would emphasize this point.
>
> Personally, I think cache_tag should be part of the initial proposal.
> Implementations may want to use different cache tags depending on
> additional information that importlib shouldn't need to care about,
> and I think it would also be reasonable to allow "cache_tag=None" to
> disable the implicit caching altogether.

Agreed.  This is how I was thinking of it.  I just wanted to keep
things as minimal as possible to start.  In importlib we can fall back
to name+version if cache_tag isn't there.  Still, of the potential
variables, cache_tag is the strongest candidate, having a solid (if
optional) use-case right now.

> The ultimate goal would be for us to be able to eliminate
> implementation checks from other parts of the standard library.
> importlib is a good place to start, since the idea is that, aside from
> the mechanism used to bootstrap it into place, along with optional
> acceleration of __import__, importlib itself should be implementation
> independent.

Spot on!

-eric


From ericsnowcurrently at gmail.com  Tue May  1 08:10:18 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Tue, 1 May 2012 00:10:18 -0600
Subject: [Python-ideas] PEP 4XX: Adding sys.implementation
In-Reply-To: 
References: 
	
	
	
Message-ID: 

On Mon, Apr 30, 2012 at 9:08 PM, Nick Coghlan  wrote:
> On Tue, May 1, 2012 at 12:50 PM, Eric Snow  wrote:
>> In my mind sys.implementation makes more sense. ?For example, in the
>> case of cache_tag (which is merely a potential future variable), its
>> value is an implementation detail used by importlib. ?Having it in
>> sys.implementation would emphasize this point.
>
> Personally, I think cache_tag should be part of the initial proposal.
> Implementations may want to use different cache tags depending on
> additional information that importlib shouldn't need to care about,
> and I think it would also be reasonable to allow "cache_tag=None" to
> disable the implicit caching altogether.

I'm going to leave it as-is for the moment, but I'm leaning toward doing this.

-eric


From ericsnowcurrently at gmail.com  Tue May  1 21:05:52 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Tue, 1 May 2012 13:05:52 -0600
Subject: [Python-ideas] PEP 4XX: Adding sys.implementation
In-Reply-To: 
References: 
	
Message-ID: 

Updated:

http://www.python.org/dev/peps/pep-0421/

-eric


From barry at python.org  Wed May  2 00:25:29 2012
From: barry at python.org (Barry Warsaw)
Date: Tue, 1 May 2012 18:25:29 -0400
Subject: [Python-ideas] PEP 4XX: Adding sys.implementation
References: 
	<20120430170454.08d73f74@resist.wooz.org>
	
Message-ID: <20120501182529.5ea3d94d@resist.wooz.org>

On Apr 30, 2012, at 09:22 PM, Eric Snow wrote:

>Perhaps I should clarify "Other Possible Values" in the PEP?  I'd
>intended it as a list of meaningful names, most of which others had
>suggested, that could be considered at some later point.  That's part
>of why I didn't develop the descriptions there too much.  Rather, I
>wanted to focus on the two primary names for now.
>
>Should those potential names be considered more seriously right now?
>I was hoping to keep it light to start out, just the things we'd use
>immediately.

I think you could keep it light (but +1 for adding cache_tag now).

I'd suggest making it clear that neither the keys, values, nor semantics are
actually being proposed in this PEP.  The PEP could just include some examples
for future additions (and thus de-emphasize that section of the PEP).

It might be helpful to describe a mechanism by which future values would be
added to sys.implementation.  E.g. is a new PEP required for each?  (I don't
have an opinion on that right now. :)

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: 

From barry at python.org  Wed May  2 00:28:26 2012
From: barry at python.org (Barry Warsaw)
Date: Tue, 1 May 2012 18:28:26 -0400
Subject: [Python-ideas] PEP 4XX: Adding sys.implementation
References: 
	<20120430170454.08d73f74@resist.wooz.org>
	
Message-ID: <20120501182826.0864f84b@resist.wooz.org>

On Apr 30, 2012, at 09:22 PM, Eric Snow wrote:

>> I agree that sequence semantics are meaningless here. ?Presumably, a
>> dictionary is proposed because this
>>
>> ? ?cache_tag = sys.implementation.get('cache_tag')
>>
>> is nicer than
>>
>> ? ?cache_tag = getattr(sys.implementation, 'cache_tag', None)
>
>That's a good point.  Also, a dict better reflects a collection of
>variables that a dotted-access object, which to me implies the
>potential for methods as well.
>
>> OTOH, maybe we need a nameddict type!
>
>You won't have to convince _me_. :)

Well, I was being a bit facetious.  You can easily implement those semantics
in pure Python.  5 minute hack below.

Cheers,
-Barry

-----snip snip-----
#! /usr/bin/python3

_missing = object()

import operator
import unittest

class Implementation:
    cache_tag = 'cpython33'
    name = 'CPython'

    def __getitem__(self, name, default=_missing):
        result = getattr(self, name, default)
        if result is _missing:
            raise AttributeError("'{}' object has no attribute '{}'".format(
                self.__class__.__name__, name))
        return result

    def __setitem__(self, name, value):
        raise TypeError('read only')

    def __setattr__(self, name, value):
        raise TypeError('read only')


implementation = Implementation()


class TestImplementation(unittest.TestCase):
    def test_cache_tag(self):
        self.assertEqual(implementation.cache_tag, 'cpython33')
        self.assertEqual(implementation['cache_tag'], 'cpython33')

    def test_name(self):
        self.assertEqual(implementation.name, 'CPython')
        self.assertEqual(implementation['name'], 'CPython')

    def test_huh(self):
        self.assertRaises(AttributeError, operator.getitem,
                          implementation, 'droids')
        self.assertRaises(AttributeError, getattr,
                          implementation, 'droids')

    def test_read_only(self):
        self.assertRaises(TypeError, operator.setitem,
                          implementation, 'droids', 'looking')
        self.assertRaises(TypeError, setattr,
                          implementation, 'droids', 'looking')
        self.assertRaises(TypeError, operator.setitem,
                          implementation, 'cache_tag', 'xpython99')
        self.assertRaises(TypeError, setattr,
                          implementation, 'cache_tag', 'xpython99')
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: 

From steve at pearwood.info  Wed May  2 03:09:17 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 02 May 2012 11:09:17 +1000
Subject: [Python-ideas] PEP 4XX: Adding sys.implementation
In-Reply-To: 
References: 	<20120430170454.08d73f74@resist.wooz.org>
	
Message-ID: <4FA0893D.4090903@pearwood.info>

Eric Snow wrote:
> On Mon, Apr 30, 2012 at 3:04 PM, Barry Warsaw  wrote:
>> On Apr 27, 2012, at 12:36 AM, Eric Snow wrote:
>>> ``sys.implementation`` is a dictionary, as opposed to any form of "named"
>>> tuple (a la ``sys.version_info``).  This is partly because it doesn't
>>> have meaning as a sequence, and partly because it's a potentially more
>>> variable data structure.
>> I agree that sequence semantics are meaningless here.  Presumably, a
>> dictionary is proposed because this
>>
>>    cache_tag = sys.implementation.get('cache_tag')
>>
>> is nicer than
>>
>>    cache_tag = getattr(sys.implementation, 'cache_tag', None)
> 
> That's a good point.  Also, a dict better reflects a collection of
> variables that a dotted-access object, which to me implies the
> potential for methods as well.

Dicts have methods, and support iteration.

A dict suggests to me that an arbitrary number of items could be included, 
rather than suggesting a record-like structure with an fixed number of items. 
(Even if that number varies from release to release.)

On the other hand, a dict supports iteration, and len, so even if you don't 
know how many fields there are, you can always find them by iterating over the 
record.

Syntax-wise, dotted name access seems right to me for this, similar to 
sys.float_info. If you know a field exists, sys.implementation.field is much 
nicer than sys.implementation['field'].

I hate to admit it, but I'm starting to think that the right solution here is 
something like a dict with dotted name access.

http://code.activestate.com/recipes/473786
http://code.activestate.com/recipes/576586

sort of thing.



-- 
Steven


From steve at pearwood.info  Wed May  2 03:24:19 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 02 May 2012 11:24:19 +1000
Subject: [Python-ideas] PEP 4XX: Adding sys.implementation
In-Reply-To: 
References: 			
	
Message-ID: <4FA08CC3.5070602@pearwood.info>

Nick Coghlan wrote:

> Would it be
> better if we had a separate "namespace" type in CPython that simply
> *disallowed* iteration and indexing? Perhaps, but we've survived long
> enough without it that I have my doubts about the practical need.


I have often wanted a namespace type, with class-like syntax and module-like 
semantics. In pseudocode:


namespace Spam:
     x = 1

     def ham(a):
         return x+a

     def cheese(a):
         return ham(a)*10


Spam.cheese(5)
=> returns 60


But I suspect that's not what you're talking about here in context.


-- 
Steven


From ncoghlan at gmail.com  Wed May  2 04:37:08 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 2 May 2012 12:37:08 +1000
Subject: [Python-ideas] PEP 4XX: Adding sys.implementation
In-Reply-To: <4FA0893D.4090903@pearwood.info>
References: 
	<20120430170454.08d73f74@resist.wooz.org>
	
	<4FA0893D.4090903@pearwood.info>
Message-ID: 

On Wed, May 2, 2012 at 11:09 AM, Steven D'Aprano  wrote:
> Syntax-wise, dotted name access seems right to me for this, similar to
> sys.float_info. If you know a field exists, sys.implementation.field is much
> nicer than sys.implementation['field'].
>
> I hate to admit it, but I'm starting to think that the right solution here
> is something like a dict with dotted name access.

Whereas I'm thinking it makes sense to explicitly separate out
"standard, must be defined by all conforming Python implementations"
and "implementation specific extras"

Under that model, we'd add an extra "metadata" field at the standard
level to hold implementation specific fields. The initial set of
standard fields would then be:

name: the name of the implementation (e.g. "CPython", "IronPython",
"PyPy", "Jython")
version: the version of the implemenation (in sys.version_info format)
cache_tag: the identifier used by importlib when caching bytecode
files in __pycache__ (set to None to disable bytecode caching)
metadata: a dict containing arbitrary additional information about a
particular implementation

sys.implementation.metadata would then give a home for information
that needs to be builtin, without having to pollute the main sys
namespace.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From ericsnowcurrently at gmail.com  Thu May  3 03:23:14 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Wed, 2 May 2012 19:23:14 -0600
Subject: [Python-ideas] PEP 4XX: Adding sys.implementation
In-Reply-To: <20120501182529.5ea3d94d@resist.wooz.org>
References: 
	<20120430170454.08d73f74@resist.wooz.org>
	
	<20120501182529.5ea3d94d@resist.wooz.org>
Message-ID: 

On Tue, May 1, 2012 at 4:25 PM, Barry Warsaw  wrote:
> On Apr 30, 2012, at 09:22 PM, Eric Snow wrote:
>
>>Perhaps I should clarify "Other Possible Values" in the PEP? ?I'd
>>intended it as a list of meaningful names, most of which others had
>>suggested, that could be considered at some later point. ?That's part
>>of why I didn't develop the descriptions there too much. ?Rather, I
>>wanted to focus on the two primary names for now.
>>
>>Should those potential names be considered more seriously right now?
>>I was hoping to keep it light to start out, just the things we'd use
>>immediately.
>
> I think you could keep it light (but +1 for adding cache_tag now).

cache_tag it is.

> I'd suggest making it clear that neither the keys, values, nor semantics are
> actually being proposed in this PEP. ?The PEP could just include some examples
> for future additions (and thus de-emphasize that section of the PEP).
>
> It might be helpful to describe a mechanism by which future values would be
> added to sys.implementation. ?E.g. is a new PEP required for each? ?(I don't
> have an opinion on that right now. :)

This is a good direction.  I'll update the PEP.  Thanks!

-eric


From mwm at mired.org  Thu May  3 03:28:21 2012
From: mwm at mired.org (Mike Meyer)
Date: Wed, 2 May 2012 21:28:21 -0400
Subject: [Python-ideas] argparse FileType v.s default arguments...
In-Reply-To: 
References: <20120430133338.33b2f75d@bhuda.mired.org>
	
Message-ID: 

On Mon, Apr 30, 2012 at 3:59 PM, Gregory P. Smith  wrote:
> On Mon, Apr 30, 2012 at 10:33 AM, Mike Meyer  wrote:
>> While I really like the argparse module, I've run into a case I think
>> it ought to handle that it doesn't.
>>
>> So I'm asking here to see if 1) I've overlooked something, and it can
>> do this, or 2) there's a good reason for it not to do this or maybe 3)
>> this is a bad idea.
>>
>> The usage I ran into looks like this:
>>
>> parser.add_argument('configfile', default='/my/default/config',
>> ? ? ? ? ? ? ? ? ? ? type=FileType('r'), nargs='?')
>>
>> If I provide the argument, everything works fine, and it opens the
>> named file for me. If I don't, parser.configfile is set to the string,
>> which doesn't work very well when I try to use it's read method.
>> Unfortunately, setting default to open('/my/default/config') has the
>> side affect of opening the file. Or raising an exception if the file
>> doesn't exist (which is a common reason for wanting to provide an
>> alternative!)
>>
>> Could default handling could be made smarter, and if 1) type is set
>> and 2) the value of default is a string, call pass the value of
>> default to type? Or maybe a flag to make that happen, or even a
>> default_factory argument (incompatible with default) that would accept
>> something like default_factory=lambda: open('/my/default/config')?
> This makes sense to me as described. ?I suggest going ahead and file an
> issue on bugs.python.org with the above.

I finally got around to this. There are already two issues that
address this problem, though in different ways: 12776 and 11389.

12776 includes a patch that worked against a build of a checkout
today. I've added a patch for test_argparse that adds tests to verify
that a default filename that doesn't exist with type=FileType
complains if you don't specify the argument, and opens the correct
file if you do.

     
References: 
	<20120430170454.08d73f74@resist.wooz.org>
	
	<4FA0893D.4090903@pearwood.info>
	
Message-ID: 

On Tue, May 1, 2012 at 8:37 PM, Nick Coghlan  wrote:
> On Wed, May 2, 2012 at 11:09 AM, Steven D'Aprano  wrote:
>> Syntax-wise, dotted name access seems right to me for this, similar to
>> sys.float_info. If you know a field exists, sys.implementation.field is much
>> nicer than sys.implementation['field'].
>>
>> I hate to admit it, but I'm starting to think that the right solution here
>> is something like a dict with dotted name access.
>
> Whereas I'm thinking it makes sense to explicitly separate out
> "standard, must be defined by all conforming Python implementations"
> and "implementation specific extras"
>
> Under that model, we'd add an extra "metadata" field at the standard
> level to hold implementation specific fields. The initial set of
> standard fields would then be:
>
> name: the name of the implementation (e.g. "CPython", "IronPython",
> "PyPy", "Jython")
> version: the version of the implemenation (in sys.version_info format)
> cache_tag: the identifier used by importlib when caching bytecode
> files in __pycache__ (set to None to disable bytecode caching)
> metadata: a dict containing arbitrary additional information about a
> particular implementation
>
> sys.implementation.metadata would then give a home for information
> that needs to be builtin, without having to pollute the main sys
> namespace.

I really like this approach, particularly the separation aspect.
Presumably sys.implementation would be more struct-like (static-ish,
dotted-access namespace).  I'll give it a day or two to stew and if it
still seems like a good idea I'll weave it into the PEP.

One question though: having it be iterable (a la structseq or
namedtuple) doesn't seem to be a good fit, but does it matter?
Likewise with mutability.  Thoughts?

-eric


From ericsnowcurrently at gmail.com  Thu May  3 04:17:40 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Wed, 2 May 2012 20:17:40 -0600
Subject: [Python-ideas] PEP 4XX: Adding sys.implementation
In-Reply-To: <20120430170454.08d73f74@resist.wooz.org>
References: 
	<20120430170454.08d73f74@resist.wooz.org>
Message-ID: 

On Mon, Apr 30, 2012 at 3:04 PM, Barry Warsaw  wrote:
> On Apr 27, 2012, at 12:36 AM, Eric Snow wrote:
>>Version Format
>>--------------
>>
>>XXX same as sys.version_info?
>
> Why not? :) ?It might be useful also to have something similar to
> sys.hexversion, which I often find convenient.

Would it be worth mirroring all 3 (sys.version, sys.version_info,
sys.hexversion)?  Symmetry is nice, but it also makes sense if the
each would be as meaningful as they are in sys.

-eric


From steve at pearwood.info  Thu May  3 05:49:59 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 3 May 2012 13:49:59 +1000
Subject: [Python-ideas] PEP 4XX: Adding sys.implementation
In-Reply-To: 
References: 
	<20120430170454.08d73f74@resist.wooz.org>
	
Message-ID: <20120503034959.GA19401@ando>

On Wed, May 02, 2012 at 08:17:40PM -0600, Eric Snow wrote:
> On Mon, Apr 30, 2012 at 3:04 PM, Barry Warsaw  wrote:
> > On Apr 27, 2012, at 12:36 AM, Eric Snow wrote:
> >>Version Format
> >>--------------
> >>
> >>XXX same as sys.version_info?
> >
> > Why not? :) ?It might be useful also to have something similar to
> > sys.hexversion, which I often find convenient.
> 
> Would it be worth mirroring all 3 (sys.version, sys.version_info,
> sys.hexversion)?  Symmetry is nice, but it also makes sense if the
> each would be as meaningful as they are in sys.

I am still unclear what justification there is for having a separate 
sys.version (from PEP 421: "the version of the Python language") and 
sys.implementation.version ("the version of the Python implementation"). 
Under what circumstances will one change but not the other?


-- 
Steven



From carl at oddbird.net  Thu May  3 06:30:28 2012
From: carl at oddbird.net (Carl Meyer)
Date: Wed, 02 May 2012 22:30:28 -0600
Subject: [Python-ideas] PEP 4XX: Adding sys.implementation
In-Reply-To: <20120503034959.GA19401@ando>
References: 
	<20120430170454.08d73f74@resist.wooz.org>
	
	<20120503034959.GA19401@ando>
Message-ID: <4FA209E4.6040900@oddbird.net>

On 05/02/2012 09:49 PM, Steven D'Aprano wrote:
> On Wed, May 02, 2012 at 08:17:40PM -0600, Eric Snow wrote:
>> On Mon, Apr 30, 2012 at 3:04 PM, Barry Warsaw  wrote:
>>> On Apr 27, 2012, at 12:36 AM, Eric Snow wrote:
>>>> Version Format
>>>> --------------
>>>>
>>>> XXX same as sys.version_info?
>>>
>>> Why not? :)  It might be useful also to have something similar to
>>> sys.hexversion, which I often find convenient.
>>
>> Would it be worth mirroring all 3 (sys.version, sys.version_info,
>> sys.hexversion)?  Symmetry is nice, but it also makes sense if the
>> each would be as meaningful as they are in sys.
>
> I am still unclear what justification there is for having a separate
> sys.version (from PEP 421: "the version of the Python language") and
> sys.implementation.version ("the version of the Python implementation").
> Under what circumstances will one change but not the other?

I know at least PyPy has separate "PyPy version" and "Python language 
compatibility version" numbers. They might choose to do a release that 
increments the PyPy version (because they've made improvements to the 
JIT or any number of other implementation-quality issues) but doesn't 
change the bundled stdlib version or language-compatibility version at 
all. Seems pretty reasonable to me.

Carl


From pyideas at rebertia.com  Thu May  3 06:31:47 2012
From: pyideas at rebertia.com (Chris Rebert)
Date: Wed, 2 May 2012 21:31:47 -0700
Subject: [Python-ideas] PEP 4XX: Adding sys.implementation
In-Reply-To: <20120503034959.GA19401@ando>
References: 
	<20120430170454.08d73f74@resist.wooz.org>
	
	<20120503034959.GA19401@ando>
Message-ID: 

On Wed, May 2, 2012 at 8:49 PM, Steven D'Aprano  wrote:
> On Wed, May 02, 2012 at 08:17:40PM -0600, Eric Snow wrote:
>> On Mon, Apr 30, 2012 at 3:04 PM, Barry Warsaw  wrote:
>> > On Apr 27, 2012, at 12:36 AM, Eric Snow wrote:

> I am still unclear what justification there is for having a separate
> sys.version (from PEP 421: "the version of the Python language") and
> sys.implementation.version ("the version of the Python implementation").
> Under what circumstances will one change but not the other?

In the event of an implementation bugfix? The Python version
implemented would be unchanged, but the implementation version would
be incremented slightly.

Cheers,
Chris


From ncoghlan at gmail.com  Thu May  3 08:06:44 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 3 May 2012 16:06:44 +1000
Subject: [Python-ideas] PEP 4XX: Adding sys.implementation
In-Reply-To: <20120503034959.GA19401@ando>
References: 
	<20120430170454.08d73f74@resist.wooz.org>
	
	<20120503034959.GA19401@ando>
Message-ID: 

On Thu, May 3, 2012 at 1:49 PM, Steven D'Aprano  wrote:
> I am still unclear what justification there is for having a separate
> sys.version (from PEP 421: "the version of the Python language") and
> sys.implementation.version ("the version of the Python implementation").
> Under what circumstances will one change but not the other?

The PyPy example is the real motivator. It allows "sys.version" to
declare what version of Python the implementation intends to
implement, while sys.implementation.version may be completely
different.

For example, a new implementation might declare sys.version_info as
(3, 3, etc...) to indicate they're aiming at 3.3 compatibility, while
setting sys.implementation.version to (0, 1, etc...) to reflect its
actual immaturity as an implementation.

Implementations are of course free to set the two numbers in lock
step, and CPython, IronPython and Jython will likely continue to do
exactly that.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From mal at egenix.com  Thu May  3 10:20:39 2012
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 03 May 2012 10:20:39 +0200
Subject: [Python-ideas] PEP 4XX: Adding sys.implementation
In-Reply-To: 
References: 
	<20120430170454.08d73f74@resist.wooz.org>
	
	<20120503034959.GA19401@ando>
	
Message-ID: <4FA23FD7.3070906@egenix.com>

Some corrections to the PEP text:

platform.python_implementation()
--------------------------------

The following text in the PEP needs to be updated:

"""
The platform module guesses the python implementation by looking for
clues in a couple different sys variables [3]. However, this approach
is fragile.
"""

Fact is, that sys.version parsing is documented to be done by the
platform module (see the docs on sys.version), so implementations
are free to provide patches in case they choose different ways of
formatting sys.version.

A sys.implementation record would make things easier for the platform
module, though, so it's an improvement.

sys.version
-----------

sys.version is defined as "A string containing the version number
of the Python interpreter plus additional information on the build
number and compiler used. This string is displayed when the interactive
interpreter is started. Do not extract version information out of it,
rather, use version_info and the functions provided by the platform module.

It's not defined as "version of the Python language" as the PEP
appears to indicate.

Other things:

Making sys.implementation a dictionary
--------------------------------------

This is not a good idea, since it allows for monkey-patching
the values and will also result in new undocumented or per-implementation
keys.

Better use a namedtuple like we do for all other such informational
resources.

sys.implementation information
------------------------------

While I'm not sure whether details such as VCS URLs and revision ids
should really be part of a data structure that is supposed to
identify the implementation (sys.version is better for that),
if you do want to add such information, then please add all of it,
not just part of the available build information.

See platform._sys_version() returns (name, version, branch, revision,
buildno, builddate, compiler).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 03 2012)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2012-04-26: Released mxODBC 3.1.2                 http://egenix.com/go28
2012-04-25: Released eGenix mx Base 3.2.4         http://egenix.com/go27

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


From ericsnowcurrently at gmail.com  Thu May  3 22:50:49 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Thu, 3 May 2012 14:50:49 -0600
Subject: [Python-ideas] PEP 4XX: Adding sys.implementation
In-Reply-To: <4FA23FD7.3070906@egenix.com>
References: 
	<20120430170454.08d73f74@resist.wooz.org>
	
	<20120503034959.GA19401@ando>
	
	<4FA23FD7.3070906@egenix.com>
Message-ID: 

On Thu, May 3, 2012 at 2:20 AM, M.-A. Lemburg  wrote:
> Some corrections to the PEP text:
>
> platform.python_implementation()
> --------------------------------
>
> The following text in the PEP needs to be updated:
>
> """
> The platform module guesses the python implementation by looking for
> clues in a couple different sys variables [3]. However, this approach
> is fragile.
> """
>
> Fact is, that sys.version parsing is documented to be done by the
> platform module (see the docs on sys.version), so implementations
> are free to provide patches in case they choose different ways of
> formatting sys.version.
>
> A sys.implementation record would make things easier for the platform
> module, though, so it's an improvement.

Yeah, I'll update that to be softer and more clear.

> sys.version
> -----------
>
> sys.version is defined as "A string containing the version number
> of the Python interpreter plus additional information on the build
> number and compiler used. This string is displayed when the interactive
> interpreter is started. Do not extract version information out of it,
> rather, use version_info and the functions provided by the platform module.
>
> It's not defined as "version of the Python language" as the PEP
> appears to indicate.

This is an excellent point.  sys.(version|version_info|hexversion)
reflect CPython specifics, rather than the language itself.  As far as
I know the language does not have a "micro" version, nor a release
level or serial.

So where does that leave us?  Undoubtedly no small number of people
already depend on the the sys variables for CPython release info, so
we can't just change the semantics.  I'll clarify the PEP and add this
to the open issues list because the PEP definitely needs to be clear
here.  Any suggestions on this point would be great.

> Other things:
>
> Making sys.implementation a dictionary
> --------------------------------------
>
> This is not a good idea, since it allows for monkey-patching
> the values and will also result in new undocumented or per-implementation
> keys.
>
> Better use a namedtuple like we do for all other such informational
> resources.

Nick Coghlan made good suggestion on this front that I'm likely going
to adopt: sys.implementation as an object (namespace with dotted
access) with required attributes.  One required attribute would be
'metadata', a dict where optional/per-implementation values could go.

Having it be immutable (make monkey-patching hard) didn't seem like it
mattered, though I'm not opposed.  I just don't see that as a
convincing reason for it to be a named tuple (structseq, etc.).

To be honest, I'd like to avoid making sys.implementation any kind of
sequence.  It has no meaning as a sequence (hence why the PEP shifted
from named tuple to dict).  Unlike other informational sources, we
expect that the namespace of required attributes will grow over time.
As such, people shouldn't rely on a fixed number of attributes, which
a named tuple would imply.  As well, I'm not convinced that the order
of the attributes is significant, nor that sequence unpacking is
useful here.

So in order to send the right message on both points, I'd rather not
make it a sequence.  It *could* be meaningful to implement the Mapping
ABC, but I'm not going to specify that in the PEP without good reason.
 (I will add that as an open issue though.)

Unless there is a good reason to use a named tuple, as opposed to a
regular object, let's not.  However, I'm still quite open to hearing
out arguments on this point.

-eric


From nbvfour at gmail.com  Thu May  3 22:57:33 2012
From: nbvfour at gmail.com (nbv4)
Date: Thu, 3 May 2012 13:57:33 -0700 (PDT)
Subject: [Python-ideas] one line class definitions
Message-ID: <22532650.695.1336078653821.JavaMail.geo-discussion-forums@yndu27>

Instead of

class CustomException(Exception):
    pass

how about just

class CustomException(Exception)

(no colon, no 'pass')
I'm sure this has been suggested before, but I couldn't find any...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From pyideas at rebertia.com  Fri May  4 00:19:32 2012
From: pyideas at rebertia.com (Chris Rebert)
Date: Thu, 3 May 2012 15:19:32 -0700
Subject: [Python-ideas] one line class definitions
In-Reply-To: <22532650.695.1336078653821.JavaMail.geo-discussion-forums@yndu27>
References: <22532650.695.1336078653821.JavaMail.geo-discussion-forums@yndu27>
Message-ID: 

On Thu, May 3, 2012 at 1:57 PM, nbv4  wrote:
> Instead of
>
> class CustomException(Exception):
> ? ? pass
>
> how about just
>
> class CustomException(Exception)
>
> (no colon, no 'pass')
> I'm sure this has been suggested before, but I couldn't find any...

"Special cases aren't special enough to break the rules." -- PEP 20

Just use a docstring-only body; you should be documenting what the
exception means anyways:

class CustomException(Exception):
    """This exception means that a custom error happened."""
# other module-level code?

Cheers,
Chris


From ned at nedbatchelder.com  Fri May  4 00:44:42 2012
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Thu, 03 May 2012 18:44:42 -0400
Subject: [Python-ideas] one line class definitions
In-Reply-To: <22532650.695.1336078653821.JavaMail.geo-discussion-forums@yndu27>
References: <22532650.695.1336078653821.JavaMail.geo-discussion-forums@yndu27>
Message-ID: <4FA30A5A.7050407@nedbatchelder.com>

On 5/3/2012 4:57 PM, nbv4 wrote:
> Instead of
>
> class CustomException(Exception):
>     pass
>
> how about just
>
> class CustomException(Exception)
>
How about just:

     class CustomException(Exception): pass

or better yet, using more lines?

--Ned.

> (no colon, no 'pass')
> I'm sure this has been suggested before, but I couldn't find any...
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From guido at python.org  Fri May  4 00:56:07 2012
From: guido at python.org (Guido van Rossum)
Date: Thu, 3 May 2012 15:56:07 -0700
Subject: [Python-ideas] one line class definitions
In-Reply-To: 
References: <22532650.695.1336078653821.JavaMail.geo-discussion-forums@yndu27>
	
Message-ID: 

(Repeat, somehow the message to which I replied had
python-ideas at googlegroups.com which doesn't exist.)

On Thu, May 3, 2012 at 2:53 PM, Guido van Rossum  wrote:
> That's just asking for more mysterious errors if you forget the colon.
> You can always write
>
> class Foo(Bar): pass
>
> On Thu, May 3, 2012 at 1:57 PM, nbv4  wrote:
>> Instead of
>>
>> class CustomException(Exception):
>> ? ? pass
>>
>> how about just
>>
>> class CustomException(Exception)
>>
>> (no colon, no 'pass')
>> I'm sure this has been suggested before, but I couldn't find any...
>>
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> http://mail.python.org/mailman/listinfo/python-ideas
>>
>
>
>
> --
> --Guido van Rossum (python.org/~guido)



-- 
--Guido van Rossum (python.org/~guido)


From Ronny.Pfannschmidt at gmx.de  Sun May  6 09:48:46 2012
From: Ronny.Pfannschmidt at gmx.de (Ronny Pfannschmidt)
Date: Sun, 06 May 2012 09:48:46 +0200
Subject: [Python-ideas] package based import
Message-ID: <4FA62CDE.8050307@gmx.de>

Hi,

this one is still prety rough (not just on the edges)

after some tinkering with tools like nmp,

i got the idea of package based imports

the idea is to have a lookup based on package toplevel names firs, 
instead of just walking sys.path

that way it becomes more natural for packages to be in a own dir instead 
of everything being merged in site-packages
and of course, much less files to walk to find a particular package, 
since the mapping of package name to import paths is already known

in order to add such ackages, some kind of registration would be necessary


a basic example could be something like

packages.pth::
   import pkgutil
   # whereever _init__.py is
   pkgutil.register_package('flask', '~/Projects/flask/flask')
   pkgutil.register_module('hgdistver', '~/Projects/hgdistver.py')

alltough for convience a ini file with sections and doted name to path 
mappings might be better.

once that is in place that opens up the path for fun local envs
like simply looping over eggs/dirs in a packages subdir and adding them,
instead of needing buildout/virtualenv

-- Ronny




From p.f.moore at gmail.com  Sun May  6 10:02:09 2012
From: p.f.moore at gmail.com (Paul Moore)
Date: Sun, 6 May 2012 09:02:09 +0100
Subject: [Python-ideas] package based import
In-Reply-To: <4FA62CDE.8050307@gmx.de>
References: <4FA62CDE.8050307@gmx.de>
Message-ID: 

On 6 May 2012 08:48, Ronny Pfannschmidt  wrote:
>
> the idea is to have a lookup based on package toplevel names firs, instead
> of just walking sys.path
>
> that way it becomes more natural for packages to be in a own dir instead of
> everything being merged in site-packages
> and of course, much less files to walk to find a particular package, since
> the mapping of package name to import paths is already known
>
> in order to add such ackages, some kind of registration would be necessary

This should be relatively easy to do using importlib - as a custom
meta hook (in PEP 302 terms). It could probably be written as a 3rd
party module, at least as a proof of concept.

Paul.


From tjreedy at udel.edu  Sun May  6 23:24:40 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 06 May 2012 17:24:40 -0400
Subject: [Python-ideas] Should range() == range(0)?
Message-ID: 

It is a general principle that if a built-in class C has a unique (up to 
equality) null object, then C() returns that null object.

 >>> for f in (bool, int, float, complex, tuple, list, dict, set, 
frozenset, str, bytes, bytearray):
	print(bool(f()))

# 12 lines of False

Some imported classes such as fractions.Fraction and collections.deque 
can be added to the list.

I add 'up to equality' because in the case of floats, 0.0 and -0.0 are 
distinct but equal, and float() returns the obvious 0.0.
 >>> 0.0 == -0.0
True
 >>> m.copysign(1, 0.0)
1.0
 >>> m.copysign(1, -0.0)
-1.0
 >>> m.copysign(1, float())
1.0

The notable exception to the rule is
 >>> range()
Traceback (most recent call last):
   File "", line 1, in 
     range()
TypeError: range expected 1 arguments, got 0
 >>> bool(range(0))
False

It is true that there are multiple distinct null range objects (because 
the defining start,stop,step args are kept as attributes) but they are 
all equal.
 >>> range(1,1) == range(0)
True

range(0) == range(0, 0, 1) would be the obvious choice for range().

Another advantage of doing this, beside consistency, is that it would 
emphasize that range() produces a re-iterable sequence, not just an 
iterator.

Possible objections and responses:

1. This would slightly complicate the already messy code and doc for 
range().

Pass, for now.

2. There is little need as there is already the alternative.

This is just as true or even more true for the other classes. While 
int() is slightly easier to type than int(), 0 is even easier.

3. There is little or no use case.

The justification I have seen for all the other classes behaving as they 
do is expressions like type(x)(), which gets the null object 
corresponding to x. This requires a parameterless call rather than a 
literal (or display) or call with typed arg.

A proper objection this sort would have to argue that range() is less 
useful than all 12+ cases that we have now.

4. memoryview() does not work.

Even though memoryview(bytes()) and memoryview(bytearray()) are both 
False and equal, other empty memoryviews would not all be equal. Besides 
which, a memoryview is dependent on another object, and there is not 
reason to create any particular object for it to be dependent on.

5. The dict view methods, such as dict.keys, do not work.

These also return views that are dependent on a primary object, and the 
views also are null if the primary object is. Here there is a unique 
null primary object, so it would at least be possible to create an empty 
dict whose only reference is held by a read-only view. On the other 
hand, '.keys' is a function, not a class.

6. filter() does not work.

While filter is a class, its instances, again, are dependent on another 
object, not just at creation but during its lifetime. Moreover, 
bool(empty-iterable) is not False. Ditto for map() and, for instance, 
open(), even though in the latter case the primary object is external.

-- 
Terry Jan Reedy



From g.brandl at gmx.net  Mon May  7 00:24:21 2012
From: g.brandl at gmx.net (Georg Brandl)
Date: Mon, 07 May 2012 00:24:21 +0200
Subject: [Python-ideas] Should range() == range(0)?
In-Reply-To: 
References: 
Message-ID: 

On 05/06/2012 11:24 PM, Terry Reedy wrote:
> It is a general principle that if a built-in class C has a unique (up to 
> equality) null object, then C() returns that null object.
> 
>  >>> for f in (bool, int, float, complex, tuple, list, dict, set, 
> frozenset, str, bytes, bytearray):
> 	print(bool(f()))
> 
> # 12 lines of False
> 
> Some imported classes such as fractions.Fraction and collections.deque 
> can be added to the list.
> 
> I add 'up to equality' because in the case of floats, 0.0 and -0.0 are 
> distinct but equal, and float() returns the obvious 0.0.
>  >>> 0.0 == -0.0
> True
>  >>> m.copysign(1, 0.0)
> 1.0
>  >>> m.copysign(1, -0.0)
> -1.0
>  >>> m.copysign(1, float())
> 1.0
> 
> The notable exception to the rule is
>  >>> range()
> Traceback (most recent call last):
>    File "", line 1, in 
>      range()
> TypeError: range expected 1 arguments, got 0
>  >>> bool(range(0))
> False
> 
> It is true that there are multiple distinct null range objects (because 
> the defining start,stop,step args are kept as attributes) but they are 
> all equal.
>  >>> range(1,1) == range(0)
> True
> 
> range(0) == range(0, 0, 1) would be the obvious choice for range().
> 
> Another advantage of doing this, beside consistency, is that it would 
> emphasize that range() produces a re-iterable sequence, not just an 
> iterator.
> 
> Possible objections and responses:

[1. - 6.]

7. The "default value" is only really useful for types that are best
described as "data-like".  range is not a data-like type, it's a helper
for iteration, just as filter or dictviews aren't data-like.

Georg




From jeanpierreda at gmail.com  Mon May  7 01:52:00 2012
From: jeanpierreda at gmail.com (Devin Jeanpierre)
Date: Sun, 6 May 2012 19:52:00 -0400
Subject: [Python-ideas] Should range() == range(0)?
In-Reply-To: 
References: 
Message-ID: 

On Sun, May 6, 2012 at 5:24 PM, Terry Reedy  wrote:
> Another advantage of doing this, beside consistency, is that it would
> emphasize that range() produces a re-iterable sequence, not just an
> iterator.

How? The empty sequence is the exact case where reiterable objects and
iterators have identical iteration behavior. (Both immediately stop
every time you try.)

> Possible objections and responses:
>
> 1. This would slightly complicate the already messy code and doc for
> range().
>
> Pass, for now

By this, do you mean don't write new documentation? That just defers
the problem to later.

> 3. There is little or no use case.
>
> The justification I have seen for all the other classes behaving as they do
> is expressions like type(x)(), which gets the null object corresponding to
> x. This requires a parameterless call rather than a literal (or display) or
> call with typed arg.
>
> A proper objection this sort would have to argue that range() is less useful
> than all 12+ cases that we have now.

Most of the other types are useful as parameters to something such as
collections.defaultdict. In the case of range, why not use tuple for
this?

Although, I actually like this idea, because it feels more consistent.
I imagine that isn't a good reason to like things though.

-- Devin


From steve at pearwood.info  Mon May  7 02:46:32 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Mon, 07 May 2012 10:46:32 +1000
Subject: [Python-ideas] Should range() == range(0)?
In-Reply-To: 
References: 
Message-ID: <4FA71B68.4010400@pearwood.info>

Terry Reedy wrote:
> It is a general principle that if a built-in class C has a unique (up to 
> equality) null object, then C() returns that null object.
> 
>  >>> for f in (bool, int, float, complex, tuple, list, dict, set, 
> frozenset, str, bytes, bytearray):
>     print(bool(f()))
> 
> # 12 lines of False

I don't think that's so much a general principle that should be aspired to as 
a general observation that many objects have an obvious "nothing" (empty) 
value that intuitively matches the zero-argument case, e.g. set, dict, list 
and so forth.

The cases of int, float, complex etc. are a little more dubious; I'm not 
convinced there's a general philosophical reason why int() should be allowed 
at all. E.g. int("") fails, int([]) fails, etc. so there's no general 
principle that the int of "emptiness" is expected to return 0.

The fact that float() has to choose between two zero objects, complex() 
between four, and Fraction and Decimal between an infinity of zero objects, 
highlights that the choice of a "default" is at least in part an arbitrary 
choice. If Python has any general principle here, it is that we should be 
reluctant to make arbitrary choices in the face of ambiguity.

For the avoidance of doubt, I'm not arguing for changing the behaviour of int. 
The current behaviour is fine. But I don't think we should treat it as a 
general principle that other objects should necessarily follow.



> Some imported classes such as fractions.Fraction and collections.deque 
> can be added to the list.
[...]
> It is true that there are multiple distinct null range objects (because 
> the defining start,stop,step args are kept as attributes) but they are 
> all equal.
>  >>> range(1,1) == range(0)
> True


Are you using Python 2 here? If so, you should be looking at xrange, not 
range. In Python 3, range objects are equal if their start, stop and step 
attributes are equal, not if their output values are equal:

py> range(0) == range(1,1)
False
py> range(1, 6, 2) == range(1, 7, 2)
False


> range(0) == range(0, 0, 1) would be the obvious choice for range().

I'm not entirely sure that is quite so obvious. range() defaults to a start of 
0 and a step of 1, so it's natural to reason that range() => range(0, end, 1). 
But surely we should treat end to be a required argument? If end is not 
required, that suggests the possibility of calling range with (say) a start 
value only, using the default end and step values.

I think there is great value in keeping range simple, and the simplest thing 
is to keep end as a required argument and refuse the temptation to guess if it 
is not given.

I do think this is a line-call though. If I were designing range from scratch, 
I too would be sorely tempted to have range() => range(0).


> Another advantage of doing this, beside consistency, is that it would 
> emphasize that range() produces a re-iterable sequence, not just an 
> iterator.

I don't follow your reasoning there. Whether range(*args) succeeds or fails 
for some arbitrary value of args has no bearing on whether it is re-iterable. 
Consider zip().


> 6. filter() does not work.
> 
> While filter is a class, its instances, again, are dependent on another 
> object, not just at creation but during its lifetime. Moreover, 
> bool(empty-iterable) is not False. Ditto for map() and, for instance, 
> open(), even though in the latter case the primary object is external.

Likewise reversed() and iter().

sorted() is an interesting case, because although it returns a list rather 
than a (hypothetical) SortedSequence object, it could choose to return [] when 
called with no arguments. I think it is right to not do so.

zip() on the other hand is a counter-example, and it is informative to think 
about why zip() succeeds while range() fails. zip takes an arbitrary number of 
arguments, where no particular argument is required or treated differently 
from the others. Also there is a unique interpretation of zip() with no 
arguments: an empty zip object (or list in the case of Python 2).

Nevertheless, I consider it somewhat surprising that zip() succeeds, and don't 
think that it is a good match for range.

Given the general principle "the status quo wins", I'm going to vote -0 on the 
suggested change.


-- 
Steven



From tjreedy at udel.edu  Mon May  7 04:20:35 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 06 May 2012 22:20:35 -0400
Subject: [Python-ideas] Should range() == range(0)?
In-Reply-To: <4FA71B68.4010400@pearwood.info>
References:  <4FA71B68.4010400@pearwood.info>
Message-ID: 

On 5/6/2012 8:46 PM, Steven D'Aprano wrote:
> Terry Reedy wrote:
>> It is a general principle that if a built-in class C has a unique (up
>> to equality) null object, then C() returns that null object.
>>
>> >>> for f in (bool, int, float, complex, tuple, list, dict, set,
>> frozenset, str, bytes, bytearray):
>> print(bool(f()))
>>
>> # 12 lines of False
>
> I don't think that's so much a general principle that should be aspired
> to as a general observation that many objects have an obvious "nothing"
> (empty) value that intuitively matches the zero-argument case, e.g. set,
> dict, list and so forth.

The general principle, including consistency, *has* been invoked in 
discussions about making the code example above true. It is not just an 
accident. To me, an empty range is nearly as obvious as any other empty 
collection.

> The cases of int, float, complex etc. are a little more dubious; I'm not
> convinced there's a general philosophical reason why int() should be
> allowed at all. E.g. int("") fails, int([]) fails, etc. so there's no
> general principle that the int of "emptiness" is expected to return 0.
>
> The fact that float() has to choose between two zero objects, complex()
> between four, and Fraction and Decimal between an infinity of zero
> objects,

Fraction normalizes all 0 fractions to 0/1, so there is no choice ;-)
 >>> from fractions import Fraction
 >>> Fraction(0, 2)
Fraction(0, 1)
 >>> Fraction()
Fraction(0, 1)

I believe there was consideration given to similarly normalizing ranges 
so that equal ranges (in 3.3, see below) would have the same start, 
stop, and step attributes. But I believe Guido said that recording the 
input might help debugging. Or there might have been some point about 
consistency with slice objects.

If list objects, for instance, had a .source_type attribute (for 
debugging), there would be multiple different but equal empty lists. 
Both [] and list() would then, most sensibly, use list as the default 
.source_type.

 > highlights that the choice of a "default" is at least in part
> an arbitrary choice. If Python has any general principle here, it is
> that we should be reluctant to make arbitrary choices in the face of
> ambiguity.
>
> For the avoidance of doubt, I'm not arguing for changing the behaviour
> of int. The current behaviour is fine. But I don't think we should treat
> it as a general principle that other objects should necessarily follow.

The consistent list above *is* a result of treating the principle as one 
that 'other' classes should follow.

> Are you using Python 2 here? If so, you should be looking at xrange, not
> range. In Python 3, range objects are equal if their start, stop and
> step attributes are equal, not if their output values are equal:
>
> py> range(0) == range(1,1)
> False
> py> range(1, 6, 2) == range(1, 7, 2)
> False

Python 3.3.0a3 (default, May  1 2012, 16:46:00) [MSC v.1500 64 bit 
(AMD64)] on win32
>>> range(0) == range(1,1)
True
>>> range(1, 6, 2) == range(1, 7, 2)
True

I remember there being a discussion about this, which Guido was part of, 
that since ranges are sequences, not their source inputs, == should 
reflect what they are, and not how they came to be. If ranges A and B 
are equal, len(A) == len(B), A[i] == B[i], and iter(A) and iter(B) 
produce the same sequence -- and vice versa.

>> range(0) == range(0, 0, 1) would be the obvious choice for range().
>
> I'm not entirely sure that is quite so obvious. range() defaults to a
> start of 0 and a step of 1, so it's natural to reason that range() =>
> range(0, end, 1). But surely we should treat end to be a required
> argument? If end is not required, that suggests the possibility of
> calling range with (say) a start value only, using the default end and
> step values.
>
> I think there is great value in keeping range simple, and the simplest
> thing is to keep end as a required argument and refuse the temptation to
> guess if it is not given.
>
> I do think this is a line-call though. If I were designing range from
> scratch, I too would be sorely tempted to have range() => range(0).
>
>> Another advantage of doing this, beside consistency, is that it would
>> emphasize that range() produces a re-iterable sequence, not just an
>> iterator.

Sorry, that is mis-worded to the point of being erroneous. I meant to 
say 'non-iterator re-iterable sequence *instead of* an iterator. Just 
like a list or tuple or deque ... .

> I don't follow your reasoning there. Whether range(*args) succeeds or
> fails for some arbitrary value of args has no bearing on whether it is
> re-iterable.

Whether range is an non-iterator iterable sequence or an iterator has 
everything to do with whether it it reiterable.

> Consider zip().

That surprises me. Zip is an one-time iterator, like map, dependent on 
underlying iterables. I wonder whether it is really intentional, or an 
accident of the definition or some mplementation, that zip() returns an 
exhausted iterator instead of raising. In any case, bool(zip()) returns 
True, not False, so it has nothing to do with the return null principle.

>> 6. filter() does not work.
>>
>> While filter is a class, its instances, again, are dependent on
>> another object, not just at creation but during its lifetime.
>> Moreover, bool(empty-iterable) is not False. Ditto for map() and, for
>> instance, open(), even though in the latter case the primary object is
>> external.
>
> Likewise reversed() and iter().

both fail, as I expected.

> sorted() is an interesting case, because although it returns a list
> rather than a (hypothetical) SortedSequence object, it could choose to
> return [] when called with no arguments. I think it is right to not do so.

It is a function, not a class. I would not suggest that all functions of 
one arg should have a default input and therefor a default output. This 
is certainly not a Python design principle.

> zip() on the other hand is a counter-example, and it is informative to
> think about why zip() succeeds while range() fails. zip takes an
> arbitrary number of arguments, where no particular argument is required
> or treated differently from the others. Also there is a unique
> interpretation of zip() with no arguments: an empty zip object (or list
> in the case of Python 2).
>
> Nevertheless, I consider it somewhat surprising that zip() succeeds, and
> don't think that it is a good match for range.

They are not in the same sub-categories of iterables.

-- 
Terry Jan Reedy



From tjreedy at udel.edu  Mon May  7 04:43:20 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 06 May 2012 22:43:20 -0400
Subject: [Python-ideas] Should range() == range(0)?
In-Reply-To: 
References:  
Message-ID: 

On 5/6/2012 6:24 PM, Georg Brandl wrote:

> 7. The "default value" is only really useful for types that are best
 > described as "data-like".
 > range is not a data-like type, it's a helper
> for iteration, just as filter or dictviews aren't data-like.

Not knowning your definition of 'data-like', it is hard to respond.

A range is an immutable, indexable, reiterable sequence of regularly 
spaced ints with a definite length. It compactly represents an finite 
but possibly long arithmetic sequence. While mostly used for iteration, 
it is not limited to iteration. It implements the sequence protocol. It 
is not an iterator. It is not dependent on an underlying iterable. It is 
properly documented with the other sequence types.

It is most like a bytes object in being an immutable sequence of ints. 
In that regard, it is different in not restricting the ints to [0,255] 
while restricting the differences to being equal.

(Dict views, especially .keys() are also, to me, somewhat data-like and 
not limited to iteration. But, unlike ranges, they are dependencies.)

-- 
Terry Jan Reedy



From tjreedy at udel.edu  Mon May  7 04:56:19 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 06 May 2012 22:56:19 -0400
Subject: [Python-ideas] Should range() == range(0)?
In-Reply-To: 
References:  <4FA71B68.4010400@pearwood.info>
	
Message-ID: 

On 5/6/2012 10:20 PM, Terry Reedy wrote:
> On 5/6/2012 8:46 PM, Steven D'Aprano wrote:

>> Are you using Python 2 here? If so, you should be looking at xrange, not
>> range. In Python 3, range objects are equal if their start, stop and
>> step attributes are equal, not if their output values are equal:
>>
>> py> range(0) == range(1,1)
>> False
>> py> range(1, 6, 2) == range(1, 7, 2)
>> False
>
> Python 3.3.0a3 (default, May 1 2012, 16:46:00) [MSC v.1500 64 bit
> (AMD64)] on win32
>>>> range(0) == range(1,1)
> True
>>>> range(1, 6, 2) == range(1, 7, 2)
> True
>
> I remember there being a discussion about this, which Guido was part of,
> that since ranges are sequences, not their source inputs, == should
> reflect what they are, and not how they came to be. If ranges A and B
> are equal, len(A) == len(B), A[i] == B[i], and iter(A) and iter(B)
> produce the same sequence -- and vice versa.

I found the change notice in the library manual.
"Changed in version 3.3: Define ?==? and ?!=? to compare range objects 
based on the sequence of values they define (instead of comparing based 
on object identity)."

That implies, for instance, "range(1,6,2) != range(1,6,2)" in 3.2, which 
is rather useless. Python slowly improves in many little ways.

-- 
Terry Jan Reedy




From tjreedy at udel.edu  Mon May  7 05:22:18 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 06 May 2012 23:22:18 -0400
Subject: [Python-ideas] Should range() == range(0)?
In-Reply-To: 
References: 
	
Message-ID: 

On 5/6/2012 7:52 PM, Devin Jeanpierre wrote:
> On Sun, May 6, 2012 at 5:24 PM, Terry Reedy  wrote:
>> Another advantage of doing this, beside consistency, is that it would
>> emphasize that range() produces a re-iterable sequence, not just an
>> iterator.

My apology for mis-writing that. A range is a non-iterator, re-iterable 
sequence rather than an iterator.

> How? The empty sequence is the exact case where reiterable objects and
> iterators have identical iteration behavior. (Both immediately stop
> every time you try.)

That is also true of empty tuples, lists, sets, and dicts. An iterator 
can only be used to iterate - once. Non-iterator iterables (usually) 
have other behaviors.

>> Possible objections and responses:
>>
>> 1. This would slightly complicate the already messy code and doc for
>> range().
>>
>> Pass, for now
>
> By this, do you mean don't write new documentation?

No, it means I was defering discussing this possible objection unless 
someone raises it as a show-stopper, or it becomes the last issue. The 
current messiness is that the signature in the doc "range([start], 
stop[, step])" is non-standard in that it does not follow the rule that 
optional parameters and arguements follow required ones. It would 
perhaps be more accurate, but also possibly more confusing, to give it 
as "range(start_stop, [[stop], [step])", where start_stop is interpreted 
as start if stop is given and stop if stop is not (otherwise) given. 
Either version would just need an outer '[]' added: "range([[start], 
stop, [step]])" and a note "If no arguments are given, return range(0)."

For a Python version, adding "= 0" to start_stop in the header should be 
sufficient. But I do not know how the C version works.

> Most of the other types are useful as parameters to something such as
> collections.defaultdict.

I admit range() would be seemingly useless there.

> Although, I actually like this idea, because it feels more consistent.
> I imagine that isn't a good reason to like things though.

I believe, though, it was a reason for the consistency of everything 
other than range.

-- 
Terry Jan Reedy



From greg.ewing at canterbury.ac.nz  Mon May  7 07:05:30 2012
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 07 May 2012 17:05:30 +1200
Subject: [Python-ideas] Should range() == range(0)?
In-Reply-To: <4FA71B68.4010400@pearwood.info>
References:  <4FA71B68.4010400@pearwood.info>
Message-ID: <4FA7581A.6050807@canterbury.ac.nz>

Steven D'Aprano wrote:

> The cases of int, float, complex etc. are a little more dubious; I'm not 
> convinced there's a general philosophical reason why int() should be 
> allowed at all.

A philosophical reason would be that list() and int()
both return false values. Pragmatically, it makes them useful
as arguments to defaultdict.

The fact that there is sometimes more than one representation
of zero isn't much of a problem, since they all give the same
result when you add a nonzero value to them.

The defaultdict argument doesn't apply to range() in Python 3, or
xrange() in Python 2, since you can't apply += to them. It
also doesn't apply much to range() in Python 2, since list
would work just as well as a defaultdict argument as a range
that accepted no arguments.

-- 
Greg


From greg.ewing at canterbury.ac.nz  Mon May  7 07:16:39 2012
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 07 May 2012 17:16:39 +1200
Subject: [Python-ideas] Should range() == range(0)?
In-Reply-To: 
References:  <4FA71B68.4010400@pearwood.info>
	
Message-ID: <4FA75AB7.1000702@canterbury.ac.nz>

Terry Reedy wrote:

> I believe there was consideration given to similarly normalizing ranges 
> so that equal ranges (in 3.3, see below) would have the same start, 
> stop, and step attributes.

That might make sense if there were a well-defined algebra of
range objects, but there isn't. For example, concatenating the
sequences represented by two ranges with different step sizes
results in a sequence that can't be represented by a single
range object.

Also I can't remember seeing a plethora of use cases for
comparing range objects.

-- 
Greg


From ncoghlan at gmail.com  Mon May  7 09:06:59 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 7 May 2012 17:06:59 +1000
Subject: [Python-ideas] Should range() == range(0)?
In-Reply-To: <4FA75AB7.1000702@canterbury.ac.nz>
References:  <4FA71B68.4010400@pearwood.info>
	 <4FA75AB7.1000702@canterbury.ac.nz>
Message-ID: 

On Mon, May 7, 2012 at 3:16 PM, Greg Ewing  wrote:
> Also I can't remember seeing a plethora of use cases for
> comparing range objects.

Most of the changes to range() in 3.3 are about making them live up to
their claim to implement the Sequence ABC. The approach taken to
achieve this is to follow the philosophy that a Python 3.3 range
object should behave as much as possible like a memory efficient
representation for a tuple of regularly spaced integers (but ignoring
the concatenation and repetition operations that tuples support but
aren't part of the Sequence ABC).

Having range() return an empty range in the same way that tuple()
returns an empty tuple would be a natural extension of that
philosophy.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From g.brandl at gmx.net  Mon May  7 12:50:26 2012
From: g.brandl at gmx.net (Georg Brandl)
Date: Mon, 07 May 2012 12:50:26 +0200
Subject: [Python-ideas] Should range() == range(0)?
In-Reply-To: 
References:  <4FA71B68.4010400@pearwood.info>
	 <4FA75AB7.1000702@canterbury.ac.nz>
	
Message-ID: 

On 05/07/2012 09:06 AM, Nick Coghlan wrote:
> On Mon, May 7, 2012 at 3:16 PM, Greg Ewing  wrote:
>> Also I can't remember seeing a plethora of use cases for
>> comparing range objects.
> 
> Most of the changes to range() in 3.3 are about making them live up to
> their claim to implement the Sequence ABC. The approach taken to
> achieve this is to follow the philosophy that a Python 3.3 range
> object should behave as much as possible like a memory efficient
> representation for a tuple of regularly spaced integers (but ignoring
> the concatenation and repetition operations that tuples support but
> aren't part of the Sequence ABC).
> 
> Having range() return an empty range in the same way that tuple()
> returns an empty tuple would be a natural extension of that
> philosophy.

For what gain?  At the moment, I cannot think of any arguments in favor
of the change, which is the point where arguments against it aren't
even needed to keep the status quo.

Ah yes: and I would rather have the bug

for i in range():   # <- "n" (or equivalent) missing

give me an explicit exception than silently "skipping" the loop.
After all, the primary use case for range() is loops, and we should not
make that use worse for the benefit of hypothetical other use cases.

Georg



From ncoghlan at gmail.com  Mon May  7 13:14:27 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 7 May 2012 21:14:27 +1000
Subject: [Python-ideas] Should range() == range(0)?
In-Reply-To: 
References:  <4FA71B68.4010400@pearwood.info>
	 <4FA75AB7.1000702@canterbury.ac.nz>
	
	
Message-ID: 

On Mon, May 7, 2012 at 8:50 PM, Georg Brandl  wrote:
> For what gain? ?At the moment, I cannot think of any arguments in favor
> of the change, which is the point where arguments against it aren't
> even needed to keep the status quo.
>
> Ah yes: and I would rather have the bug
>
> for i in range(): ? # <- "n" (or equivalent) missing
>
> give me an explicit exception than silently "skipping" the loop.
> After all, the primary use case for range() is loops, and we should not
> make that use worse for the benefit of hypothetical other use cases.

Now *that's* a good reason to nix the idea :)

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From ram.rachum at gmail.com  Mon May  7 13:42:34 2012
From: ram.rachum at gmail.com (Ram Rachum)
Date: Mon, 7 May 2012 04:42:34 -0700 (PDT)
Subject: [Python-ideas] bool(datetime.time(0, 0))
Message-ID: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5>

Hello,

Currently, `bool(datetime.time(0, 0)) is False`.

Can we change that to `True`?

There is nothing False-y about midnight.


Ram.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From steve at pearwood.info  Mon May  7 16:38:19 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 08 May 2012 00:38:19 +1000
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5>
References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5>
Message-ID: <4FA7DE5B.8000703@pearwood.info>

Ram Rachum wrote:
> Hello,
> 
> Currently, `bool(datetime.time(0, 0)) is False`.
> 
> Can we change that to `True`?
> 
> There is nothing False-y about midnight.


Of course there is -- it is the witching hour, and witches are known to be 
deceivers whose words and actions are false.

*wink*

I fear that backwards compatibility will prevent any change, but I don't see 
any good reasons for treating any date or time as a false value.


By the way, the "To:" address on your post is set to 
python-ideas at googlegroups.com, which does not exist.



-- 
Steven



From solipsis at pitrou.net  Mon May  7 17:02:19 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 7 May 2012 17:02:19 +0200
Subject: [Python-ideas] bool(datetime.time(0, 0))
References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5>
	<4FA7DE5B.8000703@pearwood.info>
Message-ID: <20120507170219.266304f2@pitrou.net>

On Tue, 08 May 2012 00:38:19 +1000
Steven D'Aprano  wrote:
> Ram Rachum wrote:
> > Hello,
> > 
> > Currently, `bool(datetime.time(0, 0)) is False`.
> > 
> > Can we change that to `True`?
> > 
> > There is nothing False-y about midnight.
> 
> 
> Of course there is -- it is the witching hour, and witches are known to be 
> deceivers whose words and actions are false.
> 
> *wink*
> 
> I fear that backwards compatibility will prevent any change, but I don't see 
> any good reasons for treating any date or time as a false value.

I, too, think it would be desireable to make the change.

Regards

Antoine.




From dickinsm at gmail.com  Mon May  7 17:11:55 2012
From: dickinsm at gmail.com (Mark Dickinson)
Date: Mon, 7 May 2012 16:11:55 +0100
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: <20120507170219.266304f2@pitrou.net>
References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5>
	<4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
Message-ID: 

> Steven D'Aprano  wrote:
> I fear that backwards compatibility will prevent any change, but I don't see
> any good reasons for treating any date or time as a false value.

I agree for the date, time and datetime classes.  Having timedelta(0)
be False makes sense to me, though.

But see:

http://bugs.python.org/issue13936

Mark


From alexander.belopolsky at gmail.com  Mon May  7 17:19:04 2012
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Mon, 7 May 2012 11:19:04 -0400
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: 
References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5>
	<4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
Message-ID: 

On Mon, May 7, 2012 at 11:11 AM, Mark Dickinson  wrote:
>> Steven D'Aprano  wrote:
>> I fear that backwards compatibility will prevent any change, but I don't see
>> any good reasons for treating any date or time as a false value.
>
> I agree for the date, time and datetime classes.

Can anyone show a use case where the change will result in an
improvement?  It seems to me that the issue mostly shows up in the
code like "if t: ..." which would work better with "if t is not None:
...".


From solipsis at pitrou.net  Mon May  7 17:32:54 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 7 May 2012 17:32:54 +0200
Subject: [Python-ideas] bool(datetime.time(0, 0))
References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5>
	<4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
Message-ID: <20120507173254.6a6aee5b@pitrou.net>

On Mon, 7 May 2012 11:19:04 -0400
Alexander Belopolsky
 wrote:
> On Mon, May 7, 2012 at 11:11 AM, Mark Dickinson  wrote:
> >> Steven D'Aprano  wrote:
> >> I fear that backwards compatibility will prevent any change, but I don't see
> >> any good reasons for treating any date or time as a false value.
> >
> > I agree for the date, time and datetime classes.
> 
> Can anyone show a use case where the change will result in an
> improvement?

Well, less occasional puzzlement is an improvement in itself.
Unintuitive behaviour is always a risk for software quality.

Regards

Antoine.




From alexander.belopolsky at gmail.com  Mon May  7 17:57:34 2012
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Mon, 7 May 2012 11:57:34 -0400
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: <20120507173254.6a6aee5b@pitrou.net>
References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5>
	<4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
Message-ID: 

On Mon, May 7, 2012 at 11:32 AM, Antoine Pitrou  wrote:
> Well, less occasional puzzlement is an improvement in itself.
> Unintuitive behaviour is always a risk for software quality.

I don't find the current behavior unintuitive.  It is common to
represent time of day as an integer (number of minutes or seconds
since midnight) or as a float (fraction of the 24-hour day).  In these
cases one gets bool(midnight) -> False as an artifact of the
representation.  For someone who wants to switch from typeless time
variables to datetime module types, bool(midnight) -> True may present
an extra hurdle.  One can improve the quality of his software by
avoiding constructs that he finds unintuitive.  For example, I claim
that in most cases a test for bool(t) is really a lazy version of the
more appropriate test for t is None.

Note that if we make bool(midnight) -> True, it will not be trivial to
faithfully reproduce the old behavior.  I want the proponents of the
change to try it before I explain why it is not easy.


From solipsis at pitrou.net  Mon May  7 18:06:53 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 7 May 2012 18:06:53 +0200
Subject: [Python-ideas] bool(datetime.time(0, 0))
References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5>
	<4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
Message-ID: <20120507180653.25a654d1@pitrou.net>

On Mon, 7 May 2012 11:57:34 -0400
Alexander Belopolsky
 wrote:
> On Mon, May 7, 2012 at 11:32 AM, Antoine Pitrou  wrote:
> > Well, less occasional puzzlement is an improvement in itself.
> > Unintuitive behaviour is always a risk for software quality.
> 
> I don't find the current behavior unintuitive.  It is common to
> represent time of day as an integer (number of minutes or seconds
> since midnight) or as a float (fraction of the 24-hour day).

I'm not sure it's common. I don't remember seeing it myself. When I use
an integer or a float as you say, it's to represent a *duration*, not
an absolute time.

> In these
> cases one gets bool(midnight) -> False as an artifact of the
> representation.

That's part of why the integer or float representation is worse than a
higher-level structure.

> One can improve the quality of his software by
> avoiding constructs that he finds unintuitive.  For example, I claim
> that in most cases a test for bool(t) is really a lazy version of the
> more appropriate test for t is None.

From a purity standpoint, you are right, but people still do it
intuitively, and it works for well-behaved types.

Either we try to lecture people into "the one way of writing Python
code using time objects", or we make it so that common uses are not
broken (i.e. a piece of code that gets wrongly executed in the rare
case they encounter a midnight time object).

> Note that if we make bool(midnight) -> True, it will not be trivial to
> faithfully reproduce the old behavior.

Why do you want to reproduce it? Does midnight warrant any special
shortcut for testing? Especially one that is confusing to many
readers.

Regards

Antoine.




From alexander.belopolsky at gmail.com  Mon May  7 18:24:21 2012
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Mon, 7 May 2012 12:24:21 -0400
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: <20120507180653.25a654d1@pitrou.net>
References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5>
	<4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<20120507180653.25a654d1@pitrou.net>
Message-ID: 

On Mon, May 7, 2012 at 12:06 PM, Antoine Pitrou  wrote:
> Why do you want to reproduce it?

If I am porting my software to the hypothetical Python 3.4 and see
that the time.__bool__ changed, I would prefer to simply replace every
occurrence of time tested for truth with something equivalent.  In a
porting scenario, I don't want to second guess the intent of the
original programmer or "improve" the code.

> Does midnight warrant any special shortcut for testing?

I never needed it, but apparently it is common enough for users to
notice an complain.  That's why I asked my original question: if
you've seen a time variable been tested for truth, was it a bug that
can be fixed by a change in time.__bool__ or a deliberate test for the
midnight value?

> Especially one that is confusing to many readers.

I have a feeling that "readers" here are readers of documentation or
tutorials rather than readers of actual code.  If this is the case, we
can discuss how to improve the documentation and not change the
behavior.


From solipsis at pitrou.net  Mon May  7 18:33:43 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 7 May 2012 18:33:43 +0200
Subject: [Python-ideas] bool(datetime.time(0, 0))
References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5>
	<4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<20120507180653.25a654d1@pitrou.net>
	
Message-ID: <20120507183343.5552cccb@pitrou.net>

On Mon, 7 May 2012 12:24:21 -0400
Alexander Belopolsky
 wrote:
> On Mon, May 7, 2012 at 12:06 PM, Antoine Pitrou  wrote:
> > Does midnight warrant any special shortcut for testing?
> 
> I never needed it, but apparently it is common enough for users to
> notice an complain.

How so? Those users complain that midnight is false, not that they have
trouble testing for midnight.
That's the whole point really: they don't think about midnight as a
special value, and they are surprised that it is.

>  That's why I asked my original question: if
> you've seen a time variable been tested for truth, was it a bug that
> can be fixed by a change in time.__bool__ or a deliberate test for the
> midnight value?

Most likely it's a bug, unless the code is written by an expert in the
datetime module. I don't expect many people to remember such oddities
(and I don't remember them myselves), let alone willfully rely on them
instead of writing more explicit code.

> > Especially one that is confusing to many readers.
> 
> I have a feeling that "readers" here are readers of documentation or
> tutorials rather than readers of actual code.

I was talking about readers of code. If I read code where boolean
testing of a time object is done, I wouldn't assume the intent is to
test for midnight (unless there's a comment indicating so).

Regards

Antoine.




From mal at egenix.com  Mon May  7 18:53:28 2012
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 07 May 2012 18:53:28 +0200
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: 
References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5>
	<4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
Message-ID: <4FA7FE08.2050901@egenix.com>

Alexander Belopolsky wrote:
> On Mon, May 7, 2012 at 11:32 AM, Antoine Pitrou  wrote:
>> Well, less occasional puzzlement is an improvement in itself.
>> Unintuitive behaviour is always a risk for software quality.
> 
> I don't find the current behavior unintuitive.  It is common to
> represent time of day as an integer (number of minutes or seconds
> since midnight) or as a float (fraction of the 24-hour day).  In these
> cases one gets bool(midnight) -> False as an artifact of the
> representation.

In Python 2.x, the slot used by bool() is called nb_nonzero, which
returns 1/0 depending on whether a value is considered zero or not.

This makes it quite natural for any special object representing a
value akin to zero in its domain to be false.

In Python 3.x, nb_nonzero was renamed to nb_bool without really
paying attention to the fact that many types implemented the original
meaning instead of a notion of boolean value, so I guess we'll just
have to live with it, unless we want to introduce yet another
subtle difference between Python 2.x and 3.x.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 07 2012)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2012-07-02: EuroPython 2012, Florence, Italy               56 days to go
2012-04-26: Released mxODBC 3.1.2                 http://egenix.com/go28
2012-04-25: Released eGenix mx Base 3.2.4         http://egenix.com/go27

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


From alexander.belopolsky at gmail.com  Mon May  7 18:57:47 2012
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Mon, 7 May 2012 12:57:47 -0400
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: <20120507183343.5552cccb@pitrou.net>
References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5>
	<4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<20120507180653.25a654d1@pitrou.net>
	
	<20120507183343.5552cccb@pitrou.net>
Message-ID: 

On Mon, May 7, 2012 at 12:33 PM, Antoine Pitrou  wrote:
>> I have a feeling that "readers" here are readers of documentation or
>> tutorials rather than readers of actual code.
>
> I was talking about readers of code. If I read code where boolean
> testing of a time object is done, I wouldn't assume the intent is to
> test for midnight (unless there's a comment indicating so).

I understand your hypothetical, but does such code actually exist in
the wild or are we debating the number of angels than can dance at
midnight?  Yes, if I were to rely on bool(time(0)) exact behavior, I
would comment my code.  Do we know one way or another whether such
code exists?

As a matter of coding stile, I recommend avoiding use of datetime.time
objects.  More often than not, time values are meaningless when
detached from the date value.  This is particularly true when timezone
aware instances are used.  Lack of support for time + timedelta makes
naked timevalues inconvenient even in reasonable applications that
deal with schedules that repeat from day to day.   If perceived
uncertainly over the truth value will further dissuade anyone from
using naked time objects, I am all for it.

Note that since date range starts at date(1,1,1) we don't have the
same problem with the date or datetime objects.


From tjreedy at udel.edu  Mon May  7 19:04:17 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 07 May 2012 13:04:17 -0400
Subject: [Python-ideas] Should range() == range(0)?
In-Reply-To: 
References:  <4FA71B68.4010400@pearwood.info>
	 <4FA75AB7.1000702@canterbury.ac.nz>
	
	
	
Message-ID: 

On 5/7/2012 7:14 AM, Nick Coghlan wrote:
> On Mon, May 7, 2012 at 8:50 PM, Georg Brandl  wrote:
>> For what gain?  At the moment, I cannot think of any arguments in favor
>> of the change, which is the point where arguments against it aren't
>> even needed to keep the status quo.
>>
>> Ah yes: and I would rather have the bug
>>
>> for i in range():   #<- "n" (or equivalent) missing
>>
>> give me an explicit exception than silently "skipping" the loop.
>> After all, the primary use case for range() is loops, and we should not
>> make that use worse for the benefit of hypothetical other use cases.
>
> Now *that's* a good reason to nix the idea :)

I agree that the bug possibility is by far the strongest for range 
whereas the usefulness is probably the weakest. So this seems a case of 
practicality beats purity.

-- 
Terry Jan Reedy



From solipsis at pitrou.net  Mon May  7 19:03:05 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 7 May 2012 19:03:05 +0200
Subject: [Python-ideas] bool(datetime.time(0, 0))
References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5>
	<4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<20120507180653.25a654d1@pitrou.net>
	
	<20120507183343.5552cccb@pitrou.net>
	
Message-ID: <20120507190305.5255b030@pitrou.net>

On Mon, 7 May 2012 12:57:47 -0400
Alexander Belopolsky
 wrote:
> On Mon, May 7, 2012 at 12:33 PM, Antoine Pitrou  wrote:
> >> I have a feeling that "readers" here are readers of documentation or
> >> tutorials rather than readers of actual code.
> >
> > I was talking about readers of code. If I read code where boolean
> > testing of a time object is done, I wouldn't assume the intent is to
> > test for midnight (unless there's a comment indicating so).
> 
> I understand your hypothetical, but does such code actually exist in
> the wild or are we debating the number of angels than can dance at
> midnight?

Well, people complained about it, so they did try to write such code
and got bitten, right? Whether or not such code still exists "in the
wild" probably depends on how fast people get bitten and fix it :-)
But regardless, it's still an annoyance for people who write new code.

On the other hand, nobody chimed in to say that they relied on boolean
testing to check for midnight.

> As a matter of coding stile, I recommend avoiding use of datetime.time
> objects.  More often than not, time values are meaningless when
> detached from the date value.

I tend to agree.

> Note that since date range starts at date(1,1,1) we don't have the
> same problem with the date or datetime objects.

That's fortunate.

Regards

Antoine.




From solipsis at pitrou.net  Mon May  7 19:06:12 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 7 May 2012 19:06:12 +0200
Subject: [Python-ideas] Should range() == range(0)?
References:  <4FA71B68.4010400@pearwood.info>
	 <4FA75AB7.1000702@canterbury.ac.nz>
	
	
	
	
Message-ID: <20120507190612.2173a86e@pitrou.net>

On Mon, 07 May 2012 13:04:17 -0400
Terry Reedy  wrote:
> On 5/7/2012 7:14 AM, Nick Coghlan wrote:
> > On Mon, May 7, 2012 at 8:50 PM, Georg Brandl  wrote:
> >> For what gain?  At the moment, I cannot think of any arguments in favor
> >> of the change, which is the point where arguments against it aren't
> >> even needed to keep the status quo.
> >>
> >> Ah yes: and I would rather have the bug
> >>
> >> for i in range():   #<- "n" (or equivalent) missing
> >>
> >> give me an explicit exception than silently "skipping" the loop.
> >> After all, the primary use case for range() is loops, and we should not
> >> make that use worse for the benefit of hypothetical other use cases.
> >
> > Now *that's* a good reason to nix the idea :)
> 
> I agree that the bug possibility is by far the strongest for range 
> whereas the usefulness is probably the weakest. So this seems a case of 
> practicality beats purity.

The fact that there's absolutely no use case to call range() without an
argument is enough to dismiss the idea, IMO.
Just because something can be done doesn't mean it should be done.

Regards

Antoine.




From alexander.belopolsky at gmail.com  Mon May  7 19:16:11 2012
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Mon, 7 May 2012 13:16:11 -0400
Subject: [Python-ideas] Should range() == range(0)?
In-Reply-To: 
References:  <4FA71B68.4010400@pearwood.info>
	 <4FA75AB7.1000702@canterbury.ac.nz>
	
	
Message-ID: 

On Mon, May 7, 2012 at 6:50 AM, Georg Brandl  wrote:
>> Having range() return an empty range in the same way that tuple()
>> returns an empty tuple would be a natural extension of that
>> philosophy.
>
> For what gain?

Lack of the default constructor is a pain for generic programming in
Python.  It is not uncommon to require an arbitrary instance of the
given type and calling the type without arguments is a convenient way
to get one.  I never missed working range() mostly because I don't
recall ever using range as an actual type rather than the Python way
to spell the C for loop.  I do, however often miss default
constructors for datetime objects, so I understand why some people may
desire range().


From tjreedy at udel.edu  Mon May  7 19:27:01 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 07 May 2012 13:27:01 -0400
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: <20120507183343.5552cccb@pitrou.net>
References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5>
	<4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<20120507180653.25a654d1@pitrou.net>
	
	<20120507183343.5552cccb@pitrou.net>
Message-ID: 

On 5/7/2012 12:33 PM, Antoine Pitrou wrote:
> On Mon, 7 May 2012 12:24:21 -0400
> Alexander Belopolsky
>   wrote:
>> On Mon, May 7, 2012 at 12:06 PM, Antoine Pitrou  wrote:
>>> Does midnight warrant any special shortcut for testing?
>>
>> I never needed it, but apparently it is common enough for users to
>> notice an complain.
>
> How so? Those users complain that midnight is false, not that they have
> trouble testing for midnight.
> That's the whole point really: they don't think about midnight as a
> special value, and they are surprised that it is.

It is only special in the representation because 24:00 == 00:00. I have 
the impression that European train timetables at least in decades past 
printed midnight arrival times as 24:00 instead of 00:00. I agree that 
that is an extremely thin reason for the current behavior. Someone 
printing timetables in that style should explicitly test arrivals for 
being midnight.

There have been cultures that started the day at dawn or noon, and in 
the US at least, we still restart half-days at both noon and midnight, 
so both would be special here.

Rather unusually, I disagree with Tim here: "It is odd, but really no 
odder than "zero values" of other types evaluating to false in Boolean 
contexts ;-)". Numerical 0 and empty collections are special and often 
need to be treated specially in a way that is untrue of midnight. I 
think treating it as special was a design mistake.

There have been discussions on python-list to the effect that if one 
wants to branch on something being None or not, one should be explicit 
-- 'is None' or 'is not None' -- to avoid accidentally picking up other 
null values.

-- 
Terry Jan Reedy



From steve at pearwood.info  Mon May  7 19:31:59 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 08 May 2012 03:31:59 +1000
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: 
References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5>	<4FA7DE5B.8000703@pearwood.info>	<20120507170219.266304f2@pitrou.net>			<20120507173254.6a6aee5b@pitrou.net>
	
Message-ID: <4FA8070F.1040107@pearwood.info>

Alexander Belopolsky wrote:
> On Mon, May 7, 2012 at 11:32 AM, Antoine Pitrou  wrote:
>> Well, less occasional puzzlement is an improvement in itself.
>> Unintuitive behaviour is always a risk for software quality.
> 
> I don't find the current behavior unintuitive.  It is common to
> represent time of day as an integer (number of minutes or seconds
> since midnight) or as a float (fraction of the 24-hour day).  In these
> cases one gets bool(midnight) -> False as an artifact of the
> representation.  

I think you have made a good point there: the behaviour of bool(midnight) is 
an artifact of the internal representation. Unless this behaviour is 
documented, that makes it an implementation detail, and therefore lowers (but 
not eliminates) the barrier to changing it.



[...]
> Note that if we make bool(midnight) -> True, it will not be trivial to
> faithfully reproduce the old behavior.  I want the proponents of the
> change to try it before I explain why it is not easy.

I think it is easy. Instead of either of these:

     if bool(some_time):
         ...
     if some_time:
         ...

write this:

     _MIDNIGHT = datetime.time(0, 0)  # defined once
     if some_time != _MIDNIGHT:
         ...

For code where some_time could be None, write this:

     if not (some_time is None or some_time == _MIDNIGHT):
         ...


Have I missed any common cases?



-- 
Steven



From alexander.belopolsky at gmail.com  Mon May  7 19:44:29 2012
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Mon, 7 May 2012 13:44:29 -0400
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: <4FA8070F.1040107@pearwood.info>
References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5>
	<4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<4FA8070F.1040107@pearwood.info>
Message-ID: 

On Mon, May 7, 2012 at 1:31 PM, Steven D'Aprano  wrote:
> I think it is easy. Instead of either of these:
>
> ? ?if bool(some_time):
> ? ? ? ?...
> ? ?if some_time:
> ? ? ? ?...
>
> write this:
>
> ? ?_MIDNIGHT = datetime.time(0, 0) ?# defined once
> ? ?if some_time != _MIDNIGHT:
> ? ? ? ?...
>
> For code where some_time could be None, write this:
>
> ? ?if not (some_time is None or some_time == _MIDNIGHT):
> ? ? ? ?...
>
>
> Have I missed any common cases?


Yes, your code will raise an exception if some_time has tzinfo set.
This is exactly the issue that I expected you to miss, so I rest my
case. :-)


From breamoreboy at yahoo.co.uk  Mon May  7 19:44:34 2012
From: breamoreboy at yahoo.co.uk (Mark Lawrence)
Date: Mon, 07 May 2012 18:44:34 +0100
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: 
References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5>
	<4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<20120507180653.25a654d1@pitrou.net>
	
	<20120507183343.5552cccb@pitrou.net> 
Message-ID: 

On 07/05/2012 18:27, Terry Reedy wrote:
> It is only special in the representation because 24:00 == 00:00. I have
> the impression that European train timetables at least in decades past
> printed midnight arrival times as 24:00 instead of 00:00. I agree that
> that is an extremely thin reason for the current behavior. Someone
> printing timetables in that style should explicitly test arrivals for
> being midnight.
>
> There have been cultures that started the day at dawn or noon, and in
> the US at least, we still restart half-days at both noon and midnight,
> so both would be special here.
>

My understanding is that 24:00 hours is only really used by the military 
to avoid misunderstanding over the actual day they're talking about. 
The night of 5th/6th June 1944 springs to my mind.  I'll happily be 
corrected.

-- 
Cheers.

Mark Lawrence.



From mal at egenix.com  Mon May  7 19:49:56 2012
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 07 May 2012 19:49:56 +0200
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: 
References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5>
	<4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<20120507180653.25a654d1@pitrou.net>
	
	<20120507183343.5552cccb@pitrou.net>
	 
Message-ID: <4FA80B44.8090608@egenix.com>

Mark Lawrence wrote:
> On 07/05/2012 18:27, Terry Reedy wrote:
>> It is only special in the representation because 24:00 == 00:00. I have
>> the impression that European train timetables at least in decades past
>> printed midnight arrival times as 24:00 instead of 00:00. I agree that
>> that is an extremely thin reason for the current behavior. Someone
>> printing timetables in that style should explicitly test arrivals for
>> being midnight.
>>
>> There have been cultures that started the day at dawn or noon, and in
>> the US at least, we still restart half-days at both noon and midnight,
>> so both would be special here.
>>
> 
> My understanding is that 24:00 hours is only really used by the military to avoid misunderstanding
> over the actual day they're talking about. The night of 5th/6th June 1944 springs to my mind.  I'll
> happily be corrected.

It's in common use in Germany, e.g. for describing opening hours.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 07 2012)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2012-07-02: EuroPython 2012, Florence, Italy               56 days to go
2012-04-26: Released mxODBC 3.1.2                 http://egenix.com/go28
2012-04-25: Released eGenix mx Base 3.2.4         http://egenix.com/go27

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


From alexander.belopolsky at gmail.com  Mon May  7 20:01:19 2012
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Mon, 7 May 2012 14:01:19 -0400
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: <4FA80B44.8090608@egenix.com>
References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5>
	<4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<20120507180653.25a654d1@pitrou.net>
	
	<20120507183343.5552cccb@pitrou.net> 
	 <4FA80B44.8090608@egenix.com>
Message-ID: 

On Mon, May 7, 2012 at 1:49 PM, M.-A. Lemburg  wrote:
>> My understanding is that 24:00 hours is only really used by the military to avoid misunderstanding
>> over the actual day they're talking about. The night of 5th/6th June 1944 springs to my mind. ?I'll
>> happily be corrected.
>
> It's in common use in Germany, e.g. for describing opening hours.

Properly supporting 24:00 timestamps in datetime module is actually a
more interesting issue than what bool(time(0)) should be.  See
http://bugs.python.org/issue10427


From mal at egenix.com  Mon May  7 20:25:26 2012
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 07 May 2012 20:25:26 +0200
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: 
References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5>
	<4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<20120507180653.25a654d1@pitrou.net>
	
	<20120507183343.5552cccb@pitrou.net>
	 
	<4FA80B44.8090608@egenix.com>
	
Message-ID: <4FA81396.4020606@egenix.com>

Alexander Belopolsky wrote:
> On Mon, May 7, 2012 at 1:49 PM, M.-A. Lemburg  wrote:
>>> My understanding is that 24:00 hours is only really used by the military to avoid misunderstanding
>>> over the actual day they're talking about. The night of 5th/6th June 1944 springs to my mind.  I'll
>>> happily be corrected.
>>
>> It's in common use in Germany, e.g. for describing opening hours.
> 
> Properly supporting 24:00 timestamps in datetime module is actually a
> more interesting issue than what bool(time(0)) should be.  See
> http://bugs.python.org/issue10427

Just to clarify: 24:00 is used when describing times, but not in
timestamps (those use 00:00 and the next day). E.g. it's common
to write: "open 00:00-24:00" or "open 18:00-24:00".

I've never seen anything like "2011-12-31 24:00" in Germany,
but Google suggests that it's in common use in Asia:

https://www.google.de/search?q=%222011-12-31+24:00%22

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 07 2012)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2012-07-02: EuroPython 2012, Florence, Italy               56 days to go
2012-04-26: Released mxODBC 3.1.2                 http://egenix.com/go28
2012-04-25: Released eGenix mx Base 3.2.4         http://egenix.com/go27

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


From steve at pearwood.info  Mon May  7 20:31:51 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 08 May 2012 04:31:51 +1000
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: 
References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5>	<4FA7DE5B.8000703@pearwood.info>	<20120507170219.266304f2@pitrou.net>			<20120507173254.6a6aee5b@pitrou.net>		<4FA8070F.1040107@pearwood.info>
	
Message-ID: <4FA81517.8070300@pearwood.info>

Alexander Belopolsky wrote:
> On Mon, May 7, 2012 at 1:31 PM, Steven D'Aprano  wrote:

>> For code where some_time could be None, write this:
>>
>>    if not (some_time is None or some_time == _MIDNIGHT):
>>        ...
>>
>>
>> Have I missed any common cases?
> 
> 
> Yes, your code will raise an exception if some_time has tzinfo set.
> This is exactly the issue that I expected you to miss, so I rest my
> case. :-)

But if you set the timezone, midnight is not necessarily false.


py> class GMT5(datetime.tzinfo):
...     def utcoffset(self,dt):
...         return timedelta(hours=5)
...     def tzname(self,dt):
...         return "GMT +5"
...     def dst(self,dt):
...         return timedelta(0)
...
py> gmt5 = GMT5()
py> bool(datetime.time(0,0, tzinfo=gmt5))
True
py> bool(datetime.time(5, 0, tzinfo=gmt5))
False


So I assume anyone using tzinfo will probably know enough not to be testing 
against time objects directly. Or at least not be using bool(some_some) to 
detect midnight.




-- 
Steven



From jkbbwr at gmail.com  Mon May  7 20:43:17 2012
From: jkbbwr at gmail.com (Jakob Bowyer)
Date: Mon, 7 May 2012 19:43:17 +0100
Subject: [Python-ideas] Replacing shelve in the next 3.x release.
Message-ID: 

I suggest that we either replace the internals of shelve, or deprecate
it, or remove it in favour of other dbm's like
http://packages.python.org/sqlite3dbm/dbm.html. Many people feel that
shelve is a pointless module that should not be used because it relies
too much on pickle an insecure format, in my own searches Google
showed only 13 projects using shelve and Github showed only 3000 odd
snippets containing shelve.open so its time this module either died
quietly or got the internals replaced. For a major start shelve
doesn't support integer keys where as the suggestion put earlier
clearly does. I'm sure there is other stuff I'm missing which is why
I'm posting here first.


From solipsis at pitrou.net  Mon May  7 20:50:11 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 7 May 2012 20:50:11 +0200
Subject: [Python-ideas] Replacing shelve in the next 3.x release.
References: 
Message-ID: <20120507205011.676362c0@pitrou.net>

On Mon, 7 May 2012 19:43:17 +0100
Jakob Bowyer  wrote:
> I suggest that we either replace the internals of shelve, or deprecate
> it, or remove it in favour of other dbm's like
> http://packages.python.org/sqlite3dbm/dbm.html. Many people feel that
> shelve is a pointless module that should not be used because it relies
> too much on pickle an insecure format,

pickle is only insecure if you want to accept data from untrusted
sources. shelve would obviously be very bad for an exchange format, but
I don't think that's what it's used for.

Someone should post a proper comparison of shelve with its alternatives
(including functionality and performance) before a decision is made.

Regards

Antoine.




From ubershmekel at gmail.com  Mon May  7 21:20:56 2012
From: ubershmekel at gmail.com (Yuval Greenfield)
Date: Mon, 7 May 2012 22:20:56 +0300
Subject: [Python-ideas] Replacing shelve in the next 3.x release.
In-Reply-To: <20120507205011.676362c0@pitrou.net>
References: 
	<20120507205011.676362c0@pitrou.net>
Message-ID: 

On Mon, May 7, 2012 at 9:50 PM, Antoine Pitrou  wrote:

> On Mon, 7 May 2012 19:43:17 +0100
> Jakob Bowyer  wrote:
> > I suggest that we either replace the internals of shelve, or deprecate
> > it, or remove it in favour of other dbm's like
> > http://packages.python.org/sqlite3dbm/dbm.html. Many people feel that
> > shelve is a pointless module that should not be used because it relies
> > too much on pickle an insecure format,
>
> pickle is only insecure if you want to accept data from untrusted
> sources. shelve would obviously be very bad for an exchange format, but
> I don't think that's what it's used for.
>
> Someone should post a proper comparison of shelve with its alternatives
> (including functionality and performance) before a decision is made.
>
> Regards
>
> Antoine.
>
>
I used shelve for a long time on multiple projects as it's really easy to
use but I had to deal with random data corruption on abrupt process
termination. That was my motivator to implement an sqlite backend for
shelve though I guess I wasn't motivated strongly enough to follow through.

Yuval
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From solipsis at pitrou.net  Mon May  7 21:25:19 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 7 May 2012 21:25:19 +0200
Subject: [Python-ideas] Replacing shelve in the next 3.x release.
References: 
	<20120507205011.676362c0@pitrou.net>
	
Message-ID: <20120507212519.58d25b26@pitrou.net>

On Mon, 7 May 2012 22:20:56 +0300
Yuval Greenfield 
wrote:
> I used shelve for a long time on multiple projects as it's really easy to
> use but I had to deal with random data corruption on abrupt process
> termination. That was my motivator to implement an sqlite backend for
> shelve though I guess I wasn't motivated strongly enough to follow through.

Atomic replacement of the shelve file is probably an improvement worth
adding.

Regards

Antoine.




From greg.ewing at canterbury.ac.nz  Tue May  8 02:18:20 2012
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 08 May 2012 12:18:20 +1200
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: 
References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5>
	<4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
Message-ID: <4FA8664C.2090201@canterbury.ac.nz>

Alexander Belopolsky wrote:

> I don't find the current behavior unintuitive.  It is common to
> represent time of day as an integer (number of minutes or seconds
> since midnight) or as a float (fraction of the 24-hour day).  In these
> cases one gets bool(midnight) -> False as an artifact of the
> representation.

Relying on that artifact by using midnight as a kind of
null value seems like a bad idea to me, though. Any
code doing that almost deserves to be broken.

-- 
Greg


From greg.ewing at canterbury.ac.nz  Tue May  8 02:32:23 2012
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 08 May 2012 12:32:23 +1200
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: <4FA7FE08.2050901@egenix.com>
References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5>
	<4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<4FA7FE08.2050901@egenix.com>
Message-ID: <4FA86997.5000809@canterbury.ac.nz>

M.-A. Lemburg wrote:
> In Python 3.x, nb_nonzero was renamed to nb_bool without really
> paying attention to the fact that many types implemented the original
> meaning instead of a notion of boolean value

I don't think it was wrong to do that. The fact that the
C slot was called "nonzero" was never visible to the Python
programmer, who always thought of the operation it represents
as truth-testing.

If there's any fault here, it's with C type implementors who
have taken "nonzero" too literally.

-- 
Greg


From ncoghlan at gmail.com  Tue May  8 03:57:08 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 8 May 2012 11:57:08 +1000
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: <4FA86997.5000809@canterbury.ac.nz>
References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5>
	<4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<4FA7FE08.2050901@egenix.com> <4FA86997.5000809@canterbury.ac.nz>
Message-ID: 

On Tue, May 8, 2012 at 10:32 AM, Greg Ewing  wrote:
> M.-A. Lemburg wrote:
>>
>> In Python 3.x, nb_nonzero was renamed to nb_bool without really
>> paying attention to the fact that many types implemented the original
>> meaning instead of a notion of boolean value
>
>
> I don't think it was wrong to do that. The fact that the
> C slot was called "nonzero" was never visible to the Python
> programmer, who always thought of the operation it represents
> as truth-testing.
>
> If there's any fault here, it's with C type implementors who
> have taken "nonzero" too literally.

The Python level special method in 2.x is also __nonzero__ (or you
could just implement __len__).

In 3.x, the two relevant special methods are now __bool__ and __len__.
Type designers are, of course, still free to use "non-zero" as their
definition for how they choose to implement __bool__.

For myself, I don't see any harm in having the zero hour be treated as
the zero hour at the language level ("zero hour" is another term for
midnight, which, as far as I know, stems from the military Zulu
notation where it's written as "0000Z"). Certainly I don't see
adequate justification for changing the boolean behaviour of time
objects at this late date.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From ethan at stoneleaf.us  Tue May  8 04:57:20 2012
From: ethan at stoneleaf.us (Ethan Furman)
Date: Mon, 07 May 2012 19:57:20 -0700
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: 
References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5>	<4FA7DE5B.8000703@pearwood.info>	<20120507170219.266304f2@pitrou.net>			<20120507173254.6a6aee5b@pitrou.net>		<20120507180653.25a654d1@pitrou.net>
	
Message-ID: <4FA88B90.7060309@stoneleaf.us>

Alexander Belopolsky wrote:
> On Mon, May 7, 2012 at 12:06 PM, Antoine Pitrou  wrote:
>> Does midnight warrant any special shortcut for testing?
> 
> I never needed it, but apparently it is common enough for users to
> notice an complain.  That's why I asked my original question: if
> you've seen a time variable been tested for truth, was it a bug that
> can be fixed by a change in time.__bool__ or a deliberate test for the
> midnight value?

Complain or rewrite to something reasonable, which is what I did.  Much 
better to have the built-in time behave properly than have users either 
work around it or constantly create new classes.

>> Especially one that is confusing to many readers.
> 
> I have a feeling that "readers" here are readers of documentation or
> tutorials rather than readers of actual code.  If this is the case, we
> can discuss how to improve the documentation and not change the
> behavior.

The behavior is broken.  Midnight is not False.

~Ethan~


From ncoghlan at gmail.com  Tue May  8 05:57:13 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 8 May 2012 13:57:13 +1000
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: <4FA88B90.7060309@stoneleaf.us>
References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5>
	<4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<20120507180653.25a654d1@pitrou.net>
	
	<4FA88B90.7060309@stoneleaf.us>
Message-ID: 

On Tue, May 8, 2012 at 12:57 PM, Ethan Furman  wrote:
> The behavior is broken. ?Midnight is not False.

Whereas I disagree - I see having zero hour be false as perfectly
reasonable behaviour (not necessarily *useful*, but then having all
time objects report as True in boolean context isn't particularly
useful either).

In matters of opinion, the status quo reigns.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From steve at pearwood.info  Tue May  8 07:42:49 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 8 May 2012 15:42:49 +1000
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: 
References: <4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<20120507180653.25a654d1@pitrou.net>
	
	<4FA88B90.7060309@stoneleaf.us>
	
Message-ID: <20120508054249.GB3797@ando>

On Tue, May 08, 2012 at 01:57:13PM +1000, Nick Coghlan wrote:
> On Tue, May 8, 2012 at 12:57 PM, Ethan Furman  wrote:
> > The behavior is broken. ?Midnight is not False.
> 
> Whereas I disagree - I see having zero hour be false as perfectly
> reasonable behaviour (not necessarily *useful*, but then having all
> time objects report as True in boolean context isn't particularly
> useful either).

On the contrary, it can be very useful to have all objects of some 
classes treated as true. For example, we can write:

mo = re.search(a, b)
if mo:
    do_something_with(mo)

without having to worry about the case where a valid MatchObject happens 
to be false.

Consider:

t = get_job_start_time()  # returns a datetime.time object, or None
if t:
    do_something_with(t)


Oops, we have a bug. If the job happens to have started at exactly 
midnight, it will wrongly be treated as false.

But wait, it's worse than that. Because it's not actually midnight that 
gets treated as false, but some arbitrary time of the day which depends 
on your timezone. It's only midnight if you don't specify a tzinfo, or 
if you do and happen to be using GMT.

Midnight (modulo timezone) is not special enough to treat it as a false 
value. It's not an empty container or mapping, or the identity element 
under addition, or the only string that contains no substrings except 
itself. It's just another hour.

(Midnight is only special if you care about the change from one day to 
another. But if you care about that, you're probably using datetime 
objects rather than time objects, and then you don't have this problem 
because "midnight last Tuesday" is not treated as false.)

I believe that having time(0,0) be treated as false is at best a 
misfeature and at worst a bug. It is as unnecessary a special case as it 
would be to have the string "\0" treated as false.

The only good defence for keeping it, in my opinion, would be fear that 
there is working code that relies on this.


> In matters of opinion, the status quo reigns.

That's somewhat of an exaggeration. The mere existence of a single 
dissenting opinion isn't enough to block all progress/changes. (Not 
unless it's Guido *wink*.) Consensus doesn't require every single person 
to agree.


-- 
Steven



From alexander.belopolsky at gmail.com  Tue May  8 08:21:07 2012
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 8 May 2012 02:21:07 -0400
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: <20120508054249.GB3797@ando>
References: <4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<20120507180653.25a654d1@pitrou.net>
	
	<4FA88B90.7060309@stoneleaf.us>
	
	<20120508054249.GB3797@ando>
Message-ID: 

On Tue, May 8, 2012 at 1:42 AM, Steven D'Aprano  wrote:
>> In matters of opinion, the status quo reigns.
>
> That's somewhat of an exaggeration. The mere existence of a single
> dissenting opinion isn't enough to block all progress/changes.

For what it's worth, I am also against changing the status quo.
time(0) is special: it is the smallest possible value.  If you deal
with low resolution time values, say hourly schedules, it is not
unreasonable to test for time(0).  For example, when estimating daily
averages, midnight samples can be weighted by 1/2 to account for the
uncertainty in assigning midnight to a given day.


From ethan at stoneleaf.us  Tue May  8 08:33:56 2012
From: ethan at stoneleaf.us (Ethan Furman)
Date: Mon, 07 May 2012 23:33:56 -0700
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: 
References: <4FA7DE5B.8000703@pearwood.info>	<20120507170219.266304f2@pitrou.net>			<20120507173254.6a6aee5b@pitrou.net>		<20120507180653.25a654d1@pitrou.net>		<4FA88B90.7060309@stoneleaf.us>		<20120508054249.GB3797@ando>
	
Message-ID: <4FA8BE54.6060909@stoneleaf.us>

Alexander Belopolsky wrote:
> On Tue, May 8, 2012 at 1:42 AM, Steven D'Aprano  wrote:
>>> In matters of opinion, the status quo reigns.
>> That's somewhat of an exaggeration. The mere existence of a single
>> dissenting opinion isn't enough to block all progress/changes.
> 
> For what it's worth, I am also against changing the status quo.
> time(0) is special: it is the smallest possible value.  If you deal
> with low resolution time values, say hourly schedules, it is not
> unreasonable to test for time(0).  For example, when estimating daily
> averages, midnight samples can be weighted by 1/2 to account for the
> uncertainty in assigning midnight to a given day.

Testing for midnight does not require midnight to be False.

And no, I don't maintain any hope of winning this argument -- that's why 
I wrote my own class.  With it it is possible to create an unspecified 
moment... and guess what?  It evaluates to False; all actual times 
evaluate as True.  (Including midnight. ;)

~Ethan~


From ncoghlan at gmail.com  Tue May  8 09:02:04 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 8 May 2012 17:02:04 +1000
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: <20120508054249.GB3797@ando>
References: <4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<20120507180653.25a654d1@pitrou.net>
	
	<4FA88B90.7060309@stoneleaf.us>
	
	<20120508054249.GB3797@ando>
Message-ID: 

On Tue, May 8, 2012 at 3:42 PM, Steven D'Aprano  wrote:
> On Tue, May 08, 2012 at 01:57:13PM +1000, Nick Coghlan wrote:
>> On Tue, May 8, 2012 at 12:57 PM, Ethan Furman  wrote:
>> > The behavior is broken. ?Midnight is not False.
>>
>> Whereas I disagree - I see having zero hour be false as perfectly
>> reasonable behaviour (not necessarily *useful*, but then having all
>> time objects report as True in boolean context isn't particularly
>> useful either).
>
> On the contrary, it can be very useful to have all objects of some
> classes treated as true. For example, we can write:
>
> mo = re.search(a, b)
> if mo:
> ? ?do_something_with(mo)
>
> without having to worry about the case where a valid MatchObject happens
> to be false.
>
> Consider:
>
> t = get_job_start_time() ?# returns a datetime.time object, or None
> if t:
> ? ?do_something_with(t)
>
>
> Oops, we have a bug. If the job happens to have started at exactly
> midnight, it will wrongly be treated as false.

IMO, you've completely misdiagnosed the source of that bug. Never
*ever* rely on boolean evaluation when testing against None. *Always*
use the "is not None" trailer.

Regards,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From ncoghlan at gmail.com  Tue May  8 09:09:25 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 8 May 2012 17:09:25 +1000
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: <20120508054249.GB3797@ando>
References: <4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<20120507180653.25a654d1@pitrou.net>
	
	<4FA88B90.7060309@stoneleaf.us>
	
	<20120508054249.GB3797@ando>
Message-ID: 

On Tue, May 8, 2012 at 3:42 PM, Steven D'Aprano  wrote:
> On Tue, May 08, 2012 at 01:57:13PM +1000, Nick Coghlan wrote:
>> In matters of opinion, the status quo reigns.
>
> That's somewhat of an exaggeration. The mere existence of a single
> dissenting opinion isn't enough to block all progress/changes. (Not
> unless it's Guido *wink*.) Consensus doesn't require every single person
> to agree.

The current behaviour is perfectly consistent and well-defined, so
changing it will break any code that relies on the current behaviour.
The burden is not on me to prove that there *is* such code in the
wild, it's on those proposing a change to prove that there *isn't*
such code (which can't be done), or else to provide a sufficiently
compelling rationale that the risk of breakage can be justified.

"I don't like it" is not a valid argument for a change, nor is "I like
using a boolean test when I really mean an 'is not None' test".

*If* such a change were to be made, it would require at least one
release where a DeprecationWarning was emitted before returning False,
and then the return value could change in the next release. Why
bother?

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From ubershmekel at gmail.com  Tue May  8 09:17:31 2012
From: ubershmekel at gmail.com (Yuval Greenfield)
Date: Tue, 8 May 2012 10:17:31 +0300
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: <20120508054249.GB3797@ando>
References: <4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<20120507180653.25a654d1@pitrou.net>
	
	<4FA88B90.7060309@stoneleaf.us>
	
	<20120508054249.GB3797@ando>
Message-ID: 

On Tue, May 8, 2012 at 8:42 AM, Steven D'Aprano  wrote:
[...]

> It's only midnight if you don't specify a tzinfo, or
> if you do and happen to be using GMT.
>

Arbitrary and unexpected times evaluating to False is a bug waiting to
happen. Personally I'd prefer all datetime.time objects had no boolean
value at all.

The only good defence for keeping it, in my opinion, would be fear that
> there is working code that relies on this.


+1

Yuval
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From solipsis at pitrou.net  Tue May  8 11:56:26 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 8 May 2012 11:56:26 +0200
Subject: [Python-ideas] bool(datetime.time(0, 0))
References: <4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<20120507180653.25a654d1@pitrou.net>
	
	<4FA88B90.7060309@stoneleaf.us>
	
	<20120508054249.GB3797@ando>
	
Message-ID: <20120508115626.32706f28@pitrou.net>

On Tue, 8 May 2012 17:02:04 +1000
Nick Coghlan  wrote:
> 
> IMO, you've completely misdiagnosed the source of that bug. Never
> *ever* rely on boolean evaluation when testing against None.

Nick, that's just plain silly. If we didn't want people to rely on
boolean evaluation, we wouldn't define __bool__ at all (or we would
make it return a random value).

Regards

Antoine.




From ncoghlan at gmail.com  Tue May  8 12:08:07 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 8 May 2012 20:08:07 +1000
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: <20120508115626.32706f28@pitrou.net>
References: <4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<20120507180653.25a654d1@pitrou.net>
	
	<4FA88B90.7060309@stoneleaf.us>
	
	<20120508054249.GB3797@ando>
	
	<20120508115626.32706f28@pitrou.net>
Message-ID: 

The problem is not using boolean evaluation - it's assuming that boolean
evaluation is defined as "x is not None". Doing so introduces a completely
unnecessary dependency on the type of "x". I'm frankly astonished that so
many people seem to think it's a reasonable thing to do.

--
Sent from my phone, thus the relative brevity :)
On May 8, 2012 8:01 PM, "Antoine Pitrou"  wrote:

> On Tue, 8 May 2012 17:02:04 +1000
> Nick Coghlan  wrote:
> >
> > IMO, you've completely misdiagnosed the source of that bug. Never
> > *ever* rely on boolean evaluation when testing against None.
>
> Nick, that's just plain silly. If we didn't want people to rely on
> boolean evaluation, we wouldn't define __bool__ at all (or we would
> make it return a random value).
>
> Regards
>
> Antoine.
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From solipsis at pitrou.net  Tue May  8 12:11:04 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 08 May 2012 12:11:04 +0200
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: 
References: <4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<20120507180653.25a654d1@pitrou.net>
	
	<4FA88B90.7060309@stoneleaf.us>
	
	<20120508054249.GB3797@ando>
	
	<20120508115626.32706f28@pitrou.net>
	
Message-ID: <1336471864.3376.1.camel@localhost.localdomain>

Le mardi 08 mai 2012 ? 20:08 +1000, Nick Coghlan a ?crit :
> The problem is not using boolean evaluation - it's assuming that
> boolean evaluation is defined as "x is not None". Doing so introduces
> a completely unnecessary dependency on the type of "x".

Well, the dependency is obvious when the type is already well-known.

Regards

Antoine.




From g.brandl at gmx.net  Tue May  8 12:25:48 2012
From: g.brandl at gmx.net (Georg Brandl)
Date: Tue, 08 May 2012 12:25:48 +0200
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: <20120508115626.32706f28@pitrou.net>
References: <4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<20120507180653.25a654d1@pitrou.net>
	
	<4FA88B90.7060309@stoneleaf.us>
	
	<20120508054249.GB3797@ando>
	
	<20120508115626.32706f28@pitrou.net>
Message-ID: 

On 05/08/2012 11:56 AM, Antoine Pitrou wrote:
> On Tue, 8 May 2012 17:02:04 +1000
> Nick Coghlan  wrote:
>> 
>> IMO, you've completely misdiagnosed the source of that bug. Never
>> *ever* rely on boolean evaluation when testing against None.
> 
> Nick, that's just plain silly. If we didn't want people to rely on
> boolean evaluation, we wouldn't define __bool__ at all (or we would
> make it return a random value).

Read again: he's talking about people using "bool(x)" (implicitly) when
they mean "x is not None".

Georg



From mal at egenix.com  Tue May  8 12:34:32 2012
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 08 May 2012 12:34:32 +0200
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: 
References: <4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<20120507180653.25a654d1@pitrou.net>
	
	<4FA88B90.7060309@stoneleaf.us>
	
	<20120508054249.GB3797@ando>
	
Message-ID: <4FA8F6B8.3030307@egenix.com>

Nick Coghlan wrote:
> On Tue, May 8, 2012 at 3:42 PM, Steven D'Aprano  wrote:
>> On Tue, May 08, 2012 at 01:57:13PM +1000, Nick Coghlan wrote:
>>> On Tue, May 8, 2012 at 12:57 PM, Ethan Furman  wrote:
>>>> The behavior is broken.  Midnight is not False.
>>>
>>> Whereas I disagree - I see having zero hour be false as perfectly
>>> reasonable behaviour (not necessarily *useful*, but then having all
>>> time objects report as True in boolean context isn't particularly
>>> useful either).
>>
>> On the contrary, it can be very useful to have all objects of some
>> classes treated as true. For example, we can write:
>>
>> mo = re.search(a, b)
>> if mo:
>>    do_something_with(mo)
>>
>> without having to worry about the case where a valid MatchObject happens
>> to be false.
>>
>> Consider:
>>
>> t = get_job_start_time()  # returns a datetime.time object, or None
>> if t:
>>    do_something_with(t)
>>
>>
>> Oops, we have a bug. If the job happens to have started at exactly
>> midnight, it will wrongly be treated as false.
> 
> IMO, you've completely misdiagnosed the source of that bug. Never
> *ever* rely on boolean evaluation when testing against None. *Always*
> use the "is not None" trailer.

Fully agreed.

The above code is just plain wrong and often causes problems
in larger applications - besides, it's also slower in most
cases, esp. if determining the length of an object or
converting it to a numeric value is slow.

If you want to test for None return values, you need to use
"if is None" or "if is not None".

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 08 2012)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2012-07-02: EuroPython 2012, Florence, Italy               55 days to go
2012-04-26: Released mxODBC 3.1.2                 http://egenix.com/go28
2012-04-25: Released eGenix mx Base 3.2.4         http://egenix.com/go27

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


From solipsis at pitrou.net  Tue May  8 12:51:48 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 8 May 2012 12:51:48 +0200
Subject: [Python-ideas] bool(datetime.time(0, 0))
References: <4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<20120507180653.25a654d1@pitrou.net>
	
	<4FA88B90.7060309@stoneleaf.us>
	
	<20120508054249.GB3797@ando>
	
	<20120508115626.32706f28@pitrou.net> 
Message-ID: <20120508125148.030614cb@pitrou.net>

On Tue, 08 May 2012 12:25:48 +0200
Georg Brandl  wrote:
> On 05/08/2012 11:56 AM, Antoine Pitrou wrote:
> > On Tue, 8 May 2012 17:02:04 +1000
> > Nick Coghlan  wrote:
> >> 
> >> IMO, you've completely misdiagnosed the source of that bug. Never
> >> *ever* rely on boolean evaluation when testing against None.
> > 
> > Nick, that's just plain silly. If we didn't want people to rely on
> > boolean evaluation, we wouldn't define __bool__ at all (or we would
> > make it return a random value).
> 
> Read again: he's talking about people using "bool(x)" (implicitly) when
> they mean "x is not None".

That's what I read.





From solipsis at pitrou.net  Tue May  8 12:53:42 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 8 May 2012 12:53:42 +0200
Subject: [Python-ideas] bool(datetime.time(0, 0))
References: <4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<20120507180653.25a654d1@pitrou.net>
	
	<4FA88B90.7060309@stoneleaf.us>
	
	<20120508054249.GB3797@ando>
	
	<4FA8F6B8.3030307@egenix.com>
Message-ID: <20120508125342.50a02242@pitrou.net>

On Tue, 08 May 2012 12:34:32 +0200
"M.-A. Lemburg"  wrote:
> 
> Fully agreed.
> 
> The above code is just plain wrong and often causes problems
> in larger applications - besides, it's also slower in most
> cases, esp. if determining the length of an object or
> converting it to a numeric value is slow.
> 
> If you want to test for None return values, you need to use
> "if is None" or "if is not None".

So who writes the PEP to deprecate __bool__ methods wholesale?

Regards

Antoine.




From p.f.moore at gmail.com  Tue May  8 13:12:05 2012
From: p.f.moore at gmail.com (Paul Moore)
Date: Tue, 8 May 2012 12:12:05 +0100
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: <20120508125342.50a02242@pitrou.net>
References: <4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<20120507180653.25a654d1@pitrou.net>
	
	<4FA88B90.7060309@stoneleaf.us>
	
	<20120508054249.GB3797@ando>
	
	<4FA8F6B8.3030307@egenix.com> <20120508125342.50a02242@pitrou.net>
Message-ID: 

On 8 May 2012 11:53, Antoine Pitrou  wrote:
>> If you want to test for None return values, you need to use
>> "if is None" or "if is not None".
>
> So who writes the PEP to deprecate __bool__ methods wholesale?

I see no need - if you're testing for "true" return values, bool is
correct. But if you're testing for None vs an actual value, it's not.

The fact that testing for boolean true values is a lot rarer than
people think, or that it's not appropriate in certain situations,
doesn't mean it's useless.

Paul.


From solipsis at pitrou.net  Tue May  8 13:17:22 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 8 May 2012 13:17:22 +0200
Subject: [Python-ideas] bool(datetime.time(0, 0))
References: <4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<20120507180653.25a654d1@pitrou.net>
	
	<4FA88B90.7060309@stoneleaf.us>
	
	<20120508054249.GB3797@ando>
	
	<4FA8F6B8.3030307@egenix.com> <20120508125342.50a02242@pitrou.net>
	
Message-ID: <20120508131722.1353070e@pitrou.net>

On Tue, 8 May 2012 12:12:05 +0100
Paul Moore  wrote:
> On 8 May 2012 11:53, Antoine Pitrou  wrote:
> >> If you want to test for None return values, you need to use
> >> "if is None" or "if is not None".
> >
> > So who writes the PEP to deprecate __bool__ methods wholesale?
> 
> I see no need - if you're testing for "true" return values, bool is
> correct. But if you're testing for None vs an actual value, it's not.

Well, again, if that's the case, then __bool__ should be deprecated for
all types where being "true" or "false" doesn't make obvious sense.
Which is most types, actually.

Of course, this is a completely vacuous discussion. The reality is that
a __bool__ exists for all types, we are not deprecating it, and people
rely on it even though PEP 8 zealots may recommend otherwise.

Again, the question is whether time.__bool__ is sane and, if not, why
not make it saner? Lecturing people on style doesn't make Python better.

Regards

Antoine.




From mal at egenix.com  Tue May  8 13:36:52 2012
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 08 May 2012 13:36:52 +0200
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: <20120508125342.50a02242@pitrou.net>
References: <4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<20120507180653.25a654d1@pitrou.net>
	
	<4FA88B90.7060309@stoneleaf.us>
	
	<20120508054249.GB3797@ando>
	
	<4FA8F6B8.3030307@egenix.com> <20120508125342.50a02242@pitrou.net>
Message-ID: <4FA90554.9010505@egenix.com>

Antoine Pitrou wrote:
> On Tue, 08 May 2012 12:34:32 +0200
> "M.-A. Lemburg"  wrote:
>>
>> Fully agreed.
>>
>> The above code is just plain wrong and often causes problems
>> in larger applications - besides, it's also slower in most
>> cases, esp. if determining the length of an object or
>> converting it to a numeric value is slow.
>>
>> If you want to test for None return values, you need to use
>> "if is None" or "if is not None".
> 
> So who writes the PEP to deprecate __bool__ methods wholesale?

I think I lost you there. What does the above have to do with
__bool__ methods ?

Whether or not a type implements the notion of a boolean value
is really up to the specific implementation and not a question
that can be answered in general.

It's perfectly fine for time value to mimic a boolean value
by following the same paradigm as a float "seconds since midnight"
value.

As such, reusing the __nonzero__ or __len__ slots for boolean
values is fine as well. It may not always make sense in every
conceivable way, but as long as there is a valid explanation
that can be documented, I don't see that as problem.

If you're purist, you'd probably disallow __bool__ methods
on non-boolean types, but this is Python, so we pass on
control to object and type implementers.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 08 2012)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2012-07-02: EuroPython 2012, Florence, Italy               55 days to go
2012-04-26: Released mxODBC 3.1.2                 http://egenix.com/go28
2012-04-25: Released eGenix mx Base 3.2.4         http://egenix.com/go27

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


From g.brandl at gmx.net  Tue May  8 13:45:57 2012
From: g.brandl at gmx.net (Georg Brandl)
Date: Tue, 08 May 2012 13:45:57 +0200
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: <20120508131722.1353070e@pitrou.net>
References: <4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<20120507180653.25a654d1@pitrou.net>
	
	<4FA88B90.7060309@stoneleaf.us>
	
	<20120508054249.GB3797@ando>
	
	<4FA8F6B8.3030307@egenix.com> <20120508125342.50a02242@pitrou.net>
	
	<20120508131722.1353070e@pitrou.net>
Message-ID: 

On 05/08/2012 01:17 PM, Antoine Pitrou wrote:
> On Tue, 8 May 2012 12:12:05 +0100
> Paul Moore  wrote:
>> On 8 May 2012 11:53, Antoine Pitrou  wrote:
>> >> If you want to test for None return values, you need to use
>> >> "if is None" or "if is not None".
>> >
>> > So who writes the PEP to deprecate __bool__ methods wholesale?
>> 
>> I see no need - if you're testing for "true" return values, bool is
>> correct. But if you're testing for None vs an actual value, it's not.
> 
> Well, again, if that's the case, then __bool__ should be deprecated for
> all types where being "true" or "false" doesn't make obvious sense.
> Which is most types, actually.

Repeating a strange argument does not make it more true.

Georg



From mwm at mired.org  Tue May  8 13:58:20 2012
From: mwm at mired.org (Mike Meyer)
Date: Tue, 08 May 2012 07:58:20 -0400
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: 
References: <4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<20120507180653.25a654d1@pitrou.net>
	
	<4FA88B90.7060309@stoneleaf.us>
	
	<20120508054249.GB3797@ando>
	
	<20120508115626.32706f28@pitrou.net>
	
Message-ID: <6fc091a5-c4bf-46e5-b9ba-8f3251ba1c1d@email.android.com>

Nick Coghlan  wrote:

>The problem is not using boolean evaluation - it's assuming that
>boolean
>evaluation is defined as "x is not None". Doing so introduces a
>completely
>unnecessary dependency on the type of "x". I'm frankly astonished that
>so
>many people seem to think it's a reasonable thing to do.

+1


-- 
Sent from my Android tablet. Please excuse my swyping.


From solipsis at pitrou.net  Tue May  8 14:00:44 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 8 May 2012 14:00:44 +0200
Subject: [Python-ideas] bool(datetime.time(0, 0))
References: <4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<20120507180653.25a654d1@pitrou.net>
	
	<4FA88B90.7060309@stoneleaf.us>
	
	<20120508054249.GB3797@ando>
	
	<4FA8F6B8.3030307@egenix.com> <20120508125342.50a02242@pitrou.net>
	
	<20120508131722.1353070e@pitrou.net> 
Message-ID: <20120508140044.3547fff9@pitrou.net>

On Tue, 08 May 2012 13:45:57 +0200
Georg Brandl  wrote:
> On 05/08/2012 01:17 PM, Antoine Pitrou wrote:
> > On Tue, 8 May 2012 12:12:05 +0100
> > Paul Moore  wrote:
> >> On 8 May 2012 11:53, Antoine Pitrou  wrote:
> >> >> If you want to test for None return values, you need to use
> >> >> "if is None" or "if is not None".
> >> >
> >> > So who writes the PEP to deprecate __bool__ methods wholesale?
> >> 
> >> I see no need - if you're testing for "true" return values, bool is
> >> correct. But if you're testing for None vs an actual value, it's not.
> > 
> > Well, again, if that's the case, then __bool__ should be deprecated for
> > all types where being "true" or "false" doesn't make obvious sense.
> > Which is most types, actually.
> 
> Repeating a strange argument does not make it more true.

Well, it does follow from what you wrote above...

Regards

Antoine.




From mwm at mired.org  Tue May  8 17:15:53 2012
From: mwm at mired.org (Mike Meyer)
Date: Tue, 8 May 2012 11:15:53 -0400
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: <20120508131722.1353070e@pitrou.net>
References: <4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<20120507180653.25a654d1@pitrou.net>
	
	<4FA88B90.7060309@stoneleaf.us>
	
	<20120508054249.GB3797@ando>
	
	<4FA8F6B8.3030307@egenix.com> <20120508125342.50a02242@pitrou.net>
	
	<20120508131722.1353070e@pitrou.net>
Message-ID: <20120508111553.6c9023cd@bhuda.mired.org>

On Tue, 8 May 2012 13:17:22 +0200
Antoine Pitrou  wrote:
> On Tue, 8 May 2012 12:12:05 +0100
> Paul Moore  wrote:
> > On 8 May 2012 11:53, Antoine Pitrou  wrote:
> > >> If you want to test for None return values, you need to use
> > >> "if is None" or "if is not None".
> > > So who writes the PEP to deprecate __bool__ methods wholesale?
> > I see no need - if you're testing for "true" return values, bool is
> > correct. But if you're testing for None vs an actual value, it's not.
> Well, again, if that's the case, then __bool__ should be deprecated for
> all types where being "true" or "false" doesn't make obvious sense.
> Which is most types, actually.

Not quite, because "practicality beats purity". It should be "all
types where there aren't obviously useful values for 'true' and
'false' in a boolean context."

For container types, "not empty" provides obviously useful etc.  For
numeric types, "nonzero" does that. I think that covers most of the
builtins.

Arguably, the ability to write "if not " instead of "if
empty()" isn't worth the price of the all-to-common bug of
writing "if not " when you should be writing "if
 is None" passing quietly instead of possibly throwing an
exception. But that battle is already lost (and I prefer the current
behavior anyway).

For the case at hand - datetime.time() - the current behavior isn't
obviously useful. If we were doing it from scratch, yeah, maybe it
ought to be true all the time. Or maybe we should follow your
suggestion here, and make converting a datetime.time() to bool throw
an exception. But it doesn't appear to be a problem worth the cost of
fixing.

> Of course, this is a completely vacuous discussion. The reality is that
> a __bool__ exists for all types, we are not deprecating it, and people
> rely on it even though PEP 8 zealots may recommend otherwise.

PEP 8 doesn't recommend against using __bool__. It warns about the
common python coding error of writing "if not x" instead of "if x is
not None" when x may have a value that's both false and not None. This
is a code correctness issue. Checking whether converting something to
bool yields false when you're trying to see if it's some specific
value that happens to convert to false is wrong. It's just as wrong to
write "if not x" rather than "if len(x)" to check to see if x is an
empty container if x might be None as to write "if not x" rather than
"if x is not None" to check to see if x is None. The latter is the far
more common bug in Python programs, which is why PEP 8 warns people
about it instead of the former.

    		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From guido at python.org  Tue May  8 19:11:39 2012
From: guido at python.org (Guido van Rossum)
Date: Tue, 8 May 2012 10:11:39 -0700
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: <20120508111553.6c9023cd@bhuda.mired.org>
References: <4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<20120507180653.25a654d1@pitrou.net>
	
	<4FA88B90.7060309@stoneleaf.us>
	
	<20120508054249.GB3797@ando>
	
	<4FA8F6B8.3030307@egenix.com> <20120508125342.50a02242@pitrou.net>
	
	<20120508131722.1353070e@pitrou.net>
	<20120508111553.6c9023cd@bhuda.mired.org>
Message-ID: 

I think there's nothing to be done. It's clear that making
datetime.time() be false was a conscious decision when the datetime
module was designed (we thought about *every* aspect of the design
quite a lot -- see (*) below). At the same time knowing what I know
now about common usage I wouldn't design it this way today.

Note that the date and datetime types don't have this problem because
a zero date is invalid; and for the timedelta type having zero be
false is more useful than harmful. But for time, the use case is
marginal and the "trap" is real, even if it is due to poor coding
style. (Good API design should consider avoiding traps for poor coders
one of its goals.)

However, given that it's been a feature for so long I don't think we
can change it. Perhaps we could call it out more in the documentation
(though it's already quite prominent).

(*) I trawled through some history. The original design wiki is only
accessible on the wayback machine:

http://wayback.archive.org/web/20020801000000*/http://zope.org/Members/fdrake/DateTimeWiki/FrontPage

(has many versions between 2002 and 2006). The wiki has no mention of
boolean interpretation for the time type, but the earliest docs for
the C implementation mentions Boolean context:

http://web.archive.org/web/20030119231337/http://www.python.org/dev/doc/devel/lib/datetime-time.html

-- 
--Guido van Rossum (python.org/~guido)


From tjreedy at udel.edu  Tue May  8 19:57:32 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 08 May 2012 13:57:32 -0400
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: <4FA90554.9010505@egenix.com>
References: <4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<20120507180653.25a654d1@pitrou.net>
	
	<4FA88B90.7060309@stoneleaf.us>
	
	<20120508054249.GB3797@ando>
	
	<4FA8F6B8.3030307@egenix.com> <20120508125342.50a02242@pitrou.net>
	<4FA90554.9010505@egenix.com>
Message-ID: 

On 5/8/2012 7:36 AM, M.-A. Lemburg wrote:

> It's perfectly fine for time value to mimic a boolean value
> by following the same paradigm as a float "seconds since midnight"
> value.

Ah, I think this is the key to the dispute as to whether midnight should 
be False or True. Is the implementation of time of day as seconds since 
midnight essential (then midnight should be False) or accidental (then 
midnight should be True like all other times)? Different discussants 
disagree on the premise and hence the conclusion.

If one first implements time-of-day as a number representing seconds 
from midnight, then bool(midmight) is bool(0) is False, like it or not. 
If one later wraps the number as a Time object, as Python did, then 
seconds from midnight and the specialness of midnight is essential for 
the new object to be a completely back-compatible drop-in replacement 
(with augmentations). Anyway, if 'from midnight' is part of the core 
concept of the class, the current behavior is correct.

If one starts with time-of-day as a concept independent of linear 
numbers, as smoothly flowing around a circle, then making any particular 
time of day (or point on the circle) special seems wrong. Indeed, time 
of day is the same as local rotation angle with respect to the sum. So 
it is as much geometric as numeric.

Abstractly, the second viewpoint seems correct. Pragmatically, however, 
civilized humans (those with clocks ;-) have standardized on local 
nominal midnight as the base point for numerically measuring time of day.

---
We can also argue the issue both ways from the viewpoint of code 
compactness.

False: Let t be a Time instance and midnight be Time(0). Then False 
midnight allows 'if t == midnight', which is needed occasionally, to be 
abbreviated 'if not t'.

True: Let t be a Time instance or None, such as might be the return from 
a function just prior to testing t. Then True midnight allows 'if t is 
None', which may be needed on such occasions, to be abbreviated 'if not t'.

While I am comfortable with and love the abbreviations for 0 and empty 
(and occasionally None), I would be disinclined, at least at present, to 
use either abbreviation for Time conditions. Typing the actual 
conditions above was as fast as thinking about getting the abbreviation 
to match correctly.

---
If the stdlib had an Elevation class, we could have the same argument 
about whether Elevation(0) should be True, like all others, or False.

-- 
Terry Jan Reedy



From greg at krypto.org  Tue May  8 20:03:17 2012
From: greg at krypto.org (Gregory P. Smith)
Date: Tue, 8 May 2012 11:03:17 -0700
Subject: [Python-ideas] Module __getattr__ [Was: breaking out of module
	execution]
In-Reply-To: 
References: 
	
	<4F9F328D.1040003@pearwood.info>
	
Message-ID: 

On Mon, Apr 30, 2012 at 7:33 PM, Guido van Rossum  wrote:

> On Mon, Apr 30, 2012 at 5:47 PM, Steven D'Aprano 
> wrote:
> > Gregory P. Smith wrote:
> >
> >> Making modules "simply" be a class that could be subclasses rather than
> >> their own thing _would_ be nice for one particular project I've worked
> on
> >> where the project including APIs and basic implementations were open
> >> source
> >> but which allowed for site specific code to override many/most of those
> >> base implementations as a way of customizing it for your own specific
> (non
> >> open source) environment.
>
> > This makes no sense to me. What does the *licence* of a project have to
> do
> > with the library API? I mean, yes, you could do such a thing, but surely
> you
> > shouldn't. That would be like saying that the accelerator pedal should
> be on
> > the right in cars you buy outright, but on the left for cars you get on
> > hire-purchase.
>
> That's an irrelevant, surprising and unfair criticism of Greg's
> message. He just tried to give a specific example without being too
> specific.
>

heh. right. I didn't want to drag people into a project that shall not be
named because I dislike its design to begin with... I was just suggesting a
case where we _would_ have found it useful to treat a module as a class
that could be subclassed.  A better overall design pattern using classes in
the code's public APIs to start with would have prevented all such need.

I am not strongly in favor of modules being classes but it is an
interesting thing to ponder from time to time.


> > Nevertheless, I think your focus here is on the wrong thing. It seems to
> me
> > that you are jumping to an implementation, namely that modules should
> stop
> > being instances of a type and become classes, without having a clear
> idea of
> > your functional requirements.
> >
> > The functional requirements *might* be:
> >
> > "There ought to be an easy way to customize the behaviour of attribute
> > access in modules."
> >
> > Or perhaps:
> >
> > "There ought to be an easy way for one module to shadow another module
> with
> > the same name, but still inherit behaviour from the shadowed module."
> >
> > neither of which *require* modules to become classes.
> >
> > Or perhaps it is something else... it is unclear to me exactly what
> problems
> > you and Jim wish to solve, or whether they're the same kind of problem,
> > which is why I say the functional requirements are unclear.
> >
> > Changing modules from an instance of ModuleType to "a class that could
> be a
> > subclass" is surely going to break code. Somewhere, someone is relying on
> > the fact that modules are not types and you're going to break their
> > application.
> >
> >
> >
> >> Any APIs that were unfortunately defined as a
> >> module with a bunch of functions in it was a real pain to make site
> >> specific overrides for.
> >
> >
> > It shouldn't be. Just ensure the site-specific override module comes
> first
> > in the path, and "import module" will pick up the override module
> instead of
> > the standard one. This is a simple exercise in shadowing modules.
> >
> > Of course, this implies that the override module has to override
> > *everything*. There's currently no simple way for the shadowing module to
> > inherit functionality from the shadowed module. You can probably hack
> > something together, but it would be a PITA.
>
> If there is a bunch of functions and you want to replace a few of
> those, you can probably get the desired effect quite easily:
>
>  from base_module import *  # Or the specific set of functions that
> comprise the API.
>
>  def funct1(): 
>  def funct2(): 
>
> Not that I would recommend this -- it's easy to get confused if there
> are more than a very small number of functions. Also if
> base_module.funct3 were to call func2, it wouldn't call the overridden
> version.
>
> But all attempts to view modules as classes or instances have lead to
> negative results. (I'm sure I've thought about it at various times in
> the past.)
>
> I think the reason is that a module at best acts as a class where
> every method is a *static* method, but implicitly so. Ad we all know
> how limited static methods are. (They're basically an accident -- back
> in the Python 2.2 days when I was inventing new-style classes and
> descriptors, I meant to implement class methods but at first I didn't
> understand them and accidentally implemented static methods first.
> Then it was too late to remove them and only provide class methods.)
>

This "oops, I implemented static methods" is a wonderful bit of history! :)


>
> There is actually a hack that is occasionally used and recommended: a
> module can define a class with the desired functionality, and then at
> the end, replace itself in sys.modules with an instance of that class
> (or with the class, if you insist, but that's generally less useful).
> E.g.:
>
>  # module foo.py
>
>  import sys
>
>  class Foo:
>    def funct1(self, ): 
>    def funct2(self, ): 
>
>  sys.modules[__name__] = Foo()
>
> This works because the import machinery is actively enabling this
> hack, and as its final step pulls the actual module out of
> sys.modules, after loading it. (This is no accident. The hack was
> proposed long ago and we decided we liked enough to support it in the
> import machinery.)
>
> You can easily override __getattr__ / __getattribute__ / __setattr__
> this way. It also makes "subclassing" the module a little easier
> (although accessing the class to be used as a base class is a little
> tricky: you'd have to use foo.__class__). But of course the kind of
> API that Greg was griping about would never be implemented this way,
> so that's fairly useless. And if you were designing a module as an
> inheritable class right from the start you're much better off just
> using a class instead of the above hack.
>
> But all in all I don't think there's a great future in stock for the
> idea of allowing modules to be "subclassed". In the vast, vast
> majority of cases it's better to clearly have a separation between
> modules, which provide no inheritance and no instantiation, and
> classes, which provide both. I think Python is better off this way
> than Java, where all you have is classes (its packages cannot contain
> anything except class definitions).
>
> --
> --Guido van Rossum (python.org/~guido)
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From alexander.belopolsky at gmail.com  Tue May  8 20:30:39 2012
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 8 May 2012 14:30:39 -0400
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: 
References: <4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<20120507180653.25a654d1@pitrou.net>
	
	<4FA88B90.7060309@stoneleaf.us>
	
	<20120508054249.GB3797@ando>
	
	<4FA8F6B8.3030307@egenix.com> <20120508125342.50a02242@pitrou.net>
	
	<20120508131722.1353070e@pitrou.net>
	<20120508111553.6c9023cd@bhuda.mired.org>
	
Message-ID: 

On Tue, May 8, 2012 at 1:11 PM, Guido van Rossum  wrote:
> At the same time knowing what I know
> now about common usage I wouldn't design it this way today.

I agree with this.  Note that the latest addition to the datetime
module - the timezone type is designed differently:

>>> bool(timezone.utc)
True

In many ways the timezone type is similar to time: it represents a
point on a 24-hour circle.  Even though the "zero" timezone is even
more special than midnight, the potential for a coding mistake testing
for truth instead of identity with None is even greater because None
is (unfortunately) a very common value for tzinfo.

I still don't think we can change bool(time(0)), though.


From ethan at stoneleaf.us  Tue May  8 20:46:25 2012
From: ethan at stoneleaf.us (Ethan Furman)
Date: Tue, 08 May 2012 11:46:25 -0700
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: 
References: <4FA7DE5B.8000703@pearwood.info>	<20120507170219.266304f2@pitrou.net>			<20120507173254.6a6aee5b@pitrou.net>		<20120507180653.25a654d1@pitrou.net>		<4FA88B90.7060309@stoneleaf.us>		<20120508054249.GB3797@ando>		<4FA8F6B8.3030307@egenix.com>
	<20120508125342.50a02242@pitrou.net>	<4FA90554.9010505@egenix.com>
	
Message-ID: <4FA96A01.6040503@stoneleaf.us>

Terry Reedy wrote:
> If the stdlib had an Elevation class, we could have the same argument 
> about whether Elevation(0) should be True, like all others, or False.

I liked your explanation, Terry.

For me, it comes down to the something vs. nothing argument:  empty 
containers, the number 0 (when it represents Nothing), False, etc., are 
all instances of nothing.

I do not see midnight as a representation of nothing.

I would probably go with Something for an elevation of zero as well.

But not money.  ;)  positive is money I have, negative is money I owe, 
and zero is nothing.

~Ethan~


From mal at egenix.com  Tue May  8 21:55:15 2012
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 08 May 2012 21:55:15 +0200
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: 
References: <4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<20120507180653.25a654d1@pitrou.net>
	
	<4FA88B90.7060309@stoneleaf.us>
	
	<20120508054249.GB3797@ando>
	
	<4FA8F6B8.3030307@egenix.com> <20120508125342.50a02242@pitrou.net>
	<4FA90554.9010505@egenix.com> 
Message-ID: <4FA97A23.7010009@egenix.com>

Terry Reedy wrote:
> On 5/8/2012 7:36 AM, M.-A. Lemburg wrote:
> 
>> It's perfectly fine for time value to mimic a boolean value
>> by following the same paradigm as a float "seconds since midnight"
>> value.
> 
> Ah, I think this is the key to the dispute as to whether midnight should be False or True. Is the
> implementation of time of day as seconds since midnight essential (then midnight should be False) or
> accidental (then midnight should be True like all other times)? Different discussants disagree on
> the premise and hence the conclusion.
> 
> If one first implements time-of-day as a number representing seconds from midnight, then
> bool(midmight) is bool(0) is False, like it or not. If one later wraps the number as a Time object,
> as Python did, then seconds from midnight and the specialness of midnight is essential for the new
> object to be a completely back-compatible drop-in replacement (with augmentations). Anyway, if 'from
> midnight' is part of the core concept of the class, the current behavior is correct.
> 
> If one starts with time-of-day as a concept independent of linear numbers, as smoothly flowing
> around a circle, then making any particular time of day (or point on the circle) special seems
> wrong. Indeed, time of day is the same as local rotation angle with respect to the sum. So it is as
> much geometric as numeric.
> 
> Abstractly, the second viewpoint seems correct. Pragmatically, however, civilized humans (those with
> clocks ;-) have standardized on local nominal midnight as the base point for numerically measuring
> time of day.

I think you have to broaden that view a bit :-)

The Julian day starts noon and in other date/time concepts, the day
starts at sunrise or sunrise, so it depends on the location as well
as the day of the year (and various other astronomical corrections).
See e.g.

    http://en.wikipedia.org/wiki/Day

The whole date/time topic is full of mysteries, oddities and very
human errors and misconceptions. It also demonstrates that there's
no single right way to capture date/time.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 08 2012)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2012-07-02: EuroPython 2012, Florence, Italy               55 days to go
2012-04-26: Released mxODBC 3.1.2                 http://egenix.com/go28
2012-04-25: Released eGenix mx Base 3.2.4         http://egenix.com/go27

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


From ben+python at benfinney.id.au  Wed May  9 02:10:18 2012
From: ben+python at benfinney.id.au (Ben Finney)
Date: Wed, 09 May 2012 10:10:18 +1000
Subject: [Python-ideas] bool(datetime.time(0, 0))
References: <4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<20120507180653.25a654d1@pitrou.net>
	
	<4FA88B90.7060309@stoneleaf.us>
	
	<20120508054249.GB3797@ando>
	
	<20120508115626.32706f28@pitrou.net> 
	<20120508125148.030614cb@pitrou.net>
Message-ID: <874nrq6wph.fsf@benfinney.id.au>

Antoine Pitrou 
writes:

> On Tue, 08 May 2012 12:25:48 +0200
> Georg Brandl  wrote:
> > On 05/08/2012 11:56 AM, Antoine Pitrou wrote:
> > > On Tue, 8 May 2012 17:02:04 +1000
> > > Nick Coghlan  wrote:
> > >> 
> > >> IMO, you've completely misdiagnosed the source of that bug. Never
> > >> *ever* rely on boolean evaluation when testing against None.
> > > 
> > > Nick, that's just plain silly. If we didn't want people to rely on
> > > boolean evaluation, we wouldn't define __bool__ at all (or we
> > > would make it return a random value).
> > 
> > Read again: he's talking about people using "bool(x)" (implicitly)
> > when they mean "x is not None".
>
> That's what I read.

Yet you mis-represent him, omitting the crucial qualifier ?when testing
against None? when you quote him as saying ?we don't want people to rely
on boolean evaluation?. Then you call your straw man silly.

-- 
 \        ?I took it easy today. I just pretty much layed around in my |
  `\        underwear all day. ? Got kicked out of quite a few places, |
_o__)                              though.? ?Bug-Eyed Earl, _Red Meat_ |
Ben Finney



From jimjjewett at gmail.com  Wed May  9 02:53:58 2012
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue, 8 May 2012 20:53:58 -0400
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: <20120507180653.25a654d1@pitrou.net>
References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5>
	<4FA7DE5B.8000703@pearwood.info> <20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<20120507180653.25a654d1@pitrou.net>
Message-ID: 

On Mon, May 7, 2012 at 12:06 PM, Antoine Pitrou  wrote:

> Why do you want to reproduce it? Does midnight warrant any special
> shortcut for testing? Especially one that is confusing to many
> readers.

Why do you think that 0 represents midnight?   *Because* it is zero,
it will often be used as a default for missing data.

Python at least offers alternatives, but that doesn't mean people will
use them, and certainly doesn't mean that the data wasn't already
corrupted before python ever saw it.  And if I know the hour but not
the minute or second, I myself would generally use zeros for the
missing data even in python.

Saying that it represents midnight because it is defined that way is
true only in the same sense that evaluating to False is correct
because it is defined that way.

With a sufficiently powerful time machine, I would use 1-60 to
represent the minute/second being traversed and leave 0 for missing
data.  (And making this the right answer would probably involve going
back long before python considered the issue.)

Without such a time machine, none of options are good enough to
justify breaking backwards compatibility.

-jJ


From stephen at xemacs.org  Wed May  9 03:42:42 2012
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed, 09 May 2012 10:42:42 +0900
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: <4FA96A01.6040503@stoneleaf.us>
References: <4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<20120507180653.25a654d1@pitrou.net>
	
	<4FA88B90.7060309@stoneleaf.us>
	
	<20120508054249.GB3797@ando>
	
	<4FA8F6B8.3030307@egenix.com> <20120508125342.50a02242@pitrou.net>
	<4FA90554.9010505@egenix.com> 
	<4FA96A01.6040503@stoneleaf.us>
Message-ID: <87lil2ruy5.fsf@uwakimon.sk.tsukuba.ac.jp>

Ethan Furman writes:

 > But not money.  ;)  positive is money I have, negative is money I owe, 
 > and zero is nothing.

Accountants and economists take exception.  Unless you live on a
desert island with Friday, an aggregate zero is an accident, just like
a Poisson arrival at exactly midnight is an accident.

On the other hand, there are zeros that are None in accounting, but
they are *always* associated with a real zero (no sales of an item, no
production, worker absent, ...).

I have to admit I find Terry's circular reasoning[1] compelling as an
ex ante argument, but looking at the clock I discover it's ex post.
And it's really not that big a deal; if a factory function is
documented to return None, the test *should* be "x is not None", not
"bool(x)".

Footnotes: 
[1]  Sorry, Terry, I couldn't resist!



From steve at pearwood.info  Wed May  9 05:20:56 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 09 May 2012 13:20:56 +1000
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: 
References: <4FA7DE5B.8000703@pearwood.info>	<20120507170219.266304f2@pitrou.net>			<20120507173254.6a6aee5b@pitrou.net>		<20120507180653.25a654d1@pitrou.net>		<4FA88B90.7060309@stoneleaf.us>		<20120508054249.GB3797@ando>
	
Message-ID: <4FA9E298.20704@pearwood.info>

Nick Coghlan wrote:
> On Tue, May 8, 2012 at 3:42 PM, Steven D'Aprano  wrote:

>> Consider:
>>
>> t = get_job_start_time()  # returns a datetime.time object, or None
>> if t:
>>    do_something_with(t)
>>
>>
>> Oops, we have a bug. If the job happens to have started at exactly
>> midnight, it will wrongly be treated as false.
> 
> IMO, you've completely misdiagnosed the source of that bug. Never
> *ever* rely on boolean evaluation when testing against None. *Always*
> use the "is not None" trailer.

and then in a follow-up post:

> The problem is not using boolean evaluation - it's assuming that boolean
> evaluation is defined as "x is not None". Doing so introduces a completely
> unnecessary dependency on the type of "x". I'm frankly astonished that so
> many people seem to think it's a reasonable thing to do.

I am perfectly aware that None is not the only falsey value, and that bool 
tests are not implemented as comparisons against None. It is unfortunate that 
this thread has been hijacked into a argument about testing objects in a bool 
context, because that's not the fundamental problem.

In my example code I intentionally assumed that time values don't have a false 
value, to show how the time values encourage buggy code. At the time I wrote 
that example I thought that the behaviour of time values in a bool context was 
undocumented, an easy mistake to make: the only documentation I have found is 
buried under the entry for time.tzinfo:

     a time object is considered to be true if and only if, after
     converting it to minutes and subtracting utcoffset() (or 0
     if that?s None), the result is non-zero.

http://docs.python.org/py3k/library/datetime.html#datetime.time.tzinfo


The complicated nature of this bool context should have been a clue that it 
might have been a bad idea. The false time value is, perhaps "unpredictable" 
is a little strong, but certainly surprising. If my timezone calculations are 
correct, the local time which is falsey for me is 10am. Anyone keen to defend 
having 10am local time treated as false?


Also, be careful about dogmatic prohibitions like "*never* ever rely on 
boolean evaluation when testing against None". The equivalent comparison for 
re MatchObjects is unproblematic. They are explicitly documented as always 
being true, with the explicit aim of allowing boolean evaluation to work 
correctly:

"Match objects always have a boolean value of True. This lets you use a simple 
if-statement to test whether a match was found."

http://docs.python.org/py3k/library/re.html#match-objects

and indeed, there are at least five instances in the standard library that use 
the idiom

mo = re.match(blah blah)
if mo:
     ...

or similar.


It is unfortunate that time values don't operate similarly.



-- 
Steven


From steve at pearwood.info  Wed May  9 07:16:23 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 9 May 2012 15:16:23 +1000
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: 
References: 
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<20120507180653.25a654d1@pitrou.net>
	
	<4FA88B90.7060309@stoneleaf.us>
	
	<20120508054249.GB3797@ando>
	
Message-ID: <20120509051622.GA8882@ando>

On Tue, May 08, 2012 at 02:21:07AM -0400, Alexander Belopolsky wrote:
> On Tue, May 8, 2012 at 1:42 AM, Steven D'Aprano  wrote:
> >> In matters of opinion, the status quo reigns.
> >
> > That's somewhat of an exaggeration. The mere existence of a single
> > dissenting opinion isn't enough to block all progress/changes.
> 
> For what it's worth, I am also against changing the status quo.
> time(0) is special: it is the smallest possible value.  If you deal
> with low resolution time values, say hourly schedules, it is not
> unreasonable to test for time(0).  For example, when estimating daily
> averages, midnight samples can be weighted by 1/2 to account for the
> uncertainty in assigning midnight to a given day.

I think this demonstrates the incidious nature of this design flaw in 
time objects. Alexander, you caught me in a mistake earlier, when I 
neglected to take tzinfo into account, and here you are doing the same 
sort of thing: you can't reliably detect midnight with a simple 
bool(timevalue) test.


-- 
Steven


From steve at pearwood.info  Wed May  9 07:42:28 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 9 May 2012 15:42:28 +1000
Subject: [Python-ideas] bool(datetime.time(0, 0))
In-Reply-To: 
References: <20120507180653.25a654d1@pitrou.net>
	
	<4FA88B90.7060309@stoneleaf.us>
	
	<20120508054249.GB3797@ando>
	
	<4FA8F6B8.3030307@egenix.com> <20120508125342.50a02242@pitrou.net>
	<4FA90554.9010505@egenix.com> 
Message-ID: <20120509054227.GB8882@ando>

On Tue, May 08, 2012 at 01:57:32PM -0400, Terry Reedy wrote:
> On 5/8/2012 7:36 AM, M.-A. Lemburg wrote:
> 
> >It's perfectly fine for time value to mimic a boolean value
> >by following the same paradigm as a float "seconds since midnight"
> >value.
> 
> Ah, I think this is the key to the dispute as to whether midnight should 
> be False or True. Is the implementation of time of day as seconds since 
> midnight essential (then midnight should be False) or accidental (then 
> midnight should be True like all other times)? Different discussants 
> disagree on the premise and hence the conclusion.

If we implemented times as an Hour-Minute-Second tuple would that imply 
that midnight was True because the tuple (0, 0, 0) is not an empty 
tuple? I don't think so.


> If one first implements time-of-day as a number representing seconds 
> from midnight, then bool(midmight) is bool(0) is False, like it or not. 
> If one later wraps the number as a Time object, as Python did, then 
> seconds from midnight and the specialness of midnight is essential for 
> the new object to be a completely back-compatible drop-in replacement 
> (with augmentations). Anyway, if 'from midnight' is part of the core 
> concept of the class, the current behavior is correct.

Just because times can be implemented as (say) floats doesn't mean it is 
sensible to treat them as floats. "Square root of 3:15pm" isn't a 
meaningful concept.


> If one starts with time-of-day as a concept independent of linear 
> numbers, as smoothly flowing around a circle, then making any particular 
> time of day (or point on the circle) special seems wrong.

Precisely.

Especially when that special-time-of-day depends on the timezone.

[...]
> Abstractly, the second viewpoint seems correct. Pragmatically, however, 
> civilized humans (those with clocks ;-) have standardized on local 
> nominal midnight as the base point for numerically measuring time of day.

Even if that is correct, and I think that orthodox Jews may disagree 
that the day begins at midnight (even those with clocks), the 
datetime.time class does not fit that model. Local midnight is not 
necessarily false.



-- 
Steven


From solipsis at pitrou.net  Wed May  9 08:07:46 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Wed, 9 May 2012 08:07:46 +0200
Subject: [Python-ideas] bool(datetime.time(0, 0))
References: <2577765.3188.1336390954337.JavaMail.geo-discussion-forums@vbq5>
	<4FA7DE5B.8000703@pearwood.info>
	<20120507170219.266304f2@pitrou.net>
	
	
	<20120507173254.6a6aee5b@pitrou.net>
	
	<20120507180653.25a654d1@pitrou.net>
	
Message-ID: <20120509080746.5fb45e3b@pitrou.net>

On Tue, 8 May 2012 20:53:58 -0400
Jim Jewett  wrote:
> On Mon, May 7, 2012 at 12:06 PM, Antoine Pitrou  wrote:
> 
> > Why do you want to reproduce it? Does midnight warrant any special
> > shortcut for testing? Especially one that is confusing to many
> > readers.
> 
> Why do you think that 0 represents midnight?   *Because* it is zero,
> it will often be used as a default for missing data.

Well, if you've decided upfront that midnight "is zero", then you may
argue it's special. But as others have shown, there's nothing obvious
about midnight being "zero", especially with timezones factored in.
For example, there are no binary operators where midnight is a "zero"
i.e. a neutral element.

Besides, we have a special value called None exactly for the purpose of
representing missing data.

Regards

Antoine.




From jimjjewett at gmail.com  Wed May  9 18:38:19 2012
From: jimjjewett at gmail.com (Jim Jewett)
Date: Wed, 9 May 2012 12:38:19 -0400
Subject: [Python-ideas] bool(datetime.time(0, 0)) vs missing data
Message-ID: 

On Wed, May 9, 2012 at 2:07 AM, Antoine Pitrou  wrote:
> On Tue, 8 May 2012 20:53:58 -0400
> Jim Jewett  wrote:
>> On Mon, May 7, 2012 at 12:06 PM, Antoine Pitrou  wrote:

>> > ... Does midnight warrant any special ...

>> Why do you think that 0 represents midnight? ? *Because* it is zero,
>> it will often be used as a default for missing data.

> Well, if you've decided upfront that midnight "is zero", then you may
> argue it's special. But as others have shown, there's nothing obvious
> about midnight being "zero", especially with timezones factored in.
> For example, there are no binary operators where midnight is a "zero"
> i.e. a neutral element.

The cyclic groups Z/n have a zero element, so *something* has to be
effectively zero; start of day is as reasonable as anything else.  Or
are you just saying that there aren't *any* meaningful binary
operators on hour-of-the-day, beyond __eq__ and __ne__?

Practicality Beats Purity suggests that at least comparisons should
work consistently, so that time-of-day can be consistently ordered,
and that requires a least element.  (You could make it noon, or  mean
sunrise, or actual sundown at a certain monument, but you do need
one.)

> Besides, we have a special value called None exactly for the purpose of
> representing missing data.

Not really at the moment, since datetime.time doesn't accept it as an
argument, and itself uses 0 for missing data.  That could *probably*
be fixed in an upwards compatible way, but you would still have to
special case how missing-data times should compare to current class
instances.

    # Should they ever be equal?
    # If not, mixing types is a problem.
    time(hour, min, sec, microsecond)  ==
    Time(hour, min, sec, microsecond)  ?

    # Should microseconds be required?
    # If not, mixing types is a problem.
    time(hour, min, sec, microsecond=0)  ==
    Time(hour, min, sec, microsecond=None)  ?

    # Should even hours be required?
    # If so, how much precision is logically required?
    time(hour=0, min=0, sec=0, microsecond=0)  ==
    Time(hour=None, min=None, sec=None, microsecond=None)  ?

    # datetime.time already skips the date.
    # Can hours be skipped too, to indicate "every hour on the hour"?
    Time(hour=None, min=0, sec=0, microsecond=0)  ?

    # Should missing data never match, match 0, or match anything (the
"on the hour" case)
    time(hour=7, min=15, sec=0, microsecond=0)  ==
    Time(hour=None, min=15, sec=0, microsecond=0)  ?

-jJ


From solipsis at pitrou.net  Wed May  9 18:58:03 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Wed, 9 May 2012 18:58:03 +0200
Subject: [Python-ideas] bool(datetime.time(0, 0)) vs missing data
References: 
Message-ID: <20120509185803.49487659@pitrou.net>

On Wed, 9 May 2012 12:38:19 -0400
Jim Jewett  wrote:
> Or
> are you just saying that there aren't *any* meaningful binary
> operators on hour-of-the-day, beyond __eq__ and __ne__?

There aren't indeed. __eq__ and __ne__ don't produce a time result, so
they can't be used as a basis for a group. Hence, the notion of a "zero"
which you invoked is undefined here.

> Practicality Beats Purity suggests that at least comparisons should
> work consistently, so that time-of-day can be consistently ordered,
> and that requires a least element.  (You could make it noon, or  mean
> sunrise, or actual sundown at a certain monument, but you do need
> one.)

Sure, so what? Let's say you are creating a day-of-week class with 7
possible instances, and you make them orderable. Does it mean that the
least of them (say, Monday or Sunday) should evaluate to false?

> > Besides, we have a special value called None exactly for the purpose of
> > representing missing data.
> 
> Not really at the moment, since datetime.time doesn't accept it as an
> argument, and itself uses 0 for missing data.

That's bogus. int() doesn't take None as an argument, yet None is
often used to indicate a missing integer argument in other APIs.
This fact is true for most types under the sun. You are just inventing
non-existing contraints.

Regards

Antoine.




From jeanpierreda at gmail.com  Wed May  9 19:10:32 2012
From: jeanpierreda at gmail.com (Devin Jeanpierre)
Date: Wed, 9 May 2012 13:10:32 -0400
Subject: [Python-ideas] bool(datetime.time(0, 0)) vs missing data
In-Reply-To: 
References: 
Message-ID: 

On Wed, May 9, 2012 at 12:38 PM, Jim Jewett  wrote:
> The cyclic groups Z/n have a zero element, so *something* has to be
> effectively zero; start of day is as reasonable as anything else. ?Or
> are you just saying that there aren't *any* meaningful binary
> operators on hour-of-the-day, beyond __eq__ and __ne__?

Times are not a group -- there's no addition or multiplication
operator among times. They only add against timedeltas, and timedeltas
are the ones that need a 0 in order for that to work properly in some
sense (since the result is a time).

-- Devin


From jeanpierreda at gmail.com  Wed May  9 20:27:20 2012
From: jeanpierreda at gmail.com (Devin Jeanpierre)
Date: Wed, 9 May 2012 14:27:20 -0400
Subject: [Python-ideas] bool(datetime.time(0, 0)) vs missing data
In-Reply-To: 
References: 
	
Message-ID: 

Egh, maybe I should elaborate.

On Wed, May 9, 2012 at 1:10 PM, Devin Jeanpierre  wrote:
> Times are not a group -- there's no addition or multiplication
> operator among times. They only add against timedeltas, and timedeltas
> are the ones that need a 0 in order for that to work properly in some
> sense (since the result is a time).

In some system with addition, you generally want an "identity element"
(called 0), such that for every x, 0 + x = x.

+ is only defined with times for timedeltas, not with times and other
times. It doesn't make sense to add 3 'oclock to 5 'oclock. So if
we're talking about some 0 such that 0 + x = x, either 0 is the time
and x is the timedelta, or 0 is the timedelta and x is the time.

It doesn't make sense for the zero to be with the times, since the
result of addition with a timedelta shouldn't be a timedelta. There is
no time such that time + x = x for any timedelta x. This basically
means that there is _no time that makes sense as a "zero"_. At all.
You can pick some arbitrary one and call it zero, but it isn't zero in
an arithmetical sense.

On the other hand, there definitely is a zero timedelta: x +
timedelta(0) = x for every time x.

-- Devin


From alexander.belopolsky at gmail.com  Wed May  9 20:36:32 2012
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Wed, 9 May 2012 14:36:32 -0400
Subject: [Python-ideas] bool(datetime.time(0, 0)) vs missing data
In-Reply-To: 
References: 
	
Message-ID: 

On Wed, May 9, 2012 at 1:10 PM, Devin Jeanpierre  wrote:
> Times are not a group -- there's no addition or multiplication
> operator among times. They only add against timedeltas, ...

No, they don't.  There is really very little that you can do with
detached time objects.  While they have the tzinfo, with any timezone
that observes DST it is useless.  That's the main reason I am so
skeptical about any ideas about improving the time type.  Users should
just learn to avoid using it and use full datetime instead.


From jeanpierreda at gmail.com  Wed May  9 20:52:12 2012
From: jeanpierreda at gmail.com (Devin Jeanpierre)
Date: Wed, 9 May 2012 14:52:12 -0400
Subject: [Python-ideas] bool(datetime.time(0, 0)) vs missing data
In-Reply-To: 
References: 
	
	
Message-ID: 

On Wed, May 9, 2012 at 2:36 PM, Alexander Belopolsky
 wrote:
> No, they don't. ?There is really very little that you can do with
> detached time objects. ?While they have the tzinfo, with any timezone
> that observes DST it is useless. ?That's the main reason I am so
> skeptical about any ideas about improving the time type. ?Users should
> just learn to avoid using it and use full datetime instead.

Blagh, that's even worse. But thanks for the correction. I admit I was
just assuming they were like datetimes.

-- Devin


From sven at marnach.net  Wed May  9 20:48:56 2012
From: sven at marnach.net (Sven Marnach)
Date: Wed, 9 May 2012 19:48:56 +0100
Subject: [Python-ideas] Add `future_builtins` as an alias for `builtins`
Message-ID: <20120509184856.GC3133@bagheera>

With the reintroduction of u"Unicode literals", Python 3.3 will remove
one of the major stumbling stones for supporting Python 2.x and 3.3
within the same code base.  Another rather trivial stumbling stone
could be removed by adding the alias `future_builtins` for the
`builtins` module.  Currently, you need to use a try/except block,
which isn't too bad, but I think it would be nicer if a line like

    from future_builtins import map

continues to work, just like __future__ imports continue to work.  I
think the above actually *is* a kind of __future__ report which just
happens to be in a regular module because it doesn't need any special
compiler support.

I know a few module names changed and some modules have been
reorganised to packages, so you will still need try/except blocks for
other imports.  However, I think `future_builtins` is special because
it's sole raison d'?tre is forward-compatibility and becuase of the
analogy with `__future__`.

Cheers,
    Sven


From mikegraham at gmail.com  Wed May  9 22:26:00 2012
From: mikegraham at gmail.com (Mike Graham)
Date: Wed, 9 May 2012 16:26:00 -0400
Subject: [Python-ideas] Add `future_builtins` as an alias for `builtins`
In-Reply-To: <20120509184856.GC3133@bagheera>
References: <20120509184856.GC3133@bagheera>
Message-ID: 

On Wed, May 9, 2012 at 2:48 PM, Sven Marnach  wrote:

> With the reintroduction of u"Unicode literals", Python 3.3 will remove
> one of the major stumbling stones for supporting Python 2.x and 3.3
> within the same code base.  Another rather trivial stumbling stone
> could be removed by adding the alias `future_builtins` for the
> `builtins` module.  Currently, you need to use a try/except block,
> which isn't too bad, but I think it would be nicer if a line like
>
>    from future_builtins import map
>
> continues to work, just like __future__ imports continue to work.  I
> think the above actually *is* a kind of __future__ report which just
> happens to be in a regular module because it doesn't need any special
> compiler support.
>
> I know a few module names changed and some modules have been
> reorganised to packages, so you will still need try/except blocks for
> other imports.  However, I think `future_builtins` is special because
> it's sole raison d'?tre is forward-compatibility and becuase of the
> analogy with `__future__`.
>
> Cheers,
>    Sven


Sounds like it will do more good than harm.

+1

Mike
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From jkbbwr at gmail.com  Wed May  9 22:28:31 2012
From: jkbbwr at gmail.com (Jakob Bowyer)
Date: Wed, 9 May 2012 21:28:31 +0100
Subject: [Python-ideas] Add `future_builtins` as an alias for `builtins`
In-Reply-To: <20120509184856.GC3133@bagheera>
References: <20120509184856.GC3133@bagheera>
Message-ID: 

Why not for naming call it from __future__.builtins import map

On Wed, May 9, 2012 at 7:48 PM, Sven Marnach  wrote:
> With the reintroduction of u"Unicode literals", Python 3.3 will remove
> one of the major stumbling stones for supporting Python 2.x and 3.3
> within the same code base. ?Another rather trivial stumbling stone
> could be removed by adding the alias `future_builtins` for the
> `builtins` module. ?Currently, you need to use a try/except block,
> which isn't too bad, but I think it would be nicer if a line like
>
> ? ?from future_builtins import map
>
> continues to work, just like __future__ imports continue to work. ?I
> think the above actually *is* a kind of __future__ report which just
> happens to be in a regular module because it doesn't need any special
> compiler support.
>
> I know a few module names changed and some modules have been
> reorganised to packages, so you will still need try/except blocks for
> other imports. ?However, I think `future_builtins` is special because
> it's sole raison d'?tre is forward-compatibility and becuase of the
> analogy with `__future__`.
>
> Cheers,
> ? ?Sven
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas


From mikegraham at gmail.com  Wed May  9 23:17:33 2012
From: mikegraham at gmail.com (Mike Graham)
Date: Wed, 9 May 2012 17:17:33 -0400
Subject: [Python-ideas] Add `future_builtins` as an alias for `builtins`
In-Reply-To: 
References: <20120509184856.GC3133@bagheera>
	
Message-ID: 

On Wed, May 9, 2012 at 4:28 PM, Jakob Bowyer  wrote:

> Why not for naming call it from __future__.builtins import map
>

Because future_builtins already exists.

Mike
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From tjreedy at udel.edu  Wed May  9 23:59:03 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 09 May 2012 17:59:03 -0400
Subject: [Python-ideas] Add `future_builtins` as an alias for `builtins`
In-Reply-To: <20120509184856.GC3133@bagheera>
References: <20120509184856.GC3133@bagheera>
Message-ID: 

On 5/9/2012 2:48 PM, Sven Marnach wrote:
> With the reintroduction of u"Unicode literals", Python 3.3 will remove
> one of the major stumbling stones for supporting Python 2.x and 3.3
> within the same code base.

The justification for that is that some people need *three* types of 
fast string/byte literals:
1. things that should be bytes in both Py 2 and Py 3;
2. things that should be bytes in Py 2 and unicode in Py 3;
3. things that should be unicode in both Py 2 and Py;
and that there is no way to accomplish that with imports.

 > Another rather trivial stumbling stone
> could be removed by adding the alias `future_builtins` for the
> `builtins` module.  Currently, you need to use a try/except block,
> which isn't too bad,but I think it would be nicer if a line like
>      from future_builtins import map
> continues to work, just like __future__ imports continue to work.

This proposal, as admitted above and below, does not have the 
justification of the u prefix reversion. By bloating Python 3 with 
obsolete stuff, it would make Python 3 less nice. I was dubious about 
the u prefix addition, because I anticipated that additional, less 
justified, reversion proposals would follow;-).

A strong -1

Every deprecation and removal or change of a name introduces a stumbling 
block that could be 'removed' by keeping the old name. So we do not 
remove things casually. This one was deprecated on introduction, so 
there is no surprise. I do not see any particular reason to special case 
it, and I notice that you had the sense to not propose that all 
changed/deleted module names be duplicated;-).

I do not know whether this: "The 2to3 tool that ports Python 2 code to 
Python 3 will recognize this usage and leave the new builtins alone" is 
because 2to3 special-cases imports from future_builtins or because it 
always leaves explicitly imported names alone, even if they duplicate a 
built-in name. But I don't think it matters for a single code base, even 
it you do use 2to3 to help write that.

In any case, if you do not like how you have to directly use 
future_builtins with a single code base, wrap it. Install the 
work-a-like wrapper module in either site-packages or include one in 
your application package. In either case, use whatever name you prefer.

app/x.py
from app.builtins import map  # or *, or whatever.

app/builtins.py:
try:
   from future.builtins import *
else:  # Py3
   map = map
   

This will work with all versions of Python 3.

> I know a few module names changed and some modules have been
> reorganised to packages, so you will still need try/except blocks for
> other imports.

If you really dislike conditional imports, you could wrap them all in an 
application 'stdlib' module that hid the version details. Is something 
like this really not part of existing compatibility packages?

-- 
Terry Jan Reedy



From ncoghlan at gmail.com  Thu May 10 00:04:38 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 10 May 2012 08:04:38 +1000
Subject: [Python-ideas] Add `future_builtins` as an alias for `builtins`
In-Reply-To: <20120509184856.GC3133@bagheera>
References: <20120509184856.GC3133@bagheera>
Message-ID: 

No, because it is trivial to do the following during application startup
(with appropriate version checks or try blocks):

import sys, builtins
sys.modules["future_builtins"] = builtins

Or, use the six package instead.

--
Sent from my phone, thus the relative brevity :)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From greg.ewing at canterbury.ac.nz  Thu May 10 02:18:45 2012
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 10 May 2012 12:18:45 +1200
Subject: [Python-ideas] bool(datetime.time(0, 0)) vs missing data
In-Reply-To: 
References: 
Message-ID: <4FAB0965.90009@canterbury.ac.nz>

Jim Jewett wrote:

> The cyclic groups Z/n have a zero element, so *something* has to be
> effectively zero;

But times of day are not a cyclic group. Time of day
*differences* are, but not the times themselves.

-- 
Greg


From charlesw123456 at gmail.com  Sat May 12 00:27:03 2012
From: charlesw123456 at gmail.com (li wang)
Date: Sat, 12 May 2012 06:27:03 +0800
Subject: [Python-ideas] I have an encrypted python module format: .pye
Message-ID: 

hi all:

I want to use python in my product because I like and familiar with
python for many years, but I won't let the customer to read and modify
my code. So the best way is to encrypt my module .py to .pye.

Now python will write compiled byte code .pyc or .pyo when a .py is
imported, I have write a patch to add .pye support for encrypted byte
code.

When a .pye is imported, python will check the environment variable
PYTHONENCRYPT, if this environment variable is defined with non-blank
value, the value is used to generate AES key and CBC initialize vector
which will be used to encrypt .py and decrypt .pye.

Now it is work for me, does the python community is interested for it?
I believe this feature can be helpful to let the python to be used in
bussiness use case.

Thanks greatly.

Charles Wang   May/12, 2012.


From mwm at mired.org  Sat May 12 00:39:07 2012
From: mwm at mired.org (Mike Meyer)
Date: Fri, 11 May 2012 18:39:07 -0400
Subject: [Python-ideas] I have an encrypted python module format: .pye
In-Reply-To: 
References: 
Message-ID: <20120511183907.1c256ed8@bhuda.mired.org>

On Sat, 12 May 2012 06:27:03 +0800
li wang  wrote:
> When a .pye is imported, python will check the environment variable
> PYTHONENCRYPT, if this environment variable is defined with non-blank
> value, the value is used to generate AES key and CBC initialize vector
> which will be used to encrypt .py and decrypt .pye.

And what prevents the customer from doing that themselves in order to
read the source?

> Now it is work for me, does the python community is interested for it?
> I believe this feature can be helpful to let the python to be used in
> bussiness use case.

While the ability to hide code is a recurring request, it really
doesn't get a lot of support. The problem is that you have to have the
plain text of the code available on the customers machine in order to
run it. So everything they need to know to decrypt it has to be on the
machine, meaning you're relying on obscuring some part of that
information to keep them from decrypting it outside of the execution
environment.  Security through obscurity is a bad idea, and never
really works for very long.

The recommended solution is to package your software so that reading
the source isn't really a requirement. One alternative is to ship both
a Python executable and .pyo files without the .py files. I believe
there's even a tool for windows that bundles all of that up into a
.exe file. This is really just more obscurity, though. It's not like
extracting the .pyo files from the .exe is impossible, and turning
.pyo files back into python code is straightforward.

The better approach is to refactor the critical code into a web
service, and sell the users a client and an account. Or give away the
client and just sell the account.

       		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From alexander.belopolsky at gmail.com  Sat May 12 00:43:41 2012
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Fri, 11 May 2012 18:43:41 -0400
Subject: [Python-ideas] I have an encrypted python module format: .pye
In-Reply-To: 
References: 
Message-ID: <1BD6DC17-D858-48CF-AA75-461F6E6FC655@gmail.com>





On May 11, 2012, at 6:27 PM, li wang  wrote:

> I won't let the customer to read and modify
> my code. 

What you describe sounds impossible: how can your customer run your code without an encryption key? If you deliver the key, how can you prevent the customer from reading your code?  Preventing modification is feasible with various signed code schemes, but software DRM can never work. 

From guido at python.org  Sat May 12 01:02:55 2012
From: guido at python.org (Guido van Rossum)
Date: Fri, 11 May 2012 16:02:55 -0700
Subject: [Python-ideas] I have an encrypted python module format: .pye
In-Reply-To: <1BD6DC17-D858-48CF-AA75-461F6E6FC655@gmail.com>
References: 
	<1BD6DC17-D858-48CF-AA75-461F6E6FC655@gmail.com>
Message-ID: 

It it impossible in the same way that it is impossible to lock the
front door of your house.

The Dropbox client for most major OS'es is written in Python and they
use a similar technique. They are very happy with it.

--Guido

On Fri, May 11, 2012 at 3:43 PM, Alexander Belopolsky
 wrote:
>
>
>
>
> On May 11, 2012, at 6:27 PM, li wang  wrote:
>
>> I won't let the customer to read and modify
>> my code.
>
> What you describe sounds impossible: how can your customer run your code without an encryption key? If you deliver the key, how can you prevent the customer from reading your code? ?Preventing modification is feasible with various signed code schemes, but software DRM can never work.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas



-- 
--Guido van Rossum (python.org/~guido)


From grosser.meister.morti at gmx.net  Sat May 12 01:14:24 2012
From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=)
Date: Sat, 12 May 2012 01:14:24 +0200
Subject: [Python-ideas] I have an encrypted python module format: .pye
In-Reply-To: 
References: 
	<1BD6DC17-D858-48CF-AA75-461F6E6FC655@gmail.com>
	
Message-ID: <4FAD9D50.9030209@gmx.net>

Well, a quick Google search found this:
http://itooktheredpill.dyndns.org/2012/dropbox-decrypt/

So their encryption is pretty useless. The difference to breaking a door lock is, that breaking a 
lock requires some effort each time you do it. Breaking the encryption of such code only requires a 
one time effort by someone interested in cracking such things (provided he/she will then publish 
his/her findings, which they often do).

On 05/12/2012 01:02 AM, Guido van Rossum wrote:
> It it impossible in the same way that it is impossible to lock the
> front door of your house.
>
> The Dropbox client for most major OS'es is written in Python and they
> use a similar technique. They are very happy with it.
>
> --Guido
>
> On Fri, May 11, 2012 at 3:43 PM, Alexander Belopolsky
>   wrote:
>>
>>
>>
>>
>> On May 11, 2012, at 6:27 PM, li wang  wrote:
>>
>>> I won't let the customer to read and modify
>>> my code.
>>
>> What you describe sounds impossible: how can your customer run your code without an encryption key? If you deliver the key, how can you prevent the customer from reading your code?  Preventing modification is feasible with various signed code schemes, but software DRM can never work.
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> http://mail.python.org/mailman/listinfo/python-ideas
>
>
>



From techtonik at gmail.com  Sat May 12 10:21:17 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Sat, 12 May 2012 11:21:17 +0300
Subject: [Python-ideas] [...].join(sep)
Message-ID: 

I am certain this was proposed many times, but still - why it is rejected?

"real man don't use spaces".split().join('+').upper()
    instead of
'+'.join("real man don't use spaces".split()).upper()


The class purity (not being dependent from objects of other class) is
not an argument here:
    string.join() produces list, why list.join() couldn't produce strings?

The impedance mismatch can be, but it is a pain already and
string.join() doesn't help:
    that means you still get exception when trying to join lists with
no strings inside


Can practicality still beat purity in this case?


From phd at phdru.name  Sat May 12 10:37:08 2012
From: phd at phdru.name (Oleg Broytman)
Date: Sat, 12 May 2012 12:37:08 +0400
Subject: [Python-ideas] [...].join(sep)
In-Reply-To: 
References: 
Message-ID: <20120512083708.GA3901@iskra.aviel.ru>

On Sat, May 12, 2012 at 11:21:17AM +0300, anatoly techtonik  wrote:
> I am certain this was proposed many times

   Thousands.

>     string.join() produces list, why list.join() couldn't produce strings?

   string.split() produces list.

   There is no list.join() because list is only one of many containers.
Should tuple has its own .join() method? What about other containers?
iterables? generators?
   string.join() can accept any iterable, not only a list. That's the
explanation why it's preferred.

>     that means you still get exception when trying to join lists with
> no strings inside

   In what way do you expect list.join(string) would help?

Oleg.
-- 
     Oleg Broytman            http://phdru.name/            phd at phdru.name
           Programmers don't die, they just GOSUB without RETURN.


From techtonik at gmail.com  Sat May 12 10:59:03 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Sat, 12 May 2012 11:59:03 +0300
Subject: [Python-ideas] hexdump
Message-ID: 

Just an idea of usability fix for Python 3.
hexdump module (function or bytes method is better) as simple, easy
and intuitive way for dumping binary data when writing programs in
Python.

hexdump(bytes)   - produce human readable dump of binary data,
byte-by-byte representation, separated by space, 16-byte rows


Rationale:
1. Debug.
    Generic binary data can't be output to console. A separate helper
is needed to print, log or store its value in human readable format in
database. This takes time.
2. Usability.
    binascii is ugly: name is not intuitive any more, there are a lot
of functions, and it is not clear how it relates to unicode.
3. Serialization.
    It is convenient to have format that can be displayed in a text
editor. Simple tools encourage people to use them.

Practical example:
>>> print(b)
? ? ? ?? ?? ? ?? ?? ?
 ?  ? ?
>>> b
'\xe6\xb0\x08\x04\xe7\x9e\x08\x04\xe7\xbc\x08\x04\xe7\xd5\x08\x04\xe7\xe4\x08\x04\xe6\xb0\x08\x04\xe7\xf0\x08\x04\xe7\xff\x08\x04\xe8\x0b\x08\x04\xe8\x1a\x08\x04\xe6\xb0\x08\x04\xe6\xb0\x08\x04'
>>> print(binascii.hexlify(data))
e6b00804e79e0804e7bc0804e7d50804e7e40804e6b00804e7f00804e7ff0804e80b0804e81a0804e6b00804e6b00804
>>>
>>> data = hexdump(b)
>>> print(data)
E6 B0 08 04 E7 9E 08 04 E7 BC 08 04 E7 D5 08 04
E7 E4 08 04 E6 B0 08 04 E7 F0 08 04 E7 FF 08 04
E8 0B 08 04 E8 1A 08 04 E6 B0 08 04 E6 B0 08 04
>>>
>>> # achieving the same output with binascii is overcomplicated
>>> data_lines = [binascii.hexlify(b)[i:min(i+32, len(binascii.hexlify(b)))] for i in xrange(0, len(binascii.hexlify(b)), 32)]
>>> data_lines = [' '.join(l[i:min(i+2, len(l))] for i in xrange(0, len(l), 2)).upper() for l in data_lines]
>>> print('\n'.join(data_lines))
E6 B0 08 04 E7 9E 08 04 E7 BC 08 04 E7 D5 08 04
E7 E4 08 04 E6 B0 08 04 E7 F0 08 04 E7 FF 08 04
E8 0B 08 04 E8 1A 08 04 E6 B0 08 04 E6 B0 08 04

On the other side, getting rather useless binascii output from
hexdump() is quite trivial:
>>> data.replace(' ','').replace('\n','').lower()
'e6b00804e79e0804e7bc0804e7d50804e7e40804e6b00804e7f00804e7ff0804e80b0804e81a0804e6b00804e6b00804'

But more practical, for example, would be counting offset from hexdump:
>>> print( ''.join( '%05x: %s\n' % (i*16,l) for i,l in enumerate(hexdump(b).split('\n'))))

Etc.

Conclusion:
By providing better building blocks on basic level Python will become
a better tool for more useful tasks.


References:
[1] http://stackoverflow.com/questions/2340319/python-3-1-1-string-to-hex
[2] http://en.wikipedia.org/wiki/Hex_dump

--
anatoly t.


From phd at phdru.name  Sat May 12 11:15:43 2012
From: phd at phdru.name (Oleg Broytman)
Date: Sat, 12 May 2012 13:15:43 +0400
Subject: [Python-ideas] hexdump
In-Reply-To: 
References: 
Message-ID: <20120512091543.GA5284@iskra.aviel.ru>

On Sat, May 12, 2012 at 11:59:03AM +0300, anatoly techtonik  wrote:
> Just an idea of usability fix for Python 3.
> hexdump module (function or bytes method is better) as simple, easy
> and intuitive way for dumping binary data when writing programs in
> Python.

   Well, you know, the way to add such modules to Python is via
Cheeseshop.

Oleg.
-- 
     Oleg Broytman            http://phdru.name/            phd at phdru.name
           Programmers don't die, they just GOSUB without RETURN.


From simon.sapin at kozea.fr  Sat May 12 10:34:26 2012
From: simon.sapin at kozea.fr (Simon Sapin)
Date: Sat, 12 May 2012 10:34:26 +0200
Subject: [Python-ideas] [...].join(sep)
In-Reply-To: 
References: 
Message-ID: <4FAE2092.70309@kozea.fr>

Le 12/05/2012 10:21, anatoly techtonik a ?crit :
> I am certain this was proposed many times, but still - why it is rejected?
>
> "real man don't use spaces".split().join('+').upper()
>      instead of
> '+'.join("real man don't use spaces".split()).upper()
>
>
> The class purity (not being dependent from objects of other class) is
> not an argument here:
>      string.join() produces list, why list.join() couldn't produce strings?
>
> The impedance mismatch can be, but it is a pain already and
> string.join() doesn't help:
>      that means you still get exception when trying to join lists with
> no strings inside
>
>
> Can practicality still beat purity in this case?

Hi,

I?m not sure what you mean by "class purity", but the argument against 
this is practical: list.join would work but we want to join iterables, 
not just lists.

bytes.join and str.join accept any iterable (including user-defined 
ones), while not every iterable would have a join method.

Having the burden of defining join on user-defined string-like types 
(not very common) is better than on user-defined iterables (more 
common). Also, a "string-like" already needs many methods while __iter__ 
is enough to make an iterable.

-- 
Simon Sapin


From timothy.c.delaney at gmail.com  Sat May 12 15:11:00 2012
From: timothy.c.delaney at gmail.com (Tim Delaney)
Date: Sat, 12 May 2012 23:11:00 +1000
Subject: [Python-ideas] I have an encrypted python module format: .pye
In-Reply-To: 
References: 
Message-ID: 

On 12 May 2012 08:27, li wang  wrote:

> When a .pye is imported, python will check the environment variable
> PYTHONENCRYPT, if this environment variable is defined with non-blank
> value, the value is used to generate AES key and CBC initialize vector
> which will be used to encrypt .py and decrypt .pye.
>

As others have noted, this is essentially useless for protecting your code.
How do you set that environment variable on your customer's system, without
giving them the key they need?

You can erect a somewhat higher barrier by using Pyrex or Cython to compile
your modules to .pyd/.so. It's still quite possible to extract your logic
and/or patch around things, but it's a little harder.

The only reasonably secure method (again, as noted by others) is to not
have your code on the client machine e.g. using a web service for the
critical logic.

Tim Delaney
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From g.rodola at gmail.com  Sat May 12 16:29:35 2012
From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=)
Date: Sat, 12 May 2012 16:29:35 +0200
Subject: [Python-ideas] Move tarfile.filemode() into stat module
Message-ID: 

http://hg.python.org/cpython/file/9d9495fabeb9/Lib/tarfile.py#l304
I discovered this undocumented function by accident different years
ago and reused it a couple of times since then.
I think that leaving it hidden inside tarfile module is unfortunate.
What about moving it into stat module and document it?

Regards,

--- Giampaolo
http://code.google.com/p/pyftpdlib/
http://code.google.com/p/psutil/
http://code.google.com/p/pysendfile/


From solipsis at pitrou.net  Sat May 12 16:41:10 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sat, 12 May 2012 16:41:10 +0200
Subject: [Python-ideas] Move tarfile.filemode() into stat module
References: 
Message-ID: <20120512164110.27316aec@pitrou.net>

On Sat, 12 May 2012 16:29:35 +0200
Giampaolo Rodol? 
wrote:
> http://hg.python.org/cpython/file/9d9495fabeb9/Lib/tarfile.py#l304
> I discovered this undocumented function by accident different years
> ago and reused it a couple of times since then.
> I think that leaving it hidden inside tarfile module is unfortunate.
> What about moving it into stat module and document it?

I don't know which of stat or shutil would be the better recipient, but
it's a good idea anyway.

Regards

Antoine.




From g.rodola at gmail.com  Sat May 12 17:23:48 2012
From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=)
Date: Sat, 12 May 2012 17:23:48 +0200
Subject: [Python-ideas] Move tarfile.filemode() into stat module
In-Reply-To: <20120512164110.27316aec@pitrou.net>
References: 
	<20120512164110.27316aec@pitrou.net>
Message-ID: 

2012/5/12 Antoine Pitrou :
> On Sat, 12 May 2012 16:29:35 +0200
> Giampaolo Rodol? 
> wrote:
>> http://hg.python.org/cpython/file/9d9495fabeb9/Lib/tarfile.py#l304
>> I discovered this undocumented function by accident different years
>> ago and reused it a couple of times since then.
>> I think that leaving it hidden inside tarfile module is unfortunate.
>> What about moving it into stat module and document it?
>
> I don't know which of stat or shutil would be the better recipient, but
> it's a good idea anyway.

Hmm... right. It's controversial.
On one hand stat module looks better because the "mode" concept is
scattered all over the place, on the other hand this is a perfect
example of file-related "utility" function.
Let's wait in order to collect some bikeshedding then. =)


--- Giampaolo
http://code.google.com/p/pyftpdlib/
http://code.google.com/p/psutil/
http://code.google.com/p/pysendfile/


From tjreedy at udel.edu  Sat May 12 18:18:33 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 12 May 2012 12:18:33 -0400
Subject: [Python-ideas] hexdump
In-Reply-To: 
References: 
Message-ID: 

On 5/12/2012 4:59 AM, anatoly techtonik wrote:
> Just an idea of usability fix for Python 3.
> hexdump module (function or bytes method is better) as simple, easy
> and intuitive way for dumping binary data when writing programs in
> Python.
>
> hexdump(bytes)   - produce human readable dump of binary data,
> byte-by-byte representation, separated by space, 16-byte rows

Hexdump, as you propose it, does three things. In each case, it fixes a 
parameter that could reasonably have a different value.

1. Splits the hex characters into groups of two characters, each 
representing one byte. For some uses, large chunks would be more useful.

2. Uppercases the alpha hex characters. This is a holdover from the 
ancient all-uppercase world, where there was no choice. While is may 
make the block visual more 'even' and 'aesthetic', which not actually 
being read, it makes it harder to tell the difference between a 0-9 
digit and alpha digit. B and 8 become very similar. There is 
justification for binascii.hexlify using locecase.

3. Group the hex-represented units into lines of 16 each. This is only 
useful when the bytes come from memory with hex addresses, when the 
point is to determine the specific bytes at specific addresses. For 
displaying decimal-length byte strings, 25 bytes per line would be better.

What it does not do.

4. Break lines into blocks. One might want to break up multiple lines of 
25 into blocks of four lines each.

5. Label the rows and column either with hex or decimal labels.

6. Add 'dotted ascii' translation to reveal embedded ascii strints.

Output: choices are an iterator of lines, a list of lines, and a string 
with embedded newlines. The second and third are easily derived from the 
first, so I propose the first as the best choice. A iterator can also be 
used to write to a file.

A flexible module would be a good addition to pypi if not there already. 
Let see....

hexencoder 1.0
hex encode decode and compare
This project offers 3 basic tools for manipulating binary files: 1) 
flexible hexdump
Home Page: http://sourceforge.net/projects/hexencoder

I did not look to see how flexible is 'flexible', but there it is.

> Rationale:
> 1. Debug.
>      Generic binary data can't be output to console.

That depends on the console. Old IBM PCs had a character for every byte. 
That was meant for line-drawing, accents, and symbols, but could also be 
used for binary dumps. I believe there are Windows codepages that will 
do similar. Any bytes can be decoded as latin-1 and then printed.

 > A separate helper
> is needed to print, log or store its value in human readable format in
> database. This takes time.

A custom helper gives custom output.

> 2. Usability.
>      binascii is ugly: name is not intuitive any more, there are a lot
> of functions, and it is not clear how it relates to unicode.

Even if there are lots of functions, one might be added.
What does 'it' refer to? hexdump or binascii? Both are about binary 
bytes and not about unicode characters, so neither relate to abstract 
unicode. Encoded unicode characters are binary data like any other, 
though if the encoding is utf-16 or utf-32, one would want 2 or 4 bytes 
dumped together, as I suggested above.

-- 
Terry Jan Reedy



From flub at devork.be  Sat May 12 18:20:40 2012
From: flub at devork.be (Floris Bruynooghe)
Date: Sat, 12 May 2012 17:20:40 +0100
Subject: [Python-ideas] hexdump
In-Reply-To: 
References: 
Message-ID: 

On 12 May 2012 09:59, anatoly techtonik  wrote:
> hexdump(bytes) ? - produce human readable dump of binary data,

+1 on this basic function, that would be very nice in the stdlib.  Now
I always need to go and dig up my own function from somewhere.

A certain deal of bikeshedding would be required on the function
signature however, I'd go with something like:

hexdump(data, rowsize=16, offsets=True, ascii=True)

Where rowsize is the number of bytes on one row, offsets controls
showing the byte number (in hex) of the first byte of each row and
ascii controls showing the 7-bit printable characters in a right hand
column.

This would cover my needs, I'm sure other people will come up with
more must-haves.

Regards,
Floris

-- 
Debian GNU/Linux -- The Power of Freedom
www.debian.org | www.gnu.org | www.kernel.org


From jkbbwr at gmail.com  Sat May 12 18:44:13 2012
From: jkbbwr at gmail.com (Jakob Bowyer)
Date: Sat, 12 May 2012 17:44:13 +0100
Subject: [Python-ideas] I have an encrypted python module format: .pye
In-Reply-To: 
References: 
	
Message-ID: 

http://chargen.matasano.com/chargen/2009/7/22/if-youre-typing-the-letters-a-e-s-into-your-code-youre-doing.html

On Sat, May 12, 2012 at 2:11 PM, Tim Delaney
 wrote:
> On 12 May 2012 08:27, li wang  wrote:
>>
>> When a .pye is imported, python will check the environment variable
>> PYTHONENCRYPT, if this environment variable is defined with non-blank
>> value, the value is used to generate AES key and CBC initialize vector
>> which will be used to encrypt .py and decrypt .pye.
>
>
> As others have noted, this is essentially useless for protecting your code.
> How do you set that environment variable on your customer's system, without
> giving them the key they need?
>
> You can erect a somewhat higher barrier by using Pyrex or Cython to compile
> your modules to .pyd/.so. It's still quite possible to extract your logic
> and/or patch around things, but it's a little harder.
>
> The only reasonably secure method (again, as noted by others) is to not have
> your code on the client machine e.g. using a web service for the critical
> logic.
>
> Tim Delaney
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>


From brett at python.org  Sat May 12 19:13:59 2012
From: brett at python.org (Brett Cannon)
Date: Sat, 12 May 2012 13:13:59 -0400
Subject: [Python-ideas] I have an encrypted python module format: .pye
In-Reply-To: 
References: 
Message-ID: 

On Fri, May 11, 2012 at 6:27 PM, li wang  wrote:

> hi all:
>
> I want to use python in my product because I like and familiar with
> python for many years, but I won't let the customer to read and modify
> my code. So the best way is to encrypt my module .py to .pye.
>
>
Actually it's better to simply ship the .pyc/.pyo files and/or to minify
the code to make it unreadable. As everyone pointed out, the encryption you
are proposing won't stop anyone from reading your source, it will just make
it a little harder.

-Brett


> Now python will write compiled byte code .pyc or .pyo when a .py is
> imported, I have write a patch to add .pye support for encrypted byte
> code.
>
> When a .pye is imported, python will check the environment variable
> PYTHONENCRYPT, if this environment variable is defined with non-blank
> value, the value is used to generate AES key and CBC initialize vector
> which will be used to encrypt .py and decrypt .pye.
>
> Now it is work for me, does the python community is interested for it?
> I believe this feature can be helpful to let the python to be used in
> bussiness use case.
>
> Thanks greatly.
>
> Charles Wang   May/12, 2012.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From guido at python.org  Sat May 12 19:14:46 2012
From: guido at python.org (Guido van Rossum)
Date: Sat, 12 May 2012 10:14:46 -0700
Subject: [Python-ideas] hexdump
In-Reply-To: 
References: 
	
Message-ID: 

Rather than bikeshedding, why not implement the common formats and
flags implemented by the venerable 'od' command? It's been
time-tested...

On Sat, May 12, 2012 at 9:20 AM, Floris Bruynooghe  wrote:
> On 12 May 2012 09:59, anatoly techtonik  wrote:
>> hexdump(bytes) ? - produce human readable dump of binary data,
>
> +1 on this basic function, that would be very nice in the stdlib. ?Now
> I always need to go and dig up my own function from somewhere.
>
> A certain deal of bikeshedding would be required on the function
> signature however, I'd go with something like:
>
> hexdump(data, rowsize=16, offsets=True, ascii=True)
>
> Where rowsize is the number of bytes on one row, offsets controls
> showing the byte number (in hex) of the first byte of each row and
> ascii controls showing the 7-bit printable characters in a right hand
> column.
>
> This would cover my needs, I'm sure other people will come up with
> more must-haves.
>
> Regards,
> Floris
>
> --
> Debian GNU/Linux -- The Power of Freedom
> www.debian.org | www.gnu.org | www.kernel.org
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas



-- 
--Guido van Rossum (python.org/~guido)


From tjreedy at udel.edu  Sat May 12 19:16:05 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 12 May 2012 13:16:05 -0400
Subject: [Python-ideas] Move tarfile.filemode() into stat module
In-Reply-To: <20120512164110.27316aec@pitrou.net>
References: 
	<20120512164110.27316aec@pitrou.net>
Message-ID: 

On 5/12/2012 10:41 AM, Antoine Pitrou wrote:
> On Sat, 12 May 2012 16:29:35 +0200
> Giampaolo Rodol?
> wrote:
>> http://hg.python.org/cpython/file/9d9495fabeb9/Lib/tarfile.py#l304
>> I discovered this undocumented function by accident different years
>> ago and reused it a couple of times since then.
>> I think that leaving it hidden inside tarfile module is unfortunate.
>> What about moving it into stat module and document it?
>
> I don't know which of stat or shutil would be the better recipient, but
> it's a good idea anyway.

I think I would more likely look in stat, and as noted below, the 
constants used for the table used in the function are already in stat.py.

I checked, and

# Bits used in the mode field, values in octal.
#---------------------------------------------------------
S_IFLNK = 0o120000        # symbolic link
...

are only used in

filemode_table = (
     ((S_IFLNK,      "l"),
     ...

which is only used in

def filemode(mode): ...

So all three can be cleanly extracted into another module.

However 1) the bit definitions themselves should just be deleted as they 
*duplicate* those in stat.py. The S_Ixxx names are the same, the other 
names are variations of the other stat.S_Ixxxx names. So filemode_table 
(with '_' added?) could/should be re-written in stat.py to use the 
public, documented constants already defined there.

However 2) stat.py lacks the nice comments explaining the constants in 
the file itself, so I *would* copy the comments to the appropriate lines.


There only seems to be one use of the function in tarfile.py:
Line 1998:                 print(filemode(tarinfo.mode), end=' ')

All the other uses of 'filemode' are as a local name inside the open 
method, derived from its mode parameter:
             filemode, comptype = mode.split(":", 1)

+1 on moving the table (probably with private name, and using the 
existing, documented stat S_Ixxxx constants) and function (public) to 
stat.py.

-- 
Terry Jan Reedy




From mwm at mired.org  Sat May 12 20:39:55 2012
From: mwm at mired.org (Mike Meyer)
Date: Sat, 12 May 2012 14:39:55 -0400
Subject: [Python-ideas] I have an encrypted python module format: .pye
In-Reply-To: 
References: 
	
Message-ID: <20120512143955.174c4a76@bhuda.mired.org>

On Sat, 12 May 2012 13:13:59 -0400
Brett Cannon  wrote:

> On Fri, May 11, 2012 at 6:27 PM, li wang  wrote:
> > I want to use python in my product because I like and familiar with
> > python for many years, but I won't let the customer to read and modify
> > my code. So the best way is to encrypt my module .py to .pye.
> Actually it's better to simply ship the .pyc/.pyo files and/or to minify
> the code to make it unreadable. As everyone pointed out, the encryption you
> are proposing won't stop anyone from reading your source, it will just make
> it a little harder.

I think it's worth explaining why just shipping the .pyc/.pyo files is
"better".

If it's not clear by now, a fancy encryption scheme won't protect your
sources from someone who really wants to read them. On the other hand,
shipping just the .pyc/.pyo files will stop casual browsing. The only
real difference here is how much effort it takes to get the source. To
carry Guido's analogy further, both lock your front door, one just
uses a better lock. Neither will stop a determined burglar.

On the other hand, if you ship code with a fancy encryption scheme,
you're shipping more moving parts, which means more things to go
wrong, which means more support calls. With the particular scheme you
proposed, you'll get calls from people who managed to run the code
without properly setting the environment variable, or set it to the
wrong thing, and those are just the obvious problems.

In summary, your encryption scheme will make life just a little harder
for everyone when compared to simply not shipping the source.

    		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From cmjohnson.mailinglist at gmail.com  Sun May 13 02:49:54 2012
From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson)
Date: Sat, 12 May 2012 14:49:54 -1000
Subject: [Python-ideas] Printf function?
Message-ID: <10A9C44B-E20A-4803-80A8-DB416C3D0CAE@gmail.com>

I was looking at this jokey page on the evolution of programming language syntax -- http://alan.dipert.org/post/153430634/the-march-of-progress -- and it made me think about where Python is now. The Python 2 version of the example from the page is

	print "%10.2f" % x

and the Python 3 is

	print("{:10.2f}".format(x))

Personally, I prefer the new style {} formatting to the old % formatting, but it is pretty busy when you want to do a print and format in one step. Why not add a printf function to the built-ins, so you could just write

	printf("{:10.2f}", x)

Of course, writing a printf function for oneself is trivial and "not every three line function needs to be a built-in," but I do feel like this would be a win for Python's legibility.


What do you all think?


-- Carl Johnson 

From cs at zip.com.au  Sun May 13 03:50:10 2012
From: cs at zip.com.au (Cameron Simpson)
Date: Sun, 13 May 2012 11:50:10 +1000
Subject: [Python-ideas] Printf function?
In-Reply-To: <10A9C44B-E20A-4803-80A8-DB416C3D0CAE@gmail.com>
References: <10A9C44B-E20A-4803-80A8-DB416C3D0CAE@gmail.com>
Message-ID: <20120513015010.GA30528@cskk.homeip.net>

On 12May2012 14:49, Carl M. Johnson  wrote:
| I was looking at this jokey page on the evolution of programming language syntax -- http://alan.dipert.org/post/153430634/the-march-of-progress -- and it made me think about where Python is now. The Python 2 version of the example from the page is
| 
| 	print "%10.2f" % x
| 
| and the Python 3 is
| 
| 	print("{:10.2f}".format(x))
| 
| Personally, I prefer the new style {} formatting to the old %
| formatting, but it is pretty busy when you want to do a print and
| format in one step. Why not add a printf function to the built-ins,
| so you could just write
| 
| 	printf("{:10.2f}", x)
| 
| Of course, writing a printf function for oneself is trivial and "not
| every three line function needs to be a built-in," but I do feel like
| this would be a win for Python's legibility.

I'm -1 on it:

  - as you say, it could be a three line function

  - %-formatting isn't going away

  - neither %-formatting nor {}-formatting is anything to do with the
    print statement; they are both string actions
    So the printf idea does not achieve anything anyway.

Observe my Python 3.2:

  [/home/cameron]janus*> python3.2
  Python 3.2.2 (default, May  2 2012, 09:04:59) 
  [GCC 4.5.3] on linux2
  Type "help", "copyright", "credits" or "license" for more information.
  >>> x=1.5
  >>> print("%10.2f" % x)
        1.50
  >>> 

Printf isn't needed.

Cheers,
-- 
Cameron Simpson  DoD#743
http://www.cskk.ezoshosting.com/cs/

Senior ego adepto, ocius ego eram.


From cmjohnson.mailinglist at gmail.com  Sun May 13 05:37:54 2012
From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson)
Date: Sat, 12 May 2012 17:37:54 -1000
Subject: [Python-ideas] Printf function?
In-Reply-To: <20120513015010.GA30528@cskk.homeip.net>
References: <10A9C44B-E20A-4803-80A8-DB416C3D0CAE@gmail.com>
	<20120513015010.GA30528@cskk.homeip.net>
Message-ID: 


On May 12, 2012, at 3:50 PM, Cameron Simpson wrote:

> Observe my Python 3.2:
> 
>  [/home/cameron]janus*> python3.2
>  Python 3.2.2 (default, May  2 2012, 09:04:59) 
>  [GCC 4.5.3] on linux2
>  Type "help", "copyright", "credits" or "license" for more information.
>>>> x=1.5
>>>> print("%10.2f" % x)
>        1.50
>>>> 
> 
> Printf isn't needed.

Well, if that's the solution, why do we even have .format in the first place? I know there are a lot of people who still prefer % formatting, but I personally never liked it, and I prefer not to use it if I have any choice about it. But that's neither here nor there. My question is, being that we have .format, why not make it easier to use?

From steve at pearwood.info  Sun May 13 09:05:43 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Sun, 13 May 2012 17:05:43 +1000
Subject: [Python-ideas] Printf function?
In-Reply-To: <20120513015010.GA30528@cskk.homeip.net>
References: <10A9C44B-E20A-4803-80A8-DB416C3D0CAE@gmail.com>
	<20120513015010.GA30528@cskk.homeip.net>
Message-ID: <4FAF5D47.3090503@pearwood.info>

Cameron Simpson wrote:

> Printf isn't needed.

Agreed. printf does two things, formatting and printing, and Python can 
already do both. There's no point in a format-then-print function when you can 
just format then print.

However a lightweight alternative to regexes, something similar to scanf only 
safe, might be a nice idea. You can simulate scanf with regexes, but of course 
that's hardly lightweight.

(But now I'm indulging in idle speculation, not a serious proposal.)


-- 
Steven


From steve at pearwood.info  Sun May 13 09:22:45 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Sun, 13 May 2012 17:22:45 +1000
Subject: [Python-ideas] Printf function?
In-Reply-To: 
References: <10A9C44B-E20A-4803-80A8-DB416C3D0CAE@gmail.com>	<20120513015010.GA30528@cskk.homeip.net>
	
Message-ID: <4FAF6145.7090804@pearwood.info>

Carl M. Johnson wrote:

> Well, if that's the solution, why do we even have .format in the first
> place? I know there are a lot of people who still prefer % formatting, but
> I personally never liked it, and I prefer not to use it if I have any
> choice about it. But that's neither here nor there. My question is, being
> that we have .format, why not make it easier to use? 

Define "easier to use". Calling a method and passing its output to print seems 
to be pretty easy to me.

The major issue with printf is that it prints AND formats. That means you 
can't easily capture its output. A simpler approach is to have one function 
that handles the printing, and another function (or possibly a choice of 
multiple functions) that handles the formatting, then simply pass the output 
of the second to the first. That is to say, multiple simple tools that do one 
thing each are simpler *and* more flexible than a single tool to do multiple 
things: a hammer and a  wrench together are less complex than a combination 
hammer-wrench, and you can do more with them as separate tools than as a combo.



-- 
Steven



From steve at pearwood.info  Sun May 13 11:36:52 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Sun, 13 May 2012 19:36:52 +1000
Subject: [Python-ideas] I have an encrypted python module format: .pye
In-Reply-To: <20120512143955.174c4a76@bhuda.mired.org>
References: 	
	<20120512143955.174c4a76@bhuda.mired.org>
Message-ID: <4FAF80B4.1040507@pearwood.info>

Mike Meyer wrote:
> On Sat, 12 May 2012 13:13:59 -0400
> Brett Cannon  wrote:
> 
>> On Fri, May 11, 2012 at 6:27 PM, li wang  wrote:
>>> I want to use python in my product because I like and familiar with
>>> python for many years, but I won't let the customer to read and modify
>>> my code. So the best way is to encrypt my module .py to .pye.
>> Actually it's better to simply ship the .pyc/.pyo files and/or to minify
>> the code to make it unreadable. As everyone pointed out, the encryption you
>> are proposing won't stop anyone from reading your source, it will just make
>> it a little harder.
> 
> I think it's worth explaining why just shipping the .pyc/.pyo files is
> "better".
> 
> If it's not clear by now, a fancy encryption scheme won't protect your
> sources from someone who really wants to read them. On the other hand,
> shipping just the .pyc/.pyo files will stop casual browsing. The only
> real difference here is how much effort it takes to get the source. To
> carry Guido's analogy further, both lock your front door, one just
> uses a better lock. Neither will stop a determined burglar.

I think Guido's analogy is bogus and wrongly suggests that encrypting 
applications just might work if you try hard enough. If we can lock the door 
and keep strangers from peeking inside, why can't we encrypt apps and stop 
people from peeking at the code? But the analogy doesn't follow. In the front 
door example, untrusted people don't have a key and are forced to pick or 
break the lock to get it. In the encryption example, untrusted people are 
given the key (as an environment variable), then trusted not to use it to read 
the source code!

(Possibly on the assumption that they don't realise they have the key, or that 
using it manually is too difficult for them.)

Ultimately, on a computer the user controls, with a key they have access to, 
they can bypass any encryption or security you install. That's why e.g. so 
many forms of copy protection and digital restrictions software try to take 
control away from the user, to some greater or lesser degree of success.


-- 
Steven



From masklinn at masklinn.net  Sun May 13 11:51:58 2012
From: masklinn at masklinn.net (Masklinn)
Date: Sun, 13 May 2012 11:51:58 +0200
Subject: [Python-ideas] Printf function?
In-Reply-To: <10A9C44B-E20A-4803-80A8-DB416C3D0CAE@gmail.com>
References: <10A9C44B-E20A-4803-80A8-DB416C3D0CAE@gmail.com>
Message-ID: <45E5027E-9A8C-4748-BAE8-54ACA49D70E7@masklinn.net>


On 2012-05-13, at 02:49 , Carl M. Johnson wrote:

> I was looking at this jokey page on the evolution of programming language syntax -- http://alan.dipert.org/post/153430634/the-march-of-progress -- and it made me think about where Python is now. The Python 2 version of the example from the page is
> 
> 	print "%10.2f" % x
> 
> and the Python 3 is
> 
> 	print("{:10.2f}".format(x))
> 
> Personally, I prefer the new style {} formatting to the old % formatting, but it is pretty busy when you want to do a print and format in one step. Why not add a printf function to the built-ins, so you could just write
> 
> 	printf("{:10.2f}", x)
> 
> Of course, writing a printf function for oneself is trivial and "not every three line function needs to be a built-in," but I do feel like this would be a win for Python's legibility.
> 
> 
> What do you all think?

I'm ?1 on two counts personally:

1. Even with Python 3's slightly more verbose string formatting, I don't
think there's much (if any) gain in having a builtin merging print and
format
2. If I see a function called `printf` (or with `printf` pas part of its
name), I expect it to use printf-style format strings (that is, Python
2-style formatting). A function called printf with new-style format
string would be far more confusing than the current situation, I think.

From masklinn at masklinn.net  Sun May 13 12:00:49 2012
From: masklinn at masklinn.net (Masklinn)
Date: Sun, 13 May 2012 12:00:49 +0200
Subject: [Python-ideas] Printf function?
In-Reply-To: <4FAF6145.7090804@pearwood.info>
References: <10A9C44B-E20A-4803-80A8-DB416C3D0CAE@gmail.com>	<20120513015010.GA30528@cskk.homeip.net>
	
	<4FAF6145.7090804@pearwood.info>
Message-ID: 

On 2012-05-13, at 09:22 , Steven D'Aprano wrote:

> Define "easier to use". Calling a method and passing its output to print seems to be pretty easy to me.
> 
> The major issue with printf is that it prints AND formats. That means you can't easily capture its output. A simpler approach is to have one function that handles the printing, and another function (or possibly a choice of multiple functions) that handles the formatting, then simply pass the output of the second to the first.

An other option is to have a formatting function similar to Common
Lisp's format[0]: it formats a string and

* If provided with a stream (or stream-like) argument writes the
  formatted string to the stream and returns `nil`
* Otherwise returns the formatted string

The function formats and prints, but capturing the output (to a
non-standard stream or to a string) is trivial.

[0] http://www.lispworks.com/documentation/HyperSpec/Body/f_format.htm#format



From ncoghlan at gmail.com  Sun May 13 16:44:02 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 14 May 2012 00:44:02 +1000
Subject: [Python-ideas] Printf function?
In-Reply-To: <10A9C44B-E20A-4803-80A8-DB416C3D0CAE@gmail.com>
References: <10A9C44B-E20A-4803-80A8-DB416C3D0CAE@gmail.com>
Message-ID: 

On Sun, May 13, 2012 at 10:49 AM, Carl M. Johnson
 wrote:
> I was looking at this jokey page on the evolution of programming language syntax -- http://alan.dipert.org/post/153430634/the-march-of-progress -- and it made me think about where Python is now. The Python 2 version of the example from the page is
>
> ? ? ? ?print "%10.2f" % x
>
> and the Python 3 is
>
> ? ? ? ?print("{:10.2f}".format(x))

The main reason the format() builtin exists is to easily format single fields:

>>> x = 5.0
>>> print(format(x, "10.2f"))
      5.00

The new formatting system doesn't scale down as well as the old one,
so "format a single field value with no surrounding text" is handled
as a special case.

> Personally, I prefer the new style {} formatting to the old % formatting, but it is pretty busy when you want to do a print and format in one step. Why not add a printf function to the built-ins, so you could just write
>
> ? ? ? ?printf("{:10.2f}", x)
>
> Of course, writing a printf function for oneself is trivial and "not every three line function needs to be a built-in," but I do feel like this would be a win for Python's legibility.

The problem is that you have two competing uses for your arguments -
"print" wants to accept "file", "end", etc, while format() wants to
accept *args and **kwds). It's better to keep the two separate and
allow people to compose them as they wish.

For myself, aside from temporary debugging messages, I rarely call
"print" directly in any non-trivial code - instead I'll have a
"display" utility module that tweaks things appropriately for the
specific application. Maybe it will redirect to logging, maybe it will
print directly to a stream or to a file - the utility module gives me
a single point of control without having to change the rest of the
script.

Cheers,
Nick.
-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From ncoghlan at gmail.com  Sun May 13 16:44:57 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 14 May 2012 00:44:57 +1000
Subject: [Python-ideas] Printf function?
In-Reply-To: <4FAF5D47.3090503@pearwood.info>
References: <10A9C44B-E20A-4803-80A8-DB416C3D0CAE@gmail.com>
	<20120513015010.GA30528@cskk.homeip.net>
	<4FAF5D47.3090503@pearwood.info>
Message-ID: 

On Sun, May 13, 2012 at 5:05 PM, Steven D'Aprano  wrote:
> However a lightweight alternative to regexes, something similar to scanf
> only safe, might be a nice idea. You can simulate scanf with regexes, but of
> course that's hardly lightweight.
>
> (But now I'm indulging in idle speculation, not a serious proposal.)

Take a look at the parse module on PyPI :)

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From gahtune at gmail.com  Sun May 13 17:00:24 2012
From: gahtune at gmail.com (Gabriel AHTUNE)
Date: Sun, 13 May 2012 23:00:24 +0800
Subject: [Python-ideas] I have an encrypted python module format: .pye
In-Reply-To: <4FAF80B4.1040507@pearwood.info>
References: 
	
	<20120512143955.174c4a76@bhuda.mired.org>
	<4FAF80B4.1040507@pearwood.info>
Message-ID: 

>
> I think Guido's analogy is bogus and wrongly suggests that encrypting
> applications just might work if you try hard enough. If we can lock the
> door and keep strangers from peeking inside, why can't we encrypt apps and
> stop people from peeking at the code? But the analogy doesn't follow.
>

The analogy is you want to protect the doors with a lock from the guy you
gave the key. (source: house, encrypted: the lock, the way to decrypt in
order to run: the key)


> In the front door example, untrusted people don't have a key and are
> forced to pick or break the lock to get it. In the encryption example,
> untrusted people are given the key
>
(as an environment variable), then trusted not to use it to read the source
> code!
>

The problem is that he don't trust the customer.


> (Possibly on the assumption that they don't realise they have the key, or
> that using it manually is too difficult for them.)
>
> Ultimately, on a computer the user controls, with a key they have access
> to, they can bypass any encryption or security you install. That's why e.g.
> so many forms of copy protection and digital restrictions software try to
> take control away from the user, to some greater or lesser degree of
> success.
>
>
> --
> Steven
>
>
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/**mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From guido at python.org  Sun May 13 17:30:39 2012
From: guido at python.org (Guido van Rossum)
Date: Sun, 13 May 2012 08:30:39 -0700
Subject: [Python-ideas] I have an encrypted python module format: .pye
In-Reply-To: <4FAF80B4.1040507@pearwood.info>
References: 
	
	<20120512143955.174c4a76@bhuda.mired.org>
	<4FAF80B4.1040507@pearwood.info>
Message-ID: 

On Sun, May 13, 2012 at 2:36 AM, Steven D'Aprano  wrote:
> I think Guido's analogy is bogus and wrongly suggests that encrypting
> applications just might work if you try hard enough.

Eh? I didn't mean that at all. To the contrary I meant that every
encryption can be broken but that it may still be a useful deterrent.
I wasn't aware of the detail of the OP's proposal that the key was
right in the user's environment -- but that actually has an exact
analogy in the front door example: hiding the key under the mat.

-- 
--Guido van Rossum (python.org/~guido)


From alexander.belopolsky at gmail.com  Sun May 13 18:43:08 2012
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Sun, 13 May 2012 12:43:08 -0400
Subject: [Python-ideas] I have an encrypted python module format: .pye
In-Reply-To: 
References: 
	
	<20120512143955.174c4a76@bhuda.mired.org>
	<4FAF80B4.1040507@pearwood.info>
	
Message-ID: 

On Sun, May 13, 2012 at 11:30 AM, Guido van Rossum  wrote:
> I wasn't aware of the detail of the OP's proposal that the key was
> right in the user's environment -- but that actually has an exact
> analogy in the front door example: hiding the key under the mat.

This sounds like an argument not to include this functionality in
stdlib.  If hiding the key under the mat becomes standard, a key under
the mat will be as inviting as an open front door.  Those interested
in obscurity should not invite public discussion of clandestine
advantages of doormats over garden rocks.


From guido at python.org  Sun May 13 19:33:07 2012
From: guido at python.org (Guido van Rossum)
Date: Sun, 13 May 2012 10:33:07 -0700
Subject: [Python-ideas] I have an encrypted python module format: .pye
In-Reply-To: 
References: 
	
	<20120512143955.174c4a76@bhuda.mired.org>
	<4FAF80B4.1040507@pearwood.info>
	
	
Message-ID: 

--Guido van Rossum (sent from Android phone)
On May 13, 2012 9:43 AM, "Alexander Belopolsky" <
alexander.belopolsky at gmail.com> wrote:
>
> On Sun, May 13, 2012 at 11:30 AM, Guido van Rossum 
wrote:
> > I wasn't aware of the detail of the OP's proposal that the key was
> > right in the user's environment -- but that actually has an exact
> > analogy in the front door example: hiding the key under the mat.
>
> This sounds like an argument not to include this functionality in
> stdlib.  If hiding the key under the mat becomes standard, a key under
> the mat will be as inviting as an open front door.  Those interested
> in obscurity should not invite public discussion of clandestine
> advantages of doormats over garden rocks.

Agreed, definitely not for the stdlib.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From mwm at mired.org  Sun May 13 20:26:19 2012
From: mwm at mired.org (Mike Meyer)
Date: Sun, 13 May 2012 14:26:19 -0400
Subject: [Python-ideas] I have an encrypted python module format: .pye
In-Reply-To: <4FAF80B4.1040507@pearwood.info>
References: 
	
	<20120512143955.174c4a76@bhuda.mired.org>
	<4FAF80B4.1040507@pearwood.info>
Message-ID: <20120513142619.07bce1a8@bhuda.mired.org>

On Sun, 13 May 2012 19:36:52 +1000
Steven D'Aprano  wrote:
> Mike Meyer wrote:
> > If it's not clear by now, a fancy encryption scheme won't protect your
> > sources from someone who really wants to read them. On the other hand,
> > shipping just the .pyc/.pyo files will stop casual browsing. The only
> > real difference here is how much effort it takes to get the source. To
> > carry Guido's analogy further, both lock your front door, one just
> > uses a better lock. Neither will stop a determined burglar.
> I think Guido's analogy is bogus and wrongly suggests that encrypting 
> applications just might work if you try hard enough. If we can lock the door 
> and keep strangers from peeking inside, why can't we encrypt apps and stop 
> people from peeking at the code?

But locking the door *won't* keep strangers from peeking inside. Not
if they really want to. It'll keep people from casually opening the
door, but it won't stop someone who really wants to see the insides
because they can:

> But the analogy doesn't follow. In the front 
> door example, untrusted people don't have a key and are forced to pick or 
> break the lock to get it.

Exactly. You can easily get tools to do all these things, as well as
others, to get past the lock.

> In the encryption example, untrusted people are given the key (as an
> environment variable), then trusted not to use it to read the source
> code!

This is pretty much required in any form of DRM. You have to give the
end user the keys in order for them to use what you gave them. Trying
to prevent them from then doing *other* things is done by obfuscating
how you get from the cyphertext to the plaintext. That's it can't work
is why the US container companies got laws passed making doing so
illegal.

> (Possibly on the assumption that they don't realise they have the key, or that 
> using it manually is too difficult for them.)

The difficulty level is immaterial. With the proper training and
tools, none of these things (picking locks, breaking down doors,
reverse engineering code obfuscation) is difficult. On the other hand,
you can raise the difficulty level of any of them by investing more in
whatever obstacles you're putting in the way.

They both do the same thing. That's why the analogy works.

			http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From g.rodola at gmail.com  Sun May 13 23:04:38 2012
From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=)
Date: Sun, 13 May 2012 23:04:38 +0200
Subject: [Python-ideas] Move tarfile.filemode() into stat module
In-Reply-To: 
References: 
	<20120512164110.27316aec@pitrou.net> 
Message-ID: 

2012/5/12 Terry Reedy :
> On 5/12/2012 10:41 AM, Antoine Pitrou wrote:
>>
>> On Sat, 12 May 2012 16:29:35 +0200
>> Giampaolo Rodol?
>> wrote:
>>>
>>> http://hg.python.org/cpython/file/9d9495fabeb9/Lib/tarfile.py#l304
>>> I discovered this undocumented function by accident different years
>>> ago and reused it a couple of times since then.
>>> I think that leaving it hidden inside tarfile module is unfortunate.
>>> What about moving it into stat module and document it?
>>
>>
>> I don't know which of stat or shutil would be the better recipient, but
>> it's a good idea anyway.
>
>
> I think I would more likely look in stat, and as noted below, the constants
> used for the table used in the function are already in stat.py.
>
> I checked, and
>
> # Bits used in the mode field, values in octal.
> #---------------------------------------------------------
> S_IFLNK = 0o120000 ? ? ? ?# symbolic link
> ...
>
> are only used in
>
> filemode_table = (
> ? ?((S_IFLNK, ? ? ?"l"),
> ? ?...
>
> which is only used in
>
> def filemode(mode): ...
>
> So all three can be cleanly extracted into another module.
>
> However 1) the bit definitions themselves should just be deleted as they
> *duplicate* those in stat.py. The S_Ixxx names are the same, the other names
> are variations of the other stat.S_Ixxxx names. So filemode_table (with '_'
> added?) could/should be re-written in stat.py to use the public, documented
> constants already defined there.
>
> However 2) stat.py lacks the nice comments explaining the constants in the
> file itself, so I *would* copy the comments to the appropriate lines.
>
>
> There only seems to be one use of the function in tarfile.py:
> Line 1998: ? ? ? ? ? ? ? ? print(filemode(tarinfo.mode), end=' ')
>
> All the other uses of 'filemode' are as a local name inside the open method,
> derived from its mode parameter:
> ? ? ? ? ? ?filemode, comptype = mode.split(":", 1)
>
> +1 on moving the table (probably with private name, and using the existing,
> documented stat S_Ixxxx constants) and function (public) to stat.py.
>
> --
> Terry Jan Reedy

Agreed then. I'm going to submit a patch soon.


--- Giampaolo

http://code.google.com/p/pyftpdlib/
http://code.google.com/p/psutil/
http://code.google.com/p/pysendfile/


From g.rodola at gmail.com  Mon May 14 00:29:18 2012
From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=)
Date: Mon, 14 May 2012 00:29:18 +0200
Subject: [Python-ideas] Move tarfile.filemode() into stat module
In-Reply-To: 
References: 
	<20120512164110.27316aec@pitrou.net> 
Message-ID: 

2012/5/12 Terry Reedy :
> However 2) stat.py lacks the nice comments explaining the constants in the
> file itself, so I *would* copy the comments to the appropriate lines.

+1
If no one is opposed I'll do that tomorrow.

> Add me terry.reedy as nosy and I will help review by checking the
> S_Ixxx substitutions.

Thanks, will do.


--- Giampaolo
http://code.google.com/p/pyftpdlib/
http://code.google.com/p/psutil/
http://code.google.com/p/pysendfile/


From mikegraham at gmail.com  Mon May 14 19:35:13 2012
From: mikegraham at gmail.com (Mike Graham)
Date: Mon, 14 May 2012 13:35:13 -0400
Subject: [Python-ideas] I have an encrypted python module format: .pye
In-Reply-To: 
References: 
	
Message-ID: 

 On Fri, May 11, 2012 at 6:27 PM, li wang  wrote:
>
> I want to use python in my product because I like and familiar with
> python for many years, but I won't let the customer to read and modify
> my code. So the best way is to encrypt my module .py to .pye.

They scheme you describe only provides a false sense of security. That
would be very bad.

The only ways to protect your code are a) legally, which is the main
one, and b) by not giving it to anyone (and making them access things
by a remote interface).

A very strong -1 from me. Do not provide wrong-headed, insecure
features like this.

Mike


From guido at python.org  Mon May 14 19:46:29 2012
From: guido at python.org (Guido van Rossum)
Date: Mon, 14 May 2012 10:46:29 -0700
Subject: [Python-ideas] I have an encrypted python module format: .pye
In-Reply-To: 
References: 
	
	
Message-ID: 

On Mon, May 14, 2012 at 10:35 AM, Mike Graham  wrote:
> ?On Fri, May 11, 2012 at 6:27 PM, li wang  wrote:
>>
>> I want to use python in my product because I like and familiar with
>> python for many years, but I won't let the customer to read and modify
>> my code. So the best way is to encrypt my module .py to .pye.
>
> They scheme you describe only provides a false sense of security. That
> would be very bad.

You seem to be assuming security by obscurity is worse than no
security. I disagree (although I am not defending it as the sole form
of security). Many security professionals are not happy unless
multiple levels of security are in place, some of which can only be
described as obscurity.

> The only ways to protect your code are a) legally, which is the main
> one,

If you look into legal ways of protecting physical property you'll
find that having locks, fences etc. is often necessary for legal
protection to apply. That's why so often you'll find "no trespassing"
signs (in Holland these even have a specific reference to the law on
them).

> and b) by not giving it to anyone (and making them access things
> by a remote interface).
>
> A very strong -1 from me. Do not provide wrong-headed, insecure
> features like this.

I am -1 on including any support for encrypting bytecode in the
standard library, for the same reasons that we *removed* Bastion and
rexec -- since it cannot be made perfect, we'd be forever open to
criticism and possible liability if someone misunderstood the level of
security provided. But I am defending the right of users to implement
a level of obscurity that they are comfortable with. At the same time
it is good to point out the limits of such schemes.

-- 
--Guido van Rossum (python.org/~guido)


From mikegraham at gmail.com  Mon May 14 20:00:11 2012
From: mikegraham at gmail.com (Mike Graham)
Date: Mon, 14 May 2012 14:00:11 -0400
Subject: [Python-ideas] I have an encrypted python module format: .pye
In-Reply-To: 
References: 
	
	
	
Message-ID: 

On Mon, May 14, 2012 at 1:46 PM, Guido van Rossum  wrote:
> You seem to be assuming security by obscurity is worse than no
> security. I disagree (although I am not defending it as the sole form
> of security). Many security professionals are not happy unless
> multiple levels of security are in place, some of which can only be
> described as obscurity.

I would point out: a) It can be worse than no security for the same
reason a cotton bulletproof jacket is worse than no bulletproof
jacket: it lures you into a false sense of security, and b) The
original post asked for a non-obscure, non-secure solution.

> If you look into legal ways of protecting physical property you'll
> find that having locks, fences etc. is often necessary for legal
> protection to apply. That's why so often you'll find "no trespassing"
> signs (in Holland these even have a specific reference to the law on
> them).

This is very true, but I think I might be missing something about your
point. Are there places where intellectual property has similar laws
or policies?

Thanks,
Mike


From bruce at leapyear.org  Mon May 14 20:10:27 2012
From: bruce at leapyear.org (Bruce Leban)
Date: Mon, 14 May 2012 11:10:27 -0700
Subject: [Python-ideas] I have an encrypted python module format: .pye
In-Reply-To: 
References: 
	
	
	
	
Message-ID: 

On Mon, May 14, 2012 at 11:00 AM, Mike Graham  wrote:

> On Mon, May 14, 2012 at 1:46 PM, Guido van Rossum 
> wrote:> If you look into legal ways of protecting physical property you'll
> > find that having locks, fences etc. is often necessary for legal
> > protection to apply. That's why so often you'll find "no trespassing"
> > signs (in Holland these even have a specific reference to the law on
> > them).
>
> This is very true, but I think I might be missing something about your
> point. Are there places where intellectual property has similar laws
> or policies?
>



Both patent and copyright law have the concept of 'willful infringement'
and 'proper notice'. Taking the right steps to make sure the person
receiving your IP is aware of your copyright and patent rights can make
them a willful infringer and subject to harsher penalties. Conversely,
failure to use proper notices means you have less protection. (It used to
be that the mere absence of a copyright notice would put your work in the
public domain but that is no longer the case.)

If you obfuscate the code, the reader of the code cannot claim that you
didn't mind if they read it. It makes your intent clear. While simply
compiling source to byte codes obfuscates it to some extent, it doesn't
send a clear message that you don't want them to read it. A notice at the
front of the file saying that you don't want them to read it might be just
as good as obfuscation from that standpoint.



--- Bruce
Follow me: http://www.twitter.com/Vroo http://www.vroospeak.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From mal at egenix.com  Mon May 14 21:41:19 2012
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 14 May 2012 21:41:19 +0200
Subject: [Python-ideas] I have an encrypted python module format: .pye
In-Reply-To: 
References: 
	
	
	
	
Message-ID: <4FB15FDF.9030806@egenix.com>

Mike Graham wrote:
> On Mon, May 14, 2012 at 1:46 PM, Guido van Rossum  wrote:
>> You seem to be assuming security by obscurity is worse than no
>> security. I disagree (although I am not defending it as the sole form
>> of security). Many security professionals are not happy unless
>> multiple levels of security are in place, some of which can only be
>> described as obscurity.
> 
> I would point out: a) It can be worse than no security for the same
> reason a cotton bulletproof jacket is worse than no bulletproof
> jacket: it lures you into a false sense of security, and b) The
> original post asked for a non-obscure, non-secure solution.
> 
>> If you look into legal ways of protecting physical property you'll
>> find that having locks, fences etc. is often necessary for legal
>> protection to apply. That's why so often you'll find "no trespassing"
>> signs (in Holland these even have a specific reference to the law on
>> them).
> 
> This is very true, but I think I might be missing something about your
> point. Are there places where intellectual property has similar laws
> or policies?

Yes, see http://en.wikipedia.org/wiki/Anti-circumvention

Take e.g. the EU directive text:

"...the expression 'technological measures' means any technology, device or component that, in the
normal course of its operation, is designed to prevent or restrict acts..."

"Technological measures shall be deemed 'effective' where the use of a protected work or other
subjectmatter is controlled by the rightsholders through application of an access control or
protection process, such as encryption, scrambling or other transformation of the work or other
subject-matter or a copy control mechanism, which achieves the protection objective."

There's an important difference between "security by obscurity" and
"protection by obscurity". The first is very hard to achieve. The second
is made easy by laws and regulations (because the first doesn't work out
too well in practice).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 14 2012)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2012-07-02: EuroPython 2012, Florence, Italy               49 days to go
2012-04-26: Released mxODBC 3.1.2                 http://egenix.com/go28
2012-04-25: Released eGenix mx Base 3.2.4         http://egenix.com/go27

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


From ckaynor at zindagigames.com  Mon May 14 22:31:11 2012
From: ckaynor at zindagigames.com (Chris Kaynor)
Date: Mon, 14 May 2012 13:31:11 -0700
Subject: [Python-ideas] I have an encrypted python module format: .pye
In-Reply-To: <4FB15FDF.9030806@egenix.com>
References: 
	
	
	
	
	<4FB15FDF.9030806@egenix.com>
Message-ID: 

On Mon, May 14, 2012 at 12:41 PM, M.-A. Lemburg  wrote:
> Mike Graham wrote:
> > I would point out: a) It can be worse than no security for the same
> > reason a cotton bulletproof jacket is worse than no bulletproof
> > jacket: it lures you into a false sense of security, and b) The
> > original post asked for a non-obscure, non-secure solution.
> >
> > On Mon, May 14, 2012 at 1:46 PM, Guido van Rossum  wrote:
> >> If you look into legal ways of protecting physical property you'll
> >> find that having locks, fences etc. is often necessary for legal
> >> protection to apply. That's why so often you'll find "no trespassing"
> >> signs (in Holland these even have a specific reference to the law on
> >> them).
> >
> > This is very true, but I think I might be missing something about your
> > point. Are there places where intellectual property has similar laws
> > or policies?
>
> Yes, see http://en.wikipedia.org/wiki/Anti-circumvention
>
> Take e.g. the EU directive text:
>
> "...the expression 'technological measures' means any technology, device or component that, in the
> normal course of its operation, is designed to prevent or restrict acts..."
>
> "Technological measures shall be deemed 'effective' where the use of a protected work or other
> subjectmatter is controlled by the rightsholders through application of an access control or
> protection process, such as encryption, scrambling or other transformation of the work or other
> subject-matter or a copy control mechanism, which achieves the protection objective."

As I read it, the text of the law quoted above would mean that just
releasing the pyc files would be enough, as would running the source
though an obfuscator.

>
> There's an important difference between "security by obscurity" and
> "protection by obscurity". The first is very hard to achieve. The second
> is made easy by laws and regulations (because the first doesn't work out
> too well in practice).

Chris


From fuzzyman at gmail.com  Tue May 15 01:28:02 2012
From: fuzzyman at gmail.com (Michael Foord)
Date: Tue, 15 May 2012 00:28:02 +0100
Subject: [Python-ideas] Unhelpful error message from sorted
Message-ID: 

Hello all,

It seems to me that the following error message, whilst technically
correct, is unhelpful:

>>> sorted([3, 2, 1], reverse=None)
Traceback (most recent call last):
  File "", line 1, in 
TypeError: an integer is required

Worth creating an issue for?

Michael

-- 

http://www.voidspace.org.uk/

May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing http://www.sqlite.org/different.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From pyideas at rebertia.com  Tue May 15 02:04:04 2012
From: pyideas at rebertia.com (Chris Rebert)
Date: Mon, 14 May 2012 17:04:04 -0700
Subject: [Python-ideas] Unhelpful error message from sorted
In-Reply-To: 
References: 
Message-ID: 

On Mon, May 14, 2012 at 4:28 PM, Michael Foord  wrote:
> Hello all,
>
> It seems to me that the following error message, whilst technically correct,
> is unhelpful:
>
>>>> sorted([3, 2, 1], reverse=None)
> Traceback (most recent call last):
> ? File "", line 1, in 
> TypeError: an integer is required
>
> Worth creating an issue for?

IMO, yes. Surely a *bool[ean]* value ought to be required.
(And mentioning the `reverse` parameter by name would of course also be nice.)

Cheers,
Chris


From tjreedy at udel.edu  Tue May 15 08:52:26 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 15 May 2012 02:52:26 -0400
Subject: [Python-ideas] Unhelpful error message from sorted
In-Reply-To: 
References: 
	
Message-ID: 

On 5/14/2012 8:04 PM, Chris Rebert wrote:
> On Mon, May 14, 2012 at 4:28 PM, Michael Foord  wrote:
>> Hello all,
>>
>> It seems to me that the following error message, whilst technically correct,
>> is unhelpful:
>>
>>>>> sorted([3, 2, 1], reverse=None)
>> Traceback (most recent call last):
>>    File "", line 1, in
>> TypeError: an integer is required
>>
>> Worth creating an issue for?
>
> IMO, yes. Surely a *bool[ean]* value ought to be required.
> (And mentioning the `reverse` parameter by name would of course also be nice.)

There are still overly cryptic errors messages. I would like to see 
something more like
TypeError: 'reverse' argument must be bool, not Nonetype

-- 
Terry Jan Reedy



From storchaka at gmail.com  Tue May 15 09:16:49 2012
From: storchaka at gmail.com (Serhiy Storchaka)
Date: Tue, 15 May 2012 10:16:49 +0300
Subject: [Python-ideas] Unhelpful error message from sorted
In-Reply-To: 
References: 
	
	
Message-ID: <4FB202E1.4060509@gmail.com>

On 15.05.12 09:52, Terry Reedy wrote:
> There are still overly cryptic errors messages. I would like to see
> something more like
> TypeError: 'reverse' argument must be bool, not Nonetype

With using issue14705 [1] sort can accepts reverse=None.

[1] http://bugs.python.org/issue14705



From stefan_ml at behnel.de  Tue May 15 09:58:27 2012
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Tue, 15 May 2012 09:58:27 +0200
Subject: [Python-ideas] Unhelpful error message from sorted
In-Reply-To: <4FB202E1.4060509@gmail.com>
References: 
	
	 <4FB202E1.4060509@gmail.com>
Message-ID: 

Serhiy Storchaka, 15.05.2012 09:16:
> On 15.05.12 09:52, Terry Reedy wrote:
>> There are still overly cryptic errors messages. I would like to see
>> something more like
>> TypeError: 'reverse' argument must be bool, not Nonetype
> 
> With using issue14705 [1] sort can accepts reverse=None.
> 
> [1] http://bugs.python.org/issue14705

Looks like a side effect, though. It doesn't make any sense to me to pass
None for the "reversed" argument.

Stefan



From steve at pearwood.info  Tue May 15 10:02:08 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 15 May 2012 18:02:08 +1000
Subject: [Python-ideas] Unhelpful error message from sorted
In-Reply-To: 
References: 
Message-ID: <20120515080208.GA30938@ando>

On Tue, May 15, 2012 at 12:28:02AM +0100, Michael Foord wrote:
> Hello all,
> 
> It seems to me that the following error message, whilst technically
> correct, is unhelpful:
> 
> >>> sorted([3, 2, 1], reverse=None)
> Traceback (most recent call last):
>   File "", line 1, in 
> TypeError: an integer is required

I don't know what you mean by "technically correct". Surely the Pythonic 
idiom is to allow any value in a boolean context.

sorted() here is neither one thing nor the other, neither duck-typing, 
since it won't accept flags that quack like a bool, nor does it strictly 
insist on a bool, since it accepts ints:

>>> sorted([1,2,3], reverse=42)
[3, 2, 1]

I can't see any sense to this almost-but-not-quite type restriction.

+1 to allow any object that is truthy or falsey (i.e. anything).
+0 to allowing only True or False.
-1 to half-heartedly allowing ints but no other values.


-- 
Steven


From fuzzyman at gmail.com  Tue May 15 11:01:21 2012
From: fuzzyman at gmail.com (Michael Foord)
Date: Tue, 15 May 2012 10:01:21 +0100
Subject: [Python-ideas] Unhelpful error message from sorted
In-Reply-To: <20120515080208.GA30938@ando>
References: 
	<20120515080208.GA30938@ando>
Message-ID: 

On 15 May 2012 09:02, Steven D'Aprano  wrote:

> On Tue, May 15, 2012 at 12:28:02AM +0100, Michael Foord wrote:
> > Hello all,
> >
> > It seems to me that the following error message, whilst technically
> > correct, is unhelpful:
> >
> > >>> sorted([3, 2, 1], reverse=None)
> > Traceback (most recent call last):
> >   File "", line 1, in 
> > TypeError: an integer is required
>
> I don't know what you mean by "technically correct". Surely the Pythonic
> idiom is to allow any value in a boolean context.
>
> sorted() here is neither one thing nor the other, neither duck-typing,
> since it won't accept flags that quack like a bool, nor does it strictly
> insist on a bool, since it accepts ints:
>
> >>> sorted([1,2,3], reverse=42)
> [3, 2, 1]
>
> I can't see any sense to this almost-but-not-quite type restriction.
>
> +1 to allow any object that is truthy or falsey (i.e. anything).
>

I would rather have "sorted(some_list, reverse=[1, 2, 3])" raise an error
(and preferably a helpful error message that tells you which argument is
faulty and why).

Michael


> +0 to allowing only True or False.
> -1 to half-heartedly allowing ints but no other values.
>
>
> --
> Steven
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>



-- 

http://www.voidspace.org.uk/

May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing http://www.sqlite.org/different.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From Suriaprakash.Mariappan at smsc.com  Tue May 15 12:46:20 2012
From: Suriaprakash.Mariappan at smsc.com (Suriaprakash.Mariappan at smsc.com)
Date: Tue, 15 May 2012 16:16:20 +0530
Subject: [Python-ideas] input function: built-in space between string and
	user-input
Message-ID: 



print function: built-in space between string and variable:

The below python code,

length = 5
print('Length is', length)

gives an output of

Length is 5

Even though we have not specified a space between 'Length is' and the variable length, Python puts it for us so that we
get a clean nice output and the program is much more readable this way (since we don't need to worry about spacing in
the strings we use for output). This is surely an example of how Python makes life easy for the programmer.

input function: built-in space between string and user-input:

However, the below python code,

guess =  int(input('Enter an integer'))

gives an output of

Enter an integer7

[Note: Assume 7 is entered by the user.]

Suggestion: Similar to the printf function, for the input function also, it will be nice to have the Python put a space
between string and user-input, so that the output in the above case will be more readable as below.

Enter an integer 7

Thanks and Regards,
Suriaprakash M,
Principal Engineer - Software,
Standard Microsystems India Pvt. Ltd.,
Module 1, 4th Floor, Block A, SP Infocity,
#40, MGR Salai, Perungudi,
Chennai - 600 096, Tamil Nadu, INDIA.
Email: Suriaprakash.Mariappan at smsc.com
Mobile :+919381453832
Skype ID: msuriaprakash
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From ericsnowcurrently at gmail.com  Tue May 15 18:26:35 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Tue, 15 May 2012 10:26:35 -0600
Subject: [Python-ideas] [Python-Dev] sys.implementation
In-Reply-To: 
References: 
	
	
	<20120426103150.4898a678@limelight.wooz.org>
	
	<4FAA3FA7.5070808@v.loewis.de>
	
	<20120509165039.23c8bf56@pitrou.net>
	
	<20120509095311.3a2c25c2@resist>
	
	<20120510105749.7401f1d2@pitrou.net>
	
	
	
Message-ID: 

At this point I'm pretty comfortable with where PEP 421 is at.  Before
asking for pronouncement, I'd like to know if anyone has any
outstanding concerns that should be addressed first.

The only (relatively) substantial point of debate has been the type
for sys.implementation.  The PEP now limits the specification of the
type to the minimum (Big-Endian vs. Little...er...attribute-access vs
mapping).  If anyone objects to the decision there to go with
attribute-access, please make your case.

>From my point of the view either one would be fine for what we need
and attribute-access is more representative of the fixed namespace.
Unless there is a really good reason to use a mapping, I'd like to
stick with that.

Thanks!

-eric


From tjreedy at udel.edu  Tue May 15 23:19:49 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 15 May 2012 17:19:49 -0400
Subject: [Python-ideas] input function: built-in space between string
	and user-input
In-Reply-To: 
References: 
Message-ID: 

On 5/15/2012 6:46 AM, 
Suriaprakash.Mariappan at smsc.com wrote:
> *_print function: built-in space between string and variable:_*
>
> The below python code,
>
> */length = 5/*
> */print('Length is', length)/*
>
> gives an output of
>
> */Length is 5/*

The */.../* and *_..._* bracketing makes you post harder to read. 
Perhaps this is used in India, but not elsewhere. Omit next time.

> Even though we have not specified a space between 'Length is' and the
> variable length, Python puts it for us so that we get a clean nice
> output and the program is much more readable this way (since we don't
> need to worry about spacing in the strings we use for output). This is
> surely an example of how Python makes life easy for the programmer.
>
> *_input function: built-in space between string and user-input:_*
>
> However, the below python code,
>
> */guess = int(input('Enter an integer'))/*
>
> gives an output of
>
> */Enter an integer7/*
>
> [Note: Assume 7 is entered by the user.]
>
> *Suggestion: *Similar to the printf function, for the input function
> also, it will be nice to have the Python put a space between string and
> user-input, so that the output in the above case will be more readable
> as below.
>
> */Enter an integer 7/*

print() converts objects to strings and adds separators and a terminator 
before writing to outfile.write(). In 3.x, the separator, terminator, 
and outfile can all be changed from the default. The user is stuck with 
the fact that str(obj) is what it is, so it is handy to automatically 
tack something on.

input() directly writes a prompt string with sys.stdout.write.
There is no need to to augment that as the user can make the prompt 
string be whatever they want. In any case, a change would break 
back-compatibility.

-- 
Terry Jan Reedy



From mwm at mired.org  Wed May 16 08:32:15 2012
From: mwm at mired.org (Mike Meyer)
Date: Wed, 16 May 2012 02:32:15 -0400
Subject: [Python-ideas] get method for sets?
Message-ID: <20120516023215.4699c0b4@bhuda.mired.org>

Is there some reason that there isn't a straightforward way to get an
element from a set without removing it? Everything I find either
requires multiple statements or converting the set to another data
type.

It seems that some kind of get method would be useful. The argument
that "getting an arbitrary element from a set isn't useful" is refuted
by 1) the existence of the pop method, which does just that, and 2)
the fact that I (and a number of other people) have run into such a
need.

My search for such a reason kept finding people asking how
to get an element instead. Of course, my key words (set and get) are
heavily overloaded.

   thanks,
   		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From steve at pearwood.info  Wed May 16 08:58:43 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 16 May 2012 16:58:43 +1000
Subject: [Python-ideas] get method for sets?
In-Reply-To: <20120516023215.4699c0b4@bhuda.mired.org>
References: <20120516023215.4699c0b4@bhuda.mired.org>
Message-ID: <20120516065843.GA2542@ando>

On Wed, May 16, 2012 at 02:32:15AM -0400, Mike Meyer wrote:
> Is there some reason that there isn't a straightforward way to get an
> element from a set without removing it? Everything I find either
> requires multiple statements or converting the set to another data
> type.
> 
> It seems that some kind of get method would be useful. The argument
> that "getting an arbitrary element from a set isn't useful" is refuted
> by 1) the existence of the pop method, which does just that,

pop returns an arbitrary element, and removes it. That's a very 
different operation to "get this element from the set".

The problem is, if there was a set.get(x) method, you have to pass x as 
argument, and it returns, what? x. So what's the point? You already have 
the return value before you call the function.

def get(s, x):
    """Return element x from set s."""
    if x in s: return x
    raise KeyError('not found')


As I see it, this is only remotely useful if:

- you care about identity, e.g. caching/interning

- you care about types, e.g. get(s, 42) may return 42.0 as the element 
of the set instead.

In either case, a dict is the more obvious data structure to use.

def intern(d, x):
    """Intern element x in dict d, and return the interned version."""
    return d.setdefault(x, x)
    

But with sets? Seems pretty pointless to me. I can't help but feel that 
set.get() is a poorly thought out operation, much requested but rarely 
useful.


> and 2)
> the fact that I (and a number of other people) have run into such a
> need.
> 
> My search for such a reason kept finding people asking how
> to get an element instead. Of course, my key words (set and get) are
> heavily overloaded.

What's your use-case?



-- 
Steven


From dirkjan at ochtman.nl  Wed May 16 09:01:04 2012
From: dirkjan at ochtman.nl (Dirkjan Ochtman)
Date: Wed, 16 May 2012 09:01:04 +0200
Subject: [Python-ideas] get method for sets?
In-Reply-To: <20120516065843.GA2542@ando>
References: <20120516023215.4699c0b4@bhuda.mired.org>
	<20120516065843.GA2542@ando>
Message-ID: 

On Wed, May 16, 2012 at 8:58 AM, Steven D'Aprano  wrote:
> pop returns an arbitrary element, and removes it. That's a very
> different operation to "get this element from the set".
>
> The problem is, if there was a set.get(x) method, you have to pass x as
> argument, and it returns, what? x. So what's the point? You already have
> the return value before you call the function.

I understood Mike's message to be proposing an argument-less .get() method.

Cheers,

Dirkjan


From bruce at leapyear.org  Wed May 16 09:02:31 2012
From: bruce at leapyear.org (Bruce Leban)
Date: Wed, 16 May 2012 00:02:31 -0700
Subject: [Python-ideas] get method for sets?
In-Reply-To: <20120516023215.4699c0b4@bhuda.mired.org>
References: <20120516023215.4699c0b4@bhuda.mired.org>
Message-ID: 

On Tue, May 15, 2012 at 11:32 PM, Mike Meyer  wrote:

> Is there some reason that there isn't a straightforward way to get an
> element from a set without removing it? Everything I find either
> requires multiple statements or converting the set to another data
> type.
>
> It seems that some kind of get method would be useful. The argument
> that "getting an arbitrary element from a set isn't useful" is refuted
> by 1) the existence of the pop method, which does just that, and 2)
> the fact that I (and a number of other people) have run into such a
> need.
>

Your request needs clarification.  What does set.get do? What is the actual
use case? I understand what pop does: it removes and returns an arbitrary
member of the set. Therefore, if I call pop repeatedly, I eventually get
all the members. That's useful.

Here's one definition of get:

def get_from_set1(s):
    """Return an arbitrary member of a set."""
    return min(s, key=hash)


How is this useful?

Or do you mean instead: checks to see if an element is in the set and
returns it otherwise returns a default value

def get_from_set2(s, v, d=None):
    """Returns v if v is in the set, otherwise returns d."""
    return v if v in s else d

I suppose this could be useful but it's a one liner and seems much less
obvious what it does than dict.get.

Or did you mean something else?

--- Bruce
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From mwm at mired.org  Wed May 16 09:10:35 2012
From: mwm at mired.org (Mike Meyer)
Date: Wed, 16 May 2012 03:10:35 -0400
Subject: [Python-ideas] get method for sets?
In-Reply-To: <20120516065843.GA2542@ando>
References: <20120516023215.4699c0b4@bhuda.mired.org>
	<20120516065843.GA2542@ando>
Message-ID: <20120516031035.5974b5b3@bhuda.mired.org>

On Wed, 16 May 2012 16:58:43 +1000
Steven D'Aprano  wrote:

> On Wed, May 16, 2012 at 02:32:15AM -0400, Mike Meyer wrote:
> > Is there some reason that there isn't a straightforward way to get an
> > element from a set without removing it? Everything I find either
> > requires multiple statements or converting the set to another data
> > type. 
> > It seems that some kind of get method would be useful. The argument
> > that "getting an arbitrary element from a set isn't useful" is refuted
> > by 1) the existence of the pop method, which does just that,
> pop returns an arbitrary element, and removes it. That's a very 
> different operation to "get this element from the set".
> The problem is, if there was a set.get(x) method, you have to pass x as 
> argument, and it returns, what? x. So what's the point? You already have 
> the return value before you call the function.

I guess I should have been explicit about what I'm was asking about.

I'm not asking for set.get(x) that returns "this element", I'm asking
for set.get() that returns an arbitrary element, like set.pop(), but
without removing it. It doesn't even need to be the same element that
set.pop() would return.

The name is probably a poor choice, but I'm not sure what else it
should be. pop_without_remove seems a bit verbose, and implies that it
might return the element a pop would.

       		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From ncoghlan at gmail.com  Wed May 16 09:11:28 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 16 May 2012 17:11:28 +1000
Subject: [Python-ideas] get method for sets?
In-Reply-To: <20120516023215.4699c0b4@bhuda.mired.org>
References: <20120516023215.4699c0b4@bhuda.mired.org>
Message-ID: 

On Wed, May 16, 2012 at 4:32 PM, Mike Meyer  wrote:
> Is there some reason that there isn't a straightforward way to get an
> element from a set without removing it? Everything I find either
> requires multiple statements or converting the set to another data
> type.
>
> It seems that some kind of get method would be useful. The argument
> that "getting an arbitrary element from a set isn't useful" is refuted
> by 1) the existence of the pop method, which does just that, and 2)
> the fact that I (and a number of other people) have run into such a
> need.
>
> My search for such a reason kept finding people asking how
> to get an element instead. Of course, my key words (set and get) are
> heavily overloaded.

The two primary use cases handled by the current interface are:
1. Do something for all items in the set (iteration)
2. Do something for an arbitrary item in the set, and keep track of
which items remain (set.pop)

Now, at the iterator level, it is possible to turn "do something for
all items in an iterable" to "do something for the *first* item in the
iterable" via "next(iter(obj))".

Since this use case is already covered by the iterator protocol, the
question then becomes: Is there a specific reason a dedicated
set-specific solution is needed rather than better educating people
that "the first item" is an acceptable answer when the request is for
"an arbitrary item" (this is particularly true in a world where set
ordering is randomised by default)?

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From steve at pearwood.info  Wed May 16 09:26:45 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 16 May 2012 17:26:45 +1000
Subject: [Python-ideas] get method for sets?
In-Reply-To: <20120516031035.5974b5b3@bhuda.mired.org>
References: <20120516023215.4699c0b4@bhuda.mired.org>
	<20120516065843.GA2542@ando>
	<20120516031035.5974b5b3@bhuda.mired.org>
Message-ID: <20120516072645.GB2542@ando>

On Wed, May 16, 2012 at 03:10:35AM -0400, Mike Meyer wrote:

> I guess I should have been explicit about what I'm was asking about.

:)
 
> I'm not asking for set.get(x) that returns "this element", I'm asking
> for set.get() that returns an arbitrary element, like set.pop(), but
> without removing it. It doesn't even need to be the same element that
> set.pop() would return.

Could this helper function not do the job?

def get(s):
    x = s.pop()
    s.add(x)
    return x


Of course, this does not guarantee that repeated calls to get() won't 
return the same result over and over again. If that's unacceptable, 
you'll need to specify what behaviour is acceptable -- i.e. what your 
functional requirements are. E.g.

"I need the element to be selected at random."

"I don't need randomness, returning the elements in some arbitrary 
but deterministic order will do, with no repeats or cycles."

"I don't care whether or not there are repeats, so long as the same 
element is not returned twice in a row."

"Once I've seen every element, I expect get() to raise an exception."

etc.

And I guarantee that whatever your requirements are, other people will 
want something different.

Once you have your requirements, you can start thinking about 
implementation (e.g. how does the set remember which elements have 
already been get'ed?).


> The name is probably a poor choice, but I'm not sure what else it
> should be. pop_without_remove seems a bit verbose, and implies that it
> might return the element a pop would.

Are you suggesting that get() and pop() should not return the same 
element?


-- 
Steven


From ben+python at benfinney.id.au  Wed May 16 09:39:10 2012
From: ben+python at benfinney.id.au (Ben Finney)
Date: Wed, 16 May 2012 17:39:10 +1000
Subject: [Python-ideas] get method for sets?
References: <20120516023215.4699c0b4@bhuda.mired.org>
Message-ID: <87vcjw8ti9.fsf@benfinney.id.au>

Mike Meyer  writes:

> Is there some reason that there isn't a straightforward way to get an
> element from a set without removing it? Everything I find either
> requires multiple statements or converting the set to another data
> type.

With a mapping, you use a key to get an item.

With a sequence, you have an index and get an item.

Sets are unordered collections of items without indices or keys. What
does it mean to you to ?get? an item from that?

If you mean ?get the items one by one?, a set is an iterable::

    for item in foo_set:
        do_something_with(item)

If you mean ?test whether an item is in the set?, the ?in? operator
works::

    if item in foo_set:
        do_something()

If you mean ?get a specific item from a set?, the only way to do that is
to *already have* the specific item and test whether it's in the set.

> It seems that some kind of get method would be useful. The argument
> that "getting an arbitrary element from a set isn't useful" is refuted
> by 1) the existence of the pop method, which does just that, and 2)
> the fact that I (and a number of other people) have run into such a
> need.

If by ?get? you mean to get an *arbitrary* item, not a specific item,
then what's the problem? You already have ?set.pop?, as you point out.

What need do you have that isn't being fulfilled by the existing mthods
and operators? Can you show some actual code that would be improved by a
?get? operation on sets?

-- 
 \          ?It's a terrible paradox that most charities are driven by |
  `\     religious belief.? if you think altruism without Jesus is not |
_o__)          altruism, then you're a dick.? ?Tim Minchin, 2010-11-28 |
Ben Finney



From jeanpierreda at gmail.com  Wed May 16 09:38:57 2012
From: jeanpierreda at gmail.com (Devin Jeanpierre)
Date: Wed, 16 May 2012 03:38:57 -0400
Subject: [Python-ideas] get method for sets?
In-Reply-To: <20120516072645.GB2542@ando>
References: <20120516023215.4699c0b4@bhuda.mired.org>
	<20120516065843.GA2542@ando>
	<20120516031035.5974b5b3@bhuda.mired.org> <20120516072645.GB2542@ando>
Message-ID: 

On Wed, May 16, 2012 at 3:26 AM, Steven D'Aprano  wrote:
> Are you suggesting that get() and pop() should not return the same
> element?

He is suggesting that "It doesn't even need to be the same element
that set.pop() would return."

-- Devin


From mwm at mired.org  Wed May 16 09:40:34 2012
From: mwm at mired.org (Mike Meyer)
Date: Wed, 16 May 2012 03:40:34 -0400
Subject: [Python-ideas] get method for sets?
In-Reply-To: 
References: <20120516023215.4699c0b4@bhuda.mired.org>
	
Message-ID: <20120516034034.048f2eaa@bhuda.mired.org>

On Wed, 16 May 2012 00:02:31 -0700
Bruce Leban  wrote:
> On Tue, May 15, 2012 at 11:32 PM, Mike Meyer  wrote:
> > Is there some reason that there isn't a straightforward way to get an
> > element from a set without removing it? Everything I find either
> > requires multiple statements or converting the set to another data
> > type.
> >
> > It seems that some kind of get method would be useful. The argument
> > that "getting an arbitrary element from a set isn't useful" is refuted
> > by 1) the existence of the pop method, which does just that, and 2)
> > the fact that I (and a number of other people) have run into such a
> > need.
> Your request needs clarification.  What does set.get do? What is the actual
> use case? I understand what pop does: it removes and returns an arbitrary
> member of the set. Therefore, if I call pop repeatedly, I eventually get
> all the members. That's useful.

So is just getting a single member:

> Here's one definition of get:
> def get_from_set1(s):
>     """Return an arbitrary member of a set."""
>     return min(s, key=hash)

From poking around, at least at one time the fastest implementation
was the very confusing:

def get_from_set(s):
    for x in s:
    	return x

> How is this useful?

Basically, anytime you want to examine an arbitrary element of a set,
and would use pop, except you need to preserve the set for future
use. In my case, I'm running a series of tests on the set, and some
tests need an element.

Again, looking for a reason for this not existing turned up other
cases where people were wondering how to do this.

Hmm. Maybe the name should be item?

	 		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From mwm at mired.org  Wed May 16 09:52:01 2012
From: mwm at mired.org (Mike Meyer)
Date: Wed, 16 May 2012 03:52:01 -0400
Subject: [Python-ideas] get method for sets?
In-Reply-To: <20120516072645.GB2542@ando>
References: <20120516023215.4699c0b4@bhuda.mired.org>
	<20120516065843.GA2542@ando>
	<20120516031035.5974b5b3@bhuda.mired.org>
	<20120516072645.GB2542@ando>
Message-ID: <20120516035201.2fb0b3f6@bhuda.mired.org>

On Wed, 16 May 2012 17:26:45 +1000
Steven D'Aprano  wrote:

> On Wed, May 16, 2012 at 03:10:35AM -0400, Mike Meyer wrote:
> > I guess I should have been explicit about what I'm was asking about.
> 
> :)
>  
> > I'm not asking for set.get(x) that returns "this element", I'm asking
> > for set.get() that returns an arbitrary element, like set.pop(), but
> > without removing it. It doesn't even need to be the same element that
> > set.pop() would return.
> 
> Could this helper function not do the job?
> 
> def get(s):
>     x = s.pop()
>     s.add(x)
>     return x

Sure, if you don't mind munging the set unnecessarily. That's more
readable, but slower and longer than:

def get(s):
    for x in is:
        return s

> Of course, this does not guarantee that repeated calls to get() won't 
> return the same result over and over again. If that's unacceptable, 
> you'll need to specify what behaviour is acceptable -- i.e. what your 
> functional requirements are. E.g.
> "I need the element to be selected at random."
> "I don't need randomness, returning the elements in some arbitrary 
> but deterministic order will do, with no repeats or cycles."
> "I don't care whether or not there are repeats, so long as the same 
> element is not returned twice in a row."
> "Once I've seen every element, I expect get() to raise an exception."
> etc.

My requirements are "I need an element from the set". The behavior of
repeated calls is immaterial.

> And I guarantee that whatever your requirements are, other people will 
> want something different.

That's not what I found in my google results. They were all pretty
much asking for what I was asking for, and didn't care what happened
beyond the first call.

I believe you're assuming that the purpose of this method is to start
an iteration through the set. That's not the case at all, and a single
call to pop would be perfectly acceptable, except I would then need to
put the element back.

If that were the purpose, I'd agree with you - between iteration and
pop, we've covered most of the ways you might want to iterate through
the elements. But the point isn't to iterate through the elements,
it's to examine a single element.

> Once you have your requirements, you can start thinking about 
> implementation (e.g. how does the set remember which elements have 
> already been get'ed?).

Doesn't need to, because it doesn't matter.

> > The name is probably a poor choice, but I'm not sure what else it
> > should be. pop_without_remove seems a bit verbose, and implies that it
> > might return the element a pop would.
> Are you suggesting that get() and pop() should not return the same 
> element?

I'm suggesting there is no requirement that they return the same
element.

			http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From ncoghlan at gmail.com  Wed May 16 10:08:11 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 16 May 2012 18:08:11 +1000
Subject: [Python-ideas] get method for sets?
In-Reply-To: <20120516034034.048f2eaa@bhuda.mired.org>
References: <20120516023215.4699c0b4@bhuda.mired.org>
	
	<20120516034034.048f2eaa@bhuda.mired.org>
Message-ID: 

On Wed, May 16, 2012 at 5:40 PM, Mike Meyer  wrote:
> >From poking around, at least at one time the fastest implementation
> was the very confusing:
>
> def get_from_set(s):
> ? ?for x in s:
> ? ? ? ?return x

Why is this confusing? The operation you want to perform is "give me
an object from this set, I don't care which one".

That's not an operation that applies just to sets, you can do it with
an iterable, therefore the spelling is one that works with any
iterable: next(iter(s))

This entire thread is like asking for s.length() when len(s) already works.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From bruce at leapyear.org  Wed May 16 10:08:09 2012
From: bruce at leapyear.org (Bruce Leban)
Date: Wed, 16 May 2012 01:08:09 -0700
Subject: [Python-ideas] get method for sets?
In-Reply-To: <20120516034034.048f2eaa@bhuda.mired.org>
References: <20120516023215.4699c0b4@bhuda.mired.org>
	
	<20120516034034.048f2eaa@bhuda.mired.org>
Message-ID: 

On Wed, May 16, 2012 at 12:40 AM, Mike Meyer  wrote:

> On Wed, 16 May 2012 00:02:31 -0700
> Bruce Leban  wrote:
>


> > Here's one definition of get:
> > def get_from_set1(s):
> >     """Return an arbitrary member of a set."""
> >     return min(s, key=hash)
>
> >From poking around, at least at one time the fastest implementation
> was the very confusing:
>
> def get_from_set(s):
>    for x in s:
>        return x
>

I didn't claim it was fast. I actually wrote that version instead of the
in/return version for a very specific reason: it always returns the same
element. (The for/in/return version might return the same element every
time too but it's not guaranteed.)

> How is this useful?
>
> Basically, anytime you want to examine an arbitrary element of a set,
> and would use pop, except you need to preserve the set for future
> use. In my case, I'm running a series of tests on the set, and some
> tests need an element.
>
> That's bordering on tautological. It's useful anytime you need it. I don't
think your test is very good if it uses the get I wrote above. Your test
will only operate on one element of the set and it's easy to write
functions which succeed for some elements of the set and fail for others.
I'd like to see an actual test that you think needs this that would not be
improved by iterating over the list.



> Again, looking for a reason for this not existing turned up other
> cases where people were wondering how to do this.
>

Things are added to APIs and libraries because they are useful, not because
people wonder why they aren't there. set.get as you propose is not
sufficiently analogous to dict.get or list.__getitem__.

--- Bruce
Follow me: http://www.twitter.com/Vroo http://www.vroospeak.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From mwm at mired.org  Wed May 16 10:09:41 2012
From: mwm at mired.org (Mike Meyer)
Date: Wed, 16 May 2012 04:09:41 -0400
Subject: [Python-ideas] get method for sets?
In-Reply-To: 
References: <20120516023215.4699c0b4@bhuda.mired.org>
	
Message-ID: <20120516040941.0330975d@bhuda.mired.org>

On Wed, 16 May 2012 17:11:28 +1000
Nick Coghlan  wrote:

> On Wed, May 16, 2012 at 4:32 PM, Mike Meyer  wrote:
> > Is there some reason that there isn't a straightforward way to get an
> > element from a set without removing it? Everything I find either
> > requires multiple statements or converting the set to another data
> > type.
> > It seems that some kind of get method would be useful. The argument
> > that "getting an arbitrary element from a set isn't useful" is refuted
> > by 1) the existence of the pop method, which does just that, and 2)
> > the fact that I (and a number of other people) have run into such a
> > need.
> > My search for such a reason kept finding people asking how
> > to get an element instead. Of course, my key words (set and get) are
> > heavily overloaded.
> The two primary use cases handled by the current interface are:
> 1. Do something for all items in the set (iteration)
> 2. Do something for an arbitrary item in the set, and keep track of
> which items remain (set.pop)

Neither of which fits my use case.

> Since this use case is already covered by the iterator protocol, the
> question then becomes: Is there a specific reason a dedicated
> set-specific solution is needed rather than better educating people
> that "the first item" is an acceptable answer when the request is for
> "an arbitrary item" (this is particularly true in a world where set
> ordering is randomised by default)?

Because next(iter(s)) makes the reader wonder "Why is this iterator
being created?" It's a less expensive form of writing list(s)[0]. It's
also sufficiently non-obvious that the closest I found on google for a
discussion of the issue was the "for x in s: break" variant. Which
makes me think that at the very least, this idiom ought to be
mentioned in the documentation. Or if it's already there, then a
pointer added to the set documentation.

But my question was actually whether or not there was a reason for it
not existing. Has there been a previous discussion of this?

   		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From mwm at mired.org  Wed May 16 10:12:38 2012
From: mwm at mired.org (Mike Meyer)
Date: Wed, 16 May 2012 04:12:38 -0400
Subject: [Python-ideas] get method for sets?
In-Reply-To: <87vcjw8ti9.fsf@benfinney.id.au>
References: <20120516023215.4699c0b4@bhuda.mired.org>
	<87vcjw8ti9.fsf@benfinney.id.au>
Message-ID: <20120516041238.2ef36576@bhuda.mired.org>

On Wed, 16 May 2012 17:39:10 +1000
Ben Finney  wrote:
> If by ?get? you mean to get an *arbitrary* item, not a specific item,
> then what's the problem? You already have ?set.pop?, as you point out.

And, as I also pointed out, it's not useful in the case where you need
to preserve the set for future use.

   		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From p.f.moore at gmail.com  Wed May 16 10:34:52 2012
From: p.f.moore at gmail.com (Paul Moore)
Date: Wed, 16 May 2012 09:34:52 +0100
Subject: [Python-ideas] get method for sets?
In-Reply-To: <20120516040941.0330975d@bhuda.mired.org>
References: <20120516023215.4699c0b4@bhuda.mired.org>
	
	<20120516040941.0330975d@bhuda.mired.org>
Message-ID: 

On 16 May 2012 09:09, Mike Meyer  wrote:
> Because next(iter(s)) makes the reader wonder "Why is this iterator
> being created?" It's a less expensive form of writing list(s)[0]. It's
> also sufficiently non-obvious that the closest I found on google for a
> discussion of the issue was the "for x in s: break" variant. Which
> makes me think that at the very least, this idiom ought to be
> mentioned in the documentation. Or if it's already there, then a
> pointer added to the set documentation.

I guess a doc patch adding a comment in the documentation of set.pop
that if you want an arbitrary element of a set *without* removing it,
then next(iter(s)) will give it to you, would be reasonable. Maybe you
could write one?

But I don't think it's particularly difficult to understand. It's very
Python-specific, sure, but it feels idiomatic to me.

Paul.


From stephen at xemacs.org  Wed May 16 10:42:49 2012
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed, 16 May 2012 17:42:49 +0900
Subject: [Python-ideas] get method for sets?
In-Reply-To: <20120516035201.2fb0b3f6@bhuda.mired.org>
References: <20120516023215.4699c0b4@bhuda.mired.org>
	<20120516065843.GA2542@ando>
	<20120516031035.5974b5b3@bhuda.mired.org>
	<20120516072645.GB2542@ando>
	<20120516035201.2fb0b3f6@bhuda.mired.org>
Message-ID: <87k40cqzxy.fsf@uwakimon.sk.tsukuba.ac.jp>

Mike Meyer writes:
 > On Wed, 16 May 2012 17:26:45 +1000

 > > Could this helper function not do the job?
 > > 
 > > def get(s):
 > >     x = s.pop()
 > >     s.add(x)
 > >     return x
 > 
 > Sure, if you don't mind munging the set unnecessarily. That's more
 > readable, but slower and longer than:
 > 
 > def get(s):
 >     for x in s:
 >         return s

Why would you mind munging the set temporarily?  Why is speed (of
something that almost by definition is undefined if repeated)
important?  Your example use case of testing doesn't motivate these
parts of your requirements.

I'm -1 on adding a method that has no motivation in production that I
can see.  Just redefine your get() function as a function, with a more
appropriate name such as "get_item_nondeterministically".  It will
work on any iterable.  (Don't forget to document that it will "use up"
an item if the iterable is not a sequence, though.)


From mwm at mired.org  Wed May 16 10:46:28 2012
From: mwm at mired.org (Mike Meyer)
Date: Wed, 16 May 2012 04:46:28 -0400
Subject: [Python-ideas] get method for sets?
In-Reply-To: 
References: <20120516023215.4699c0b4@bhuda.mired.org>
	
	<20120516034034.048f2eaa@bhuda.mired.org>
	
Message-ID: <20120516044628.2aa6dff9@bhuda.mired.org>

On Wed, 16 May 2012 01:08:09 -0700
Bruce Leban  wrote:
> On Wed, May 16, 2012 at 12:40 AM, Mike Meyer  wrote:
> > On Wed, 16 May 2012 00:02:31 -0700
> > Bruce Leban  wrote:

> > > Here's one definition of get:
> > > def get_from_set1(s):
> > >     """Return an arbitrary member of a set."""
> > >     return min(s, key=hash)
> >
> > >From poking around, at least at one time the fastest implementation
> > was the very confusing:
> >
> > def get_from_set(s):
> >    for x in s:
> >        return x
> I didn't claim it was fast. I actually wrote that version instead of the
> in/return version for a very specific reason: it always returns the same
> element. (The for/in/return version might return the same element every
> time too but it's not guaranteed.)

I didn't ask for a get that would always return the same element.

> > Basically, anytime you want to examine an arbitrary element of a set,
> > and would use pop, except you need to preserve the set for future
> > use. In my case, I'm running a series of tests on the set, and some
> > tests need an element.
> That's bordering on tautological. It's useful anytime you need it.

What do you expect? We've got a container type that has no way to
examine an element without modifying the container or wrapping it in
an object of a different another type. The general use case is that
you don't need the facilities of the wrapping type (except for their
ability to provide a single element) and you don't want to modify the
container.

Would you also complain that having int accept a string value in lieu
of using eval on untrusted input is a case of "it's useful anytime you
need it."?

> I don't think your test is very good if it uses the get I wrote
> above. Your test will only operate on one element of the set and
> it's easy to write functions which succeed for some elements of the
> set and fail for others.  I'd like to see an actual test that you
> think needs this that would not be improved by iterating over the
> list.

Talk about tautologies! Of course you can write tests that will fail
in some cases. You can also write tests that won't fail for your
cases. Especially if you know something about the set beforehand.

For instance, I happen to know I have a set of ElementTree elements
that all have the same tag. I want to check the tag.

One of the test cases starts by checking to see if the set is a
singleton. Do you really propose something like:

    if len(s) == 1:
	for i in s:
	    res = process(i)

or:

    if len(s) == 1:
	res = process(list(s)[0])

Or, as suggested elsewhere:

    if len(s) == 1:
	res = process(next(iter(s)))

Or, just to be really obtuse:

    if len(s) == 1:
        res = process(set(s).pop())

All of these require creating an intermediate object for the sole
purpose of getting an item out of the container without destroying the
container. This leads the reader to wonder why it was created, which
clashes with pretty much everything else in python. The only really
palatable version is sufficiently obscure that nobody else who needed
this API found it, or had it suggested to them by those responding to
their question.

> > Again, looking for a reason for this not existing turned up other
> > cases where people were wondering how to do this.
> Things are added to APIs and libraries because they are useful, not because
> people wonder why they aren't there. set.get as you propose is not
> sufficiently analogous to dict.get or list.__getitem__.

People wonder why an API is not there because they need it and can't
find it. I have a use for this API, so clearly it's useful. When I
didn't find it, I figured there might be a reason for it not existing,
so I asked the google djinn. Rather than providing a reason (djinn
being noticeably untrustworthy), it turned up other people who had the
same need as I did. I then asked this list the same question. Instead
of getting an answer to that question, I get a bunch of claims that "I
don't really need that", making me wonder if I've invoked another
djinn.

			http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From stephen at xemacs.org  Wed May 16 11:23:37 2012
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed, 16 May 2012 18:23:37 +0900
Subject: [Python-ideas] get method for sets?
In-Reply-To: <20120516044628.2aa6dff9@bhuda.mired.org>
References: <20120516023215.4699c0b4@bhuda.mired.org>
	
	<20120516034034.048f2eaa@bhuda.mired.org>
	
	<20120516044628.2aa6dff9@bhuda.mired.org>
Message-ID: <87havgqy1y.fsf@uwakimon.sk.tsukuba.ac.jp>

Mike Meyer writes:

 > of getting an answer to that question, I get a bunch of claims that "I
 > don't really need that", making me wonder if I've invoked another
 > djinn.

Indeed, you asked whether there's a reason it's not in the stdlib, and
the bunch of answers you got is the stdlib really doesn't need it, so
it's not there.

Maybe some of them were rash enough to also claim that *you* don't
really need it, but that's sort of off topic in this thread---just
ignore them and use the recipe you find most useful!


From fuzzyman at gmail.com  Wed May 16 11:28:54 2012
From: fuzzyman at gmail.com (Michael Foord)
Date: Wed, 16 May 2012 10:28:54 +0100
Subject: [Python-ideas] input function: built-in space between string
	and user-input
In-Reply-To: 
References: 
	
Message-ID: 

On 15 May 2012 22:19, Terry Reedy  wrote:

> On 5/15/2012 6:46 AM, Suriaprakash.Mariappan at smsc.**comwrote:
>
>> *_print function: built-in space between string and variable:_*
>>
>> The below python code,
>>
>> */length = 5/*
>> */print('Length is', length)/*
>>
>> gives an output of
>>
>> */Length is 5/*
>>
>
> The */.../* and *_..._* bracketing makes you post harder to read. Perhaps
> this is used in India, but not elsewhere. Omit next time.
>


They weren't present in the version I read. Probably a consequence of your
mail client not being able to display formatted emails.

Michael


>
>  Even though we have not specified a space between 'Length is' and the
>> variable length, Python puts it for us so that we get a clean nice
>> output and the program is much more readable this way (since we don't
>> need to worry about spacing in the strings we use for output). This is
>> surely an example of how Python makes life easy for the programmer.
>>
>> *_input function: built-in space between string and user-input:_*
>>
>>
>> However, the below python code,
>>
>> */guess = int(input('Enter an integer'))/*
>>
>> gives an output of
>>
>> */Enter an integer7/*
>>
>>
>> [Note: Assume 7 is entered by the user.]
>>
>> *Suggestion: *Similar to the printf function, for the input function
>>
>> also, it will be nice to have the Python put a space between string and
>> user-input, so that the output in the above case will be more readable
>> as below.
>>
>> */Enter an integer 7/*
>>
>
> print() converts objects to strings and adds separators and a terminator
> before writing to outfile.write(). In 3.x, the separator, terminator, and
> outfile can all be changed from the default. The user is stuck with the
> fact that str(obj) is what it is, so it is handy to automatically tack
> something on.
>
> input() directly writes a prompt string with sys.stdout.write.
> There is no need to to augment that as the user can make the prompt string
> be whatever they want. In any case, a change would break back-compatibility.
>
> --
> Terry Jan Reedy
>
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/**mailman/listinfo/python-ideas
>



-- 

http://www.voidspace.org.uk/

May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing http://www.sqlite.org/different.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From ned at nedbatchelder.com  Wed May 16 12:27:38 2012
From: ned at nedbatchelder.com (Ned Batchelder)
Date: Wed, 16 May 2012 06:27:38 -0400
Subject: [Python-ideas] input function: built-in space between string
 and user-input
In-Reply-To: 
References: 
	
Message-ID: <4FB3811A.4090601@nedbatchelder.com>

On 5/15/2012 5:19 PM, Terry Reedy wrote:
> On 5/15/2012 6:46 AM, Suriaprakash.Mariappan at smsc.com wrote:
>> *_print function: built-in space between string and variable:_*
>>
>> The below python code,
>>
>> */length = 5/*
>> */print('Length is', length)/*
>>
>> gives an output of
>>
>> */Length is 5/*
>
> The */.../* and *_..._* bracketing makes you post harder to read. 
> Perhaps this is used in India, but not elsewhere. Omit next time. 

That's your mail client's rendering of bold-italic and bold-underscored 
text from the HTML version of the original email.

--Ned.


From mikegraham at gmail.com  Wed May 16 16:44:20 2012
From: mikegraham at gmail.com (Mike Graham)
Date: Wed, 16 May 2012 10:44:20 -0400
Subject: [Python-ideas] get method for sets?
In-Reply-To: 
References: <20120516023215.4699c0b4@bhuda.mired.org>
	
	<20120516034034.048f2eaa@bhuda.mired.org>
	
Message-ID: 

On Wed, May 16, 2012 at 4:08 AM, Nick Coghlan  wrote:
> That's not an operation that applies just to sets, you can do it with
> an iterable, therefore the spelling is one that works with any
> iterable: next(iter(s))

It sounds like you're re-implementing the venerable ,= operator. ,= is
one of my favorite operators in Python.

You know,
>>> s
set([42])
>>> item ,= s
>>> item
42




;^),
Mike


From masklinn at masklinn.net  Wed May 16 16:51:09 2012
From: masklinn at masklinn.net (Masklinn)
Date: Wed, 16 May 2012 16:51:09 +0200
Subject: [Python-ideas] get method for sets?
In-Reply-To: 
References: <20120516023215.4699c0b4@bhuda.mired.org>
	
	<20120516034034.048f2eaa@bhuda.mired.org>
	
	
Message-ID: <9B40FD38-A488-4B48-BDEC-11A8ED25E3E7@masklinn.net>


On 2012-05-16, at 16:44 , Mike Graham wrote:

> On Wed, May 16, 2012 at 4:08 AM, Nick Coghlan  wrote:
>> That's not an operation that applies just to sets, you can do it with
>> an iterable, therefore the spelling is one that works with any
>> iterable: next(iter(s))
> 
> It sounds like you're re-implementing the venerable ,= operator. ,= is
> one of my favorite operators in Python.
> 
> You know,
>>>> s
> set([42])
>>>> item ,= s
>>>> item
> 42

With the difference that ,= also asserts there is only one item in the
iterable, where `next . iter` only does `head`.

(but the formatting as a single operator is genius, I usually write it
as `item, = s` and the lack of clarity bothers me, thanks for that
visual trick)


From mikegraham at gmail.com  Wed May 16 16:57:55 2012
From: mikegraham at gmail.com (Mike Graham)
Date: Wed, 16 May 2012 10:57:55 -0400
Subject: [Python-ideas] get method for sets?
In-Reply-To: <9B40FD38-A488-4B48-BDEC-11A8ED25E3E7@masklinn.net>
References: <20120516023215.4699c0b4@bhuda.mired.org>
	
	<20120516034034.048f2eaa@bhuda.mired.org>
	
	
	<9B40FD38-A488-4B48-BDEC-11A8ED25E3E7@masklinn.net>
Message-ID: 

On Wed, May 16, 2012 at 10:51 AM, Masklinn  wrote:
> With the difference that ,= also asserts there is only one item in the
> iterable, where `next . iter` only does `head`.
>
> (but the formatting as a single operator is genius, I usually write it
> as `item, = s` and the lack of clarity bothers me, thanks for that
> visual trick)

1. I should have quoted Mike Meyer's code "if len(s) == 1: res =
process(next(iter(s)))", which is what I had in mind. (In that case,
it's a feature. :) )

2. I grouped it this way as a joke. If you do this, everyone will
think you're crazy. I've been known to write it (item,) = s, which
makes it a little easier to see the comma.

If you want to be unambiguous AND confuse everyone, go with

>>> s
set([42])
>>> [item] = s
>>> item
42

Mike


From bruce at leapyear.org  Wed May 16 17:06:12 2012
From: bruce at leapyear.org (Bruce Leban)
Date: Wed, 16 May 2012 08:06:12 -0700
Subject: [Python-ideas] get method for sets?
In-Reply-To: <20120516044628.2aa6dff9@bhuda.mired.org>
References: <20120516023215.4699c0b4@bhuda.mired.org>
	
	<20120516034034.048f2eaa@bhuda.mired.org>
	
	<20120516044628.2aa6dff9@bhuda.mired.org>
Message-ID: 

On Wed, May 16, 2012 at 1:46 AM, Mike Meyer  wrote:

> On Wed, 16 May 2012 01:08:09 -0700
> Bruce Leban  wrote:
> > On Wed, May 16, 2012 at 12:40 AM, Mike Meyer  wrote:
>
> I didn't claim it was fast. I actually wrote that version instead of the
> > in/return version for a very specific reason: it always returns the same
> > element. (The for/in/return version might return the same element every
> > time too but it's not guaranteed.)
>
> I didn't ask for a get that would always return the same element.
>

You didn't ask for a get that *didn't* always return the same element. My
deterministic version is totally compatible with your ask here and


> Would you also complain that having int accept a string value in lieu
> of using eval on untrusted input is a case of "it's useful anytime you
> need it."?
>

Not at all. It's useful because it's very common to need to convert strings
to numbers and I can show you lots of code that does just that. So we need
a method that does that safely. Does it have to be int? No; it could be
atoi or parse_int or scanf. But we do need it.

>
> > I don't think your test is very good if it uses the get I wrote
> > above. Your test will only operate on one element of the set and
> > it's easy to write functions which succeed for some elements of the
> > set and fail for others.  I'd like to see an actual test that you
> > think needs this that would not be improved by iterating over the
> > list.
>
> Talk about tautologies! Of course you can write tests that will fail
> in some cases. You can also write tests that won't fail for your
> cases. Especially if you know something about the set beforehand.
>

Not what I said. It's easy to write a *function* that fails on some
elements and your *test* won't test it. Example: a function that fails when
operating on non-integer set elements or the largest element or .... Your
test only tests an one case.

>
> For instance, I happen to know I have a set of ElementTree elements
> that all have the same tag. I want to check the tag.
>

Then maybe you should be using a different data structure than a set. Maybe
set_with_same_tag that declares that constraint and can enforce the
constraint if you want.


> One of the test cases starts by checking to see if the set is a
> singleton. Do you really propose something like:
>
>    if len(s) == 1:
>        for i in s:
>            res = process(i)
>
> This is a legitimate use case. I don't think it's a big deal to have to
add a one line function to your code. I might even use EAFP:

res = process(set_singleton(s))


where

def set_singleton(s):
    [result] = s
    return result


--- Bruce
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From ben+python at benfinney.id.au  Wed May 16 17:43:06 2012
From: ben+python at benfinney.id.au (Ben Finney)
Date: Thu, 17 May 2012 01:43:06 +1000
Subject: [Python-ideas] get method for sets?
References: <20120516023215.4699c0b4@bhuda.mired.org>
	<87vcjw8ti9.fsf@benfinney.id.au>
	<20120516041238.2ef36576@bhuda.mired.org>
Message-ID: <87r4uk873p.fsf@benfinney.id.au>

Mike Meyer  writes:

> On Wed, 16 May 2012 17:39:10 +1000
> Ben Finney  wrote:
> > If by ?get? you mean to get an *arbitrary* item, not a specific item,
> > then what's the problem? You already have ?set.pop?, as you point out.
>
> And, as I also pointed out, it's not useful in the case where you need
> to preserve the set for future use.

Then, if ?item = next(iter(foo_set))? doesn't suit you, perhaps you'd
like ?item = set(foo_set).pop()?.

Regardless, I think you have your answer: Like most things that can
already be done by composing the existing pieces, this corner case
hasn't met the deliberately-high bar for making a special method just to
do it.

I still haven't seen you describe the use case where the existing ways
of doing this aren't good enough.

-- 
 \     ?Men never do evil so completely and cheerfully as when they do |
  `\        it from religious conviction.? ?Blaise Pascal (1623?1662), |
_o__)                                                   Pens?es, #894. |
Ben Finney



From steve at pearwood.info  Wed May 16 18:20:19 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 17 May 2012 02:20:19 +1000
Subject: [Python-ideas] get method for sets?
In-Reply-To: <20120516040941.0330975d@bhuda.mired.org>
References: <20120516023215.4699c0b4@bhuda.mired.org>	
	<20120516040941.0330975d@bhuda.mired.org>
Message-ID: <4FB3D3C3.2070502@pearwood.info>

Mike Meyer wrote:

> But my question was actually whether or not there was a reason for it
> not existing. Has there been a previous discussion of this?


Aye yai yai, have there ever.

http://mail.python.org/pipermail/python-bugs-list/2005-August/030069.html

If you have an hour or two spare, read this thread:

http://mail.python.org/pipermail/python-dev/2009-October/093227.html

By the way, I suggest that a better name than "get" is pick(), which once was 
(but no longer is) suggested by Wikipedia as a fundamental set operation.

http://en.wikipedia.org/w/index.php?title=Set_%28abstract_data_type%29&oldid=461872038#Static_sets


It seems to me that it has been removed because:

- the actual semantics of what it means to get/pick a value from
   a set are unclear; and
- few, if any, set implementations actually provide this method.

I still think your best bet is a helper function:

def pick(s):
     return next(iter(s))



-- 
Steven


From steve at pearwood.info  Wed May 16 18:28:22 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 17 May 2012 02:28:22 +1000
Subject: [Python-ideas] get method for sets?
In-Reply-To: <20120516044628.2aa6dff9@bhuda.mired.org>
References: <20120516023215.4699c0b4@bhuda.mired.org>		<20120516034034.048f2eaa@bhuda.mired.org>	
	<20120516044628.2aa6dff9@bhuda.mired.org>
Message-ID: <4FB3D5A6.602@pearwood.info>

Mike Meyer wrote:

> I didn't ask for a get that would always return the same element.

It seems to me that you haven't exactly been clear about what you want this 
get() method to actually do. In an earlier email, you said:

[quote]
My requirements are "I need an element from the set". The behavior of
repeated calls is immaterial.
[end quote]

So a get() which always returns the same element fits your requirements, *as 
stated*. If you have other requirements, you haven't been forthcoming with them.

I've never come across a set implementation which includes something like your 
get() method. Does anyone know any language whose set implementation has this 
functionality? Wikipedia currently suggests it isn't a natural method of sets:

     Unlike most other collection types, rather than retrieving a
     specific element from a set, one typically tests a value for
     membership in a set.

http://en.wikipedia.org/wiki/Set_%28computer_science%29

For all you say it is a common request, I don't think it's a well-thought-out 
request. It's one thing to ask "give me any element without modifying the 
set", but what does that mean exactly? Which element should it return? "Any 
element, so long as it isn't always the same element twice in a row" perhaps? 
Would flip-flopping between the first and second elements meet your requirements?

The example you give below:


> For instance, I happen to know I have a set of ElementTree elements
> that all have the same tag. I want to check the tag.
> 
> One of the test cases starts by checking to see if the set is a
> singleton. Do you really propose something like:

is too much of a special case to really matter. A set with one item avoids all 
the hard questions, since there is only one item which could be picked. It's 
the sets with two or more items that are hard. A general case get/pick method 
has to deal with the hard cases, not just the easy one-element cases.


[...]
> All of these require creating an intermediate object for the sole
> purpose of getting an item out of the container without destroying the
> container. This leads the reader to wonder why it was created, 

You've just explained why it was created -- to get an item out of the set 
without destroying it. Why is this a problem? We do something similar 
frequently, often abstracted away inside a helper function:


first = list(iterable)[0]
num_digits = len(str(some_integer))

etc.




-- 
Steven



From masklinn at masklinn.net  Wed May 16 18:43:22 2012
From: masklinn at masklinn.net (Masklinn)
Date: Wed, 16 May 2012 18:43:22 +0200
Subject: [Python-ideas] get method for sets?
In-Reply-To: 
References: <20120516023215.4699c0b4@bhuda.mired.org>
	
	<20120516034034.048f2eaa@bhuda.mired.org>
	
	
	<9B40FD38-A488-4B48-BDEC-11A8ED25E3E7@masklinn.net>
	
Message-ID: 

On 2012-05-16, at 16:57 , Mike Graham wrote:

> 2. I grouped it this way as a joke. If you do this, everyone will
> think you're crazy.

I don't mind, it expresses the intent clearly and looks *weird* at first
glance which is fine by me: colleagues & readers are unlikely to miss it
if they don't know the idiom.

I genuinely like it.



From pyideas at rebertia.com  Wed May 16 20:19:05 2012
From: pyideas at rebertia.com (Chris Rebert)
Date: Wed, 16 May 2012 11:19:05 -0700
Subject: [Python-ideas] get method for sets?
In-Reply-To: <4FB3D3C3.2070502@pearwood.info>
References: <20120516023215.4699c0b4@bhuda.mired.org>
	
	<20120516040941.0330975d@bhuda.mired.org>
	<4FB3D3C3.2070502@pearwood.info>
Message-ID: 

On Wed, May 16, 2012 at 9:20 AM, Steven D'Aprano  wrote:
> Mike Meyer wrote:
>
>> But my question was actually whether or not there was a reason for it
>> not existing. Has there been a previous discussion of this?
>
> Aye yai yai, have there ever.
>
> http://mail.python.org/pipermail/python-bugs-list/2005-August/030069.html
>
> If you have an hour or two spare, read this thread:
>
> http://mail.python.org/pipermail/python-dev/2009-October/093227.html
>
> By the way, I suggest that a better name than "get" is pick(), which once
> was (but no longer is) suggested by Wikipedia as a fundamental set
> operation.
>
> http://en.wikipedia.org/w/index.php?title=Set_%28abstract_data_type%29&oldid=461872038#Static_sets
>
>
> It seems to me that it has been removed because:
>
> - the actual semantics of what it means to get/pick a value from
> ?a set are unclear; and
> - few, if any, set implementations actually provide this method.

Objective-C's NSSet calls it "anyObject" and doesn't specify much
about it (in particular, its behavior when called repeatedly), mainly
just that "the selection is not guaranteed to be random". I haven't
poked around to see how it actually behaves in practice.

C#'s ISet has First() and Last(), but merely as extension methods.

Java, Ruby, and Haskell don't seem to include any such operation in
their generic set interfaces.

Cheers,
Chris
--
http://rebertia.com


From asampson at cs.washington.edu  Wed May 16 20:43:05 2012
From: asampson at cs.washington.edu (Adrian Sampson)
Date: Wed, 16 May 2012 11:43:05 -0700
Subject: [Python-ideas] Composability and concurrent.futures
Message-ID: <445C226E-4C9E-4D9D-A641-7FC3BEE64185@cs.washington.edu>

The concurrent.futures module in the Python standard library has problems with composability. If I start a ThreadPoolExecutor to run some library functions that internally use ThreadPoolExecutor, I will end up with many more worker threads on my system than I expect. For example, each parallel execution wants to take full advantage of an 8-core machine, I could end up with as many as 8*8=64 competing worker threads, which could significantly hurt performance.

This is because each instance of ThreadPoolExecutor (or ProcessPoolExecutor) maintains its own independent worker pool. Especially in situations where the goal is to exploit multiple CPUs, it's essential for any thread pool implementation to globally manage contention between multiple concurrent job schedulers.

I'm not sure about the best way to address this problem, but here's one proposal: Add additional executors to the futures library. ComposableThreadPoolExecutor and ComposableProcessPoolExecutor would each use a *shared* thread-pool model. When created, these composable executors will check to see if they are being created within a future worker thread/process initiated by another composable executor. If so, the "child" executor will forward all submitted jobs to the executor in the parent thread/process. Otherwise, it will behave normally, starting up its own worker pool.

Has anyone else dealt with composition problems in parallel programs? What do you think of this solution -- is there a better way to tackle this deficiency?

Adrian



From masklinn at masklinn.net  Wed May 16 21:21:08 2012
From: masklinn at masklinn.net (Masklinn)
Date: Wed, 16 May 2012 21:21:08 +0200
Subject: [Python-ideas] get method for sets?
In-Reply-To: 
References: <20120516023215.4699c0b4@bhuda.mired.org>
	
	<20120516040941.0330975d@bhuda.mired.org>
	<4FB3D3C3.2070502@pearwood.info>
	
Message-ID: <32F30142-0503-453A-BC55-90DC763E937C@masklinn.net>

On 2012-05-16, at 20:19 , Chris Rebert wrote:

> Objective-C's NSSet calls it "anyObject" and doesn't specify much
> about it (in particular, its behavior when called repeatedly), mainly
> just that "the selection is not guaranteed to be random". I haven't
> poked around to see how it actually behaves in practice.

It takes the first object it finds which is pretty much solely a
property of how it stores its items, its behavior translated into Python
code is precisely:

    next(iter(set))

or

    list(set)[0]

I just tested creating a few thousand sets and filling them with random
integer values (using arc4random(3)) and never once did [set anyObject]
differ from [[set allObjects] objectAtIndex:0] or from
[[set objectEnumerator] nextValue].


From mwm at mired.org  Wed May 16 22:00:33 2012
From: mwm at mired.org (Mike Meyer)
Date: Wed, 16 May 2012 16:00:33 -0400
Subject: [Python-ideas] get method for sets?
In-Reply-To: <87r4uk873p.fsf@benfinney.id.au>
References: <20120516023215.4699c0b4@bhuda.mired.org>
	<87vcjw8ti9.fsf@benfinney.id.au>
	<20120516041238.2ef36576@bhuda.mired.org>
	<87r4uk873p.fsf@benfinney.id.au>
Message-ID: <20120516160033.7245ce3f@bhuda.mired.org>

On Thu, 17 May 2012 01:43:06 +1000
Ben Finney  wrote:
> Mike Meyer  writes:
> > On Wed, 16 May 2012 17:39:10 +1000
> > Ben Finney  wrote:
> > > If by ?get? you mean to get an *arbitrary* item, not a specific item,
> > > then what's the problem? You already have ?set.pop?, as you point out.
> > And, as I also pointed out, it's not useful in the case where you need
> > to preserve the set for future use.
> Then, if ?item = next(iter(foo_set))? doesn't suit you, perhaps you'd
> like ?item = set(foo_set).pop()?.

This is precisely what bugs me about this case. There's not one
obvious way to do it. There's a collection of ways that are all in
some ways/cases problematical. They all involve creating a scratch
object from the set and using it's API to (possibly destructively) get
the one value that's wanted. In a way, it reminds me of the
discussions that eventually led to the if else expression being added.

> Regardless, I think you have your answer: Like most things that can
> already be done by composing the existing pieces, this corner case
> hasn't met the deliberately-high bar for making a special method just to
> do it.

And it's already been discussed to death. *That's* what I was trying
to find out. If it hadn't been, I'd have put together a serious
proposal.

> I still haven't seen you describe the use case where the existing ways
> of doing this aren't good enough.

That, of course, is a subjective judgment.

      		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From yselivanov.ml at gmail.com  Wed May 16 22:10:00 2012
From: yselivanov.ml at gmail.com (Yury Selivanov)
Date: Wed, 16 May 2012 16:10:00 -0400
Subject: [Python-ideas] get method for sets?
In-Reply-To: <9B40FD38-A488-4B48-BDEC-11A8ED25E3E7@masklinn.net>
References: <20120516023215.4699c0b4@bhuda.mired.org>
	
	<20120516034034.048f2eaa@bhuda.mired.org>
	
	
	<9B40FD38-A488-4B48-BDEC-11A8ED25E3E7@masklinn.net>
Message-ID: <8609D04B-A4CC-44D7-9420-350857523CDC@gmail.com>

On 2012-05-16, at 10:51 AM, Masklinn wrote:
> With the difference that ,= also asserts there is only one item in the
> iterable, where `next . iter` only does `head`.

For that, the ,*_= operator exists ;)

>>> a = '123'
>>> b ,*_= a
>>> b
'1'

I hope I'll never encounter this, though.

-
Yury


From grosser.meister.morti at gmx.net  Wed May 16 22:16:25 2012
From: grosser.meister.morti at gmx.net (=?UTF-8?B?TWF0aGlhcyBQYW56ZW5iw7Zjaw==?=)
Date: Wed, 16 May 2012 22:16:25 +0200
Subject: [Python-ideas] get method for sets?
In-Reply-To: 
References: <20120516023215.4699c0b4@bhuda.mired.org>
	
	<20120516040941.0330975d@bhuda.mired.org>
	<4FB3D3C3.2070502@pearwood.info>
	
Message-ID: <4FB40B19.3030604@gmx.net>

On 05/16/2012 08:19 PM, Chris Rebert wrote:
> On Wed, May 16, 2012 at 9:20 AM, Steven D'Aprano  wrote:
>> Mike Meyer wrote:
>>
>>> But my question was actually whether or not there was a reason for it
>>> not existing. Has there been a previous discussion of this?
>>
>> Aye yai yai, have there ever.
>>
>> http://mail.python.org/pipermail/python-bugs-list/2005-August/030069.html
>>
>> If you have an hour or two spare, read this thread:
>>
>> http://mail.python.org/pipermail/python-dev/2009-October/093227.html
>>
>> By the way, I suggest that a better name than "get" is pick(), which once
>> was (but no longer is) suggested by Wikipedia as a fundamental set
>> operation.
>>
>> http://en.wikipedia.org/w/index.php?title=Set_%28abstract_data_type%29&oldid=461872038#Static_sets
>>
>>
>> It seems to me that it has been removed because:
>>
>> - the actual semantics of what it means to get/pick a value from
>>   a set are unclear; and
>> - few, if any, set implementations actually provide this method.
>
> Objective-C's NSSet calls it "anyObject" and doesn't specify much
> about it (in particular, its behavior when called repeatedly), mainly
> just that "the selection is not guaranteed to be random". I haven't
> poked around to see how it actually behaves in practice.
>
> C#'s ISet has First() and Last(), but merely as extension methods.
>
> Java, Ruby, and Haskell don't seem to include any such operation in
> their generic set interfaces.
>

Ruby's Set has first() but not last(). That saied I'm -1 on a get/pick method for a set.

> Cheers,
> Chris
> --
> http://rebertia.com
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas



From grosser.meister.morti at gmx.net  Wed May 16 22:28:13 2012
From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=)
Date: Wed, 16 May 2012 22:28:13 +0200
Subject: [Python-ideas] Add `future_builtins` as an alias for `builtins`
In-Reply-To: <20120509184856.GC3133@bagheera>
References: <20120509184856.GC3133@bagheera>
Message-ID: <4FB40DDD.6090702@gmx.net>

If you look at the __future__ stuff one might get the idea to reverse it:


try:
	from __past__ import unicode_literals, future_builtins
except ImportError:
	pass

On 05/09/2012 08:48 PM, Sven Marnach wrote:
> With the reintroduction of u"Unicode literals", Python 3.3 will remove
> one of the major stumbling stones for supporting Python 2.x and 3.3
> within the same code base.  Another rather trivial stumbling stone
> could be removed by adding the alias `future_builtins` for the
> `builtins` module.  Currently, you need to use a try/except block,
> which isn't too bad, but I think it would be nicer if a line like
>
>      from future_builtins import map
>
> continues to work, just like __future__ imports continue to work.  I
> think the above actually *is* a kind of __future__ report which just
> happens to be in a regular module because it doesn't need any special
> compiler support.
>
> I know a few module names changed and some modules have been
> reorganised to packages, so you will still need try/except blocks for
> other imports.  However, I think `future_builtins` is special because
> it's sole raison d'?tre is forward-compatibility and becuase of the
> analogy with `__future__`.
>
> Cheers,
>      Sven



From tjreedy at udel.edu  Thu May 17 00:41:46 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 16 May 2012 18:41:46 -0400
Subject: [Python-ideas] get method for sets?
In-Reply-To: <20120516040941.0330975d@bhuda.mired.org>
References: <20120516023215.4699c0b4@bhuda.mired.org>
	
	<20120516040941.0330975d@bhuda.mired.org>
Message-ID: 

On 5/16/2012 4:09 AM, Mike Meyer wrote:

> Because next(iter(s)) makes the reader wonder "Why is this iterator
> being created?"

If s is a non-iterator iterable, to get at the contents of s 
non-destructively.

> makes me think that at the very least, this idiom ought to be
> mentioned in the documentation.

http://bugs.python.org/issue14836

> But my question was actually whether or not there was a reason for it
> not existing.

Because not all simple compositions need to be added to the sdtlib and 
builtins.

-- 
Terry Jan Reedy



From tjreedy at udel.edu  Thu May 17 00:53:33 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 16 May 2012 18:53:33 -0400
Subject: [Python-ideas] input function: built-in space between string
	and user-input
In-Reply-To: <4FB3811A.4090601@nedbatchelder.com>
References: 
	 <4FB3811A.4090601@nedbatchelder.com>
Message-ID: 

On 5/16/2012 6:27 AM, Ned Batchelder wrote:
> On 5/15/2012 5:19 PM, Terry Reedy wrote:
>> On 5/15/2012 6:46 AM,
>> Suriaprakash.Mariappan at smsc.com wrote:
>>> *_print function: built-in space between string and variable:_*
>>>
>>> The below python code,
>>>
>>> */length = 5/*
>>> */print('Length is', length)/*
>>>
>>> gives an output of
>>>
>>> */Length is 5/*
>>
>> The */.../* and *_..._* bracketing makes you post harder to read.
>> Perhaps this is used in India, but not elsewhere. Omit next time.
>
> That's your mail client's rendering of bold-italic and bold-underscored
> text from the HTML version of the original email.

I am reading the Gmane newsgroup mirror with Thunderbird. I have not 
seen it do anything similar with other mixed text/plain and text/html 
messages. So let me re-phrase by advice.

"The Python mailings lists and newsgroups are, as usual, intended for 
plain text. Posting html or plaintext and html can have strange and 
unpredictable effects with various mail and news readers. So if you want 
people to see what you send, just use plain text, without tab characters."

-- 
Terry Jan Reedy



From ethan at stoneleaf.us  Thu May 17 01:08:59 2012
From: ethan at stoneleaf.us (Ethan Furman)
Date: Wed, 16 May 2012 16:08:59 -0700
Subject: [Python-ideas] get method for sets?
In-Reply-To: <4FB3D3C3.2070502@pearwood.info>
References: <20120516023215.4699c0b4@bhuda.mired.org>		<20120516040941.0330975d@bhuda.mired.org>
	<4FB3D3C3.2070502@pearwood.info>
Message-ID: <4FB4338B.9070109@stoneleaf.us>

Steven D'Aprano wrote:
> Mike Meyer wrote:
> 
>> But my question was actually whether or not there was a reason for it
>> not existing. Has there been a previous discussion of this?
> 
> 
> Aye yai yai, have there ever.
> 
> http://mail.python.org/pipermail/python-bugs-list/2005-August/030069.html
> 
> If you have an hour or two spare, read this thread:
> 
> http://mail.python.org/pipermail/python-dev/2009-October/093227.html
> 
> By the way, I suggest that a better name than "get" is pick(), which 
> once was (but no longer is) suggested by Wikipedia as a fundamental set 
> operation.
> 
> http://en.wikipedia.org/w/index.php?title=Set_%28abstract_data_type%29&oldid=461872038#Static_sets 
> 
> 
> 
> It seems to me that it has been removed because:
> 
> - the actual semantics of what it means to get/pick a value from
>   a set are unclear; and
> - few, if any, set implementations actually provide this method.
> 
> I still think your best bet is a helper function:
> 
> def pick(s):
>     return next(iter(s))


Don't forget the doc string!

     "returns an arbitrary element from set s"

~Ethan~


From greg.ewing at canterbury.ac.nz  Thu May 17 01:51:49 2012
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 17 May 2012 11:51:49 +1200
Subject: [Python-ideas] get method for sets?
In-Reply-To: 
References: <20120516023215.4699c0b4@bhuda.mired.org>
	
	<20120516034034.048f2eaa@bhuda.mired.org>
	
	<20120516044628.2aa6dff9@bhuda.mired.org>
	
Message-ID: <4FB43D95.4080801@canterbury.ac.nz>

Bruce Leban wrote:
> Then maybe you should be using a different data structure than a set. 
> Maybe set_with_same_tag that declares that constraint and can enforce 
> the constraint if you want.

Another class of use cases is where you know that the
set contains only one element, and you want to find out
what that element is.

I encountered one of these in my recent PyWeek game
entry. I have a set of selected units, and commands
that can be applied to them. Some commands can only
be used on a single unit at a time, so there are places
in the code where there can only be one element in the
set.

Using a separate SetWithOnlyOneElement type in that
case would be tedious and unnecessary. I don't need to
enforce the constraint; I know it's satisfied because
I wouldn't have ended up at that point in the code if
it wasn't.

-- 
Greg


From greg.ewing at canterbury.ac.nz  Thu May 17 01:56:51 2012
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 17 May 2012 11:56:51 +1200
Subject: [Python-ideas] get method for sets?
In-Reply-To: <4FB3D3C3.2070502@pearwood.info>
References: <20120516023215.4699c0b4@bhuda.mired.org>
	
	<20120516040941.0330975d@bhuda.mired.org>
	<4FB3D3C3.2070502@pearwood.info>
Message-ID: <4FB43EC3.4090000@canterbury.ac.nz>

Steven D'Aprano wrote:

> By the way, I suggest that a better name than "get" is pick()

I was going to suggest peek(), which is more suggestive
of a non-modifying function.

-- 
Greg


From cs at zip.com.au  Thu May 17 02:56:29 2012
From: cs at zip.com.au (Cameron Simpson)
Date: Thu, 17 May 2012 10:56:29 +1000
Subject: [Python-ideas] get method for sets?
In-Reply-To: <87k40cqzxy.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <87k40cqzxy.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <20120517005629.GA28044@cskk.homeip.net>

On 16May2012 17:42, Stephen J. Turnbull  wrote:
| Mike Meyer writes:
|  > On Wed, 16 May 2012 17:26:45 +1000
| 
|  > > Could this helper function not do the job?
|  > > 
|  > > def get(s):
|  > >     x = s.pop()
|  > >     s.add(x)
|  > >     return x
|  > 
|  > Sure, if you don't mind munging the set unnecessarily. That's more
|  > readable, but slower and longer than:
|  > 
|  > def get(s):
|  >     for x in s:
|  >         return s

I was about to suggest Mike's implementation.

| Why would you mind munging the set temporarily?

Personally, I work with multiple threads quite often. Therefore I
habitually avoid data structure modifying operations unless they're
neccessary. Any time I modify a data structure is a time I have to worry
about shared access.

| Why is speed (of
| something that almost by definition is undefined if repeated)
| important?

Besides, modifying a data structure _is_ slow than just looking,
usually. There may even be garbage collection:-(

| I'm -1 on adding a method that has no motivation in production that I
| can see.  Just redefine your get() function as a function, with a more
| appropriate name such as "get_item_nondeterministically".  It will
| work on any iterable.  (Don't forget to document that it will "use up"
| an item if the iterable is not a sequence, though.)

Yah:

  def an(s):
    for i in s:
      return i

I'm also -1 on a set _method_, though he can always subclass and add his
own for his use case.

Cheers,
-- 
Cameron Simpson  DoD#743
http://www.cskk.ezoshosting.com/cs/

If your new theorem can be stated with great simplicity, then there
will exist a pathological exception.    - Adrian Mathesis


From greg.ewing at canterbury.ac.nz  Thu May 17 03:51:04 2012
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 17 May 2012 13:51:04 +1200
Subject: [Python-ideas] get method for sets?
In-Reply-To: <4FB3D5A6.602@pearwood.info>
References: <20120516023215.4699c0b4@bhuda.mired.org>
	
	<20120516034034.048f2eaa@bhuda.mired.org>
	
	<20120516044628.2aa6dff9@bhuda.mired.org> <4FB3D5A6.602@pearwood.info>
Message-ID: <4FB45988.704@canterbury.ac.nz>

On 17/05/12 04:28, Steven D'Aprano wrote:
> Which element should it return? "Any
> element, so long as it isn't always the same element twice in a row" perhaps?
> Would flip-flopping between the first and second elements meet your requirements?

It might be useful to have a method specified as returning the
same element that a subsequent pop() would return. Then it could
be used as a look-ahead for an algorithm involving a pop-loop,
or for anything with more liberal requirements.

-- 
Greg


From greg.ewing at canterbury.ac.nz  Thu May 17 04:00:39 2012
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 17 May 2012 14:00:39 +1200
Subject: [Python-ideas] Add `future_builtins` as an alias for `builtins`
In-Reply-To: <4FB40DDD.6090702@gmx.net>
References: <20120509184856.GC3133@bagheera> <4FB40DDD.6090702@gmx.net>
Message-ID: <4FB45BC7.20509@canterbury.ac.nz>

On 17/05/12 08:28, Mathias Panzenb?ck wrote:
> from __past__ import unicode_literals, future_builtins

I seem to remember Guido declaring ages ago that there would never
be any imports from the past. So the past import feature would first
have to be imported from a reality where he hadn't made that decision.

    from __alternatetimeline__ import __past__
    from __past__ import unicode_literals, future_builtins

-- 
Greg


From paul.dubois at gmail.com  Thu May 17 04:19:50 2012
From: paul.dubois at gmail.com (Paul Du Bois)
Date: Wed, 16 May 2012 19:19:50 -0700
Subject: [Python-ideas] get method for sets?
In-Reply-To: <20120517005629.GA28044@cskk.homeip.net>
References: <87k40cqzxy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20120517005629.GA28044@cskk.homeip.net>
Message-ID: 

> | Mike Meyer writes:
> | ?> def get(s):
> | ?> ? ? for x in s:
> | ?> ? ? ? ? return s

On Wed, May 16, 2012 at 5:56 PM, Cameron Simpson  wrote:
> ?def an(s):
> ? ?for i in s:
> ? ? ?return i

Normally I'm content to lurk, but this thread has been going on for a
long time without anyone pointing out that the "for" loop idiom needs
an "else: raise KeyError" in order to act pythonically.

p


From greg.ewing at canterbury.ac.nz  Thu May 17 05:03:11 2012
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 17 May 2012 15:03:11 +1200
Subject: [Python-ideas] get method for sets?
In-Reply-To: 
References: <87k40cqzxy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20120517005629.GA28044@cskk.homeip.net>
	
Message-ID: <4FB46A6F.4050200@canterbury.ac.nz>

On 17/05/12 14:19, Paul Du Bois wrote:

> On Wed, May 16, 2012 at 5:56 PM, Cameron Simpson  wrote:
>> >  def an(s):
>> >    for i in s:
>> >      return i

> Normally I'm content to lurk, but this thread has been going on for a
> long time without anyone pointing out that the "for" loop idiom needs
> an "else: raise KeyError" in order to act pythonically.

That depends on what result you want in the empty set case. If
returning None is okay, or you know the set can never be empty,
then it's fine as written.

-- 
Greg


From mwm at mired.org  Thu May 17 05:49:01 2012
From: mwm at mired.org (Mike Meyer)
Date: Wed, 16 May 2012 23:49:01 -0400
Subject: [Python-ideas] get method for sets?
In-Reply-To: <4FB46A6F.4050200@canterbury.ac.nz>
References: <87k40cqzxy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20120517005629.GA28044@cskk.homeip.net>
	
	<4FB46A6F.4050200@canterbury.ac.nz>
Message-ID: <20120516234901.6ffb3b1c@bhuda.mired.org>

On Thu, 17 May 2012 15:03:11 +1200
Greg Ewing  wrote:
> On 17/05/12 14:19, Paul Du Bois wrote:
> > On Wed, May 16, 2012 at 5:56 PM, Cameron Simpson  wrote:
> >> >  def an(s):
> >> >    for i in s:
> >> >      return i
> > Normally I'm content to lurk, but this thread has been going on for a
> > long time without anyone pointing out that the "for" loop idiom needs
> > an "else: raise KeyError" in order to act pythonically.
> That depends on what result you want in the empty set case. If
> returning None is okay, or you know the set can never be empty,
> then it's fine as written.

Raising KeyError is probably best, as that parallels "pop". In fact,
it would be required for the proposed "peek" method that returns what
"pop" would have returned at that point.

That method would not only have satisfied all 2.5 of the use cases I
had, but would probably be useful for algorithms that want a
conditional pop.

	    		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From ethan at stoneleaf.us  Thu May 17 17:10:40 2012
From: ethan at stoneleaf.us (Ethan Furman)
Date: Thu, 17 May 2012 08:10:40 -0700
Subject: [Python-ideas] weakrefs
Message-ID: <4FB514F0.6000403@stoneleaf.us>

 From the manual [8.11]:

 > A weak reference to an object is not enough to keep the object alive:
 > when the only remaining references to a referent are weak references,
 > garbage collection is free to destroy the referent and reuse its
 > memory for something else.

This leads to a difference in behaviour between CPython and the other 
implementations:  CPython will (currently) immediately destroy any 
objects that only have weak references to them with the result that 
trying to access said object will require making a new one; other 
implementations (at least PyPy, and presumably the others that don't use 
ref-count gc's) can "reach into the grave" and pull back objects that 
don't have any strong references left.

I would like to have the guarantees for weakrefs strengthened such that 
any weakref'ed object that has no strong references left will return 
None instead of the object, even if the object has not yet been garbage 
collected.

Without this stronger guarantee programs that are relying on weakrefs to 
disappear when strong refs are gone end up relying on the gc method 
instead, with the result that the program behaves differently on 
different implementations.

~Ethan~


From solipsis at pitrou.net  Thu May 17 17:44:29 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Thu, 17 May 2012 17:44:29 +0200
Subject: [Python-ideas] weakrefs
References: <4FB514F0.6000403@stoneleaf.us>
Message-ID: <20120517174429.75965d06@pitrou.net>

On Thu, 17 May 2012 08:10:40 -0700
Ethan Furman  wrote:
>  From the manual [8.11]:
> 
>  > A weak reference to an object is not enough to keep the object alive:
>  > when the only remaining references to a referent are weak references,
>  > garbage collection is free to destroy the referent and reuse its
>  > memory for something else.
> 
> This leads to a difference in behaviour between CPython and the other 
> implementations:  CPython will (currently) immediately destroy any 
> objects that only have weak references to them with the result that 
> trying to access said object will require making a new one;

This is only true if the object isn't caught in a reference cycle.

> Without this stronger guarantee programs that are relying on weakrefs to 
> disappear when strong refs are gone end up relying on the gc method 
> instead, with the result that the program behaves differently on 
> different implementations.

Why would they "rely on weakrefs to disappear when strong refs are
gone"? What is the use case?

Regards

Antoine.




From ckaynor at zindagigames.com  Thu May 17 19:13:15 2012
From: ckaynor at zindagigames.com (Chris Kaynor)
Date: Thu, 17 May 2012 10:13:15 -0700
Subject: [Python-ideas] weakrefs
In-Reply-To: <20120517174429.75965d06@pitrou.net>
References: <4FB514F0.6000403@stoneleaf.us>
	<20120517174429.75965d06@pitrou.net>
Message-ID: 

On Thu, May 17, 2012 at 8:44 AM, Antoine Pitrou  wrote:

> On Thu, 17 May 2012 08:10:40 -0700
> Ethan Furman  wrote:
> >  From the manual [8.11]:
> >
> >  > A weak reference to an object is not enough to keep the object alive:
> >  > when the only remaining references to a referent are weak references,
> >  > garbage collection is free to destroy the referent and reuse its
> >  > memory for something else.
> >
> > This leads to a difference in behaviour between CPython and the other
> > implementations:  CPython will (currently) immediately destroy any
> > objects that only have weak references to them with the result that
> > trying to access said object will require making a new one;
>
> This is only true if the object isn't caught in a reference cycle.


To further this, consider the following example, ran in CPython2.6:

>>> import weakref
>>> import gc
>>>
>>> class O(object):
...     pass
...
>>> a = O()
>>> b = O()
>>> a.x = b
>>> b.x = a
>>>
>>> w = weakref.ref(a)
>>>
>>>
>>> del a, b
>>>
>>> print w()
<__main__.O object at 0x0000000003C78B38>
>>>
>>> gc.collect()
20
>>>
>>> print w()
None


>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From greg.ewing at canterbury.ac.nz  Fri May 18 00:49:05 2012
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 18 May 2012 10:49:05 +1200
Subject: [Python-ideas] weakrefs
In-Reply-To: <4FB514F0.6000403@stoneleaf.us>
References: <4FB514F0.6000403@stoneleaf.us>
Message-ID: <4FB58061.6050809@canterbury.ac.nz>

Ethan Furman wrote:
> I would like to have the guarantees for weakrefs strengthened such that 
> any weakref'ed object that has no strong references left will return 
> None instead of the object, even if the object has not yet been garbage 
> collected.

Why do you want this guarantee? It would complicate
implementations for which ref counting is not the
native method of managing memory.

-- 
Greg


From ethan at stoneleaf.us  Fri May 18 18:08:48 2012
From: ethan at stoneleaf.us (stoneleaf)
Date: Fri, 18 May 2012 09:08:48 -0700 (PDT)
Subject: [Python-ideas] weakrefs
In-Reply-To: <4FB514F0.6000403@stoneleaf.us>
References: <4FB514F0.6000403@stoneleaf.us>
Message-ID: <551680bb-b954-4355-82cd-a0e373991512@nl1g2000pbc.googlegroups.com>



On May 17, 8:10?am, Ethan Furman wrote:
> ?From the manual [8.11]:
>
>> A weak reference to an object is not enough to keep the object alive:
>> when the only remaining references to a referent are weak references,
>> garbage collection is free to destroy the referent and reuse its
>> memory for something else.
>
> This leads to a difference in behaviour between CPython and the other
> implementations: ?CPython will (currently) immediately destroy any
> objects that only have weak references to them with the result that
> trying to access said object will require making a new one; other
> implementations (at least PyPy, and presumably the others that don't use
> ref-count gc's) can "reach into the grave" and pull back objects that
> don't have any strong references left.

Antione Pitrou wrote:
> This is only true if the object isn't caught in a reference cycle.

Good point -- so I would also like the proposed change in CPython as
well.


Ethan Furman wrote:
> I would like to have the guarantees for weakrefs strengthened such that
> any weakref'ed object that has no strong references left will return
> None instead of the object, even if the object has not yet been garbage
> collected.
>
> Without this stronger guarantee programs that are relying on weakrefs to
> disappear when strong refs are gone end up relying on the gc method
> instead, with the result that the program behaves differently on
> different implementations.

Antione Pitrou wrote:
> Why would they "rely on weakrefs to disappear when strong refs are
> gone"? What is the use case?

Greg Ewing wrote:
> Why do you want this guarantee? It would complicate
> implementations for which ref counting is not the
> native method of managing memory.

My dbf module provides direct access to dbf files.  A retrieved record
is
a singleton object, and allows temporary changes that are not written
to
disk.  Whether those changes are seen by the next incarnation depends
on
(I had thought) whether or not the record with the unwritten changes
has
gone out of scope.

I see two questions that determine whether this change should be made:

  1) How difficult it would be for the non-ref counting
implementations
     to implement

  2) Whether it's appropriate to have objects be changed, but not
saved,
     and then discarded when the strong references are gone so the
next
     incarnation doesn't see the changes, even if the object hasn't
been
     destroyed yet.

~Ethan~

FYI:  For dbf I am going to disallow temporary changes so this won't
be
an immediate issue for me.


From masklinn at masklinn.net  Fri May 18 18:38:00 2012
From: masklinn at masklinn.net (Masklinn)
Date: Fri, 18 May 2012 18:38:00 +0200
Subject: [Python-ideas] weakrefs
In-Reply-To: <551680bb-b954-4355-82cd-a0e373991512@nl1g2000pbc.googlegroups.com>
References: <4FB514F0.6000403@stoneleaf.us>
	<551680bb-b954-4355-82cd-a0e373991512@nl1g2000pbc.googlegroups.com>
Message-ID: <0A5D8A72-14E3-4614-8720-8070FEBAD28F@masklinn.net>


On 2012-05-18, at 18:08 , stoneleaf wrote:
> 
> My dbf module provides direct access to dbf files.  A retrieved record
> is
> a singleton object, and allows temporary changes that are not written
> to
> disk.  Whether those changes are seen by the next incarnation depends
> on
> (I had thought) whether or not the record with the unwritten changes
> has
> gone out of scope.

If a record is a singleton, that singleton-ification would be handled
through weakrefs would it not?

In that case, until the GC is triggered (and the weakref is
invalidated), you will keep getting your initial singleton and there
will be no "next record", I fail to see why that would be an issue.


> I see two questions that determine whether this change should be made:
> 
>  1) How difficult it would be for the non-ref counting
> implementations
>     to implement
> 

Pretty much impossible I'd expect, the weakrefs can only be broken on GC
runs (at object deallocation) and that is generally non-deterministic
without specifying precisely which type of GC implementation is used.
You'd need a fully deterministic deallocation model to ensure a weakref
is broken as soon as the corresponding object has no outstanding strong
(and soft, in some VMs like the JVM) reference.

>  2) Whether it's appropriate to have objects be changed, but not
> saved,
>     and then discarded when the strong references are gone so the
> next
>     incarnation doesn't see the changes, even if the object hasn't
> been
>     destroyed yet.

If your saves are synchronized with the weakref being broken (the object
being *effectively* collected) and the singleton behavior is as well,
there will be no difference, I'm not sure what the issue would be, you
might just have a second change cycle using the same unsaved (but still
modified) object.

Although frankly speaking such reliance on non-deterministic events would
scare the shit out of me.


From ethan at stoneleaf.us  Sat May 19 04:54:08 2012
From: ethan at stoneleaf.us (stoneleaf)
Date: Fri, 18 May 2012 19:54:08 -0700 (PDT)
Subject: [Python-ideas] weakrefs
In-Reply-To: <0A5D8A72-14E3-4614-8720-8070FEBAD28F@masklinn.net>
References: <4FB514F0.6000403@stoneleaf.us>
	<551680bb-b954-4355-82cd-a0e373991512@nl1g2000pbc.googlegroups.com>
	<0A5D8A72-14E3-4614-8720-8070FEBAD28F@masklinn.net>
Message-ID: <1b227cd1-3c22-4661-9b24-3edc16a580dd@st3g2000pbc.googlegroups.com>



On May 18, 9:38?am, Masklinn wrote:
> On 2012-05-18, at 18:08 , stoneleaf wrote:
>> My dbf module provides direct access to dbf files. ?A retrieved record
>> is
>> a singleton object, and allows temporary changes that are not written
>> to
>> disk. ?Whether those changes are seen by the next incarnation depends
>> on
>> (I had thought) whether or not the record with the unwritten changes
>> has
>> gone out of scope.
>
> If a record is a singleton, that singleton-ification would be handled
> through weakrefs would it not?

Indeed, that is the current bahavior.

> In that case, until the GC is triggered (and the weakref is
> invalidated), you will keep getting your initial singleton and there
> will be no "next record", I fail to see why that would be an issue.

Because, since I had only been using CPython, I was able to count on
records that had gone out of scope disappearing along with their
_temporary_ changes.  If I get that same record back the next time I
loop
through the table -- well, then the changes weren't temporary, were
they?

>> I see two questions that determine whether this change should be made:
>
>> ?1) How difficult it would be for the non-ref counting
>> implementations to implement
>
> Pretty much impossible I'd expect, the weakrefs can only be broken on GC
> runs (at object deallocation) and that is generally non-deterministic
> without specifying precisely which type of GC implementation is used.
> You'd need a fully deterministic deallocation model to ensure a weakref
> is broken as soon as the corresponding object has no outstanding strong
> (and soft, in some VMs like the JVM) reference.
>
>> ?2) Whether it's appropriate to have objects be changed, but not
>> saved, and then discarded when the strong references are gone so the
>> next incarnation doesn't see the changes, even if the object hasn't
>> been destroyed yet.
>
> If your saves are synchronized with the weakref being broken (the object
> being *effectively* collected) and the singleton behavior is as well,
> there will be no difference, I'm not sure what the issue would be, you
> might just have a second change cycle using the same unsaved (but still
> modified) object.

And that's exactly the problem -- I don't want to see the
modifications the
second time 'round, and if I can't count on weakrefs invalidating as
soon as
the strong refs are gone I'll have to completely rethink how I handle
records
from the table.

> Although frankly speaking such reliance on non-deterministic events would
> scare the shit out of me.

Indeed -- I hadn't realized that I was until somebody using PyPy
noticed the
problem.

~Ethan~


From fuzzyman at gmail.com  Sat May 19 14:33:35 2012
From: fuzzyman at gmail.com (Michael Foord)
Date: Sat, 19 May 2012 13:33:35 +0100
Subject: [Python-ideas] weakrefs
In-Reply-To: <1b227cd1-3c22-4661-9b24-3edc16a580dd@st3g2000pbc.googlegroups.com>
References: <4FB514F0.6000403@stoneleaf.us>
	<551680bb-b954-4355-82cd-a0e373991512@nl1g2000pbc.googlegroups.com>
	<0A5D8A72-14E3-4614-8720-8070FEBAD28F@masklinn.net>
	<1b227cd1-3c22-4661-9b24-3edc16a580dd@st3g2000pbc.googlegroups.com>
Message-ID: 

On 19 May 2012 03:54, stoneleaf  wrote:

>
>
> On May 18, 9:38 am, Masklinn wrote:
> > On 2012-05-18, at 18:08 , stoneleaf wrote:
> >> My dbf module provides direct access to dbf files.  A retrieved record
> >> is
> >> a singleton object, and allows temporary changes that are not written
> >> to
> >> disk.  Whether those changes are seen by the next incarnation depends
> >> on
> >> (I had thought) whether or not the record with the unwritten changes
> >> has
> >> gone out of scope.
> >
> > If a record is a singleton, that singleton-ification would be handled
> > through weakrefs would it not?
>
> Indeed, that is the current bahavior.
>
> > In that case, until the GC is triggered (and the weakref is
> > invalidated), you will keep getting your initial singleton and there
> > will be no "next record", I fail to see why that would be an issue.
>
> Because, since I had only been using CPython, I was able to count on
> records that had gone out of scope disappearing along with their
> _temporary_ changes.  If I get that same record back the next time I
> loop
> through the table -- well, then the changes weren't temporary, were
> they?
>


So you're taking a *dependence* on the reference counting garbage
collection of the CPython implementation, and when that doesn't work for
you with other implementations trying to force the same semantics on them.
Your proposal can't reasonably be implemented by other implementations as
working out whether there are any references to an object is an expensive
operation.

A much better technique would be for you to use explicit
life-cycle-management (like the with statement) for your objects.

Michael


>
> >> I see two questions that determine whether this change should be made:
> >
> >>  1) How difficult it would be for the non-ref counting
> >> implementations to implement
> >
> > Pretty much impossible I'd expect, the weakrefs can only be broken on GC
> > runs (at object deallocation) and that is generally non-deterministic
> > without specifying precisely which type of GC implementation is used.
> > You'd need a fully deterministic deallocation model to ensure a weakref
> > is broken as soon as the corresponding object has no outstanding strong
> > (and soft, in some VMs like the JVM) reference.
> >
> >>  2) Whether it's appropriate to have objects be changed, but not
> >> saved, and then discarded when the strong references are gone so the
> >> next incarnation doesn't see the changes, even if the object hasn't
> >> been destroyed yet.
> >
> > If your saves are synchronized with the weakref being broken (the object
> > being *effectively* collected) and the singleton behavior is as well,
> > there will be no difference, I'm not sure what the issue would be, you
> > might just have a second change cycle using the same unsaved (but still
> > modified) object.
>
> And that's exactly the problem -- I don't want to see the
> modifications the
> second time 'round, and if I can't count on weakrefs invalidating as
> soon as
> the strong refs are gone I'll have to completely rethink how I handle
> records
> from the table.
>
> > Although frankly speaking such reliance on non-deterministic events would
> > scare the shit out of me.
>
> Indeed -- I hadn't realized that I was until somebody using PyPy
> noticed the
> problem.
>
> ~Ethan~
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>



-- 

http://www.voidspace.org.uk/

May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing http://www.sqlite.org/different.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From ethan at stoneleaf.us  Sat May 19 17:29:02 2012
From: ethan at stoneleaf.us (stoneleaf)
Date: Sat, 19 May 2012 08:29:02 -0700 (PDT)
Subject: [Python-ideas] weakrefs
In-Reply-To: 
References: <4FB514F0.6000403@stoneleaf.us>
	<551680bb-b954-4355-82cd-a0e373991512@nl1g2000pbc.googlegroups.com>
	<0A5D8A72-14E3-4614-8720-8070FEBAD28F@masklinn.net>
	<1b227cd1-3c22-4661-9b24-3edc16a580dd@st3g2000pbc.googlegroups.com>
	
Message-ID: <4109f083-f8c7-4f58-84ef-2da278242934@ri8g2000pbc.googlegroups.com>

On May 19, 5:33?am, Michael Foord wrote:
> So you're taking a *dependence* on the reference counting garbage
> collection of the CPython implementation, and when that doesn't work for
> you with other implementations trying to force the same semantics on them.

I am not trying to force anything.  I stated what I would like, and
followed
up with questions to further the discussion.


> Your proposal can't reasonably be implemented by other implementations as
> working out whether there are any references to an object is an expensive
> operation.

Then that nixes it.  The (debatable) advantages aren't worth a large
expenditure in programmer time, nor a large hit in performance.


> A much better technique would be for you to use explicit
> life-cycle-management (like the with statement) for your objects.

I'm leaning strongly towards just not allowing temporary changes,
which will
also solve my problem.


Thanks everyone for the feedback.

~Ethan~


From bborcic at gmail.com  Mon May 21 16:27:35 2012
From: bborcic at gmail.com (Boris Borcic)
Date: Mon, 21 May 2012 16:27:35 +0200
Subject: [Python-ideas] [...].join(sep)
In-Reply-To: 
References: 
Message-ID: 

anatoly techtonik wrote:
> I am certain this was proposed many times, but still - why it is rejected?
>
> "real man don't use spaces".split().join('+').upper()
>      instead of
> '+'.join("real man don't use spaces".split()).upper()

IMO this should really be :

'+'.join(' '.split("real man don't use spaces")).upper()



From anacrolix at gmail.com  Mon May 21 18:17:06 2012
From: anacrolix at gmail.com (Matt Joiner)
Date: Tue, 22 May 2012 02:17:06 +1000
Subject: [Python-ideas] Composability and concurrent.futures
In-Reply-To: <445C226E-4C9E-4D9D-A641-7FC3BEE64185@cs.washington.edu>
References: <445C226E-4C9E-4D9D-A641-7FC3BEE64185@cs.washington.edu>
Message-ID: 

On Thu, May 17, 2012 at 4:43 AM, Adrian Sampson
wrote:

> The concurrent.futures module in the Python standard library has problems
> with composability. If I start a ThreadPoolExecutor to run some library
> functions that internally use ThreadPoolExecutor, I will end up with many
> more worker threads on my system than I expect. For example, each parallel
> execution wants to take full advantage of an 8-core machine, I could end up
> with as many as 8*8=64 competing worker threads, which could significantly
> hurt performance.
>
> This is because each instance of ThreadPoolExecutor (or
> ProcessPoolExecutor) maintains its own independent worker pool. Especially
> in situations where the goal is to exploit multiple CPUs, it's essential
> for any thread pool implementation to globally manage contention between
> multiple concurrent job schedulers.
>
> I'm not sure about the best way to address this problem, but here's one
> proposal: Add additional executors to the futures library.
> ComposableThreadPoolExecutor and ComposableProcessPoolExecutor would each
> use a *shared* thread-pool model. When created, these composable executors
> will check to see if they are being created within a future worker
> thread/process initiated by another composable executor. If so, the "child"
> executor will forward all submitted jobs to the executor in the parent
> thread/process. Otherwise, it will behave normally, starting up its own
> worker pool.
>
> Has anyone else dealt with composition problems in parallel programs? What
> do you think of this solution -- is there a better way to tackle this
> deficiency?


It's my understanding this is a known flaw with concurrency *in general*.
Currently most multi-{threaded,process} applications assume they're the
only ones running on the system. As does the likely implementation of the
proposed composable pools problem you've posed. A proper interprocess
scheduler is required to handle this ideally. (See GCD, and runtime
implementations that provide at least some userspace scheduling such as Go,
however poor it may be).

Secondly, composable pools don't handle recursive relationships well. If a
thread in one pool depends on the completion of all the tasks in its own
pool to complete before it can itself complete, you'll have deadlock.

Personally if I implemented a composable thread pool I'd have it global,
creation and submission of tasks would be proxied to it via some composable
executor class.

As it stands, thread pools are best for task-oriented concurrency rather
than parallelism anyway, especially in CPython.

In short, I think composable thread pools are a hack at best and won't gain
you anything except a slightly reduced threading overhead. If you want
optimal utilization, threading isn't the right place to be looking.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From asampson at cs.washington.edu  Mon May 21 19:21:01 2012
From: asampson at cs.washington.edu (Adrian Sampson)
Date: Mon, 21 May 2012 10:21:01 -0700
Subject: [Python-ideas] Composability and concurrent.futures
In-Reply-To: 
References: <445C226E-4C9E-4D9D-A641-7FC3BEE64185@cs.washington.edu>
	
Message-ID: <951AE63A-2AE9-4314-8B05-F80EC90D3314@cs.washington.edu>

On May 21, 2012, at 9:17 AM, Matt Joiner wrote:

> Personally if I implemented a composable thread pool I'd have it
> global, creation and submission of tasks would be proxied to it via
> some composable executor class.

I agree completely. Maybe the implementation I described was overly
hacky for the sake of transparent compatibility with the existing
(non-composable) executors in concurrent.futures. Ideally, the system
would have one global pool which many concurrency APIs -- not just
concurrent.futures -- could potentially share.

(In a *really* ideal world, the OS would provide thread pool management
-- like GCD, which you mentioned, or scheduler activations. But a
cross-platform library currently requires a less ambitious solution.)

> In short, I think composable thread pools are a hack at best and won't
> gain you anything except a slightly reduced threading overhead. If you
> want optimal utilization, threading isn't the right place to be
> looking.

To be clear, I meant to refer to processes *or* threads when discussing
the problem originally. The ProcessPoolExecutor is pretty useful (in my
experience) for easily getting speedup even on pure-Python CPU-bound
workloads.

Adrian



From tjreedy at udel.edu  Mon May 21 20:29:34 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 21 May 2012 14:29:34 -0400
Subject: [Python-ideas] [...].join(sep)
In-Reply-To: 
References: 
	
Message-ID: 

On 5/21/2012 10:27 AM, Boris Borcic wrote:
> anatoly techtonik wrote:
>> I am certain this was proposed many times, but still - why it is
>> rejected?
>>
>> "real man don't use spaces".split().join('+').upper()
>> instead of
>> '+'.join("real man don't use spaces".split()).upper()
>
> IMO this should really be :
>
> '+'.join(' '.split("real man don't use spaces")).upper()

It the separator were a mandatory argument for .split, then that would 
be possible, not not with it being optional, and therefore the second 
argument.

 >>> ' real  men  usE SPAces   and 	 tabs'.split()
['real', 'men', 'usE', 'SPAces', 'and', 'tabs']
 >>> ' real  men  usE SPAces   and 	 tabs'.split(' ')
['', 'real', '', 'men', '', 'usE', 'SPAces', '', '', 'and', '\t', 'tabs']

 >>> ' '.join(' real  men  usE SPAces   and 	 tabs'.split())
'real men usE SPAces and tabs'

is a handy way to clean up whitespace

-- 
Terry Jan Reedy



From techtonik at gmail.com  Tue May 22 17:39:16 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Tue, 22 May 2012 18:39:16 +0300
Subject: [Python-ideas] shutil.run (Was:  shutil.runret and shutil.runout)
Message-ID: 

Hello again,

I've finally found some time to partially process the replies and
came up with a better solution than subprocess.* and
shutil.runret/runout

Disclaimer: I don't say that suprocess suxx - it is powerful and very
awesome under the hood. What I want to say that its final user
interface is awful - for such complex thing as this it should have
been passed through several iteration cycles before settling down.


Therefore, inspired by Fabric API, I've finally found the solution -
shutil.run() function:
https://bitbucket.org/techtonik/shutil-run/src

run(command, combine_stderr=True):

    Run command through a system shell, return output string with
    additional properties:

        output.succeeded    - result of the operation True/False
        output.return_code  - specific return code
        output.stderr       - stderr contents if combine_stderr=False

     `combine_stderr` if set, makes stderr merged into output string,
     otherwise it will be available  as `output.stderr` attribute.

Example:

    from shellrun import run

    output = run('ls -la')
    if output.succeeded:
        print(output)
    else:
        print("Error %s" % output.return_code)


That's the most intuitive way I found so far. Objective advantages:

1. Better than
       subprocess.call(cmd, shell=true)
       subprocess.check_call(cmd, shell=true)
       subprocess.check_output(cmd, shell=True)
     because it is just
       shutil.run(cmd)
     i.e. short, simple and _easy to remember_

2. With shutil.run() you don't need to rewrite your check_call() or
check_output() with Popen() if you need to get return_code in addition
to stderr contents on error

3. shutil.run() is predictable and consistent - its arguments are not
dependent on each other, their combination doesn't change the function
behavior over and over requiring you iterate over the documentation
and warnings again and again

4. shutil.run() is the correct next level API over subprocess base
level. subprocess executes external process - that is its role, but
automatic ability to execute external process inside another external
process (shell) looks like a hack to me. Practical, but still a hack.

5. No required exception catching, which doesn't work for shell=True anyway

6. No need to learn subprocess.PIPE routing magic (not an argument for
hackers, I know)


Subjective advantages:
1. More beautiful
2. More simple
3. More readable
4. Practical
5. Obvious
6. It easy to explain


Hopefully, it can find its way in stdlib instead of
http://shell-command.readthedocs.org/
-- 
anatoly t.


From ericsnowcurrently at gmail.com  Tue May 22 18:26:20 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Tue, 22 May 2012 10:26:20 -0600
Subject: [Python-ideas] a simple namespace type
Message-ID: 

Below I've included a pure Python implementation of a type that I wish
was a builtin.  I know others have considered similar classes in the
past without any resulting change to Python, but I'd like to consider
it afresh[1][2].

  class SimpleNamespace:
      """A simple attribute-based namespace."""
      def __init__(self, **kwargs):
          self.__dict__.update(kwargs)  # or self.__dict__ = kwargs
      def __repr__(self):
          keys = sorted(k for k in self.__dict__ if not k.startswith('_'))
          content = ("{}={!r}".format(k, self.__dict__[k]) for k, v in keys)
          return "{}({})".format(type(self).__name__, ", ".join(content))

This is the sort of class that people implement all the time.  There's
even a similar one in the argparse module, which inspired the second
class below[3].  If the builtin object type were dict-based rather
than slot based then this sort of namespace type would be mostly
superfluous.  However, I also understand how that would add an
unnecessary resource burden on _all_ objects.  So why not a new type?

Nick Coghlan had this objection recently to a similar proposal[4]:

    Please, no. No new
    just-like-a-namedtuple-except-you-can't-iterate-over-it type, and
    definitely not one exposed in the collections module.

    We've been over this before: collections.namedtuple *is* the standard
    library's answer for structured records. TOOWTDI, and the way we have
    already chosen includes iterability as one of its expected properties.

As you can see he's referring to "structured records", but I expect
that his objections could be extended somewhat to this proposal.  I
see where he's coming from and agree relative to structured records.
However, I also think that a simple namespace type would be a benefit
to different use cases, namely where you want a simple dynamic
namespace.

Making a simple namespace class is trivial and likely just about
everyone has written one:  "class Namespace: pass" or even
"type('Namespace', (), {})".  Obviously the type in this proposal has
more meat, but that's certainly not necessary.  So why a new type?

The main reason is that as a builtin type the simple namespace type
could be used in builtin modules[5][6][7].

Thoughts?

-eric


[1] http://mail.python.org/pipermail/python-dev/2012-May/119387.html
[2] http://mail.python.org/pipermail/python-dev/2012-May/119393.html
[3] http://hg.python.org/cpython/file/dff6c506c2f1/Lib/argparse.py#l1177
[4] http://mail.python.org/pipermail/python-dev/2012-May/119412.html
[5] http://mail.python.org/pipermail/python-dev/2012-May/119395.html
[6] http://mail.python.org/pipermail/python-dev/2012-May/119399.html
[7] http://mail.python.org/pipermail/python-dev/2012-May/119402.html

--------------------------

class Namespace(SimpleNamespace):
    def __dir__(self):
        return sorted(k for k in self.__dict__ if not k.startswith('_'))
    def __eq__(self, other):
        return self.__dict__ == other.__dict__
    def __ne__(self, other):
        return self.__dict__ != other.__dict__
    def __contains__(self, name):
        return name in self.__dict__


From mwm at mired.org  Tue May 22 22:30:53 2012
From: mwm at mired.org (Mike Meyer)
Date: Tue, 22 May 2012 16:30:53 -0400
Subject: [Python-ideas] shutil.run (Was: shutil.runret and shutil.runout)
In-Reply-To: 
References: 
Message-ID: <20120522163053.684b43d0@bhuda.mired.org>

On Tue, 22 May 2012 18:39:16 +0300
anatoly techtonik  wrote:

> Therefore, inspired by Fabric API, I've finally found the solution -
> shutil.run() function:
> https://bitbucket.org/techtonik/shutil-run/src
> 
> run(command, combine_stderr=True):
> 
>     Run command through a system shell, return output string with
>     additional properties:
> 
>         output.succeeded    - result of the operation True/False
>         output.return_code  - specific return code
>         output.stderr       - stderr contents if combine_stderr=False
> 
>      `combine_stderr` if set, makes stderr merged into output string,
>      otherwise it will be available  as `output.stderr` attribute.
[...]
> That's the most intuitive way I found so far. Objective advantages:
> 
> 1. Better than
>        subprocess.call(cmd, shell=true)
>        subprocess.check_call(cmd, shell=true)
>        subprocess.check_output(cmd, shell=True)
>      because it is just
>        shutil.run(cmd)
>      i.e. short, simple and _easy to remember_

-2

Unless there's some way to turn off shell processing (better yet, have
no shell processing be the default, and require that it be turned on),
it can't be used securely with tainted strings, so it should *not* be
used with tainted strings, which means it's pretty much useless in any
environment where security matters. With everything being networked,
there may no longer be any such environments.

> 3. shutil.run() is predictable and consistent - its arguments are not
> dependent on each other, their combination doesn't change the function
> behavior over and over requiring you iterate over the documentation
> and warnings again and again

As proposed, it certainly provides a predictable and consistent
vulnerability to code injection attacks.

> 4. shutil.run() is the correct next level API over subprocess base
> level. subprocess executes external process - that is its role, but
> automatic ability to execute external process inside another external
> process (shell) looks like a hack to me. Practical, but still a hack.

It's only correct if you are in an environment where you don't care
about security. If you care about security, you can't use it. If we're
going to add yet another system() replacement, let's at least try and
make it secure.

     		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From ncoghlan at gmail.com  Tue May 22 23:41:28 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 23 May 2012 07:41:28 +1000
Subject: [Python-ideas] shutil.run (Was: shutil.runret and shutil.runout)
In-Reply-To: <20120522163053.684b43d0@bhuda.mired.org>
References: 
	<20120522163053.684b43d0@bhuda.mired.org>
Message-ID: 

Right, security implications are one of the reasons why I've held back from
proposing Shell Command. The lack of cross platform support is also a pain.
This suggestion shares both of those problems.

Having dealt with long running child processes lately, I can also say that
producing output line-by-line would be on my personal list of requirements.

So, yeah, interesting idea, but this is still an area that needs a lot of
exploration on PyPI before we select an answer for the stdlib.

--
Sent from my phone, thus the relative brevity :)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From jeanpierreda at gmail.com  Wed May 23 08:49:46 2012
From: jeanpierreda at gmail.com (Devin Jeanpierre)
Date: Wed, 23 May 2012 02:49:46 -0400
Subject: [Python-ideas] shutil.run (Was: shutil.runret and shutil.runout)
In-Reply-To: 
References: 
	<20120522163053.684b43d0@bhuda.mired.org>
	
Message-ID: 

On Tue, May 22, 2012 at 5:41 PM, Nick Coghlan  wrote:
> Having dealt with long running child processes lately, I can also say that
> producing output line-by-line would be on my personal list of requirements.

You can do that with subprocess, right? Just have to be sure to close
stdin/stderr and read p.stdout with readline() repeatedly...

I think you might be able to even have the other file descriptors be
inputting/outputting if you use threads, but I'm scared of
experimenting with these things -- experiments don't tell you that it
doesn't work on an OS you don't have.

-- Devin


From ncoghlan at gmail.com  Wed May 23 09:09:12 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 23 May 2012 17:09:12 +1000
Subject: [Python-ideas] shutil.run (Was: shutil.runret and shutil.runout)
In-Reply-To: 
References: 
	<20120522163053.684b43d0@bhuda.mired.org>
	
	
Message-ID: 

On Wed, May 23, 2012 at 4:49 PM, Devin Jeanpierre
 wrote:
> On Tue, May 22, 2012 at 5:41 PM, Nick Coghlan  wrote:
>> Having dealt with long running child processes lately, I can also say that
>> producing output line-by-line would be on my personal list of requirements.
>
> You can do that with subprocess, right? Just have to be sure to close
> stdin/stderr and read p.stdout with readline() repeatedly...

Yep, subprocess is a swiss army knife - you can do pretty much
anything with it. That's the complaint, though - *because* it's so
configurable, even the existing convenience APIs aren't always that
convenient for simple operations.

Thus the current spate of efforts to provide a "friendlier" API for
performing shell operations from Python. The dust may settle well
enough in the 3.4 time frame for us to declare a "winner" and add
something to the standard library, but that's far from certain.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From techtonik at gmail.com  Wed May 23 10:38:36 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Wed, 23 May 2012 11:38:36 +0300
Subject: [Python-ideas] Run attached Python tests in browser
Message-ID: 

I am not sure if it belongs here, to python-dev or infrastructure.

Lately I've been looking at http://repl.it/ and found it to be pretty
convenient to code stuff that otherwise require a Python editor to be
installed. So, I thought that it might be actually convenient to use
for automatically testing patches in Python bugtracker without going
through the hassle to download, patch and run everything locally. Of
course, not everything will work, but at least some parts of it could
be.
--
anatoly t.


From techtonik at gmail.com  Wed May 23 10:47:06 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Wed, 23 May 2012 11:47:06 +0300
Subject: [Python-ideas] shutil.run no security thread
Message-ID: 

Ok, let's separately discuss shutil.run() added value without touching
security at all (subj changed).

Is it ok? Is it nice idea? Would it be included in stdlib in an ideal
world where security implications doesn't matter?
--
anatoly t.


On Tue, May 22, 2012 at 11:30 PM, Mike Meyer  wrote:
> On Tue, 22 May 2012 18:39:16 +0300
> anatoly techtonik  wrote:
>
>> Therefore, inspired by Fabric API, I've finally found the solution -
>> shutil.run() function:
>> https://bitbucket.org/techtonik/shutil-run/src
>>
>> run(command, combine_stderr=True):
>>
>> ? ? Run command through a system shell, return output string with
>> ? ? additional properties:
>>
>> ? ? ? ? output.succeeded ? ?- result of the operation True/False
>> ? ? ? ? output.return_code ?- specific return code
>> ? ? ? ? output.stderr ? ? ? - stderr contents if combine_stderr=False
>>
>> ? ? ?`combine_stderr` if set, makes stderr merged into output string,
>> ? ? ?otherwise it will be available ?as `output.stderr` attribute.
> [...]
>> That's the most intuitive way I found so far. Objective advantages:
>>
>> 1. Better than
>> ? ? ? ?subprocess.call(cmd, shell=true)
>> ? ? ? ?subprocess.check_call(cmd, shell=true)
>> ? ? ? ?subprocess.check_output(cmd, shell=True)
>> ? ? ?because it is just
>> ? ? ? ?shutil.run(cmd)
>> ? ? ?i.e. short, simple and _easy to remember_
>
> -2
>
> Unless there's some way to turn off shell processing (better yet, have
> no shell processing be the default, and require that it be turned on),
> it can't be used securely with tainted strings, so it should *not* be
> used with tainted strings, which means it's pretty much useless in any
> environment where security matters. With everything being networked,
> there may no longer be any such environments.
>
>> 3. shutil.run() is predictable and consistent - its arguments are not
>> dependent on each other, their combination doesn't change the function
>> behavior over and over requiring you iterate over the documentation
>> and warnings again and again
>
> As proposed, it certainly provides a predictable and consistent
> vulnerability to code injection attacks.
>
>> 4. shutil.run() is the correct next level API over subprocess base
>> level. subprocess executes external process - that is its role, but
>> automatic ability to execute external process inside another external
>> process (shell) looks like a hack to me. Practical, but still a hack.
>
> It's only correct if you are in an environment where you don't care
> about security. If you care about security, you can't use it. If we're
> going to add yet another system() replacement, let's at least try and
> make it secure.
>
> ? ?  --
> Mike Meyer  ? ? ? ? ? ? ?http://www.mired.org/
> Independent Software developer/SCM consultant, email for more information.
>
> O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From techtonik at gmail.com  Wed May 23 11:04:13 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Wed, 23 May 2012 12:04:13 +0300
Subject: [Python-ideas] Important fixes not getting to releases
Message-ID: 

I know why important usability features are not getting into releases -
  they are taken care too late in release cycle
    and it all starts with bug tracker
      which imposes the workflow
        where usability bugs
          are always an enhancement.

http://bugs.python.org/issue14872
--
anatoly t.


From techtonik at gmail.com  Wed May 23 11:07:55 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Wed, 23 May 2012 12:07:55 +0300
Subject: [Python-ideas] processing subprocess output line-by-line (Was:
	shutil.run)
Message-ID: 

On Wed, May 23, 2012 at 10:09 AM, Nick Coghlan  wrote:
> On Wed, May 23, 2012 at 4:49 PM, Devin Jeanpierre
>  wrote:
>> On Tue, May 22, 2012 at 5:41 PM, Nick Coghlan  wrote:
>>> Having dealt with long running child processes lately, I can also say that
>>> producing output line-by-line would be on my personal list of requirements.
>>
>> You can do that with subprocess, right? Just have to be sure to close
>> stdin/stderr and read p.stdout with readline() repeatedly...
>
> Yep, subprocess is a swiss army knife - you can do pretty much
> anything with it. That's the complaint, though - *because* it's so
> configurable, even the existing convenience APIs aren't always that
> convenient for simple operations.

It is quite likely that there are use cases where subprocess fails, \
because they require async control.
http://bugs.python.org/issue14872

And line-by-line recipe is here:
http://stackoverflow.com/questions/5582933/need-to-avoid-subprocess-deadlock-without-communicate
--
anatoly t.


From techtonik at gmail.com  Wed May 23 11:12:51 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Wed, 23 May 2012 12:12:51 +0300
Subject: [Python-ideas] shutil.run (Was: shutil.runret and shutil.runout)
In-Reply-To: 
References: 
	<20120522163053.684b43d0@bhuda.mired.org>
	
Message-ID: 

On Wed, May 23, 2012 at 12:41 AM, Nick Coghlan  wrote:
> Right, security implications are one of the reasons why I've held back from
> proposing Shell Command. The lack of cross platform support is also a pain.
> This suggestion shares both of those problems.

Why shutil.run() is not cross-platform?
Is it technically feasible to make shutil.run() (or subprocess.* for
that purpose) cross-platform?

> Sent from my phone, thus the relative brevity :)

That's actually lowers a bounce rate for discussion. =)


From pyideas at rebertia.com  Wed May 23 11:26:53 2012
From: pyideas at rebertia.com (Chris Rebert)
Date: Wed, 23 May 2012 02:26:53 -0700
Subject: [Python-ideas] shutil.run no security thread
In-Reply-To: 
References: 
Message-ID: 

> On Tue, May 22, 2012 at 11:30 PM, Mike Meyer  wrote:
>> On Tue, 22 May 2012 18:39:16 +0300
>> anatoly techtonik  wrote:
>>
>>> Therefore, inspired by Fabric API, I've finally found the solution -
>>> shutil.run() function:
>>> https://bitbucket.org/techtonik/shutil-run/src

>> -2
>>
>> Unless there's some way to turn off shell processing (better yet, have
>> no shell processing be the default, and require that it be turned on),
>> it can't be used securely with tainted strings, so it should *not* be
>> used with tainted strings, which means it's pretty much useless in any
>> environment where security matters. With everything being networked,
>> there may no longer be any such environments.
>>
>>> 3. shutil.run() is predictable and consistent - its arguments are not

>> As proposed, it certainly provides a predictable and consistent
>> vulnerability to code injection attacks.
>>
>>> 4. shutil.run() is the correct next level API over subprocess base
>>> level. subprocess executes external process - that is its role, but
>>> automatic ability to execute external process inside another external
>>> process (shell) looks like a hack to me. Practical, but still a hack.
>>
>> It's only correct if you are in an environment where you don't care
>> about security. If you care about security, you can't use it. If we're
>> going to add yet another system() replacement, let's at least try and
>> make it secure.

On Wed, May 23, 2012 at 1:47 AM, anatoly techtonik  wrote:
> Ok, let's separately discuss shutil.run() added value without touching
> security at all (subj changed).
>
> Is it ok? Is it nice idea? Would it be included in stdlib in an ideal
> world where security implications doesn't matter?

I hope not, because it'd still have all the /usability/ pitfalls
associated with shell interpolation (and the consequent need to escape
command arguments).

Consider:
chris at MBP ~ $ mkdir foo && cd foo
chris at MBP foo $ ls
chris at MBP foo $ touch '~'  # the horror
chris at MBP foo $ touch '$EDITOR'  # you have a sick mind
chris at MBP foo $ ls -l  # verify the devious plot
total 0
-rw-r--r--  1 chris  staff  0 May 23 02:11 $EDITOR
-rw-r--r--  1 chris  staff  0 May 23 02:11 ~
chris at MBP foo $ python
Python 2.7.1 (r271:86832, Jul 31 2011, 19:30:53)
>>> from os import listdir
>>> from subprocess import call
>>> for entry in listdir('.'):
?     ret = call('ls '+entry, shell=True) # ? la your wrapper
...
ls: ed: No such file or directory

>>> # that's not what I wanted at all!

(Less contrived examples left as an exercise for the reader.)

Also, this isn't shell-specific, but it still should be made easier to
handle properly: What about a file named "--help"?

Cheers,
Chris
--
Sadly, no, `ed` isn't really my editor.
http://rebertia.com

P.S. Please avoid top-posting in the future.


From ncoghlan at gmail.com  Wed May 23 14:22:56 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 23 May 2012 22:22:56 +1000
Subject: [Python-ideas] shutil.run no security thread
In-Reply-To: 
References: 
Message-ID: 

On Wed, May 23, 2012 at 6:47 PM, anatoly techtonik  wrote:
> Ok, let's separately discuss shutil.run() added value without touching
> security at all (subj changed).
>
> Is it ok? Is it nice idea? Would it be included in stdlib in an ideal
> world where security implications doesn't matter?

Sure. That world is called PHP (or C, for that matter).

We *care* about security implications, and trying to be secure by
default is part of that. Usability isn't everything, and it's OK if
software development is sometimes hard.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From techtonik at gmail.com  Wed May 23 15:30:32 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Wed, 23 May 2012 16:30:32 +0300
Subject: [Python-ideas] shutil.run (Was: shutil.runret and shutil.runout)
In-Reply-To: <20120522163053.684b43d0@bhuda.mired.org>
References: 
	<20120522163053.684b43d0@bhuda.mired.org>
Message-ID: 

About security.

On Tue, May 22, 2012 at 11:30 PM, Mike Meyer  wrote:
> On Tue, 22 May 2012 18:39:16 +0300
> anatoly techtonik  wrote:
>
>> Therefore, inspired by Fabric API, I've finally found the solution -
>> shutil.run() function:
>> https://bitbucket.org/techtonik/shutil-run/src
>>
>> run(command, combine_stderr=True):
>>
>> ? ? Run command through a system shell, return output string with
>> ? ? additional properties:
>>
>> ? ? ? ? output.succeeded ? ?- result of the operation True/False
>> ? ? ? ? output.return_code ?- specific return code
>> ? ? ? ? output.stderr ? ? ? - stderr contents if combine_stderr=False
>>
>> ? ? ?`combine_stderr` if set, makes stderr merged into output string,
>> ? ? ?otherwise it will be available ?as `output.stderr` attribute.
> [...]
>> That's the most intuitive way I found so far. Objective advantages:
>>
>> 1. Better than
>> ? ? ? ?subprocess.call(cmd, shell=true)
>> ? ? ? ?subprocess.check_call(cmd, shell=true)
>> ? ? ? ?subprocess.check_output(cmd, shell=True)
>> ? ? ?because it is just
>> ? ? ? ?shutil.run(cmd)
>> ? ? ?i.e. short, simple and _easy to remember_
>
> -2
>
> Unless there's some way to turn off shell processing (better yet, have
> no shell processing be the default, and require that it be turned on),
> it can't be used securely with tainted strings, so it should *not* be
> used with tainted strings, which means it's pretty much useless in any
> environment where security matters. With everything being networked,
> there may no longer be any such environments.

What does this "shell processing" involve to understand what to turn off?
Why there is no way to turn off "shell processing"?
What's the primary reason that it is impossible to be turned off?

>> 3. shutil.run() is predictable and consistent - its arguments are not
>> dependent on each other, their combination doesn't change the function
>> behavior over and over requiring you iterate over the documentation
>> and warnings again and again
>
> As proposed, it certainly provides a predictable and consistent
> vulnerability to code injection attacks.

suprocess.* with shell=True provides the same entrypoint for injection
attacks, and security through obscurity doesn't help here. People
still use shell=True, because that's sometimes the only way to execute
external utilities properly. Even my synapses were silent when I
reviewed and used shell=True for Rietveld upload script and Spyder
IDE.

What will help is a better simple explanation in a prominent place,
with an example that people can really remember instead of frightening
them with warnings. People will ignore warning eventually, and after
endless experiments will subprocess.* params mess will just leave
shell=True because it works (I did so).

No sane web developer will use subprocess calls on server side at all.
Regardless of shell=True or not. For example, how can I be sure that
Graphviz is save from exploit through malicious input? No sane
developer will run shell script on a web side either. For those who
still want - there will be this simple explanation right on the
shutil.run() page - with link to proper vulnerability analysis instead
of uncertainty inducting warning.

shutil.run() is aimed for local operations.

>> 4. shutil.run() is the correct next level API over subprocess base
>> level. subprocess executes external process - that is its role, but
>> automatic ability to execute external process inside another external
>> process (shell) looks like a hack to me. Practical, but still a hack.
>
> It's only correct if you are in an environment where you don't care
> about security. If you care about security, you can't use it. If we're
> going to add yet another system() replacement, let's at least try and
> make it secure.

I am all ears how to make shutil.run() more secure. Right now I must
confess that I don't even realize.how serious is this problems, so if
anyone can came up with a real-world example with explanation of
security concern that could be copied "as-is" into documentation, it
will surely be appreciated not only by me.


From bborcic at gmail.com  Wed May 23 16:50:18 2012
From: bborcic at gmail.com (Boris Borcic)
Date: Wed, 23 May 2012 16:50:18 +0200
Subject: [Python-ideas] [...].join(sep)
In-Reply-To: 
References: 
	 
Message-ID: 

Terry Reedy wrote:
> On 5/21/2012 10:27 AM, Boris Borcic wrote:
>> anatoly techtonik wrote:
>>> I am certain this was proposed many times, but still - why it is
>>> rejected?
>>>
>>> "real man don't use spaces".split().join('+').upper()
>>> instead of
>>> '+'.join("real man don't use spaces".split()).upper()
>>
>> IMO this should really be :
>>
>> '+'.join(' '.split("real man don't use spaces")).upper()
>
> It the separator were a mandatory argument for .split, then that would be
> possible, not not with it being optional, and therefore the second argument.
>
>  >>> ' real men usE SPAces and tabs'.split()
> ['real', 'men', 'usE', 'SPAces', 'and', 'tabs']
>  >>> ' real men usE SPAces and tabs'.split(' ')
> ['', 'real', '', 'men', '', 'usE', 'SPAces', '', '', 'and', '\t', 'tabs']
>
>  >>> ' '.join(' real men usE SPAces and tabs'.split())
> 'real men usE SPAces and tabs'
>
> is a handy way to clean up whitespace
>

Kind of beside the point, which is that the desire to repair the inconsistency 
between split and join has a better prospect at the split side of things than at 
the join side of things. The problems at the split side of things are 
comparatively minor.



From bruce at leapyear.org  Wed May 23 17:29:28 2012
From: bruce at leapyear.org (Bruce Leban)
Date: Wed, 23 May 2012 08:29:28 -0700
Subject: [Python-ideas] [...].join(sep)
In-Reply-To: 
References: 
	 
	
Message-ID: 

On Wed, May 23, 2012 at 7:50 AM, Boris Borcic  wrote:

> Kind of beside the point, which is that the desire to repair the
> inconsistency between split and join has a better prospect at the split
> side of things than at the join side of things. The problems at the split
> side of things are comparatively minor.
>


The inconsistency that bugs me is the difference in split behavior between
languages. Switching between languages means I have to constantly double
check this. What is consistent is that you call string.split(separator)
rather than separator.split(string) so changing that doesn't seem at all
beneficial.

Python split has an optional *maxsplit* parameter:

If maxsplit is given, at most maxsplit splits are done (thus, the list will
have *at most maxsplit+1* elements).

The remainder of the string after the last matched separator is included in
the last part.


Java split has an optional integer *limit* parameter:

... the pattern will be applied at most limit - 1 times, the array's length
will be *no greater than limit* ...

The remainder of the string after the last matched separator is included in
the last part.


C# split has an optional *count* parameter:

The maximum number of substrings to return.
The remainder of the string after the last matched separator is included in
the last part.


Ruby split has an optional limit parameter:

If limit is a positive number, *at most that number of fields* will be
returned.
The remainder of the string after the last matched separator is included in
the last part.


Javascript has an optional limit parameter:

It returns *at most limit* parts.

The remainder of the string after the last matched separator is *discarded*.


And I'm not mentioning the differences in how the separator parameter is
interpreted. :-)

--- Bruce
Follow me: http://www.twitter.com/Vroo http://www.vroospeak.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From g.rodola at gmail.com  Thu May 24 03:32:42 2012
From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=)
Date: Thu, 24 May 2012 03:32:42 +0200
Subject: [Python-ideas] Add a generic async IO poller/reactor to select
	module
Message-ID: 

Including an established async IO framework such as Twisted, gevent or
Tornado in the Python stdlib has always been a controversial subject.
PEP-3153 (http://www.python.org/dev/peps/pep-3153/) tried to face this
problem in the most agnostic way as possible, and it's a good starting
point IMO.
Nevertheless, it's still vague about what the actual API should look
like and AFAIK it remained stagnant so far.

There's one thing in the whole async stack which is basically the same
for all implementations though: the poller/reactor.
Could it make sense to add something similar to select module?
Differently from PEP-3153, providing such a layer on top of select(),
poll() & co. is easier and could possibly be an incentive to avoid
such code duplication.

I'm coming up with this because I recently did something similar in
pyftpdlib as an hack on top of asyncore to add support for epoll() and
kqueue(), using the excellent Tornado's io loop as source of
inspiration:
http://code.google.com/p/pyftpdlib/issues/detail?id=203
http://code.google.com/p/pyftpdlib/source/browse/trunk/pyftpdlib/lib/ioloop.py


The way I imagine it:

>>> import select
>>> dir(select)
[..., 'EpollPoller', 'PollPoller', 'SelectPoller', 'KqueuePoller']
>>> poller = select.EpollPoller()
>>> poller.register(fd, handler, poller.READ | poller.WRITE)
>>> poller.socket_map
{2 : }
>>> poller.modify(fd, poller.READ)
>>> poller.poll()      # will call handler.handle_read_event() if/when it's the case
^C
KeyboardInterrupt
>>> poller.remove(fd)
>>> poller.close()

The handler is supposed to provide 3 methods:
- handle_read_event
- handle_write_event
- handle_error_event

Users willing to support multiple event loops such as wx, gtk etc can do:

>>> while 1:
...       poller.poll(timeout=0.1, blocking=False)
...       otherpoller.poll()


Basically, this would be the whole API.

Thoughts?


--- Giampaolo
http://code.google.com/p/pyftpdlib/
http://code.google.com/p/psutil/
http://code.google.com/p/pysendfile/


From g.rodola at gmail.com  Thu May 24 03:43:33 2012
From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=)
Date: Thu, 24 May 2012 03:43:33 +0200
Subject: [Python-ideas] Add a generic async IO poller/reactor to select
	module
In-Reply-To: 
References: 
Message-ID: 

2012/5/24 Giampaolo Rodol? :
> The handler is supposed to provide 3 methods:
> - handle_read_event
> - handle_write_event
> - handle_error_event

Further note: this is the approach I used in pyftpdlib.
An even more abstracted approach would be having poller.poll() return
a dict of {fd: events, fd, events, ...}, similarly to what Tornado
currently does.
This way we wouldn't be forcing the user to provide a handler class
with the 3 methods described above.


--- Giampaolo
http://code.google.com/p/pyftpdlib/
http://code.google.com/p/psutil/
http://code.google.com/p/pysendfile/


From steve at pearwood.info  Thu May 24 04:00:58 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 24 May 2012 12:00:58 +1000
Subject: [Python-ideas] shutil.run (Was: shutil.runret and shutil.runout)
In-Reply-To: 
References: 	<20120522163053.684b43d0@bhuda.mired.org>
	
Message-ID: <4FBD965A.4040801@pearwood.info>

anatoly techtonik wrote:

> I am all ears how to make shutil.run() more secure. Right now I must
> confess that I don't even realize.how serious is this problems, so if
> anyone can came up with a real-world example with explanation of
> security concern that could be copied "as-is" into documentation, it
> will surely be appreciated not only by me.

Start here:

http://cwe.mitre.org/top25/index.html

Code injection attacks include two of the top three security vulnerabilities, 
over even buffer overflows.

One sub-category of code injection:

OS Command Injection
http://cwe.mitre.org/data/definitions/78.html



-- 
Steven


From debatem1 at gmail.com  Thu May 24 05:24:39 2012
From: debatem1 at gmail.com (geremy condra)
Date: Wed, 23 May 2012 20:24:39 -0700
Subject: [Python-ideas] shutil.run (Was: shutil.runret and shutil.runout)
In-Reply-To: <4FBD965A.4040801@pearwood.info>
References: 
	<20120522163053.684b43d0@bhuda.mired.org>
	
	<4FBD965A.4040801@pearwood.info>
Message-ID: 

On Wed, May 23, 2012 at 7:00 PM, Steven D'Aprano wrote:

> anatoly techtonik wrote:
>
>  I am all ears how to make shutil.run() more secure. Right now I must
>> confess that I don't even realize.how serious is this problems, so if
>> anyone can came up with a real-world example with explanation of
>> security concern that could be copied "as-is" into documentation, it
>> will surely be appreciated not only by me.
>>
>
> Start here:
>
> http://cwe.mitre.org/top25/**index.html
>
> Code injection attacks include two of the top three security
> vulnerabilities, over even buffer overflows.
>
> One sub-category of code injection:
>
> OS Command Injection
> http://cwe.mitre.org/data/**definitions/78.html


I talked about this in my pycon talk this year. It's easy to avoid and
disastrous to get wrong. Please don't do it this way.

Geremy Condra


>
>
>
>
> --
> Steven
>
> ______________________________**_________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/**mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From ronaldoussoren at mac.com  Thu May 24 08:47:35 2012
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Thu, 24 May 2012 08:47:35 +0200
Subject: [Python-ideas] Add a generic async IO poller/reactor to
	select	module
In-Reply-To: 
References: 
Message-ID: 


On 24 May, 2012, at 3:32, Giampaolo Rodol? wrote:
>>>> 
> 
> The handler is supposed to provide 3 methods:
> - handle_read_event
> - handle_write_event
> - handle_error_event
> 
> Users willing to support multiple event loops such as wx, gtk etc can do:
> 
>>>> while 1:
> ...       poller.poll(timeout=0.1, blocking=False)
> ...       otherpoller.poll()
> 
> 
> Basically, this would be the whole API.

Isn't this a limited version of asyncore? (poller.poll == asyncore.loop, the handler is a subset of asyncore.dispatcher).

Ronald
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4788 bytes
Desc: not available
URL: 

From lyricconch at gmail.com  Thu May 24 12:47:48 2012
From: lyricconch at gmail.com (=?UTF-8?B?5rW36Z+1?=)
Date: Thu, 24 May 2012 18:47:48 +0800
Subject: [Python-ideas] Extended "break" "continue" in "for ... in" block.
Message-ID: 

Hi all...
i'd like to propose a syntax extend of "break" and  "continue"
to let them work together with "yield".

Syntax:
continue_stmt: "continue" [ test ]
break_stmt: "break" [ test ]
it is only valid in "for ... in" block.

when we writing "for  in : ",
we can say there is a generator ("__g = iter()") providing values
(" = next(__g)") and the values are processing by .
 implenetments computing. (we focus)
inside geneator("__iter__" of ) implenetments iteration.
(sealed logic)
---- as current ----
"continue" is "next(__g)" (which equals to "__g.send(None)"),
"break" leave the block and __g is garbage collected(which implies a
__g.close()).
"return" "raise" inside  leave the block and __g is garbage collected.

let's make thing reverse. consider we are write __g' code.
generator function implenetments computing. (we focus)
outside code( of "for ... in") implenetments continuation. (sealed logic)
---- as proposal ----
"continue " is equiv to "__g.send()".
"continue" is alias of "continue None".
"break " is equiv to "__g.throw()".
"break" is alias of "break GeneratorExit".
"return" "raise" inside  impies a "break" to __g,

with communication between "yield" and "continue", "break",
this plays just as Ruby's block except return value of __g lost
(may we use an "as" after "for ... in" to fetch return value?).


-- 
= =!


From simon.sapin at kozea.fr  Thu May 24 13:05:15 2012
From: simon.sapin at kozea.fr (Simon Sapin)
Date: Thu, 24 May 2012 13:05:15 +0200
Subject: [Python-ideas] Extended "break" "continue" in "for ... in"
	block.
In-Reply-To: 
References: 
Message-ID: <4FBE15EB.4080706@kozea.fr>

Le 24/05/2012 12:47, ?? a ?crit :
> we can say there is a generator ("__g = iter()")
> [...]
> "continue " is equiv to "__g.send()".

Hi,

iter() returns an iterator, not a generator.
All generators are iterators, but not all iterators are generators: an 
iterator may not have a .send() method.

How would "continue something_that_is_not_None" behave with an iterator 
without a .send() method?

Regards,
-- 
Simon Sapin


From steve at pearwood.info  Thu May 24 13:26:25 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Thu, 24 May 2012 21:26:25 +1000
Subject: [Python-ideas] Extended "break" "continue" in "for ... in"
	block.
In-Reply-To: 
References: 
Message-ID: <4FBE1AE1.9030800@pearwood.info>

?? wrote:
> Hi all...
> i'd like to propose a syntax extend of "break" and  "continue"
> to let them work together with "yield".

Can you give an example of how you would use them, and why?



-- 
Steven


From g.rodola at gmail.com  Thu May 24 13:50:27 2012
From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=)
Date: Thu, 24 May 2012 13:50:27 +0200
Subject: [Python-ideas] Add a generic async IO poller/reactor to select
	module
In-Reply-To: 
References: 
	
Message-ID: 

2012/5/24 Ronald Oussoren :
>
> On 24 May, 2012, at 3:32, Giampaolo Rodol? wrote:
>>>>>
>>
>> The handler is supposed to provide 3 methods:
>> - handle_read_event
>> - handle_write_event
>> - handle_error_event
>>
>> Users willing to support multiple event loops such as wx, gtk etc can do:
>>
>>>>> while 1:
>> ... ? ? ? poller.poll(timeout=0.1, blocking=False)
>> ... ? ? ? otherpoller.poll()
>>
>>
>> Basically, this would be the whole API.
>
> Isn't this a limited version of asyncore? (poller.poll == asyncore.loop, the handler is a subset of asyncore.dispatcher).
>
> Ronald

poller.poll serves the same purpose of asyncore.loop, yes, but this is
supposed to be independent from asyncore.

--- Giampaolo
http://code.google.com/p/pyftpdlib/
http://code.google.com/p/psutil/
http://code.google.com/p/pysendfile/


From ubershmekel at gmail.com  Thu May 24 13:59:15 2012
From: ubershmekel at gmail.com (Yuval Greenfield)
Date: Thu, 24 May 2012 14:59:15 +0300
Subject: [Python-ideas] a simple namespace type
In-Reply-To: 
References: 
Message-ID: 

On Tue, May 22, 2012 at 7:26 PM, Eric Snow wrote:

> Below I've included a pure Python implementation of a type that I wish
> was a builtin.  I know others have considered similar classes in the
> past without any resulting change to Python, but I'd like to consider
> it afresh[1][2].
>
>  class SimpleNamespace:
>      """A simple attribute-based namespace."""
>      def __init__(self, **kwargs):
>          self.__dict__.update(kwargs)  # or self.__dict__ = kwargs
>      def __repr__(self):
>          keys = sorted(k for k in self.__dict__ if not k.startswith('_'))
>          content = ("{}={!r}".format(k, self.__dict__[k]) for k, v in keys)
>          return "{}({})".format(type(self).__name__, ", ".join(content))
>
> This is the sort of class that people implement all the time.  There's
> even a similar one in the argparse module, which inspired the second
> class below[3].  If the builtin object type were dict-based rather
> than slot based then this sort of namespace type would be mostly
> superfluous.  However, I also understand how that would add an
> unnecessary resource burden on _all_ objects.  So why not a new type?
>
> Nick Coghlan had this objection recently to a similar proposal[4]:
>
>    Please, no. No new
>    just-like-a-namedtuple-except-you-can't-iterate-over-it type, and
>    definitely not one exposed in the collections module.
>
>    We've been over this before: collections.namedtuple *is* the standard
>    library's answer for structured records. TOOWTDI, and the way we have
>    already chosen includes iterability as one of its expected properties.
> [...]


I've implemented this a few times as well. I called it "AttributeDict" or
"Record".


I think adding an __iter__ method would be beneficial. E.g.

class  SimpleNamespace :
     def __init__(self, **kwargs):
         self.__dict__.update(kwargs)  # or self.__dict__ = kwargs
         self.__iter__ = lambda: iter(kwargs.keys())


Why do we need this imo:

* sometimes x.something feels better than x['something']
* to ease duck-typing, making mocks, etc.
* Named tuple feels clunky for certain dynamic cases (why do I need to
create the type for a one-off?)


I wonder if SimpleNameSpace should allow __getitem__ as well...


Yuval
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From solipsis at pitrou.net  Thu May 24 14:03:14 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Thu, 24 May 2012 14:03:14 +0200
Subject: [Python-ideas] Add a generic async IO poller/reactor to select
	module
References: 
	
	
Message-ID: <20120524140314.303d3bcd@pitrou.net>

On Thu, 24 May 2012 13:50:27 +0200
Giampaolo Rodol? 
wrote:
> 2012/5/24 Ronald Oussoren :
> >
> > On 24 May, 2012, at 3:32, Giampaolo Rodol? wrote:
> >>>>>
> >>
> >> The handler is supposed to provide 3 methods:
> >> - handle_read_event
> >> - handle_write_event
> >> - handle_error_event
> >>
> >> Users willing to support multiple event loops such as wx, gtk etc can do:
> >>
> >>>>> while 1:
> >> ... ? ? ? poller.poll(timeout=0.1, blocking=False)
> >> ... ? ? ? otherpoller.poll()
> >>
> >>
> >> Basically, this would be the whole API.
> >
> > Isn't this a limited version of asyncore? (poller.poll == asyncore.loop, the handler is a subset of asyncore.dispatcher).
> >
> > Ronald
> 
> poller.poll serves the same purpose of asyncore.loop, yes, but this is
> supposed to be independent from asyncore.

I agree with Ronald that it looks like a less-braindead version of
asyncore. I don't think the select module is the right place.

Also, I don't know why you would specify poller.READ or poller.WRITE
explicitly. Usually you are interested in all events, no?

Regards

Antoine.




From ncoghlan at gmail.com  Thu May 24 14:21:43 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 24 May 2012 22:21:43 +1000
Subject: [Python-ideas] Extended "break" "continue" in "for ... in"
	block.
In-Reply-To: <4FBE1AE1.9030800@pearwood.info>
References: 
	<4FBE1AE1.9030800@pearwood.info>
Message-ID: 

On Thu, May 24, 2012 at 9:26 PM, Steven D'Aprano  wrote:
> ?? wrote:
>>
>> Hi all...
>> i'd like to propose a syntax extend of "break" and ?"continue"
>> to let them work together with "yield".
>
> Can you give an example of how you would use them, and why?

It's an approach to driving a coroutine (and one that was discussed
back when the coroutine methods were added to generators). Currently,
if you're using a generator as a coroutine, you largely *avoid* using
it directly as an iterator. Aside from the initial priming of
coroutines, most generator based code will either treat them as
iterators (via for loops, comprehensions and next() calls), or as
coroutines (via send() and throw() calls).

The main reason tinkering with for loops has been resisted is that
native support for even "continue " (the least controversial
part of the suggestion) would likely result in slowing down all for
loops to cover the relatively niche coroutine use case.

Also, if anything was going to map to throw() it would be "continue
raise", not "break":

continue -> next(itr)
continue  -> itr.send()
continue raise  -> itr.throw()

So yeah, this isn't a new proposal, but what's still lacking is a
clear justification of what code will actually *gain* from the
increase in the language complexity. How often are generator based
coroutines actually used outside the context of a larger framework
that already takes care of the next/send/throw details?

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From ncoghlan at gmail.com  Thu May 24 14:37:03 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 24 May 2012 22:37:03 +1000
Subject: [Python-ideas] Add a generic async IO poller/reactor to select
	module
In-Reply-To: 
References: 
	
	
Message-ID: 

On Thu, May 24, 2012 at 9:50 PM, Giampaolo Rodol?  wrote:
> poller.poll serves the same purpose of asyncore.loop, yes, but this is
> supposed to be independent from asyncore.

I'd actually like to see something like this pitched as a
"concurrent.eventloop" PEP. PEP 3153 really wasn't what I was
expecting after the discussions at the PyCon US 2011 language summit -
I was expecting "here's a common event loop all the async frameworks
can hook into", but instead we got something a *lot* more ambitious
taht tried to merge the entire IO stack for the async frameworks,
rather than just provide a standard way for their event loops to
cooperate.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From g.rodola at gmail.com  Thu May 24 14:45:01 2012
From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=)
Date: Thu, 24 May 2012 14:45:01 +0200
Subject: [Python-ideas] Add a generic async IO poller/reactor to select
	module
In-Reply-To: <20120524140314.303d3bcd@pitrou.net>
References: 
	
	
	<20120524140314.303d3bcd@pitrou.net>
Message-ID: 

2012/5/24 Antoine Pitrou :
> On Thu, 24 May 2012 13:50:27 +0200
> Giampaolo Rodol? 
> wrote:
>> 2012/5/24 Ronald Oussoren :
>> >
>> > On 24 May, 2012, at 3:32, Giampaolo Rodol? wrote:
>> >>>>>
>> >>
>> >> The handler is supposed to provide 3 methods:
>> >> - handle_read_event
>> >> - handle_write_event
>> >> - handle_error_event
>> >>
>> >> Users willing to support multiple event loops such as wx, gtk etc can do:
>> >>
>> >>>>> while 1:
>> >> ... ? ? ? poller.poll(timeout=0.1, blocking=False)
>> >> ... ? ? ? otherpoller.poll()
>> >>
>> >>
>> >> Basically, this would be the whole API.
>> >
>> > Isn't this a limited version of asyncore? (poller.poll == asyncore.loop, the handler is a subset of asyncore.dispatcher).
>> >
>> > Ronald
>>
>> poller.poll serves the same purpose of asyncore.loop, yes, but this is
>> supposed to be independent from asyncore.
>
> I agree with Ronald that it looks like a less-braindead version of
> asyncore. I don't think the select module is the right place.

Yeah, probably. Usually when I post here I'm the first one not being
sure whether what I propose is a good idea or not. =)
Anyway, it must be clear that what I have in mind is not related to
asyncore per-se.
The proposal is to add a *generic* poller/reactor to select module as
an abstraction layer on top of select(), poll(), epoll() and kqueue(),
that's all.

> Also, I don't know why you would specify poller.READ or poller.WRITE
> explicitly. Usually you are interested in all events, no?

Nope, that's what asyncore does and that's why it is significantly
slower compared to more modern and clever async loops (independenly
from the lack of epoll() / kqueue() support in asyncore).
You should only be interested in reading for accepting sockets
(servers) or when you want to receive data.
You should only be interested in writing for connecting sockets
(clients) or when you want to send data.
Being interested in both when, say, you only intend to receive data is
a considerable waste of time, especially when there are many
concurrent connections.
The performance degradation if you wildly look for both read and write
events is *huge*, see benchmarks referring to old vs. new select()
implementation here (~8.5x slowdown with 200 concurrent clients):
http://code.google.com/p/pyftpdlib/issues/detail?id=203#c6


--- Giampaolo
http://code.google.com/p/pyftpdlib/
http://code.google.com/p/psutil/
http://code.google.com/p/pysendfile/


From ncoghlan at gmail.com  Thu May 24 14:51:36 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 24 May 2012 22:51:36 +1000
Subject: [Python-ideas] Add a generic async IO poller/reactor to select
	module
In-Reply-To: 
References: 
	
	
	
Message-ID: 

On Thu, May 24, 2012 at 10:37 PM, Nick Coghlan  wrote:
> On Thu, May 24, 2012 at 9:50 PM, Giampaolo Rodol?  wrote:
>> poller.poll serves the same purpose of asyncore.loop, yes, but this is
>> supposed to be independent from asyncore.
>
> I'd actually like to see something like this pitched as a
> "concurrent.eventloop" PEP. PEP 3153 really wasn't what I was
> expecting after the discussions at the PyCon US 2011 language summit -
> I was expecting "here's a common event loop all the async frameworks
> can hook into", but instead we got something a *lot* more ambitious
> taht tried to merge the entire IO stack for the async frameworks,
> rather than just provide a standard way for their event loops to
> cooperate.

See the final section of my notes here:
http://www.boredomandlaziness.org/2011/03/python-language-summit-rough-notes.html

Turns out the idea of a PEP 3153 level API *was* raised at the summit,
but I'd still like to see a competing PEP that targets the reactor
level API directly.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From ronaldoussoren at mac.com  Thu May 24 14:52:59 2012
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Thu, 24 May 2012 14:52:59 +0200
Subject: [Python-ideas] Add a generic async IO poller/reactor to select
 module
In-Reply-To: <20120524140314.303d3bcd@pitrou.net>
References: 
	
	
	<20120524140314.303d3bcd@pitrou.net>
Message-ID: 


On 24 May, 2012, at 14:03, Antoine Pitrou wrote:

> On Thu, 24 May 2012 13:50:27 +0200
> Giampaolo Rodol? 
> wrote:
>> 2012/5/24 Ronald Oussoren :
>>> 
>>> On 24 May, 2012, at 3:32, Giampaolo Rodol? wrote:
>>>>>>> 
>>>> 
>>>> The handler is supposed to provide 3 methods:
>>>> - handle_read_event
>>>> - handle_write_event
>>>> - handle_error_event
>>>> 
>>>> Users willing to support multiple event loops such as wx, gtk etc can do:
>>>> 
>>>>>>> while 1:
>>>> ...       poller.poll(timeout=0.1, blocking=False)
>>>> ...       otherpoller.poll()
>>>> 
>>>> 
>>>> Basically, this would be the whole API.
>>> 
>>> Isn't this a limited version of asyncore? (poller.poll == asyncore.loop, the handler is a subset of asyncore.dispatcher).
>>> 
>>> Ronald
>> 
>> poller.poll serves the same purpose of asyncore.loop, yes, but this is
>> supposed to be independent from asyncore.
> 
> I agree with Ronald that it looks like a less-braindead version of
> asyncore. I don't think the select module is the right place.

What worries me most is that it might only look like a beter version of asyncore. I'd much rather see something based on the event-handling core of Twisted because that code base is used in production and is hence more likely to be correct w.r.t. odd real-world conditions.     IIRC doing this was discussed at the language summit in 2011, but as Nick mentions that doesn't seem to be the focus of PEP 3153.

I am by the way not using Twisted myself, I'm at this time still using homebrew select loops and asyncore.

> 
> Also, I don't know why you would specify poller.READ or poller.WRITE
> explicitly. Usually you are interested in all events, no?

You're not always interested in write events, those are only interesting when you have data that must be written to a socket.

Ronald

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4788 bytes
Desc: not available
URL: 

From anacrolix at gmail.com  Thu May 24 15:05:12 2012
From: anacrolix at gmail.com (Matt Joiner)
Date: Thu, 24 May 2012 21:05:12 +0800
Subject: [Python-ideas] Composability and concurrent.futures
In-Reply-To: <951AE63A-2AE9-4314-8B05-F80EC90D3314@cs.washington.edu>
References: <445C226E-4C9E-4D9D-A641-7FC3BEE64185@cs.washington.edu>
	
	<951AE63A-2AE9-4314-8B05-F80EC90D3314@cs.washington.edu>
Message-ID: 

>
> To be clear, I meant to refer to processes *or* threads when discussing
> the problem originally. The ProcessPoolExecutor is pretty useful (in my
> experience) for easily getting speedup even on pure-Python CPU-bound
> workloads.
>

FWIW that wasn't the default "use processes" spike. In my experience toying
with concurrency in Python, trying to manage the load threads put on the
system always ends badly. The 2 best supported concurrency mechanisms,
threads and processes are constantly t?te-?-t?te, neither are adequate when
you start to consider extreme concurrency scenarios. I suggest this because
if you're considering composing executors, you're already trying to reduce
the overhead (wastage) that processes and threads are incurring on your
system for these purposes.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From g.rodola at gmail.com  Thu May 24 15:06:18 2012
From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=)
Date: Thu, 24 May 2012 15:06:18 +0200
Subject: [Python-ideas] Add a generic async IO poller/reactor to select
	module
In-Reply-To: 
References: 
	
	
	<20120524140314.303d3bcd@pitrou.net>
	
Message-ID: 

2012/5/24 Ronald Oussoren :
>
> On 24 May, 2012, at 14:03, Antoine Pitrou wrote:
>
>> On Thu, 24 May 2012 13:50:27 +0200
>> Giampaolo Rodol? 
>> wrote:
>>> 2012/5/24 Ronald Oussoren :
>>>>
>>>> On 24 May, 2012, at 3:32, Giampaolo Rodol? wrote:
>>>>>>>>
>>>>>
>>>>> The handler is supposed to provide 3 methods:
>>>>> - handle_read_event
>>>>> - handle_write_event
>>>>> - handle_error_event
>>>>>
>>>>> Users willing to support multiple event loops such as wx, gtk etc can do:
>>>>>
>>>>>>>> while 1:
>>>>> ... ? ? ? poller.poll(timeout=0.1, blocking=False)
>>>>> ... ? ? ? otherpoller.poll()
>>>>>
>>>>>
>>>>> Basically, this would be the whole API.
>>>>
>>>> Isn't this a limited version of asyncore? (poller.poll == asyncore.loop, the handler is a subset of asyncore.dispatcher).
>>>>
>>>> Ronald
>>>
>>> poller.poll serves the same purpose of asyncore.loop, yes, but this is
>>> supposed to be independent from asyncore.
>>
>> I agree with Ronald that it looks like a less-braindead version of
>> asyncore. I don't think the select module is the right place.
>
> What worries me most is that it might only look like a beter version of asyncore.

Please, forget about asyncore: this has nothing to do with it per-se
as it's just a reactor - it doesn't aim to provide any connection
handling.
Given the poor asyncore API I doubt it would be even integrable with
it without breaking backward compatibility.

--- Giampaolo
http://code.google.com/p/pyftpdlib/
http://code.google.com/p/psutil/
http://code.google.com/p/pysendfile/


From solipsis at pitrou.net  Thu May 24 15:06:59 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Thu, 24 May 2012 15:06:59 +0200
Subject: [Python-ideas] Add a generic async IO poller/reactor to select
	module
References: 
	
	
	
Message-ID: <20120524150659.49361158@pitrou.net>

On Thu, 24 May 2012 22:37:03 +1000
Nick Coghlan  wrote:
> On Thu, May 24, 2012 at 9:50 PM, Giampaolo Rodol?  wrote:
> > poller.poll serves the same purpose of asyncore.loop, yes, but this is
> > supposed to be independent from asyncore.
> 
> I'd actually like to see something like this pitched as a
> "concurrent.eventloop" PEP.

Sounds like a good idea to me. By the way, it should also have some
support for delayed calls to be actually useful (something that
asyncore *still* doesn't have, AFAIK).

Regards

Antoine.




From ncoghlan at gmail.com  Thu May 24 15:23:42 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 24 May 2012 23:23:42 +1000
Subject: [Python-ideas] Composability and concurrent.futures
In-Reply-To: 
References: <445C226E-4C9E-4D9D-A641-7FC3BEE64185@cs.washington.edu>
	
	<951AE63A-2AE9-4314-8B05-F80EC90D3314@cs.washington.edu>
	
Message-ID: 

It's really up to individual libraries to make it possible for
applications to provide the executor explicitly, rather than the
library assuming it's OK to just create its own.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From ericsnowcurrently at gmail.com  Thu May 24 19:17:31 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Thu, 24 May 2012 11:17:31 -0600
Subject: [Python-ideas] a simple namespace type
In-Reply-To: 
References: 
	
Message-ID: 

On Thu, May 24, 2012 at 5:59 AM, Yuval Greenfield  wrote:
> On Tue, May 22, 2012 at 7:26 PM, Eric Snow 
> wrote:
>>
>> Below I've included a pure Python implementation of a type that I wish
>> was a builtin. ?I know others have considered similar classes in the
>> past without any resulting change to Python, but I'd like to consider
>> it afresh[1][2].
>> [...]
>
> I've implemented this a few times as well. I called it "AttributeDict" or
> "Record".
>
>
> I think adding an __iter__ method would be beneficial. E.g.
>
> class? SimpleNamespace?:
> ? ? ?def __init__(self, **kwargs):
> ? ? ? ? ?self.__dict__.update(kwargs) ?# or self.__dict__ = kwargs
> ? ? ? ? ?self.__iter__ = lambda: iter(kwargs.keys())

I'd like to limit the syntactic overlap with dict as much as possible.
 Effectively this is just a simple but distinct facade around dict to
give a namespace with attribute access.  I suppose part of the
question is how much of the Mapping interface would belong instead to
a hypothetical Namespace interface. (I'm definitely _not_ proposing
such an unnecessary extra level of abstraction).

Regardless, if you want to do dict things then you can get the
underlying dict using vars(ns) or ns.__dict__ on your instance.
Alternately you can subclass the SimpleNamespace type to get all the
extra goodies you want, as I showed with the Namespace class at the
bottom of my first message.

> Why do we need this imo:
>
> * sometimes x.something feels better than x['something']
> * to ease duck-typing, making mocks, etc.
> * Named tuple feels clunky for certain dynamic cases (why do I need to
> create the type for a one-off?)

Yup.

> I wonder if SimpleNameSpace should allow __getitem__ as well...

Same thing: just use vars(ns) or a subclass of SimpleNamespace.

-eric


From guido at python.org  Thu May 24 20:14:28 2012
From: guido at python.org (Guido van Rossum)
Date: Thu, 24 May 2012 11:14:28 -0700
Subject: [Python-ideas] a simple namespace type
In-Reply-To: 
References: 
	
	
Message-ID: 

On Thu, May 24, 2012 at 10:17 AM, Eric Snow  wrote:
> On Thu, May 24, 2012 at 5:59 AM, Yuval Greenfield  wrote:
>> On Tue, May 22, 2012 at 7:26 PM, Eric Snow 
>> wrote:
>>>
>>> Below I've included a pure Python implementation of a type that I wish
>>> was a builtin. ?I know others have considered similar classes in the
>>> past without any resulting change to Python, but I'd like to consider
>>> it afresh[1][2].
>>> [...]
>>
>> I've implemented this a few times as well. I called it "AttributeDict" or
>> "Record".

I tend to call it "Struct(ure)" -- I guess I like C better than Pascal. :-)

>> I think adding an __iter__ method would be beneficial. E.g.
>>
>> class? SimpleNamespace?:
>> ? ? ?def __init__(self, **kwargs):
>> ? ? ? ? ?self.__dict__.update(kwargs) ?# or self.__dict__ = kwargs
>> ? ? ? ? ?self.__iter__ = lambda: iter(kwargs.keys())
>
> I'd like to limit the syntactic overlap with dict as much as possible.

+1

> ?Effectively this is just a simple but distinct facade around dict to
> give a namespace with attribute access. ?I suppose part of the
> question is how much of the Mapping interface would belong instead to
> a hypothetical Namespace interface. (I'm definitely _not_ proposing
> such an unnecessary extra level of abstraction).

Possibly there is a (weird?) parallel with namedtuple. The end result
is somewhat similar: you get to use attribute names instead of the
accessor syntax (x[y]) of the underlying type. But the "feel" of the
type is different, and inherits more of the underlying type
(namedtuple is immutable and has a fixed set of keys, whereas the type
proposed here is mutable and allows arbitrary keys as long as they
look like Python names).

> Regardless, if you want to do dict things then you can get the
> underlying dict using vars(ns) or ns.__dict__ on your instance.
> Alternately you can subclass the SimpleNamespace type to get all the
> extra goodies you want, as I showed with the Namespace class at the
> bottom of my first message.
>
>> Why do we need this imo:
>>
>> * sometimes x.something feels better than x['something']
>> * to ease duck-typing, making mocks, etc.
>> * Named tuple feels clunky for certain dynamic cases (why do I need to
>> create the type for a one-off?)
>
> Yup.
>
>> I wonder if SimpleNameSpace should allow __getitem__ as well...
>
> Same thing: just use vars(ns) or a subclass of SimpleNamespace.

-- 
--Guido van Rossum (python.org/~guido)


From g.rodola at gmail.com  Thu May 24 20:40:31 2012
From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=)
Date: Thu, 24 May 2012 20:40:31 +0200
Subject: [Python-ideas] Add a generic async IO poller/reactor to select
	module
In-Reply-To: 
References: 
	
	
	
	
Message-ID: 

2012/5/24 Nick Coghlan :
> On Thu, May 24, 2012 at 10:37 PM, Nick Coghlan  wrote:
>> On Thu, May 24, 2012 at 9:50 PM, Giampaolo Rodol?  wrote:
>>> poller.poll serves the same purpose of asyncore.loop, yes, but this is
>>> supposed to be independent from asyncore.
>>
>> I'd actually like to see something like this pitched as a
>> "concurrent.eventloop" PEP. PEP 3153 really wasn't what I was
>> expecting after the discussions at the PyCon US 2011 language summit -
>> I was expecting "here's a common event loop all the async frameworks
>> can hook into", but instead we got something a *lot* more ambitious
>> taht tried to merge the entire IO stack for the async frameworks,
>> rather than just provide a standard way for their event loops to
>> cooperate.
>
> See the final section of my notes here:
> http://www.boredomandlaziness.org/2011/03/python-language-summit-rough-notes.html
>
> Turns out the idea of a PEP 3153 level API *was* raised at the summit,
> but I'd still like to see a competing PEP that targets the reactor
> level API directly.
>
> Cheers,
> Nick.


It's not clear to me what such a PEP should address in particular,
anyway here's a bunch of semi-random ideas.


=== Idea #1 ===

4 classes (SelectPoller, PollPoller, EpollPoller, KqueuePoller) within
concurrent.eventloop namespace all sharing the same API:

- register(fd, events, callback)  # callback gets called with events as arg
- modify(fd, events)
- unregister(fd)
- call_later(timeout, callback, errback=None)
- call_every(timeout, callback, errback=None)
- poll(timeout=1.0, blocking=True)
- close()

call_later() and call_every() can return an object having cancel() and
reset() methods.
The user willing to register a new handler will do:

>>> poller.register(sock.fileno(), poller.READ | poller.WRITE, callback)

...then, in the callback:

def callback(events):
    if events & poller.ERROR and not events & poller.READ:
          disconnect()
    else:
         if events & poller.READ:
             read()
         if events & poller.WRITE:
             write()


pros: highly customizable
cons: too low level, requires manual handling

=== Idea #2 ===

same as #1 except:

- register(fd, events)
- poll(timeout=1.0)  # desn't block, return {fd:events, fd:events, ...}


=== Idea #3 ===

same as #1 except:

- register(fd, events, handler)
- poll(timeout=1.0, blocking=True)

...poll() will call handler.handle_X_event() depending on the current
event (READ, WRITE or ERROR).
An internal map such as {fd:handler, fd:handler} will be maintaned internally.

- pros: easier to use
- cons: more rigid, requires a "contract" with the handler


--- Giampaolo
http://code.google.com/p/pyftpdlib/
http://code.google.com/p/psutil/
http://code.google.com/p/pysendfile/


From ericsnowcurrently at gmail.com  Thu May 24 21:34:07 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Thu, 24 May 2012 13:34:07 -0600
Subject: [Python-ideas] a simple namespace type
In-Reply-To: 
References: 
	
	
	
Message-ID: 

On Thu, May 24, 2012 at 12:14 PM, Guido van Rossum  wrote:
> On Thu, May 24, 2012 at 10:17 AM, Eric Snow  wrote:
>> ?Effectively this is just a simple but distinct facade around dict to
>> give a namespace with attribute access. ?I suppose part of the
>> question is how much of the Mapping interface would belong instead to
>> a hypothetical Namespace interface. (I'm definitely _not_ proposing
>> such an unnecessary extra level of abstraction).
>
> Possibly there is a (weird?) parallel with namedtuple. The end result
> is somewhat similar: you get to use attribute names instead of the
> accessor syntax (x[y]) of the underlying type. But the "feel" of the
> type is different, and inherits more of the underlying type
> (namedtuple is immutable and has a fixed set of keys, whereas the type
> proposed here is mutable and allows arbitrary keys as long as they
> look like Python names).

Yeah, the feel is definitely different.  I've been thinking about this
because of the code for sys.implementation.  Using a structseq would
probably been the simplest approach there, but a named tuple doesn't
feel right.  In contrast, a SimpleNamespace would fit much better.

As far as this goes generally, the pattern of a simple, dynamic
attribute-based namespace has been implemented a zillion times (and
it's easy to do).  This is because people find a simple dynamic
namespace really handy and they want the attribute-access interface
rather than a mapping.

In contrast, a namedtuple is, as Nick said, "the standard library's
answer for structured records".  It's an immutable (attribute-based)
namespace implementing the Sequence interface.  It's a tuple and
directly reflects the underlying concept of tuples in Python by giving
the values names.

SimpleNamespace (and the like) isn't a structured record.  It's only
job is to be an attribute-based namespace with as simple an interface
as possible.

So why isn't a type like SimpleNamespace in the stdlib? Because it's
trivial to implement.  There's a certain trivial-ness threshold a
function/type must pass before it gets canonized, and rightly so.

Anyway, while many would use something like SimpleNamespace out the
the standard library, my impetus was having it as a builtin type so I
could use it for sys.implementation.  :)

FWIW, I have an implementation (pure Python + c extension) of
SimpleNamespace on PyPI:

  http://pypi.python.org/pypi/simple_namespace

-eric


From tjreedy at udel.edu  Thu May 24 22:04:40 2012
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 24 May 2012 16:04:40 -0400
Subject: [Python-ideas] Add a generic async IO poller/reactor to select
	module
In-Reply-To: 
References: 
	
	
	
	
	
Message-ID: 

On 5/24/2012 2:40 PM, Giampaolo Rodol? wrote:

> It's not clear to me what such a PEP should address in particular,
> anyway here's a bunch of semi-random ideas.

I have been reading for perhaps a decade how bad asyncore is. So I  hope 
you stick with trying to thrash out something different, even if the 
discussion gets tedious or contentions.

> === Idea #1 ===
>
> 4 classes (SelectPoller, PollPoller, EpollPoller, KqueuePoller) within
> concurrent.eventloop namespace all sharing the same API:

For new classes, the first question is what concept (and data/function 
grouping) they and their instances represent. As a naive event loop 
user, I might think in terms of event sources (or sets of sources) and 
corresponding handlers. For events generated by 'file' polling, the 
particular method would seem like a secondary issue.

Your proposed classes are named after methods and you give no 
initialization api. This suggests to me that you mean for all files 
being polled by the same method to be grouped together. If so, there 
would only need 0 or 1 instance of each 'class', in while case, they 
could just as well be modules.

In other words, I am unsure what concept these classes would represent. 
I am perhaps thinking at too high a level.

> - register(fd, events, callback)  # callback gets called with events as arg
> - modify(fd, events)
> - unregister(fd)
> - call_later(timeout, callback, errback=None)
> - call_every(timeout, callback, errback=None)
> - poll(timeout=1.0, blocking=True)
> - close()
>
> call_later() and call_every() can return an object having cancel() and
> reset() methods.

-- 
Terry Jan Reedy




From cs at zip.com.au  Fri May 25 00:37:52 2012
From: cs at zip.com.au (Cameron Simpson)
Date: Fri, 25 May 2012 08:37:52 +1000
Subject: [Python-ideas] Add a generic async IO poller/reactor to select
 module
In-Reply-To: <20120524140314.303d3bcd@pitrou.net>
References: <20120524140314.303d3bcd@pitrou.net>
Message-ID: <20120524223752.GA7468@cskk.homeip.net>

On 24May2012 14:03, Antoine Pitrou  wrote:
| Also, I don't know why you would specify poller.READ or poller.WRITE
| explicitly. Usually you are interested in all events, no?

Personally, I would want specificity. If I only care about write (eg I'm
only sending), I would only specify poller.WRITE and have my handler
only know and care about that. Possibly it would be good to be able to
raise an exception for events I hadn't handled, but I'd be half inclined
to have my handler do that, were it wanted (yes, there is some tension
in this sentence).

Unless I'm missing something here.

Just my 2c,
-- 
Cameron Simpson  DoD#743
http://www.cskk.ezoshosting.com/cs/

I just didn't give up, not riding it out wasn't an option. You don't
crash, until you do. The longer you ride it out the more likely you
are to ride it out. Throwing it away, saves nothing.    - J. Pridmore


From ronaldoussoren at mac.com  Fri May 25 08:39:23 2012
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Fri, 25 May 2012 08:39:23 +0200
Subject: [Python-ideas] Add a generic async IO poller/reactor to select
 module
In-Reply-To: 
References: 
	
	
	
	
	
Message-ID: 


On 24 May, 2012, at 20:40, Giampaolo Rodol? wrote:

> 2012/5/24 Nick Coghlan :
>> On Thu, May 24, 2012 at 10:37 PM, Nick Coghlan  wrote:
>>> On Thu, May 24, 2012 at 9:50 PM, Giampaolo Rodol?  wrote:
>>>> poller.poll serves the same purpose of asyncore.loop, yes, but this is
>>>> supposed to be independent from asyncore.
>>> 
>>> I'd actually like to see something like this pitched as a
>>> "concurrent.eventloop" PEP. PEP 3153 really wasn't what I was
>>> expecting after the discussions at the PyCon US 2011 language summit -
>>> I was expecting "here's a common event loop all the async frameworks
>>> can hook into", but instead we got something a *lot* more ambitious
>>> taht tried to merge the entire IO stack for the async frameworks,
>>> rather than just provide a standard way for their event loops to
>>> cooperate.
>> 
>> See the final section of my notes here:
>> http://www.boredomandlaziness.org/2011/03/python-language-summit-rough-notes.html
>> 
>> Turns out the idea of a PEP 3153 level API *was* raised at the summit,
>> but I'd still like to see a competing PEP that targets the reactor
>> level API directly.
>> 
>> Cheers,
>> Nick.
> 
> 
> It's not clear to me what such a PEP should address in particular,
> anyway here's a bunch of semi-random ideas.


All of these are probably too low level to be the only API because they don't encapsulate error handling. 

A slightly higher level API would have a callback with received data and a buffered API for sending data. That way the networking library can deal with lowlevel socket API errors and translate them to usefull abtract errors. It would also handle some errors like and EGAIN error itself.

Also: how would you use SSL with these APIs? 

The API would probably end up with functionality simular to Twisted's reactor and transport APIs (and possibly endpoints but I don't know how stable that API is). 

Ronald
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4788 bytes
Desc: not available
URL: 

From list at qtrac.plus.com  Fri May 25 09:53:47 2012
From: list at qtrac.plus.com (Mark Summerfield)
Date: Fri, 25 May 2012 08:53:47 +0100
Subject: [Python-ideas] Minimal built-ins (+ tiny doc suggestion)
Message-ID: <20120525085347.31215c94@dino>

Hi,

Built-ins:

In an effort to keep the core language as small as possible (to keep it
"brain sized":-) would it be reasonable to deprecate filter() and map()
and to move them to the standard library as happened with reduce()?
After all, don't people mostly use list comprehensions and generator
expressions for these nowadays?

Docs:

The Python Module Index http://docs.python.org/dev/py-modindex.html
Shows _ | a | b | ...
This is prettier than _ | A | B | ...
but also harder to click because the letters are smaller; so I would
prefer the use of capitals.

-- 
Mark Summerfield, Qtrac Ltd, www.qtrac.eu
    C++, Python, Qt, PyQt - training and consultancy
        "Programming in Go" - ISBN 0321774639
            http://www.qtrac.eu/gobook.html


From ncoghlan at gmail.com  Fri May 25 10:28:28 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 25 May 2012 18:28:28 +1000
Subject: [Python-ideas] Minimal built-ins (+ tiny doc suggestion)
In-Reply-To: <20120525085347.31215c94@dino>
References: <20120525085347.31215c94@dino>
Message-ID: 

On Fri, May 25, 2012 at 5:53 PM, Mark Summerfield  wrote:
> Hi,
>
> Built-ins:
>
> In an effort to keep the core language as small as possible (to keep it
> "brain sized":-) would it be reasonable to deprecate filter() and map()
> and to move them to the standard library as happened with reduce()?
> After all, don't people mostly use list comprehensions and generator
> expressions for these nowadays?

I'd personally agree with filter() moving, but "map(str, seq)" still
beats "(str(x) for x in seq)" by a substantial margin for me when it
comes to quickly and cleanly encapsulating a common idiom such that it
is easier both to read *and* write.

The basic problem is that the answer to your question is "no" - for
preexisting functions, a lot of people still use filter() and map(),
with the comprehension forms reigning supreme only when someone would
have had to otherwise use a lambda expression.

We won the argument for moving reduce() to functools because it's such
a pain to use correctly that it clearly qualified as an attractive
nuisance.

Cheers,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From simon.sapin at kozea.fr  Fri May 25 10:25:29 2012
From: simon.sapin at kozea.fr (Simon Sapin)
Date: Fri, 25 May 2012 10:25:29 +0200
Subject: [Python-ideas] Minimal built-ins (+ tiny doc suggestion)
In-Reply-To: <20120525085347.31215c94@dino>
References: <20120525085347.31215c94@dino>
Message-ID: <4FBF41F9.90609@kozea.fr>

Hi,

Le 25/05/2012 09:53, Mark Summerfield a ?crit :
> Built-ins:
>
> In an effort to keep the core language as small as possible (to keep it
> "brain sized":-) would it be reasonable to deprecate filter() and map()
> and to move them to the standard library as happened with reduce()?
> After all, don't people mostly use list comprehensions and generator
> expressions for these nowadays?


Aside from the pain of porting existing code, what would this achieve? 
How do filter() and map() bother you if you can just ignore them and not 
use them?

The only upside I can imagine in having less bultins is that using 
variables with the same names is a kind-of bad practice. But it can not 
cause a bug if you don?t use the builtin at all.


> Docs:
>
> The Python Module Indexhttp://docs.python.org/dev/py-modindex.html
> Shows _ | a | b | ...
> This is prettier than _ | A | B | ...
> but also harder to click because the letters are smaller; so I would
> prefer the use of capitals.


Adding CSS padding on links can make the clickable area bigger, so the 
choice of using capitals or not can be independent of that.


Regards,
-- 
Simon Sapin


From steve at pearwood.info  Fri May 25 11:07:03 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 25 May 2012 19:07:03 +1000
Subject: [Python-ideas] Minimal built-ins (+ tiny doc suggestion)
In-Reply-To: <20120525085347.31215c94@dino>
References: <20120525085347.31215c94@dino>
Message-ID: <4FBF4BB7.1070308@pearwood.info>

Mark Summerfield wrote:
> Hi,
> 
> Built-ins:
> 
> In an effort to keep the core language as small as possible (to keep it
> "brain sized":-) would it be reasonable to deprecate filter() and map()
> and to move them to the standard library as happened with reduce()?
> After all, don't people mostly use list comprehensions and generator
> expressions for these nowadays?

So you would put people through the pain of dealing with broken code and 
deprecation just so that people don't have to remember functions which you 
think they don't remember anyway?

-1

Keeping the core language small is a benefit to core developers. It is not so 
much a benefit to users of the language -- if a programmer is only using the 
builtins, they are surely reinventing the wheel (and probably badly). To be an 
effective programmer, you surely are using functions and classes in the std 
lib as well as the builtins, which means you have to memorise both what the 
function is, *and* where it is. Shrinking the builtins while increasing the 
size of the std lib is not much of a human-memory optimization, and may very 
well be a pessimation.

If you need a memory-jog, it is much easier to find builtins because they are 
always available to a quick call to dir(), while finding something in a module 
means searching the docs or the file system.


-- 
Steven


From steve at pearwood.info  Fri May 25 11:09:29 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Fri, 25 May 2012 19:09:29 +1000
Subject: [Python-ideas] Minimal built-ins (+ tiny doc suggestion)
In-Reply-To: 
References: <20120525085347.31215c94@dino>
	
Message-ID: <4FBF4C49.30007@pearwood.info>

Nick Coghlan wrote:

> I'd personally agree with filter() moving, but "map(str, seq)" still
> beats "(str(x) for x in seq)" by a substantial margin for me when it
> comes to quickly and cleanly encapsulating a common idiom such that it
> is easier both to read *and* write.

filter(None, seq)
[obj for obj in seq if obj]

I think the version with filter is *much* better than the second.


-- 
Steven



From ironfroggy at gmail.com  Fri May 25 11:32:23 2012
From: ironfroggy at gmail.com (Calvin Spealman)
Date: Fri, 25 May 2012 05:32:23 -0400
Subject: [Python-ideas] Add a generic async IO poller/reactor to select
	module
In-Reply-To: 
References: 
Message-ID: 

On Wed, May 23, 2012 at 9:32 PM, Giampaolo Rodol?  wrote:
> Including an established async IO framework such as Twisted, gevent or
> Tornado in the Python stdlib has always been a controversial subject.
> PEP-3153 (http://www.python.org/dev/peps/pep-3153/) tried to face this
> problem in the most agnostic way as possible, and it's a good starting
> point IMO.
> Nevertheless, it's still vague about what the actual API should look
> like and AFAIK it remained stagnant so far.
>
> There's one thing in the whole async stack which is basically the same
> for all implementations though: the poller/reactor.
> Could it make sense to add something similar to select module?
> Differently from PEP-3153, providing such a layer on top of select(),
> poll() & co. is easier and could possibly be an incentive to avoid
> such code duplication.
>
> I'm coming up with this because I recently did something similar in
> pyftpdlib as an hack on top of asyncore to add support for epoll() and
> kqueue(), using the excellent Tornado's io loop as source of
> inspiration:
> http://code.google.com/p/pyftpdlib/issues/detail?id=203
> http://code.google.com/p/pyftpdlib/source/browse/trunk/pyftpdlib/lib/ioloop.py
>
>
> The way I imagine it:
>
>>>> import select
>>>> dir(select)
> [..., 'EpollPoller', 'PollPoller', 'SelectPoller', 'KqueuePoller']
>>>> poller = select.EpollPoller()
>>>> poller.register(fd, handler, poller.READ | poller.WRITE)
>>>> poller.socket_map
> {2 : }
>>>> poller.modify(fd, poller.READ)
>>>> poller.poll() ? ? ?# will call handler.handle_read_event() if/when it's the case
> ^C
> KeyboardInterrupt
>>>> poller.remove(fd)
>>>> poller.close()
>
> The handler is supposed to provide 3 methods:
> - handle_read_event
> - handle_write_event
> - handle_error_event
>
> Users willing to support multiple event loops such as wx, gtk etc can do:
>
>>>> while 1:
> ... ? ? ? poller.poll(timeout=0.1, blocking=False)
> ... ? ? ? otherpoller.poll()
>
>
> Basically, this would be the whole API.
>
> Thoughts?
>

Frankly, I don't think this deserves a PEP at all, or even to consider
one *yet*.

Building a new API and a new library from scratch seems a frail
comparison to testing
a library in the real world, it having real uses, and then being
incorporated into the
stdlib. The problem here, of course, is that all the real-world
solutions (ie, Twisted)
include far more than the reactor.

>
> --- Giampaolo
> http://code.google.com/p/pyftpdlib/
> http://code.google.com/p/psutil/
> http://code.google.com/p/pysendfile/
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas



-- 
Read my blog! I depend on your acceptance of my opinion! I am interesting!
http://techblog.ironfroggy.com/
Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy


From solipsis at pitrou.net  Fri May 25 11:52:06 2012
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Fri, 25 May 2012 11:52:06 +0200
Subject: [Python-ideas] Minimal built-ins (+ tiny doc suggestion)
References: <20120525085347.31215c94@dino>
	
	<4FBF4C49.30007@pearwood.info>
Message-ID: <20120525115206.1f098cbb@pitrou.net>

On Fri, 25 May 2012 19:09:29 +1000
Steven D'Aprano  wrote:
> Nick Coghlan wrote:
> 
> > I'd personally agree with filter() moving, but "map(str, seq)" still
> > beats "(str(x) for x in seq)" by a substantial margin for me when it
> > comes to quickly and cleanly encapsulating a common idiom such that it
> > is easier both to read *and* write.
> 
> filter(None, seq)
> [obj for obj in seq if obj]
> 
> I think the version with filter is *much* better than the second.

Only if you remember what the special value None does when passed to
filter. The cognitive burden is higher.

That said, the idea of moving filter() and map() away won't fly before
at least Python 4.

Regatds

Antoine.




From jeanpierreda at gmail.com  Fri May 25 12:33:22 2012
From: jeanpierreda at gmail.com (Devin Jeanpierre)
Date: Fri, 25 May 2012 06:33:22 -0400
Subject: [Python-ideas] Add a generic async IO poller/reactor to select
	module
In-Reply-To: 
References: 
	
Message-ID: 

On Fri, May 25, 2012 at 5:32 AM, Calvin Spealman  wrote:
> Frankly, I don't think this deserves a PEP at all, or even to consider
> one *yet*.
>
> Building a new API and a new library from scratch seems a frail
> comparison to testing
> a library in the real world, it having real uses, and then being
> incorporated into the
> stdlib. The problem here, of course, is that all the real-world
> solutions (ie, Twisted)
> include far more than the reactor.

To be fair, PEP-3153 was built based largely on experience from the
Twisted project and input from Twisted developers, who know what they
are talking about and how to build a useful system. The entire
transport/protocol separation is lifted directly out of it.

-- Devin


From ironfroggy at gmail.com  Fri May 25 13:54:13 2012
From: ironfroggy at gmail.com (Calvin Spealman)
Date: Fri, 25 May 2012 07:54:13 -0400
Subject: [Python-ideas] Add a generic async IO poller/reactor to select
	module
In-Reply-To: 
References: 
	
	
Message-ID: 

On Fri, May 25, 2012 at 6:33 AM, Devin Jeanpierre
 wrote:
> On Fri, May 25, 2012 at 5:32 AM, Calvin Spealman  wrote:
>> Frankly, I don't think this deserves a PEP at all, or even to consider
>> one *yet*.
>>
>> Building a new API and a new library from scratch seems a frail
>> comparison to testing
>> a library in the real world, it having real uses, and then being
>> incorporated into the
>> stdlib. The problem here, of course, is that all the real-world
>> solutions (ie, Twisted)
>> include far more than the reactor.
>
> To be fair, PEP-3153 was built based largely on experience from the
> Twisted project and input from Twisted developers, who know what they
> are talking about and how to build a useful system. The entire
> transport/protocol separation is lifted directly out of it.
>
> -- Devin

My comments were in response to this post, not PEP-3153

-- 
Read my blog! I depend on your acceptance of my opinion! I am interesting!
http://techblog.ironfroggy.com/
Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy


From ncoghlan at gmail.com  Fri May 25 15:53:04 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 25 May 2012 23:53:04 +1000
Subject: [Python-ideas] Add a generic async IO poller/reactor to select
	module
In-Reply-To: 
References: 
	
Message-ID: 

On May 25, 2012 7:33 PM, "Calvin Spealman"  wrote:

> On Wed, May 23, 2012 at 9:32 PM, Giampaolo Rodol? 
> wrote:
> > Users willing to support multiple event loops such as wx, gtk etc can do:
> >
> >>>> while 1:
> > ...       poller.poll(timeout=0.1, blocking=False)
> > ...       otherpoller.poll()
> >
> >
> > Basically, this would be the whole API.
> >
> > Thoughts?
> >
>
> Frankly, I don't think this deserves a PEP at all, or even to consider
> one *yet*.
>
> Building a new API and a new library from scratch seems a frail
> comparison to testing
> a library in the real world, it having real uses, and then being
> incorporated into the
> stdlib. The problem here, of course, is that all the real-world
> solutions (ie, Twisted)
> include far more than the reactor.
>

No, the specific call at the PyCon US 2011 language summit was for a PEP
that proposed a *new* event loop for the standard library that:
1. Provides simple event loop functionality in the standard library, as an
improved alternative to asyncore for small apps that don't require the full
power of a framework like Twisted (think things like little IRC bots, TCP
echo servers, or testing of async components)
2. Provides a clean migration path to a production grade reactor like
Twisted's
3. Makes it easier for multiple event loop based frameworks (e.g. tkinter,
wxPython, PySide, Twisted) to all cooperate within the same process

What we're after is something for the stdlib that is to event
loops/reactors as wsgiref is to production grade WSGI servers like mod_wsgi
and nginx. asyncore isn't it, because the migration path isn't clean.

PEP 3153 currently spends a lot of time talking about transports and
protocols, but doesn't answer those 3 core questions:

1. How do I write a simple IRC bot or TCP echo server?
2. How do I migrate my simple app to a production grade reactor like
Twisted's?
3. How do I run two different concurrent.eventloop compatible reactors in
the same process?

As far as I can tell, PEP 3153 wants to handle all that by merging the I/O
stacks of all the frameworks first, which strikes me as being *way* too
ambitious for a first step. If we can't even figure out a common
abstraction for the reactor level (ala WSGI), how are we ever going to
agree on a standard async I/O abstraction?

Cheers,
Nick.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From ncoghlan at gmail.com  Fri May 25 15:54:51 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 25 May 2012 23:54:51 +1000
Subject: [Python-ideas] Add a generic async IO poller/reactor to select
	module
In-Reply-To: 
References: 
	
	
Message-ID: 

On Fri, May 25, 2012 at 11:53 PM, Nick Coghlan  wrote:
> as wsgiref is to production grade WSGI servers like mod_wsgi and nginx.

s/nginx/gunicorn/

Confusing-my-software-stack-levels'ly,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From mwm at mired.org  Fri May 25 18:29:12 2012
From: mwm at mired.org (Mike Meyer)
Date: Fri, 25 May 2012 12:29:12 -0400
Subject: [Python-ideas] Minimal built-ins (+ tiny doc suggestion)
In-Reply-To: 
References: <20120525085347.31215c94@dino>
	
Message-ID: <20120525122912.46096701@bhuda.mired.org>

On Fri, 25 May 2012 18:28:28 +1000
Nick Coghlan  wrote:
> On Fri, May 25, 2012 at 5:53 PM, Mark Summerfield  wrote:
> > In an effort to keep the core language as small as possible (to keep it
> > "brain sized":-) would it be reasonable to deprecate filter() and map()
> > and to move them to the standard library as happened with reduce()?
> > After all, don't people mostly use list comprehensions and generator
> > expressions for these nowadays?

Wasn't this changed discussed for that very reason as part of the move
to 3.x?

Which makes me wonder why reduce moved but not map and filter, when
map and filter have obvious rewrites as list comprehensions, but
reduce doesn't? Seems backwards to me.

> The basic problem is that the answer to your question is "no" - for
> preexisting functions, a lot of people still use filter() and map(),
> with the comprehension forms reigning supreme only when someone would
> have had to otherwise use a lambda expression.

Personally, I tend to favor list comprehensions most of the time (and
I was a pretty heavy user of map and filter in the day), because it's
just one less idiom to deal with. The exception is when they'd nest -
I use [map(f, l) for l in list-of-lists] rather than nesting the
comprehensions, because I then don't have to worry about untangling
the nest.

But I do agree that since they survived into 3.x, they need to stay
put until 4.x.

			http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From guido at python.org  Fri May 25 18:47:00 2012
From: guido at python.org (Guido van Rossum)
Date: Fri, 25 May 2012 09:47:00 -0700
Subject: [Python-ideas] Minimal built-ins (+ tiny doc suggestion)
In-Reply-To: <20120525122912.46096701@bhuda.mired.org>
References: <20120525085347.31215c94@dino>
	
	<20120525122912.46096701@bhuda.mired.org>
Message-ID: 

On Fri, May 25, 2012 at 9:29 AM, Mike Meyer  wrote:
> On Fri, 25 May 2012 18:28:28 +1000
> Nick Coghlan  wrote:
>> On Fri, May 25, 2012 at 5:53 PM, Mark Summerfield  wrote:
>> > In an effort to keep the core language as small as possible (to keep it
>> > "brain sized":-) would it be reasonable to deprecate filter() and map()
>> > and to move them to the standard library as happened with reduce()?
>> > After all, don't people mostly use list comprehensions and generator
>> > expressions for these nowadays?
>
> Wasn't this changed discussed for that very reason as part of the move
> to 3.x?
>
> Which makes me wonder why reduce moved but not map and filter, when
> map and filter have obvious rewrites as list comprehensions, but
> reduce doesn't? Seems backwards to me.

How quickly we forget.

The point wasn't sparsity of constructs. The point was readability.
Code written using map() or filter(), is usually quite readable --
excesses are possible, but not more so than using list comprehensions.
However code that uses reduce() has a high likelihood of being
unreadable, and is almost always rewritten more easily using a
traditional for loop and some variables that are updated in the loop.

>> The basic problem is that the answer to your question is "no" - for
>> preexisting functions, a lot of people still use filter() and map(),
>> with the comprehension forms reigning supreme only when someone would
>> have had to otherwise use a lambda expression.
>
> Personally, I tend to favor list comprehensions most of the time (and
> I was a pretty heavy user of map and filter in the day), because it's
> just one less idiom to deal with. The exception is when they'd nest -
> I use [map(f, l) for l in list-of-lists] rather than nesting the
> comprehensions, because I then don't have to worry about untangling
> the nest.

There are interesting considerations of readability either way. If you
have to write a lambda to use map() or filter(), it is *always* better
to use a list comprehension, because of the overhead in creating the
stack frame for the lambda. But if you are mapping or filtering using
an already-existing function, map()/filter() is more concise and I
usually find it more readable, because you don't have to invent a loop
control variable. My claim is that for the human reader (who is
familiar with map/filter), it is less work for the brain to understand
map(f, xs) than [f(x) for x in xs] -- there are more words to parse in
the latter, and you have to check that it is the same 'x' in both
places. The advantage of map/filter increases when f is a built-in
function, since the loop implied by map/filter executes more quickly
than the explicit loop (implemented using standard looping byte codes)
used by list comprehensions.

(I hesitate to emphasize the performance too much, since some
hypothetical future Python implementation could make the performance
the same in all cases. But with today's CPython, Jython and
IronPython, it is important to know about relative performance of
different constructs; and even PyPy doesn't alter the equation too
much here. Still, the readability arguments aligns pretty much with
the performance arguments, so they just strengthen each other.)

> But I do agree that since they survived into 3.x, they need to stay
> put until 4.x.

And beyond.

-- 
--Guido van Rossum (python.org/~guido)


From mwm at mired.org  Fri May 25 23:37:29 2012
From: mwm at mired.org (Mike Meyer)
Date: Fri, 25 May 2012 17:37:29 -0400
Subject: [Python-ideas] pmap, preduce, pmapreduce?
Message-ID: <20120525173729.5a42a558@bhuda.mired.org>

Another crazy idea that may not be possible, based on my finally
getting around to watching Guy Steele's talks about what he's up to
these days (http://vimeo.com/6624203).

Given a function that takes a list (or a container class which len
doesn't consume) and a function, and then applies that function to the
list in some way: either element wise, or in pairs of elements/results,
but does it in parallel. It will hold the GIL, but run the function
calls in distinct threads, meaning two applications of the function
could interfere with each other.

Is it possible to place limitations on the function such that this
kind of controlled concurrent operation is safe?

     		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From pyideas at rebertia.com  Sat May 26 00:52:54 2012
From: pyideas at rebertia.com (Chris Rebert)
Date: Fri, 25 May 2012 15:52:54 -0700
Subject: [Python-ideas] Minimal built-ins (+ tiny doc suggestion)
In-Reply-To: <4FBF4C49.30007@pearwood.info>
References: <20120525085347.31215c94@dino>
	
	<4FBF4C49.30007@pearwood.info>
Message-ID: 

On Fri, May 25, 2012 at 2:09 AM, Steven D'Aprano  wrote:
> Nick Coghlan wrote:
>> I'd personally agree with filter() moving, but "map(str, seq)" still
>> beats "(str(x) for x in seq)" by a substantial margin for me when it
>> comes to quickly and cleanly encapsulating a common idiom such that it
>> is easier both to read *and* write.
>
> filter(None, seq)
> [obj for obj in seq if obj]
>
> I think the version with filter is *much* better than the second.

And I think filter(bool, seq) beats the first. Exact same length, more
explicit, one less key to press (Shift).
The consistency of using comprehensions all the time has a certain
attraction though.

Cheers,
Chris


From jeanpierreda at gmail.com  Sat May 26 00:54:36 2012
From: jeanpierreda at gmail.com (Devin Jeanpierre)
Date: Fri, 25 May 2012 18:54:36 -0400
Subject: [Python-ideas] from foo import bar.baz
Message-ID: 

Has it irritated anyone else that this syntax is invalid? I've wanted
it a couple of times, to be equivalent to:

    import foo.bar.baz
    from foo import bar
    del foo # but only if we didn't import foo already before"

The idea being that one wants access to foo.bar.baz under the name
bar.baz , for readability purposes or what have you.

I played around with adding this, but I seem to have really bad luck
with extending CPython...

-- Devin


From grosser.meister.morti at gmx.net  Sat May 26 01:53:26 2012
From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=)
Date: Sat, 26 May 2012 01:53:26 +0200
Subject: [Python-ideas] from foo import bar.baz
In-Reply-To: 
References: 
Message-ID: <4FC01B76.4030104@gmx.net>

+1 Indeed, I would have expected that "from foo import bar.baz" would work.

On 05/26/2012 12:54 AM, Devin Jeanpierre wrote:
> Has it irritated anyone else that this syntax is invalid? I've wanted
> it a couple of times, to be equivalent to:
>
>      import foo.bar.baz
>      from foo import bar
>      del foo # but only if we didn't import foo already before"
>
> The idea being that one wants access to foo.bar.baz under the name
> bar.baz , for readability purposes or what have you.
>
> I played around with adding this, but I seem to have really bad luck
> with extending CPython...
>
> -- Devin




From jsbueno at python.org.br  Sat May 26 06:17:29 2012
From: jsbueno at python.org.br (Joao S. O. Bueno)
Date: Sat, 26 May 2012 01:17:29 -0300
Subject: [Python-ideas] pmap, preduce, pmapreduce?
In-Reply-To: <20120525173729.5a42a558@bhuda.mired.org>
References: <20120525173729.5a42a558@bhuda.mired.org>
Message-ID: 

On 25 May 2012 18:37, Mike Meyer  wrote:
> Another crazy idea that may not be possible, based on my finally
> getting around to watching Guy Steele's talks about what he's up to
> these days (http://vimeo.com/6624203).
>
> Given a function that takes a list (or a container class which len
> doesn't consume) and a function, and then applies that function to the
> list in some way: either element wise, or in pairs of elements/results,
> but does it in parallel. It will hold the GIL, but run the function
> calls in distinct threads, meaning two applications of the function
> could interfere with each other.


Just like the already existing "map" method  in concurrent.futures.Executor
? *

  js
 -><-

* all praise the Python time machine

> ? ?  --
> Mike Meyer  ? ? ? ? ? ? ?http://www.mired.org/


From jeanpierreda at gmail.com  Sat May 26 14:17:35 2012
From: jeanpierreda at gmail.com (Devin Jeanpierre)
Date: Sat, 26 May 2012 08:17:35 -0400
Subject: [Python-ideas] pmap, preduce, pmapreduce?
In-Reply-To: <20120525173729.5a42a558@bhuda.mired.org>
References: <20120525173729.5a42a558@bhuda.mired.org>
Message-ID: 

On Fri, May 25, 2012 at 5:37 PM, Mike Meyer  wrote:
> Is it possible to place limitations on the function such that this
> kind of controlled concurrent operation is safe?

I'm not sure what you mean. Tentative answer: restrict it to pure functions.

-- Devin


From masklinn at masklinn.net  Sat May 26 17:34:52 2012
From: masklinn at masklinn.net (Masklinn)
Date: Sat, 26 May 2012 17:34:52 +0200
Subject: [Python-ideas] pmap, preduce, pmapreduce?
In-Reply-To: <20120525173729.5a42a558@bhuda.mired.org>
References: <20120525173729.5a42a558@bhuda.mired.org>
Message-ID: 

On 2012-05-25, at 23:37 , Mike Meyer wrote:
> 
> Is it possible to place limitations on the function such that this
> kind of controlled concurrent operation is safe?

This would mean ideally only having pure functions, and at the very
least having functions which can't share state (not easily anyway).

Python, as a language, has no such provision that I know of beyond "be
careful" and "you're on your own".

A possible option, though, would be to use `multiprocessing` rather than
threads: multiprocessing.pool already provides a `map` operation, and
processes can't share state by default (doing so is quite an explicit
? and some would say involved ? operation). Going through
multiprocessing puts other limitations/complexities on the function
implementations, but at the very least it wouldn't be possible to
*unknowingly* share state.

From ericsnowcurrently at gmail.com  Sat May 26 20:53:11 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Sat, 26 May 2012 12:53:11 -0600
Subject: [Python-ideas] a simple namespace type
In-Reply-To: 
References: 
	
	
	
	
Message-ID: 

On Thu, May 24, 2012 at 1:34 PM, Eric Snow  wrote:
> On Thu, May 24, 2012 at 12:14 PM, Guido van Rossum  wrote:
>> On Thu, May 24, 2012 at 10:17 AM, Eric Snow  wrote:
>>> ?Effectively this is just a simple but distinct facade around dict to
>>> give a namespace with attribute access. ?I suppose part of the
>>> question is how much of the Mapping interface would belong instead to
>>> a hypothetical Namespace interface. (I'm definitely _not_ proposing
>>> such an unnecessary extra level of abstraction).
>>
>> Possibly there is a (weird?) parallel with namedtuple. The end result
>> is somewhat similar: you get to use attribute names instead of the
>> accessor syntax (x[y]) of the underlying type. But the "feel" of the
>> type is different, and inherits more of the underlying type
>> (namedtuple is immutable and has a fixed set of keys, whereas the type
>> proposed here is mutable and allows arbitrary keys as long as they
>> look like Python names).
>
> Yeah, the feel is definitely different. ?I've been thinking about this
> because of the code for sys.implementation. ?Using a structseq would
> probably been the simplest approach there, but a named tuple doesn't
> feel right. ?In contrast, a SimpleNamespace would fit much better.
>
> As far as this goes generally, the pattern of a simple, dynamic
> attribute-based namespace has been implemented a zillion times (and
> it's easy to do). ?This is because people find a simple dynamic
> namespace really handy and they want the attribute-access interface
> rather than a mapping.
>
> In contrast, a namedtuple is, as Nick said, "the standard library's
> answer for structured records". ?It's an immutable (attribute-based)
> namespace implementing the Sequence interface. ?It's a tuple and
> directly reflects the underlying concept of tuples in Python by giving
> the values names.
>
> SimpleNamespace (and the like) isn't a structured record. ?It's only
> job is to be an attribute-based namespace with as simple an interface
> as possible.
>
> So why isn't a type like SimpleNamespace in the stdlib? Because it's
> trivial to implement. ?There's a certain trivial-ness threshold a
> function/type must pass before it gets canonized, and rightly so.
>
> Anyway, while many would use something like SimpleNamespace out the
> the standard library, my impetus was having it as a builtin type so I
> could use it for sys.implementation. ?:)
>
> FWIW, I have an implementation (pure Python + c extension) of
> SimpleNamespace on PyPI:
>
> ?http://pypi.python.org/pypi/simple_namespace
>
> -eric

Any further thoughts on this?  Unless anyone is strongly opposed, I'd
like to push this forward.

-eric


From ironfroggy at gmail.com  Sat May 26 23:02:31 2012
From: ironfroggy at gmail.com (Calvin Spealman)
Date: Sat, 26 May 2012 17:02:31 -0400
Subject: [Python-ideas] a simple namespace type
In-Reply-To: 
References: 
	
	
	
	
	
Message-ID: 

On Sat, May 26, 2012 at 2:53 PM, Eric Snow  wrote:
> On Thu, May 24, 2012 at 1:34 PM, Eric Snow  wrote:
>> On Thu, May 24, 2012 at 12:14 PM, Guido van Rossum  wrote:
>>> On Thu, May 24, 2012 at 10:17 AM, Eric Snow  wrote:
>>>> ?Effectively this is just a simple but distinct facade around dict to
>>>> give a namespace with attribute access. ?I suppose part of the
>>>> question is how much of the Mapping interface would belong instead to
>>>> a hypothetical Namespace interface. (I'm definitely _not_ proposing
>>>> such an unnecessary extra level of abstraction).
>>>
>>> Possibly there is a (weird?) parallel with namedtuple. The end result
>>> is somewhat similar: you get to use attribute names instead of the
>>> accessor syntax (x[y]) of the underlying type. But the "feel" of the
>>> type is different, and inherits more of the underlying type
>>> (namedtuple is immutable and has a fixed set of keys, whereas the type
>>> proposed here is mutable and allows arbitrary keys as long as they
>>> look like Python names).
>>
>> Yeah, the feel is definitely different. ?I've been thinking about this
>> because of the code for sys.implementation. ?Using a structseq would
>> probably been the simplest approach there, but a named tuple doesn't
>> feel right. ?In contrast, a SimpleNamespace would fit much better.
>>
>> As far as this goes generally, the pattern of a simple, dynamic
>> attribute-based namespace has been implemented a zillion times (and
>> it's easy to do). ?This is because people find a simple dynamic
>> namespace really handy and they want the attribute-access interface
>> rather than a mapping.
>>
>> In contrast, a namedtuple is, as Nick said, "the standard library's
>> answer for structured records". ?It's an immutable (attribute-based)
>> namespace implementing the Sequence interface. ?It's a tuple and
>> directly reflects the underlying concept of tuples in Python by giving
>> the values names.
>>
>> SimpleNamespace (and the like) isn't a structured record. ?It's only
>> job is to be an attribute-based namespace with as simple an interface
>> as possible.
>>
>> So why isn't a type like SimpleNamespace in the stdlib? Because it's
>> trivial to implement. ?There's a certain trivial-ness threshold a
>> function/type must pass before it gets canonized, and rightly so.
>>
>> Anyway, while many would use something like SimpleNamespace out the
>> the standard library, my impetus was having it as a builtin type so I
>> could use it for sys.implementation. ?:)
>>
>> FWIW, I have an implementation (pure Python + c extension) of
>> SimpleNamespace on PyPI:
>>
>> ?http://pypi.python.org/pypi/simple_namespace
>>
>> -eric
>
> Any further thoughts on this? ?Unless anyone is strongly opposed, I'd
> like to push this forward.

There is no good name for such a type. "Namespace" is a bad name, because
the term "namespace" is already a general term that describes a lot of things in
Python (and outside it) and shouldn't share a name with a specific
thing, this type.
That this specific type would also be within the more general namespace-concept
only makes that worse.

So, what do you call it?

Also, is this here because you don't like typing the square brackets
and quotes? If
so, does it only save you three characters and is that worth the increase to the
language size?

A final complaint against: would the existence of this fragment
python-learners education
to the point that they would defer learning and practicing to use
classes properly?

Sorry to complain, but someone needs to in python-ideas! ;-)
Calvin

> -eric
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas



-- 
Read my blog! I depend on your acceptance of my opinion! I am interesting!
http://techblog.ironfroggy.com/
Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy


From ironfroggy at gmail.com  Sat May 26 23:05:57 2012
From: ironfroggy at gmail.com (Calvin Spealman)
Date: Sat, 26 May 2012 17:05:57 -0400
Subject: [Python-ideas] Add a generic async IO poller/reactor to select
	module
In-Reply-To: 
References: 
	
	
Message-ID: 

On Fri, May 25, 2012 at 9:53 AM, Nick Coghlan  wrote:
> On May 25, 2012 7:33 PM, "Calvin Spealman"  wrote:
>>
>> On Wed, May 23, 2012 at 9:32 PM, Giampaolo Rodol? 
>> wrote:
>> > Users willing to support multiple event loops such as wx, gtk etc can
>> > do:
>> >
>> >>>> while 1:
>> > ... ? ? ? poller.poll(timeout=0.1, blocking=False)
>> > ... ? ? ? otherpoller.poll()
>> >
>> >
>> > Basically, this would be the whole API.
>> >
>> > Thoughts?
>> >
>>
>> Frankly, I don't think this deserves a PEP at all, or even to consider
>> one *yet*.
>>
>> Building a new API and a new library from scratch seems a frail
>> comparison to testing
>> a library in the real world, it having real uses, and then being
>> incorporated into the
>> stdlib. The problem here, of course, is that all the real-world
>> solutions (ie, Twisted)
>> include far more than the reactor.
>
>
> No, the specific call at the PyCon US 2011 language summit was for a PEP
> that proposed a *new* event loop for the standard library that:
> 1. Provides simple event loop functionality in the standard library, as an
> improved alternative to asyncore for small apps that don't require the full
> power of a framework like Twisted (think things like little IRC bots, TCP
> echo servers, or testing of async components)
> 2. Provides a clean migration path to a production grade reactor like
> Twisted's
> 3. Makes it easier for multiple event loop based frameworks (e.g. tkinter,
> wxPython, PySide, Twisted) to all cooperate within the same process
>
> What we're after is something for the stdlib that is to event loops/reactors
> as wsgiref is to production grade WSGI servers like mod_wsgi and nginx.
> asyncore isn't it, because the migration path isn't clean.
>
> PEP 3153 currently spends a lot of time talking about transports and
> protocols, but doesn't answer those 3 core questions:
>
> 1. How do I write a simple IRC bot or TCP echo server?
> 2. How do I migrate my simple app to a production grade reactor like
> Twisted's?
> 3. How do I run two different concurrent.eventloop compatible reactors in
> the same process?
>
> As far as I can tell, PEP 3153 wants to handle all that by merging the I/O
> stacks of all the frameworks first, which strikes me as being *way* too
> ambitious for a first step. If we can't even figure out a common abstraction
> for the reactor level (ala WSGI), how are we ever going to agree on a
> standard async I/O abstraction?

Obviously, for a man with many opinions I miss out on too many conversations
and too many potential actions. I should make steps to correct this in
the future.
Thanks for clearing this up.

> Cheers,
> Nick.
>



-- 
Read my blog! I depend on your acceptance of my opinion! I am interesting!
http://techblog.ironfroggy.com/
Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy


From mwm at mired.org  Sun May 27 00:04:18 2012
From: mwm at mired.org (Mike Meyer)
Date: Sat, 26 May 2012 18:04:18 -0400
Subject: [Python-ideas] pmap, preduce, pmapreduce?
In-Reply-To: 
References: <20120525173729.5a42a558@bhuda.mired.org>
	
Message-ID: <20120526180418.56871823@bhuda.mired.org>

On Sat, 26 May 2012 17:34:52 +0200
Masklinn  wrote:
> On 2012-05-25, at 23:37 , Mike Meyer wrote:
> > Is it possible to place limitations on the function such that this
> > kind of controlled concurrent operation is safe?
> This would mean ideally only having pure functions, and at the very
> least having functions which can't share state (not easily anyway).

I'm not sure pure functions is good enough for cPython. If the
function involves looking through a tree of state (shared via the
arguments, even), then the changing reference counts as the code goes
through the key will hose you, unless the function evaluations are
serialized via the GIL.

> Python, as a language, has no such provision that I know of beyond "be
> careful" and "you're on your own".

Generally true for your code, but I think it tries to keep the
interpreter from tripping over it's own feet (via the GIL, etc.).

> A possible option, though, would be to use `multiprocessing` rather than
> threads: multiprocessing.pool already provides a `map` operation, and
> processes can't share state by default (doing so is quite an explicit
> ? and some would say involved ? operation). Going through
> multiprocessing puts other limitations/complexities on the function
> implementations, but at the very least it wouldn't be possible to
> *unknowingly* share state.

I'm familiar with that option, but was hoping to avoid it.

Though adding reduce (and maybe a mapreduce?) method to something like
concurrent.futures might be nice.

     Thanks
     		http://www.mired.org/
Independent Software developer/SCM consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


From ericsnowcurrently at gmail.com  Sun May 27 01:33:49 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Sat, 26 May 2012 17:33:49 -0600
Subject: [Python-ideas] a simple namespace type
In-Reply-To: 
References: 
	
	
	
	
	
	
Message-ID: 

On Sat, May 26, 2012 at 3:02 PM, Calvin Spealman  wrote:
> On Sat, May 26, 2012 at 2:53 PM, Eric Snow  wrote:
>> Any further thoughts on this? ?Unless anyone is strongly opposed, I'd
>> like to push this forward.
>
> There is no good name for such a type. "Namespace" is a bad name, because
> the term "namespace" is already a general term that describes a lot of things in
> Python (and outside it) and shouldn't share a name with a specific
> thing, this type.
> That this specific type would also be within the more general namespace-concept
> only makes that worse.
>
> So, what do you call it?

Yeah, I've seen it called at least 10 different things.  I'm certainly
open to whatever works best.  I've called it "namespace" because it is
one of the two kinds of namespace in Python: mapping ([]-access) and
object (dotted-access).  The builtin dict fills the one role and the
builtin object type almost fills the other.  I guess
"dotted_namespace" or "attribute_namespace" would work if "namespace"
is too confusing.

> Also, is this here because you don't like typing the square brackets and quotes? If
> so, does it only save you three characters and is that worth the increase to the
> language size?

This is definitely the stick against which to measure!

It boils down to this: for me dotted-access communicates a different,
more stable sort of namespace than does []-access (a la dicts).
Certainly it is less typing, but that isn't really a draw for me.
Dotted access is a little easier to read, which is nice but not the
big deal for me.  No, the big deal is the conceptual difference
inherent to access via string vs. access via identifier.

Though Python does not currently have a basic, dynamic,
attribute-based namespace type, it's trivial to make one: "class
Namespace: pass" or "type('Namespace', (), {})".  While this has been
done countless times, it's so simple that no one has felt like it
belonged in the language.  And I think that's fine, though it wouldn't
hurt to have something a little more than that (see my original
message).

So if it's so easy, why bother adding it?  Well, "class Namespace:
pass" is not so simple to do using the C API.  That's about it.  (I
*do* think people would be glad to have a basic attribute-based
namespace type in the langauge.

> A final complaint against: would the existence of this fragment
> python-learners education
> to the point that they would defer learning and practicing to use
> classes properly?

This is an excellent point.  I suppose it depends on who was teaching,
and how a new simple "namespace" type were exposed and documented.  It
certainly is not a replacement for classes, which have much more
machinery surrounding state/methods/class-ness.  If it made it harder
to learn Python then it would definitely have to bring *a lot* to the
table.

> Sorry to complain, but someone needs to in python-ideas! ;-)

Hey, I was more worried about the crickets I was hearing.  :)

-eric


From ironfroggy at gmail.com  Sun May 27 15:42:26 2012
From: ironfroggy at gmail.com (Calvin Spealman)
Date: Sun, 27 May 2012 09:42:26 -0400
Subject: [Python-ideas] a simple namespace type
In-Reply-To: 
References: 
	
	
	
	
	
	
	
Message-ID: 

On Sat, May 26, 2012 at 7:33 PM, Eric Snow  wrote:
> On Sat, May 26, 2012 at 3:02 PM, Calvin Spealman  wrote:
>> On Sat, May 26, 2012 at 2:53 PM, Eric Snow  wrote:
>>> Any further thoughts on this? ?Unless anyone is strongly opposed, I'd
>>> like to push this forward.
>>
>> There is no good name for such a type. "Namespace" is a bad name, because
>> the term "namespace" is already a general term that describes a lot of things in
>> Python (and outside it) and shouldn't share a name with a specific
>> thing, this type.
>> That this specific type would also be within the more general namespace-concept
>> only makes that worse.
>>
>> So, what do you call it?
>
> Yeah, I've seen it called at least 10 different things. ?I'm certainly
> open to whatever works best. ?I've called it "namespace" because it is
> one of the two kinds of namespace in Python: mapping ([]-access) and
> object (dotted-access). ?The builtin dict fills the one role and the
> builtin object type almost fills the other. ?I guess
> "dotted_namespace" or "attribute_namespace" would work if "namespace"
> is too confusing.
>
>> Also, is this here because you don't like typing the square brackets and quotes? If
>> so, does it only save you three characters and is that worth the increase to the
>> language size?
>
> This is definitely the stick against which to measure!
>
> It boils down to this: for me dotted-access communicates a different,
> more stable sort of namespace than does []-access (a la dicts).

This is probably the best case I've heard for such a type. Intent expression
is important!

> Certainly it is less typing, but that isn't really a draw for me.
> Dotted access is a little easier to read, which is nice but not the
> big deal for me. ?No, the big deal is the conceptual difference
> inherent to access via string vs. access via identifier.
>
> Though Python does not currently have a basic, dynamic,
> attribute-based namespace type, it's trivial to make one: "class
> Namespace: pass" or "type('Namespace', (), {})". ?While this has been
> done countless times, it's so simple that no one has felt like it
> belonged in the language. ?And I think that's fine, though it wouldn't
> hurt to have something a little more than that (see my original
> message).
>
> So if it's so easy, why bother adding it? ?Well, "class Namespace:
> pass" is not so simple to do using the C API. ?That's about it. ?(I
> *do* think people would be glad to have a basic attribute-based
> namespace type in the langauge.
>
>> A final complaint against: would the existence of this fragment
>> python-learners education
>> to the point that they would defer learning and practicing to use
>> classes properly?
>
> This is an excellent point. ?I suppose it depends on who was teaching,
> and how a new simple "namespace" type were exposed and documented. ?It
> certainly is not a replacement for classes, which have much more
> machinery surrounding state/methods/class-ness. ?If it made it harder
> to learn Python then it would definitely have to bring *a lot* to the
> table.
>
>> Sorry to complain, but someone needs to in python-ideas! ;-)
>
> Hey, I was more worried about the crickets I was hearing. ?:)
>
> -eric

The best names I was able to get crowdsourced from #python this morning
are:

- record
- flexobject
- attrobject
- attrdict
- nameddict
- namedobject

and the absolute worst name:
- Object

-- 
Read my blog! I depend on your acceptance of my opinion! I am interesting!
http://techblog.ironfroggy.com/
Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy


From sven at marnach.net  Sun May 27 18:08:26 2012
From: sven at marnach.net (Sven Marnach)
Date: Sun, 27 May 2012 17:08:26 +0100
Subject: [Python-ideas] a simple namespace type
In-Reply-To: 
References: 
	
	
	
	
	
	
	
	
Message-ID: <20120527160826.GT14830@bagheera>

Calvin Spealman schrieb am Sun, 27. May 2012, um 09:42:26 -0400:
> - record
> - flexobject
> - attrobject
> - attrdict
> - nameddict
> - namedobject

Since the proposed type is basically an `object` allowing attributes,
another option would be `attrobject`.

Adding an `__iter__()` method, as proposed earlier in this thread,
seems unnecessary; you can simply iterate over `vars(x)` for an
`attrobject` instance `x`.

Cheers,
    Sven


From bauertomer at gmail.com  Sun May 27 19:58:45 2012
From: bauertomer at gmail.com (T.B.)
Date: Sun, 27 May 2012 20:58:45 +0300
Subject: [Python-ideas] a simple namespace type
In-Reply-To: <20120527160826.GT14830@bagheera>
References: 
	
	
	
	
	
	
	
	
	<20120527160826.GT14830@bagheera>
Message-ID: <4FC26B55.7080000@gmail.com>

On 2012-05-27 19:08, Sven Marnach wrote:
 > Calvin Spealman schrieb am Sun, 27. May 2012, um 09:42:26 -0400:
 >> - record
 >> - flexobject
 >> - attrobject
 >> - attrdict
 >> - nameddict
 >> - namedobject
 >
 > Since the proposed type is basically an `object` allowing attributes,
 > another option would be `attrobject`.
 >
 > Adding an `__iter__()` method, as proposed earlier in this thread,
 > seems unnecessary; you can simply iterate over `vars(x)` for an
 > `attrobject` instance `x`.
 >

Is this whole class really necessary? As said before, this type is 
implemented numerous times:
* empty class (included in the Python Tutorial) [1]
* argparse.Namespace [2]
* multiprocessing.managers.Namespace [3]
* bunch (PyPI) that inherits from dict, instead of wrapping __dict__ [4]
* many more...

Each of them has a different semantics. Each is suited for a slightly 
different use case and they are so easy to implement. So you can 
customize to your liking - fields can or can't begin with "_", the later 
__repr__ comment or the color of the shed. Still, it seems they do not 
have a "killer feature" like namedtuple's efficiency.

Noticeable is how much they resemble a dict. Some let you iterate over 
the keys, test for equality and even all of the builtin dict methods 
(bunch). If you already use vars() for iteration, you might want a dict.

Funny that except for the easy "class Namespace: pass", the rest fail 
repr for recursive/self-referential objects:

 >>> from argparse/multiprocessing.managers/simplenamespace import Namespace
 >>> ns = Namespace()
 >>> ns.a = ns
 >>> repr(ns)
...
RuntimeError: maximum recursion depth exceeded

The next snippet use the fact that dict's __repr__ knows how to handle 
recursion to solve the RuntimeError problem:
def __repr__(self):
     return "{}({!r})".format(self.__class__.__name__, self.__dict__)


TB

[1] http://docs.python.org/dev/tutorial/classes.html#odds-and-ends
[2] http://hg.python.org/cpython/file/c1eab1ef9c0b/Lib/argparse.py#l1177
[3] 
http://hg.python.org/cpython/file/c1eab1ef9c0b/Lib/multiprocessing/managers.py#l913
[4] http://pypi.python.org/pypi/bunch


From ironfroggy at gmail.com  Sun May 27 21:31:53 2012
From: ironfroggy at gmail.com (Calvin Spealman)
Date: Sun, 27 May 2012 15:31:53 -0400
Subject: [Python-ideas] a simple namespace type
In-Reply-To: <4FC26B55.7080000@gmail.com>
References: 
	
	
	
	
	
	
	
	
	<20120527160826.GT14830@bagheera> <4FC26B55.7080000@gmail.com>
Message-ID: 

On Sun, May 27, 2012 at 1:58 PM, T.B.  wrote:
> On 2012-05-27 19:08, Sven Marnach wrote:
>> Calvin Spealman schrieb am Sun, 27. May 2012, um 09:42:26 -0400:
>>> - record
>>> - flexobject
>>> - attrobject
>>> - attrdict
>>> - nameddict
>>> - namedobject
>>
>> Since the proposed type is basically an `object` allowing attributes,
>> another option would be `attrobject`.
>>
>> Adding an `__iter__()` method, as proposed earlier in this thread,
>> seems unnecessary; you can simply iterate over `vars(x)` for an
>> `attrobject` instance `x`.
>>
>
> Is this whole class really necessary? As said before, this type is
> implemented numerous times:
> * empty class (included in the Python Tutorial) [1]
> * argparse.Namespace [2]
> * multiprocessing.managers.Namespace [3]
> * bunch (PyPI) that inherits from dict, instead of wrapping __dict__ [4]
> * many more...

All of the re-implementations of essentially the same thing is exactly why a
standard version is constantly suggested.

That said, it is so simple that it easily has many variants, because it is only
the base of the different ideas all these things implement.

> Each of them has a different semantics. Each is suited for a slightly
> different use case and they are so easy to implement. So you can customize
> to your liking - fields can or can't begin with "_", the later __repr__
> comment or the color of the shed. Still, it seems they do not have a "killer
> feature" like namedtuple's efficiency.
>
> Noticeable is how much they resemble a dict. Some let you iterate over the
> keys, test for equality and even all of the builtin dict methods (bunch). If
> you already use vars() for iteration, you might want a dict.
>
> Funny that except for the easy "class Namespace: pass", the rest fail repr
> for recursive/self-referential objects:
>
>>>> from argparse/multiprocessing.managers/simplenamespace import Namespace
>>>> ns = Namespace()
>>>> ns.a = ns
>>>> repr(ns)
> ...
> RuntimeError: maximum recursion depth exceeded
>
> The next snippet use the fact that dict's __repr__ knows how to handle
> recursion to solve the RuntimeError problem:
> def __repr__(self):
> ? ?return "{}({!r})".format(self.__class__.__name__, self.__dict__)
>
>
> TB
>
> [1] http://docs.python.org/dev/tutorial/classes.html#odds-and-ends
> [2] http://hg.python.org/cpython/file/c1eab1ef9c0b/Lib/argparse.py#l1177
> [3]
> http://hg.python.org/cpython/file/c1eab1ef9c0b/Lib/multiprocessing/managers.py#l913
> [4] http://pypi.python.org/pypi/bunch
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas

-- 
Read my blog! I depend on your acceptance of my opinion! I am interesting!
http://techblog.ironfroggy.com/
Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy


From eric at trueblade.com  Sun May 27 21:35:42 2012
From: eric at trueblade.com (Eric V. Smith)
Date: Sun, 27 May 2012 15:35:42 -0400
Subject: [Python-ideas] a simple namespace type
In-Reply-To: 
References: 
	
	
	
	
	
	
	
	
	<20120527160826.GT14830@bagheera> <4FC26B55.7080000@gmail.com>
	
Message-ID: <4FC2820E.2010205@trueblade.com>

On 5/27/2012 3:31 PM, Calvin Spealman wrote:
> On Sun, May 27, 2012 at 1:58 PM, T.B.  wrote:
>> On 2012-05-27 19:08, Sven Marnach wrote:
>>> Calvin Spealman schrieb am Sun, 27. May 2012, um 09:42:26 -0400:
>>>> - record
>>>> - flexobject
>>>> - attrobject
>>>> - attrdict
>>>> - nameddict
>>>> - namedobject
>>>
>>> Since the proposed type is basically an `object` allowing attributes,
>>> another option would be `attrobject`.
>>>
>>> Adding an `__iter__()` method, as proposed earlier in this thread,
>>> seems unnecessary; you can simply iterate over `vars(x)` for an
>>> `attrobject` instance `x`.
>>>
>>
>> Is this whole class really necessary? As said before, this type is
>> implemented numerous times:
>> * empty class (included in the Python Tutorial) [1]
>> * argparse.Namespace [2]
>> * multiprocessing.managers.Namespace [3]
>> * bunch (PyPI) that inherits from dict, instead of wrapping __dict__ [4]
>> * many more...
> 
> All of the re-implementations of essentially the same thing is exactly why a
> standard version is constantly suggested.
> 
> That said, it is so simple that it easily has many variants, because it is only
> the base of the different ideas all these things implement.

A test of the concept would be: could the uses of the similar classes in
the standard library be replaced with the proposed new implementation?

Eric.



From oscar.j.benjamin at gmail.com  Sun May 27 22:05:48 2012
From: oscar.j.benjamin at gmail.com (Oscar Benjamin)
Date: Sun, 27 May 2012 21:05:48 +0100
Subject: [Python-ideas] a simple namespace type
In-Reply-To: <4FC2820E.2010205@trueblade.com>
References: 
	
	
	
	
	
	
	
	
	<20120527160826.GT14830@bagheera> <4FC26B55.7080000@gmail.com>
	
	<4FC2820E.2010205@trueblade.com>
Message-ID: 

On 27 May 2012 20:35, Eric V. Smith  wrote:

> On 5/27/2012 3:31 PM, Calvin Spealman wrote:
> > On Sun, May 27, 2012 at 1:58 PM, T.B.  wrote:
> >> On 2012-05-27 19:08, Sven Marnach wrote:
> >>> Calvin Spealman schrieb am Sun, 27. May 2012, um 09:42:26 -0400:
> >>>> - record
> >>>> - flexobject
> >>>> - attrobject
> >>>> - attrdict
> >>>> - nameddict
> >>>> - namedobject
> >>>
> >>> Since the proposed type is basically an `object` allowing attributes,
> >>> another option would be `attrobject`.
> >>>
> >>> Adding an `__iter__()` method, as proposed earlier in this thread,
> >>> seems unnecessary; you can simply iterate over `vars(x)` for an
> >>> `attrobject` instance `x`.
>


What about an `__iter__()` method that works like `dict.items()`? Then you
can do a round trip with
    ns = attrobject(**d)
and
    d = dict(ns)
allowing you to quickly convert between attribute-based and item-based
access in either direction.



> >>>
> >>
> >> Is this whole class really necessary? As said before, this type is
> >> implemented numerous times:
> >> * empty class (included in the Python Tutorial) [1]
> >> * argparse.Namespace [2]
> >> * multiprocessing.managers.Namespace [3]
> >> * bunch (PyPI) that inherits from dict, instead of wrapping __dict__ [4]
> >> * many more...
> >
> > All of the re-implementations of essentially the same thing is exactly
> why a
> > standard version is constantly suggested.
> >
> > That said, it is so simple that it easily has many variants, because it
> is only
> > the base of the different ideas all these things implement.
>
> A test of the concept would be: could the uses of the similar classes in
> the standard library be replaced with the proposed new implementation?
>
> Eric.
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From ncoghlan at gmail.com  Sun May 27 22:09:06 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 28 May 2012 06:09:06 +1000
Subject: [Python-ideas] a simple namespace type
In-Reply-To: <4FC2820E.2010205@trueblade.com>
References: 
	
	
	
	
	
	
	
	
	<20120527160826.GT14830@bagheera> <4FC26B55.7080000@gmail.com>
	
	<4FC2820E.2010205@trueblade.com>
Message-ID: 

Slightly easier bar to reach: could the various incarnations be improved by
using a new varobject type as a base class (e.g. I know I often use
namedtuple as a base class rather than instantiating them directly,
although I do the latter, too).

There's also a potentially less controversial alternative: just add an easy
spelling for "type(name, (), {})" to the C API.

--
Sent from my phone, thus the relative brevity :)
On May 28, 2012 5:54 AM, "Eric V. Smith"  wrote:

> On 5/27/2012 3:31 PM, Calvin Spealman wrote:
> > On Sun, May 27, 2012 at 1:58 PM, T.B.  wrote:
> >> On 2012-05-27 19:08, Sven Marnach wrote:
> >>> Calvin Spealman schrieb am Sun, 27. May 2012, um 09:42:26 -0400:
> >>>> - record
> >>>> - flexobject
> >>>> - attrobject
> >>>> - attrdict
> >>>> - nameddict
> >>>> - namedobject
> >>>
> >>> Since the proposed type is basically an `object` allowing attributes,
> >>> another option would be `attrobject`.
> >>>
> >>> Adding an `__iter__()` method, as proposed earlier in this thread,
> >>> seems unnecessary; you can simply iterate over `vars(x)` for an
> >>> `attrobject` instance `x`.
> >>>
> >>
> >> Is this whole class really necessary? As said before, this type is
> >> implemented numerous times:
> >> * empty class (included in the Python Tutorial) [1]
> >> * argparse.Namespace [2]
> >> * multiprocessing.managers.Namespace [3]
> >> * bunch (PyPI) that inherits from dict, instead of wrapping __dict__ [4]
> >> * many more...
> >
> > All of the re-implementations of essentially the same thing is exactly
> why a
> > standard version is constantly suggested.
> >
> > That said, it is so simple that it easily has many variants, because it
> is only
> > the base of the different ideas all these things implement.
>
> A test of the concept would be: could the uses of the similar classes in
> the standard library be replaced with the proposed new implementation?
>
> Eric.
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From ericsnowcurrently at gmail.com  Mon May 28 18:34:38 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Mon, 28 May 2012 10:34:38 -0600
Subject: [Python-ideas] a simple namespace type
In-Reply-To: 
References: 
	
	
	
	
	
	
	
	
	<20120527160826.GT14830@bagheera> <4FC26B55.7080000@gmail.com>
	
	<4FC2820E.2010205@trueblade.com>
	
Message-ID: 

On Sun, May 27, 2012 at 2:09 PM, Nick Coghlan  wrote:
> Slightly easier bar to reach: could the various incarnations be improved by
> using a new varobject type as a base class (e.g. I know I often use
> namedtuple as a base class rather than instantiating them directly, although
> I do the latter, too).

Good point.  I do the same.

> There's also a potentially less controversial alternative: just add an easy
> spelling for "type(name, (), {})" to the C API.

I really like this.  There's a lot of boilerplate to create just a
simple type like this in the C API.  I'll see what I can come up with.
 :)

As a namespace, it would be good to have a nice repr, but that's not a
show stopper.

-eric


From ericsnowcurrently at gmail.com  Tue May 29 02:00:24 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Mon, 28 May 2012 18:00:24 -0600
Subject: [Python-ideas] a simple namespace type
In-Reply-To: 
References: 
	
	
	
	
	
	
	
	
	<20120527160826.GT14830@bagheera> <4FC26B55.7080000@gmail.com>
	
	<4FC2820E.2010205@trueblade.com>
	
	
Message-ID: 

On Mon, May 28, 2012 at 10:34 AM, Eric Snow  wrote:
> On Sun, May 27, 2012 at 2:09 PM, Nick Coghlan  wrote:
>> There's also a potentially less controversial alternative: just add an easy
>> spelling for "type(name, (), {})" to the C API.
>
> I really like this. ?There's a lot of boilerplate to create just a
> simple type like this in the C API. ?I'll see what I can come up with.
> ?:)

http://bugs.python.org/issue14942

-eric


From techtonik at gmail.com  Tue May 29 06:00:51 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Tue, 29 May 2012 07:00:51 +0300
Subject: [Python-ideas] from foo import bar.baz
In-Reply-To: 
References: 
Message-ID: 

On Sat, May 26, 2012 at 1:54 AM, Devin Jeanpierre
 wrote:
> Has it irritated anyone else that this syntax is invalid? I've wanted
> it a couple of times, to be equivalent to:
>
> ? ?import foo.bar.baz
> ? ?from foo import bar
> ? ?del foo # but only if we didn't import foo already before"
>
> The idea being that one wants access to foo.bar.baz under the name
> bar.baz , for readability purposes or what have you.

+1

> I played around with adding this, but I seem to have really bad luck
> with extending CPython...

TryPyPy? =)


From julian at grayvines.com  Tue May 29 06:46:08 2012
From: julian at grayvines.com (Julian Berman)
Date: Tue, 29 May 2012 00:46:08 -0400
Subject: [Python-ideas] Reimplementing collections.deque as a dynamic array
Message-ID: <1251639012979975459@unknownmsgid>

I've occasionally had a need for a container with constant-time append
to both ends without sacrificing constant-time indexing in the middle.
collections.deque will in these cases narrowly miss the target due to
linear indexing (with the current use case being for two deques
storing the lines of text surrounding the cursor in a text editor
while still being randomly indexed occasionally).

Wikipedia lists at least two common deque implementations:

http://en.wikipedia.org/wiki/Double-ended_queue#Implementations

where switching to a dynamic array would seemingly satisfy my requirements.

I know from a bit of experience (and a quick SO perusal) that "How do
I index a deque" does occasionally come up. Any thoughts on the value
of such a change?

JB


From techtonik at gmail.com  Tue May 29 07:05:27 2012
From: techtonik at gmail.com (anatoly techtonik)
Date: Tue, 29 May 2012 08:05:27 +0300
Subject: [Python-ideas] stdlib crowdsourcing
Message-ID: 

The problem with stdlib - it is all damn subjective. There is no
process to add functions and modules if you're not well-behaved and
skilled in public debates and don't have really a lot of time to be a
champion of your module/function. In other words - it is hard (if not
impossible for 80% of Python Earth population). So, many people and
projects decide to opt-out. Take a look at Twisted - a lot of useful
stuff, but not in Python stdlib. So..

Provide a way for people to opt-out from core stuff, but still allow
to share the changes and update code if necessary.

This will require:
- a local stdlib Python path convention
- snippet normalization function and AST hash dumper
- web site with stats
- source code crawler

How it works:
1. Every project maintains its own stdlib directory with functions
that they feel are good to have in standard library
2. Functions are placed so that they are imported as if from standard
library, but this time with stdlib prefix
3. The license for this directory is public domain to remove all legal
barriers (credits are welcome, but optional)
4. Crawler (probably PyPI) scans this stdlib dir, finds functions,
normalizes them, calculates hash and submits to web site
  4.1 Normalization is required to find the shared function
copy/pasted across different projects with different
        indentation level, docstrings, parameters/variable names etc.
  4.2 Hash is calculated upon AST. There are at least three hashes for
each entry:
       4.2.1 Full hash - all docstrings and variable names are
preserved, whitespace normalized
       4.2.2 Stripped hash - docstrings are stripped, variable names
are normalized
       4.2.3 Signature hash - a mark placed in a comment above
function name, either calculated from function
                signature or generated randomly, used for manual
tracking of copy/paste e.g. pd:ac546df6b8340a92
5. Web site maintains usage and popularity staff, accepts votes on
inclusion of snippets


User stories:
1. "I want to find if there is a better/updated version of my function
available"
   1.1  I enter hash into web site search form
   1.2  Site gives me a link to my snippet
   1.3  I can see what people proposed to replace this function with
   1.4  I can choose the function with most votes
   1.5  I can flag the functions I may find irrelevant or
   1.5  I can tag the functions that divert in different direction
than I need to filter them

2. "I want to reuse code snippets without additional dependencies on
3rd party projects"
   1.1  Just place them into my own stdlib directory

3. "I want to update code snippets when there is an update for them"
   1.1  I run scanner, it extracts signature hashes, stripped hashes
and looks if web-site version of signature matches normalized hash

4. "I want to see what people want to include in the next Python version"
   1.1  A call for proposals is made
   1.2  People place wannabe's into their stdlib dirs
   1.3  Crawl generates new functions on a web site
   1.4  Functions are categorized
   1.5  Optionally included / declined with a short one-liner reason - why
   1.6  Optionally provided with more detailed info why

--- feature creep cut ---
5. "I want to see what functions are popular in other languages"
   1.1  A separate crawler for Ruby, PHP etc. stdlib converts their
AST into compatible format where possible
   1.2  Submit to site stats

6. "I want to download the function in Ruby format"
   1.1  AST converter tries to do the job automatically where possible
   1.2  If it fails - you are encouraged to fix the converter rules or
write the replacement for this signature manually


Just an idea.
--
anatoly t.


From ncoghlan at gmail.com  Tue May 29 08:02:25 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 29 May 2012 16:02:25 +1000
Subject: [Python-ideas] stdlib crowdsourcing
In-Reply-To: 
References: 
Message-ID: 

Once again, you're completely ignoring all existing knowledge and
expertise on open collaboration and trying to reinvent the world. It's
*not going to happen*.

The standard library is just the curated core, and *yes*, it's damn
hard to get anything added to it (deliberately so). There's a place
where anyone can post anything they want, and see if others find it
useful: PyPI.

The standard library provides tools to upload to PyPI, and, as of 3.3,
will even include tools to download and install from it.

If you don't like our ecosystem (it's hard to tell whether or not you
do: everything you post is about how utterly awful and unusable
everything is, yet you're still here years later).

If you think the PyPI UI is awful or inadequate, follow the example of
crate.io or pythonpackage.com and *create your own*. There's far more
to the Python universe than just core development, stop trying to
shoehorn everything into a place where it doesn't belong.

Finally-giving-up'ly,
Nick.

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From alexandre at peadrop.com  Tue May 29 08:50:32 2012
From: alexandre at peadrop.com (Alexandre Vassalotti)
Date: Tue, 29 May 2012 02:50:32 -0400
Subject: [Python-ideas] Reimplementing collections.deque as a dynamic
	array
In-Reply-To: <1251639012979975459@unknownmsgid>
References: <1251639012979975459@unknownmsgid>
Message-ID: 

The current implementation of deque is a doubly linked list of arrays.
Indexing is indeed linear, but still very efficient. It takes 1 ms to index
a deque with a million items.

If that's not good enough, you should try to implement your own container
using lists (which are dynamic arrays in Python). That should be easy to
implement though this approach will likely be slower for everything but
very large datasets.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From alexandre at peadrop.com  Tue May 29 09:00:32 2012
From: alexandre at peadrop.com (Alexandre Vassalotti)
Date: Tue, 29 May 2012 03:00:32 -0400
Subject: [Python-ideas] stdlib crowdsourcing
In-Reply-To: 
References: 
	
Message-ID: 

On Tue, May 29, 2012 at 2:02 AM, Nick Coghlan  wrote:
>
> If you don't like our ecosystem (it's hard to tell whether or not you
> do: everything you post is about how utterly awful and unusable
> everything is, yet you're still here years later).
>

I understand the discouragement with regard to repeating yourself over and
over again. But, let's keep the discussion friendly here, okay? This is
Python-ideas: crazy proposals are fine. We can simply ignore those and move
on.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From cs at zip.com.au  Tue May 29 09:04:44 2012
From: cs at zip.com.au (Cameron Simpson)
Date: Tue, 29 May 2012 17:04:44 +1000
Subject: [Python-ideas] Reimplementing collections.deque as a dynamic
	array
In-Reply-To: <1251639012979975459@unknownmsgid>
References: <1251639012979975459@unknownmsgid>
Message-ID: <20120529070444.GA31399@cskk.homeip.net>

On 29May2012 00:46, Julian Berman  wrote:
| I've occasionally had a need for a container with constant-time append
| to both ends without sacrificing constant-time indexing in the middle.
| collections.deque will in these cases narrowly miss the target due to
| linear indexing (with the current use case being for two deques
| storing the lines of text surrounding the cursor in a text editor
| while still being randomly indexed occasionally).
| 
| Wikipedia lists at least two common deque implementations:
| 
| http://en.wikipedia.org/wiki/Double-ended_queue#Implementations
| 
| where switching to a dynamic array would seemingly satisfy my requirements.
| 
| I know from a bit of experience (and a quick SO perusal) that "How do
| I index a deque" does occasionally come up. Any thoughts on the value
| of such a change?

It was pointed out to me recently that Python's list.append() is constant
time overall.

Use two lists, one for append-forward and one for append backward. Keep
track of the bound. Access to item "i" is trivially computed from the
backward and forward list sizes and ends.

Cheers,
-- 
Cameron Simpson  DoD#743
http://www.cskk.ezoshosting.com/cs/

The mere existence of a problem is no proof of the existence of a solution.
        - Yiddish Proverb


From zuo at chopin.edu.pl  Tue May 29 19:34:06 2012
From: zuo at chopin.edu.pl (Jan Kaliszewski)
Date: Tue, 29 May 2012 19:34:06 +0200
Subject: [Python-ideas] a simple namespace type
In-Reply-To: 
References: 
	
	
	
	<20120527160826.GT14830@bagheera> <4FC26B55.7080000@gmail.com>
	
	<4FC2820E.2010205@trueblade.com>
	
	
Message-ID: <20120529173406.GA1869@chopin.edu.pl>

Eric Snow dixit (2012-05-28, 10:34):

> On Sun, May 27, 2012 at 2:09 PM, Nick Coghlan  wrote:
> > Slightly easier bar to reach: could the various incarnations be improved by
> > using a new varobject type as a base class (e.g. I know I often use
> > namedtuple as a base class rather than instantiating them directly, although
> > I do the latter, too).
> 
> Good point.  I do the same.
> 
> > There's also a potentially less controversial alternative: just add an easy
> > spelling for "type(name, (), {})" to the C API.
> 
> I really like this.  There's a lot of boilerplate to create just a
> simple type like this in the C API.  I'll see what I can come up with.
>  :)
> 
> As a namespace, it would be good to have a nice repr, but that's not a
> show stopper.

Using classes as 'attribute containers' is suboptimal (which means that
in performance-critical parts of code you would have to implement a
namespace-like type anyway -- if you wanted to have attr-based syntax,
of course).

There should be one obvious way to do it. Now there is no one.

Cheers.
*j



From ericsnowcurrently at gmail.com  Wed May 30 06:52:21 2012
From: ericsnowcurrently at gmail.com (Eric Snow)
Date: Tue, 29 May 2012 22:52:21 -0600
Subject: [Python-ideas] a simple namespace type
In-Reply-To: <20120529173406.GA1869@chopin.edu.pl>
References: 
	
	
	
	<20120527160826.GT14830@bagheera> <4FC26B55.7080000@gmail.com>
	
	<4FC2820E.2010205@trueblade.com>
	
	
	<20120529173406.GA1869@chopin.edu.pl>
Message-ID: 

On Tue, May 29, 2012 at 11:34 AM, Jan Kaliszewski  wrote:
> Using classes as 'attribute containers' is suboptimal (which means that
> in performance-critical parts of code you would have to implement a
> namespace-like type anyway -- if you wanted to have attr-based syntax,
> of course).

What are the performance problems of using a type object in this way?

>
> There should be one obvious way to do it. Now there is no one.

Yeah, I feel the same way.

-eric


From g.brandl at gmx.net  Wed May 30 08:47:38 2012
From: g.brandl at gmx.net (Georg Brandl)
Date: Wed, 30 May 2012 08:47:38 +0200
Subject: [Python-ideas] stdlib crowdsourcing
In-Reply-To: 
References: 
	
	
Message-ID: 

Am 29.05.2012 09:00, schrieb Alexandre Vassalotti:
> On Tue, May 29, 2012 at 2:02 AM, Nick Coghlan
>  > wrote:
> 
>     If you don't like our ecosystem (it's hard to tell whether or not you
>     do: everything you post is about how utterly awful and unusable
>     everything is, yet you're still here years later).
> 
> 
> I understand the discouragement with regard to repeating yourself over and over
> again. But, let's keep the discussion friendly here, okay? This is Python-ideas:
> crazy proposals are fine. We can simply ignore those and move on.

I don't see what's unfriendly about that paragraph: it's a quite accurate
matter-of-fact statement...

Georg



From armin.wieser at gmail.com  Wed May 30 10:45:31 2012
From: armin.wieser at gmail.com (Armin Wieser)
Date: Wed, 30 May 2012 10:45:31 +0200
Subject: [Python-ideas] PEP for Python folder structure
Message-ID: <4FC5DE2B.5020503@gmail.com>

Hi,

I would like to write a PEP about folder structure in python projects.

You will think that there is no need for that, because everything is
documented (package, module, setuptools). But it should contain
something like [0].

If you aren't into those concepts, never have pushed some package to
pypi, and you only have written some scripts, it's hard to find out how to
structure your folders.

Therefore i think a PEP would be a great way to show how you can do it.

What do you think about it?

[0] http://jcalderone.livejournal.com/39794.html


From littleq0903 at gmail.com  Wed May 30 11:05:41 2012
From: littleq0903 at gmail.com (LittleQ)
Date: Wed, 30 May 2012 17:05:41 +0800
Subject: [Python-ideas] PEP for Python folder structure
In-Reply-To: <4FC5DE2B.5020503@gmail.com>
References: <4FC5DE2B.5020503@gmail.com>
Message-ID: <4D11166B36784105A9CCC360A05049A9@gmail.com>

I think one of the goodnesses of Python is "no project structure", that make Python is easy to learn and easy to use.

Could you show something like your Python project structure for example? I'm curious for why do you think Python needs a basic project structure : )

Just personally hate the project structure, because Erlang has project structure for each project, that made me get into a mess often. 

>>>Best Regards,
Colin Su (LittleQ)
NCCU Computer Science Dept. / PLSM Lab.

About.me: http://about.me/littleq


On Wednesday, May 30, 2012 at 4:45 PM, Armin Wieser wrote:

> Hi,
> 
> I would like to write a PEP about folder structure in python projects.
> 
> You will think that there is no need for that, because everything is
> documented (package, module, setuptools). But it should contain
> something like [0].
> 
> If you aren't into those concepts, never have pushed some package to
> pypi, and you only have written some scripts, it's hard to find out how to
> structure your folders.
> 
> Therefore i think a PEP would be a great way to show how you can do it.
> 
> What do you think about it?
> 
> [0] http://jcalderone.livejournal.com/39794.html
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org (mailto:Python-ideas at python.org)
> http://mail.python.org/mailman/listinfo/python-ideas
> 
> 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: 

From steve at pearwood.info  Wed May 30 11:19:07 2012
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 30 May 2012 19:19:07 +1000
Subject: [Python-ideas] PEP for Python folder structure
In-Reply-To: <4FC5DE2B.5020503@gmail.com>
References: <4FC5DE2B.5020503@gmail.com>
Message-ID: <20120530091907.GB27475@ando>

On Wed, May 30, 2012 at 10:45:31AM +0200, Armin Wieser wrote:
> Hi,
> 
> I would like to write a PEP about folder structure in python projects.

Why?

PEP stands for Python Enhancement Proposal, and relate to suggested 
changes to the Python language and standard library. Your blog post 
about folder structure:

> [0] http://jcalderone.livejournal.com/39794.html

is interesting, but it has nothing to do with either Python the language 
or the standard library, as far as I can tell. In fact, some of your 
project suggestions go against best-practice, or at least common 
practice:

"Don't put your source in a directory called src"

Really? I think you'll find many people disagree with that.

I think your blog post is a good blog post, and deserves to have people 
read it and discuss it. With feedback from others, I think it might even 
become a good How To layout projects. But I think it would be a poor 
PEP.

Of course, you can write a post in the format of a PEP. Just don't call 
it a PEP unless it is a proposal for an enhancement to Python, or at 
least related to development of Python, e.g. PEP 8.


-- 
Steven


From mal at egenix.com  Wed May 30 13:01:42 2012
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 30 May 2012 13:01:42 +0200
Subject: [Python-ideas] PEP for Python folder structure
In-Reply-To: <20120530091907.GB27475@ando>
References: <4FC5DE2B.5020503@gmail.com> <20120530091907.GB27475@ando>
Message-ID: <4FC5FE16.5000200@egenix.com>

Steven D'Aprano wrote:
> On Wed, May 30, 2012 at 10:45:31AM +0200, Armin Wieser wrote:
>> Hi,
>>
>> I would like to write a PEP about folder structure in python projects.
> 
> Why?
> 
> PEP stands for Python Enhancement Proposal, and relate to suggested 
> changes to the Python language and standard library. Your blog post 
> about folder structure:
> 
>> [0] http://jcalderone.livejournal.com/39794.html
> 
> is interesting, but it has nothing to do with either Python the language 
> or the standard library, as far as I can tell.

We do have informational PEPs for the purpose Armin is describing, but
we usually only try to use those for standardization of things.

I don't think a standard project dir layout is really needed. Helping new
package authors finding the right structure for their project does
help, though.

Perhaps the idea could be turned into a section of the (distutils)
documentation, a how-to or a page on the wiki ?!

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 30 2012)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2012-07-17: Python Meeting Duesseldorf ...                 48 days to go
2012-07-02: EuroPython 2012, Florence, Italy ...           33 days to go
2012-05-16: Released eGenix pyOpenSSL 0.13 ...    http://egenix.com/go29

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


From ncoghlan at gmail.com  Wed May 30 13:48:39 2012
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 30 May 2012 21:48:39 +1000
Subject: [Python-ideas] PEP for Python folder structure
In-Reply-To: <4FC5FE16.5000200@egenix.com>
References: <4FC5DE2B.5020503@gmail.com> <20120530091907.GB27475@ando>
	<4FC5FE16.5000200@egenix.com>
Message-ID: 

On Wed, May 30, 2012 at 9:01 PM, M.-A. Lemburg  wrote:
> I don't think a standard project dir layout is really needed. Helping new
> package authors finding the right structure for their project does
> help, though.

The basic problem is that it's a matter of "it depends what you're
building and whether or not there are any other constraints on your
layout". Kenneth Reitz has a decent guide that he posted recently
([1]), but see the comments below the post for some useful caveats and
discussion.

Ultimately though, providing a place to provide opinionated advice on
exactly this kind of question is why the Hitchhiker's Guide to Python
[2] was created.

[1] http://kennethreitz.com/repository-structure-and-python.html
[2] http://docs.python-guide.org/en/latest/index.html

-- 
Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia


From Ronny.Pfannschmidt at gmx.de  Wed May 30 17:03:31 2012
From: Ronny.Pfannschmidt at gmx.de (Ronny Pfannschmidt)
Date: Wed, 30 May 2012 17:03:31 +0200
Subject: [Python-ideas] FormatRepr in reprlib for declaring simple repr
	functions easily
Message-ID: <4FC636C3.5080004@gmx.de>

Hi,

i consider my utility class FormatRepr finished,
its currently availiable in
( http://pypi.python.org/pypi/reprtools/0.1 )

it supplies a descriptor that allows to simply declare __repr__ methods 
based on object attributes.

i think it greatly enhances readability for those things,
as its DRY and focuses on the parts *i* consider important
(e.E. what accessible attribute gets formatted how)

there is no need ot repeat attribute names or
care if something is a property,class-attribute or object attribute
(one of the reasons why a simple .format(**vars(self)) will not always work)

oversimplified example:


.. code-block:: python

    from reprtools import FormatRepr

    class User(object):
        __repr__ = FormatRepr("")

        def __init__(self, name):
            self.name = name



 >>> User('test')




-- Ronny


From mikegraham at gmail.com  Thu May 31 16:38:37 2012
From: mikegraham at gmail.com (Mike Graham)
Date: Thu, 31 May 2012 10:38:37 -0400
Subject: [Python-ideas] FormatRepr in reprlib for declaring simple repr
 functions easily
In-Reply-To: <4FC636C3.5080004@gmx.de>
References: <4FC636C3.5080004@gmx.de>
Message-ID: 

On Wed, May 30, 2012 at 11:03 AM, Ronny Pfannschmidt
 wrote:
> Hi,
>
> i consider my utility class FormatRepr finished,
> its currently availiable in
> ( http://pypi.python.org/pypi/reprtools/0.1 )
>
> it supplies a descriptor that allows to simply declare __repr__ methods
> based on object attributes.
>
> i think it greatly enhances readability for those things,
> as its DRY and focuses on the parts *i* consider important
> (e.E. what accessible attribute gets formatted how)
>
> there is no need ot repeat attribute names or
> care if something is a property,class-attribute or object attribute
> (one of the reasons why a simple .format(**vars(self)) will not always work)
>
> oversimplified example:
>
>
> .. code-block:: python
>
> ? from reprtools import FormatRepr
>
> ? class User(object):
> ? ? ? __repr__ = FormatRepr("")
>
> ? ? ? def __init__(self, name):
> ? ? ? ? ? self.name = name
>
>
>
>>>> User('test')
> 

If we introduce something like this, I think I'd prefer an approach
that didn't encourage hardcoding "User". In my __repr__s, I usually
make the class's name dynamic so it does not make for confusing reprs
in the event of subclassing.

You really don't end up implementing __repr__ all that often and if
you do you writing a simple one isn't hard. I'm -0 on having this in
the stdlib.

Mike


From Ronny.Pfannschmidt at gmx.de  Thu May 31 16:43:23 2012
From: Ronny.Pfannschmidt at gmx.de (Ronny Pfannschmidt)
Date: Thu, 31 May 2012 16:43:23 +0200
Subject: [Python-ideas] FormatRepr in reprlib for declaring simple repr
 functions easily
In-Reply-To: 
References: <4FC636C3.5080004@gmx.de>
	
Message-ID: <4FC7838B.5050709@gmx.de>

On 05/31/2012 04:38 PM, Mike Graham wrote:
> On Wed, May 30, 2012 at 11:03 AM, Ronny Pfannschmidt
>   wrote:
>> Hi,
>>
>> i consider my utility class FormatRepr finished,
>> its currently availiable in
>> ( http://pypi.python.org/pypi/reprtools/0.1 )
>>
>> it supplies a descriptor that allows to simply declare __repr__ methods
>> based on object attributes.
>>
>> i think it greatly enhances readability for those things,
>> as its DRY and focuses on the parts *i* consider important
>> (e.E. what accessible attribute gets formatted how)
>>
>> there is no need ot repeat attribute names or
>> care if something is a property,class-attribute or object attribute
>> (one of the reasons why a simple .format(**vars(self)) will not always work)
>>
>> oversimplified example:
>>
>>
>> .. code-block:: python
>>
>>    from reprtools import FormatRepr
>>
>>    class User(object):
>>        __repr__ = FormatRepr("")
>>
>>        def __init__(self, name):
>>            self.name = name
>>
>>
>>
>>>>> User('test')
>> 
>
> If we introduce something like this, I think I'd prefer an approach
> that didn't encourage hardcoding "User". In my __repr__s, I usually
> make the class's name dynamic so it does not make for confusing reprs
> in the event of subclassing.

you can just use {__class__.__name__} to have it "softcoded"

>
> You really don't end up implementing __repr__ all that often and if
> you do you writing a simple one isn't hard. I'm -0 on having this in
> the stdlib.
>
> Mike



From alexandre.zani at gmail.com  Thu May 31 16:51:45 2012
From: alexandre.zani at gmail.com (Alexandre Zani)
Date: Thu, 31 May 2012 07:51:45 -0700
Subject: [Python-ideas] FormatRepr in reprlib for declaring simple repr
 functions easily
In-Reply-To: <4FC7838B.5050709@gmx.de>
References: <4FC636C3.5080004@gmx.de>
	
	<4FC7838B.5050709@gmx.de>
Message-ID: 

I would prefer an interface where I just pass a list of attribute
names and the utility class figures everything else out.

That said, I'm not sure this does enough to warrant inclusion in the
stdlib. It's easy enough to write a __repr__ with just a few more
characters. I'm not sure that:

__repr__ = FormatRepr(""

is actually more readable than

def __repr__(self):
  return "" % self.username

In fact, the second option might be better because I don't have to
learn anything new to understand it. If I see your version, I have to
google it and then use brain-space to hold that feature in memory. If
I know python, I already know what the second option means.

Alexandre Zani

On Thu, May 31, 2012 at 7:43 AM, Ronny Pfannschmidt
 wrote:
> On 05/31/2012 04:38 PM, Mike Graham wrote:
>>
>> On Wed, May 30, 2012 at 11:03 AM, Ronny Pfannschmidt
>>  ?wrote:
>>>
>>> Hi,
>>>
>>> i consider my utility class FormatRepr finished,
>>> its currently availiable in
>>> ( http://pypi.python.org/pypi/reprtools/0.1 )
>>>
>>> it supplies a descriptor that allows to simply declare __repr__ methods
>>> based on object attributes.
>>>
>>> i think it greatly enhances readability for those things,
>>> as its DRY and focuses on the parts *i* consider important
>>> (e.E. what accessible attribute gets formatted how)
>>>
>>> there is no need ot repeat attribute names or
>>> care if something is a property,class-attribute or object attribute
>>> (one of the reasons why a simple .format(**vars(self)) will not always
>>> work)
>>>
>>> oversimplified example:
>>>
>>>
>>> .. code-block:: python
>>>
>>> ? from reprtools import FormatRepr
>>>
>>> ? class User(object):
>>> ? ? ? __repr__ = FormatRepr("")
>>>
>>> ? ? ? def __init__(self, name):
>>> ? ? ? ? ? self.name = name
>>>
>>>
>>>
>>>>>> User('test')
>>>
>>> 
>>
>>
>> If we introduce something like this, I think I'd prefer an approach
>> that didn't encourage hardcoding "User". In my __repr__s, I usually
>> make the class's name dynamic so it does not make for confusing reprs
>> in the event of subclassing.
>
>
> you can just use {__class__.__name__} to have it "softcoded"
>
>
>>
>> You really don't end up implementing __repr__ all that often and if
>> you do you writing a simple one isn't hard. I'm -0 on having this in
>> the stdlib.
>>
>> Mike
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas