From tjreedy at udel.edu Wed Aug 1 01:09:54 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 31 Jul 2012 19:09:54 -0400 Subject: [Python-ideas] only raise ImportError out of imports In-Reply-To: References: Message-ID: On 7/31/2012 1:49 AM, Eric Snow wrote: > Currently, if module 'eggs' has bad syntax and you import it from > module 'spam', then you will get a SyntaxError in module spam: > > --------------------------------------------------------- > $ cat > eggs.py << EOF > a + > EOF > $ cat > spam.py << EOF > import eggs > EOF > $ python -c 'import spam' > Traceback (most recent call last): > File "spam.py", line 2, in > import eggs > File "/tmp/eggs.py", line 1 > a + > ^ > SyntaxError: invalid syntax > --------------------------------------------------------- This is really clear to me. SyntaxError in eggs exposed during the import call, which obviously failed, as does everything in a traceback. > --------------------------------------------------------- > $ cat > spam.py << EOF > try: > import eggs > except SyntaxError as e: > raise ImportError("failed to import eggs") from e > EOF > $ python -c 'import spam' > Traceback (most recent call last): > File "spam.py", line 2, in > import eggs > File "/tmp/eggs.py", line 1 > a + > ^ > SyntaxError: invalid syntax > > The above exception was the direct cause of the following exception: > > Traceback (most recent call last): > File "spam.py", line 4, in > raise ImportError("failed to import eggs") from e > ImportError: failed to import eggs > --------------------------------------------------------- Much worse. Lots of extra noise with the important part buried in the middle. I don't see the point of privileging import calls over everything else in the call chain. An ImportError should be related to an import-specific cause, and SyntaxError is not such. It would be there anyway if eggs.py were run directly. -- Terry Jan Reedy From brett at python.org Wed Aug 1 17:25:37 2012 From: brett at python.org (Brett Cannon) Date: Wed, 1 Aug 2012 11:25:37 -0400 Subject: [Python-ideas] Better support for finalization with weakrefs In-Reply-To: References: Message-ID: On Tue, Jul 31, 2012 at 9:28 AM, Calvin Spealman wrote: > On Mon, Jul 30, 2012 at 12:44 PM, Richard Oudkerk > wrote: > > I would like to see better support for the use of weakrefs callbacks > > for object finalization. > > > > The current issues with weakref callbacks include: > > > > 1. They are rather low level, and working out how to use them > > correctly requires a bit of head scratching. One must find > > somewhere to store the weakref till after the referent is dead, and > > without accidentally keeping the referent alive. Then one must > > ensure that the callback frees the weakref (without leaving any > > remnant ref-cycles). > > > > When it is an option, using a __del__ method is far less hassle. > > > > 2. Callbacks invoked during interpreter shutdown are troublesome. For > > instance they can throw exceptions because module globals have been > > replaced by None. > > > > 3. Sometimes you want the callback to be called at program exit even > > if the referent is still alive. > > > > 4. Exceptions thrown in callbacks do not produce a traceback. This > > can make debugging rather awkward. (Maybe this is because printing > > tracebacks is problematic during interpreter shutdown?) > > > > (Note that problems 2-4 also apply to using __del__ methods.) > > > > If possible I would like to see the weakref module provide a finalize > > class to address these issues. Trivial usage might be > > > > >>> class Kenny: pass > > ... > > >>> kenny = Kenny() > > >>> finalize(kenny, print, "you killed kenny!") > > > > >>> del kenny > > you killed kenny! > > > > Prototype at https://gist.github.com/3208245 > > I like how simple it is, but it might be too simple. That is, > shouldn't we have a way to unregister the callback? You can't unregister a __del__ method, so why should that stop this from moving forward? And if you really need that then you just have the cleanup code check something to see if it should run or not or just use some form of delegation. -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.lefavor at nasa.gov Wed Aug 1 17:41:15 2012 From: matthew.lefavor at nasa.gov (Lefavor, Matthew (GSFC-582.0)[MICROTEL LLC]) Date: Wed, 1 Aug 2012 10:41:15 -0500 Subject: [Python-ideas] Better support for finalization with weakrefs In-Reply-To: Message-ID: If finalizations are objects, is there any reason you can't make the finalization object have an unregister method, like below? Or does this not address the problem? >>> class Kenny: pass ... >>> kenny = Kenny() >>> f = finalize(kenny, print, "you killed kenny!") >>> f >>> f.unregister() >>> del kenny >>> Matthew Lefavor From: Brett Cannon > Date: Wednesday, August 1, 2012 11:25 AM To: "ironfroggy at gmail.com" > Cc: Richard Oudkerk >, "python-ideas at python.org" > Subject: Re: [Python-ideas] Better support for finalization with weakrefs On Tue, Jul 31, 2012 at 9:28 AM, Calvin Spealman > wrote: On Mon, Jul 30, 2012 at 12:44 PM, Richard Oudkerk > wrote: > I would like to see better support for the use of weakrefs callbacks > for object finalization. > > The current issues with weakref callbacks include: > > 1. They are rather low level, and working out how to use them > correctly requires a bit of head scratching. One must find > somewhere to store the weakref till after the referent is dead, and > without accidentally keeping the referent alive. Then one must > ensure that the callback frees the weakref (without leaving any > remnant ref-cycles). > > When it is an option, using a __del__ method is far less hassle. > > 2. Callbacks invoked during interpreter shutdown are troublesome. For > instance they can throw exceptions because module globals have been > replaced by None. > > 3. Sometimes you want the callback to be called at program exit even > if the referent is still alive. > > 4. Exceptions thrown in callbacks do not produce a traceback. This > can make debugging rather awkward. (Maybe this is because printing > tracebacks is problematic during interpreter shutdown?) > > (Note that problems 2-4 also apply to using __del__ methods.) > > If possible I would like to see the weakref module provide a finalize > class to address these issues. Trivial usage might be > > >>> class Kenny: pass > ... > >>> kenny = Kenny() > >>> finalize(kenny, print, "you killed kenny!") > > >>> del kenny > you killed kenny! > > Prototype at https://gist.github.com/3208245 I like how simple it is, but it might be too simple. That is, shouldn't we have a way to unregister the callback? You can't unregister a __del__ method, so why should that stop this from moving forward? And if you really need that then you just have the cleanup code check something to see if it should run or not or just use some form of delegation. -------------- next part -------------- An HTML attachment was scrubbed... URL: From shibturn at gmail.com Wed Aug 1 18:27:23 2012 From: shibturn at gmail.com (Richard Oudkerk) Date: Wed, 01 Aug 2012 17:27:23 +0100 Subject: [Python-ideas] Better support for finalization with weakrefs In-Reply-To: References: Message-ID: <501958EB.9030306@gmail.com> On 01/08/2012 4:41pm, Lefavor, Matthew (GSFC-582.0)[MICROTEL LLC] wrote: > If finalizations are objects, is there any reason you can't make the > finalization object have an unregister method, like below? Or does this > not address the problem? > > >>> class Kenny: pass > ... > >>> kenny = Kenny() > >>> f = finalize(kenny, print, "you killed kenny!") > >>> f > > >>> f.unregister() > >>> del kenny > >>> > > Matthew Lefavor I have a patch (with tests and docs) at http://bugs.python.org/issue15528 It has a pop() method for unregistering the callback. So if f was created using f = finalize(obj, func, *args, **kwds) then f.pop() returns a tuple (wr, func, args, kwds) where wr is a weakref to obj. There is also a get() method which returns the same info, but does not unregister the callback. Once the finalizer is dead, f(), f.get(), f.pop() all return None. Richard From ironfroggy at gmail.com Wed Aug 1 22:56:56 2012 From: ironfroggy at gmail.com (Calvin Spealman) Date: Wed, 1 Aug 2012 16:56:56 -0400 Subject: [Python-ideas] Better support for finalization with weakrefs In-Reply-To: <501958EB.9030306@gmail.com> References: <501958EB.9030306@gmail.com> Message-ID: On Wed, Aug 1, 2012 at 12:27 PM, Richard Oudkerk wrote: > On 01/08/2012 4:41pm, Lefavor, Matthew (GSFC-582.0)[MICROTEL LLC] wrote: >> >> If finalizations are objects, is there any reason you can't make the >> finalization object have an unregister method, like below? Or does this >> not address the problem? >> >> >>> class Kenny: pass >> ... >> >>> kenny = Kenny() >> >>> f = finalize(kenny, print, "you killed kenny!") >> >>> f >> >> >>> f.unregister() >> >>> del kenny >> >>> >> >> Matthew Lefavor > > > I have a patch (with tests and docs) at > > http://bugs.python.org/issue15528 > > It has a pop() method for unregistering the callback. > So if f was created using > > f = finalize(obj, func, *args, **kwds) > > then f.pop() returns a tuple > > (wr, func, args, kwds) > > where wr is a weakref to obj. > > There is also a get() method which returns the same info, but does > not unregister the callback. > > Once the finalizer is dead, f(), f.get(), f.pop() all return None. I don't like reusing the method names of containers, which this isn't. I liked "peek" and "detach". Also, why not fully hide the weakref adn return the actual object when you call one fo them, but have them fail if the object no longer exists? Or, what if they were just properties that looked up the weakref'ed object on access but you could just get at the callback and args directly? > > Richard > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -- Read my blog! I depend on your acceptance of my opinion! I am interesting! http://techblog.ironfroggy.com/ Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy From shibturn at gmail.com Thu Aug 2 13:49:33 2012 From: shibturn at gmail.com (Richard Oudkerk) Date: Thu, 02 Aug 2012 12:49:33 +0100 Subject: [Python-ideas] Better support for finalization with weakrefs In-Reply-To: References: <501958EB.9030306@gmail.com> Message-ID: <501A694D.60504@gmail.com> On 01/08/2012 9:56pm, Calvin Spealman wrote: > I don't like reusing the method names of containers, which this isn't. > I liked "peek" and "detach". Yeah, maybe so. > Also, why not fully hide the weakref adn return the actual object when > you call one fo them, but have them fail if the object no longer > exists? I did consider that but worried about the "race" where the object is garbage collected just after you pop from the registry. Thinking about it again, you can just try resolving the weakref before popping from the registry. > Or, what if they were just properties that looked up the > weakref'ed object on access but you could just get at the callback and > args directly? The finalizer itself is just a light weight dictionary key. It should not own references to anything non-trivial. Otherwise storing it could keep alive big objects (or cause ref-cycles) even after the finalizer is dead. But all the information could be available from properties which return None when the finalizer is dead. Richard From ericsnowcurrently at gmail.com Thu Aug 2 17:26:59 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 2 Aug 2012 09:26:59 -0600 Subject: [Python-ideas] abc.optionalabstractmethod Message-ID: Sometimes you want to specific an optional method in an abstract base class. Currently we don't have a consistent way of doing so, instead having to mix in the old way of defining "abstract" methods: class MyABC(metaclass=ABCMeta): ... def do_something_optional(self): """An optional method for doing something.""" raise NotImplementedError This came up in issue 15502[1], where we are splitting importlib.abc.Finder into MetaPathFinder and PathEntryFinder. These have a method, invalidate_caches(), which is optional. It would be nice to have a new decorator akin to abstractmethod that would allow defining an optional interface in a consistent way. Something like optionalabstractmethod (or some better name). Then the above example would be like this: class MyABC(metaclass=ABCMeta): ... @optionalabstractmethod def do_something_optional(self): """An optional method for doing something.""" Thoughts? -eric [1] http://bugs.python.org/issue15502; Barry Warsaw voiced what I've considered on multiple occasions. From barry at python.org Thu Aug 2 17:31:04 2012 From: barry at python.org (Barry Warsaw) Date: Thu, 2 Aug 2012 11:31:04 -0400 Subject: [Python-ideas] abc.optionalabstractmethod References: Message-ID: <20120802113104.4bb2d99f@resist.wooz.org> On Aug 02, 2012, at 09:26 AM, Eric Snow wrote: >Sometimes you want to specific an optional method in an abstract base >class. Currently we don't have a consistent way of doing so, instead >having to mix in the old way of defining "abstract" methods: > >class MyABC(metaclass=ABCMeta): > ... > > def do_something_optional(self): > """An optional method for doing something.""" > raise NotImplementedError > >This came up in issue 15502[1], where we are splitting >importlib.abc.Finder into MetaPathFinder and PathEntryFinder. These >have a method, invalidate_caches(), which is optional. It would be >nice to have a new decorator akin to abstractmethod that would allow >defining an optional interface in a consistent way. Something like >optionalabstractmethod (or some better name). Then the above example >would be like this: > >class MyABC(metaclass=ABCMeta): > ... > > @optionalabstractmethod > def do_something_optional(self): > """An optional method for doing something.""" > >Thoughts? +1 -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From masklinn at masklinn.net Thu Aug 2 17:42:59 2012 From: masklinn at masklinn.net (Masklinn) Date: Thu, 2 Aug 2012 17:42:59 +0200 Subject: [Python-ideas] abc.optionalabstractmethod In-Reply-To: References: Message-ID: <52A5A20A-43C3-4F3F-9AE5-EB7CB69221B7@masklinn.net> On 2012-08-02, at 17:26 , Eric Snow wrote: > Sometimes you want to specific an optional method in an abstract base > class. Currently we don't have a consistent way of doing so, instead > having to mix in the old way of defining "abstract" methods: > > class MyABC(metaclass=ABCMeta): > ... > > def do_something_optional(self): > """An optional method for doing something.""" > raise NotImplementedError > > This came up in issue 15502[1], where we are splitting > importlib.abc.Finder into MetaPathFinder and PathEntryFinder. These > have a method, invalidate_caches(), which is optional. It would be > nice to have a new decorator akin to abstractmethod that would allow > defining an optional interface in a consistent way. Something like > optionalabstractmethod (or some better name). Then the above example > would be like this: > > class MyABC(metaclass=ABCMeta): > ... > > @optionalabstractmethod > def do_something_optional(self): > """An optional method for doing something.""" > > Thoughts? Wouldn't it be better to have a generic @optional applying to all @abstract*? e.g. class MyABC*metaclass=ABCMeta): @optional @abstractmethod def do_something_optional(self): """ an optional method for doing something """ (yeah I know all of them apart from @abstractmethod are deprecated, still it feels ugly and unreadable to have @optionalabstractmethod next to @abstractmethod) From simon.sapin at kozea.fr Thu Aug 2 17:37:50 2012 From: simon.sapin at kozea.fr (Simon Sapin) Date: Thu, 02 Aug 2012 17:37:50 +0200 Subject: [Python-ideas] abc.optionalabstractmethod In-Reply-To: References: Message-ID: <501A9ECE.5020708@kozea.fr> Le 02/08/2012 17:26, Eric Snow a ?crit : > Sometimes you want to specific an optional method in an abstract base > class. Currently we don't have a consistent way of doing so, instead > having to mix in the old way of defining "abstract" methods: > > class MyABC(metaclass=ABCMeta): > ... > > def do_something_optional(self): > """An optional method for doing something.""" > raise NotImplementedError Hi, What?s wrong with this? I think that the only thing making methods with @abstractmethod special is that they prevent instantiation if they are not overridden. If we remove that, the only remaining difference is terms of documentation. Maybe a new Sphinx directive could help? What would an optional abstract method do when it is called? Raise NotImplementedError? I?m not against this idea but I just don?t see how @optionalabstractmethod is better than raise NotImplementedError Regards, -- Simon Sapin From solipsis at pitrou.net Thu Aug 2 17:46:04 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 2 Aug 2012 17:46:04 +0200 Subject: [Python-ideas] abc.optionalabstractmethod References: Message-ID: <20120802174604.3be467f3@pitrou.net> On Thu, 2 Aug 2012 09:26:59 -0600 Eric Snow wrote: > Sometimes you want to specific an optional method in an abstract base > class. Currently we don't have a consistent way of doing so, instead > having to mix in the old way of defining "abstract" methods: > > class MyABC(metaclass=ABCMeta): > ... > > def do_something_optional(self): > """An optional method for doing something.""" > raise NotImplementedError What's the problem with it exactly? Regards Antoine. -- Software development and contracting: http://pro.pitrou.net From ericsnowcurrently at gmail.com Fri Aug 3 05:19:14 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 2 Aug 2012 21:19:14 -0600 Subject: [Python-ideas] abc.optionalabstractmethod In-Reply-To: <501A9ECE.5020708@kozea.fr> References: <501A9ECE.5020708@kozea.fr> Message-ID: On Thu, Aug 2, 2012 at 9:37 AM, Simon Sapin wrote: > Le 02/08/2012 17:26, Eric Snow a ?crit : > >> Sometimes you want to specific an optional method in an abstract base >> class. Currently we don't have a consistent way of doing so, instead >> having to mix in the old way of defining "abstract" methods: >> >> class MyABC(metaclass=ABCMeta): >> ... >> >> def do_something_optional(self): >> """An optional method for doing something.""" >> raise NotImplementedError > > > Hi, > > What?s wrong with this? > > I think that the only thing making methods with @abstractmethod special is > that they prevent instantiation if they are not overridden. If we remove > that, the only remaining difference is terms of documentation. Maybe a new > Sphinx directive could help? > > What would an optional abstract method do when it is called? Raise > NotImplementedError? > > I?m not against this idea but I just don?t see how @optionalabstractmethod > is better than raise NotImplementedError Yeah, there isn't a huge win (particularly considering the implementation would tread on perilous ground*). The main motivation is consistency in the definition of abstract base classes. How would it work? Yeah, probably a lot like the NotImplementedError technique works. Or perhaps it would simply be removed by the ABCMeta when the class is built. -eric * If you have an hour or two to spare, manually trace trough the abc module and Objects/typeobject.c to see how abstract base classes work under the hood. Totally worth it! From ncoghlan at gmail.com Fri Aug 3 06:45:02 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 3 Aug 2012 14:45:02 +1000 Subject: [Python-ideas] abc.optionalabstractmethod In-Reply-To: References: Message-ID: On Fri, Aug 3, 2012 at 1:26 AM, Eric Snow wrote: > Sometimes you want to specific an optional method in an abstract base > class. Currently we don't have a consistent way of doing so, instead > having to mix in the old way of defining "abstract" methods: > > class MyABC(metaclass=ABCMeta): > ... > > def do_something_optional(self): > """An optional method for doing something.""" > raise NotImplementedError Really, the phrase "optional abstract method" is a bit of an oxymoron. The point of an abstract method is that subclasses *must* implement it, either so that concrete method implementations provided by the base class can rely on it, or so that users of the class can rely on it. For interface methods that aren't mandatory, there are already a few different signalling methods currently in use within the language core and standard library: 1. Attribute checks Call method X if it exists, call method Y otherwise (e.g. the iterator protocol, which falls back to the sequence iterator if __iter__ doesn't exist, but __getitem__ does, and the backwards compatibility in the path entry finder protocol, which tries find_loader first, then falls back to find_module) 2. Setting attributes to None Call method X if it is not None (e.g. the hash protocol, where we added support for __hash__ = None to cope with the fact that object instances are hashable, but instances of subclasses that override __eq__ without also overriding __hash__ are not) 3. Implementing the method, but returning NotImplemented Call method if defined, treat a result of NotImplemented as if the method did not exist at all. (e.g. the binary operator protocols) 4. Throwing a particular exception (typically NotImplementedError) Call method if defined, treat the designated exception as if the method does not exist at all (e.g. optional methods in the IO stack) Now, the interesting point here is that these are all things that can't easily be defined in a way that is open to *programmatic* introspection. Here's how the four of them would currently look in a base class: # optional_method(appropriate_signature) can be defined by X instances # See the docs for details (users should handle the case where this method is not defined) # optional_method(appropriate_signature) can be defined by X instances # See the docs for details (users should handle the case where this method is set to None) optional_method = None def optional_method(appropriate_signature): """Details on the significance of optional_method Users should handle the case where this method returns NotImplemented """ return NotImplemented def optional_method(appropriate_signature): """Details on the significance of optional_method Users should handle the case where this method raises NotImplementedError """ raise NotImplementedError Expanding the ABC descriptor lexicon to accurately describe those 4 protocol variants (or at least some of them) in a base class may be worthwhile, but -1 on merely adding yet another variant. For example (this is rather ugly and I don't actually like it, but I wanted to illustrate the general point that the descriptor protocol allows all 4 cases to be distinguished): @optional(raise_on_instance_get=AttributeError) def optional_method(appropriate_signature): ... @optional(value_on_instance=None) def optional_method(appropriate_signature): ... @optional(result_on_method_call=NotImplemented) def optional_method(appropriate_signature): ... @optional(raise_on_method_call=NotImplementedError) def optional_method(appropriate_signature): ... Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ericsnowcurrently at gmail.com Fri Aug 3 07:40:37 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 2 Aug 2012 23:40:37 -0600 Subject: [Python-ideas] abc.optionalabstractmethod In-Reply-To: References: Message-ID: On Thu, Aug 2, 2012 at 10:45 PM, Nick Coghlan wrote: > Now, the interesting point here is that these are all things that > can't easily be defined in a way that is open to *programmatic* > introspection. I hadn't thought of it in these terms, but deep down this is exactly the itch that was making me squirm. > Expanding the ABC descriptor lexicon to accurately describe those 4 > protocol variants (or at least some of them) in a base class may be > worthwhile, but -1 on merely adding yet another variant. Agreed on both counts. I think your example is along the right lines. To be honest, I'll probably let this idea ruminate a while, but you're explanation is meaningful right now. Thanks. -eric From shibturn at gmail.com Wed Aug 1 18:27:23 2012 From: shibturn at gmail.com (Richard Oudkerk) Date: Wed, 01 Aug 2012 17:27:23 +0100 Subject: [Python-ideas] Better support for finalization with weakrefs In-Reply-To: References: Message-ID: <501958EB.9030306@gmail.com> On 01/08/2012 4:41pm, Lefavor, Matthew (GSFC-582.0)[MICROTEL LLC] wrote: > If finalizations are objects, is there any reason you can't make the > finalization object have an unregister method, like below? Or does this > not address the problem? > > >>> class Kenny: pass > ... > >>> kenny = Kenny() > >>> f = finalize(kenny, print, "you killed kenny!") > >>> f > > >>> f.unregister() > >>> del kenny > >>> > > Matthew Lefavor I have a patch (with tests and docs) at http://bugs.python.org/issue15528 It has a pop() method for unregistering the callback. So if f was created using f = finalize(obj, func, *args, **kwds) then f.pop() returns a tuple (wr, func, args, kwds) where wr is a weakref to obj. There is also a get() method which returns the same info, but does not unregister the callback. Once the finalizer is dead, f(), f.get(), f.pop() all return None. Richard From shibturn at gmail.com Thu Aug 2 13:49:33 2012 From: shibturn at gmail.com (Richard Oudkerk) Date: Thu, 02 Aug 2012 12:49:33 +0100 Subject: [Python-ideas] Better support for finalization with weakrefs In-Reply-To: References: <501958EB.9030306@gmail.com> Message-ID: <501A694D.60504@gmail.com> On 01/08/2012 9:56pm, Calvin Spealman wrote: > I don't like reusing the method names of containers, which this isn't. > I liked "peek" and "detach". Yeah, maybe so. > Also, why not fully hide the weakref adn return the actual object when > you call one fo them, but have them fail if the object no longer > exists? I did consider that but worried about the "race" where the object is garbage collected just after you pop from the registry. Thinking about it again, you can just try resolving the weakref before popping from the registry. > Or, what if they were just properties that looked up the > weakref'ed object on access but you could just get at the callback and > args directly? The finalizer itself is just a light weight dictionary key. It should not own references to anything non-trivial. Otherwise storing it could keep alive big objects (or cause ref-cycles) even after the finalizer is dead. But all the information could be available from properties which return None when the finalizer is dead. Richard From jimjjewett at gmail.com Tue Aug 7 00:55:00 2012 From: jimjjewett at gmail.com (Jim Jewett) Date: Mon, 6 Aug 2012 18:55:00 -0400 Subject: [Python-ideas] abc.optionalabstractmethod In-Reply-To: References: Message-ID: On 8/3/12, Nick Coghlan wrote: > Really, the phrase "optional abstract method" is a bit of an oxymoron. Only if you place too much faith in the details on the current implementation. One major use of an abstract class is to document an interface. The abstract class can only define (or even document, really, if you value docstrings) methods that it defines. > For interface methods that aren't mandatory, there are already a > few different signalling methods currently in use within the language > core and standard library: > 1. Attribute checks > Call method X if it exists, call method Y otherwise (e.g. the > iterator protocol, which falls back to the sequence iterator if > __iter__ doesn't exist, but __getitem__ does, and the backwards > compatibility in the path entry finder protocol, which tries > find_loader first, then falls back to find_module) Assuming that another method will be called is a bit too restrictive, but this otherwise works well -- until you inherit from the abstract class, in which case the method will exist, even if it wasn't implemented. The least bad solution I've found is to use co-operative super, and write an identity function for the base class. Then if you need to know whether or not it was actually implemented, you can at least do an identity check on the method, if you're careful enough. -jJ From steve at pearwood.info Tue Aug 7 02:35:16 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 07 Aug 2012 10:35:16 +1000 Subject: [Python-ideas] abc.optionalabstractmethod In-Reply-To: References: Message-ID: <502062C4.6050801@pearwood.info> On 07/08/12 08:55, Jim Jewett wrote: > On 8/3/12, Nick Coghlan wrote: > >> Really, the phrase "optional abstract method" is a bit of an oxymoron. > > Only if you place too much faith in the details on the current implementation. I don't see that this has anything to do with implementation. > One major use of an abstract class is to document an interface. The > abstract class can only define (or even document, really, if you value > docstrings) methods that it defines. I don't know that I accept that abstract classes are documentation. It seems to me that to be documentation, it has to be, well, documentation. Not a class hierarchy. The documentation might include the fact that something is an abstract class, but merely making something an abstract class is not in and of itself documentation. But even putting that aside, the interface that it (implicitly?) documents is surely *required* interface. If you can neglect to override an abstract method with impunity, then it shouldn't have been an abstract method in the first place. As I see it, "optional abstract method" means "You *must* implement this method, but if you don't, that's okay too". -- Steven From songofacandy at gmail.com Tue Aug 7 02:51:54 2012 From: songofacandy at gmail.com (INADA Naoki) Date: Tue, 7 Aug 2012 09:51:54 +0900 Subject: [Python-ideas] changing sys.stdout encoding In-Reply-To: <87bokn7gnk.fsf@uwakimon.sk.tsukuba.ac.jp> References: <87fwa7a945.fsf@uwakimon.sk.tsukuba.ac.jp> <1339102104.17621.YahooMailClassic@web161503.mail.bf1.yahoo.com> <87k3zb7qq3.fsf@uwakimon.sk.tsukuba.ac.jp> <87bokn7gnk.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Wed, Jun 13, 2012 at 5:35 PM, Stephen J. Turnbull wrote: > Guido van Rossum writes: > > > I'm not sure about a name, but it might well be called set_encoding(). > > I would still prefer "initialize_encoding" or something like that, but > the main thing I was worried about was a "consenting adults" function > that shouldn't be called after I/O, but *could* be. I still don't understand why Python can't support using it after I/O. Is this code wrong? https://gist.github.com/3280063 -- INADA Naoki From wuwei23 at gmail.com Tue Aug 7 04:09:49 2012 From: wuwei23 at gmail.com (alex23) Date: Mon, 6 Aug 2012 19:09:49 -0700 (PDT) Subject: [Python-ideas] abc.optionalabstractmethod In-Reply-To: <502062C4.6050801@pearwood.info> References: <502062C4.6050801@pearwood.info> Message-ID: <6da9beeb-260f-4940-add6-0cb8d8dd33de@nj2g2000pbc.googlegroups.com> On Aug 7, 10:35?am, Steven D'Aprano wrote: > I don't know that I accept that abstract classes are documentation. It seems > to me that to be documentation, it has to be, well, documentation. Not a class > hierarchy. The documentation might include the fact that something is an > abstract class, but merely making something an abstract class is not in and of > itself documentation. Why not? What could I say here in textual documentation that isn't made obvious by declaring the interface? import abc class XBuilder(object): """Interface for setting up, updating & deleting objects for X""" __metaclass__ = abc.ABCMeta @abc.abstractmethod def setup(self): """Get the object ready for use""" @abc.abstractmethod def update(self): """Modify the object to reflect recent changes""" @abc.abstractmethod def delete(self): """Remove the object""" Good code _is_ documentation. From wuwei23 at gmail.com Tue Aug 7 04:18:33 2012 From: wuwei23 at gmail.com (alex23) Date: Mon, 6 Aug 2012 19:18:33 -0700 (PDT) Subject: [Python-ideas] abc.optionalabstractmethod In-Reply-To: <502062C4.6050801@pearwood.info> References: <502062C4.6050801@pearwood.info> Message-ID: On Aug 7, 10:35?am, Steven D'Aprano wrote: > But even putting that aside, the interface that it (implicitly?) documents is > surely *required* interface. If you can neglect to override an abstract method > with impunity, then it shouldn't have been an abstract method in the first place. This I completely agree with. I don't understand the point of declaring an abstract method that you cannot guarantee will be on an implementation. Wouldn't it make more sense to define the optional aspect as a secondary interface? import abc class IStarter(object): __metaclass__ = abc.ABCMeta @abc.abstractmethod def start(self): """start it up""" class IStopper(object): __metaclass__ = abc.ABCMeta @abc.abstractmethod def stop(self): """shut it down""" class StartOnly(object): def start(self): print "starting!" class StartAndStop(object): def start(self): print "starting!!" def stop(self): print "stopping!!" IStarter.register(StartOnly) IStarter.register(StartAndStop) IStopper.register(StartAndStop) From ncoghlan at gmail.com Tue Aug 7 07:02:49 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 7 Aug 2012 15:02:49 +1000 Subject: [Python-ideas] abc.optionalabstractmethod In-Reply-To: References: <502062C4.6050801@pearwood.info> Message-ID: On Tue, Aug 7, 2012 at 12:18 PM, alex23 wrote: > On Aug 7, 10:35 am, Steven D'Aprano wrote: >> But even putting that aside, the interface that it (implicitly?) documents is >> surely *required* interface. If you can neglect to override an abstract method >> with impunity, then it shouldn't have been an abstract method in the first place. > > This I completely agree with. I don't understand the point of > declaring an abstract method that you cannot guarantee will be on an > implementation. > > Wouldn't it make more sense to define the optional aspect as a > secondary interface? No. That leads to a combinatorial explosion of interface classes that only makes sense in languages which don't readily support attribute level introspection (*cough*such-as-Java*cough*). Look at the IO stack for an example - think how much more complicated it would need to be if implementations weren't allowed to raise NotImplemented error for unsupported methods and each of those methods thus had an associated ABC. As I wrote in my earlier message, there are several ways to define a runtime protocol to check for optional methods in Python today. The problem is that *NONE* of them are particularly open to programmatic introspection, thus it is difficult for automatic documentation tools to flag them correctly the way they can flag abstract methods. To avoid people having to dig up the alternatives (all of which are used in the core or standard library in various situations): - the method/attribute may be missing entirely - the method/attribute is always present, but may be None - the method is always present, but returns NotImplemented by default - the method is always present, but raises NotImplementedError by default The first case you can't indicate in an ABC at all (except in a comment) The second case makes it difficult to add a docstring (although you can use a property to get around that) The latter two cases mean you have to actually *call* the method to find out if if is implemented or not So, here's a concrete suggestion that would be suitable for many cases where this is desired (although obviously not all such cases, due to backwards compatibility concerns, as well as cases where it is desirable that the base implementation be suitable for termination of a chain of cooperative super calls) 1. Add a __call__ definition to type(NotImplemented) that accepts arbitrary arguments and always raises NotImplementedError 2. Add an @optional decorator that functions roughly as follows class optional: def __init__(self, f): functools.update_wrapper(self, f) def __get__(self, *args): return NotImplemented A demo (using a custom "not implemented" marker and property rather than a custom descriptor): >>> def _not_implemented(*args, **kwds): ... raise NotImplementedError ... >>> def optional(f): ... return property((lambda self: _not_implemented), doc=f.__doc__) ... >>> class C: ... @optional ... def may_not_be_implemented(signature_details): ... """Documentation of this method""" ... >>> C.may_not_be_implemented.__doc__ 'Documentation of this method' >>> C.may_not_be_implemented >>> C().may_not_be_implemented >>> C().may_not_be_implemented is _not_implemented True >>> C().may_not_be_implemented() Traceback (most recent call last): File "", line 1, in File "", line 2, in _not_implemented NotImplementedError That specific approach does have the problem that you lose the signature details, so you'd probably want a custom descriptor that is recognised by inspect.getsignature() rather than reusing the property descriptor as I have done here. A custom descriptor would also be easier to pick out on the class object without needing an instance. Such an approach would improve the expressiveness of the ABC dialect, while remaining broadly consistent with current practices for marking optional methods. Most importantly, it would provide a fairly obvious way to flag optional methods in APIs in a way that is open to static introspection. However, it's not a universal solution. As noted above, it's not usable in any cases where you want the optional method to be usable as a terminal for multiple inheritance. It may be with some API tweaks, you could convert optional to a decorator factory that also handles the "suitable foundation for multiple inheritance" case. Perhaps the answer is even simpler than what I have above: perhaps the decorator could just set "f.__implemented__ = False", allowing introspection via "getattr(method, '__implemented__', True)". This is something that really needs to be explored before making a concrete proposal. Looking at the various ways optional methods are used in the standard library and interpreter core would be a good place to start. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From x at jwp.io Tue Aug 7 08:51:07 2012 From: x at jwp.io (James William Pye) Date: Mon, 6 Aug 2012 23:51:07 -0700 Subject: [Python-ideas] abc.optionalabstractmethod In-Reply-To: References: <502062C4.6050801@pearwood.info> Message-ID: <45664F19-C5F4-4EA8-91AC-59828CA013FB@jwp.io> On Aug 6, 2012, at 10:02 PM, Nick Coghlan wrote: > No. That leads to a combinatorial explosion of interface classes that > only makes sense in languages which don't readily support attribute > level introspection (*cough*such-as-Java*cough*). Yes, so don't do that. =) What's the matter with having a set of *distinct* interface classes and having a given *implementation* cherry pick the appropriate ones? From ncoghlan at gmail.com Tue Aug 7 09:42:10 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 7 Aug 2012 17:42:10 +1000 Subject: [Python-ideas] abc.optionalabstractmethod In-Reply-To: <45664F19-C5F4-4EA8-91AC-59828CA013FB@jwp.io> References: <502062C4.6050801@pearwood.info> <45664F19-C5F4-4EA8-91AC-59828CA013FB@jwp.io> Message-ID: On Tue, Aug 7, 2012 at 4:51 PM, James William Pye wrote: > On Aug 6, 2012, at 10:02 PM, Nick Coghlan wrote: >> No. That leads to a combinatorial explosion of interface classes that >> only makes sense in languages which don't readily support attribute >> level introspection (*cough*such-as-Java*cough*). > > Yes, so don't do that. =) > > What's the matter with having a set of *distinct* interface classes and having > a given *implementation* cherry pick the appropriate ones? If the additional ABCs are defined with an appropriate fallback to a method existence check (ala collections.Hashable) then it may be OK. However, the long "implements" lists that Java is prone to is precisely the boilerplate I consider unacceptable (and antithetical to ducktyping). The point of this discussion is that there really isn't an obvious general purpose mechanism that easily lets you document the existence of an optional method in an ABC (such that tools like pydoc will see it), without inadvertently making it look like that method is implemented. The "x.__hash__ is None" trick that is used to get around object.__hash__ existing by default certainly works, but is definitely not the typical behaviour. Other approaches, like returning NotImplemented or raising NotImplementedError have the problem that the only way to find out if they're implemented or not is to call them, which may have unwanted side effects if they *are* implemented. That's why the IO stack has methods like "seekable()" - to ask the question "is the seek() method implemented?". Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From victor.stinner at gmail.com Tue Aug 7 12:20:30 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 7 Aug 2012 12:20:30 +0200 Subject: [Python-ideas] changing sys.stdout encoding In-Reply-To: References: <87fwa7a945.fsf@uwakimon.sk.tsukuba.ac.jp> <1339102104.17621.YahooMailClassic@web161503.mail.bf1.yahoo.com> <87k3zb7qq3.fsf@uwakimon.sk.tsukuba.ac.jp> <87bokn7gnk.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: The write buffer can be flushed, so I don't see the problem of changing the encoding of stdout and stderr (except potential mojibake). For stdin, TextIOWrapper has a readahead algorithm, so changing the encoding may seek backward. It cannot be done if stdin is not seekable (ex: if stdin is a pipe). I wrote a Python implementation of set_encoding, see my patch attached to the issue. http://bugs.python.org/15216 Victor Le 7 ao?t 2012 02:52, "INADA Naoki" a ?crit : > On Wed, Jun 13, 2012 at 5:35 PM, Stephen J. Turnbull > wrote: > > Guido van Rossum writes: > > > > > I'm not sure about a name, but it might well be called set_encoding(). > > > > I would still prefer "initialize_encoding" or something like that, but > > the main thing I was worried about was a "consenting adults" function > > that shouldn't be called after I/O, but *could* be. > > I still don't understand why Python can't support using it after I/O. > Is this code wrong? > https://gist.github.com/3280063 > > -- > INADA Naoki > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From masklinn at masklinn.net Tue Aug 7 12:35:38 2012 From: masklinn at masklinn.net (Masklinn) Date: Tue, 7 Aug 2012 12:35:38 +0200 Subject: [Python-ideas] changing sys.stdout encoding In-Reply-To: References: <1338916801.8871.YahooMailClassic@web161506.mail.bf1.yahoo.com> <87zk8ha6tv.fsf@uwakimon.sk.tsukuba.ac.jp> <87wr3l9kzq.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On 2012-06-06, at 07:49 , Nick Coghlan wrote: > On Wed, Jun 6, 2012 at 1:28 PM, Stephen J. Turnbull wrote: >> For both of these cases a command-line option to initialize the >> encoding would be convenient. > > Before adding yet-another-command-line-option, the cases where the > existing environment variable support can't be used from the command > line, but a new option could be, should be clearly enumerated. > > $ python3 > Python 3.2.1 (default, Jul 11 2011, 18:54:42) > [GCC 4.6.1 20110627 (Red Hat 4.6.1-1)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. >>>> import sys >>>> sys.stdout.encoding > 'UTF-8' >>>> > $ PYTHONIOENCODING=latin-1 python3 > Python 3.2.1 (default, Jul 11 2011, 18:54:42) > [GCC 4.6.1 20110627 (Red Hat 4.6.1-1)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. >>>> import sys >>>> sys.stdout.encoding > 'latin-1' >>>> LC_CTYPE also works, and is not specific to Python. From x at jwp.io Tue Aug 7 12:44:57 2012 From: x at jwp.io (James William Pye) Date: Tue, 7 Aug 2012 03:44:57 -0700 Subject: [Python-ideas] abc.optionalabstractmethod In-Reply-To: References: <502062C4.6050801@pearwood.info> <45664F19-C5F4-4EA8-91AC-59828CA013FB@jwp.io> Message-ID: <9D8F56A8-99A4-47DA-AA6B-5C77499B5CE0@jwp.io> On Aug 7, 2012, at 12:42 AM, Nick Coghlan wrote: > However, the long "implements" lists that Java is prone to is > precisely the boilerplate I consider unacceptable (and antithetical to > ducktyping). *shrug* I figure if you run into a case where you see a long combination in common use, go ahead and make another ABC consisting of that combination. Or just throw them in a sequence and map a methodcaller. > there really isn't an obvious > general purpose mechanism that easily lets you document the existence > of an optional method in an ABC ISTM the proposed mechanism is a bit of a contradiction, and the value provided to documentation extractors does not appear to be particularly significant. However, annotations on the implementation may be a reasonable solution for documentation extractors: def foo() -> NotImplemented: ... > That's why the IO stack has methods like "seekable()" - to ask the > question "is the seek() method implemented?". Well, hopefully for ESPIPE?Which doesn't appear to be the case.. =( From matt at whoosh.ca Tue Aug 7 21:03:27 2012 From: matt at whoosh.ca (Matt Chaput) Date: Tue, 07 Aug 2012 15:03:27 -0400 Subject: [Python-ideas] compatibility triples for binary distributions In-Reply-To: <5010A405.10704@pearwood.info> References: <4a1e9d73-5778-44a2-a934-cfca092fb976@googlegroups.com> <92fbab11-ce4e-4c8a-b7cb-3f79ca1e0bc2@googlegroups.com> <5010A405.10704@pearwood.info> Message-ID: <5021667F.5020302@whoosh.ca> On 25/07/2012 9:57 PM, Steven D'Aprano wrote: > Some of these may be dead projects; others may be working > but not maintained; some are actively maintained. Some self-impose obscurity by choosing un-google-able names... > HoPe From steve at pearwood.info Tue Aug 7 21:09:42 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 08 Aug 2012 05:09:42 +1000 Subject: [Python-ideas] compatibility triples for binary distributions In-Reply-To: <5021667F.5020302@whoosh.ca> References: <4a1e9d73-5778-44a2-a934-cfca092fb976@googlegroups.com> <92fbab11-ce4e-4c8a-b7cb-3f79ca1e0bc2@googlegroups.com> <5010A405.10704@pearwood.info> <5021667F.5020302@whoosh.ca> Message-ID: <502167F6.5050108@pearwood.info> On 08/08/12 05:03, Matt Chaput wrote: > On 25/07/2012 9:57 PM, Steven D'Aprano wrote: >> Some of these may be dead projects; others may be working >> but not maintained; some are actively maintained. > > Some self-impose obscurity by choosing un-google-able names... > >> HoPe Second link on Google for "hope python": http://kenai.com/projects/hope http://www.google.com.au/search?q=hope+python -- Steven From matt at whoosh.ca Tue Aug 7 21:11:39 2012 From: matt at whoosh.ca (Matt Chaput) Date: Tue, 07 Aug 2012 15:11:39 -0400 Subject: [Python-ideas] compatibility triples for binary distributions In-Reply-To: <502167F6.5050108@pearwood.info> References: <4a1e9d73-5778-44a2-a934-cfca092fb976@googlegroups.com> <92fbab11-ce4e-4c8a-b7cb-3f79ca1e0bc2@googlegroups.com> <5010A405.10704@pearwood.info> <5021667F.5020302@whoosh.ca> <502167F6.5050108@pearwood.info> Message-ID: <5021686B.9070608@whoosh.ca> On 07/08/2012 3:09 PM, Steven D'Aprano wrote: > Second link on Google for "hope python": Huh, weird, for "python hope" it's below the fold :) From jimjjewett at gmail.com Tue Aug 7 21:12:30 2012 From: jimjjewett at gmail.com (Jim Jewett) Date: Tue, 7 Aug 2012 15:12:30 -0400 Subject: [Python-ideas] abc.optionalabstractmethod In-Reply-To: <502062C4.6050801@pearwood.info> References: <502062C4.6050801@pearwood.info> Message-ID: On 8/6/12, Steven D'Aprano wrote: > On 07/08/12 08:55, Jim Jewett wrote: >> One major use of an abstract class is to document an interface. The >> abstract class can only define (or even document, really, if you value >> docstrings) methods that it defines. > I don't know that I accept that abstract classes are documentation. It seems > to me that to be documentation, it has to be, well, documentation. If the code is short and seems to be clear, many people will never look at the external documentation. In that case, leaving surprises out of the docstring is almost as bad as not documenting them at all. > ... merely making something an abstract class is not in and of > itself documentation. Agreed, but that abstract class itself should clearly document the contract that concrete implementations (and their users) must follow. > But even putting that aside, the interface that it (implicitly?) documents > is surely *required* interface. Separating, for example, pen.draw from gunfighter.draw was one of the primary use cases of abstract base classes. _abcoll.py is dealing with infrastructure where python already defines the double-underscore methods; for now, pretend it were an ordinary domain module, using normal unreserved names. A Container need not implement __len__. Since we're pretending that the name wouldn't already be reserved by python, a concrete Container could use the name for some domain-specific attribute, such as LoudEngineNoise. Every specialized container in the module (Sequence, Set, Mapping, even MappingView) does need to have a __len__ method (ruling out some factory objects). They even all define it with the name meaning, by also inheriting from Sized. I would prefer to see the name reserved and defined at the Container level. It would still be OK to create a Container that could not tell you its length, but it would not be OK to create a Container using __len__ for some other purpose. For reasons that are no longer obvious, a MappingView needs to be Sized, but does not need to be an Iterable or a Container. I haven't yet seen a use for MappingView except to group the derived classes KeysView, ItemsView, and ValuesView -- all of which are Iterable Containers. KeysView and ItemsView also contain identical _from_iterable methods.* Reusing the names associated with Iterable or Container (remember we're pretending they weren't reserved by the double-underscore) for some other use would be very confusing. So would reusing _from_iterable, if it weren't semi-private. MappingView should be able to reserve these names and forbid such unrelated use. I'm not sure it needs to actually enforce the prohibition, but it should be able to at least express the prohibition in a manner visible to introspection. * Well, identical code. I'm assuming that the expected input type differs. -jJ From python at mrabarnett.plus.com Tue Aug 7 22:25:20 2012 From: python at mrabarnett.plus.com (MRAB) Date: Tue, 07 Aug 2012 21:25:20 +0100 Subject: [Python-ideas] Why no dict.discard? Message-ID: <502179B0.5020801@mrabarnett.plus.com> The 'set' class has a 'discard' method, but dict doesn't. I know that I can just delete the key and catch the exception if it's absent, but I can also 'remove' a member from a set and catch the exception if it's absent, OR just 'discard' it to avoid an exception. How much interest would there be in adding 'discard' to dict too? From _ at lvh.cc Tue Aug 7 22:30:35 2012 From: _ at lvh.cc (Laurens Van Houtven) Date: Tue, 7 Aug 2012 22:30:35 +0200 Subject: [Python-ideas] Why no dict.discard? In-Reply-To: <502179B0.5020801@mrabarnett.plus.com> References: <502179B0.5020801@mrabarnett.plus.com> Message-ID: <98776C9C-A06D-4A42-AB21-6A8AC9735A1C@lvh.cc> FWIW I use d.pop(k, None) to mean "just remove this already regardless of whether or not it's present now". cheers lvh On 07 Aug 2012, at 22:25, MRAB wrote: > The 'set' class has a 'discard' method, but dict doesn't. > > I know that I can just delete the key and catch the exception if it's > absent, but I can also 'remove' a member from a set and catch the > exception if it's absent, OR just 'discard' it to avoid an exception. > > How much interest would there be in adding 'discard' to dict too? > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From python at mrabarnett.plus.com Tue Aug 7 22:42:38 2012 From: python at mrabarnett.plus.com (MRAB) Date: Tue, 07 Aug 2012 21:42:38 +0100 Subject: [Python-ideas] Why no dict.discard? In-Reply-To: <98776C9C-A06D-4A42-AB21-6A8AC9735A1C@lvh.cc> References: <502179B0.5020801@mrabarnett.plus.com> <98776C9C-A06D-4A42-AB21-6A8AC9735A1C@lvh.cc> Message-ID: <50217DBE.5040505@mrabarnett.plus.com> On 07/08/2012 21:30, Laurens Van Houtven wrote: > On 07 Aug 2012, at 22:25, MRAB wrote: > >> The 'set' class has a 'discard' method, but dict doesn't. >> >> I know that I can just delete the key and catch the exception if it's >> absent, but I can also 'remove' a member from a set and catch the >> exception if it's absent, OR just 'discard' it to avoid an exception. >> >> How much interest would there be in adding 'discard' to dict too? >> > FWIW I use d.pop(k, None) to mean "just remove this already regardless > of whether or not it's present now". > Thanks for that. From dholth at gmail.com Wed Aug 8 13:14:39 2012 From: dholth at gmail.com (Daniel Holth) Date: Wed, 8 Aug 2012 07:14:39 -0400 Subject: [Python-ideas] bdist naming scheme (compatibility tags) PEP Message-ID: Today pypy and CPython's "setup.py bdist" generate the same filename but incompatible bdists. This makes it difficult to share both bdists in the same folder or index. Instead, they should generate different bdist filenames because one won't work with the other implementation. This PEP specifies a tagging system that includes enough information to decide whether a particular bdist is expected to work on a particular Python. Also at https://bitbucket.org/dholth/python-peps/raw/98cd36228c2e/pep-CTAG.txt Thanks for your feedback, Daniel Holth -------------- next part -------------- PEP: CTAG Title: Compatibility Tags for Built Distributions Version: $Revision$ Last-Modified: 07-Aug-2012 Author: Daniel Holth Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 27-Jul-2012 Abstract ======== This PEP specifies a tagging system to indicate with which versions of Python a built or binary distribution is compatible. A set of three tags indicate which Python implementation and language version, ABI, and platform a built distribution requires. The tags are terse because they will be included in filenames. Rationale ========= Today "python setup.py bdist" generates the same filename on PyPy and CPython, but an incompatible archive, making it inconvenient to share built distributions in the same folder or index. Instead, built distributions should have a file naming convention that includes enough information to decide whether or not a particular archive is compatible with a particular implementation. Previous efforts come from a time where CPython was the only important implementation and the ABI was the same as the Python language release. This specification improves upon the older schemes by including the Python implementation, language version, ABI, and platform as a set of tags. By comparing the tags it supports with the tags listed by the distribution, an installer can make an educated decision about whether to download a particular built distribution without having to read its full metadata. Overview ======== The tag format is {python tag}-{abi tag}-{platform tag} python tag ?py27?, ?cp33? abi tag ?cp33dmu?, ?none? platform tag ?linux_x86_64?, ?any? For example, the tag py27-none-any indicates compatible with Python 2.7 (any Python 2.7 implementation) with no abi requirement, on any platform. Details ======= Python Tag ---------- The Python tag indicates both the implementation and the language version required by a distribution. Major implementations have abbreviated codes, initially: * py: Generic Python (does not require implementation-specific features) * cp: CPython * ip: IronPython * pp: PyPy * jy: Jython Other Python implementations should use `sys.implementation.name`. The language version is `py_version_nodot`, or just the major version `2` or `3` for many pure-Python distributions. CPython gets away with no dot, but if one is needed the underscore `_` is used instead. A single-source Python 2/3 compatible distribution can use the compound tag `py2.py3`. See `Compressed Tag Sets`, below. ABI Tag ------- The ABI tag indicates which Python ABI is required by any included extension modules. For implementation-specific ABIs, the implementation is abbreviated in the same way as the Python Tag, e.g. `cp33d` would be the CPython 3.3 ABI with debugging. As a special case, the CPython stable ABI starts with `py`; `py32` is that ABI with only the operations available from Python 3.2 onward. Implementations with a very unstable ABI may use the first 6 bytes (as 8 base64-encoded characters) of the SHA-256 hash of ther source code revision and compiler flags, etc, but will probably not have a great need to distribute binary distributions. Each implementation's community may decide how to best use the ABI tag. Platform Tag ------------ The platform tag is simply `distutils.util.get_platform()` with all hyphens `-` and periods `.` replaced with underscore `_`. Use === The tags are used by installers to decide which built distribution (if any) to download from a list of potential built distributions. Installers will have a list of (python, abi, plat) that the current Python installation can run sorted by order of preference. Each built distribution recieves a score based on its tag's position in the list, and the most-preferred distribution is the one that is installed. If no built distribution matches the list of supported tag tuples then the installer falls back to installing from the source distribution. Tags are only compared for equality; they are never greater or less than another tag, and a tag that 'startswith' another tag is not a subset of the shorter tag. For example, an installer running under CPython 3.3 on an imaginary MMIX system might prefer, in order: 0. (cp33, cp33, mmix) # built for this specific version of Python 1. (cp33, py32, mmix) # using the stable ABI as defined by Python 3.2 2. (cp33, none, mmix) # using no ABI, but still depending on the specific platform (e.g. through ctypes or os.system) 3. (cp33, none, any) # pure-Python distribution for the current Python 4. (py33, none, any) # pure-Python distribution for the current (generic) Python 5. (py32, none, any) # pure-Python distributions for older versions of Python 6. (py31, none, any) # "" 6. (py30, none, any) # "" 7. (py3, none, any) # "" A distribution that requires CPython 3.3 or CPython 2.7 and has an optional extension module could distribute built distributions tagged `cp33-cp3-mmix`, `cp33-none-any`, and `cp27-none-any`. (Our imaginary program is using 2to3, so the built distribution is not compatible across major releases.) `cp33-cp3-mmix` gets a score of 1, `cp33-none-any` gets a score of 3, and `cp27-none-any` is not in the list at all. Since `cp33-cp3-mmix` has the best score, that built distribution is installed. A user could instruct their installer to fall back to building from an sdist more or less often by configuring this list of tags. Compressed Tag Sets =================== To allow for compact filenames of bdists that work with more than one compatibility tag triple, each tag in a filename can instead be a '.'-separated, sorted, set of tags. For example, pip, a pure-Python package that is written to run under Python 2 and 3 with the same source code, could distribute a bdist with the tag `py2.py3-none-any`. The full list of simple tags is:: for x in pytag.split('.'): for y in abitag.split('.'): for z in archtag.split('.'): yield '-'.join((x, y, z)) A bdist format that implements this scheme should include the expanded tags in bdist-specific metadata. This compression scheme can generate large numbers of unsupported tags and "impossible" tags that are supported by no Python implementation e.g. "cp33-cp31u-win64", so use it sparingly. FAQ === Can I have a tag `py32+` to indicate a minimum Python minor release? No. Inspect the Trove classifiers to determine this level of cross-release compatibility. Similar to the announcements "beaglevote versions 3.2 and above no longer supports Python 1.52", you will have to manually keep track of the maximum (PEP-386) release that still supports your version of Python. Why isn't there a `.` in the Python version number? CPython has lasted 20+ years without a 3-digit major release. This should continue for some time. Other implementations may use _ as a delimeter, since both - and . delimit the surrounding filename. Who will maintain the registry of abbreviated implementations? New two-letter abbreviations can be requested on the python-dev mailing list. As a rule of thumb, abbreviations are reserved for the current 4 most prominent implementations. Does the compatibility tag go into METADATA or PKG-INFO? No. The compatibility tag is part of the built distribution's metadata. METADATA / PKG-INFO should be valid for an entire distribution, not a single build of that distribution. Why didn't you mention my favorite Python implementation? The abbreviated tags facilitate sharing compiled Python code in a public index. Your Python implementation can use this specification too, but with longer tags. Recall that all "pure Python" built distributions just use 'py'. References ========== .. [1] Egg Filename-Embedded Metadata (http://peak.telecommunity.com/DevCenter/EggFormats#filename-embedded-metadata) .. [2] Creating Built Distributions (http://docs.python.org/distutils/builtdist.html) .. [3] PEP 3147 -- PYC Repository Directories (http://www.python.org/dev/peps/pep-3147/) Acknowledgements ================ The author thanks Paul Moore, Nick Coughlan, Mark Abramowitz, and Mr. Michele Lacchia for their valuable advice and help with this effort. Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: From mal at egenix.com Wed Aug 8 13:56:17 2012 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 08 Aug 2012 13:56:17 +0200 Subject: [Python-ideas] bdist naming scheme (compatibility tags) PEP In-Reply-To: References: Message-ID: <502253E1.6050703@egenix.com> Daniel Holth wrote: > Today pypy and CPython's "setup.py bdist" generate the same filename > but incompatible bdists. The distutils "bdist" command is just a generic command which then runs one of the more specific bdist_* commands (via the --formats option; defaulting to bdist_dumb). Since each of these produces different output files (installers, packages, eggs, etc), you should be more specific about which command you are referring to. Reading the PEP, I assume you'd like to change the bdist_dumb output file name only. > This makes it difficult to share both bdists > in the same folder or index. Instead, they should generate different > bdist filenames because one won't work with the other implementation. > This PEP specifies a tagging system that includes enough information > to decide whether a particular bdist is expected to work on a > particular Python. > > Also at https://bitbucket.org/dholth/python-peps/raw/98cd36228c2e/pep-CTAG.txt > > Thanks for your feedback, > > Daniel Holth > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 08 2012) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2012-08-25: FrOSCon, St. Augustin, Germany ... 17 days to go ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From dholth at gmail.com Wed Aug 8 14:33:09 2012 From: dholth at gmail.com (Daniel Holth) Date: Wed, 8 Aug 2012 08:33:09 -0400 Subject: [Python-ideas] bdist naming scheme (compatibility tags) PEP In-Reply-To: <502253E1.6050703@egenix.com> References: <502253E1.6050703@egenix.com> Message-ID: On Wed, Aug 8, 2012 at 7:56 AM, M.-A. Lemburg wrote: > Daniel Holth wrote: >> Today pypy and CPython's "setup.py bdist" generate the same filename >> but incompatible bdists. > > The distutils "bdist" command is just a generic command which then > runs one of the more specific bdist_* commands (via the --formats > option; defaulting to bdist_dumb). > Reading the PEP, I assume you'd like to change the bdist_dumb output > file name only. Yes, I do mean bdist_dumb in the Rationale. The PEP doesn't propose changing any file names. It is just a naming scheme. There is a new format "wheel" that needs this, but the naming scheme should be useful elsewhere, and I need feedback from the implementation communities to get it right. From ericsnowcurrently at gmail.com Wed Aug 8 15:31:35 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Wed, 8 Aug 2012 07:31:35 -0600 Subject: [Python-ideas] bdist naming scheme (compatibility tags) PEP In-Reply-To: References: Message-ID: On Aug 8, 2012 5:15 AM, "Daniel Holth" wrote: > > Today pypy and CPython's "setup.py bdist" generate the same filename > but incompatible bdists. This makes it difficult to share both bdists > in the same folder or index. Instead, they should generate different > bdist filenames because one won't work with the other implementation. > This PEP specifies a tagging system that includes enough information > to decide whether a particular bdist is expected to work on a > particular Python Consider using sys.implementation to get name/version. The cache_tag should be particularly helpful. The 2-character approach for implementation names requires unnecessary curating. -eric > > Also at https://bitbucket.org/dholth/python-peps/raw/98cd36228c2e/pep-CTAG.txt > > Thanks for your feedback, > > Daniel Holth > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dholth at gmail.com Wed Aug 8 15:42:14 2012 From: dholth at gmail.com (Daniel Holth) Date: Wed, 8 Aug 2012 09:42:14 -0400 Subject: [Python-ideas] bdist naming scheme (compatibility tags) PEP In-Reply-To: References: Message-ID: On Wed, Aug 8, 2012 at 9:31 AM, Eric Snow wrote: > Consider using sys.implementation to get name/version. The cache_tag should > be particularly helpful. The 2-character approach for implementation names > requires unnecessary curating. It will use that for the implementations not mentioned in the initial PEP. From fuzzyman at gmail.com Wed Aug 8 16:11:43 2012 From: fuzzyman at gmail.com (Michael Foord) Date: Wed, 8 Aug 2012 15:11:43 +0100 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values Message-ID: Hey all, True and False are singletons, so if you want to check precisely for True and False then an identity check seems appropriate. However the "Programming Recommendations" section of PEP 8 has this to say on the topic: Don't compare boolean values to True or False using ==. Yes: if greeting: No: if greeting == True: Worse: if greeting is True: http://www.python.org/dev/peps/pep-0008/#programming-recommendations It seems to me that there is an important distinction between testing that an object is either the boolean True or False and merely checking the "truthiness" of an object. Many a bug has been caused by an empty container object (or some other falsey object) falling into an "if not value" clause that was actually meant to check for the presence of False or None. Why does PEP 8 recommend not testing "boolean values" (which to me implies "values you expect to be a bool") using equality or identity, and more specifically why does it say that using an identity check is worse than an equality check? As this is Python-ideas and not python-list, my specific suggestion is to modify the wording of that section - or just removing it altogether as I don't think it can be adequately clarified without using lots of words. All the best, Michael Foord -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From dholth at gmail.com Wed Aug 8 16:20:13 2012 From: dholth at gmail.com (Daniel Holth) Date: Wed, 8 Aug 2012 10:20:13 -0400 Subject: [Python-ideas] bdist naming scheme (compatibility tags) PEP In-Reply-To: References: Message-ID: I want to implement this all the way back to Python 2.5... On Wed, Aug 8, 2012 at 9:42 AM, Daniel Holth wrote: > On Wed, Aug 8, 2012 at 9:31 AM, Eric Snow wrote: > >> Consider using sys.implementation to get name/version. The cache_tag should >> be particularly helpful. The 2-character approach for implementation names >> requires unnecessary curating. > > It will use that for the implementations not mentioned in the initial PEP. From guido at python.org Wed Aug 8 16:28:39 2012 From: guido at python.org (Guido van Rossum) Date: Wed, 8 Aug 2012 07:28:39 -0700 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values In-Reply-To: References: Message-ID: I'd be strongly against changing that rule. If you don't want other types than bool, use an isinstance check. Code using "is True" is most likely a bug. (It's different for None, since that has no second value that is assumed.) --Guido On Wednesday, August 8, 2012, Michael Foord wrote: > Hey all, > > True and False are singletons, so if you want to check precisely for True > and False then an identity check seems appropriate. > > However the "Programming Recommendations" section of PEP 8 has this to say > on the topic: > > Don't compare boolean values to True or False using ==. > > Yes: if greeting: > No: if greeting == True: > Worse: if greeting is True: > > http://www.python.org/dev/peps/pep-0008/#programming-recommendations > > It seems to me that there is an important distinction between testing that > an object is either the boolean True or False and merely checking the > "truthiness" of an object. Many a bug has been caused by an empty container > object (or some other falsey object) falling into an "if not value" clause > that was actually meant to check for the presence of False or None. > > Why does PEP 8 recommend not testing "boolean values" (which to me implies > "values you expect to be a bool") using equality or identity, and more > specifically why does it say that using an identity check is worse than an > equality check? > > As this is Python-ideas and not python-list, my specific suggestion is to > modify the wording of that section - or just removing it altogether as I > don't think it can be adequately clarified without using lots of words. > > All the best, > > Michael Foord > > -- > > http://www.voidspace.org.uk/ > > May you do good and not evil > > May you find forgiveness for yourself and forgive others > May you share freely, never taking more than you give. > -- the sqlite blessing http://www.sqlite.org/different.html > > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben+python at benfinney.id.au Wed Aug 8 18:18:53 2012 From: ben+python at benfinney.id.au (Ben Finney) Date: Thu, 09 Aug 2012 02:18:53 +1000 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values References: Message-ID: <87628ticgi.fsf@benfinney.id.au> Michael Foord writes: > True and False are singletons, so if you want to check precisely for > True and False then an identity check seems appropriate. What is a compelling use case for checking precisely for True or False? Why is a simple boolean expression (?if foo?) not good enough? I can't think of any code checking precisely for True or False that wouldn't be much improved by using different values altogether. But maybe you have code examples that would be convincing. > It seems to me that there is an important distinction between testing > that an object is either the boolean True or False and merely checking > the "truthiness" of an object. The distinction is real and important, but that doesn't make both sides useful or that one shouldn't be deprecated. > Many a bug has been caused by an empty container object (or some other > falsey object) falling into an "if not value" clause that was actually > meant to check for the presence of False or None. What bugs do you have in mind? -- \ ?I see little commercial potential for the Internet for at | `\ least ten years.? ?Bill Gates, 1994 | _o__) | Ben Finney From tjreedy at udel.edu Wed Aug 8 18:51:13 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 08 Aug 2012 12:51:13 -0400 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values In-Reply-To: References: Message-ID: On 8/8/2012 10:11 AM, Michael Foord wrote: > Hey all, > > True and False are singletons, so if you want to check precisely for > True and False then an identity check seems appropriate. > > However the "Programming Recommendations" section of PEP 8 has this to > say on the topic: > > Don't compare boolean values to True or False using ==. > > Yes: if greeting: > No: if greeting == True: > Worse: if greeting is True: > > http://www.python.org/dev/peps/pep-0008/#programming-recommendations I understand 'boolean values' to when you KNOW that is a boolean. For instance 'isinstance(x, tuple)'. In that situation, 'if isinstance(x, tuple) is True' is a stupid redundancy, since the actual test, with the internal comparison, becomes 'if (isinstance(x, tuple) is True) is True'. An newbies not really understanding 'if' have done the latter, as seen on python-list On the other hand, when the tested value might not be boolean for item in [1, None, 'a', True, []]: if item is True: do_True() else: do_all_else() 'is True' is necessary. > It seems to me that there is an important distinction between testing > that an object is either the boolean True or False and merely checking > the "truthiness" of an object. That is the distinction drawn above. > Many a bug has been caused by an empty > container object (or some other falsey object) falling into an "if not > value" clause that was actually meant to check for the presence of False > or None. One should generally know what group of classes one is testing in any if. -- Terry Jan Reedy From phd at phdru.name Wed Aug 8 18:59:04 2012 From: phd at phdru.name (Oleg Broytman) Date: Wed, 8 Aug 2012 20:59:04 +0400 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values In-Reply-To: <87628ticgi.fsf@benfinney.id.au> References: <87628ticgi.fsf@benfinney.id.au> Message-ID: <20120808165904.GA19350@iskra.aviel.ru> On Thu, Aug 09, 2012 at 02:18:53AM +1000, Ben Finney wrote: > What is a compelling use case for checking precisely for True or False? To distinguish False and None for a tri-state variable that can have 3 values - "yes", "no" and "unspecified" - True, False and None in Python-speak. Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From bruce at leapyear.org Wed Aug 8 19:27:19 2012 From: bruce at leapyear.org (Bruce Leban) Date: Wed, 8 Aug 2012 10:27:19 -0700 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values In-Reply-To: <20120808165904.GA19350@iskra.aviel.ru> References: <87628ticgi.fsf@benfinney.id.au> <20120808165904.GA19350@iskra.aviel.ru> Message-ID: On Wed, Aug 8, 2012 at 9:59 AM, Oleg Broytman wrote: > On Thu, Aug 09, 2012 at 02:18:53AM +1000, Ben Finney < > ben+python at benfinney.id.au> wrote: > > What is a compelling use case for checking precisely for True or False? > > To distinguish False and None for a tri-state variable that can have > 3 values - "yes", "no" and "unspecified" - True, False and None in > Python-speak. > I'd probably write that this way: if t is None: # not specified elif t: # true else: # false on the other hand, if I had a variable that could be either a number or True/False, I would probably write: if t is True: # elif t is False: # else: # use t as a number just as I would for any other singleton value. Why does the PEP say that == True is preferred to is True? --- Bruce Follow me: http://www.twitter.com/Vroo http://www.vroospeak.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Wed Aug 8 19:33:23 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 09 Aug 2012 03:33:23 +1000 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values In-Reply-To: References: Message-ID: <5022A2E3.2050402@pearwood.info> On 09/08/12 00:28, Guido van Rossum wrote: > I'd be strongly against changing that rule. If you don't want other types > than bool, use an isinstance check. Code using "is True" is most likely a > bug. (It's different for None, since that has no second value that is > assumed.) Furthermore, the problem with testing flags for their truth-value (truthiness) using "flag is True" is that you never know when to stop, since the comparison also returns a flag which needs to be checked for its truth-value: if (flag is True) is True is True is True is ... # and so on ad infinitum which of course is absurd. If you can trust the truthiness of "flag is True", you can also trust the truthiness of flag on its own. If you truly need to check that flag is not only a true-value, but is specifically the bool True, then "flag is True" might be acceptable, provided it was clear in context why you were doing that. (But why would you need to? Presumably that should be a rare occurrence.) An isinstance check makes it obvious that the type matters. "type(flag) is bool" makes it even more obvious that you want a bool and nothing but a bool, not even a subclass. I note, however, that all three of CPython, IronPython and Jython prohibit you from subclassing bool, but this seems to be an implementation detail. At least, it isn't documented as a language requirement here: http://docs.python.org/py3k/library/stdtypes.html#boolean-values Coming back to Paul's request to modify the wording, I think that it is fine as it stands. We know that PEP 8 allows breaking the rules when necessary. Not every exception to PEP 8 needs to be explicitly documented, and we shouldn't dilute the message that "if flag" is the idiomatic Pythonic way to test a flag by talking about the rare case where it isn't. -- Steven From steve at pearwood.info Wed Aug 8 19:44:12 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 09 Aug 2012 03:44:12 +1000 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values In-Reply-To: References: <87628ticgi.fsf@benfinney.id.au> <20120808165904.GA19350@iskra.aviel.ru> Message-ID: <5022A56C.3050708@pearwood.info> On 09/08/12 03:27, Bruce Leban wrote: > On Wed, Aug 8, 2012 at 9:59 AM, Oleg Broytman wrote: > >> On Thu, Aug 09, 2012 at 02:18:53AM +1000, Ben Finney< >> ben+python at benfinney.id.au> wrote: >>> What is a compelling use case for checking precisely for True or False? >> >> To distinguish False and None for a tri-state variable that can have >> 3 values - "yes", "no" and "unspecified" - True, False and None in >> Python-speak. >> > > I'd probably write that this way: > > if t is None: > # not specified > elif t: > # true > else: > # false Which treats None specially, otherwise falls back to the standard Python concept of truth-value or truthiness. Which is just a form of duck-typing. > on the other hand, if I had a variable that could be either a number or > True/False, I would probably write: This sounds like poor design to me, but okay, for the sake of the argument, let's accept that you do. > if t is True: > # > elif t is False: > # > else: > # use t as a number > > just as I would for any other singleton value. Keeping the duck-typing of truth-values: if isinstance(t, numbers.Number) and not isinstance(t, bool): ... elif t: # True, or any other true-like value ... else: # False, or any other false-like value pass There are many subtle variations on this. We surely don't need, or want, to have to cover them in PEP 8. > Why does the PEP say that == True is preferred to is True? Because there surely are still libraries out there that return 0 and 1 as true/false values. The bool type was introduced in (by memory) Python 2.2 or 2.3, so very old code may fail, or worse, silently do the wrong thing if you use "is True". -- Steven From rob.cliffe at btinternet.com Wed Aug 8 19:53:29 2012 From: rob.cliffe at btinternet.com (Rob Cliffe) Date: Wed, 08 Aug 2012 18:53:29 +0100 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values In-Reply-To: <20120808165904.GA19350@iskra.aviel.ru> References: <87628ticgi.fsf@benfinney.id.au> <20120808165904.GA19350@iskra.aviel.ru> Message-ID: <5022A799.9060801@btinternet.com> On 08/08/2012 17:59, Oleg Broytman wrote: > On Thu, Aug 09, 2012 at 02:18:53AM +1000, Ben Finney wrote: >> What is a compelling use case for checking precisely for True or False? > To distinguish False and None for a tri-state variable that can have > 3 values - "yes", "no" and "unspecified" - True, False and None in > Python-speak. > > Oleg. Other cases: 1) I have written a function that returns True, False or a string error message. (No doubt an Exception-subclass object might be better style.) So sometimes I really want to test if the result "is True". 2) Say you are writing a function to process an arbitrary object, e.g. a custom version of repr. You might well choose to write if obj is True: # processing elif obj is False: # processing elif type(obj) is int: # processing # etc. etc. I am sure the examples could be multiplied. I can see no reason why we should be discouraged from writing if x is True: if that is really what we mean (and need) and not just a verbose way of spelling "if x:". Also I find the advice that if x is True: is worse than if x==True: baffling. I have been taught that the former executes faster, and indeed when I test it I find it is (significantly faster). What is the rationale? Rob Cliffe From g.brandl at gmx.net Wed Aug 8 22:29:15 2012 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 08 Aug 2012 22:29:15 +0200 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values In-Reply-To: <5022A799.9060801@btinternet.com> References: <87628ticgi.fsf@benfinney.id.au> <20120808165904.GA19350@iskra.aviel.ru> <5022A799.9060801@btinternet.com> Message-ID: On 08.08.2012 19:53, Rob Cliffe wrote: > > On 08/08/2012 17:59, Oleg Broytman wrote: >> On Thu, Aug 09, 2012 at 02:18:53AM +1000, Ben Finney wrote: >>> What is a compelling use case for checking precisely for True or False? >> To distinguish False and None for a tri-state variable that can have >> 3 values - "yes", "no" and "unspecified" - True, False and None in >> Python-speak. >> >> Oleg. > Other cases: > 1) I have written a function that returns True, False or a string > error message. (No doubt an Exception-subclass object might be better > style.) Using the (only) return value for either a result or an error indication is ugly and error-prone. It is what makes many APIs painful in languages that don't have the comfort of e.g. returning a tuple (result, error or None). Not to mention that there's nothing wrong with raising exceptions. > So sometimes I really want to test if the result "is True". > 2) Say you are writing a function to process an arbitrary object, > e.g. a custom version of repr. You might well choose to write > if obj is True: > # processing > elif obj is False: > # processing > elif type(obj) is int: > # processing > # etc. etc. > I am sure the examples could be multiplied. > I can see no reason why we should be discouraged from writing > if x is True: > if that is really what we mean (and need) and not just a verbose way of > spelling "if x:". Yeah, just that in most cases that is not really what "we" mean. And if it is, why do you feel discouraged anyway? > Also I find the advice that > if x is True: > is worse than > if x==True: > baffling. I have been taught that the former executes faster, and > indeed when I test it I find it is (significantly faster). > What is the rationale? Because in most cases you want to accept 1 and 0 too for True and False. For None, "==" and "is" are equivalent, because no other object is equal to None. For True and False, this is different, and using "is" here is a very stealthy bug. cheers, Georg From ronan.lamy at gmail.com Wed Aug 8 23:38:28 2012 From: ronan.lamy at gmail.com (Ronan Lamy) Date: Wed, 08 Aug 2012 22:38:28 +0100 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values In-Reply-To: References: <87628ticgi.fsf@benfinney.id.au> <20120808165904.GA19350@iskra.aviel.ru> <5022A799.9060801@btinternet.com> Message-ID: <1344461908.2223.185.camel@ronan-desktop> Le mercredi 08 ao?t 2012 ? 22:29 +0200, Georg Brandl a ?crit : > On 08.08.2012 19:53, Rob Cliffe wrote: > > > > On 08/08/2012 17:59, Oleg Broytman wrote: > > Also I find the advice that > > if x is True: > > is worse than > > if x==True: > > baffling. I have been taught that the former executes faster, and > > indeed when I test it I find it is (significantly faster). > > What is the rationale? > > Because in most cases you want to accept 1 and 0 too for True and False. > > For None, "==" and "is" are equivalent, because no other object is equal > to None. For True and False, this is different, and using "is" here is > a very stealthy bug. But do you really want to accept 1.0 but reject 1.0001? I would say that using "x == True" is the stealthier bug: >>> def do_stuff(n): ... a = 1; a /= n; a *= n ... if a == True: ... return 0 ... else: ... return "BOOM!" ... >>> [do_stuff(n) for n in range(42, 54)] [0, 0, 0, 0, 0, 0, 0, 'BOOM!', 0, 0, 0, 0] -- Ronan Lamy From ned at nedbatchelder.com Wed Aug 8 23:56:46 2012 From: ned at nedbatchelder.com (Ned Batchelder) Date: Wed, 08 Aug 2012 17:56:46 -0400 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values In-Reply-To: References: <87628ticgi.fsf@benfinney.id.au> <20120808165904.GA19350@iskra.aviel.ru> <5022A799.9060801@btinternet.com> Message-ID: <5022E09E.6050505@nedbatchelder.com> On 8/8/2012 4:29 PM, Georg Brandl wrote: > For None, "==" and "is" are equivalent, because no other object is equal > to None. For True and False, this is different, and using "is" here is > a very stealthy bug. It's easy to make other objects == None, by writing buggy __eq__ methods. That doesn't happen much, but it's easier with __ne__, where the negated logic is easy to screw up. I've seen it happen. Use "is None", not "== None". --Ned. From fuzzyman at gmail.com Thu Aug 9 00:04:22 2012 From: fuzzyman at gmail.com (Michael Foord) Date: Wed, 8 Aug 2012 23:04:22 +0100 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values In-Reply-To: <20120808165904.GA19350@iskra.aviel.ru> References: <87628ticgi.fsf@benfinney.id.au> <20120808165904.GA19350@iskra.aviel.ru> Message-ID: On 8 August 2012 17:59, Oleg Broytman wrote: > On Thu, Aug 09, 2012 at 02:18:53AM +1000, Ben Finney < > ben+python at benfinney.id.au> wrote: > > What is a compelling use case for checking precisely for True or False? > > To distinguish False and None for a tri-state variable that can have > 3 values - "yes", "no" and "unspecified" - True, False and None in > Python-speak. > > Or when testing, I often want to check that a method *really* returns True or False rather than some object that happens to evaluate to True or False. Michael > Oleg. > -- > Oleg Broytman http://phdru.name/ phd at phdru.name > Programmers don't die, they just GOSUB without RETURN. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Thu Aug 9 00:37:15 2012 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 8 Aug 2012 18:37:15 -0400 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values In-Reply-To: References: Message-ID: On Wed, Aug 8, 2012 at 10:11 AM, Michael Foord wrote: > True and False are singletons, so if you want to check precisely for True > and False then an identity check seems appropriate. Technically speaking, True and False are not singletons - they are each an instance of the bool class and a singleton is the *only* instance of a class. I am not sure what the right term is for bool instances behavior, but the key feature here is that if x is known to be bool, x == True is equivalent to x is True. This property is not unique to bool an singletons like None. For example, in CPython if x is an int, x == 0 is equivalent to x is 0 because a few small integers are preallocated. However, in the case of int, this property is clearly an implementation detail and alternative implementations should be free to vary the number of preallocated integers or not preallocate at all. With respect to bool, an argument can be made that it having only two preallocated instances is a language feature rather than an implementation detail and user can rely on the fact that True, bool(1) and say not [] all return the same instance. If this is so, I think this fact should be mentioned somewhere, but probably not in PEP 8. From ncoghlan at gmail.com Thu Aug 9 01:07:16 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 9 Aug 2012 09:07:16 +1000 Subject: [Python-ideas] bdist naming scheme (compatibility tags) PEP In-Reply-To: References: Message-ID: A bit of background: Daniel's wheel project aims to provide the oft-requested feature of a cross-platform binary distribution format that can be cleanly and automatically mapped to the platform specific formats. In reviewing the draft format spec for wheels, I noted that it was worth getting broader agreement on the basic binary compatibility identification scheme early, since it doesn't need to be specific to the wheel format and will play a critical role in letting installers find the right binaries efficiently regardless of any other format details. For wheel in particular, aspects of this PEP will show up in various places in metadata, filenames and installer configuration settings. The PEP could probably use a "Background" section with some of the above info. Cheers, Nick. -- Sent from my phone, thus the relative brevity :) -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.lefavor at nasa.gov Thu Aug 9 00:39:42 2012 From: matthew.lefavor at nasa.gov (Lefavor, Matthew (GSFC-582.0)[MICROTEL LLC]) Date: Wed, 8 Aug 2012 17:39:42 -0500 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values In-Reply-To: Message-ID: I don't think PEP 8 is meant to cover a specialized cases like testing. Keep in mind the recommendation in the beginning of the style guide: "But most importantly: know when to be inconsistent -- sometimes the style guide just doesn't apply." For example, a similar argument to yours could be made about the recommendation to "use the fact that empty sequences are false" instead of checking len(sequence). E.g.: Yes: if not seq: if seq No: if len(seq) If not len(seq) If you want to test that your __len__ method is working, then of course the latter would be acceptable--because you aren't concerned about checking that the list is full; you're concerned about the functionality of the __len__ method. Common sense should indicate that the style guide doesn't apply here because the use case is different. The same goes for truth or falsity. In situations in which you are concerned about the truth or falsity of an object, then you should never use "== True" or "is True", because of the possibility of bugs that others have already pointed out. There are clearly cases in which the style guide just doesn't apply--testing to ensure that a returned value actually "is" a given singleton is one of them. The style guide need not list every case in which the recommendation possibly might not apply. People with advanced needs are advanced enough to know when to break the style guide. =) Matthew Lefavor From: Michael Foord Date: Wednesday, August 8, 2012 6:04 PM To: "python-ideas at python.org" Subject: Re: [Python-ideas] Programming recommendations (PEP 8) and boolean values On 8 August 2012 17:59, Oleg Broytman wrote: On Thu, Aug 09, 2012 at 02:18:53AM +1000, Ben Finney > wrote: > What is a compelling use case for checking precisely for True or False? To distinguish False and None for a tri-state variable that can have 3 values - "yes", "no" and "unspecified" - True, False and None in Python-speak. Or when testing, I often want to check that a method *really* returns True or False rather than some object that happens to evaluate to True or False. Michael Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. _______________________________________________ Python-ideas mailing list Python-ideas at python.org http://mail.python.org/mailman/listinfo/python-ideas -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From dholth at gmail.com Thu Aug 9 02:20:46 2012 From: dholth at gmail.com (Daniel Holth) Date: Wed, 8 Aug 2012 20:20:46 -0400 Subject: [Python-ideas] bdist naming scheme (compatibility tags) PEP In-Reply-To: References: Message-ID: Wheel also has a story at http://wheel.readthedocs.org/en/latest/story.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Thu Aug 9 03:20:24 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 09 Aug 2012 11:20:24 +1000 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values In-Reply-To: References: Message-ID: <50231058.6030103@pearwood.info> On 09/08/12 08:37, Alexander Belopolsky wrote: > With respect to bool, an argument can be made that it having only two > preallocated instances is a language feature rather than an > implementation detail and user can rely on the fact that True, bool(1) > and say not [] all return the same instance. If this is so, I think > this fact should be mentioned somewhere, but probably not in PEP 8. Yes, this is documented: http://docs.python.org/py3k/library/functions.html#bool However, the question is not whether if x is True: and if isinstance(x, bool) and x: are equivalent, or whether there are obscure edge-cases where one genuinely needs to test whether or not an object is identical to True. The question is what is the idiomatic Python code for testing a flag. That is simply if x: which duck-types truthiness and accepts any true value, not just True. Like almost all identity tests, testing "x is True" puts the emphasis on the wrong thing: object identity instead of object value. It restricts the value's type instead of duck-typing, and that's usually unpythonic. If you do have a special need to do so, okay, that's fine, but there's no need for PEP 8 to cover your special use-case. PEP 8 is for conventions for idiomatic code, not every exception. PEP 8 already gives its blessing to break the rules when you need to. Besides, even when it is not, "x is True" looks like a rookie mistake. Beginners often write "if flag == True" even in statically-typed languages where flags cannot be other than a bool. -- Steven From steve at pearwood.info Thu Aug 9 03:22:45 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 09 Aug 2012 11:22:45 +1000 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values In-Reply-To: References: <87628ticgi.fsf@benfinney.id.au> <20120808165904.GA19350@iskra.aviel.ru> <5022A56C.3050708@pearwood.info> Message-ID: <502310E5.3010006@pearwood.info> Taking the liberty to return this to the mailing list. On 09/08/12 08:05, Michael Foord wrote: >> > Why does the PEP say that == True is preferred to is True? >>> >> >> > >> > Because there surely are still libraries out there that return 0 and 1 >> > as true/false values. The bool type was introduced in (by memory) >> > Python 2.2 or 2.3, so very old code may fail, or worse, silently do the >> > wrong thing if you use "is True". >> > >> > > Surely? I haven't seen one in*years*. Nevertheless, "return 1" is still legal Python and was once the idiomatic way of returning a flag. If you don't have a good reason to break such old code, why break it? -- Steven From ncoghlan at gmail.com Thu Aug 9 03:27:34 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 9 Aug 2012 11:27:34 +1000 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values In-Reply-To: References: Message-ID: On Thu, Aug 9, 2012 at 8:39 AM, Lefavor, Matthew (GSFC-582.0)[MICROTEL LLC] wrote: > I don't think PEP 8 is meant to cover a specialized cases like testing. > Keep in mind the recommendation in the beginning of the style guide: "But > most importantly: know when to be inconsistent -- sometimes the style > guide just doesn't apply." Right, there are broad swathes of programming guidelines that simply *don't apply* when writing tests. In order to properly isolate the behaviours you want to test, you often end up doing things that would be legitimately considered downright evil if you ever did them in production code. The recommendations in PEP 8 are for production code that is meant to *do things*. Test code, which is checking whether or not *other* code is doing the expected things, isn't typical production code. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From steve at pearwood.info Thu Aug 9 03:31:01 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 09 Aug 2012 11:31:01 +1000 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values In-Reply-To: References: <87628ticgi.fsf@benfinney.id.au> <20120808165904.GA19350@iskra.aviel.ru> Message-ID: <502312D5.4010905@pearwood.info> On 09/08/12 08:04, Michael Foord wrote: > Or when testing, I often want to check that a method *really* returns True > or False rather than some object that happens to evaluate to True or False. I consider that three separate unit tests: def testGivesTrue(self): for x in self.args_giving_true: self.assertTrue(method(x)) def testGivesFalse(self): for x in self.args_giving_false: self.assertFalse(method(x)) def testIsBool(self): for x in self.args_giving_true + self.args_giving_false: self.assertTrue(isinstance(method(x), bool)) -- Steven From alexander.belopolsky at gmail.com Thu Aug 9 03:43:30 2012 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 8 Aug 2012 21:43:30 -0400 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values In-Reply-To: <50231058.6030103@pearwood.info> References: <50231058.6030103@pearwood.info> Message-ID: On Wed, Aug 8, 2012 at 9:20 PM, Steven D'Aprano wrote: > Yes, this is documented: > > http://docs.python.org/py3k/library/functions.html#bool > Indeed. I was actually looking under "Built-in Types", rather than "Built-in Functions." I now remember that I noted this issue before. This part of the manual is not optimally subdivided into sections: 2. Built-in Functions (covers bool() and other builtin type constructors) 3. Built-in Constants (covers True and False) 4. Built-in Types 4.1. Truth Value Testing 4.2. Boolean Operations ? and, or, not 4.3. Comparisons 4.4. Numeric Types ? int, float, complex ... As a result, there is no one place where one can find information about bool. I wonder if it should go somewhere under section 4.4. After all, >>> isinstance(True, int) True From ben+python at benfinney.id.au Thu Aug 9 05:15:38 2012 From: ben+python at benfinney.id.au (Ben Finney) Date: Thu, 09 Aug 2012 13:15:38 +1000 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values References: <87628ticgi.fsf@benfinney.id.au> <20120808165904.GA19350@iskra.aviel.ru> Message-ID: <87vcgshi1x.fsf@benfinney.id.au> Oleg Broytman writes: > On Thu, Aug 09, 2012 at 02:18:53AM +1000, Ben Finney wrote: > > What is a compelling use case for checking precisely for True or False? > > To distinguish False and None for a tri-state variable that can > have 3 values - "yes", "no" and "unspecified" - True, False and None > in Python-speak. Since True and False are strongly coupled with a *two*-state type, that just seems needlessly confusing to the reader of the code. Better to use a non-bool type which makes it explicit that there are three valid values. -- \ ?In case you haven't noticed, [the USA] are now almost as | `\ feared and hated all over the world as the Nazis were.? ?Kurt | _o__) Vonnegut, 2004 | Ben Finney From ben+python at benfinney.id.au Thu Aug 9 05:23:40 2012 From: ben+python at benfinney.id.au (Ben Finney) Date: Thu, 09 Aug 2012 13:23:40 +1000 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values References: <87628ticgi.fsf@benfinney.id.au> <20120808165904.GA19350@iskra.aviel.ru> <5022A799.9060801@btinternet.com> Message-ID: <87r4rghhoj.fsf@benfinney.id.au> Rob Cliffe writes: > On 08/08/2012 17:59, Oleg Broytman wrote: > > On Thu, Aug 09, 2012 at 02:18:53AM +1000, Ben Finney wrote: > >> What is a compelling use case for checking precisely for True or > >> False? By ?compelling use case for?, I mean ?a use case which is best satisfied by?. So use cases which *could* use this are not compelling if they are better satisfied by some other, existing way. > 1) I have written a function that returns True, False or a string > error message. (No doubt an Exception-subclass object might be better > style.) You get to the nub of it: there are clearly better ways to meet the use case (for this one, use an exception), so this is not a compelling use case for precise checking of True and False. > 2) Say you are writing a function to process an arbitrary object, e.g. > a custom version of repr. You might well choose to write > if obj is True: > # processing > elif obj is False: > # processing > elif type(obj) is int: > # processing Again, this is much clearer written as: if type(obj) is bool: # processing if type(obj) is int: # processing So not a compelling use case for checking specific values. > I am sure the examples could be multiplied. > I can see no reason why we should be discouraged from writing > if x is True: > if that is really what we mean (and need) and not just a verbose way of > spelling "if x:". I don't see any of the cases presented so far (other than unit testing, which is already special enough to break most of the rules) as best satisfied by precise checking for specific values of bool. On the other hand, ?if foo is True? is a common beginner mistake that we meet frequently in ?comp.lang.python?, and is almost always better written as ?if foo?. It's that kind of advice that belongs in PEP 8. -- \ ?The difference between religions and cults is determined by | `\ how much real estate is owned.? ?Frank Zappa | _o__) | Ben Finney From greg.ewing at canterbury.ac.nz Thu Aug 9 05:24:59 2012 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 09 Aug 2012 15:24:59 +1200 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values In-Reply-To: <20120808165904.GA19350@iskra.aviel.ru> References: <87628ticgi.fsf@benfinney.id.au> <20120808165904.GA19350@iskra.aviel.ru> Message-ID: <50232D8B.8060501@canterbury.ac.nz> On 09/08/12 04:59, Oleg Broytman wrote: > To distinguish False and None for a tri-state variable that can have > 3 values - "yes", "no" and "unspecified" - True, False and None in > Python-speak. Er, not quite best practice. As any DailyWTFer knows, the correct third value for a boolean is FileNotFound. http://thedailywtf.com/Articles/What_Is_Truth_0x3f_.aspx -- Greg From ncoghlan at gmail.com Thu Aug 9 05:59:26 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 9 Aug 2012 13:59:26 +1000 Subject: [Python-ideas] Standard types documentation (was Re: Programming recommendations (PEP 8) and boolean values) Message-ID: On Thu, Aug 9, 2012 at 11:43 AM, Alexander Belopolsky wrote: > As a result, there is no one place where one can find information > about bool. I wonder if it should go somewhere under section 4.4. > After all, > > >>> isinstance(True, int) > True I'll be working on the builtin types documentation at the PyConAU sprints in a couple of weeks. They haven't aged well :( Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From g.brandl at gmx.net Thu Aug 9 07:48:57 2012 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 09 Aug 2012 07:48:57 +0200 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values In-Reply-To: <5022E09E.6050505@nedbatchelder.com> References: <87628ticgi.fsf@benfinney.id.au> <20120808165904.GA19350@iskra.aviel.ru> <5022A799.9060801@btinternet.com> <5022E09E.6050505@nedbatchelder.com> Message-ID: On 08.08.2012 23:56, Ned Batchelder wrote: > > On 8/8/2012 4:29 PM, Georg Brandl wrote: >> For None, "==" and "is" are equivalent, because no other object is equal >> to None. For True and False, this is different, and using "is" here is >> a very stealthy bug. > It's easy to make other objects == None, by writing buggy __eq__ > methods. That doesn't happen much, but it's easier with __ne__, where > the negated logic is easy to screw up. I've seen it happen. Use "is > None", not "== None". That's true indeed. Georg From solipsis at pitrou.net Thu Aug 9 09:15:15 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 09 Aug 2012 09:15:15 +0200 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values In-Reply-To: References: Message-ID: Le 08/08/2012 16:28, Guido van Rossum a ?crit : > I'd be strongly against changing that rule. If you don't want other > types than bool, use an isinstance check. Code using "is True" is most > likely a bug. (It's different for None, since that has no second value > that is assumed.) That said, I'm also curious about the answer to Michael's following question: ?why does it say that using an identity check is worse than an equality check?? Regards Antoine. From stephen at xemacs.org Thu Aug 9 09:33:36 2012 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 09 Aug 2012 16:33:36 +0900 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values In-Reply-To: <87vcgshi1x.fsf@benfinney.id.au> References: <87628ticgi.fsf@benfinney.id.au> <20120808165904.GA19350@iskra.aviel.ru> <87vcgshi1x.fsf@benfinney.id.au> Message-ID: <87fw7w5xkf.fsf@uwakimon.sk.tsukuba.ac.jp> Ben Finney writes: > Oleg Broytman writes: > > > On Thu, Aug 09, 2012 at 02:18:53AM +1000, Ben Finney wrote: > > > What is a compelling use case for checking precisely for True or False? > > > > To distinguish False and None for a tri-state variable that can > > have 3 values - "yes", "no" and "unspecified" - True, False and None > > in Python-speak. > > Since True and False are strongly coupled with a *two*-state type, that > just seems needlessly confusing to the reader of the code. Better to use > a non-bool type which makes it explicit that there are three valid > values. I occasionally use an abstract base class that initializes a Boolean member to None, and assert "is not None" in testing or validation code. I wouldn't call it "compelling," but I find this convention very useful in validating my own code, as if such variables weren't properly initialized in a derived class, I probably screwed up something else, too. From ubershmekel at gmail.com Thu Aug 9 09:35:29 2012 From: ubershmekel at gmail.com (Yuval Greenfield) Date: Thu, 9 Aug 2012 10:35:29 +0300 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values In-Reply-To: References: Message-ID: On Thu, Aug 9, 2012 at 10:15 AM, Antoine Pitrou wrote: > Le 08/08/2012 16:28, Guido van Rossum a ?crit : > > I'd be strongly against changing that rule. If you don't want other >> types than bool, use an isinstance check. Code using "is True" is most >> likely a bug. (It's different for None, since that has no second value >> that is assumed.) >> > > That said, I'm also curious about the answer to Michael's following > question: > ?why does it say that using an identity check is worse than an equality > check?? > > In python 3.2.3: >>> 1 == True True * >>> 13 == True* * False* >>> bool(1) True >>> bool(13) True >>> 1 is True False >>> 13 is True False To my surprise identity is actually less confusing than equality. So I agree with Antoine and Michael on that point. -------------- next part -------------- An HTML attachment was scrubbed... URL: From songofacandy at gmail.com Thu Aug 9 09:52:40 2012 From: songofacandy at gmail.com (INADA Naoki) Date: Thu, 9 Aug 2012 16:52:40 +0900 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values In-Reply-To: References: Message-ID: In [1]: True + True Out[1]: 2 I think Python 4 should raise ValueError. On Thu, Aug 9, 2012 at 4:35 PM, Yuval Greenfield wrote: > On Thu, Aug 9, 2012 at 10:15 AM, Antoine Pitrou wrote: >> >> Le 08/08/2012 16:28, Guido van Rossum a ?crit : >> >>> I'd be strongly against changing that rule. If you don't want other >>> types than bool, use an isinstance check. Code using "is True" is most >>> likely a bug. (It's different for None, since that has no second value >>> that is assumed.) >> >> >> That said, I'm also curious about the answer to Michael's following >> question: >> ?why does it say that using an identity check is worse than an equality >> check?? >> > > In python 3.2.3: > > >>> 1 == True > True > >>> 13 == True > False > >>> bool(1) > True > >>> bool(13) > True > >>> 1 is True > False > >>> 13 is True > False > > To my surprise identity is actually less confusing than equality. So I agree > with Antoine and Michael on that point. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- INADA Naoki From mal at egenix.com Thu Aug 9 10:07:49 2012 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 09 Aug 2012 10:07:49 +0200 Subject: [Python-ideas] bdist naming scheme (compatibility tags) PEP In-Reply-To: References: Message-ID: <50236FD5.2060404@egenix.com> Daniel Holth wrote: > Platform Tag > ------------ > > The platform tag is simply `distutils.util.get_platform()` with all > hyphens `-` and periods `.` replaced with underscore `_`. This part is going to cause problems. distutils is good at identifying Linux and Windows and giving them sensible platform names, but it doesn't do a good job for other OSes. For e.g. FreeBSD it adds too much detail, for Mac OS X it doesn't have enough detail and it also has a tendency to change even for Python dot releases (esp. for Mac OS X which constantly causes problems). I think your naming scheme ought to focus more on the platform part, as the other parts (Python version and implementation) are well understood. For the platform, the installer would have to detect whether a package is compatible with the platform. This often requires intimate knowledge about the platform. Things to consider: * OS name * OS version, if that matters for compatibility * C lib version, if that matters for compatibility * ABI version, if that matters for compatibility * architecture (Intel, PowerPC, Sparc, ARM, etc) * bits (32, 64, 128, etc.) * fat builds which include multiple variants in a single archive and probably some more depending on OS. In some cases, a package will also have external requirements such as specific versions of a library (e.g. 0.9.8 vs. 1.0.0 OpenSSL library, or 2.2 vs. 2.3 unixODBC). These quickly get complicated up to the point where you need to run a script in order to determine whether a platform is compatible with the package or not. Putting all that information into a tag is going to be difficult, so an installer will either have to access more meta information about the package from some other resource than the file name (which is what PyPI is heading at), or download all variants that fit the target platform and then look inside the files for more meta information. So the tag name format will have to provide a set of basic "dimensions" for the platform (e.g. OS name, architecture, bits), but also needs to provide user defined additions that can be used to differentiate between all the extra variants which may be needed, and which can easily be parsed by a human with more background knowledge about the target system and his/her needs to select the right file. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 09 2012) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2012-08-25: FrOSCon, St. Augustin, Germany ... 16 days to go ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From rob.cliffe at btinternet.com Thu Aug 9 10:09:35 2012 From: rob.cliffe at btinternet.com (Rob Cliffe) Date: Thu, 09 Aug 2012 09:09:35 +0100 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values In-Reply-To: References: <87628ticgi.fsf@benfinney.id.au> <20120808165904.GA19350@iskra.aviel.ru> <5022A799.9060801@btinternet.com> <5022E09E.6050505@nedbatchelder.com> Message-ID: <5023703F.8060500@btinternet.com> On 09/08/2012 06:48, Georg Brandl wrote: > On 08.08.2012 23:56, Ned Batchelder wrote: >> >> On 8/8/2012 4:29 PM, Georg Brandl wrote: >>> For None, "==" and "is" are equivalent, because no other object is >>> equal >>> to None. For True and False, this is different, and using "is" here is >>> a very stealthy bug. >> It's easy to make other objects == None, by writing buggy __eq__ >> methods. That doesn't happen much, but it's easier with __ne__, where >> the negated logic is easy to screw up. I've seen it happen. Use "is >> None", not "== None". > > That's true indeed. > > Georg > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > Surely that's a reason to fix the buggy __eq__ and __ne__ methods, not to avoid using them. (Sorry Georg I accidentally replied to you not to the list.) Rob Cliffe From stephen at xemacs.org Thu Aug 9 10:29:53 2012 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 09 Aug 2012 17:29:53 +0900 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values In-Reply-To: References: Message-ID: <87a9y45uym.fsf@uwakimon.sk.tsukuba.ac.jp> Yuval Greenfield writes: > In python 3.2.3: > > >>> 1 == True > True > * >>> 13 == True* > * False* > >>> bool(1) > True > >>> bool(13) > True > >>> 1 is True > False > >>> 13 is True > False > > To my surprise identity is actually less confusing than equality. So I > agree with Antoine and Michael on that point. FWIW, I don't find any of the above confusing. "1 == True" => True is unexpected in some sense, but I don't find it counter-intuitive, and I find the "don't break old code" rationale satisfactory. Inada-san's example of addition is a little disconcerting in that particular spelling. But sum(predicate(x) for x in seq) seems preferable to a boolean-specific count function, and it generalizes nicely to the creation of dummy variables in statistical applications. Steve From mal at egenix.com Thu Aug 9 10:33:44 2012 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 09 Aug 2012 10:33:44 +0200 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values In-Reply-To: References: Message-ID: <502375E8.9010802@egenix.com> I think you're reading to much into the "is". Sure, "x is True" reads like "x has the property of being true", but that's not what the "is" operator is all about. It's about an identity check that works fine when you want to check for identity, but doesn't otherwise. For "None" and other singletons, the "is" operator is the perfect choice, since it avoids programming errors and is a very fast way to check a value against the singleton. For non-singletons, "is" rarely makes sense and can indeed introduce programming errors. Now, True and False are singletons in their own right, but they are also special integers 1 and 0, nothing more, nothing less. So if you are interested in checking whether a function does indeed use these special integers, you can use the "is" operator, but apart from testing, where this may sometimes be needed, the much more common use case is not to check for the special integers, but instead check for property of the return value, i.e. whether the return value has the true property or not and you usually apply that check in a different way: if x: print 'true assumption' else: print 'false assumption' By checking just for the True singleton, it is well possible that you'll miss a lot of other true values, e.g. 42 is true, but a different integer, so "42 is True" will return false. Likewise, checking for False won't catch the 0.0 float value, an empty string or an empty list. INADA Naoki wrote: > In [1]: True + True > Out[1]: 2 > > I think Python 4 should raise ValueError. > > On Thu, Aug 9, 2012 at 4:35 PM, Yuval Greenfield wrote: >> On Thu, Aug 9, 2012 at 10:15 AM, Antoine Pitrou wrote: >>> >>> Le 08/08/2012 16:28, Guido van Rossum a ?crit : >>> >>>> I'd be strongly against changing that rule. If you don't want other >>>> types than bool, use an isinstance check. Code using "is True" is most >>>> likely a bug. (It's different for None, since that has no second value >>>> that is assumed.) >>> >>> >>> That said, I'm also curious about the answer to Michael's following >>> question: >>> ?why does it say that using an identity check is worse than an equality >>> check?? >>> >> >> In python 3.2.3: >> >> >>> 1 == True >> True >> >>> 13 == True >> False >> >>> bool(1) >> True >> >>> bool(13) >> True >> >>> 1 is True >> False >> >>> 13 is True >> False >> >> To my surprise identity is actually less confusing than equality. So I agree >> with Antoine and Michael on that point. >> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas >> > > > -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 09 2012) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2012-08-25: FrOSCon, St. Augustin, Germany ... 16 days to go ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From dholth at gmail.com Thu Aug 9 15:14:41 2012 From: dholth at gmail.com (Daniel Holth) Date: Thu, 9 Aug 2012 09:14:41 -0400 Subject: [Python-ideas] bdist naming scheme (compatibility tags) PEP In-Reply-To: <50236FD5.2060404@egenix.com> References: <50236FD5.2060404@egenix.com> Message-ID: On Aug 9, 2012 4:07 AM, "M.-A. Lemburg" wrote: > Daniel Holth wrote: > > Platform Tag > > ------------ > > > > The platform tag is simply `distutils.util.get_platform()` with all > > hyphens `-` and periods `.` replaced with underscore `_`. > > This part is going to cause problems. distutils is good at identifying > Linux and Windows and giving them sensible platform names, but it > doesn't do a good job for other OSes. > > For e.g. FreeBSD it adds too much detail, for Mac OS X it doesn't have > enough detail and it also has a tendency to change even for Python > dot releases (esp. for Mac OS X which constantly causes problems). > egg does something a little more specific for OS X. I should probably copy that. egg is obviously a big influence on this work. I downloaded 862 eggs in May, 72 of which were platform-specific: Counter({'win32': 48, 'linux-x86_64': 9, 'macosx-10.6-fat': 8, 'linux-i686': 2, 'macosx-10.7-intel': 2, 'macosx-10.4-x86_64': 1, 'macosx-10.5-intel': 1, 'macosx-10.5-i386': 1}) I think your naming scheme ought to focus more on the platform > part, as the other parts (Python version and implementation) are > well understood. > OK > For the platform, the installer would have to detect whether a > package is compatible with the platform. This often requires intimate > knowledge about the platform. > > Things to consider: > > * OS name > * OS version, if that matters for compatibility > * C lib version, if that matters for compatibility > * ABI version, if that matters for compatibility > * architecture (Intel, PowerPC, Sparc, ARM, etc) > * bits (32, 64, 128, etc.) > * fat builds which include multiple variants in a single > archive > > and probably some more depending on OS. > > In some cases, a package will also have external requirements such > as specific versions of a library (e.g. 0.9.8 vs. 1.0.0 OpenSSL > library, or 2.2 vs. 2.3 unixODBC). These quickly get complicated > up to the point where you need to run a script in order to determine whether a platform is compatible with the package or not. > The external library requirements are out of scope for these tags. There is a suitable Metadata 1.2 tag for external requirements. Putting all that information into a tag is going to be difficult, > so an installer will either have to access more meta information > about the package from some other resource than the file name > (which is what PyPI is heading at), or download all variants > that fit the target platform and then look inside the files > for more meta information. > > So the tag name format will have to provide a set of basic > "dimensions" for the platform (e.g. OS name, architecture, bits), > but also needs to provide user defined additions that can > be used to differentiate between all the extra variants which > may be needed, and which can easily be parsed by a human with > more background knowledge about the target system and his/her > needs to select the right file. > I don't want anyone to manually download packages. It just doesn't work when you have a lot of dependencies. I am interested in an 80% solution to this problem. Like the people who have uploaded eggs to pypi, I use Windows, Mac, and Linux. If someone can provide a good get_platform() for other platforms, great. I don't have that knowledge. Why don't I add the platform tag "local". Pre-built binary packages on pypi are most-necessary for Windows where it is hard to install the compiler, then Mac, and then Linux where you usually do have a compiler. If you are on a less common platform that always compiles everything from source anyway then you might compile a local cache of -local tagged binary packages. The tools will know not to upload these to pypi. -------------- next part -------------- An HTML attachment was scrubbed... URL: From phd at phdru.name Thu Aug 9 11:40:03 2012 From: phd at phdru.name (Oleg Broytman) Date: Thu, 9 Aug 2012 13:40:03 +0400 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values In-Reply-To: <50232D8B.8060501@canterbury.ac.nz> References: <87628ticgi.fsf@benfinney.id.au> <20120808165904.GA19350@iskra.aviel.ru> <50232D8B.8060501@canterbury.ac.nz> Message-ID: <20120809094003.GB10904@iskra.aviel.ru> On Thu, Aug 09, 2012 at 03:24:59PM +1200, Greg Ewing wrote: > On 09/08/12 04:59, Oleg Broytman wrote: > > To distinguish False and None for a tri-state variable that can have > >3 values - "yes", "no" and "unspecified" - True, False and None in > >Python-speak. > > Er, not quite best practice. Unavoidable. E.g. in SQL a boolean column stores True, False and None (SQL NULL). Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From phd at phdru.name Thu Aug 9 11:36:48 2012 From: phd at phdru.name (Oleg Broytman) Date: Thu, 9 Aug 2012 13:36:48 +0400 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values In-Reply-To: <87vcgshi1x.fsf@benfinney.id.au> References: <87628ticgi.fsf@benfinney.id.au> <20120808165904.GA19350@iskra.aviel.ru> <87vcgshi1x.fsf@benfinney.id.au> Message-ID: <20120809093648.GA10904@iskra.aviel.ru> On Thu, Aug 09, 2012 at 01:15:38PM +1000, Ben Finney wrote: > Oleg Broytman writes: > > On Thu, Aug 09, 2012 at 02:18:53AM +1000, Ben Finney wrote: > > > What is a compelling use case for checking precisely for True or False? > > > > To distinguish False and None for a tri-state variable that can > > have 3 values - "yes", "no" and "unspecified" - True, False and None > > in Python-speak. > > Since True and False are strongly coupled with a *two*-state type, that > just seems needlessly confusing to the reader of the code. Better to use > a non-bool type which makes it explicit that there are three valid > values. I am not always free to use my own classes and values. Sometimes (quite often, actually) I receive values from 3rd-party libraries. Example - DB API drivers: cursor.execute("SELECT boolean FROM mytable") row = cursor.fetchone() Now row[0] is that tri-state variable with possible values 1/0/None or True/False/None. None is of course for SQL NULL. Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From ned at nedbatchelder.com Thu Aug 9 16:23:40 2012 From: ned at nedbatchelder.com (Ned Batchelder) Date: Thu, 09 Aug 2012 10:23:40 -0400 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values In-Reply-To: <5023703F.8060500@btinternet.com> References: <87628ticgi.fsf@benfinney.id.au> <20120808165904.GA19350@iskra.aviel.ru> <5022A799.9060801@btinternet.com> <5022E09E.6050505@nedbatchelder.com> <5023703F.8060500@btinternet.com> Message-ID: <5023C7EC.60403@nedbatchelder.com> On 8/9/2012 4:09 AM, Rob Cliffe wrote: > > On 09/08/2012 06:48, Georg Brandl wrote: >> On 08.08.2012 23:56, Ned Batchelder wrote: >>> >>> On 8/8/2012 4:29 PM, Georg Brandl wrote: >>>> For None, "==" and "is" are equivalent, because no other object is >>>> equal >>>> to None. For True and False, this is different, and using "is" >>>> here is >>>> a very stealthy bug. >>> It's easy to make other objects == None, by writing buggy __eq__ >>> methods. That doesn't happen much, but it's easier with __ne__, where >>> the negated logic is easy to screw up. I've seen it happen. Use "is >>> None", not "== None". >> >> That's true indeed. >> >> Georg >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas >> >> > Surely that's a reason to fix the buggy __eq__ and __ne__ methods, not > to avoid using them. > (Sorry Georg I accidentally replied to you not to the list.) Of course you should fix those. But if the two methods are "equivalent" except one isn't susceptible to this sort of error, why not recommend and use the more robust technique? Use "is None", not "== None". --Ned. > Rob Cliffe > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From guido at python.org Thu Aug 9 16:33:13 2012 From: guido at python.org (Guido van Rossum) Date: Thu, 9 Aug 2012 07:33:13 -0700 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values In-Reply-To: References: Message-ID: They're both equally bad IMO. On Thursday, August 9, 2012, Antoine Pitrou wrote: > Le 08/08/2012 16:28, Guido van Rossum a ?crit : > >> I'd be strongly against changing that rule. If you don't want other >> types than bool, use an isinstance check. Code using "is True" is most >> likely a bug. (It's different for None, since that has no second value >> that is assumed.) >> > > That said, I'm also curious about the answer to Michael's following > question: > ?why does it say that using an identity check is worse than an equality > check?? > > Regards > > Antoine. > > > ______________________________**_________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/**mailman/listinfo/python-ideas > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From rurpy at yahoo.com Thu Aug 9 16:58:41 2012 From: rurpy at yahoo.com (rurpy at yahoo.com) Date: Thu, 9 Aug 2012 07:58:41 -0700 (PDT) Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values In-Reply-To: <87r4rghhoj.fsf@benfinney.id.au> References: <87628ticgi.fsf@benfinney.id.au> <20120808165904.GA19350@iskra.aviel.ru> <5022A799.9060801@btinternet.com> <87r4rghhoj.fsf@benfinney.id.au> Message-ID: On 08/08/2012 09:23 PM, Ben Finney wrote:> Rob Cliffe writes: >[...] >> 2) Say you are writing a function to process an arbitrary object, e.g. >> a custom version of repr. You might well choose to write >> if obj is True: >> # processing >> elif obj is False: >> # processing >> elif type(obj) is int: >> # processing > > Again, this is much clearer written as: > > if type(obj) is bool: > # processing > if type(obj) is int: > # processing > > So not a compelling use case for checking specific values. Your replacement is not equivalent. I think what you meant was, if type(obj) is bool: if obj: # processing else: # processing elif type(obj) is int: # processing I don't find that clearer at all, let alone "much clearer" than, if obj is True: # processing elif obj is False: # processing elif type(obj) is int: # processing (On the other hand I wouldn't claim the latter is "much clearer" than the former either, but your claim is wrong IMO.) >[...] > On the other hand, ?if foo is True? is a common beginner mistake that we > meet frequently in ?comp.lang.python?, and is almost always better > written as ?if foo?. It's that kind of advice that belongs in PEP 8. Perhaps correcting beginner's mistakes should be done in the tutorial while PEP-8 should be aimed at a broader and on average more advanced audience. I realize that one purpose of style conventions are to avoid mistakes, including beginner one's but I and several times found the simplest, most intuitive way to implement tri-value logic was to use {True, False, None} and so think that if "is True" is to be officially discouraged because misuse by some people in the first .5% of their Python careers, the reason for that discouragement should at least be explained in PEP-8. -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Thu Aug 9 16:59:55 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 10 Aug 2012 00:59:55 +1000 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values In-Reply-To: <5023703F.8060500@btinternet.com> References: <87628ticgi.fsf@benfinney.id.au> <20120808165904.GA19350@iskra.aviel.ru> <5022A799.9060801@btinternet.com> <5022E09E.6050505@nedbatchelder.com> <5023703F.8060500@btinternet.com> Message-ID: <5023D06B.30402@pearwood.info> On 09/08/12 18:09, Rob Cliffe wrote: > > On 09/08/2012 06:48, Georg Brandl wrote: >> On 08.08.2012 23:56, Ned Batchelder wrote: >>> >>> On 8/8/2012 4:29 PM, Georg Brandl wrote: >>>> For None, "==" and "is" are equivalent, because no other object is equal >>>> to None. For True and False, this is different, and using "is" here is >>>> a very stealthy bug. >>> It's easy to make other objects == None, by writing buggy __eq__ >>> methods. That doesn't happen much, but it's easier with __ne__, where >>> the negated logic is easy to screw up. I've seen it happen. Use "is >>> None", not "== None". [...] > Surely that's a reason to fix the buggy __eq__ and __ne__ methods, not to avoid using them. > (Sorry Georg I accidentally replied to you not to the list.) It may not be in your power to fix the buggy method, because it may be in the caller's code, not yours. Or it may not even be a bug -- maybe the caller has some good reason for wanting his object to compare equal to None. If I write a function like this: def spam(n=None): """Print spam n times. If n is not given or is None, print a nasty message.""" if n == None: print("Nobody expects the SPANISH INQUISITION!!! ***ominous chords***") else: for i in range(n-1): print("Spam! ", end='') print("Glorious SPAM!!!") my intention, and the documented behaviour, is that n=None, and nothing but n=None, should trigger the nasty message. So my function has a bug, not the caller, because my code fails to live up to my promise. Blaming the caller for passing a "buggy" object is just a cop-out. I made a promise that None and only None is special, and my code breaks that promise. My bug, not the caller's. The fix is, of course, to use "is None" unless I genuinely want to match None by equality for some strange reason. -- Steven From ron3200 at gmail.com Thu Aug 9 18:12:15 2012 From: ron3200 at gmail.com (Ron Adam) Date: Thu, 09 Aug 2012 11:12:15 -0500 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values In-Reply-To: References: Message-ID: On 08/08/2012 09:11 AM, Michael Foord wrote: > > True and False are singletons, so if you want to check precisely for True > and False then an identity check seems appropriate. > > However the "Programming Recommendations" section of PEP 8 has this to say > on the topic: > > Don't compare boolean values to True or False using ==. > > Yes: if greeting: > No: if greeting == True: > Worse: if greeting is True: > > http://www.python.org/dev/peps/pep-0008/#programming-recommendations > > It seems to me that there is an important distinction between testing that > an object is either the boolean True or False and merely checking the > "truthiness" of an object. Many a bug has been caused by an empty container > object (or some other falsey object) falling into an "if not value" clause > that was actually meant to check for the presence of False or None. > > Why does PEP 8 recommend not testing "boolean values" (which to me implies > "values you expect to be a bool") using equality or identity, and more > specifically why does it say that using an identity check is worse than an > equality check? Near the top of the web page you referred is ... > This document gives coding conventions for the Python code comprising > the standard library in the main Python distribution. I think in the context of coding Python's library, the "if greeting:" case makes the function more general and easier to use in more situations. Cheers, Ron From steve at pearwood.info Thu Aug 9 18:14:43 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 10 Aug 2012 02:14:43 +1000 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values In-Reply-To: References: Message-ID: <5023E1F3.4070008@pearwood.info> On 09/08/12 17:15, Antoine Pitrou wrote: > Le 08/08/2012 16:28, Guido van Rossum a ?crit : >> I'd be strongly against changing that rule. If you don't want other >> types than bool, use an isinstance check. Code using "is True" is most >> likely a bug. (It's different for None, since that has no second value >> that is assumed.) > > That said, I'm also curious about the answer to Michael's following question: > ?why does it say that using an identity check is worse than an equality check?? 1) Identity checks put the emphasis on the wrong thing: object identity rather than value. Why do you care that True and False are singletons? Your algorithm almost certainly does not depend on that fact. And who knows? Perhaps some day your code will be running on some Python implementation which fails to enforce True and False being singletons. 2) In old code, 0 and 1 were the idiomatic flags. 0 == False and 1 == True, but if you use identity checks, your code will unnecessarily break old code. Postel's Law (the Robustness Principle) tells us that we should be strict in what we send and liberal in what we accept. This rule has greater applicability than just networking. It tells us that when returning a boolean flag, we should strictly always return True or False. But when accepting a boolean flag as argument to our function, we should not unnecessarily limit what counts as a valid argument. So in order of preference: 1) under most circumstances, we should accept duck-typed truthy values (e.g. use "if x") as the most liberal way to test a flag in Python; 2) if we have a good reason not to duck-type a flag, then next best is to compare by value, not identity ("if x == True"); 3) least preferred (worst) is to be a Boolean Fascist and only accept True and False by identity ("if x is True"). There may be some cases where you rightly wish to insist on an actual bool rather than any truthy or falsey value, but that should be the exception rather than the rule. -- Steven From ethan at stoneleaf.us Thu Aug 9 21:00:08 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 09 Aug 2012 12:00:08 -0700 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values In-Reply-To: <5022E09E.6050505@nedbatchelder.com> References: <87628ticgi.fsf@benfinney.id.au> <20120808165904.GA19350@iskra.aviel.ru> <5022A799.9060801@btinternet.com> <5022E09E.6050505@nedbatchelder.com> Message-ID: <502408B8.20608@stoneleaf.us> Ned Batchelder wrote: > On 8/8/2012 4:29 PM, Georg Brandl wrote: >> For None, "==" and "is" are equivalent, because no other object is equal >> to None. For True and False, this is different, and using "is" here is >> a very stealthy bug. > > It's easy to make other objects == None, by writing buggy __eq__ > methods. That doesn't happen much, but it's easier with __ne__, where > the negated logic is easy to screw up. I've seen it happen. Use "is > None", not "== None". It's also easy to make objects == None because that's the behavior you want. Like most things, there are exceptions to the rules, and you need to know the package you are dealing with to use it properly. ~Ethan~ From tjreedy at udel.edu Thu Aug 9 23:07:02 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 09 Aug 2012 17:07:02 -0400 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values In-Reply-To: References: <87628ticgi.fsf@benfinney.id.au> <20120808165904.GA19350@iskra.aviel.ru> <5022A799.9060801@btinternet.com> <87r4rghhoj.fsf@benfinney.id.au> Message-ID: On 8/9/2012 10:58 AM, rurpy at yahoo.com wrote: > I realize that one purpose of style conventions are > to avoid mistakes, including beginner one's but I > and several times found the simplest, most intuitive > way to implement tri-value logic was to use {True, > False, None} and so think that if "is True" is to be > officially discouraged because misuse by some people > in the first .5% of their Python careers, the reason > for that discouragement should at least be explained > in PEP-8. To repeat what I said at the beginning of the thread. if x is True: is redundant and unnecessary and wasteful EXCEPT in those rare cases where one specifically wants to know that x *is* True AND x could be another true value, so that 'is True' is necessary to differentiate True from other true values. Tri-value logic with {True, False, None} is not such an exceptional case for True. 'if tri_value:' is equivalent to 'if tri_value is True:'. If you prefer the latter anyway, then ignore PEP 8 on that point. The same principle applies to False: if x is False: is equivalent to if not x: EXCEPT when one wants to know if x is False AND it could be another false value. The same tri-value logic is such an exception for False and one would need 'x is False' to deal with False before None. Of course, one *could* always write if x: Trueproc() elif x is None: Noneproc() else: Falseproc() -- Terry Jan Reedy From mwm at mired.org Sat Aug 11 05:47:54 2012 From: mwm at mired.org (Mike Meyer) Date: Fri, 10 Aug 2012 22:47:54 -0500 Subject: [Python-ideas] Programming recommendations (PEP 8) and boolean values In-Reply-To: References: Message-ID: Am I the only one that noticed this? PEP 8 says: "Don't compare boolean values to True or False using ==." To me, all the talk about wanting to test variables that might not hold True of False (i.e. - True/False/None or True/False/String or ...) are off topic, as those values *aren't* boolean values. So this bit of the PEP doesn't really apply to those cases. Some of the responses have touched on this by saying you should check that the values are/aren't boolean before testing them, which would mean the above would apply to the tests on the boolean branch. But the arguments are all couched in terms of better style, as opposed whether or not the this rule in the PEP actually applies. Am I wrong here? Has my exposure to Hindley-Milner type systems tainted me to the point where I just don't get it any more? Thanks, References: Message-ID: <5027AC39.2090205@pearwood.info> On 11/08/12 13:47, Mike Meyer wrote: > Am I the only one that noticed this? > > PEP 8 says: "Don't compare boolean values to True or False using ==." > > To me, all the talk about wanting to test variables that might not > hold True of False (i.e. - True/False/None or True/False/String or > ...) are off topic, as those values *aren't* boolean values. So this > bit of the PEP doesn't really apply to those cases. That's debatable. If I have an object which could be True/False/Other, or even many different Others, once I've eliminated the Other case(s), I'm left with a binary choice between two bools. Hence PEP 8 apparently applies even when dealing with tri-state logic values. > Some of the > responses have touched on this by saying you should check that the > values are/aren't boolean before testing them, which would mean the > above would apply to the tests on the boolean branch. Exactly. > But the > arguments are all couched in terms of better style, as opposed whether > or not the this rule in the PEP actually applies. I don't understand. The rule is a style rule, and programming style is intended to result in better code. So what's your point? > Am I wrong here? Has my exposure to Hindley-Milner type systems > tainted me to the point where I just don't get it any more? I don't anything about Hindley-Milner, but I do know that duck-typing means that any object can quack like a bool. I also know that even in languages with strict True/False bools, like Pascal, writing "if flag == True" is a common rookie mistake. In both cases, a recommendation to avoid both "flag == True" and "flag is True" is good advice. PEP 8 already gives us permission to break the rules when necessary. What else do we need? -- Steven From dholth at gmail.com Fri Aug 24 19:26:49 2012 From: dholth at gmail.com (Daniel Holth) Date: Fri, 24 Aug 2012 13:26:49 -0400 Subject: [Python-ideas] format specifier for "not bytes" Message-ID: While I was implementing JSON-JWS (JSON web signatures), a format which in Python 3 has to go from bytes > unicode > bytes > unicode several times in its construction, I notice I wrote a lot of bugs: "sha256=b'abcdef1234'" When I meant to say: "sha256=abcdef1234" Everything worked perfectly on Python 3 because the verifying code also generated the sha256=b'abcdef1234' as a comparison. I would have never noticed at all unless I had tried to verify the Python 3 output with Python 2. I know I'm a bad person for not having unit tests capable enough to catch this bug, a bug I wrote repeatedly in each layer of the bytes > unicode > bytes > unicode dance, and that there is no excuse for being confused at any time about the type of a variable, but I'm not willing to reform. Instead, I would like a new string formatting operator tentatively called 'notbytes': "sha256=%notbytes" % (b'abcdef1234'). It gives the same error as 'sha256='+b'abc1234' would: TypeError: Can't convert 'bytes' object to str implictly From solipsis at pitrou.net Fri Aug 24 19:55:31 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 24 Aug 2012 19:55:31 +0200 Subject: [Python-ideas] format specifier for "not bytes" References: Message-ID: <20120824195531.68d6f18b@pitrou.net> On Fri, 24 Aug 2012 13:26:49 -0400 Daniel Holth wrote: > While I was implementing JSON-JWS (JSON web signatures), a format > which in Python 3 has to go from bytes > unicode > bytes > unicode > several times in its construction, I notice I wrote a lot of bugs: > > "sha256=b'abcdef1234'" > > When I meant to say: > > "sha256=abcdef1234" > > Everything worked perfectly on Python 3 because the verifying code > also generated the sha256=b'abcdef1234' as a comparison. I would have > never noticed at all unless I had tried to verify the Python 3 output > with Python 2. You can use the -bb flag to raise BytesWarnings in such cases: $ python3 -bb Python 3.2.2+ (3.2:9ef20fbd340f, Oct 15 2011, 21:22:07) [GCC 4.5.2] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> str(b'foo') Traceback (most recent call last): File "", line 1, in BytesWarning: str() on a bytes instance >>> "%s" % (b'foo',) Traceback (most recent call last): File "", line 1, in BytesWarning: str() on a bytes instance >>> "{}".format(b'foo') Traceback (most recent call last): File "", line 1, in BytesWarning: str() on a bytes instance Regards Antoine. -- Software development and contracting: http://pro.pitrou.net From python at mrabarnett.plus.com Fri Aug 24 20:03:42 2012 From: python at mrabarnett.plus.com (MRAB) Date: Fri, 24 Aug 2012 19:03:42 +0100 Subject: [Python-ideas] format specifier for "not bytes" In-Reply-To: References: Message-ID: <5037C1FE.9020509@mrabarnett.plus.com> On 24/08/2012 18:26, Daniel Holth wrote: > While I was implementing JSON-JWS (JSON web signatures), a format > which in Python 3 has to go from bytes > unicode > bytes > unicode > several times in its construction, I notice I wrote a lot of bugs: > > "sha256=b'abcdef1234'" > > When I meant to say: > > "sha256=abcdef1234" > > Everything worked perfectly on Python 3 because the verifying code > also generated the sha256=b'abcdef1234' as a comparison. I would have > never noticed at all unless I had tried to verify the Python 3 output > with Python 2. > > I know I'm a bad person for not having unit tests capable enough to > catch this bug, a bug I wrote repeatedly in each layer of the bytes > > unicode > bytes > unicode dance, and that there is no excuse for being > confused at any time about the type of a variable, but I'm not willing > to reform. > > Instead, I would like a new string formatting operator tentatively > called 'notbytes': "sha256=%notbytes" % (b'abcdef1234'). It gives the > same error as 'sha256='+b'abc1234' would: TypeError: Can't convert > 'bytes' object to str implictly > Why are you singling out 'bytes'? The "%s" format specifier (or "{:s}" with the .format method) will accept a whole range of values, including ints and lists, which, when concatenated, will raise a TypeError. Why should 'bytes' be different? There _are_ certain number-only formats, so perhaps what you should be asking for is a string-only format. From dholth at gmail.com Fri Aug 24 20:33:48 2012 From: dholth at gmail.com (Daniel Holth) Date: Fri, 24 Aug 2012 14:33:48 -0400 Subject: [Python-ideas] format specifier for "not bytes" In-Reply-To: <5037C1FE.9020509@mrabarnett.plus.com> References: <5037C1FE.9020509@mrabarnett.plus.com> Message-ID: String only would be perfect. I only single out bytes because they are more like strings than any other type. -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Fri Aug 24 20:40:43 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 24 Aug 2012 20:40:43 +0200 Subject: [Python-ideas] format specifier for "not bytes" References: <5037C1FE.9020509@mrabarnett.plus.com> Message-ID: <20120824204043.3c4c4524@pitrou.net> On Fri, 24 Aug 2012 14:33:48 -0400 Daniel Holth wrote: > String only would be perfect. I only single out bytes because they are more > like strings than any other type. You can use concatenation instead of (or in addition to) formatting: >>> "" + "foo" 'foo' >>> "" + b"foo" Traceback (most recent call last): File "", line 1, in TypeError: Can't convert 'bytes' object to str implicitly >>> "" + 42 Traceback (most recent call last): File "", line 1, in TypeError: Can't convert 'int' object to str implicitly Regards Antoine. -- Software development and contracting: http://pro.pitrou.net From dholth at gmail.com Fri Aug 24 20:57:08 2012 From: dholth at gmail.com (Daniel Holth) Date: Fri, 24 Aug 2012 14:57:08 -0400 Subject: [Python-ideas] format specifier for "not bytes" In-Reply-To: <20120824204043.3c4c4524@pitrou.net> References: <5037C1FE.9020509@mrabarnett.plus.com> <20120824204043.3c4c4524@pitrou.net> Message-ID: Yes, if I wanted to pretend I was using JavaScript. A string-only formatter might cause problems with translation string / gettext type objects? On Fri, Aug 24, 2012 at 2:40 PM, Antoine Pitrou wrote: > On Fri, 24 Aug 2012 14:33:48 -0400 > Daniel Holth wrote: >> String only would be perfect. I only single out bytes because they are more >> like strings than any other type. > > You can use concatenation instead of (or in addition to) formatting: > >>>> "" + "foo" > 'foo' >>>> "" + b"foo" > Traceback (most recent call last): > File "", line 1, in > TypeError: Can't convert 'bytes' object to str implicitly >>>> "" + 42 > Traceback (most recent call last): > File "", line 1, in > TypeError: Can't convert 'int' object to str implicitly > > Regards > > Antoine. > > > -- > Software development and contracting: http://pro.pitrou.net > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From solipsis at pitrou.net Fri Aug 24 21:01:25 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 24 Aug 2012 21:01:25 +0200 Subject: [Python-ideas] format specifier for "not bytes" References: <5037C1FE.9020509@mrabarnett.plus.com> <20120824204043.3c4c4524@pitrou.net> Message-ID: <20120824210125.758f4b9a@pitrou.net> On Fri, 24 Aug 2012 14:57:08 -0400 Daniel Holth wrote: > Yes, if I wanted to pretend I was using JavaScript. ??? > A string-only formatter might cause problems with translation string / > gettext type objects? The question is rather: is it worth it? We certainly don't want to create formatters for every existing use case. Regards Antoine. -- Software development and contracting: http://pro.pitrou.net From steve at pearwood.info Fri Aug 24 21:07:31 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 25 Aug 2012 05:07:31 +1000 Subject: [Python-ideas] format specifier for "not bytes" In-Reply-To: References: <5037C1FE.9020509@mrabarnett.plus.com> <20120824204043.3c4c4524@pitrou.net> Message-ID: <5037D0F3.70108@pearwood.info> On 25/08/12 04:57, Daniel Holth wrote: > Yes, if I wanted to pretend I was using JavaScript. I'm not entirely sure what you are responding to here -- the context is lost when you top post like that. I'm *guessing* that you are responding to Antoine's advice to use concatenation. If so, why do you think that concatenation is "pretending" to be using Javascript? It is a perfectly valid operation in Python, and many languages which predate Javascript include concatenation. > A string-only formatter might cause problems with translation string / > gettext type objects? > > On Fri, Aug 24, 2012 at 2:40 PM, Antoine Pitrou wrote: >> On Fri, 24 Aug 2012 14:33:48 -0400 >> Daniel Holth wrote: >>> String only would be perfect. I only single out bytes because they are more >>> like strings than any other type. >> >> You can use concatenation instead of (or in addition to) formatting: >> >>>>> "" + "foo" >> 'foo' >>>>> "" + b"foo" >> Traceback (most recent call last): >> File "", line 1, in >> TypeError: Can't convert 'bytes' object to str implicitly >>>>> "" + 42 >> Traceback (most recent call last): >> File "", line 1, in >> TypeError: Can't convert 'int' object to str implicitly >> >> Regards >> >> Antoine. -- Steven From dholth at gmail.com Fri Aug 24 21:21:15 2012 From: dholth at gmail.com (Daniel Holth) Date: Fri, 24 Aug 2012 15:21:15 -0400 Subject: [Python-ideas] format specifier for "not bytes" In-Reply-To: <5037D0F3.70108@pearwood.info> References: <5037C1FE.9020509@mrabarnett.plus.com> <20120824204043.3c4c4524@pitrou.net> <5037D0F3.70108@pearwood.info> Message-ID: On Fri, Aug 24, 2012 at 3:07 PM, Steven D'Aprano wrote: > On 25/08/12 04:57, Daniel Holth wrote: >> >> Yes, if I wanted to pretend I was using JavaScript. > > > I'm not entirely sure what you are responding to here -- the context is > lost when you top post like that. I'm *guessing* that you are responding > to Antoine's advice to use concatenation. I am only trying to say that I like using the string formatting operations and I think I am justified in using them instead of concatenation. I was merely surprised by the implicit bytes to "b'string'" conversion, and would like to be able to turn it off. I do rather enjoy programming in JavaScript, even though its strings do not have a .format() method. > If so, why do you think that concatenation is "pretending" to be using > Javascript? It is a perfectly valid operation in Python, and many languages > which predate Javascript include concatenation. From p.f.moore at gmail.com Fri Aug 24 22:03:25 2012 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 24 Aug 2012 21:03:25 +0100 Subject: [Python-ideas] format specifier for "not bytes" In-Reply-To: References: <5037C1FE.9020509@mrabarnett.plus.com> <20120824204043.3c4c4524@pitrou.net> <5037D0F3.70108@pearwood.info> Message-ID: On 24 August 2012 20:21, Daniel Holth wrote: > I was merely surprised by the implicit bytes to > "b'string'" conversion, and would like to be able to turn it off. The conversion is not really "implicit". It's precisely what the %s (or {!s}) conversion format *explicitly* requests - insert the str() of the supplied argument at this point in the output string. See library reference 6.1.3 "Format String Syntax" (I don't know if there's an equivalent description for % formatting). If you want to force an argument to be a string, you could always do something like this: def must_be_str(s): if isinstance(s, str): return s raise ValueError x = "The value is {}".format(must_be_str(s)) There's no "only insert a string here, raise an error for other types" format specifier, largely because formatting is in principle about *formatting* - converting other types to strings. In practice, most of my uses of formatting (and I suspect many other people's) is more about interpolation - inserting chunks of text into templates. For that application, a stricter form could be more useful, I guess. I could see value in a {!S} conversion specifier (in the terminology of library reference 6.1.3 "Format String Syntax") which overrode __format__ with a conversion function equivalent to must_be_str above. But I don't know if it would get much use (anyone careful enough to use it is probably careful enough of their types to not need it). Also, is it *really* what you want? Did your code accidentally pass bytes to a {!s} formatter, and yet *never* pass a number and get the right result? Or conversely, would you be willing to audit all your conversions to be sure that numbers were never passed, and yet *still* not be willing to ensure you have no bytes/str confusion? (Although as your use case was encode/decode dances, maybe bytes really are sufficiently special in your code - but I'd argue that needing to address this issue implies that you have some fairly subtle bugs in your encoding process that you should be fixing before worrying about this). Paul From dholth at gmail.com Fri Aug 24 22:21:48 2012 From: dholth at gmail.com (Daniel Holth) Date: Fri, 24 Aug 2012 16:21:48 -0400 Subject: [Python-ideas] format specifier for "not bytes" In-Reply-To: References: <5037C1FE.9020509@mrabarnett.plus.com> <20120824204043.3c4c4524@pitrou.net> <5037D0F3.70108@pearwood.info> Message-ID: On Fri, Aug 24, 2012 at 4:03 PM, Paul Moore wrote: > On 24 August 2012 20:21, Daniel Holth wrote: >> I was merely surprised by the implicit bytes to >> "b'string'" conversion, and would like to be able to turn it off. > > The conversion is not really "implicit". It's precisely what the %s > (or {!s}) conversion format *explicitly* requests - insert the str() > of the supplied argument at this point in the output string. See > library reference 6.1.3 "Format String Syntax" (I don't know if > there's an equivalent description for % formatting). > > If you want to force an argument to be a string, you could always do > something like this: > > def must_be_str(s): > if isinstance(s, str): > return s > raise ValueError > > x = "The value is {}".format(must_be_str(s)) > > There's no "only insert a string here, raise an error for other types" > format specifier, largely because formatting is in principle about > *formatting* - converting other types to strings. In practice, most of > my uses of formatting (and I suspect many other people's) is more > about interpolation - inserting chunks of text into templates. For > that application, a stricter form could be more useful, I guess. > > I could see value in a {!S} conversion specifier (in the terminology > of library reference 6.1.3 "Format String Syntax") which overrode > __format__ with a conversion function equivalent to must_be_str above. > But I don't know if it would get much use (anyone careful enough to > use it is probably careful enough of their types to not need it). > > Also, is it *really* what you want? Did your code accidentally pass > bytes to a {!s} formatter, and yet *never* pass a number and get the > right result? Or conversely, would you be willing to audit all your > conversions to be sure that numbers were never passed, and yet *still* > not be willing to ensure you have no bytes/str confusion? (Although as > your use case was encode/decode dances, maybe bytes really are > sufficiently special in your code - but I'd argue that needing to > address this issue implies that you have some fairly subtle bugs in > your encoding process that you should be fixing before worrying about > this). Hi Paul! You could probably guess that this is the wheel digital signatures package. All the string formatting arguments (I hope) are now passed to binary() or native() string conversion functions that do less on Python 2.7 than on Python 3. Yes, I would be willing to audit my code to ensure that numbers were never passed. I am already calling .encode() and .decode() on most objects in this pipeline. In my opinion int-when-usually-str is in most cases as likely to be a bug as getting bytes() when you expect str(). Python even has the -bb argument to help with this thing that is almost never the right thing to do. How often does anyone who is not writing a REPL ever expect "%s" % bytes() to produce b''? In this particular case I could also make my life a lot easier by extending the JSON serializer to accept bytes(), but I suppose I would lose the string formatting operations. From p.f.moore at gmail.com Sat Aug 25 00:17:27 2012 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 24 Aug 2012 23:17:27 +0100 Subject: [Python-ideas] format specifier for "not bytes" In-Reply-To: References: <5037C1FE.9020509@mrabarnett.plus.com> <20120824204043.3c4c4524@pitrou.net> <5037D0F3.70108@pearwood.info> Message-ID: On 24 August 2012 21:21, Daniel Holth wrote: > Hi Paul! You could probably guess that this is the wheel digital > signatures package. All the string formatting arguments (I hope) are > now passed to binary() or native() string conversion functions that do > less on Python 2.7 than on Python 3. One point that this raises. Any such "string-only" format spec would only be available in Python 3.4+, and almost certainly only in format(). So if you're interested in something that works across Python 2 and 3, you wouldn't be able to use it anyway (and something like the must_be_str function is probably your best bet). On the other hand, if you're targeting 3.4+ only, the bytes/string code is probably cleaner (that being a lot of the point of the Python 3 exercise :-)) and so the need for a string-only spec may be a lot less. I dunno. I haven't hit a lot of encoding type issues myself, so I don't have much background in what might help. OTOH, what I *have* found is that the change in thinking that Python 3's approach pushes onto me (encode/decode at the edges and use str consistently internally, plus never gloss over the fact that you have to know an encoding to convert bytes <-> str) fixes a lot of "problems" I thought I was having... Paul. From dholth at gmail.com Sat Aug 25 01:12:09 2012 From: dholth at gmail.com (Daniel Holth) Date: Fri, 24 Aug 2012 19:12:09 -0400 Subject: [Python-ideas] format specifier for "not bytes" In-Reply-To: References: <5037C1FE.9020509@mrabarnett.plus.com> <20120824204043.3c4c4524@pitrou.net> <5037D0F3.70108@pearwood.info> Message-ID: On Aug 24, 2012 6:17 PM, "Paul Moore" wrote: > > On 24 August 2012 21:21, Daniel Holth wrote: > > Hi Paul! You could probably guess that this is the wheel digital > > signatures package. All the string formatting arguments (I hope) are > > now passed to binary() or native() string conversion functions that do > > less on Python 2.7 than on Python 3. > > One point that this raises. Any such "string-only" format spec would > only be available in Python 3.4+, and almost certainly only in > format(). So if you're interested in something that works across > Python 2 and 3, you wouldn't be able to use it anyway (and something > like the must_be_str function is probably your best bet). On the other > hand, if you're targeting 3.4+ only, the bytes/string code is probably > cleaner (that being a lot of the point of the Python 3 exercise :-)) > and so the need for a string-only spec may be a lot less. > > I dunno. I haven't hit a lot of encoding type issues myself, so I > don't have much background in what might help. OTOH, what I *have* > found is that the change in thinking that Python 3's approach pushes > onto me (encode/decode at the edges and use str consistently > internally, plus never gloss over the fact that you have to know an > encoding to convert bytes <-> str) fixes a lot of "problems" I thought > I was having... That's the core of it. You can convert bytes to string without knowing the encoding. "%s" % bytes. But instead of failing or converting from ascii it does something totally useless. I argue that this is a bug, and an alternative 'anything except bytes' should be available. Not so hot on the competing only-str idea. On Aug 24, 2012 6:17 PM, "Paul Moore" wrote: > On 24 August 2012 21:21, Daniel Holth wrote: > > Hi Paul! You could probably guess that this is the wheel digital > > signatures package. All the string formatting arguments (I hope) are > > now passed to binary() or native() string conversion functions that do > > less on Python 2.7 than on Python 3. > > One point that this raises. Any such "string-only" format spec would > only be available in Python 3.4+, and almost certainly only in > format(). So if you're interested in something that works across > Python 2 and 3, you wouldn't be able to use it anyway (and something > like the must_be_str function is probably your best bet). On the other > hand, if you're targeting 3.4+ only, the bytes/string code is probably > cleaner (that being a lot of the point of the Python 3 exercise :-)) > and so the need for a string-only spec may be a lot less. > > I dunno. I haven't hit a lot of encoding type issues myself, so I > don't have much background in what might help. OTOH, what I *have* > found is that the change in thinking that Python 3's approach pushes > onto me (encode/decode at the edges and use str consistently > internally, plus never gloss over the fact that you have to know an > encoding to convert bytes <-> str) fixes a lot of "problems" I thought > I was having... > > Paul. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Aug 25 01:16:18 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 25 Aug 2012 09:16:18 +1000 Subject: [Python-ideas] format specifier for "not bytes" In-Reply-To: References: <5037C1FE.9020509@mrabarnett.plus.com> <20120824204043.3c4c4524@pitrou.net> <5037D0F3.70108@pearwood.info> Message-ID: A couple of people at PyCon Au mentioned running into this kind of issue with Python 3. It relates to the fact that: 1. String formatting is *coercive* by default 2. Absolutely everything, including bytes objects can be coerced to a string, due to the repr() fallback So it's relatively easy to miss a decode or encode operation, and end up interpolating an unwanted "b" prefix and some quotes. For existing versions, I think the easiest answer is to craft a regex that matches bytes object repr's and advise people to check that it *doesn?t* match their formatted strings in their unit tests. For 3.4+ a non-coercive string interpolation format code may be desirable. Cheers, Nick. -- Sent from my phone, thus the relative brevity :) -------------- next part -------------- An HTML attachment was scrubbed... URL: From dreamingforward at gmail.com Sat Aug 25 02:10:23 2012 From: dreamingforward at gmail.com (Mark Adam) Date: Fri, 24 Aug 2012 19:10:23 -0500 Subject: [Python-ideas] inheriting docstrings and mutable docstings for classes In-Reply-To: References: Message-ID: On Thu, Jun 9, 2011 at 8:05 PM, Nick Coghlan wrote: > On Fri, Jun 10, 2011 at 9:54 AM, Eric Snow > wrote: > > I'm +1 on having __doc__ be inherited. > > -1. Subclasses are not the same thing as the original class so > docstring inheritance should be requested explicitly. > > Hmm, subclasses are supposed to represent an IS-A relationship in my oldschool OOP books, typically a specialization of a more general (ie. abstract) outer class. The multiple-inheritance case does make things a bit more sloppy, but then this problem has already been resolved by the BDFL via MRO, the same could probably apply with docstrings, with the user updating or using a blank docstring when that general rule doesn't work. In any case, I found myself wanting this auto-inheritance for easier testing with doctest. I don't want my subclasses to mess up invariants in my parent classes, and if the doctests were inherited this would be easy to check. Just my (late) 2 cents worth after examining the current python issues list. mark Sorry for any formatting problems, this is a forward after accidently replying only to ncoghlan. -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Sat Aug 25 02:44:17 2012 From: python at mrabarnett.plus.com (MRAB) Date: Sat, 25 Aug 2012 01:44:17 +0100 Subject: [Python-ideas] format specifier for "not bytes" In-Reply-To: References: <5037C1FE.9020509@mrabarnett.plus.com> <20120824204043.3c4c4524@pitrou.net> <5037D0F3.70108@pearwood.info> Message-ID: <50381FE1.8030308@mrabarnett.plus.com> On 25/08/2012 00:12, Daniel Holth wrote: > > On Aug 24, 2012 6:17 PM, "Paul Moore" > wrote: > > > > On 24 August 2012 21:21, Daniel Holth > wrote: > > > Hi Paul! You could probably guess that this is the wheel digital > > > signatures package. All the string formatting arguments (I hope) are > > > now passed to binary() or native() string conversion functions that do > > > less on Python 2.7 than on Python 3. > > > > One point that this raises. Any such "string-only" format spec would > > only be available in Python 3.4+, and almost certainly only in > > format(). So if you're interested in something that works across > > Python 2 and 3, you wouldn't be able to use it anyway (and something > > like the must_be_str function is probably your best bet). On the other > > hand, if you're targeting 3.4+ only, the bytes/string code is probably > > cleaner (that being a lot of the point of the Python 3 exercise :-)) > > and so the need for a string-only spec may be a lot less. > > > > I dunno. I haven't hit a lot of encoding type issues myself, so I > > don't have much background in what might help. OTOH, what I *have* > > found is that the change in thinking that Python 3's approach pushes > > onto me (encode/decode at the edges and use str consistently > > internally, plus never gloss over the fact that you have to know an > > encoding to convert bytes <-> str) fixes a lot of "problems" I thought > > I was having... > > That's the core of it. You can convert bytes to string without knowing > the encoding. "%s" % bytes. But instead of failing or converting from > ascii it does something totally useless. I argue that this is a bug, and > an alternative 'anything except bytes' should be available. Not so hot > on the competing only-str idea. > "Totally useless"? Is it any more "useless" than what happens to lists, dicts, sets, etc? From steve at pearwood.info Sat Aug 25 02:54:15 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 25 Aug 2012 10:54:15 +1000 Subject: [Python-ideas] inheriting docstrings and mutable docstings for classes In-Reply-To: References: Message-ID: <50382237.3030601@pearwood.info> On 25/08/12 10:10, Mark Adam wrote: > In any case, I found myself wanting this auto-inheritance [of docstrings] > for easier testing with doctest. I don't want my subclasses to mess up > invariants in my parent classes, and if the doctests were inherited this > would be easy to check. > > Just my (late) 2 cents worth after examining the current python issues list. I have run into exactly that issue myself. But then I realised that this may not work in practice, and in fact could be outright misleading. The problem is that my class docstrings probably refer to the class by name: class Spam: def get(self, n): """Get n chunks of lovely spam. >>> Spam().get(4) 'Spam spam spam LOVELY SPAM!!!' """ class Ham(Spam): pass If Ham.get inherits the docstring, I may be fooled into thinking that I've tested Ham.get when all I've done is test Spam.get twice. A better solution to this use-case might be a class decorator which copies docstrings from superclasses (if they aren't explicitly set in the subclass). The decorator could optionally apply a bunch of string transformations to the docstrings: @copy_docstrings({'spam': 'ham'}, case_mangling=True) class Ham(Spam): pass -- Steven From ericsnowcurrently at gmail.com Sat Aug 25 03:34:19 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 24 Aug 2012 19:34:19 -0600 Subject: [Python-ideas] inheriting docstrings and mutable docstings for classes In-Reply-To: <50382237.3030601@pearwood.info> References: <50382237.3030601@pearwood.info> Message-ID: On Fri, Aug 24, 2012 at 6:54 PM, Steven D'Aprano wrote: > On 25/08/12 10:10, Mark Adam wrote: > >> In any case, I found myself wanting this auto-inheritance [of docstrings] >> >> for easier testing with doctest. I don't want my subclasses to mess up >> invariants in my parent classes, and if the doctests were inherited this >> would be easy to check. >> >> Just my (late) 2 cents worth after examining the current python issues >> list. > > > > I have run into exactly that issue myself. But then I realised that this may > not work in practice, and in fact could be outright misleading. > > The problem is that my class docstrings probably refer to the class by name: > > class Spam: > def get(self, n): > """Get n chunks of lovely spam. > > >>> Spam().get(4) > 'Spam spam spam LOVELY SPAM!!!' > """ > > class Ham(Spam): > pass > > > If Ham.get inherits the docstring, I may be fooled into thinking that I've > tested Ham.get when all I've done is test Spam.get twice. > > A better solution to this use-case might be a class decorator which copies > docstrings from superclasses (if they aren't explicitly set in the > subclass). > The decorator could optionally apply a bunch of string transformations to > the docstrings: > > @copy_docstrings({'spam': 'ham'}, case_mangling=True) > class Ham(Spam): > pass Yeah, see http://bugs.python.org/issue15731. -eric From barry at python.org Mon Aug 27 16:34:38 2012 From: barry at python.org (Barry Warsaw) Date: Mon, 27 Aug 2012 10:34:38 -0400 Subject: [Python-ideas] format specifier for "not bytes" References: <5037C1FE.9020509@mrabarnett.plus.com> <20120824204043.3c4c4524@pitrou.net> <5037D0F3.70108@pearwood.info> Message-ID: <20120827103438.093aa3f4@resist.wooz.org> On Aug 25, 2012, at 09:16 AM, Nick Coghlan wrote: >A couple of people at PyCon Au mentioned running into this kind of issue >with Python 3. It relates to the fact that: >1. String formatting is *coercive* by default >2. Absolutely everything, including bytes objects can be coerced to a >string, due to the repr() fallback > >So it's relatively easy to miss a decode or encode operation, and end up >interpolating an unwanted "b" prefix and some quotes. > >For existing versions, I think the easiest answer is to craft a regex that >matches bytes object repr's and advise people to check that it *doesn?t* >match their formatted strings in their unit tests. > >For 3.4+ a non-coercive string interpolation format code may be desirable. Or maybe just one that calls __str__ without a __repr__ fallback? FWIW, the representation of bytes with the leading b'' does cause problems when trying to write doctests that work in both Python 2 and 3. http://www.wefearchange.org/2012/01/python-3-porting-fun-redux.html It might be a bit nicer to be able to write: >>> print('{:S}'.format(somebytes)) Of course, in the bytes case, its __str__() would have to be rewritten to not call its __repr__() explicitly. It's probably not worth it just to save from writing a small helper function, but it would be useful in eliminating a surprising gotcha. The other option is of course just to make doctests smarter[*]. Cheers, -Barry [*] Doctest haters need not respond snarkily. :) -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From mikegraham at gmail.com Wed Aug 29 00:26:04 2012 From: mikegraham at gmail.com (Mike Graham) Date: Tue, 28 Aug 2012 18:26:04 -0400 Subject: [Python-ideas] Verbose traceback formatting Message-ID: It's possible to give a lot more on error than the default traceback gives you. I propose that Python should ship a more verbose formatter and a command line switch to use it. Here's an example of IPython's verbose formatter. I wrote a buggy program: > def f(a): > x = a * 4 > y = a - 4 > return x / y > > def main(): > for i in xrange(100): > f(i) > > main() and then ran it in IPython with verbose tracebacks and got the following output: > ZeroDivisionError Traceback (most recent call last) > > /home/mike/foo.py in () > 8 f(i) > 9 > ---> 10 main() > global main = > 11 > 12 > > /home/mike/foo.py in main() > 6 def main(): > 7 for i in xrange(100): > ----> 8 f(i) > global f = > i = 4 > 9 > 10 main() > > /home/mike/foo.py in f(a=4) > 2 x = a * 4 > 3 y = a - 4 > ----> 4 return x / y > x = 16 > y = 0 > 5 > 6 def main(): > > ZeroDivisionError: integer division or modulo by zero This is very handy! The reprs of all locals are input so I can see what the values of a, x, and y were when I had my error and there are a few lines of code on either side of the line that matters to help me get oriented. The former feature is the more powerful one, although enabling this by default is a bad idea; (first and foremost, this can be a security hazard). I can't count how many trips into pdb this would have saved me. I think having this feature be part of Python itself would be very helpful to new learners and to those helping them. I constantly deal with learners seeking help who are unable to clearly provide the actual values and types of the objects in the code they're having trouble with; it would be nice simply to say, "Show me a verbose traceback" and might even help them to debug their code without assistance. Mike From phd at phdru.name Wed Aug 29 02:05:38 2012 From: phd at phdru.name (Oleg Broytman) Date: Wed, 29 Aug 2012 04:05:38 +0400 Subject: [Python-ideas] Verbose traceback formatting In-Reply-To: References: Message-ID: <20120829000538.GA19926@iskra.aviel.ru> Hi! On Tue, Aug 28, 2012 at 06:26:04PM -0400, Mike Graham wrote: > It's possible to give a lot more on error than the default traceback > gives you. I propose that Python should ship a more verbose formatter Good idea! > and a command line switch to use it. And an environment variable, as usual: PYTHONTRACEBACK=verbose. > Here's an example of IPython's verbose formatter. I wrote a buggy program: > > > def f(a): > > x = a * 4 > > y = a - 4 > > return x / y > > > > def main(): > > for i in xrange(100): > > f(i) > > > > main() > > > and then ran it in IPython with verbose tracebacks and got the following output: > > > ZeroDivisionError Traceback (most recent call last) > > > > /home/mike/foo.py in () > > 8 f(i) > > 9 > > ---> 10 main() > > global main = > > 11 > > 12 > > > > /home/mike/foo.py in main() > > 6 def main(): > > 7 for i in xrange(100): > > ----> 8 f(i) > > global f = > > i = 4 > > 9 > > 10 main() > > > > /home/mike/foo.py in f(a=4) > > 2 x = a * 4 > > 3 y = a - 4 > > ----> 4 return x / y > > x = 16 > > y = 0 > > 5 > > 6 def main(): > > > > ZeroDivisionError: integer division or modulo by zero > > > > This is very handy! 100% agree! py.test produces even more verbose tracebacks and I found them very helpful in debugging. Here is a short example and below is much bigger one. ___________________________ test_transaction_delete ____________________________ close = False def test_transaction_delete(close=False): if not supports('transactions'): return setupClass(TestSOTrans) trans = TestSOTrans._connection.transaction() try: TestSOTrans(name='bob') bIn = TestSOTrans.byName('bob', connection=trans) bIn.destroySelf() bOut = TestSOTrans.select(TestSOTrans.q.name=='bob') > assert bOut.count() == 1 E assert 0 == 1 E + where 0 = .count() test_transactions.py:65: AssertionError Longer and more verbose: _______________________________ test_transaction _______________________________ def test_transaction(): if not supports('transactions'): return > setupClass(TestSOTrans) test_transactions.py:17: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ soClasses = [] force = False def setupClass(soClasses, force=False): """ Makes sure the classes have a corresponding and correct table. This won't recreate the table if it already exists. It will check that the table is properly defined (in case you change your table definition). You can provide a single class or a list of classes; if a list then classes will be created in the order you provide, and destroyed in the opposite order. So if class A depends on class B, then do setupClass([B, A]) and B won't be destroyed or cleared until after A is destroyed or cleared. If force is true, then the database will be recreated no matter what. """ global hub if not isinstance(soClasses, (list, tuple)): soClasses = [soClasses] connection = getConnection() for soClass in soClasses: ## This would be an alternate way to register connections... #try: # hub #except NameError: # hub = sqlobject.dbconnection.ConnectionHub() #soClass._connection = hub #hub.threadConnection = connection #hub.processConnection = connection soClass._connection = connection > installOrClear(soClasses, force=force) dbtest.py:83: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ cls = soClasses = [] force = False @classmethod def installOrClear(cls, soClasses, force=False): cls.setup() reversed = list(soClasses)[:] reversed.reverse() # If anything needs to be dropped, they all must be dropped # But if we're forcing it, then we'll always drop if force: any_drops = True else: any_drops = False for soClass in reversed: table = soClass.sqlmeta.table > if not soClass._connection.tableExists(table): dbtest.py:140: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = tableName = 'test_so_trans' def tableExists(self, tableName): result = self.queryOne("SELECT COUNT(relname) FROM pg_class WHERE relname = %s" > % self.sqlrepr(tableName)) ../postgres/pgconnection.py:235: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = s = "SELECT COUNT(relname) FROM pg_class WHERE relname = 'test_so_trans'" def queryOne(self, s): > return self._runWithConnection(self._queryOne, s) ../dbconnection.py:457: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = meth = > def _runWithConnection(self, meth, *args): > conn = self.getConnection() ../dbconnection.py:325: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = def getConnection(self): self._poolLock.acquire() try: if not self._pool: > conn = self.makeConnection() ../dbconnection.py:336: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = def makeConnection(self): try: if self.use_dsn: conn = self.module.connect(self.dsn) else: conn = self.module.connect(**self.dsn_dict) except self.module.OperationalError, e: > raise OperationalError("%s; used connection string %r" % (e, self.dsn)) E OperationalError: could not connect to server: No such file or directory E Is the server running locally and accepting E connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"? E ; used connection string 'dbname=test' ../postgres/pgconnection.py:142: OperationalError Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From masklinn at masklinn.net Wed Aug 29 07:15:26 2012 From: masklinn at masklinn.net (Masklinn) Date: Wed, 29 Aug 2012 07:15:26 +0200 Subject: [Python-ideas] Verbose traceback formatting In-Reply-To: References: Message-ID: On 2012-08-29, at 00:26 , Mike Graham wrote: > It's possible to give a lot more on error than the default traceback > gives you. I propose that Python should ship a more verbose formatter It already does: http://docs.python.org/py3k/library/cgitb.html > cat > test.py import cgitb cgitb.enable(format='text') def a(): b() def b(): c() def c(): assert False, "blow up" a() ^C > python3 test.py AssertionError Python 3.2.3: python Wed Aug 29 07:08:14 2012 A problem occurred in a Python script. Here is the sequence of function calls leading up to the error, in the order they occurred. test.py in () 7 c() 8 def c(): 9 assert False, "blow up" 10 11 a() a = test.py in a() 3 4 def a(): 5 b() 6 def b(): 7 c() global b = test.py in b() 5 b() 6 def b(): 7 c() 8 def c(): 9 assert False, "blow up" global c = test.py in c() 7 c() 8 def c(): 9 assert False, "blow up" 10 11 a() AssertionError: blow up __cause__ = None __class__ = __context__ = None __delattr__ = __dict__ = {} __doc__ = 'Assertion failed.' __eq__ = __format__ = __ge__ = __getattribute__ = __gt__ = __hash__ = __init__ = __le__ = __lt__ = __ne__ = __new__ = __reduce__ = __reduce_ex__ = __repr__ = __setattr__ = __setstate__ = __sizeof__ = __str__ = __subclasshook__ = __traceback__ = args = ('blow up',) with_traceback = The above is a description of an error in a Python program. Here is the original traceback: Traceback (most recent call last): File "test.py", line 11, in a() File "test.py", line 5, in a b() File "test.py", line 7, in b c() File "test.py", line 9, in c assert False, "blow up" AssertionError: blow up > and a command line switch to use it. Adding the hook on `python -mcgitb script`? In the style of -mpdb? From masklinn at masklinn.net Wed Aug 29 08:11:58 2012 From: masklinn at masklinn.net (Masklinn) Date: Wed, 29 Aug 2012 08:11:58 +0200 Subject: [Python-ideas] Verbose traceback formatting In-Reply-To: References: Message-ID: <0B1157CC-B744-4AC6-8910-760D5F174ACF@masklinn.net> On 2012-08-29, at 07:15 , Masklinn wrote: > >> and a command line switch to use it. > > Adding the hook on `python -mcgitb script`? In the style of -mpdb? Thinking about it more, this could be a nice starting point for a `traceback2` project: * Add stack frame formatting (by name, format string and function) to traceback's functions * Add formatting exechook handling (and -m switch) to traceback * Move/reimplement the meat of cgitb using traceback's stack frame formats * Maybe move the `html` formatter to wsgiref and add a trace-formatting middleware which could be dropped in about any WSGI stack From ned at nedbatchelder.com Wed Aug 29 14:53:43 2012 From: ned at nedbatchelder.com (Ned Batchelder) Date: Wed, 29 Aug 2012 08:53:43 -0400 Subject: [Python-ideas] Verbose traceback formatting In-Reply-To: References: Message-ID: <503E10D7.5000502@nedbatchelder.com> On 8/29/2012 1:15 AM, Masklinn wrote: >> and a command line switch to use it. > Adding the hook on `python -mcgitb script`? In the style of -mpdb? This could also be a use for the proposed PYTHON_RUN_FIRST mechanism (http://bugs.python.org/issue14803). Instead of expecting cgitb to know how to be the main and run Python files, let the user specify a few lines of Python to run before their actual program. This has other uses as well, as outlined in the ticket. --Ned. From mikegraham at gmail.com Wed Aug 29 15:13:45 2012 From: mikegraham at gmail.com (Mike Graham) Date: Wed, 29 Aug 2012 09:13:45 -0400 Subject: [Python-ideas] Verbose traceback formatting In-Reply-To: <0B1157CC-B744-4AC6-8910-760D5F174ACF@masklinn.net> References: <0B1157CC-B744-4AC6-8910-760D5F174ACF@masklinn.net> Message-ID: On Wed, Aug 29, 2012 at 1:15 AM, Masklinn wrote: > It already does: http://docs.python.org/py3k/library/cgitb.html Wow, nice! I vaguely knew cgitb existed as an HTML formatter, but I didn't realize how much information it showed. On Wed, Aug 29, 2012 at 2:11 AM, Masklinn wrote: > * Maybe move the `html` formatter to wsgiref and add a trace-formatting > middleware which could be dropped in about any WSGI stack On an orthogonal note, I think it may be a bad idea to take steps that seem to encourage this sort of thing in a web app. Although there is some tradition of displaying stacktraces on errors on the web, this a) provides information the user shouldn't worry about and b) can introduce security holes (and has many times). Printing out locals, the problem only gets worse; it's easy to imagine a password or private data getting displayed on screen or transmitted via plaintext. It's of course possible to use this sort of tooling and turn it off in production, but it's not really necessary and I think it is a bad idea to make it too easy. Mike From masklinn at masklinn.net Wed Aug 29 15:34:35 2012 From: masklinn at masklinn.net (Masklinn) Date: Wed, 29 Aug 2012 15:34:35 +0200 Subject: [Python-ideas] Verbose traceback formatting In-Reply-To: References: <0B1157CC-B744-4AC6-8910-760D5F174ACF@masklinn.net> Message-ID: On 29 ao?t 2012, at 15:13, Mike Graham wrote: > On Wed, Aug 29, 2012 at 1:15 AM, Masklinn wrote: >> It already does: http://docs.python.org/py3k/library/cgitb.html > > Wow, nice! I vaguely knew cgitb existed as an HTML formatter, but I > didn't realize how much information it showed. > > On Wed, Aug 29, 2012 at 2:11 AM, Masklinn wrote: >> * Maybe move the `html` formatter to wsgiref and add a trace-formatting >> middleware which could be dropped in about any WSGI stack > > On an orthogonal note, I think it may be a bad idea to take steps that > seem to encourage this sort of thing in a web app. Although there is > some tradition of displaying stacktraces on errors on the web, this a) > provides information the user shouldn't worry about and b) can > introduce security holes (and has many times). Printing out locals, > the problem only gets worse; it's easy to imagine a password or > private data getting displayed on screen or transmitted via plaintext. > It's of course possible to use this sort of tooling and turn it off in > production, but it's not really necessary and I think it is a bad idea > to make it too easy. I don't think having middleware which needs to be added to the stack and configure makes things "too easy". Most frameworks make it way easier via a simple flag (in a settings file for django, and passed to .run for flask). In fact, once you know of the feature's existence I'd argue a wsgi middleware is still way harder that "cgitb.enable()", and way easier *not* to enable in production. From carlopires at gmail.com Wed Aug 29 15:50:18 2012 From: carlopires at gmail.com (Carlo Pires) Date: Wed, 29 Aug 2012 10:50:18 -0300 Subject: [Python-ideas] Unpack of sequences Message-ID: Hi, I was just wondering why unpack of sequences did not follow same behavior of functions parameters. I mean: first, *rest = 'a b c'.split() should work in python, why doesn't it? -- Carlo Pires -------------- next part -------------- An HTML attachment was scrubbed... URL: From masklinn at masklinn.net Wed Aug 29 15:51:48 2012 From: masklinn at masklinn.net (Masklinn) Date: Wed, 29 Aug 2012 15:51:48 +0200 Subject: [Python-ideas] Verbose traceback formatting In-Reply-To: <503E10D7.5000502@nedbatchelder.com> References: <503E10D7.5000502@nedbatchelder.com> Message-ID: <6CE7C2AF-E61C-4200-97D9-B9BC3424267B@masklinn.net> On 29 ao?t 2012, at 14:53, Ned Batchelder wrote: > On 8/29/2012 1:15 AM, Masklinn wrote: >>> and a command line switch to use it. >> Adding the hook on `python -mcgitb script`? In the style of -mpdb? > This could also be a use for the proposed PYTHON_RUN_FIRST mechanism (http://bugs.python.org/issue14803). Instead of expecting cgitb to know how to be the main and run Python files, let the user specify a few lines of Python to run before their actual program. This has other uses as well, as outlined in the ticket. Indeed, it could be. From masklinn at masklinn.net Wed Aug 29 15:54:39 2012 From: masklinn at masklinn.net (Masklinn) Date: Wed, 29 Aug 2012 15:54:39 +0200 Subject: [Python-ideas] Unpack of sequences In-Reply-To: References: Message-ID: On 29 ao?t 2012, at 15:50, Carlo Pires wrote: > Hi, > > I was just wondering why unpack of sequences did not follow same behavior of functions parameters. I mean: > > first, *rest = 'a b c'.split() > > should work in python, why doesn't it? It does work in python 3, and following p3's improved unpacking semantics. Demo: http://pythonic.pocoo.org/2008/2/17/new-in-python-3:-extended-unpacking Pep: http://www.python.org/dev/peps/pep-3132/ From ned at nedbatchelder.com Wed Aug 29 15:57:30 2012 From: ned at nedbatchelder.com (Ned Batchelder) Date: Wed, 29 Aug 2012 09:57:30 -0400 Subject: [Python-ideas] Unpack of sequences In-Reply-To: References: Message-ID: <503E1FCA.7050309@nedbatchelder.com> On 8/29/2012 9:50 AM, Carlo Pires wrote: > Hi, > > I was just wondering why unpack of sequences did not follow same > behavior of functions parameters. I mean: > > first, *rest = 'a b c'.split() > > should work in python, why doesn't it? > It does in Python 3: Python 3.2.3 (default, Apr 11 2012, 07:15:24) [MSC v.1500 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> first, *rest = 'a b c'.split() >>> first 'a' >>> rest ['b', 'c'] >>> --Ned. From guido at python.org Wed Aug 29 17:03:49 2012 From: guido at python.org (Guido van Rossum) Date: Wed, 29 Aug 2012 08:03:49 -0700 Subject: [Python-ideas] Unpack of sequences In-Reply-To: <503E1FCA.7050309@nedbatchelder.com> References: <503E1FCA.7050309@nedbatchelder.com> Message-ID: On Wed, Aug 29, 2012 at 6:57 AM, Ned Batchelder wrote: > On 8/29/2012 9:50 AM, Carlo Pires wrote: >> >> Hi, >> >> I was just wondering why unpack of sequences did not follow same behavior >> of functions parameters. I mean: >> >> first, *rest = 'a b c'.split() >> >> should work in python, why doesn't it? >> > > It does in Python 3: > > Python 3.2.3 (default, Apr 11 2012, 07:15:24) [MSC v.1500 32 bit (Intel)] on > win32 > Type "help", "copyright", "credits" or "license" for more information. > >>>> first, *rest = 'a b c'.split() >>>> first > 'a' >>>> rest > ['b', 'c'] >>>> Note, however, that putting too much faith in the analogy between assignment and function calls leads to disappointment. E.g. there's a big difference between a = 42 and a = 42, while there is no difference between f(42) and f(42,) Also of course assignment has no equivalent to keyword parameters, nor does it (currently) allow a "lone star" -- although it would be handy to be able to say a, b, * = xs as a shorthand for a, b, *_ = xs del _ Also of note: after a, b, *z = xs z is a list, whereas in def foo(a, b, *z): ... z will be a tuple. (The latter is arguably a design mistake.) Plus in Pythin 3 you can't say def foo(a, (b, c), d): ... whereas the tuple unpacking even supports nested stars: a, (b, c, *d), e, *f = [1, range(10), 2, 3, 4, 5} PS. Asking "why does Python not do X" is generally considered a leading question. -- --Guido van Rossum (python.org/~guido) From masklinn at masklinn.net Wed Aug 29 17:45:31 2012 From: masklinn at masklinn.net (Masklinn) Date: Wed, 29 Aug 2012 17:45:31 +0200 Subject: [Python-ideas] Unpack of sequences In-Reply-To: References: <503E1FCA.7050309@nedbatchelder.com> Message-ID: <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> On 2012-08-29, at 17:03 , Guido van Rossum wrote: > > Also of course assignment has no equivalent to keyword parameters I've always thought it would be a rather neat way to unpack dictionaries, instead of doing it by hand or abusing `itemgetter` to get values in a known order. From dholth at gmail.com Wed Aug 29 18:08:11 2012 From: dholth at gmail.com (Daniel Holth) Date: Wed, 29 Aug 2012 12:08:11 -0400 Subject: [Python-ideas] hook zipfile.ZipExtFile to check secure hash sums Message-ID: I am checking the sha256 sums of all the files in a zip archive as it is being extracted by overriding ZipExtFile._update_crc, but it is inconvenient. It would be nice to have a hook, for example the ZipExtFile constructor could be a property of ZipFile and conveniently replaced with a ZipExtFile subclass. From mikegraham at gmail.com Wed Aug 29 18:10:22 2012 From: mikegraham at gmail.com (Mike Graham) Date: Wed, 29 Aug 2012 12:10:22 -0400 Subject: [Python-ideas] Unpack of sequences In-Reply-To: <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> References: <503E1FCA.7050309@nedbatchelder.com> <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> Message-ID: On Wed, Aug 29, 2012 at 11:45 AM, Masklinn wrote: > On 2012-08-29, at 17:03 , Guido van Rossum wrote: >> >> Also of course assignment has no equivalent to keyword parameters > > I've always thought it would be a rather neat way to unpack > dictionaries, instead of doing it by hand or abusing `itemgetter` to get > values in a known order. Do you have a suggestion of a nice syntax for a thing to unpack mappings (or to unpack things by attributes)? Mike From cesare.di.mauro at gmail.com Wed Aug 29 18:17:30 2012 From: cesare.di.mauro at gmail.com (Cesare Di Mauro) Date: Wed, 29 Aug 2012 18:17:30 +0200 Subject: [Python-ideas] Unpack of sequences In-Reply-To: References: <503E1FCA.7050309@nedbatchelder.com> <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> Message-ID: 2012/8/29 Mike Graham > On Wed, Aug 29, 2012 at 11:45 AM, Masklinn wrote: > > On 2012-08-29, at 17:03 , Guido van Rossum wrote: > >> > >> Also of course assignment has no equivalent to keyword parameters > > > > I've always thought it would be a rather neat way to unpack > > dictionaries, instead of doing it by hand or abusing `itemgetter` to get > > values in a known order. > > Do you have a suggestion of a nice syntax for a thing to unpack > mappings (or to unpack things by attributes)? > > Mike > a: x, b: y, c: z = {'a': 'x', 'b': 'y', 'c': 'z'} Cesare -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikegraham at gmail.com Wed Aug 29 18:24:05 2012 From: mikegraham at gmail.com (Mike Graham) Date: Wed, 29 Aug 2012 12:24:05 -0400 Subject: [Python-ideas] Unpack of sequences In-Reply-To: References: <503E1FCA.7050309@nedbatchelder.com> <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> Message-ID: On Wed, Aug 29, 2012 at 12:17 PM, Cesare Di Mauro wrote: > a: x, b: y, c: z = {'a': 'x', 'b': 'y', 'c': 'z'} > > Cesare "a": x, "b": y, "c": z = {'a': 'x', 'b': 'y', 'c': 'z'} or {"a": x, "b": y, "c": z} = {'a': 'x', 'b': 'y', 'c': 'z'} would admit non-string keys. IMO, a more useful thing would be attribute-based unpacking--I feel like I do that a ton more often. Mike From mal at egenix.com Wed Aug 29 18:41:02 2012 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 29 Aug 2012 18:41:02 +0200 Subject: [Python-ideas] Unpack of sequences In-Reply-To: References: <503E1FCA.7050309@nedbatchelder.com> <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> Message-ID: <503E461E.4070801@egenix.com> Cesare Di Mauro wrote: > 2012/8/29 Mike Graham > >> On Wed, Aug 29, 2012 at 11:45 AM, Masklinn wrote: >>> On 2012-08-29, at 17:03 , Guido van Rossum wrote: >>>> >>>> Also of course assignment has no equivalent to keyword parameters >>> >>> I've always thought it would be a rather neat way to unpack >>> dictionaries, instead of doing it by hand or abusing `itemgetter` to get >>> values in a known order. >> >> Do you have a suggestion of a nice syntax for a thing to unpack >> mappings (or to unpack things by attributes)? >> >> Mike >> > > a: x, b: y, c: z = {'a': 'x', 'b': 'y', 'c': 'z'} Would this assign 'a' to a or just use a as key for the lookup ? If the former, where would you take the lookup order from ? If the latter, what about keys that are not valid Python identifiers ? mxTools has a function extract() to extract values from a mapping or sequence object: extract(object,indices[,defaults]) Builds a list with entries object[index] for each index in the sequence indices. (see http://www.egenix.com/products/python/mxBase/mxTools/doc/) >>> mx.Tools.extract(d, ('a', 'b', 'c')) ['x', 'y', 'z'] IMO, that's a much cleaner way to express what you'd like Python to do. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 29 2012) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2012-10-23: Python Meeting Duesseldorf ... 55 days to go 2012-08-28: Released mxODBC 3.2.0 ... http://egenix.com/go31 2012-08-20: Released mxODBC.Connect 2.0.0 ... http://egenix.com/go30 ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From cesare.di.mauro at gmail.com Wed Aug 29 18:56:58 2012 From: cesare.di.mauro at gmail.com (Cesare Di Mauro) Date: Wed, 29 Aug 2012 18:56:58 +0200 Subject: [Python-ideas] Unpack of sequences In-Reply-To: <503E461E.4070801@egenix.com> References: <503E1FCA.7050309@nedbatchelder.com> <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> <503E461E.4070801@egenix.com> Message-ID: 2012/8/29 M.-A. Lemburg > Cesare Di Mauro wrote: > > 2012/8/29 Mike Graham > > > >> On Wed, Aug 29, 2012 at 11:45 AM, Masklinn > wrote: > >>> On 2012-08-29, at 17:03 , Guido van Rossum wrote: > >>>> > >>>> Also of course assignment has no equivalent to keyword parameters > >>> > >>> I've always thought it would be a rather neat way to unpack > >>> dictionaries, instead of doing it by hand or abusing `itemgetter` to > get > >>> values in a known order. > >> > >> Do you have a suggestion of a nice syntax for a thing to unpack > >> mappings (or to unpack things by attributes)? > >> > >> Mike > >> > > > > a: x, b: y, c: z = {'a': 'x', 'b': 'y', 'c': 'z'} > > Would this assign 'a' to a or just use a as key for the lookup ? > If the former, where would you take the lookup order from ? > If the latter, what about keys that are not valid Python identifiers ? > > The latter. But we already have problems with Python identifiers: >>> def f(**Keywords): pass >>> f(**{1 : 'BAD!'}) Traceback (most recent call last): File "", line 1, in f(**{1 : 'BAD!'}) TypeError: f() keywords must be strings It wasn't a feature blocker... mxTools has a function extract() to extract values from a mapping > or sequence object: > > extract(object,indices[,defaults]) > Builds a list with entries object[index] for each index in the sequence > indices. > (see http://www.egenix.com/products/python/mxBase/mxTools/doc/) > > >>> mx.Tools.extract(d, ('a', 'b', 'c')) > ['x', 'y', 'z'] > > IMO, that's a much cleaner way to express what you'd like Python > to do. > > -- > Marc-Andre Lemburg > Yes, may be an extract method for mappings will be finer. Cesare -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Wed Aug 29 19:01:09 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 30 Aug 2012 03:01:09 +1000 Subject: [Python-ideas] Unpack of sequences In-Reply-To: References: <503E1FCA.7050309@nedbatchelder.com> <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> Message-ID: <503E4AD5.6040102@pearwood.info> On 30/08/12 02:10, Mike Graham wrote: > On Wed, Aug 29, 2012 at 11:45 AM, Masklinn wrote: >> On 2012-08-29, at 17:03 , Guido van Rossum wrote: >>> >>> Also of course assignment has no equivalent to keyword parameters >> >> I've always thought it would be a rather neat way to unpack >> dictionaries, instead of doing it by hand or abusing `itemgetter` to get >> values in a known order. > > Do you have a suggestion of a nice syntax for a thing to unpack > mappings (or to unpack things by attributes)? a, b, x, y = **mapping could be equivalent to: a, b, x, y = mapping['a'], mapping['b'], mapping['x'], mapping['y'] I don't have good syntax for the attribute equivalent, but I do have bad syntax for it: a, b, x, y = **.obj # like a, b, x, y = obj.a, obj.b, obj.x, obj.y Or you could use wrap your object in a helper class: class AttrGetter: def __init__(self, obj): self.obj = obj def __getitem__(self, key): return getattr(self.obj, key) a, b, x, y = **AttrGetter(obj) -- Steven From guido at python.org Wed Aug 29 19:10:33 2012 From: guido at python.org (Guido van Rossum) Date: Wed, 29 Aug 2012 10:10:33 -0700 Subject: [Python-ideas] hook zipfile.ZipExtFile to check secure hash sums In-Reply-To: References: Message-ID: On Wed, Aug 29, 2012 at 9:08 AM, Daniel Holth wrote: > I am checking the sha256 sums of all the files in a zip archive as it > is being extracted by overriding ZipExtFile._update_crc, but it is > inconvenient. > > It would be nice to have a hook, for example the ZipExtFile > constructor could be a property of ZipFile and conveniently replaced > with a ZipExtFile subclass. Sounds to me like you should just whip up a patch and submit it to the issue tracker. -- --Guido van Rossum (python.org/~guido) From steve at pearwood.info Wed Aug 29 19:18:12 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 30 Aug 2012 03:18:12 +1000 Subject: [Python-ideas] Unpack of sequences In-Reply-To: <503E4AD5.6040102@pearwood.info> References: <503E1FCA.7050309@nedbatchelder.com> <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> <503E4AD5.6040102@pearwood.info> Message-ID: <503E4ED4.9020006@pearwood.info> On 30/08/12 03:01, Steven D'Aprano wrote: > a, b, x, y = **mapping > > could be equivalent to: > > a, b, x, y = mapping['a'], mapping['b'], mapping['x'], mapping['y'] Oh, I forgot to mention... extra keys are ignored. That's because in my experience, you are likely to have a mapping with many different keys (say, a set of config options in a dict) and you only want to extract a few at a time. Unlike unpacking a tuple, you're unlikely to need *every* field at once. A potentially useful extension to the idea is to capture the extra items in a dict: a, b, x, y, **extras = **mapping which is like: a, b, x, y = [mapping[name] for name in ('a', 'b', 'x', 'y')] extras = dict((k, v) for k,v in mapping.items() if k not in ('a', 'b', 'x', 'y')) -- Steven From guido at python.org Wed Aug 29 19:28:12 2012 From: guido at python.org (Guido van Rossum) Date: Wed, 29 Aug 2012 10:28:12 -0700 Subject: [Python-ideas] Unpack of sequences In-Reply-To: <503E4ED4.9020006@pearwood.info> References: <503E1FCA.7050309@nedbatchelder.com> <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> <503E4AD5.6040102@pearwood.info> <503E4ED4.9020006@pearwood.info> Message-ID: On Wed, Aug 29, 2012 at 10:18 AM, Steven D'Aprano wrote: > On 30/08/12 03:01, Steven D'Aprano wrote: > >> a, b, x, y = **mapping >> >> could be equivalent to: >> >> a, b, x, y = mapping['a'], mapping['b'], mapping['x'], mapping['y'] > > > > Oh, I forgot to mention... extra keys are ignored. That's because in > my experience, you are likely to have a mapping with many different keys > (say, a set of config options in a dict) and you only want to extract a > few at a time. Unlike unpacking a tuple, you're unlikely to need *every* > field at once. > > > A potentially useful extension to the idea is to capture the extra > items in a dict: > > a, b, x, y, **extras = **mapping > > which is like: > > a, b, x, y = [mapping[name] for name in ('a', 'b', 'x', 'y')] > extras = dict((k, v) for k,v in mapping.items() if k not in ('a', 'b', 'x', 'y')) Sounds to me like there are so many potential variations here that it's probably better not to add a language feature and let users write what they want. (My personal variation would be to use m.get(k) instead of m[k].) I think if I encountered the line a, b, c = **foo I would have only the vaguest intuition for its meaning -- and I'd be even less sure about something syntactically similarly plausible like self.a, self.b, self.c = **foo -- --Guido van Rossum (python.org/~guido) From alexander.belopolsky at gmail.com Wed Aug 29 20:07:41 2012 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 29 Aug 2012 14:07:41 -0400 Subject: [Python-ideas] Unpack of sequences In-Reply-To: References: <503E1FCA.7050309@nedbatchelder.com> <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> <503E4AD5.6040102@pearwood.info> <503E4ED4.9020006@pearwood.info> Message-ID: On Wed, Aug 29, 2012 at 1:28 PM, Guido van Rossum wrote: > it's probably better not to add a language feature and let users write > what they want. How can users write a solution that does not require repetition of variable names? Of course I can write something like a, b, c = [m.get(x) for x in ('a', 'b', 'c')] but when I have more and longer names, this gets tedious. As far as syntax goes, I also find a, b, c = **m somewhat unintuitive. I would prefer {a, b, c} = m and {a, b, c, **rest} = m. > I'd be even less sure about something syntactically similarly plausible like > > self.a, self.b, self.c = **foo I don't think unpacking into attributes is as useful as unpacking into locals. Object attribute lists are often available programmatically and it is a simple matter to supply an _update() function that can be used as self._update(locals()) after values have been assigned to locals or simply use self._update(m) directly. (I recall that something like {'a': x, 'b': y} = m has been suggested and rejected in the past. That syntax also required explicit specification of the keys to be unpacked.) From mal at egenix.com Wed Aug 29 20:16:10 2012 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 29 Aug 2012 20:16:10 +0200 Subject: [Python-ideas] Unpack of sequences In-Reply-To: References: <503E1FCA.7050309@nedbatchelder.com> <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> <503E4AD5.6040102@pearwood.info> <503E4ED4.9020006@pearwood.info> Message-ID: <503E5C6A.4030307@egenix.com> Alexander Belopolsky wrote: > On Wed, Aug 29, 2012 at 1:28 PM, Guido van Rossum wrote: >> it's probably better not to add a language feature and let users write >> what they want. > > How can users write a solution that does not require repetition of > variable names? Of course I can write something like > > a, b, c = [m.get(x) for x in ('a', 'b', 'c')] > > but when I have more and longer names, this gets tedious. >>> d = dict(a=1, b=2, c=3) >>> locals().update(d) >>> a 1 >>> b 2 >>> c 3 Not that I'd recommend doing this, but it's possible :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 29 2012) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2012-10-23: Python Meeting Duesseldorf ... 55 days to go 2012-08-28: Released mxODBC 3.2.0 ... http://egenix.com/go31 2012-08-20: Released mxODBC.Connect 2.0.0 ... http://egenix.com/go30 ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From steve at pearwood.info Wed Aug 29 20:20:01 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 30 Aug 2012 04:20:01 +1000 Subject: [Python-ideas] Unpack of sequences In-Reply-To: References: <503E1FCA.7050309@nedbatchelder.com> <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> <503E4AD5.6040102@pearwood.info> <503E4ED4.9020006@pearwood.info> Message-ID: <503E5D51.6030804@pearwood.info> On 30/08/12 03:28, Guido van Rossum wrote: > Sounds to me like there are so many potential variations here that > it's probably better not to add a language feature and let users write > what they want. (My personal variation would be to use m.get(k) > instead of m[k].) The obvious problem is that without language support, they have to repeat the key names. We've been here before with decorators, where you had to repeat the function name three times before the @ syntax was introduced. It's not too bad for namedtuple, because you only have to repeat one name: spam = namedtuple('spam', field_names) although that's still a mild code smell. But once you have multiple targets, it gets ugly soon, e.g. the typical enumeration (anti-)pattern: RED, YELLOW, BLUE, GREEN = 'RED', 'YELLOW', 'BLUE', 'GREN' Maybe it's time to tackle this once and for all and find a way to access the left-hand side names from the right-hand side. Assuming that's even possible at all, here's a wild suggestion: expand @ on the right-hand side to a quoted list of the names from the left: # Enumerations RED, YELLOW, BLUE, GREEN = @ # Dict unpacking a, b, x, y = *[mapping.get(name) for name in @] # namedtuple spam = namedtuple(@, field_names) Obviously this isn't a fully-fleshed out idea. I'm not sure what the target @ should do here: dict['key'], list[0], obj.attr, x = [something(name) for name in @] 1) Expand to ['key', '0', 'attr', 'x'] ? 2) Expand to ["dict['key']", "list[0]", "obj.attr", "x"] ? Or what happens if you have multiple = signs? a, b, c = x, y, z = @ But the easy way out is to only allow @ for the simple case and raise an exception for anything else. We could always add support for the more complicated cases in the future. -- Steven From guido at python.org Wed Aug 29 20:21:55 2012 From: guido at python.org (Guido van Rossum) Date: Wed, 29 Aug 2012 11:21:55 -0700 Subject: [Python-ideas] Unpack of sequences In-Reply-To: References: <503E1FCA.7050309@nedbatchelder.com> <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> <503E4AD5.6040102@pearwood.info> <503E4ED4.9020006@pearwood.info> Message-ID: On Wed, Aug 29, 2012 at 11:07 AM, Alexander Belopolsky wrote: > On Wed, Aug 29, 2012 at 1:28 PM, Guido van Rossum wrote: >> it's probably better not to add a language feature and let users write >> what they want. > > How can users write a solution that does not require repetition of > variable names? Maybe they can't. So what? There's a limit to the applicability of DRY. Most solutions I've seen for this particular set of issues are worse than the problem they are trying to solve. Another antipattern IMO is sometimes seen in constructors: class C: def __init__(self, **kwds): self.__dict__.update(kwds) This has nothing to recommend it. > Of course I can write something like > > a, b, c = [m.get(x) for x in ('a', 'b', 'c')] > > but when I have more and longer names, this gets tedious. > > As far as syntax goes, I also find a, b, c = **m somewhat unintuitive. > I would prefer > > {a, b, c} = m But {a, b, c} is already a set. I'd expect set-like semantics, perhaps assigning the keys of a 3-element set in arbitrary order to the variables a, b and c. > and > > {a, b, c, **rest} = m. > >> I'd be even less sure about something syntactically similarly plausible like >> >> self.a, self.b, self.c = **foo > > I don't think unpacking into attributes is as useful as unpacking into > locals. Object attribute lists are often available programmatically > and it is a simple matter to supply an _update() function that can be > used as self._update(locals()) after values have been assigned to > locals or simply use self._update(m) directly. Another antipattern. > (I recall that something like {'a': x, 'b': y} = m has been suggested > and rejected in the past. That syntax also required explicit > specification of the keys to be unpacked.) I don't believe a valid syntax proposal can come out of this thread. -- --Guido van Rossum (python.org/~guido) From steve at pearwood.info Wed Aug 29 20:22:39 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 30 Aug 2012 04:22:39 +1000 Subject: [Python-ideas] Unpack of sequences In-Reply-To: <503E5C6A.4030307@egenix.com> References: <503E1FCA.7050309@nedbatchelder.com> <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> <503E4AD5.6040102@pearwood.info> <503E4ED4.9020006@pearwood.info> <503E5C6A.4030307@egenix.com> Message-ID: <503E5DEF.8000401@pearwood.info> On 30/08/12 04:16, M.-A. Lemburg wrote: >>>> d = dict(a=1, b=2, c=3) >>>> locals().update(d) >>>> a > 1 >>>> b > 2 >>>> c > 3 > > Not that I'd recommend doing this, but it's possible :-) Try it inside a function. py> def test(): ... d = dict(a=1, b=2, c=3) ... locals().update(d) ... print a ... py> test() Traceback (most recent call last): File "", line 1, in File "", line 4, in test NameError: global name 'a' is not defined -- Steven From mal at egenix.com Wed Aug 29 20:50:47 2012 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 29 Aug 2012 20:50:47 +0200 Subject: [Python-ideas] Unpack of sequences In-Reply-To: <503E5DEF.8000401@pearwood.info> References: <503E1FCA.7050309@nedbatchelder.com> <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> <503E4AD5.6040102@pearwood.info> <503E4ED4.9020006@pearwood.info> <503E5C6A.4030307@egenix.com> <503E5DEF.8000401@pearwood.info> Message-ID: <503E6487.8010609@egenix.com> Steven D'Aprano wrote: > On 30/08/12 04:16, M.-A. Lemburg wrote: > >>>>> d = dict(a=1, b=2, c=3) >>>>> locals().update(d) >>>>> a >> 1 >>>>> b >> 2 >>>>> c >> 3 >> >> Not that I'd recommend doing this, but it's possible :-) > > > Try it inside a function. > > > py> def test(): > ... d = dict(a=1, b=2, c=3) > ... locals().update(d) > ... print a > ... > py> test() > Traceback (most recent call last): > File "", line 1, in > File "", line 4, in test > NameError: global name 'a' is not defined Yeah, that's because functions use fast locals, which locals() only mirrors as dictionary. You have to play some tricks to it work... def f(d): from test import * locals().update(d) print a,b,c d = dict(a=1, b=2, c=3) f(d) ...and because you're not supposed to do this, you get a SyntaxWarning :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 29 2012) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2012-10-23: Python Meeting Duesseldorf ... 55 days to go 2012-08-28: Released mxODBC 3.2.0 ... http://egenix.com/go31 2012-08-20: Released mxODBC.Connect 2.0.0 ... http://egenix.com/go30 ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From guido at python.org Wed Aug 29 21:12:33 2012 From: guido at python.org (Guido van Rossum) Date: Wed, 29 Aug 2012 12:12:33 -0700 Subject: [Python-ideas] Unpack of sequences In-Reply-To: <503E6487.8010609@egenix.com> References: <503E1FCA.7050309@nedbatchelder.com> <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> <503E4AD5.6040102@pearwood.info> <503E4ED4.9020006@pearwood.info> <503E5C6A.4030307@egenix.com> <503E5DEF.8000401@pearwood.info> <503E6487.8010609@egenix.com> Message-ID: Also it won't work in Python 3. On Wed, Aug 29, 2012 at 11:50 AM, M.-A. Lemburg wrote: > Steven D'Aprano wrote: >> On 30/08/12 04:16, M.-A. Lemburg wrote: >> >>>>>> d = dict(a=1, b=2, c=3) >>>>>> locals().update(d) >>>>>> a >>> 1 >>>>>> b >>> 2 >>>>>> c >>> 3 >>> >>> Not that I'd recommend doing this, but it's possible :-) >> >> >> Try it inside a function. >> >> >> py> def test(): >> ... d = dict(a=1, b=2, c=3) >> ... locals().update(d) >> ... print a >> ... >> py> test() >> Traceback (most recent call last): >> File "", line 1, in >> File "", line 4, in test >> NameError: global name 'a' is not defined > > Yeah, that's because functions use fast locals, which locals() > only mirrors as dictionary. > > You have to play some tricks to it work... > > def f(d): > from test import * > locals().update(d) > print a,b,c > > d = dict(a=1, b=2, c=3) > f(d) > > ...and because you're not supposed to do this, you get a SyntaxWarning :-) > > -- > Marc-Andre Lemburg > eGenix.com > > Professional Python Services directly from the Source (#1, Aug 29 2012) >>>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ > ________________________________________________________________________ > 2012-10-23: Python Meeting Duesseldorf ... 55 days to go > 2012-08-28: Released mxODBC 3.2.0 ... http://egenix.com/go31 > 2012-08-20: Released mxODBC.Connect 2.0.0 ... http://egenix.com/go30 > > ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: > > > eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 > D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg > Registered at Amtsgericht Duesseldorf: HRB 46611 > http://www.egenix.com/company/contact/ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -- --Guido van Rossum (python.org/~guido) From alexander.belopolsky at gmail.com Wed Aug 29 21:13:29 2012 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 29 Aug 2012 15:13:29 -0400 Subject: [Python-ideas] Unpack of sequences In-Reply-To: <503E5C6A.4030307@egenix.com> References: <503E1FCA.7050309@nedbatchelder.com> <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> <503E4AD5.6040102@pearwood.info> <503E4ED4.9020006@pearwood.info> <503E5C6A.4030307@egenix.com> Message-ID: On Wed, Aug 29, 2012 at 2:16 PM, M.-A. Lemburg wrote: >>>> locals().update(d) > .. > Not that I'd recommend doing this, but it's possible :-) I don't think this will work inside a function. From alexander.belopolsky at gmail.com Wed Aug 29 21:24:30 2012 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 29 Aug 2012 15:24:30 -0400 Subject: [Python-ideas] Unpack of sequences In-Reply-To: References: <503E1FCA.7050309@nedbatchelder.com> <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> <503E4AD5.6040102@pearwood.info> <503E4ED4.9020006@pearwood.info> Message-ID: On Wed, Aug 29, 2012 at 2:21 PM, Guido van Rossum wrote: >> {a, b, c} = m > > But {a, b, c} is already a set. I'd expect set-like semantics, perhaps > assigning the keys of a 3-element set in arbitrary order to the > variables a, b and c. I considered this before posting and I think the potential for such confusion is rather low. We already allow [a,b,c] = and (a,b,c) = even though tuples are supposed to be immutable and (a,b,c) = may look like a syntax error to someone unfamiliar with unpacking. I don't see much of a problem with having the same syntactic elements have different meaning when they appear as in an assignment target. From mal at egenix.com Wed Aug 29 21:30:41 2012 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 29 Aug 2012 21:30:41 +0200 Subject: [Python-ideas] Unpack of sequences In-Reply-To: References: <503E1FCA.7050309@nedbatchelder.com> <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> <503E4AD5.6040102@pearwood.info> <503E4ED4.9020006@pearwood.info> <503E5C6A.4030307@egenix.com> <503E5DEF.8000401@pearwood.info> <503E6487.8010609@egenix.com> Message-ID: <503E6DE1.2060208@egenix.com> Guido van Rossum wrote: > Also it won't work in Python 3. The star import is only used to trigger a call to PyFrame_LocalsToFast(). In Python 3, the only way to trigger such a call is by using a call level trace function... or by exposing the C function in Python. > On Wed, Aug 29, 2012 at 11:50 AM, M.-A. Lemburg wrote: >> Steven D'Aprano wrote: >>> On 30/08/12 04:16, M.-A. Lemburg wrote: >>> >>>>>>> d = dict(a=1, b=2, c=3) >>>>>>> locals().update(d) >>>>>>> a >>>> 1 >>>>>>> b >>>> 2 >>>>>>> c >>>> 3 >>>> >>>> Not that I'd recommend doing this, but it's possible :-) >>> >>> >>> Try it inside a function. >>> >>> >>> py> def test(): >>> ... d = dict(a=1, b=2, c=3) >>> ... locals().update(d) >>> ... print a >>> ... >>> py> test() >>> Traceback (most recent call last): >>> File "", line 1, in >>> File "", line 4, in test >>> NameError: global name 'a' is not defined >> >> Yeah, that's because functions use fast locals, which locals() >> only mirrors as dictionary. >> >> You have to play some tricks to it work... >> >> def f(d): >> from test import * >> locals().update(d) >> print a,b,c >> >> d = dict(a=1, b=2, c=3) >> f(d) >> >> ...and because you're not supposed to do this, you get a SyntaxWarning :-) >> >> -- >> Marc-Andre Lemburg >> eGenix.com >> >> Professional Python Services directly from the Source (#1, Aug 29 2012) >>>>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ >> ________________________________________________________________________ >> 2012-10-23: Python Meeting Duesseldorf ... 55 days to go >> 2012-08-28: Released mxODBC 3.2.0 ... http://egenix.com/go31 >> 2012-08-20: Released mxODBC.Connect 2.0.0 ... http://egenix.com/go30 >> >> ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: >> >> >> eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 >> D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg >> Registered at Amtsgericht Duesseldorf: HRB 46611 >> http://www.egenix.com/company/contact/ >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas > > > -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 29 2012) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2012-10-23: Python Meeting Duesseldorf ... 55 days to go 2012-08-28: Released mxODBC 3.2.0 ... http://egenix.com/go31 2012-08-20: Released mxODBC.Connect 2.0.0 ... http://egenix.com/go30 ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From python at mrabarnett.plus.com Wed Aug 29 21:47:52 2012 From: python at mrabarnett.plus.com (MRAB) Date: Wed, 29 Aug 2012 20:47:52 +0100 Subject: [Python-ideas] Unpack of sequences In-Reply-To: References: <503E1FCA.7050309@nedbatchelder.com> <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> <503E4AD5.6040102@pearwood.info> <503E4ED4.9020006@pearwood.info> Message-ID: <503E71E8.90908@mrabarnett.plus.com> On 29/08/2012 20:24, Alexander Belopolsky wrote: > On Wed, Aug 29, 2012 at 2:21 PM, Guido van Rossum wrote: >>> {a, b, c} = m >> >> But {a, b, c} is already a set. I'd expect set-like semantics, perhaps >> assigning the keys of a 3-element set in arbitrary order to the >> variables a, b and c. > > I considered this before posting and I think the potential for such > confusion is rather low. We already allow [a,b,c] = and (a,b,c) = > even though tuples are supposed to be immutable and (a,b,c) = may look > like a syntax error to someone unfamiliar with unpacking. I don't see > much of a problem with having the same syntactic elements have > different meaning when they appear as in an assignment target. > If you did want to unpack a set, that would be: (a, b, c) = m {...} are used for both dicts and sets, after all. I must admit it's starting to look a bit better to me. From masklinn at masklinn.net Wed Aug 29 22:43:43 2012 From: masklinn at masklinn.net (Masklinn) Date: Wed, 29 Aug 2012 22:43:43 +0200 Subject: [Python-ideas] Unpack of sequences In-Reply-To: References: <503E1FCA.7050309@nedbatchelder.com> <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> <503E4AD5.6040102@pearwood.info> <503E4ED4.9020006@pearwood.info> Message-ID: <65E3F232-C8EA-4D58-8910-465BDB6FDC40@masklinn.net> On 2012-08-29, at 20:21 , Guido van Rossum wrote: > > But {a, b, c} is already a set. On the rhs, the lhs is expected to have different semantics: (a, b, c) = ? matches any 3-items iterable (including sets), not necessarily a tuple. If there really is a need for disambiguation, colons could be added after the keys ({a:, b:, c:}), but I don't think the original syntax is confusing (let alone enough to warrant this). After all, {} is for dictionaries before it's for sets, I'm guessing most Python users still associate braces with dictionaries more than with sets. From guido at python.org Wed Aug 29 22:44:40 2012 From: guido at python.org (Guido van Rossum) Date: Wed, 29 Aug 2012 13:44:40 -0700 Subject: [Python-ideas] Unpack of sequences In-Reply-To: <503E6DE1.2060208@egenix.com> References: <503E1FCA.7050309@nedbatchelder.com> <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> <503E4AD5.6040102@pearwood.info> <503E4ED4.9020006@pearwood.info> <503E5C6A.4030307@egenix.com> <503E5DEF.8000401@pearwood.info> <503E6487.8010609@egenix.com> <503E6DE1.2060208@egenix.com> Message-ID: On Wed, Aug 29, 2012 at 12:30 PM, M.-A. Lemburg wrote: > Guido van Rossum wrote: >> Also it won't work in Python 3. > > The star import is only used to trigger a call to PyFrame_LocalsToFast(). > > In Python 3, the only way to trigger such a call is by using > a call level trace function... or by exposing the C function > in Python. I don't believe that's the whole story. In Python 2, the import * changes the semantics of locals. (So does 'exec' BTW.) Example: >>> def f(): ... if 0: from test import * ... locals()['x'] = 1 ... print(x) ... :1: SyntaxWarning: import * only allowed at module level >>> def g(): ... locals()['x'] = 1 ... print(x) ... >>> f() 1 >>> g() Traceback (most recent call last): File "", line 1, in File "", line 3, in g NameError: global name 'x' is not defined >>> Note the difference in generated bytecode: >>> dis.dis(f) 3 0 LOAD_CONST 1 (1) 3 LOAD_NAME 0 (locals) 6 CALL_FUNCTION 0 9 LOAD_CONST 2 ('x') 12 STORE_SUBSCR 4 13 LOAD_NAME 1 (x) 16 PRINT_ITEM 17 PRINT_NEWLINE 18 LOAD_CONST 0 (None) 21 RETURN_VALUE >>> dis.dis(g) 2 0 LOAD_CONST 1 (1) 3 LOAD_GLOBAL 0 (locals) 6 CALL_FUNCTION 0 9 LOAD_CONST 2 ('x') 12 STORE_SUBSCR 3 13 LOAD_GLOBAL 1 (x) 16 PRINT_ITEM 17 PRINT_NEWLINE 18 LOAD_CONST 0 (None) 21 RETURN_VALUE >>> Compare line 13 in both: LOAD_NAME vs. LOAD_GLOBAL. Effectively, in f(), the locals are dynamic (this is how they were implemented in Python 0.0). In g() they are not, so the compiler decides that x can't be a local, and generates a LOAD_GLOBAL. In Python 3, all functions behave like g(): import * is no longer allowed, and exec() is no longer treated special (it's no longer a reserved keyword). -- --Guido van Rossum (python.org/~guido) From guido at python.org Wed Aug 29 22:45:19 2012 From: guido at python.org (Guido van Rossum) Date: Wed, 29 Aug 2012 13:45:19 -0700 Subject: [Python-ideas] Unpack of sequences In-Reply-To: <65E3F232-C8EA-4D58-8910-465BDB6FDC40@masklinn.net> References: <503E1FCA.7050309@nedbatchelder.com> <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> <503E4AD5.6040102@pearwood.info> <503E4ED4.9020006@pearwood.info> <65E3F232-C8EA-4D58-8910-465BDB6FDC40@masklinn.net> Message-ID: On Wed, Aug 29, 2012 at 1:43 PM, Masklinn wrote: > On 2012-08-29, at 20:21 , Guido van Rossum wrote: >> >> But {a, b, c} is already a set. > > On the rhs, the lhs is expected to have different semantics: (a, b, c) = > ? matches any 3-items iterable (including sets), not necessarily a > tuple. If there really is a need for disambiguation, colons could be added > after the keys ({a:, b:, c:}), but I don't think the original syntax > is confusing (let alone enough to warrant this). > > After all, {} is for dictionaries before it's for sets, I'm guessing > most Python users still associate braces with dictionaries more than > with sets. But {a, b, c} doesn't look like a dict. Please give it up. -- --Guido van Rossum (python.org/~guido) From grosser.meister.morti at gmx.net Wed Aug 29 23:12:15 2012 From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=) Date: Wed, 29 Aug 2012 23:12:15 +0200 Subject: [Python-ideas] Unpack of sequences In-Reply-To: References: <503E1FCA.7050309@nedbatchelder.com> <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> <503E4AD5.6040102@pearwood.info> <503E4ED4.9020006@pearwood.info> Message-ID: <503E85AF.1080809@gmx.net> On 08/29/2012 08:07 PM, Alexander Belopolsky wrote: > On Wed, Aug 29, 2012 at 1:28 PM, Guido van Rossum wrote: >> it's probably better not to add a language feature and let users write >> what they want. > > How can users write a solution that does not require repetition of > variable names? Of course I can write something like > > a, b, c = [m.get(x) for x in ('a', 'b', 'c')] > > but when I have more and longer names, this gets tedious. > > As far as syntax goes, I also find a, b, c = **m somewhat unintuitive. > I would prefer > > {a, b, c} = m > > and > > {a, b, c, **rest} = m. > Note: Mozilla's JavaScript 1.8 has this: var {a, b, c} = m; While such a feature would sometimes be handy, I understand that it is a collision with the set syntax. >> I'd be even less sure about something syntactically similarly plausible like >> >> self.a, self.b, self.c = **foo > > I don't think unpacking into attributes is as useful as unpacking into > locals. Object attribute lists are often available programmatically > and it is a simple matter to supply an _update() function that can be > used as self._update(locals()) after values have been assigned to > locals or simply use self._update(m) directly. > > (I recall that something like {'a': x, 'b': y} = m has been suggested > and rejected in the past. That syntax also required explicit > specification of the keys to be unpacked.) From alexander.belopolsky at gmail.com Wed Aug 29 23:19:11 2012 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 29 Aug 2012 17:19:11 -0400 Subject: [Python-ideas] Unpack of sequences In-Reply-To: References: <503E1FCA.7050309@nedbatchelder.com> <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> <503E4AD5.6040102@pearwood.info> <503E4ED4.9020006@pearwood.info> <65E3F232-C8EA-4D58-8910-465BDB6FDC40@masklinn.net> Message-ID: On Wed, Aug 29, 2012 at 4:45 PM, Guido van Rossum wrote: > But {a, b, c} doesn't look like a dict. locals().update(m) does not look like a dict either, but many people expect it to work and get bitten when it does not. > Please give it up. Do you ask to give up the {a, b, c} = idea or any ideas that import mappings into local variables? For example some abuse of the with statement would allow with m in a, b, c: # a, b, and c are local here I am not really proposing this, just giving an example for the range of possibilities. I am sure an acceptable syntax can be found if DRY is considered important enough, but if any automation of a = m['a'] b = m['b'] c = m['c'] is deemed to be an anti-pattern, then this thread is better to stop. From tjreedy at udel.edu Wed Aug 29 23:24:20 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 29 Aug 2012 17:24:20 -0400 Subject: [Python-ideas] Verbose traceback formatting In-Reply-To: References: Message-ID: On 8/28/2012 6:26 PM, Mike Graham wrote: > It's possible to give a lot more on error than the default traceback > gives you. I propose that Python should ship a more verbose formatter > and a command line switch to use it. Part of the problem is in the overly skimpy exception instances themselves. They should contain the needed runtime info that one cannot find in the code. I would rather you push for more such changes. >> ... >> x = 16 >> y = 0 >> ... >>ZeroDivisionError: integer division or modulo by zero This could and, imo, should be changed to include the numerator, which is the main extra info included the the verbose traceback. Most of the rest strikes me as noise. ZeroDivisionError: integer division or modulo of 16 by 0 http://bugs.python.org/issue15815 -- Terry Jan Reedy From guido at python.org Wed Aug 29 23:30:42 2012 From: guido at python.org (Guido van Rossum) Date: Wed, 29 Aug 2012 14:30:42 -0700 Subject: [Python-ideas] Unpack of sequences In-Reply-To: References: <503E1FCA.7050309@nedbatchelder.com> <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> <503E4AD5.6040102@pearwood.info> <503E4ED4.9020006@pearwood.info> <65E3F232-C8EA-4D58-8910-465BDB6FDC40@masklinn.net> Message-ID: On Wed, Aug 29, 2012 at 2:19 PM, Alexander Belopolsky wrote: > On Wed, Aug 29, 2012 at 4:45 PM, Guido van Rossum wrote: >> But {a, b, c} doesn't look like a dict. > > locals().update(m) does not look like a dict either, but many people > expect it to work and get bitten when it does not. > >> Please give it up. > > Do you ask to give up the {a, b, c} = idea or any ideas that import > mappings into local variables? The latter. (Or attributes to locals.) > For example some abuse of the with > statement would allow > > with m in a, b, c: > # a, b, and c are local here > > I am not really proposing this, just giving an example for the range > of possibilities. I am sure an acceptable syntax can be found if DRY > is considered important enough, but if any automation of > > a = m['a'] > b = m['b'] > c = m['c'] > > is deemed to be an anti-pattern, then this thread is better to stop. -- --Guido van Rossum (python.org/~guido) From mikegraham at gmail.com Thu Aug 30 00:10:31 2012 From: mikegraham at gmail.com (Mike Graham) Date: Wed, 29 Aug 2012 18:10:31 -0400 Subject: [Python-ideas] Unpack of sequences In-Reply-To: References: <503E1FCA.7050309@nedbatchelder.com> Message-ID: On Wed, Aug 29, 2012 at 11:03 AM, Guido van Rossum wrote: > Also of course assignment has no equivalent to keyword parameters, nor > does it (currently) allow a "lone star" -- although it would be handy > to be able to say > > a, b, * = xs > > as a shorthand for > > a, b, *_ = xs > del _ Is there any good reason not to introduce this syntax? Mike From ncoghlan at gmail.com Thu Aug 30 00:18:30 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 30 Aug 2012 08:18:30 +1000 Subject: [Python-ideas] Unpack of sequences In-Reply-To: References: <503E1FCA.7050309@nedbatchelder.com> <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> <503E4AD5.6040102@pearwood.info> <503E4ED4.9020006@pearwood.info> <65E3F232-C8EA-4D58-8910-465BDB6FDC40@masklinn.net> Message-ID: Just backing Guido up here (not that he needs it): we've put a lot of work into making sure that the Python 3 compiler can see all local name definitions at compile time. Ideas that involve reverting that simply aren't going to be accepted. I'm personally less opposed to ideas for a new assignment statement that explicitly treats the lhs as a parameter definition and the rhs as an argument list and binds values to names accordingly, but even that would be a hard sell. -- Sent from my phone, thus the relative brevity :) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Thu Aug 30 00:21:21 2012 From: guido at python.org (Guido van Rossum) Date: Wed, 29 Aug 2012 15:21:21 -0700 Subject: [Python-ideas] Unpack of sequences In-Reply-To: References: <503E1FCA.7050309@nedbatchelder.com> Message-ID: On Wed, Aug 29, 2012 at 3:10 PM, Mike Graham wrote: > On Wed, Aug 29, 2012 at 11:03 AM, Guido van Rossum wrote: >> Also of course assignment has no equivalent to keyword parameters, nor >> does it (currently) allow a "lone star" -- although it would be handy >> to be able to say >> >> a, b, * = xs >> >> as a shorthand for >> >> a, b, *_ = xs >> del _ > > Is there any good reason not to introduce this syntax? I should apologize for bringing this up, because the analogy is actually backwards. (Or maybe I could claim that this backwardness is a good warning against hypergeneralization. :-) In function definitions, it actually means *don't allow more positional arguments*. The equivalent already exists for unpacking assignment: a, b = xs The reason separate syntax is needed in function definitions is that we occasionally wish to say "and there are no more positional parameters" but also "but there are some additional keyword-only parameters". Until unpacking assignment support an equivalent to keyword parameters with default values we won't need * there to mean "there should be no more values". But giving it the *opposite* meaning of "and ignore subsequent values" would be just perverse given what it means in function declarations. -- --Guido van Rossum (python.org/~guido) From ethan at stoneleaf.us Thu Aug 30 00:31:12 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 29 Aug 2012 15:31:12 -0700 Subject: [Python-ideas] Unpack of sequences In-Reply-To: References: <503E1FCA.7050309@nedbatchelder.com> Message-ID: <503E9830.9060208@stoneleaf.us> Mike Graham wrote: > On Wed, Aug 29, 2012 at 11:03 AM, Guido van Rossum wrote: >> Also of course assignment has no equivalent to keyword parameters, nor >> does it (currently) allow a "lone star" -- although it would be handy >> to be able to say >> >> a, b, * = xs >> >> as a shorthand for >> >> a, b, *_ = xs >> del _ > > Is there any good reason not to introduce this syntax? I'm guessing because a, b = xs[:2] is not difficult. Mind you, I'm not opposed to the idea. ~Ethan~ From grosser.meister.morti at gmx.net Thu Aug 30 00:30:58 2012 From: grosser.meister.morti at gmx.net (=?windows-1252?Q?Mathias_Panzenb=F6ck?=) Date: Thu, 30 Aug 2012 00:30:58 +0200 Subject: [Python-ideas] format specifier for "not bytes" In-Reply-To: <20120827103438.093aa3f4@resist.wooz.org> References: <5037C1FE.9020509@mrabarnett.plus.com> <20120824204043.3c4c4524@pitrou.net> <5037D0F3.70108@pearwood.info> <20120827103438.093aa3f4@resist.wooz.org> Message-ID: <503E9822.10709@gmx.net> On 08/27/2012 04:34 PM, Barry Warsaw wrote: > On Aug 25, 2012, at 09:16 AM, Nick Coghlan wrote: > >> A couple of people at PyCon Au mentioned running into this kind of issue >> with Python 3. It relates to the fact that: >> 1. String formatting is *coercive* by default >> 2. Absolutely everything, including bytes objects can be coerced to a >> string, due to the repr() fallback >> >> So it's relatively easy to miss a decode or encode operation, and end up >> interpolating an unwanted "b" prefix and some quotes. >> >> For existing versions, I think the easiest answer is to craft a regex that >> matches bytes object repr's and advise people to check that it *doesn?t* >> match their formatted strings in their unit tests. >> >> For 3.4+ a non-coercive string interpolation format code may be desirable. > > Or maybe just one that calls __str__ without a __repr__ fallback? > >>> b'a'.__str__() "b'a'" __str__ still returns the bytes literal representation. From guido at python.org Thu Aug 30 00:48:01 2012 From: guido at python.org (Guido van Rossum) Date: Wed, 29 Aug 2012 15:48:01 -0700 Subject: [Python-ideas] Add a "hasmethod()" builtin? Message-ID: There's a concept that's sometimes useful when explaining behavior of certain Python operations in terms of simpler ones, and it is "does the class of x define a method m?". This currently cannot expressed using hasattr(): hasattr(x, 'm') might find an instance variable named m, but that is not a method (and there are contexts where they are not interchangeable); hasattr(type(x), 'm') might find a *metaclass* method. Example of the former (assume Python 3): class C: def __add__(self, other): return 42 c = C() c.__add__ = lambda *args: 0 c + 1 # prints 42, not 0 Example of the latter: class C: pass c = C() hasattr(C, 'mro') # prints True, since mro() is a method of the standard metaclass ('type'). c.mro() # raises AttributeError The use case I am mainly thinking of is the formal explanation of the semantics of binary (and other) operators, e.g. def __add__(a, b): r = NotImplemented if hasmethod(a, '__add__'): r = a.__add__(b) if r is NotImplemented and hasmethod(b, '__radd__'): r = b.__radd__(a) if r is NotImplemented: raise TypeError return r (Caveat: it's even more complicated if type(b) is a subclass of type(a).) I'm not sure if it would be better if the first argument of hasmethod() was a type instead of an instance (so the example would use hasattr(type(a), '__add__') etc.). It's also interesting to figure out what should happen for proxy types. (Proxy types in general need a better foundation.) Thoughts? -- --Guido van Rossum (python.org/~guido) From sven at marnach.net Thu Aug 30 01:13:21 2012 From: sven at marnach.net (Sven Marnach) Date: Thu, 30 Aug 2012 00:13:21 +0100 Subject: [Python-ideas] Add a "hasmethod()" builtin? In-Reply-To: References: Message-ID: <20120829231321.GG5137@bagheera> On Wed, Aug 29, 2012 at 03:48:01PM -0700, Guido van Rossum wrote: > There's a concept that's sometimes useful when explaining behavior of > certain Python operations in terms of simpler ones, and it is "does > the class of x define a method m?". > > This currently cannot expressed using hasattr(): hasattr(x, 'm') might > find an instance variable named m, but that is not a method (and there > are contexts where they are not interchangeable); hasattr(type(x), > 'm') might find a *metaclass* method. > > Example of the former (assume Python 3): > > class C: > def __add__(self, other): return 42 > > c = C() > c.__add__ = lambda *args: 0 > c + 1 # prints 42, not 0 > Example of the latter: > > class C: pass > c = C() > hasattr(C, 'mro') # prints True, since mro() is a method of the > standard metaclass ('type'). > c.mro() # raises AttributeError > > The use case I am mainly thinking of is the formal explanation of the > semantics of binary (and other) operators, e.g. > > def __add__(a, b): > r = NotImplemented > if hasmethod(a, '__add__'): > r = a.__add__(b) > if r is NotImplemented and hasmethod(b, '__radd__'): > r = b.__radd__(a) > if r is NotImplemented: > raise TypeError > return r I'd usually simply use ``hasattr(a, "__add__")``. This is also what e.g. MutableMapping.update() currently does. The chances that someone accidentally passes in an object that has a callable instance variable called "__add__" seem pretty low, and we don't need to protect against people intentionally trying to break things. That said, ``hasmethod()`` can be implemented in a quite straight-forward way in pure Python: def hasmethod(obj, name): return inspect.ismethod(getattr(obj, name, None)) Cheers, Sven From ncoghlan at gmail.com Thu Aug 30 03:07:56 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 30 Aug 2012 11:07:56 +1000 Subject: [Python-ideas] Unpack of sequences In-Reply-To: References: <503E1FCA.7050309@nedbatchelder.com> <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> <503E4AD5.6040102@pearwood.info> <503E4ED4.9020006@pearwood.info> <65E3F232-C8EA-4D58-8910-465BDB6FDC40@masklinn.net> Message-ID: On Thu, Aug 30, 2012 at 8:18 AM, Nick Coghlan wrote: > I'm personally less opposed to ideas for a new assignment statement that > explicitly treats the lhs as a parameter definition and the rhs as an > argument list and binds values to names accordingly, but even that would be > a hard sell. Expanding on this, now that I'm back at a real computer. Argument-parameter binding is actually a fairly powerful name binding operation, quite distinct from ordinary assignment and tuple unpacking. It consists of two separate (often confused) components: - the parameter specification - the argument list The parameter spec is what appears in the function *definition*, and is really the main item of interest reported by the new inspect.Signature objects in 3.3. A parameter spec allows you to do several things: - define the names that will be bound locally - define default values to be assigned to those names - provide named holders for excess positional and keyword arguments - indicate that certain values can *only* be supplied by name, and not by position The argument list is what appears in a function *call*, and also has several interesting features: - can provide explicit positional arguments - can provide explicit keyword arguments - can provide an iterable of positional arguments - can provide a mapping of additional keyword arguments It *may* make sense to decouple this name binding operation from function calls (and Signature.bind) and make it available as a language primitive in the form of a statement. However, it's *not* the same thing as an ordinary assignment statement or tuple unpacking, and we shouldn't try to jam it implicitly into the existing assignment statement. There are certainly valid use cases for such a construct. One example is support for positional only arguments. Currently, handling positional only arguments with reasonable error messages requires something like the following: def _bind_args(a=None, b=None): return a, b def f(*args): a, b = _bind_args(*args) Note that the "_bind_args" function exists solely to get access to the argument/parameter binding behaviour. Another example is unpacking values from a dictionary, which can be done using a similar technique: def _bind_kwds(a=None, b=None, **other): return a, b, other def f(mapping): a, b, other = _bind_kwds(**mapping) The obvious form for such a statement is "LHS OP RHS", however syntactic ambiguity in the evaluation of both the LHS and RHS (relative to normal assigment) would likely prevent that. As a sketch, I'll present a notation inspired by Ruby's block parameter syntax and emphasising the link to def statements: def f(*args): |a=None, b=None| def= *args def f(mapping): |a=None, b=None, **other| def= **mapping This is fairly magical, but hopefully the intent is clear: the LHS is enclosed in "|" characters to resolve the syntactic ambiguity problem for the LHS, while a form of augmented assignment "def=" is used to prevent assignment chaining and to resolve any syntactic ambiguity for the RHS. It may be that there's no solution to this problem that *doesn't* look magical. However, if people are interested in an enhanced form of name binding that's more powerful than the current assignment statement and tuple unpacking, then liberating parameter binding from function calls is the way to go. Unlike other proposals, it doesn't make the language more complicated, because parameter binding is something people learning Python already need to understand. Indeed, in some ways it would make the language *simpler*, since it would allow the explanation of parameter binding to be clearly decoupled from the explanation of function calls and definitions. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Thu Aug 30 03:30:27 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 30 Aug 2012 11:30:27 +1000 Subject: [Python-ideas] Add a "hasmethod()" builtin? In-Reply-To: <20120829231321.GG5137@bagheera> References: <20120829231321.GG5137@bagheera> Message-ID: On Thu, Aug 30, 2012 at 9:13 AM, Sven Marnach wrote: > I'd usually simply use ``hasattr(a, "__add__")``. This is also what > e.g. MutableMapping.update() currently does. The chances that someone > accidentally passes in an object that has a callable instance variable > called "__add__" seem pretty low, and we don't need to protect against > people intentionally trying to break things. > > That said, ``hasmethod()`` can be implemented in a quite > straight-forward way in pure Python: > > def hasmethod(obj, name): > return inspect.ismethod(getattr(obj, name, None)) I don't think that's *quite* what Guido is getting at: it's more about exposing the semantics of _PyType_Lookup at the Python level. There are currently two different ways of approximating that. The one which mimics _PyType_Lookup most closely (by treating proxy types as instances of the proxy type, rather than the target type) is to do "type(x.).attr", "getattr(type(x), 'attr')" or "hasattr(type(x), 'attr')" rather than performing those operations directly on 'x'. There's an alternative which treats proxy objects as an instance of the *target* type, which is to respect the claimed value of __class__: "x.__class__.attr", "getattr(x.__class__, 'attr')" or "hasattr(x.__class__, 'attr')" Yet *both* of those alternatives have the problem Guido noted, where they can find metaclass methods that the descriptor protocol would ignore: >>> class C: pass ... >>> c = C() >>> type(c) >>> c.__class__ >>> hasattr(c, "mro") False >>> hasattr(type(c), "mro") True _PyType_Lookup is essentially "ordinary attribute lookup, but ignore the instance variables and don't supply the instance to descriptors", or, equivalently, "class attribute lookup, but don't fallback to the metaclass". It's a necessary part of the language semantics, but it's not currently accessible from Python code. I believe the last time this came up, the main idea being kicked around was a new function in the operator module (e.g. "operator.gettypeattr()") rather than a new builtin. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From tjreedy at udel.edu Thu Aug 30 03:52:47 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 29 Aug 2012 21:52:47 -0400 Subject: [Python-ideas] Unpack of sequences In-Reply-To: References: <503E1FCA.7050309@nedbatchelder.com> <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> <503E4AD5.6040102@pearwood.info> <503E4ED4.9020006@pearwood.info> <65E3F232-C8EA-4D58-8910-465BDB6FDC40@masklinn.net> Message-ID: On 8/29/2012 9:07 PM, Nick Coghlan wrote: > On Thu, Aug 30, 2012 at 8:18 AM, Nick Coghlan wrote: >> I'm personally less opposed to ideas for a new assignment statement that >> explicitly treats the lhs as a parameter definition and the rhs as an >> argument list and binds values to names accordingly, but even that would be >> a hard sell. > > Expanding on this, now that I'm back at a real computer. > > Argument-parameter binding is actually a fairly powerful name binding > operation, I have described this as cross-namespace assignment. The important point is that aliasing between and within namespaces has same effect vis-a-vis mutable objects. > quite distinct from ordinary assignment and tuple unpacking. > It consists of two separate (often confused) components: > - the parameter specification > - the argument list Analogous to left and right sides of assignment. They were once more similar than they are now. > The parameter spec is what appears in the function *definition*, and > is really the main item of interest reported by the new > inspect.Signature objects in 3.3. > > A parameter spec allows you to do several things: > - define the names that will be bound locally > - define default values to be assigned to those names > - provide named holders for excess positional and keyword arguments > - indicate that certain values can *only* be supplied by name, and not > by position > > The argument list is what appears in a function *call*, and also has > several interesting features: > - can provide explicit positional arguments > - can provide explicit keyword arguments > - can provide an iterable of positional arguments > - can provide a mapping of additional keyword arguments > > It *may* make sense to decouple this name binding operation from > function calls (and Signature.bind) and make it available as a > language primitive in the form of a statement. Interesting idea. > However, it's *not* the same thing as an ordinary assignment statement > or tuple unpacking, and we shouldn't try to jam it implicitly into the > existing assignment statement. There seems to be too much to jam very well. > There are certainly valid use cases for such a construct. One example > is support for positional only arguments. Currently, handling > positional only arguments with reasonable error messages requires > something like the following: > > def _bind_args(a=None, b=None): > return a, b > > def f(*args): > a, b = _bind_args(*args) > > Note that the "_bind_args" function exists solely to get access to the > argument/parameter binding behaviour. Another example is unpacking > values from a dictionary, which can be done using a similar technique: > > def _bind_kwds(a=None, b=None, **other): > return a, b, other > > def f(mapping): > a, b, other = _bind_kwds(**mapping) > > The obvious form for such a statement is "LHS OP RHS", however > syntactic ambiguity in the evaluation of both the LHS and RHS > (relative to normal assigment) would likely prevent that. As a sketch, > I'll present a notation inspired by Ruby's block parameter syntax and > emphasising the link to def statements: > > def f(*args): > |a=None, b=None| def= *args > > def f(mapping): > |a=None, b=None, **other| def= **mapping > > This is fairly magical, but hopefully the intent is clear: the LHS is > enclosed in "|" characters to resolve the syntactic ambiguity problem > for the LHS, while a form of augmented assignment "def=" is used to I would call it 'call assignment' or 'extended assignment' to clearly differentiate it from current augmented assignment, which it is not. (=) would be a possible syntax I have no idea whether the parser could handle this > prevent assignment chaining and to resolve any syntactic ambiguity for > the RHS. It may be that there's no solution to this problem that > *doesn't* look magical. > > However, if people are interested in an enhanced form of name binding > that's more powerful than the current assignment statement and tuple > unpacking, then liberating parameter binding from function calls is > the way to go. Unlike other proposals, it doesn't make the language > more complicated, because parameter binding is something people > learning Python already need to understand. Indeed, in some ways it > would make the language *simpler*, since it would allow the > explanation of parameter binding to be clearly decoupled from the > explanation of function calls and definitions. -- Terry Jan Reedy From steve at pearwood.info Thu Aug 30 06:28:25 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 30 Aug 2012 14:28:25 +1000 Subject: [Python-ideas] Add a "hasmethod()" builtin? In-Reply-To: References: Message-ID: <20120830042825.GA13314@ando> On Wed, Aug 29, 2012 at 03:48:01PM -0700, Guido van Rossum wrote: > There's a concept that's sometimes useful when explaining behavior of > certain Python operations in terms of simpler ones, and it is "does > the class of x define a method m?". It's not just methods where this is useful. For example, the help() quasi-builtin ignores instance attribute x.__doc__ and instead uses type(x).__doc__. I'm not sure that needing this is common enough to justify builtins, but I think it would be useful to have hastypeattr and friends (get*, set* and del*) in the operator module. -- Steven From greg.ewing at canterbury.ac.nz Thu Aug 30 04:25:50 2012 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 30 Aug 2012 14:25:50 +1200 Subject: [Python-ideas] Unpack of sequences In-Reply-To: References: <503E1FCA.7050309@nedbatchelder.com> Message-ID: <503ECF2E.6060400@canterbury.ac.nz> On 30/08/12 10:21, Guido van Rossum wrote: > Until unpacking assignment support an equivalent to keyword parameters > with default values we won't need * there to mean "there should be no > more values". But giving it the *opposite* meaning of "and ignore > subsequent values" would be just perverse given what it means in > function declarations. How about a, b, c, ... = d -- Greg From storchaka at gmail.com Thu Aug 30 08:55:00 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 30 Aug 2012 09:55:00 +0300 Subject: [Python-ideas] Unpack of sequences In-Reply-To: References: <503E1FCA.7050309@nedbatchelder.com> <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> <503E4AD5.6040102@pearwood.info> <503E4ED4.9020006@pearwood.info> Message-ID: On 29.08.12 21:07, Alexander Belopolsky wrote: > How can users write a solution that does not require repetition of > variable names? def interior(a=None, b=None, c=None): ... # work here interior(m) From storchaka at gmail.com Thu Aug 30 08:57:41 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 30 Aug 2012 09:57:41 +0300 Subject: [Python-ideas] Unpack of sequences In-Reply-To: <503E5D51.6030804@pearwood.info> References: <503E1FCA.7050309@nedbatchelder.com> <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> <503E4AD5.6040102@pearwood.info> <503E4ED4.9020006@pearwood.info> <503E5D51.6030804@pearwood.info> Message-ID: On 29.08.12 21:20, Steven D'Aprano wrote: > But once you have multiple > targets, > it gets ugly soon, e.g. the typical enumeration (anti-)pattern: > > RED, YELLOW, BLUE, GREEN = 'RED', 'YELLOW', 'BLUE', 'GREN' for n in 'RED', 'YELLOW', 'BLUE', 'GREN': globals()[n] = n From mal at egenix.com Thu Aug 30 09:51:21 2012 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 30 Aug 2012 09:51:21 +0200 Subject: [Python-ideas] Unpack of sequences In-Reply-To: References: <503E1FCA.7050309@nedbatchelder.com> <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> <503E4AD5.6040102@pearwood.info> <503E4ED4.9020006@pearwood.info> <503E5C6A.4030307@egenix.com> <503E5DEF.8000401@pearwood.info> <503E6487.8010609@egenix.com> <503E6DE1.2060208@egenix.com> Message-ID: <503F1B79.4020502@egenix.com> Guido van Rossum wrote: > On Wed, Aug 29, 2012 at 12:30 PM, M.-A. Lemburg wrote: >> Guido van Rossum wrote: >>> Also it won't work in Python 3. >> >> The star import is only used to trigger a call to PyFrame_LocalsToFast(). >> >> In Python 3, the only way to trigger such a call is by using >> a call level trace function... or by exposing the C function >> in Python. > > I don't believe that's the whole story. In Python 2, the import * > changes the semantics of locals. (So does 'exec' BTW.) Example: > >>>> def f(): > ... if 0: from test import * > ... locals()['x'] = 1 > ... print(x) > ... > :1: SyntaxWarning: import * only allowed at module level >>>> def g(): > ... locals()['x'] = 1 > ... print(x) > ... >>>> f() > 1 >>>> g() > Traceback (most recent call last): > File "", line 1, in > File "", line 3, in g > NameError: global name 'x' is not defined >>>> > > Note the difference in generated bytecode: > >>>> dis.dis(f) > 3 0 LOAD_CONST 1 (1) > 3 LOAD_NAME 0 (locals) > 6 CALL_FUNCTION 0 > 9 LOAD_CONST 2 ('x') > 12 STORE_SUBSCR > > 4 13 LOAD_NAME 1 (x) > 16 PRINT_ITEM > 17 PRINT_NEWLINE > 18 LOAD_CONST 0 (None) > 21 RETURN_VALUE >>>> dis.dis(g) > 2 0 LOAD_CONST 1 (1) > 3 LOAD_GLOBAL 0 (locals) > 6 CALL_FUNCTION 0 > 9 LOAD_CONST 2 ('x') > 12 STORE_SUBSCR > > 3 13 LOAD_GLOBAL 1 (x) > 16 PRINT_ITEM > 17 PRINT_NEWLINE > 18 LOAD_CONST 0 (None) > 21 RETURN_VALUE >>>> > > Compare line 13 in both: LOAD_NAME vs. LOAD_GLOBAL. Effectively, in > f(), the locals are dynamic (this is how they were implemented in > Python 0.0). In g() they are not, so the compiler decides that x can't > be a local, and generates a LOAD_GLOBAL. You're right. The effect is not the calling of the PyFrame function, but that of the compiler generating different bytecode with the star import. In fact, the way the star import calls the PyFrame API overrides the locals update. It first copies the fast locals to the locals dictionary (overriding the updates applied before the import), then adds the symbols from the import and then copies the locals from the dictionary back to the fast slots. Here's a version that uses fast locals a,b,c: def h(d): a = 0 b = 0 c = 0 locals().update(d) from test import * print a,b,c It prints 0 0 0. The dis output: 20 0 LOAD_CONST 1 (0) 3 STORE_FAST 1 (a) 21 6 LOAD_CONST 1 (0) 9 STORE_FAST 2 (b) 22 12 LOAD_CONST 1 (0) 15 STORE_FAST 3 (c) 23 18 LOAD_NAME 0 (locals) 21 CALL_FUNCTION 0 24 LOAD_ATTR 1 (update) 27 LOAD_FAST 0 (d) 30 CALL_FUNCTION 1 33 POP_TOP 24 34 LOAD_CONST 2 (-1) 37 LOAD_CONST 3 (('*',)) 40 IMPORT_NAME 2 (test) 43 IMPORT_STAR 25 44 LOAD_FAST 1 (a) 47 PRINT_ITEM 48 LOAD_FAST 2 (b) 51 PRINT_ITEM 52 LOAD_FAST 3 (c) 55 PRINT_ITEM 56 PRINT_NEWLINE 57 LOAD_CONST 0 (None) 60 RETURN_VALUE > In Python 3, all functions behave like g(): import * is no longer > allowed, and exec() is no longer treated special (it's no longer a > reserved keyword). That's good. I don't think that any of this is really needed in Python. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 30 2012) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2012-10-23: Python Meeting Duesseldorf ... 54 days to go 2012-08-28: Released mxODBC 3.2.0 ... http://egenix.com/go31 2012-08-20: Released mxODBC.Connect 2.0.0 ... http://egenix.com/go30 ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From p.f.moore at gmail.com Thu Aug 30 09:57:51 2012 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 30 Aug 2012 08:57:51 +0100 Subject: [Python-ideas] Unpack of sequences In-Reply-To: References: <503E1FCA.7050309@nedbatchelder.com> <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> <503E4AD5.6040102@pearwood.info> <503E4ED4.9020006@pearwood.info> <65E3F232-C8EA-4D58-8910-465BDB6FDC40@masklinn.net> Message-ID: On 30 August 2012 02:07, Nick Coghlan wrote: > The obvious form for such a statement is "LHS OP RHS", however > syntactic ambiguity in the evaluation of both the LHS and RHS > (relative to normal assigment) would likely prevent that. As a sketch, > I'll present a notation inspired by Ruby's block parameter syntax and > emphasising the link to def statements: > > def f(*args): > |a=None, b=None| def= *args > > def f(mapping): > |a=None, b=None, **other| def= **mapping I had a similar thought. My initial idea was to introduce a new keyword "bind" to introduce a binding assignment: bind a=None, b=None: *args bind a=None, b=None, **other: **mapping You *might* be able to reuse the def keyword, but I suspect the ambiguity would be too difficult to resolve. Rereading the above, the only major issue I see with this is that the comma/colon distinction is visually a bit too light. Adding parens might help, but probably gets a bit too punctuation-heavy: bind (a=None, b=None): *args bind (a=None, b=None, **other): **mapping Paul. From ncoghlan at gmail.com Thu Aug 30 10:38:57 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 30 Aug 2012 18:38:57 +1000 Subject: [Python-ideas] Unpack of sequences In-Reply-To: References: <503E1FCA.7050309@nedbatchelder.com> <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> <503E4AD5.6040102@pearwood.info> <503E4ED4.9020006@pearwood.info> <65E3F232-C8EA-4D58-8910-465BDB6FDC40@masklinn.net> Message-ID: On Thu, Aug 30, 2012 at 5:57 PM, Paul Moore wrote: > Rereading the above, the only major issue I see with this is that the > comma/colon distinction is visually a bit too light. Adding parens > might help, but probably gets a bit too punctuation-heavy: > > bind (a=None, b=None): *args > bind (a=None, b=None, **other): **mapping We use "bind" as a method name in the standard library. Other possible keywords are likely to have the same problem - it's not a space rich in suitable terminology. Opening the statement with "|" would suffice though - it's otherwise invalid at the start of an expression, so it gives a lot of freedom in the grammar details for the rest of the statement, including the spelling of the assignment details (you could even reuse a single "=", something like ":=" or "()=" or else Terry's TIE fighter notation "(=)") Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ironfroggy at gmail.com Thu Aug 30 15:56:29 2012 From: ironfroggy at gmail.com (Calvin Spealman) Date: Thu, 30 Aug 2012 09:56:29 -0400 Subject: [Python-ideas] Verbose traceback formatting In-Reply-To: References: Message-ID: On Tue, Aug 28, 2012 at 6:26 PM, Mike Graham wrote: > It's possible to give a lot more on error than the default traceback > gives you. I propose that Python should ship a more verbose formatter > and a command line switch to use it. > > Here's an example of IPython's verbose formatter. I wrote a buggy program: > >> def f(a): >> x = a * 4 >> y = a - 4 >> return x / y >> >> def main(): >> for i in xrange(100): >> f(i) >> >> main() > > > and then ran it in IPython with verbose tracebacks and got the following output: > >> ZeroDivisionError Traceback (most recent call last) >> >> /home/mike/foo.py in () >> 8 f(i) >> 9 >> ---> 10 main() >> global main = >> 11 >> 12 >> >> /home/mike/foo.py in main() >> 6 def main(): >> 7 for i in xrange(100): >> ----> 8 f(i) >> global f = >> i = 4 >> 9 >> 10 main() >> >> /home/mike/foo.py in f(a=4) >> 2 x = a * 4 >> 3 y = a - 4 >> ----> 4 return x / y >> x = 16 >> y = 0 >> 5 >> 6 def main(): >> >> ZeroDivisionError: integer division or modulo by zero > > > > This is very handy! The reprs of all locals are input so I can see > what the values of a, x, and y were when I had my error and there are > a few lines of code on either side of the line that matters to help me > get oriented. The former feature is the more powerful one, although > enabling this by default is a bad idea; (first and foremost, this can > be a security hazard). I can't count how many trips into pdb this > would have saved me. > > I think having this feature be part of Python itself would be very > helpful to new learners and to those helping them. I constantly deal > with learners seeking help who are unable to clearly provide the > actual values and types of the objects in the code they're having > trouble with; it would be nice simply to say, "Show me a verbose > traceback" and might even help them to debug their code without > assistance. +1 on the more verbose formatter in stdlib Rather than a glad, I'd like to see the path to a formatter, so you can use more than just the two builtin. It would be great to easily define my own and have that used in all cases. python -t traceback.VerboseFormatter my_broken_script.py > Mike > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -- Read my blog! I depend on your acceptance of my opinion! I am interesting! http://techblog.ironfroggy.com/ Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy From steve at pearwood.info Thu Aug 30 16:18:22 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 31 Aug 2012 00:18:22 +1000 Subject: [Python-ideas] Verbose traceback formatting In-Reply-To: References: Message-ID: <503F762E.1050106@pearwood.info> On 29/08/12 08:26, Mike Graham wrote: > It's possible to give a lot more on error than the default traceback > gives you. I propose that Python should ship a more verbose formatter > and a command line switch to use it. There's no command line switch, but the Time Machine strikes again. Save your buggy script in a file "foo.py", then start up the interactive interpreter and run this: import cgitb cgitb.enable(format='text') import foo Personally, I find the IPython verbose formatter more readable and useful. To revert to ordinary tracebacks: sys.excepthook = sys.__excepthook__ -- Steven From guido at python.org Thu Aug 30 16:50:45 2012 From: guido at python.org (Guido van Rossum) Date: Thu, 30 Aug 2012 07:50:45 -0700 Subject: [Python-ideas] Unpack of sequences In-Reply-To: References: <503E1FCA.7050309@nedbatchelder.com> <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> <503E4AD5.6040102@pearwood.info> <503E4ED4.9020006@pearwood.info> <503E5D51.6030804@pearwood.info> Message-ID: On Wed, Aug 29, 2012 at 11:57 PM, Serhiy Storchaka wrote: > On 29.08.12 21:20, Steven D'Aprano wrote: >> >> But once you have multiple >> targets, >> it gets ugly soon, e.g. the typical enumeration (anti-)pattern: >> >> RED, YELLOW, BLUE, GREEN = 'RED', 'YELLOW', 'BLUE', 'GREN' > > > for n in 'RED', 'YELLOW', 'BLUE', 'GREN': > globals()[n] = n Please, please, please. I much prefer to repeat myself than to use such an ugly hack. -- --Guido van Rossum (python.org/~guido) From guido at python.org Thu Aug 30 16:56:29 2012 From: guido at python.org (Guido van Rossum) Date: Thu, 30 Aug 2012 07:56:29 -0700 Subject: [Python-ideas] Unpack of sequences In-Reply-To: References: <503E1FCA.7050309@nedbatchelder.com> <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> <503E4AD5.6040102@pearwood.info> <503E4ED4.9020006@pearwood.info> Message-ID: On Wed, Aug 29, 2012 at 11:55 PM, Serhiy Storchaka wrote: > On 29.08.12 21:07, Alexander Belopolsky wrote: >> >> How can users write a solution that does not require repetition of >> variable names? > > def interior(a=None, b=None, c=None): > ... # work here > > interior(m) Bingo. Very nice solution! -- --Guido van Rossum (python.org/~guido) From steve at pearwood.info Thu Aug 30 17:14:07 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 31 Aug 2012 01:14:07 +1000 Subject: [Python-ideas] Unpack of sequences In-Reply-To: References: <503E1FCA.7050309@nedbatchelder.com> <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> <503E4AD5.6040102@pearwood.info> <503E4ED4.9020006@pearwood.info> <65E3F232-C8EA-4D58-8910-465BDB6FDC40@masklinn.net> Message-ID: <503F833F.8040904@pearwood.info> On 30/08/12 11:07, Nick Coghlan wrote: > Argument-parameter binding is actually a fairly powerful name binding > operation, quite distinct from ordinary assignment and tuple > unpacking. It consists of two separate (often confused) components: > - the parameter specification > - the argument list > > The parameter spec is what appears in the function *definition*, and > is really the main item of interest reported by the new > inspect.Signature objects in 3.3. > > A parameter spec allows you to do several things: > - define the names that will be bound locally > - define default values to be assigned to those names > - provide named holders for excess positional and keyword arguments > - indicate that certain values can *only* be supplied by name, and not > by position > > The argument list is what appears in a function *call*, and also has > several interesting features: > - can provide explicit positional arguments > - can provide explicit keyword arguments > - can provide an iterable of positional arguments > - can provide a mapping of additional keyword arguments > > It *may* make sense to decouple this name binding operation from > function calls (and Signature.bind) and make it available as a > language primitive in the form of a statement. You're talking about the *argument list* binding operations, yes? The stuff that happens when the function is called, not the parameter spec. > However, it's *not* the same thing as an ordinary assignment statement > or tuple unpacking, and we shouldn't try to jam it implicitly into the > existing assignment statement. Not only shouldn't we, but we *can't*, due to backward compatibility. This is probably obvious to Nick, but for the benefit of anyone else reading this who struggled as much as I did to see why we couldn't just "upgrade" the assignment statement to work like function call binding, here's a sketch of why we can't. Suppose the assignment statement was upgraded. Then we could do things like this: a, b, c=DEFAULT, **kwargs = 1, 2, 3, d="extra" and it would bind: a = 1 b = 2 c = 3 kwargs = {"d": "extra"} but DEFAULT would necessarily be unchanged. Just as if you called a function def spam(a, b, c=DEFAULT, **kwargs). Similarly if you did this: a, b, c=DEFAULT = 1, 2, 3 you would expect to get the bindings: a = 1 b = 2 c = 3 also with the default value DEFAULT unchanged. But that second example is already legal Python, and it works like this: py> DEFAULT = 42 py> a, b, c=DEFAULT = 1, 2, 3 py> a, b, c (1, 2, 3) py> DEFAULT (1, 2, 3) Somebody is relying on this behaviour, and so we cannot change assignment to work like function argument binding without breaking backwards compatibility. There may be other problems too, but for me, this backwards compatibility issue convinced me that regular assignment cannot be made to match function argument binding. I'm not convinced that we *should* expose function argument binding as a language primitive, but if we do, we can't use the regular assignment statement, we would need something new. > The obvious form for such a statement is "LHS OP RHS", however > syntactic ambiguity in the evaluation of both the LHS and RHS > (relative to normal assigment) would likely prevent that. I think you need to explain that in further detail. Suppose we used (making something up here) "::=" as the operator. Then e.g.: a, b, c=DEFAULT, d=None :== 1, 2, c=3 seems unambiguous to me, provided: 1) you can't chain multiple ::= operators; 2) tuples on either side have to be delimited by parentheses, exactly the same as is already the case for function parameter lists and function calls, e.g.: py> def f(a, b=1,2,3): File "", line 1 def f(a, b=1,2,3): ^ SyntaxError: invalid syntax Can you explain where the ambiguity in things like this would lie, because I'm just not seeing it. -- Steven From steve at pearwood.info Thu Aug 30 17:16:09 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 31 Aug 2012 01:16:09 +1000 Subject: [Python-ideas] Unpack of sequences In-Reply-To: References: <503E4AD5.6040102@pearwood.info> <503E4ED4.9020006@pearwood.info> <65E3F232-C8EA-4D58-8910-465BDB6FDC40@masklinn.net> Message-ID: <503F83B9.6060203@pearwood.info> On 30/08/12 18:38, Nick Coghlan wrote: > On Thu, Aug 30, 2012 at 5:57 PM, Paul Moore wrote: >> Rereading the above, the only major issue I see with this is that the >> comma/colon distinction is visually a bit too light. Adding parens >> might help, but probably gets a bit too punctuation-heavy: >> >> bind (a=None, b=None): *args >> bind (a=None, b=None, **other): **mapping > > We use "bind" as a method name in the standard library. Other possible > keywords are likely to have the same problem - it's not a space rich > in suitable terminology. The Moby thesaurus lists 269 synonyms for "bind", and in my opinion not a single one other than bind itself is appropriate. There are 125 synonyms for "glue" (with some significant overlap), including "yoke" which I think holds promise. It's unlikely to exist in many people's code, and it has the right meaning of joining together. But I'm not sure that the associations with oxen yoked together is good or helpful. -- Steven From storchaka at gmail.com Thu Aug 30 17:46:38 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 30 Aug 2012 18:46:38 +0300 Subject: [Python-ideas] Unpack of sequences In-Reply-To: <503ECF2E.6060400@canterbury.ac.nz> References: <503E1FCA.7050309@nedbatchelder.com> <503ECF2E.6060400@canterbury.ac.nz> Message-ID: On 30.08.12 05:25, Greg Ewing wrote: > How about > > a, b, c, ... = d SyntaxError: can't assign to Ellipsis From mikegraham at gmail.com Thu Aug 30 17:56:45 2012 From: mikegraham at gmail.com (Mike Graham) Date: Thu, 30 Aug 2012 11:56:45 -0400 Subject: [Python-ideas] Unpack of sequences In-Reply-To: References: <503E1FCA.7050309@nedbatchelder.com> <503ECF2E.6060400@canterbury.ac.nz> Message-ID: On Thu, Aug 30, 2012 at 11:46 AM, Serhiy Storchaka wrote: > On 30.08.12 05:25, Greg Ewing wrote: >> >> How about >> >> a, b, c, ... = d > > > SyntaxError: can't assign to Ellipsis The fact that it's currently a syntax error speaks in favour of the suggestion?if it was currently valid Python, you wouldn't want to make it do anything else. That being said, `a, b, c, *_ = d` or similar is probably better than introducing a new way. Mike From storchaka at gmail.com Thu Aug 30 19:05:50 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 30 Aug 2012 20:05:50 +0300 Subject: [Python-ideas] hook zipfile.ZipExtFile to check secure hash sums In-Reply-To: References: Message-ID: On 29.08.12 19:08, Daniel Holth wrote: > I am checking the sha256 sums of all the files in a zip archive as it > is being extracted by overriding ZipExtFile._update_crc, but it is > inconvenient. > > It would be nice to have a hook, for example the ZipExtFile > constructor could be a property of ZipFile and conveniently replaced > with a ZipExtFile subclass. Where do you get the checksums for control? ZIP file format specifies the checksum algorithm, file with another saved checksums is not ZIP file (or is invalid ZIP file). From storchaka at gmail.com Thu Aug 30 19:11:18 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 30 Aug 2012 20:11:18 +0300 Subject: [Python-ideas] hook zipfile.ZipExtFile to check secure hash sums In-Reply-To: References: Message-ID: On 29.08.12 19:08, Daniel Holth wrote: > I am checking the sha256 sums of all the files in a zip archive as it > is being extracted by overriding ZipExtFile._update_crc, but it is > inconvenient. > > It would be nice to have a hook, for example the ZipExtFile > constructor could be a property of ZipFile and conveniently replaced > with a ZipExtFile subclass. And ZipExtFile is not public, it is implementation detail. You should not use this class directly. In fact, I plan to offer a patch that removes ZipExtFile in 3.4. From mastahyeti at gmail.com Thu Aug 30 21:03:49 2012 From: mastahyeti at gmail.com (Ben Toews) Date: Thu, 30 Aug 2012 14:03:49 -0500 Subject: [Python-ideas] issue15824 Message-ID: Hello, We have been discussing the value of having namedtuple as the return type for urlparse.urlparse and urlparse.urlsplit. See that thread here: http://bugs.python.org/issue15824 . I jumped the gun and submitted a patch without seeing if anyone else thought different behavior was desirable. My argument is that it would be a major usability improvement if the return type supported item assignment. Currently, something like the following is necessary in order to parse, make changes, and unparse: import urlparse url = list(urlparse.urlparse('http://www.example.com/foo/bar?hehe=haha')) url[1] = 'python.com' new_url = urllib.urlunparse(url) I think this is really clunky. I don't see any reason why we should be using a type that doesn't support item assignment and needs to be casted to a another type in order to make changes. I think an interface like this is more useful: import urlparse url = urlparse.urlparse('http://www.example.com/foo/bar?hehe=haha') url.netloc = 'www.python.com' urlparse.urlunparse(url) What do other people think? -- -Ben Toews From storchaka at gmail.com Thu Aug 30 21:51:29 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 30 Aug 2012 22:51:29 +0300 Subject: [Python-ideas] issue15824 In-Reply-To: References: Message-ID: On 30.08.12 22:03, Ben Toews wrote: > I think this is really clunky. I don't see any reason why we should be > using a type that doesn't support item assignment and needs to be > casted to a another type in order to make changes. Mutable urlparse result is backward incompatible. For now this result can be used as dict key. From greg.ewing at canterbury.ac.nz Fri Aug 31 02:17:00 2012 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 31 Aug 2012 12:17:00 +1200 Subject: [Python-ideas] Unpack of sequences In-Reply-To: References: <503E1FCA.7050309@nedbatchelder.com> <503ECF2E.6060400@canterbury.ac.nz> Message-ID: <5040027C.2000001@canterbury.ac.nz> Mike Graham wrote: > That being said, `a, b, c, *_ = d` or similar is probably better than > introducing a new way. It's inefficient, though, because it results in iterating over the remainder of the sequence and building a new sequence that will never be used. There's currently no way to spell the most efficient way of doing this, which is simply to stop iterating and ignore the rest of the sequence. There's also no way to unpack the head of an infinite sequence without slicing it first. -- Greg From steve at pearwood.info Fri Aug 31 03:03:45 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 31 Aug 2012 11:03:45 +1000 Subject: [Python-ideas] Unpack of sequences In-Reply-To: <5040027C.2000001@canterbury.ac.nz> References: <503E1FCA.7050309@nedbatchelder.com> <503ECF2E.6060400@canterbury.ac.nz> <5040027C.2000001@canterbury.ac.nz> Message-ID: <50400D71.7000705@pearwood.info> On 31/08/12 10:17, Greg Ewing wrote: > Mike Graham wrote: >> That being said, `a, b, c, *_ = d` or similar is probably better than >> introducing a new way. > > It's inefficient, though, because it results in iterating > over the remainder of the sequence and building a new sequence > that will never be used. And if the sequence is large, that might even matter. I'm not being quite as dismissive as it might sound, but in practice, this inefficiency generally does not matter. It's unlikely to be a bottleneck. > There's currently no way to spell the most efficient way of > doing this, which is simply to stop iterating and ignore the > rest of the sequence. For sequences: a, b, c, d = really_long_sequence[:4] which might not be as efficient as possible, but by estimation it is within a factor of two of the most efficient as possible. For iterators, use itertools.islice. >There's also no way to unpack the > head of an infinite sequence without slicing it first. Surely this is the obvious way? head = next(sequence) # actually an iterator -- Steven From dholth at gmail.com Fri Aug 31 03:14:30 2012 From: dholth at gmail.com (Daniel Holth) Date: Thu, 30 Aug 2012 21:14:30 -0400 Subject: [Python-ideas] hook zipfile.ZipExtFile to check secure hash sums In-Reply-To: References: Message-ID: On Thu, Aug 30, 2012 at 1:05 PM, Serhiy Storchaka wrote: > On 29.08.12 19:08, Daniel Holth wrote: >> >> I am checking the sha256 sums of all the files in a zip archive as it >> is being extracted by overriding ZipExtFile._update_crc, but it is >> inconvenient. >> >> It would be nice to have a hook, for example the ZipExtFile >> constructor could be a property of ZipFile and conveniently replaced >> with a ZipExtFile subclass. > > > Where do you get the checksums for control? ZIP file format specifies the > checksum algorithm, file with another saved checksums is not ZIP file (or is > invalid ZIP file). The secure checksums are just listed in a different digitally signed file. It is nice to verify them during extraction instead of having to re-read the file. From tjreedy at udel.edu Fri Aug 31 03:48:48 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 30 Aug 2012 21:48:48 -0400 Subject: [Python-ideas] Unpack of sequences In-Reply-To: <5040027C.2000001@canterbury.ac.nz> References: <503E1FCA.7050309@nedbatchelder.com> <503ECF2E.6060400@canterbury.ac.nz> <5040027C.2000001@canterbury.ac.nz> Message-ID: On 8/30/2012 8:17 PM, Greg Ewing wrote: > Mike Graham wrote: >> That being said, `a, b, c, *_ = d` or similar is probably better than >> introducing a new way. > > It's inefficient, though, because it results in iterating > over the remainder of the sequence and building a new sequence > that will never be used. > > There's currently no way to spell the most efficient way of > doing this, which is simply to stop iterating and ignore the > rest of the sequence. There's also no way to unpack the > head of an infinite sequence without slicing it first. dit = iter(d) a,b,d = next(dit), next(dit), next(dit) or, most efficiently a = next(dit) b = next(dit) c = next(dit) or a, b, c = [next(dit) for i in range(3)] -- Terry Jan Reedy From ncoghlan at gmail.com Fri Aug 31 09:42:44 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 31 Aug 2012 17:42:44 +1000 Subject: [Python-ideas] Unpack of sequences In-Reply-To: <503F833F.8040904@pearwood.info> References: <503E1FCA.7050309@nedbatchelder.com> <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> <503E4AD5.6040102@pearwood.info> <503E4ED4.9020006@pearwood.info> <65E3F232-C8EA-4D58-8910-465BDB6FDC40@masklinn.net> <503F833F.8040904@pearwood.info> Message-ID: On Fri, Aug 31, 2012 at 1:14 AM, Steven D'Aprano wrote: > On 30/08/12 11:07, Nick Coghlan wrote: >> The obvious form for such a statement is "LHS OP RHS", however >> syntactic ambiguity in the evaluation of both the LHS and RHS >> (relative to normal assigment) would likely prevent that. > > > I think you need to explain that in further detail. Suppose we > used (making something up here) "::=" as the operator. Then e.g.: > > a, b, c=DEFAULT, d=None :== 1, 2, c=3 > > seems unambiguous to me You're looking too far ahead - the (deliberately dumb) parser would choke because: a, b, c = DEFAULT, d = None is a valid assignment statement. It's one that can't work (due to the length mismatch in the tuple unpacking), but it's syntactically valid. Thus, by the time it reached the "::=", the parser wouldn't be able to backtrack and revise its opinion from "ordinary assignment statement" to "parameter binding operation". Instead, it would bail out, whinging about an unexpected token (in reality, you'd hit problems long before that stage - the parser generator would have quit on you, complaining about ambiguous syntax, probably with an entirely unhelpful error message). In some cases (such as augmented assignment) we can deal with that kind of ambiguity by making the parser permissive, and catching syntax errors at a later stage in the compilation pipeline (generally either symbol analysis or bytecode generation - I can't think of any reason error detection would ever be delayed until the peephole optimisation step). However, delayed validation would be a very poor approach in this case, since there are valid parameter specifications that are also completely valid assignment statements (as shown above). You'd have to maintain a complex set of "maybe this, maybe that" constructs in order to pull it off, which would be very, very, ugly (and make a mess of the AST). Furthermore, if something is hard for the *computer* to parse, odds are pretty good that humans are also going to struggle with it (natural language notwithstanding - in that case, computers are playing catchup with millions of years of evolutionary development). Fortunately, the entire ambiguity problem can go away *if* you can find a suitable prefixed or delimited syntax. Once you manage that, then the prefix or opening delimiter serves as a marker for both the compiler and the human reader that something different is happening. That's why I stole the |parameter-spec| notation directly from Ruby's block syntax for my syntactic sketch - it's a delimited syntax that doesn't conflict with other Python syntax ("|" can't start an expression - it can only appear as part of an operator or an augmented assignment statement). The choice of operator for the name binding operation itself would then be fairly arbitrary, since the parser would already know it was dealing with a parameter binding operation. "|param-spec| ::= arglist", "|param-spec| <- arglist", "|param-spec| def= arglist", "|param-spec| ()= arglist", "|param-spec| (=) arglist", "|param-spec|(arglist)" "|param-spec| from arglist" would all be viable options from a syntactic point of view (YMMV wildly from a readability point of view, of course) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Fri Aug 31 09:48:24 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 31 Aug 2012 17:48:24 +1000 Subject: [Python-ideas] Unpack of sequences In-Reply-To: <503F833F.8040904@pearwood.info> References: <503E1FCA.7050309@nedbatchelder.com> <1FF2B6B8-1160-44D6-BF39-771326A833FF@masklinn.net> <503E4AD5.6040102@pearwood.info> <503E4ED4.9020006@pearwood.info> <65E3F232-C8EA-4D58-8910-465BDB6FDC40@masklinn.net> <503F833F.8040904@pearwood.info> Message-ID: On Fri, Aug 31, 2012 at 1:14 AM, Steven D'Aprano wrote: > You're talking about the *argument list* binding operations, yes? The > stuff that happens when the function is called, not the parameter spec. A function call binds parameter names to argument values or, equivalently, argument values to parameter names, so it doesn't really matter whether you call it "parameter (name) binding" or "argument (value) binding". It's the same operation, just emphasising either the source of the names or the source of the values. I normally talk about binding names to values, since a name can only refer to one value at a time, but there may be multiple names bound to any given value. Hence, I prefer to call this process "parameter binding" - it's the step of assigning a value to each of the parameter names before the body of the function starts executing. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia