PEP czar for PEP 3144?
Does anyone object to me naming myself PEP czar for PEP 3144? I've collated the objections to the original proposal on a few different occasions throughout the (long!) PEP review process, and as noted in the Background section, the latest version of the PEP [1] has addressed the key concerns that were raised: - the "strict" flag for Network objects is gone (instead, the validation differences between IP Network and IP Interface definitions are handled as different classes with otherwise similar interfaces) - the factory function naming scheme follows PEP 8 - some properties have been given new names that make it clearer what kind of object they produce - the module itself has been given a new name (ipaddress) to avoid clashing with the existing ipaddr module on PyPI There's also basic-but-usable module documentation available (http://code.google.com/p/ipaddr-py/wiki/Using3144). So, unless there are any new objections, I'd like to: - approve ipaddress for inclusion in Python 3.3 - grant Peter Moody push access as the module maintainer - create a tracker issue to cover incorporating the new module into the standard library, documentation and test suite (There are still a few places in both the PEP and the preliminary documentation that say "ipaddr" instead of "ipaddress", but those can be cleaned up as the module gets integrated). I don't personally think the module API needs the provisional disclaimer as the core functionality has been tested for years in ipaddr and the API changes in ipaddress are just cosmetic ones either for PEP 8 conformance, or to make the API map more cleanly to the underlying networking concepts. However, I'd be willing to include that proviso if anyone else has lingering concerns. Regards, Nick. [1] http://www.python.org/dev/peps/pep-3144/ -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Mon, 20 Feb 2012 23:23:13 +1000 Nick Coghlan <ncoghlan@gmail.com> wrote:
Does anyone object to me naming myself PEP czar for PEP 3144?
“Tsar is a title used to designate certain European Slavic monarchs or supreme rulers.” Is this our official word?
There's also basic-but-usable module documentation available (http://code.google.com/p/ipaddr-py/wiki/Using3144).
Mmmh, some comments: - a network can be "in" another network? Sounds strange. Compare with sets, which can be ordered, but not contained one within another. The idea of an address or network being "in" an interface sounds even stranger. - iterhosts()? Why not simply hosts()? - “A TypeError exception is raised if you try to compare objects of different versions or different types.”: I hope equality still works? Regards Antoine.
On Mon, Feb 20, 2012 at 11:55 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
On Mon, 20 Feb 2012 23:23:13 +1000 Nick Coghlan <ncoghlan@gmail.com> wrote:
Does anyone object to me naming myself PEP czar for PEP 3144?
“Tsar is a title used to designate certain European Slavic monarchs or supreme rulers.”
Is this our official word?
PEP czar/tsar and BDFOP (Benevolent Dictator for One PEP) are the two names I've seen for the role. I don't have a strong preference either way (just a mild preference for 'czar').
There's also basic-but-usable module documentation available (http://code.google.com/p/ipaddr-py/wiki/Using3144).
Mmmh, some comments: - a network can be "in" another network? Sounds strange. Compare with sets, which can be ordered, but not contained one within another. The idea of an address or network being "in" an interface sounds even stranger.
Ah, I'd missed that one. Yes, I think this a holdover from the main ipaddr module which plays fast and loose with type correctness by implicitly converting between networks and addresses in all sorts of places. It doesn't have Network and Interface as separate types (calling them both "Networks") and it appears the current incarnation of the Interface API still retains a few too many Network-specific behaviours. I agree the "container" behaviour should be reserved for the actual Network API, with Interface objects behaving more like Addresses in that respect. I also agree Network subset and superset checks should follow a set-style API rather than overloading the containment checks. There are actually a few other behaviours (like compare_networks() that should probably be moved to the Network objects, and accessed via the "network" property for Interface objects.
- iterhosts()? Why not simply hosts()?
And I missed that one, too. Perhaps that provisional marker wouldn't be such a bad idea after all... One requirement for integration would be fleshing out the standard library version of the documentation to include a full public API reference for the module and public classes, which will also help highlight any lingering naming problems, as well as areas where APIs that currently return realised lists should probably be returning iterators instead (there's currently iter_subnets() and subnet(), which should just be a single subnets() iterator).
- “A TypeError exception is raised if you try to compare objects of different versions or different types.”: I hope equality still works?
It looks like it's supposed to (and does for Address objects), but there's currently a bug in the _BaseInterface.__eq__ impl that makes it return None instead of False (the method impl *should* be returning NotImplemented, just as _BaseAddress does, with the interpreter than reporting False if both sides return NotImplemented). There's currently an implicit promotion of Address objects to Interface objects, such that "network_or_interface == address" is the same as "network_or_interface.ip == address". So yes, with the appropriate boundaries between the different types of objects still being a little blurred, I think a "provisional" marker is definitely warranted. Some of the APIs that are currently available directly on Interface objects should really be accessed via their .network property instead. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
Nick Coghlan wrote:
On Mon, Feb 20, 2012 at 11:55 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
On Mon, 20 Feb 2012 23:23:13 +1000 Nick Coghlan <ncoghlan@gmail.com> wrote:
Does anyone object to me naming myself PEP czar for PEP 3144? “Tsar is a title used to designate certain European Slavic monarchs or supreme rulers.”
Is this our official word?
PEP czar/tsar and BDFOP (Benevolent Dictator for One PEP) are the two names I've seen for the role. I don't have a strong preference either way (just a mild preference for 'czar').
Also, "Czar" is commonly used in US politics as an informal term for the top official responsible for an area. "Drug Czar" is only the most familiar: http://en.wikipedia.org/wiki/List_of_U.S._executive_branch_%27czars%27 -- Steven
Steven D'Aprano writes:
Also, "Czar" is commonly used in US politics as an informal term for the top official responsible for an area.
I think here the most important connotation is that in US parlance a "czar" does not report to a committee, and with the exception of a case where Sybil is appointed czar, cannot bikeshed. Decisions get made (what a concept!)
On Mon, Feb 20, 2012 at 4:53 PM, Stephen J. Turnbull <stephen@xemacs.org> wrote:
Steven D'Aprano writes:
> Also, "Czar" is commonly used in US politics as an informal term for the top > official responsible for an area.
I think here the most important connotation is that in US parlance a "czar" does not report to a committee, and with the exception of a case where Sybil is appointed czar, cannot bikeshed. Decisions get made (what a concept!)
I'm curious how old that usage is. I first encountered it around '88 when I interned for a summer at DEC SRC (long since subsumed into HP Labs); the person in charge of deciding a particular aspect of their software or organization was called a czar, e.g. the documentation czar. -- --Guido van Rossum (python.org/~guido)
On 2/20/2012 11:52 PM, Guido van Rossum wrote:
On Mon, Feb 20, 2012 at 4:53 PM, Stephen J. Turnbull<stephen@xemacs.org> wrote:
Steven D'Aprano writes:
Also, "Czar" is commonly used in US politics as an informal term for the top official responsible for an area.
I think here the most important connotation is that in US parlance a "czar" does not report to a committee, and with the exception of a case where Sybil is appointed czar, cannot bikeshed. Decisions get made (what a concept!)
I'm curious how old that usage is. I first encountered it around '88 when I interned for a summer at DEC SRC (long since subsumed into HP Labs); the person in charge of deciding a particular aspect of their software or organization was called a czar, e.g. the documentation czar.
In US politics, the first I remember was the Drug Czar about that time. It really came into currently during Clinton's admin. -- Terry Jan Reedy
On 2/21/12 4:52 AM, Guido van Rossum wrote:
On Mon, Feb 20, 2012 at 4:53 PM, Stephen J. Turnbull<stephen@xemacs.org> wrote:
Steven D'Aprano writes:
Also, "Czar" is commonly used in US politics as an informal term for the top official responsible for an area.
I think here the most important connotation is that in US parlance a "czar" does not report to a committee, and with the exception of a case where Sybil is appointed czar, cannot bikeshed. Decisions get made (what a concept!)
I'm curious how old that usage is. I first encountered it around '88 when I interned for a summer at DEC SRC (long since subsumed into HP Labs); the person in charge of deciding a particular aspect of their software or organization was called a czar, e.g. the documentation czar.
From the Wikipedia article Steven cited: """ The earliest known use of the term for a U.S. government official was in the administration of Franklin Roosevelt (1933–1945), during which eleven unique positions (or twelve if one were to count "Economic Czar" and "Economic Czar of World War II" as separate) were so described. The term was revived, mostly by the press, to describe officials in the Nixon and Ford administrations and continues today. """ http://en.wikipedia.org/wiki/List_of_U.S._executive_branch_%27czars%27 -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
Does anyone object to me naming myself PEP czar for PEP 3144?
“Tsar is a title used to designate certain European Slavic monarchs or supreme rulers.”
Is this our official word?
"supreme ruler" sounds good to me. I could go for "inquisitor" instead of "czar" as well... Regards, Martin
On 20/02/2012 16:28, Senthil Kumaran wrote:
On Tue, Feb 21, 2012 at 12:07 AM,<martin@v.loewis.de> wrote:
"supreme ruler" sounds good to me. I could go for "inquisitor" instead of "czar" as well...
But that would be bad for developers from Spain as nobody would expect a spanish inquisition.
:-)
How about Big Brother then? As anyone worked in room 101? -- Cheers. Mark Lawrence.
I like 'PEP czar' On Mon, Feb 20, 2012 at 6:50 PM, Mark Lawrence <breamoreboy@yahoo.co.uk> wrote:
On 20/02/2012 16:28, Senthil Kumaran wrote:
On Tue, Feb 21, 2012 at 12:07 AM,<martin@v.loewis.de> wrote:
"supreme ruler" sounds good to me. I could go for "inquisitor" instead of "czar" as well...
But that would be bad for developers from Spain as nobody would expect a spanish inquisition.
:-)
How about Big Brother then? As anyone worked in room 101?
-- Cheers.
Mark Lawrence.
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/andrew.svetlov%40gmail.com
On Mon, Feb 20, 2012 at 14:23, Nick Coghlan <ncoghlan@gmail.com> wrote:
I don't personally think the module API needs the provisional disclaimer as the core functionality has been tested for years in ipaddr and the API changes in ipaddress are just cosmetic ones either for PEP 8 conformance, or to make the API map more cleanly to the underlying networking concepts. However, I'd be willing to include that proviso if anyone else has lingering concerns.
Should it be net.ipaddress instead of just ipaddress? Somewhat nested is better than fully flat. Cheers, Dirkjan
On Mon, 20 Feb 2012 16:20:15 +0100 Dirkjan Ochtman <dirkjan@ochtman.nl> wrote:
On Mon, Feb 20, 2012 at 14:23, Nick Coghlan <ncoghlan@gmail.com> wrote:
I don't personally think the module API needs the provisional disclaimer as the core functionality has been tested for years in ipaddr and the API changes in ipaddress are just cosmetic ones either for PEP 8 conformance, or to make the API map more cleanly to the underlying networking concepts. However, I'd be willing to include that proviso if anyone else has lingering concerns.
Should it be net.ipaddress instead of just ipaddress?
Somewhat nested is better than fully flat.
IMHO, nesting without a good, consistent, systematic categorization leads to very unpleasant results (e.g. "from urllib.request import urlopen"). Historically, our stdlib has been flat and I think it should stay so, short of redoing the whole hierarchy. (note this has nothing to do with the possible implementation of modules as packages, such as unittest or importlib) Regards Antoine.
On Mon, Feb 20, 2012 at 16:27, Antoine Pitrou <solipsis@pitrou.net> wrote:
Should it be net.ipaddress instead of just ipaddress?
Somewhat nested is better than fully flat.
IMHO, nesting without a good, consistent, systematic categorization leads to very unpleasant results (e.g. "from urllib.request import urlopen").
Historically, our stdlib has been flat and I think it should stay so, short of redoing the whole hierarchy.
(note this has nothing to do with the possible implementation of modules as packages, such as unittest or importlib)
I thought Python 3 already came with a net package, but apparently that plan has long been discarded. So I retract my suggestion. Cheers, Dirkjan
On Mon, Feb 20, 2012 at 11:27 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
IMHO, nesting without a good, consistent, systematic categorization leads to very unpleasant results (e.g. "from urllib.request import urlopen").
Historically, our stdlib has been flat and I think it should stay so, short of redoing the whole hierarchy.
I concur. Arbitrary nesting should be avoided.
On 2/20/2012 8:23 AM, Nick Coghlan wrote:
Does anyone object to me naming myself PEP czar for PEP 3144?
I think it great that you volunteer to be the PEP czar and hope Guido appoints you -- especially after your response to Antoine. Since this is a Python 3 module, let us start off with a modern Python 3 interface. That includes returning iterators instead of lists unless there is a really good reason. I can see how an outside developer could have difficulty getting integrated into our collective PEP process ;-). -- Terry Jan Reedy
Approved. Nick is PEP czar for PEP 3144. Thanks Nick! On Mon, Feb 20, 2012 at 11:13 AM, Terry Reedy <tjreedy@udel.edu> wrote:
On 2/20/2012 8:23 AM, Nick Coghlan wrote:
Does anyone object to me naming myself PEP czar for PEP 3144?
I think it great that you volunteer to be the PEP czar and hope Guido appoints you -- especially after your response to Antoine. Since this is a Python 3 module, let us start off with a modern Python 3 interface. That includes returning iterators instead of lists unless there is a really good reason.
I can see how an outside developer could have difficulty getting integrated into our collective PEP process ;-).
-- Terry Jan Reedy
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
-- --Guido van Rossum (python.org/~guido)
On Tue, Feb 21, 2012 at 7:09 AM, Guido van Rossum <guido@python.org> wrote:
Approved. Nick is PEP czar for PEP 3144. Thanks Nick!
In that case the addition of the "ipaddress" module is approved for 3.3, with a provisional caveat on the API details. I'm doing it that way because I think those remaining details can be better flushed out by the integration process (in particular, populating full module API reference documentation) than they could by another round of updates on the PEP and the ipaddr 3144 branch. At the very least: - the IP Interface API needs to move to a point where it more clearly *is* an IP Address and *has* an associated IP Network (rather than being the other way around) - IP Network needs to behave more like an ordered set of sequential IP Addresses (without sometimes behaving like an Address in its own right) - iterable APIs should consistently produce iterators (leaving users free to wrap list() around the calls if they want the concrete realisation) Initial maintainers will be me (for the semantically cleaner incarnation of the module API) and Peter (for the IPv4 and IPv6 correctness heavy lifting and ensuring any API updates only change the spelling of particular operations, such as adding a ".network." to some current operations on Interface objects, rather than reducing overall module functionality). This approach means we will still gain the key benefits of using the PyPI-tested ipaddr as a base (i.e. correct IP address parsing and generation, full coverage of the same set of supported operations) while exposing a simpler semantic model for new users that first encounter these concepts through the standard library module documentation: - IP Address as the core abstraction - IP Network as a container for IP Addresses - IP Interface as an IP Address with an associated IP Network Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
Just checking in: On Mon, Feb 20, 2012 at 5:48 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
At the very least: - the IP Interface API needs to move to a point where it more clearly *is* an IP Address and *has* an associated IP Network (rather than being the other way around)
This is done [1]. There's cleanup that needs to happen here, but the interface classes are now subclasses of the respective address classes. Now I need to apply some consistency and then move on to the remaining issues points:
- IP Network needs to behave more like an ordered set of sequential IP Addresses (without sometimes behaving like an Address in its own right) - iterable APIs should consistently produce iterators (leaving users free to wrap list() around the calls if they want the concrete realisation)
Cheers, peter [1] http://code.google.com/p/ipaddress-py/source/detail?r=10dd6a68139fb991162198... (the date is munged b/c I rebased to my original commit before submitting). -- Peter Moody Google 1.650.253.7306 Security Engineer pgp:0xC3410038
On Thu, Mar 1, 2012 at 3:13 PM, Peter Moody <pmoody@google.com> wrote:
Just checking in:
On Mon, Feb 20, 2012 at 5:48 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
At the very least: - the IP Interface API needs to move to a point where it more clearly *is* an IP Address and *has* an associated IP Network (rather than being the other way around)
This is done [1]. There's cleanup that needs to happen here, but the interface classes are now subclasses of the respective address classes.
Thanks for the update! I'll be moving house this month, which may disrupt things a bit, but I'll still be trying to keep up with email, etc. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Wed, Feb 29, 2012 at 9:13 PM, Peter Moody <pmoody@google.com> wrote:
Just checking in:
On Mon, Feb 20, 2012 at 5:48 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
At the very least: - the IP Interface API needs to move to a point where it more clearly *is* an IP Address and *has* an associated IP Network (rather than being the other way around)
This is done [1]. There's cleanup that needs to happen here, but the interface classes are now subclasses of the respective address classes.
Now I need to apply some consistency and then move on to the remaining issues points:
- IP Network needs to behave more like an ordered set of sequential IP Addresses (without sometimes behaving like an Address in its own right)
This is done [2]. Consistent iterable apis and polish left to do. Cheers, peter [2] http://code.google.com/p/ipaddress-py/source/detail?r=578ef1777018211f536cac...
- iterable APIs should consistently produce iterators (leaving users free to wrap list() around the calls if they want the concrete realisation)
Cheers, peter
[1] http://code.google.com/p/ipaddress-py/source/detail?r=10dd6a68139fb991162198... (the date is munged b/c I rebased to my original commit before submitting).
-- Peter Moody Google 1.650.253.7306 Security Engineer pgp:0xC3410038
-- Peter Moody Google 1.650.253.7306 Security Engineer pgp:0xC3410038
On Mon, Mar 12, 2012 at 9:15 AM, Peter Moody <pmoody@google.com> wrote:
- iterable APIs should consistently produce iterators (leaving users free to wrap list() around the calls if they want the concrete realisation)
I might've missed earlier discussion somewhere, but can someone point me at an example of an iteratable api in ipaddr/ipaddress where an iterator isn't consistently produced? Cheers, peter -- Peter Moody Google 1.650.253.7306 Security Engineer pgp:0xC3410038
On Mon, Mar 19, 2012 at 12:44 PM, Peter Moody <pmoody@google.com> wrote:
On Mon, Mar 12, 2012 at 9:15 AM, Peter Moody <pmoody@google.com> wrote:
- iterable APIs should consistently produce iterators (leaving users free to wrap list() around the calls if they want the concrete realisation)
I might've missed earlier discussion somewhere, but can someone point me at an example of an iteratable api in ipaddr/ipaddress where an iterator isn't consistently produced?
There was at least one that I recall, now to find it again... And searching for "list" in the PEP 3144 branch source highlights subnet() vs iter_subnets() as the main problem child: https://code.google.com/p/ipaddr-py/source/browse/branches/3144/ipaddress.py... A single "subnets()" method that produced the iterator would seem to make more sense (with a "list()" call wrapped around it when the consumer really wants a concrete list). There are a few other cases that produce a list that are less clearcut. I *think* summarising the address range could be converted to an iterator, since the "networks" accumulation list doesn't get referenced by the summarising algorithm. Similarly, there doesn't appear to be a compelling reason for "address_exclude" to produce a concrete list (I also noticed a couple of "assert True == False" statements in that method for "this should never happen" code branches. An explicit "raise AssertionError" is a better way to handle such cases, so the code remains present even under -O and -OO) Collapsing the address list has to build the result list anyway to actually handle the deduplication part of its job, so returning a concrete list makes sense in that case. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
Nick Coghlan wrote:
Collapsing the address list has to build the result list anyway to actually handle the deduplication part of its job, so returning a concrete list makes sense in that case.
Having only one function return a list instead of an iterator seems questionable. Depending on the code it could either keep track of what it has returned so far in a set and avoid duplication that way; or, just return an `iter(listobject)` instead of `listobject`. ~Ethan~
On Mon, Mar 19, 2012 at 12:37 PM, Ethan Furman <ethan@stoneleaf.us> wrote:
Nick Coghlan wrote:
Collapsing the address list has to build the result list anyway to actually handle the deduplication part of its job, so returning a concrete list makes sense in that case.
Having only one function return a list instead of an iterator seems questionable.
Depending on the code it could either keep track of what it has returned so far in a set and avoid duplication that way; or, just return an `iter(listobject)` instead of `listobject`.
I know I'm lacking context, but is the list ever expected to be huge? If not, what's wrong with always returning a list? -- --Guido van Rossum (python.org/~guido)
Guido van Rossum wrote:
On Mon, Mar 19, 2012 at 12:37 PM, Ethan Furman <ethan@stoneleaf.us> wrote:
Nick Coghlan wrote:
Collapsing the address list has to build the result list anyway to actually handle the deduplication part of its job, so returning a concrete list makes sense in that case.
Having only one function return a list instead of an iterator seems questionable.
Depending on the code it could either keep track of what it has returned so far in a set and avoid duplication that way; or, just return an `iter(listobject)` instead of `listobject`.
I know I'm lacking context, but is the list ever expected to be huge? If not, what's wrong with always returning a list?
Nothing wrong in and of itself. It just seems to me that if we have several functions that deal with ip addresses/networks/etc, and all but one return iterators, that one is going to be a pain... 'Which one returns a list again? Oh yeah, that one.' Granted it's mostly a stylistic preference for consistency. ~Ethan~
On Mon, Mar 19, 2012 at 1:13 PM, Ethan Furman <ethan@stoneleaf.us> wrote:
Guido van Rossum wrote:
On Mon, Mar 19, 2012 at 12:37 PM, Ethan Furman <ethan@stoneleaf.us> wrote:
Nick Coghlan wrote:
Collapsing the address list has to build the result list anyway to actually handle the deduplication part of its job, so returning a concrete list makes sense in that case.
Having only one function return a list instead of an iterator seems questionable.
Depending on the code it could either keep track of what it has returned so far in a set and avoid duplication that way; or, just return an `iter(listobject)` instead of `listobject`.
I know I'm lacking context, but is the list ever expected to be huge? If not, what's wrong with always returning a list?
Nothing wrong in and of itself. It just seems to me that if we have several functions that deal with ip addresses/networks/etc, and all but one return iterators, that one is going to be a pain... 'Which one returns a list again? Oh yeah, that one.'
It depends on whether they really are easy to confuse. If they are, indeed that feels like poor API design. But sometimes the only time two things seem confusingly similar is when you have not actually tried to use them. A naming convention often helps too.
Granted it's mostly a stylistic preference for consistency.
And remember that consistency is good in moderation, but if it becomes a goal in itself you may have a problem. -- --Guido van Rossum (python.org/~guido)
Guido van Rossum wrote:
On Mon, Mar 19, 2012 at 1:13 PM, Ethan Furman wrote:
Nothing wrong in and of itself. It just seems to me that if we have several functions that deal with ip addresses/networks/etc, and all but one return iterators, that one is going to be a pain... 'Which one returns a list again? Oh yeah, that one.'
It depends on whether they really are easy to confuse. If they are, indeed that feels like poor API design. But sometimes the only time two things seem confusingly similar is when you have not actually tried to use them.
Heh -- true, I have not tried to use them (yet) -- just offering another viewpoint. ;)
Granted it's mostly a stylistic preference for consistency.
And remember that consistency is good in moderation, but if it becomes a goal in itself you may have a problem.
While I agree that consistency as a goal in and of itself is not good, I consider it more important than 'moderation' implies; in my own code I try to only be inconsistent when there is a good reason to be. To me, "it's already a list" isn't a good reason -- yes, that's easier for the library author, but is it easier for the library user? What is the library user gaining by having a list returned instead of an iterator? Of course, the flip-side also holds: what is the library user losing by getting an iterator when a list was available? When we way the pros and cons, and it comes down to a smidgeon of performance in trade for consistency [1], I would vote for consistency. ~Ethan~ [1] I'm assuming that 'iter(some_list)' is a quick operation.
On Mon, Mar 19, 2012 at 02:50:22PM -0700, Ethan Furman wrote:
Guido van Rossum wrote: [...]
And remember that consistency is good in moderation, but if it becomes a goal in itself you may have a problem.
While I agree that consistency as a goal in and of itself is not good, I consider it more important than 'moderation' implies; in my own code I try to only be inconsistent when there is a good reason to be.
I think we're probably in violent agreement, but I would put it this way: Consistency for its own sake *is* good, since consistency makes it easier for people to reason about the behaviour of functions on the basis that they are similar to other functions. But it is not the *only* good, and it is legitimate to trade-off one good for another good as needed.
To me, "it's already a list" isn't a good reason -- yes, that's easier for the library author, but is it easier for the library user? What is the library user gaining by having a list returned instead of an iterator?
I guess this discussion really hinges on which of these two positions you take: 1. The function naturally returns a list, should we compromise that simplicity by returning an iterator to be consistent with the other related/similar functions in the library? 2. These related/similar functions naturally return iterators, should we compromise that consistency by allowing one of them to return a list as it simplifies the implementation?
Of course, the flip-side also holds: what is the library user losing by getting an iterator when a list was available?
When we way the pros and cons, and it comes down to a smidgeon of performance in trade for consistency [1], I would vote for consistency.
I lean that way as well.
~Ethan~
[1] I'm assuming that 'iter(some_list)' is a quick operation.
For very small lists, it's about half as expensive as creating the list in the first place: steve@runes:~$ python3.2 -m timeit -s "x = (1,2,3)" "list(x)" 1000000 loops, best of 3: 0.396 usec per loop steve@runes:~$ python3.2 -m timeit -s "x = (1,2,3)" "iter(list(x))" 1000000 loops, best of 3: 0.614 usec per loop For large lists, it's approximately free: steve@runes:~$ python3.2 -m timeit -s "x = (1,2,3)*10000" "list(x)" 10000 loops, best of 3: 111 usec per loop steve@runes:~$ python3.2 -m timeit -s "x = (1,2,3)*10000" "iter(list(x))" 10000 loops, best of 3: 111 usec per loop On the other hand, turning the list iterator into a list again is probably not quite so cheap. -- Steven
On Mon, Mar 19, 2012 at 2:50 PM, Ethan Furman <ethan@stoneleaf.us> wrote:
[1] I'm assuming that 'iter(some_list)' is a quick operation.
This seems to be the case so I've just gone ahead and renamed collapse_address_list to collapse_addresses and added 'return iter(...)' to the end. The rest of the list-returning methods all return iterators now too. There should only be a few minor outstanding issues to to work out. Cheers, peter -- Peter Moody Google 1.650.253.7306 Security Engineer pgp:0xC3410038
On Mon, Mar 19, 2012 at 12:55 PM, Guido van Rossum <guido@python.org> wrote:
On Mon, Mar 19, 2012 at 12:37 PM, Ethan Furman <ethan@stoneleaf.us> wrote:
Nick Coghlan wrote:
Collapsing the address list has to build the result list anyway to actually handle the deduplication part of its job, so returning a concrete list makes sense in that case.
Having only one function return a list instead of an iterator seems questionable.
Depending on the code it could either keep track of what it has returned so far in a set and avoid duplication that way; or, just return an `iter(listobject)` instead of `listobject`.
I know I'm lacking context, but is the list ever expected to be huge? If not, what's wrong with always returning a list?
It's possible to return massive lists, (eg, returning the 4+ billion /128 subnets in /96 or something even larger, but I don't think that's very common). I've generally tried to avoid confusion by having 'iter' in the iterating methods, but if more of the methods return iterators, maybe I need to rethink that?
-- --Guido van Rossum (python.org/~guido) _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/pmoody%40google.com
-- Peter Moody Google 1.650.253.7306 Security Engineer pgp:0xC3410038
On Mon, Mar 19, 2012 at 2:58 PM, Peter Moody <pmoody@google.com> wrote:
On Mon, Mar 19, 2012 at 12:55 PM, Guido van Rossum <guido@python.org> wrote:
On Mon, Mar 19, 2012 at 12:37 PM, Ethan Furman <ethan@stoneleaf.us> wrote:
Nick Coghlan wrote:
Collapsing the address list has to build the result list anyway to actually handle the deduplication part of its job, so returning a concrete list makes sense in that case.
Having only one function return a list instead of an iterator seems questionable.
Depending on the code it could either keep track of what it has returned so far in a set and avoid duplication that way; or, just return an `iter(listobject)` instead of `listobject`.
I know I'm lacking context, but is the list ever expected to be huge? If not, what's wrong with always returning a list?
It's possible to return massive lists, (eg, returning the 4+ billion /128 subnets in /96 or something even larger, but I don't think that's very common). I've generally tried to avoid confusion by having 'iter' in the iterating methods, but if more of the methods return iterators, maybe I need to rethink that?
I personally like having 'iter' in the name (e.g. iterkeys() -- note that we dropped this in Py3k because it's no longer an iterator, it's a dict view now. But I don't want to promote that style for ipaddr.py. -- --Guido van Rossum (python.org/~guido)
On 3/19/2012 6:04 PM, Guido van Rossum wrote:
On Mon, Mar 19, 2012 at 2:58 PM, Peter Moody<pmoody@google.com> wrote:
On Mon, Mar 19, 2012 at 12:55 PM, Guido van Rossum<guido@python.org> wrote:
On Mon, Mar 19, 2012 at 12:37 PM, Ethan Furman<ethan@stoneleaf.us> wrote:
Nick Coghlan wrote:
Collapsing the address list has to build the result list anyway to actually handle the deduplication part of its job, so returning a concrete list makes sense in that case.
Having only one function return a list instead of an iterator seems questionable.
Depending on the code it could either keep track of what it has returned so far in a set and avoid duplication that way; or, just return an `iter(listobject)` instead of `listobject`.
I know I'm lacking context, but is the list ever expected to be huge? If not, what's wrong with always returning a list?
It's possible to return massive lists, (eg, returning the 4+ billion /128 subnets in /96 or something even larger, but I don't think that's very common). I've generally tried to avoid confusion by having 'iter' in the iterating methods, but if more of the methods return iterators, maybe I need to rethink that?
I personally like having 'iter' in the name (e.g. iterkeys() -- note that we dropped this in Py3k because it's no longer an iterator, it's a dict view now. But I don't want to promote that style for ipaddr.py.
I am not sure which way you are pointing, but the general default in 3.x is to return iterators: range, zip, enumerate, map, filter, reversed, open (file objects), as well at the dict methods. I am quite happy to be rid of the 'iter' prefix on the latter. This is aside from itertools. The main exceptions I can think of are str.split and sorted. For sorted, a list *must* be constructed anyway, so might as well return it. This apparently matches the case under consideration. If name differentiation is wanted, call it xxxlist. -- Terry Jan Reedy
On Mon, Mar 19, 2012 at 3:44 PM, Terry Reedy <tjreedy@udel.edu> wrote:
I am not sure which way you are pointing, but the general default in 3.x is to return iterators: range, zip, enumerate, map, filter, reversed, open (file objects), as well at the dict methods.
Actually as I tried to say, the dict methods (keys() etc.) DON'T return iterators. They return "views" which are iterable. Anyway, I also tried to imply that it matters if the number of list items would ever be huge. It seems that is indeed possible (even if not likely) so I think iterators are useful.
I am quite happy to be rid of the 'iter' prefix on the latter. This is aside from itertools. The main exceptions I can think of are str.split and sorted. For sorted, a list *must* be constructed anyway, so might as well return it. This apparently matches the case under consideration. If name differentiation is wanted, call it xxxlist.
Agreed, ideally you don't need to know or it'll be obvious from the name without an explicit 'list' or 'iter'. -- --Guido van Rossum (python.org/~guido)
On Tue, Mar 20, 2012 at 8:34 AM, Guido van Rossum <guido@python.org> wrote:
Anyway, I also tried to imply that it matters if the number of list items would ever be huge. It seems that is indeed possible (even if not likely) so I think iterators are useful.
But according to Nick's post, there's some sort of uniquification that is done, and the algorithm currently used computes the whole list anyway. I suppose that one could do the uniquification lazily, or find some other way to avoid that computation. Is it worth it to optimize an unlikely case?
On Tue, Mar 20, 2012 at 10:43 AM, Stephen J. Turnbull <stephen@xemacs.org> wrote:
But according to Nick's post, there's some sort of uniquification that is done, and the algorithm currently used computes the whole list anyway.
I suppose that one could do the uniquification lazily, or find some other way to avoid that computation. Is it worth it to optimize an unlikely case?
Yeah, the only where I thought retaining the list output made particular sense was "collapse_address_list". I have no problem with that operation expecting a real sequence as input and producing an actual list as output, since the entire (deduplicated) sequence will eventually end up in memory for checking purposes anyway, even if the result was an iterator rather than a list and it already has "list" in its name. The other instances I noticed should all just be a matter of replacing "output.append(value)" calls with "yield value" instead, so it seems sensible to standardise on a Py3k style iterators-instead-of-lists API for the standard library version of the module. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
Guido van Rossum wrote:
I personally like having 'iter' in the name (e.g. iterkeys() -- note that we dropped this in Py3k because it's no longer an iterator, it's a dict view now. But I don't want to promote that style for ipaddr.py.
+1 from me too on having all methods that return iterators clearly indicating so. It's an important distinction, and it can be very confusing if some methods of an API return iterators and others don't with no easy way of remembering which is which. -- Greg
Greg Ewing wrote:
Guido van Rossum wrote:
I personally like having 'iter' in the name (e.g. iterkeys() -- note that we dropped this in Py3k because it's no longer an iterator, it's a dict view now. But I don't want to promote that style for ipaddr.py.
+1 from me too on having all methods that return iterators clearly indicating so. It's an important distinction, and it can be very confusing if some methods of an API return iterators and others don't with no easy way of remembering which is which.
With the prevalence of iterators in Python 3 [1], the easy way is to have the API default to iterators, drop 'iter' from the names, and use 'list' in the names to signal the oddball cases where a list is returned instead. ~Ethan~ [1] http://mail.python.org/pipermail/python-dev/2012-March/117815.html
participants (16)
-
Andrew Svetlov
-
Antoine Pitrou
-
Dirkjan Ochtman
-
Ethan Furman
-
Greg Ewing
-
Guido van Rossum
-
Mark Lawrence
-
martin@v.loewis.de
-
Matt Joiner
-
Nick Coghlan
-
Peter Moody
-
Robert Kern
-
Senthil Kumaran
-
Stephen J. Turnbull
-
Steven D'Aprano
-
Terry Reedy