
Ok, I'm going to try to summarize this a bit so we don't go around in circles on details that are adjacent to the issue I'm trying to address. + Adding methods to "copyitems", "seteach", and "delitems"; to do partial group operations on dictionaries in C rather than iterating in python can possibly have as much as a %500 percent performance increase over iterating in python to do the same thing. - It needs to be shown that these situations occur often enough to result in a meaningful benefit. (It doesn't replace the need to iterate dictionaries as there are many cases where that's exactly what you need.) + The methods add some improvements to readability over the iterator form. - There is not a significant reduction in lines of code, so again it needs to be shown that this would be useful often enough to be a significant benefit. Providing there are enough use cases to demonstrate a significant benefit, we will then need to address the following issues. + What to call them. + The details of the implementation. Most of the arguments against fit into the following categories... - Changes the status quo - It's premature optimization - Adds additional complexity to dictionaries - Personal preference These are subjective but still important issues, and these will need to be addressed after it is demonstrated there is sufficient use cases for these features, if each of these is relevant and to what degree. Some examples: # Combine two dictionaries. (works already) dd = dict(d1) dd.update(d2) # Split dictionary d using a key list. keys_rest = set(d.keys()) - set(keys) d1, d2 = d.getitems(keys), d.getitems(keys_rest) # Remove a subdict of d with keys. dd = d.getitems(keys) d.delitems(keys) # Copy items from dictionary d1 to d2. # # The getitems method returns a dictionary so it will # work directly with the update method. # d2.update(d1.getitems(keys)) # Move items from dictionary d1 to d2. d2.update(d1.getitems(keys)) d1.del_keys(keys) # Setting items to a specified value with a list of keys. d.seteach(keys, None) Use cases: ### TODO
Josiah Carlson wrote:
Ron Adam <rrr@ronadam.com> wrote:
Is 12 cases out of about 315,000 python files a big enough need to keep the current behavior? 315,000 is the number returned from google code for all python files, 'lang:python'. (I'm sure there are some duplicates)
Is this more convincing. ;-)
Not to me, as I use dict.fromkeys(), and going from a simple expression to an assignment then mutate is unnecessary cognitive load. It would have been more convincing had you offered...
dict((i, v) for i in keys)
Well, there you go. :-)
But then again, basically every one of your additions is a one line expression. I would also consider the above myself, if it weren't for the fact that I'm supporting a Python 2.3 codebase. Please see my discussion below of *removing* functionality.
This is probably something that is better suited for python 3000. But it's possible it could be back ported to 2.6. It would have no effect on python 2.5 and earlier. And probably minimal effect on 2.x in regards to 2.3 compatibility. I don't see .fromkeys() being removed in 2.x.
Until you can show significant use-cases in the wild, and show that the slowdown of these functions in Python compared to C is sufficient to render the addition of the functions in your own personal library useless, I'm going to stick with my -1.
Your own tests show a maximum speedup of 620%. My testing shows it is 300% to 500% over a range of sizes. I would still call that sufficient. And before you point it out... yes, only if it can be shown to be useful in a wide range of situations. I fully intend to find use cases. If I can't find any, then none of this will matter.
I was pointing out how you would duplicate exactly the functionality you were proposing for dict.set_keys(). It is very difficult for me to offer you alternate implementations for your own use, or as reasons why I don't believe they should be added, if you move the target ;).
But programming is full of moving targets. ;-) In any case, look at the overall picture and try not to prematurely shoot this down based on implementation details that can be changed as needed. And I'll attempt to do a use case study from the python library.
Until you can show significant use-cases
Usually we find substantial use-cases ....
... that I don't remember anyone having ever asked for before
... but again, use-cases ...
I've never needed to do this.
Please find me real-world use-cases ...
Show me code that is easier to understand.
Ok, I get it. :-)
Also consider this from a larger view. List has __getslice__, __setslice__, and __delslice__. Set has numerous methods that operate on more than one element.
Lists are ordered sequences, dictionaries are not. Sets are not mappings, they are sets (which is why they have set operations). Dictionaries are a mapping from keys to values, used as both an arbitrary data store as well as data and method member lookups on objects. The most common use-cases of dictionaries *don't* call for any of the additional functionality that you have offered.
If they did, then it would have already been added.
This statement isn't true. It only shows the resistance to these changes is greater than the efforts of those who have tried to introduce those changes. (not without good cause) To be clear, I in no way want the bar dropped to a lower level as to what is added to python or not added. I accept that sufficient benefit needs to be demonstrated, and will try to do that. Quality is more important than quantity in this case.
Dictionaries are suppose to be highly efficient, but they only have limited methods that can operate on more than one item at a time, so you end up iterating over the keys to do nearly everything.
Iteration is a fundamental building block in Python. That's why for loops, iterators, generators, generator expressions, list comprehensions, etc., all use iteration over an iterator to do their work. Building more functionality into dictionaries won't make them easier to use, it will merely add more methods that you think will help. Is there anyone else who likes this idea? Please speak up.
Lets rephrase this to be less subjective... Does anyone think having a approximately 500% improvement in some dictionary operations would be good if it can be done in a way that is both easier to read, use, and has enough use cases to be worth while?
getkeys/setkeys/delkeys seem to me like they should be named getitems/setitems/delitems, because they are getting/setting/deleting the entire key->value association, not merely the keys.
Sounds good, how about... getitems, delitems, and seteach ? The update method corresponds to setitems, where setitems is the inverse operations to getitems. I don't see any reason to change update. d1.update(d2.getitems(keys)) So seteach, is a better name for a method that sets each key to a value. Cheers, Ron