Mailman 3 dict.fromkeys() better as dict().setkeys() ? (and other suggestions) - Python-ideas

newer
positional only arguments decorator

dict.fromkeys() better as dict().setkeys() ? (and other suggestions)

older
Re: [Python-ideas] proposal to add...

Ron Adam

May 28, 2007

10:49 p.m.

The dictionary fromkeys method seems out of place as well as miss-named. IMHO In the current 2.5 branch it occurs only twice. (excluding tests) ---- $ grep -r "fromkeys" Lib/*.py Lib/httplib.py: header_names = dict.fromkeys([k.lower() for k in headers]) Lib/UserDict.py: def fromkeys(cls, iterable, value=None): ---- In httplib.py, it is used as a set to remove duplicates. There are enough correct uses of it in the wild to keep the behavior, but it can be done in a better way. I feel it really should be called set_keys and implemented as a method that operates on the current dictionary instead of being a constructor for a new dictionary. That will allow you to add keys with a default value to an already existing dictionary, or to create a new one with a dictionary constructor. dict().set_keys(s, v=None) # The current fromkeys behavior. I think this reads better and can be used in a wider variety of situations. It could be useful for setting an existing dictionary to a default state. # reset status of items. status.set_keys(status.keys(), v=0) Or more likely, resetting a partial sub set of the keys to some initial state. The reason I started looking at this is I wanted to split a dictionary into smaller dictionaries and my first thought was that fromkeys would do that. But of course it doesn't. What I wanted was to be able to specify the keys and get the values from the existing dictionary into the new dictionary without using a for loop to iterate over the keys. d = dict(1='a', 2='b', 3='c', 4='d', 5='e') d_odds = d.from_keys([1, 3, 5]) # new dict of items 1, 3, 5 d_evens = d.from_keys([2, 4]) # new dict of items 2, 4 There currently isn't a way to split a dictionary without iterating it's contents even if you know the keys you need before hand. A from_keys method would be the inverse complement of the update method. A del_keys method could replace the clear method. del_keys would be more useful as it could operate on a partial set of keys. d.delkeys(d.keys()) # The current clear method behavior. Some potentially *very common* uses: # This first one works now, but I included it for completeness. ;-) mergedicts(d1, d2): """ Combine two dictionaries. """ dd = dict(d1) return dd.update(d2) splitdict(d, keys): """ Split dictionary d using keys. """ keys_rest = set(d.keys()) - set(keys) return d.from_keys(keys), d.from_keys(keys_rest) split_from_dict(d, keys): """ Removes and returns a subdict of d with keys. """ dd = d.from_keys(keys) d.del_keys(keys) return dd copy_items(d1, d2, keys): """ Copy items from dictionary d1 to d2. """ d2.update(d1.from_keys(keys)) # I really like this! move_items(d1, d2, keys): """ Move items from dictionary d1 to d2. """ d2.update(d1.from_keys(keys)) d1.del_keys(keys) I think the set_keys, from_keys, and del_keys methods could add both performance and clarity benefits to python. So to summarize... 1. Replace existing fromkeys method with a set_keys method. 2. Add a partial copy items from_keys method. 3. Replace the clear method with a del_keys method. So this replaces two methods and adds one more. Overall I think the usefulness of these would be very good. I also think it will work very well with the python 3000 keys method returning an iterator. (And still be two fewer methods than we currently have.) Any thoughts? Cheers, Ron

Show replies by date

Josiah Carlson

May 2007

11:56 p.m.

New subject: dict.fromkeys() better as dict().setkeys() ? (and other suggestions)

Ron Adam <rrr@ronadam.com> wrote:

...

The problem with that is that when a method mutates an object, it shouldn't return the object. Your new .set_keys() method violates this behavior that is used in lists, sets, dicts, deques, arrays, etc. I don't have time to comment on the rest at the moment, will do when I get a chance. - Josiah

Ron Adam

12:21 a.m.

New subject: dict.fromkeys() better as dict().setkeys() ? (and other suggestions)

Josiah Carlson wrote:

...

Huh? Where does it return an object?

...

Ok :-) Ron

Ron Adam

12:29 a.m.

New subject: dict.fromkeys() better as dict().setkeys() ? (and other suggestions)

Ron Adam wrote:

...

Whoops, Ok I see it now. So the example should be... d = dict() d.set_keys(s, v=none) I think that would work fine.

...

Josiah Carlson

12:41 a.m.

New subject: dict.fromkeys() better as dict().setkeys() ? (and other suggestions)

Ron Adam <rrr@ronadam.com> wrote:

...

If you are looking to replace dict.fromkeys(...), then dict().set_keys (...) must return the dictionary that was just created and mutated. - Josiah

Josiah Carlson

1:18 a.m.

New subject: dict.fromkeys() better as dict().setkeys() ? (and other suggestions)

Ron Adam <rrr@ronadam.com> wrote:

...

The dictionary fromkeys method seems out of place as well as miss-named. IMHO

It is perfectly named (IMNSHO ;), create a dictionary from the keys provided; dict.fromkeys() .

...

There are enough correct uses of it in the wild to keep the behavior, but it can be done in a better way.

I wasn't terribly convinced by your later arguments, so I'm -1.

...

I think this reads better and can be used in a wider variety of situations.

It could be useful for setting an existing dictionary to a default state.

# reset status of items. status.set_keys(status.keys(), v=0)

This can be done today: status.update((i, 0) for i in status.keys()) #or status.update(dict.fromkeys(status, 0))

...

Or more likely, resetting a partial sub set of the keys to some initial state.

The reason I started looking at this is I wanted to split a dictionary into smaller dictionaries and my first thought was that fromkeys would do that. But of course it doesn't.

Changing the bahvior of dict.fromkeys() is not going to happen. We can remove it, we can add a new method, but changing will lead to not so subtle breakage as people who were used to the old behavior try to use the updated method. Note that this isn't a matter of "it's ok to break in 3.0", because dict.fromkeys() is not seen as being a design mistake by any of the 'heavy hitters' in python-dev or python-3000 that I have heard (note that I am certainly not a 'heavy hitter').

...

What I wanted was to be able to specify the keys and get the values from the existing dictionary into the new dictionary without using a for loop to iterate over the keys.

d = dict(1='a', 2='b', 3='c', 4='d', 5='e')

d_odds = d.from_keys([1, 3, 5]) # new dict of items 1, 3, 5 d_evens = d.from_keys([2, 4]) # new dict of items 2, 4

There currently isn't a way to split a dictionary without iterating it's contents even if you know the keys you need before hand.

Um... def from_keys(d, iterator): return dict((i, d[i]) for i in iterator)

...

A del_keys method could replace the clear method. del_keys would be more useful as it could operate on a partial set of keys.

d.delkeys(d.keys()) # The current clear method behavior.

I can't remember ever needing something like this that wasn't handled by d.clear() .

...

Some potentially *very common* uses:

# This first one works now, but I included it for completeness. ;-)

mergedicts(d1, d2): """ Combine two dictionaries. """ dd = dict(d1) return dd.update(d2)

dict((i, d2.get(i, d1.get(i))) for i in itertools.chain(d1,d2))

...

splitdict(d, keys): """ Split dictionary d using keys. """ keys_rest = set(d.keys()) - set(keys) return d.from_keys(keys), d.from_keys(keys_rest)

I can't think of a simple one-liner for this one that wouldn't duplicate work.

...

split_from_dict(d, keys): """ Removes and returns a subdict of d with keys. """ dd = d.from_keys(keys) d.del_keys(keys) return dd

dict((i, d.pop(i, None)) for i in keys)

...

copy_items(d1, d2, keys): """ Copy items from dictionary d1 to d2. """ d2.update(d1.from_keys(keys)) # I really like this!

d2.update((i, d1[i]) for i in keys)

...

move_items(d1, d2, keys): """ Move items from dictionary d1 to d2. """ d2.update(d1.from_keys(keys)) d1.del_keys(keys)

d2.update((i, d1.pop(i, None)) for i in keys)

...

I think the set_keys, from_keys, and del_keys methods could add both performance and clarity benefits to python.

Performance, sometimes, for some use-cases. Clarity? Maybe. Your split* functions are a bit confusing to me, and I've never really needed any of the functions that you list.

...

So to summarize...

1. Replace existing fromkeys method with a set_keys method. 2. Add a partial copy items from_keys method. 3. Replace the clear method with a del_keys method.

Not all X line functions should be builtins. If you find that you are doing the above more often than you think you should, create a module with all of the related functionality that automatically patches the builtins on import and place it in the Python cheeseshop. If people find that the functionality helps them, then we should consider it for inclusion. As it stands, most of the methods you offer have a very simple one-line version that is already very efficient.

...

So this replaces two methods and adds one more. Overall I think the usefulness of these would be very good.

I don't find the current dictionary API to be lacking in any way other than "what do I really need to override to get functionality X", but that is a documentation issue more than anything.

...

I also think it will work very well with the python 3000 keys method returning an iterator. (And still be two fewer methods than we currently have.)

I'm sorry, but I can't really see how your changes would add to Python's flexibility without cluttering up interfaces and confusing current users. - Josiah

Ron Adam

4:04 a.m.

New subject: dict.fromkeys() better as dict().setkeys() ? (and other suggestions)

Josiah Carlson wrote:

...

That's ok, I won't hold it against you. ;-) What about it's being out of place? Is this case like 'sorted' vs 'sort' for lists? I'm ok with leaving it names as is if that's a real problem. Another name for the mutate with keys method can be found. That may reduce possible confusion as well.

...

Yes, I'm not the most influential writer. I'm not sure I can convince you it's better if you already think it's not. That has more to do with your personal preference. So lets look at how much it's actually needed in the current (and correct) form. (These are rough estimates, I can try to come up with more accurate statistics if that is desired.) Doing a search on google code turns up 300 hits for "lang:python \.fromkeys\(". Looking at a sample of those, it looks like about 80% use it as a set() constructor to remove duplicates. (For compatibility reason with python 2.3 code, or for pytohn 2.3 and earlier code.) Is there a way to narrow this down to python 2.4 and later? (anyone?) A bit more sampling, it looks like about 8 of 10 of those remaining 20% can be easily converted to the following form without any trouble. d = dict() d.set_keys(keys, v=value) That would leave about 12 cases (YMV) that need the inline functionality. For those a simple function can do it. def dict_from_keys(keys, v=value): d = dict() d.set_keys(keys, v) return d Is 12 cases out of about 315,000 python files a big enough need to keep the current behavior? 315,000 is the number returned from google code for all python files, 'lang:python'. (I'm sure there are some duplicates) Is this more convincing. ;-) (If anyone can come up with better numbers, that would be cool.)

...

Of course all of the examples I gave can be done today. But they nearly all require iterating in python in some form.

...

The first example requires iterating over the keys. The second example works if you want to initialize all the keys. In which case, there is no reason to use the update method. dict.fromkeys(status, 0) is enough.

...

Then lets find a different name.

...

(iterating) Yep as I said just above this. """There currently isn't a way to split a dictionary without iterating it's contents ...""" Lists have __getslice__, __setslice__, and __delslice__. It could be argued that those can be handled just as well with iterators and loops as well. Of course we see them as seq[s:s+x], on both lists and strings. So why not have an equivalent for dictionaries. We can't slice them, but we do have key lists to use in the same way.

...

All or nothing. d = dict() works just as well. BTW, google code give 500 hits for "\.clear\(". But it very un-clear how many of those are false positives due to other objects having a clear method. It's probably a significant percentage in this case.

...

(iterating) And I'd prefer to define the function in this case for readability reasons.

...

:-) This is one of the main motivators.

(iterating)

(iterating)

(iterating)

I think sometime our need is determined by what is available for use. So if it's not available, our minds filter it out from the solutions we consider. That way, we don't need the things we don't have or can't get. My minds "need filter" seems to be broken in that respect. I often need things I don't have. But sometimes that works out to be good. ;-)

...

Of course I knew someone would point this out. I'm not requesting the above example functions be builtins. Only the changes to the dict methods be considered. They would allow those above functions to work in a more efficient way and I'd be happy to add those functions to my own library. With these methods in most cases the functions wouldn't even be needed. You would just use the methods in combinations with each other directly and the result would still be readable without a lot of 'code' overhead. Also consider this from a larger view. List has __getslice__, __setslice__, and __delslice__. Set has numerous methods that operate on more than one element. Dictionaries are suppose to be highly efficient, but they only have limited methods that can operate on more than one item at a time, so you end up iterating over the keys to do nearly everything. So as an alternative, leave fromkeys and clear alone and add... getkeys(keys) -> dict setkeys(keys, v=None) delkeys(keys) Where these offer the equivalent of list slice functionality to dictionaries. If you find that you are

...

Iterators and for loops are fairly efficient for small dictionaries, but iterating can still be considerable slower than the equivalent C code if they are large dictionaries.

...

I think it cleans up the API more than it clutters it up. It coverts two limited use methods to be more general, and adds one more that works with the already existing update method nicely. In both cases of the two existing methods, fromkeys and clear, your arguments, that there all ready exists easy one line functions to do this, would be enough of a reason to not have them in the first place. So do you feel they should be removed? I plan on doing a search of places where these things can make a difference in making the code more readable and/or faster. Cheers, Ron

Josiah Carlson

4:35 p.m.

New subject: dict.fromkeys() better as dict().setkeys() ? (and other suggestions)

Ron Adam <rrr@ronadam.com> wrote:

...

Josiah Carlson wrote:

...
Ron Adam <rrr@ronadam.com> wrote:

...
The dictionary fromkeys method seems out of place as well as miss-named. IMHO

It is perfectly named (IMNSHO ;), create a dictionary from the keys provided; dict.fromkeys() .

That's ok, I won't hold it against you. ;-)

What about it's being out of place? Is this case like 'sorted' vs 'sort' for lists?

Sorted returns a list because it is the only mutable ordered sequence in Python, hence the only object that makes sense to return from a sorted function.

...

...
...
There are enough correct uses of it in the wild to keep the behavior, but it can be done in a better way.

I wasn't terribly convinced by your later arguments, so I'm -1.

Yes, I'm not the most influential writer.

I'm not sure I can convince you it's better if you already think it's not. That has more to do with your personal preference. So lets look at how much it's actually needed in the current (and correct) form. [snip] Is 12 cases out of about 315,000 python files a big enough need to keep the current behavior? 315,000 is the number returned from google code for all python files, 'lang:python'. (I'm sure there are some duplicates)

Is this more convincing. ;-)

Not to me, as I use dict.fromkeys(), and going from a simple expression to an assignment then mutate is unnecessary cognitive load. It would have been more convincing had you offered... dict((i, v) for i in keys) But then again, basically every one of your additions is a one line expression. I would also consider the above myself, if it weren't for the fact that I'm supporting a Python 2.3 codebase. Please see my discussion below of *removing* functionality.

...

...
...
I think this reads better and can be used in a wider variety of situations.

It could be useful for setting an existing dictionary to a default state.

# reset status of items. status.set_keys(status.keys(), v=0)

This can be done today:

Of course all of the examples I gave can be done today. But they nearly all require iterating in python in some form.

Premature optimization... Note that you don't know where you are getting your data, so the overhead of looping and setting data may be inconsequential to the overall running of the update. Since you basically use the "but it iterates in Python rather than C" for the rest of your arguments, I'm going to stick with my belief that you are prematurely optimizing. Until you can show significant use-cases in the wild, and show that the slowdown of these functions in Python compared to C is sufficient to render the addition of the functions in your own personal library useless, I'm going to stick with my -1.

...

...
status.update((i, 0) for i in status.keys()) #or status.update(dict.fromkeys(status, 0))

The first example requires iterating over the keys. The second example works if you want to initialize all the keys. In which case, there is no reason to use the update method. dict.fromkeys(status, 0) is enough.

I was pointing out how you would duplicate exactly the functionality you were proposing for dict.set_keys(). It is very difficult for me to offer you alternate implementations for your own use, or as reasons why I don't believe they should be added, if you move the target ;).

...

...
...
Or more likely, resetting a partial sub set of the keys to some initial state.

The reason I started looking at this is I wanted to split a dictionary into smaller dictionaries and my first thought was that fromkeys would do that. But of course it doesn't.

Changing the bahvior of dict.fromkeys() is not going to happen. We can remove it, we can add a new method, but changing will lead to not so subtle breakage as people who were used to the old behavior try to use the updated method.

Note that this isn't a matter of "it's ok to break in 3.0", because dict.fromkeys() is not seen as being a design mistake by any of the 'heavy hitters' in python-dev or python-3000 that I have heard (note that I am certainly not a 'heavy hitter').

Then lets find a different name.

Usually we find substantial use-cases for which this new functionality would be useful, _then_ we argue about names (usually for months ;). The only exception to this is in 3rd party modules posted in the cheeseshop, but then we don't usually hash out the details of it here, as it is a 3rd party module.

...

...
...
What I wanted was to be able to specify the keys and get the values from the existing dictionary into the new dictionary without using a for loop to iterate over the keys.

d = dict(1='a', 2='b', 3='c', 4='d', 5='e')

d_odds = d.from_keys([1, 3, 5]) # new dict of items 1, 3, 5 d_evens = d.from_keys([2, 4]) # new dict of items 2, 4

There currently isn't a way to split a dictionary without iterating it's contents even if you know the keys you need before hand.

Um...

def from_keys(d, iterator): return dict((i, d[i]) for i in iterator)

(iterating)

Yep as I said just above this.

"""There currently isn't a way to split a dictionary without iterating it's contents ..."""

You aren't splitting the dictionary. You are fetching certain values from the dictionary based on the contents of a provided iterator. The *only* thing you gain from the iterator vs. built-in method is a bit of speed. But if speed is your only argument, for a group of functions that I don't remember anyone having ever asked for before, then you better check your rationale. In the standard library there exists the deque type in collections. Why does Python have a deque? Because it was discovered over 10+ years of Python use that pretty much everyone needs a queue, with a large portion of those needing a double ended queue (put the just fetched item back at the front). Because there were so many users, and because it was used in *many* performance critical applications, it was implemented in C by Raymond Hettinger and became the first member of the collections module. A similar thing happened with default dictionaries and it being faked many times by many different people, implemented and tossed into the collections module again. As for iteration over a sequence to generate a new sequence, you need to do this regardless of whether it is in C or Python. The *only* difference between the C and Python versions of this is a difference in speed, but again, use-cases before naming and optimization. I like to see things "in the wild".

...

Lists have __getslice__, __setslice__, and __delslice__. It could be argued that those can be handled just as well with iterators and loops as well. Of course we see them as seq[s:s+x], on both lists and strings. So why not have an equivalent for dictionaries. We can't slice them, but we do have key lists to use in the same way.

Your function examples are a bit like adding set manipulation functionality through functional programming-like functions. Take your merge operations as an example. With sets, it's spelled s1 | s2. It is a bit round-about, but your from_keys functionality is a bit like s1 - (s1 - s2), or really set(s2) because sets have no associated values. Anyways.

...

...
...
A del_keys method could replace the clear method. del_keys would be more useful as it could operate on a partial set of keys.

d.delkeys(d.keys()) # The current clear method behavior.

I can't remember ever needing something like this that wasn't handled by d.clear() .

All or nothing. d = dict() works just as well.

Not when you want to mutate a dictionary.

...

And I'd prefer to define the function in this case for readability reasons.

...
...
splitdict(d, keys): """ Split dictionary d using keys. """ keys_rest = set(d.keys()) - set(keys) return d.from_keys(keys), d.from_keys(keys_rest)

I can't think of a simple one-liner for this one that wouldn't duplicate work.

:-)

This is one of the main motivators.

I've never needed to do this. And I've never seen source that needed to do this either. So whether this is a main motivator for you doesn't sway me. [snip your pointing out that iteration happens in Python and not C]

...

...
...
I think the set_keys, from_keys, and del_keys methods could add both performance and clarity benefits to python.

Performance, sometimes, for some use-cases. Clarity? Maybe. Your split* functions are a bit confusing to me, and I've never really needed any of the functions that you list.

I think sometime our need is determined by what is available for use. So if it's not available, our minds filter it out from the solutions we consider. That way, we don't need the things we don't have or can't get.

My minds "need filter" seems to be broken in that respect. I often need things I don't have. But sometimes that works out to be good. ;-)

Yeah, I don't buy your 'need filter' reasoning. Typically people resist doing things that are difficult or inconvenient to do. Take decoration for example. Before decorator syntax, decoration was a pain in the butt. Yeah, you wrote the same number of lines of code, but there was such a disconnect from the signature of the function/method (and class in 2.6) that it was just too inconvenient to write, maintain, and understand. In the case of dictionaries, all but two or three of the things you would like to offer is available via a very simple dict(generator expression). If people aren't thinking of ways to use generator expressions to make their lives easier (this is the case in multiple threads daily in comp.lang.python), is that Python's fault, or is it the developer's? I like to think of Python's syntax and semantics as just rich enough for people to write what they want and to understand it quickly, but not so rich that you need to spend time thinking what something means (the Perl argument). Adding functionality to existing objects needs to do a few things, not the least of which is solving a problem that happens in the wild, but also that it doesn't overly burdon those who implement similar functionality. Remember, dictionaries are *the* canonical mapping interface, and anyone who implements a complete mapping interface necessarily would need to implement the 3 methods you propose. For what? To clean up the interface? I'm sorry, but to add 3 methods, even with the assmuption that two previous methods were going to be removed, in order to "clean up" the interface doesn't convince me. Please find me real-world use-cases where your new methods would improve readability. ... Also, I develop software for fun and profit. Since basically everyone else here probably does some selection of the same, I'm sure that they will tell you pretty much the same thing: if we restricted our needs to what we already have, software wouldn't get written, or would only be proposed by marketing.

...

...
...
So to summarize...

1. Replace existing fromkeys method with a set_keys method. 2. Add a partial copy items from_keys method. 3. Replace the clear method with a del_keys method.

Not all X line functions should be builtins.

Of course I knew someone would point this out.

I'm usually the one to invoke it. Maybe I have less tolerance to arguably trivial additions to Python than others.

...

I'm not requesting the above example functions be builtins. Only the changes to the dict methods be considered. They would allow those above functions to work in a more efficient way and I'd be happy to add those functions to my own library.

With these methods in most cases the functions wouldn't even be needed. You would just use the methods in combinations with each other directly and the result would still be readable without a lot of 'code' overhead.

My single expression replacements were to show that the functions aren't needed now, as most are *easily* implemented in Python 2.5 in a straightforward manner.

...

Also consider this from a larger view. List has __getslice__, __setslice__, and __delslice__. Set has numerous methods that operate on more than one element.

Lists are ordered sequences, dictionaries are not. Sets are not mappings, they are sets (which is why they have set operations). Dictionaries are a mapping from keys to values, used as both an arbitrary data store as well as data and method member lookups on objects. The most common use-cases of dictionaries *don't* call for any of the additional functionality that you have offered. If they did, then it would have already been added.

...

Dictionaries are suppose to be highly efficient, but they only have limited methods that can operate on more than one item at a time, so you end up iterating over the keys to do nearly everything.

Iteration is a fundamental building block in Python. That's why for loops, iterators, generators, generator expressions, list comprehensions, etc., all use iteration over an iterator to do their work. Building more functionality into dictionaries won't make them easier to use, it will merely add more methods that you think will help. Is there anyone else who likes this idea? Please speak up.

...

So as an alternative, leave fromkeys and clear alone and add...

getkeys(keys) -> dict setkeys(keys, v=None) delkeys(keys)

Where these offer the equivalent of list slice functionality to dictionaries.

getkeys/setkeys/delkeys seem to me like they should be named getitems/setitems/delitems, because they are getting/setting/deleting the entire key->value association, not merely the keys.

...

...
If you find that you are doing the above more often than you think you should, create a module with all of the related functionality that automatically patches the builtins on import and place it in the Python cheeseshop. If people find that the functionality helps them, then we should consider it for inclusion. As it stands, most of the methods you offer have a very simple one-line version that is already very efficient.

Iterators and for loops are fairly efficient for small dictionaries, but iterating can still be considerable slower than the equivalent C code if they are large dictionaries.

Lets find out. >>> d = dict.fromkeys(xrange(10000000)) >>> import time >>> if 1: ... t = time.time() ... e = dict(d) ... print time.time()-t ... 1.21899986267 >>> del e >>> if 1: ... t = time.time() ... e = dict(d.iteritems()) ... print time.time()-t ... 2.75 >>> del e >>> if 1: ... t = time.time() ... e = dict((i,j) for i,j in d.iteritems()) ... print time.time()-t ... 6.95399999619 >>> del e >>> if 1: ... t = time.time() ... e = dict((i, d[i]) for i in d) ... print time.time()-t ... 7.54699993134 >>> Those all seem to be pretty reasonable timings to me. In the best case you are talking about 6.2 times faster to use the C rather than Python version.

...

...
...
So this replaces two methods and adds one more. Overall I think the usefulness of these would be very good.

I don't find the current dictionary API to be lacking in any way other than "what do I really need to override to get functionality X", but that is a documentation issue more than anything.

...
...
I also think it will work very well with the python 3000 keys method returning an iterator. (And still be two fewer methods than we currently have.)

I'm sorry, but I can't really see how your changes would add to Python's flexibility without cluttering up interfaces and confusing current users.

I think it cleans up the API more than it clutters it up. It coverts two limited use methods to be more general, and adds one more that works with the already existing update method nicely.

But you propose a further half dozen functions. If you aren't proposing them for inclusion, why bother including them in your proposal, especially when they have very simple replacements that are, arguably, easier to understand than the function bodies you provided.

...

In both cases of the two existing methods, fromkeys and clear, your arguments, that there all ready exists easy one line functions to do this, would be enough of a reason to not have them in the first place. So do you feel they should be removed?

We don't remove functionality in Python unless there is a good reason. Typically that reason is because the functionality is broken, the old functionality is not considered "Pythonic", or generally because a group of people believe there is a better way. Guido is more or less happy with dictionaries as-is (except for the keys(), values(), and items() methods, which are changing), and no one in python-dev has complained about dictionary functionalty that I can remember. As such, even if you think that your changes would clean up dictionary methods, it is unlikely to happen precisely because *others* aren't mentioning, "dictionaries need to be cleaned up".

...

I plan on doing a search of places where these things can make a difference in making the code more readable and/or faster.

I don't care about faster. Show me code that is easier to understand. I will mention that all of your functionality smells very much like a functional programming approach to Python. This makes a difference because some functional programming tools (reduce, map, filter, ...) are slated for removal in Python 3.0, so adding functional programming tools (when we are removing others), is unlikely to gain much traction. - Josiah

Jim Jewett

6:23 p.m.

New subject: dict.fromkeys() better as dict().setkeys() ? (and other suggestions)

On 5/29/07, Josiah Carlson <jcarlson@uci.edu> wrote:

...

Ron Adam <rrr@ronadam.com> wrote:

...
Josiah Carlson wrote:

...

...
...
dict.fromkeys() is not seen as being a design mistake

ermm.... I think at least a few see it as something of a leftover, for using dicts as sets. The docs also (weakly) support this view. This isn't quite the only use case, but I doubt fromkeys would be added today (given both sets and subclasses of defaultdict); there just isn't quite enough discomfort to remove it.

...

...
...
...
splitdict(d, keys): """ Split dictionary d using keys. """ keys_rest = set(d.keys()) - set(keys) return d.from_keys(keys), d.from_keys(keys_rest)

...

...
...
I can't think of a simple one-liner for this one that wouldn't duplicate work.

*this* is the core of a useful idea. list (and set and generator) comprehensions can't partition very well, because they have only a single output. There isn't a good way to say: list_a = [x for x in src if pred(a)] src = [x for x in src if not pred(a)] list_b = [x for x in src if pred(b)] src = [x for x in src if not pred(b)] list_c = [x for x in src if pred(c)] list_other = [x for x in src if not pred(c)] On the other hand, you can do it (inefficiently) as above, or you can write an (ugly) version using a custom function, so the solution would have to be pretty good before it justified complicating the comprehension APIs.

...

unlikely to happen precisely because *others* aren't mentioning, "dictionaries need to be cleaned up".

Not in so many words; Raymond is very reluctant to add anything, because the API is already fairly large. Guido's ABC for mappings (http://svn.python.org/view/sandbox/trunk/abc/) is explicitly a small subset of what dict offers (and doesn't include fromkeys). That said, saying "too large, this isn't needed" is still a far cry from "so we'll remove it", let alone "and add this stuff to replace it". -jJ

Ron Adam

2:12 a.m.

New subject: dict.fromkeys() better as dict().setkeys() ? (and other suggestions)

Jim Jewett wrote:

...

I can't see how it could be done with out as you say... complicating the comprehension APIs. However, I do think there could be very useful uses for a standard sorting structure of some sort. That's the sorting as in mail sorters, or category sorters, that produce several streams of output instead of just one. Would that be called a de-comprehension? Mabye something like the following as a starting point? # generate some random data import random import string def random_pnum(length): ok_digits = string.letters + string.digits digits = [random.choice(ok_digits) for n in range(length)] return ''.join(digits) src = [] for x in range(10): src.append(random_pnum(10)) # A de - comprehension generator def decomp(seq, *cmps): results = dict(((c.__name__, []) for c in cmps)) rest = [] for x in seq: for c in cmps: if c(x): results[c.__name__].append(x) break else: rest.append(x) for c in cmps: yield results[c.__name__] yield rest # Tests def a_g(s): return s[0].lower() in "abcdefg" def h_m(s): return s[0].lower() in "hijklm" def n_z(s): return s[0].lower() in "nopqrstuvwxyz" decmps = [a_g, h_m, n_z] ag, hm, nz, other = decomp(src, *decmps) print 'ag =', ag print 'hm =', hm print 'nz =', nz print 'other =', other ------------------- ag = ['c8WQe60G6J', 'EMY7O8qzTg'] hm = ['lDunyeOM98', 'LJuPg8ncZd'] nz = ['uhhuhd9YdO', 'qAuQvfTc6N', 'vpJz47pkP5', 'YOq6m4IXBn'] other = ['8JE6PuXxBz', '4ttyMdpuQY']

Steve Howell

2:15 a.m.

New subject: dict.fromkeys() better as dict().setkeys() ? (and other suggestions)

--- Ron Adam <rrr@ronadam.com> wrote:

...

Would that be called a de-comprehension?

LOL ____________________________________________________________________________________ Expecting? Get great news right away with email Auto-Check. Try the Yahoo! Mail Beta. http://advision.webevents.yahoo.com/mailbeta/newmail_tools.html

Steve Howell

2:32 a.m.

New subject: dict.fromkeys() better as dict().setkeys() ? (and other suggestions)

--- Ron Adam <rrr@ronadam.com> wrote:

...

Am I misunderstanding (de-comprehensing) something here? How does the code above return those result sets? Or, more specifically, why does ag include 'T' in its results set? ____________________________________________________________________________________ Sucker-punch spam with award-winning protection. Try the free Yahoo! Mail Beta. http://advision.webevents.yahoo.com/mailbeta/features_spam.html

Ron Adam

2:45 a.m.

New subject: dict.fromkeys() better as dict().setkeys() ? (and other suggestions)

Steve Howell wrote:

...

The data in this case simulates 10 digit partnumbers which can include a-z, A-Z, and 0-9. It doesn't alter the data, it just sorts it into smaller groups according to some predefined tests. In this case.. it's only testing the first letter of each item. What is tested is entirely up to you. You could have lists of records as your data and test fields and divide the data according to that. Cheers, Ron

Steve Howell

2:53 a.m.

New subject: dict.fromkeys() better as dict().setkeys() ? (and other suggestions)

--- Ron Adam <rrr@ronadam.com> wrote:

...

Ok, apologies for quoting away the parts of your code that probably answer my own question. But to your bigger question--I think you can set up a list comprehension that does partitioning by having the list comprension or generator expression simply return a list of tuples where the first element in the tuple is a value that suggest where it fits in the partition, then feed that tuple to dict() or whatever. But I don't have a specific code example to prove it. For simple binary partitions, there is the bool function. ____________________________________________________________________________________Choose the right car based on your needs. Check out Yahoo! Autos new Car Finder tool. http://autos.yahoo.com/carfinder/

Ron Adam

3:30 a.m.

New subject: dict.fromkeys() better as dict().setkeys() ? (and other suggestions)

Steve Howell wrote:

...

That would depend on what level of abstraction you want. I find python already handles the simple things fairly well, so I tend to look for the next level up now. That makes it a bit harder to find the balance between being too specific and too general.

...

For simple binary partitions, there is the bool function.

Or more likely you may have a method in a class that tests for a particular condition. pass, fail = decomp(list_of_classes, lambda x: x.test()) Cheers, Ron

Steve Howell

3:31 a.m.

New subject: dict.fromkeys() better as dict().setkeys() ? (and other suggestions)

--- Ron Adam <rrr@ronadam.com> wrote:

...

Here is some category-sorting code, FWIW, where every employee, Fred or not, gets a 50% raise, and employees are partitioned according to their Fredness. It doesn't use a general iterator, so maybe I'm missing your point. def partitions(lst): dct = {} for k, value in lst: dct.setdefault(k, []).append(value) return dct.items() def is_fred(emp): return 'Fred' in emp[0] emps = [ ('Fred Smith', 50), ('Fred Jones', 40), ('Joe Blow', 30), ] def pay_increase(salary): return salary * 0.5 emp_groups = partitions([(is_fred(emp), (emp[0], pay_increase(emp[1]))) for emp in emps]) for fredness, emps in emp_groups: print print 'is Fred?', fredness for name, pay_increase in emps: print name, pay_increase ---- is Fred? False Joe Blow 15.0 is Fred? True Fred Smith 25.0 Fred Jones 20.0 ____________________________________________________________________________________ Park yourself in front of a world of choices in alternative vehicles. Visit the Yahoo! Auto Green Center. http://autos.yahoo.com/green_center/

George Sakkis

3:58 a.m.

New subject: dict.fromkeys() better as dict().setkeys() ? (and other suggestions)

On 5/29/07, Steve Howell <showell30@yahoo.com> wrote:

...

Or maybe you skipped homework on the itertools.groupby thread of c.l.py. ;-) George -- "If I have been able to see further, it was only because I stood on the shoulders of million monkeys."

Ron Adam

4:24 a.m.

New subject: dict.fromkeys() better as dict().setkeys() ? (and other suggestions)

Steve Howell wrote:

...

Since you aren't really creating two lists, the problem below doesn't really fit this particular solution. But maybe we can make it work from a different point of view. emps = { 'Fred Smith': 50.0, 'Fred Jones': 40.0, 'Joe Blow': 30, } def pay_increase(salary): return salary * 0.5 def is_fred(emp): return 'Fred' in emp[0] # give all Freds a raise freds, notfreds = decomp(emps.keys(), is_fred): for name in freds: emp[name] = pay_increas(emp[name]) # # Then we can use the freds list again to generate a report. # Of course in this case the following would work just as well... freds = [] for name in emps: if is_fred(name): emp[name] = pay_increas(emp[name]) freds.append(name) One reason to generating more than one list is if each list is going to be handled as batches, or in different ways, or at different times than you otherwise would by just iterating it. Cheers, Ron

...

Ron Adam

8:21 p.m.

New subject: Dictionary group key/value modificaton methods.

Ok, I'm going to try to summarize this a bit so we don't go around in circles on details that are adjacent to the issue I'm trying to address. + Adding methods to "copyitems", "seteach", and "delitems"; to do partial group operations on dictionaries in C rather than iterating in python can possibly have as much as a %500 percent performance increase over iterating in python to do the same thing. - It needs to be shown that these situations occur often enough to result in a meaningful benefit. (It doesn't replace the need to iterate dictionaries as there are many cases where that's exactly what you need.) + The methods add some improvements to readability over the iterator form. - There is not a significant reduction in lines of code, so again it needs to be shown that this would be useful often enough to be a significant benefit. Providing there are enough use cases to demonstrate a significant benefit, we will then need to address the following issues. + What to call them. + The details of the implementation. Most of the arguments against fit into the following categories... - Changes the status quo - It's premature optimization - Adds additional complexity to dictionaries - Personal preference These are subjective but still important issues, and these will need to be addressed after it is demonstrated there is sufficient use cases for these features, if each of these is relevant and to what degree. Some examples: # Combine two dictionaries. (works already) dd = dict(d1) dd.update(d2) # Split dictionary d using a key list. keys_rest = set(d.keys()) - set(keys) d1, d2 = d.getitems(keys), d.getitems(keys_rest) # Remove a subdict of d with keys. dd = d.getitems(keys) d.delitems(keys) # Copy items from dictionary d1 to d2. # # The getitems method returns a dictionary so it will # work directly with the update method. # d2.update(d1.getitems(keys)) # Move items from dictionary d1 to d2. d2.update(d1.getitems(keys)) d1.del_keys(keys) # Setting items to a specified value with a list of keys. d.seteach(keys, None) Use cases: ### TODO

...

...
Josiah Carlson wrote:

...
Ron Adam <rrr@ronadam.com> wrote:

...

...
Is 12 cases out of about 315,000 python files a big enough need to keep the current behavior? 315,000 is the number returned from google code for all python files, 'lang:python'. (I'm sure there are some duplicates)

Is this more convincing. ;-)

Not to me, as I use dict.fromkeys(), and going from a simple expression to an assignment then mutate is unnecessary cognitive load. It would have been more convincing had you offered...

dict((i, v) for i in keys)

Well, there you go. :-)

...

But then again, basically every one of your additions is a one line expression. I would also consider the above myself, if it weren't for the fact that I'm supporting a Python 2.3 codebase. Please see my discussion below of *removing* functionality.

This is probably something that is better suited for python 3000. But it's possible it could be back ported to 2.6. It would have no effect on python 2.5 and earlier. And probably minimal effect on 2.x in regards to 2.3 compatibility. I don't see .fromkeys() being removed in 2.x.

...

Until you can show significant use-cases in the wild, and show that the slowdown of these functions in Python compared to C is sufficient to render the addition of the functions in your own personal library useless, I'm going to stick with my -1.

Your own tests show a maximum speedup of 620%. My testing shows it is 300% to 500% over a range of sizes. I would still call that sufficient. And before you point it out... yes, only if it can be shown to be useful in a wide range of situations. I fully intend to find use cases. If I can't find any, then none of this will matter.

...

I was pointing out how you would duplicate exactly the functionality you were proposing for dict.set_keys(). It is very difficult for me to offer you alternate implementations for your own use, or as reasons why I don't believe they should be added, if you move the target ;).

But programming is full of moving targets. ;-) In any case, look at the overall picture and try not to prematurely shoot this down based on implementation details that can be changed as needed. And I'll attempt to do a use case study from the python library.

...

Until you can show significant use-cases

...

Usually we find substantial use-cases ....

...

... that I don't remember anyone having ever asked for before

...

... but again, use-cases ...

...

I've never needed to do this.

...

Please find me real-world use-cases ...

...

Show me code that is easier to understand.

Ok, I get it. :-)

...

...
Also consider this from a larger view. List has __getslice__, __setslice__, and __delslice__. Set has numerous methods that operate on more than one element.

Lists are ordered sequences, dictionaries are not. Sets are not mappings, they are sets (which is why they have set operations). Dictionaries are a mapping from keys to values, used as both an arbitrary data store as well as data and method member lookups on objects. The most common use-cases of dictionaries *don't* call for any of the additional functionality that you have offered.

...

If they did, then it would have already been added.

This statement isn't true. It only shows the resistance to these changes is greater than the efforts of those who have tried to introduce those changes. (not without good cause) To be clear, I in no way want the bar dropped to a lower level as to what is added to python or not added. I accept that sufficient benefit needs to be demonstrated, and will try to do that. Quality is more important than quantity in this case.

...

...
Dictionaries are suppose to be highly efficient, but they only have limited methods that can operate on more than one item at a time, so you end up iterating over the keys to do nearly everything.

Iteration is a fundamental building block in Python. That's why for loops, iterators, generators, generator expressions, list comprehensions, etc., all use iteration over an iterator to do their work. Building more functionality into dictionaries won't make them easier to use, it will merely add more methods that you think will help. Is there anyone else who likes this idea? Please speak up.

Lets rephrase this to be less subjective... Does anyone think having a approximately 500% improvement in some dictionary operations would be good if it can be done in a way that is both easier to read, use, and has enough use cases to be worth while?

...

getkeys/setkeys/delkeys seem to me like they should be named getitems/setitems/delitems, because they are getting/setting/deleting the entire key->value association, not merely the keys.

Sounds good, how about... getitems, delitems, and seteach ? The update method corresponds to setitems, where setitems is the inverse operations to getitems. I don't see any reason to change update. d1.update(d2.getitems(keys)) So seteach, is a better name for a method that sets each key to a value. Cheers, Ron

Josiah Carlson

May 2007

11:56 p.m.

New subject: dict.fromkeys() better as dict().setkeys() ? (and other suggestions)

Ron Adam <rrr@ronadam.com> wrote:

...

Ron Adam

12:21 a.m.

New subject: dict.fromkeys() better as dict().setkeys() ? (and other suggestions)

Josiah Carlson wrote:

...

Huh? Where does it return an object?

...

Ok :-) Ron

Ron Adam

12:29 a.m.

New subject: dict.fromkeys() better as dict().setkeys() ? (and other suggestions)

Ron Adam wrote:

...

Whoops, Ok I see it now. So the example should be... d = dict() d.set_keys(s, v=none) I think that would work fine.

...

Josiah Carlson

12:41 a.m.

New subject: dict.fromkeys() better as dict().setkeys() ? (and other suggestions)

Ron Adam <rrr@ronadam.com> wrote:

...

If you are looking to replace dict.fromkeys(...), then dict().set_keys (...) must return the dictionary that was just created and mutated. - Josiah

Josiah Carlson

1:18 a.m.

New subject: dict.fromkeys() better as dict().setkeys() ? (and other suggestions)

Ron Adam <rrr@ronadam.com> wrote:

...

The dictionary fromkeys method seems out of place as well as miss-named. IMHO

It is perfectly named (IMNSHO ;), create a dictionary from the keys provided; dict.fromkeys() .

...

There are enough correct uses of it in the wild to keep the behavior, but it can be done in a better way.

I wasn't terribly convinced by your later arguments, so I'm -1.

...

I think this reads better and can be used in a wider variety of situations.

It could be useful for setting an existing dictionary to a default state.

# reset status of items. status.set_keys(status.keys(), v=0)

This can be done today: status.update((i, 0) for i in status.keys()) #or status.update(dict.fromkeys(status, 0))

...

Or more likely, resetting a partial sub set of the keys to some initial state.

The reason I started looking at this is I wanted to split a dictionary into smaller dictionaries and my first thought was that fromkeys would do that. But of course it doesn't.

...

What I wanted was to be able to specify the keys and get the values from the existing dictionary into the new dictionary without using a for loop to iterate over the keys.

d = dict(1='a', 2='b', 3='c', 4='d', 5='e')

d_odds = d.from_keys([1, 3, 5]) # new dict of items 1, 3, 5 d_evens = d.from_keys([2, 4]) # new dict of items 2, 4

There currently isn't a way to split a dictionary without iterating it's contents even if you know the keys you need before hand.

Um... def from_keys(d, iterator): return dict((i, d[i]) for i in iterator)

...

A del_keys method could replace the clear method. del_keys would be more useful as it could operate on a partial set of keys.

d.delkeys(d.keys()) # The current clear method behavior.

I can't remember ever needing something like this that wasn't handled by d.clear() .

...

Some potentially *very common* uses:

# This first one works now, but I included it for completeness. ;-)

mergedicts(d1, d2): """ Combine two dictionaries. """ dd = dict(d1) return dd.update(d2)

dict((i, d2.get(i, d1.get(i))) for i in itertools.chain(d1,d2))

...

splitdict(d, keys): """ Split dictionary d using keys. """ keys_rest = set(d.keys()) - set(keys) return d.from_keys(keys), d.from_keys(keys_rest)

I can't think of a simple one-liner for this one that wouldn't duplicate work.

...

split_from_dict(d, keys): """ Removes and returns a subdict of d with keys. """ dd = d.from_keys(keys) d.del_keys(keys) return dd

dict((i, d.pop(i, None)) for i in keys)

...

copy_items(d1, d2, keys): """ Copy items from dictionary d1 to d2. """ d2.update(d1.from_keys(keys)) # I really like this!

d2.update((i, d1[i]) for i in keys)

...

move_items(d1, d2, keys): """ Move items from dictionary d1 to d2. """ d2.update(d1.from_keys(keys)) d1.del_keys(keys)

d2.update((i, d1.pop(i, None)) for i in keys)

...

I think the set_keys, from_keys, and del_keys methods could add both performance and clarity benefits to python.

Performance, sometimes, for some use-cases. Clarity? Maybe. Your split* functions are a bit confusing to me, and I've never really needed any of the functions that you list.

...

So to summarize...

1. Replace existing fromkeys method with a set_keys method. 2. Add a partial copy items from_keys method. 3. Replace the clear method with a del_keys method.

...

So this replaces two methods and adds one more. Overall I think the usefulness of these would be very good.

I don't find the current dictionary API to be lacking in any way other than "what do I really need to override to get functionality X", but that is a documentation issue more than anything.

...

I also think it will work very well with the python 3000 keys method returning an iterator. (And still be two fewer methods than we currently have.)

I'm sorry, but I can't really see how your changes would add to Python's flexibility without cluttering up interfaces and confusing current users. - Josiah

Ron Adam

4:04 a.m.

New subject: dict.fromkeys() better as dict().setkeys() ? (and other suggestions)

Josiah Carlson wrote:

...

Of course all of the examples I gave can be done today. But they nearly all require iterating in python in some form.

...

Then lets find a different name.

...

(iterating) And I'd prefer to define the function in this case for readability reasons.

...

:-) This is one of the main motivators.

(iterating)

(iterating)

(iterating)

Iterators and for loops are fairly efficient for small dictionaries, but iterating can still be considerable slower than the equivalent C code if they are large dictionaries.

...

Josiah Carlson

May 2007

4:35 p.m.

New subject: dict.fromkeys() better as dict().setkeys() ? (and other suggestions)

Ron Adam <rrr@ronadam.com> wrote:

...

Josiah Carlson wrote:

...
Ron Adam <rrr@ronadam.com> wrote:

...
The dictionary fromkeys method seems out of place as well as miss-named. IMHO

It is perfectly named (IMNSHO ;), create a dictionary from the keys provided; dict.fromkeys() .

That's ok, I won't hold it against you. ;-)

What about it's being out of place? Is this case like 'sorted' vs 'sort' for lists?

Sorted returns a list because it is the only mutable ordered sequence in Python, hence the only object that makes sense to return from a sorted function.

...

...
...
There are enough correct uses of it in the wild to keep the behavior, but it can be done in a better way.

I wasn't terribly convinced by your later arguments, so I'm -1.

Yes, I'm not the most influential writer.

I'm not sure I can convince you it's better if you already think it's not. That has more to do with your personal preference. So lets look at how much it's actually needed in the current (and correct) form. [snip] Is 12 cases out of about 315,000 python files a big enough need to keep the current behavior? 315,000 is the number returned from google code for all python files, 'lang:python'. (I'm sure there are some duplicates)

Is this more convincing. ;-)

...

...
...
I think this reads better and can be used in a wider variety of situations.

It could be useful for setting an existing dictionary to a default state.

# reset status of items. status.set_keys(status.keys(), v=0)

This can be done today:

Of course all of the examples I gave can be done today. But they nearly all require iterating in python in some form.

...

...
status.update((i, 0) for i in status.keys()) #or status.update(dict.fromkeys(status, 0))

The first example requires iterating over the keys. The second example works if you want to initialize all the keys. In which case, there is no reason to use the update method. dict.fromkeys(status, 0) is enough.

...

...
...
Or more likely, resetting a partial sub set of the keys to some initial state.

The reason I started looking at this is I wanted to split a dictionary into smaller dictionaries and my first thought was that fromkeys would do that. But of course it doesn't.

Changing the bahvior of dict.fromkeys() is not going to happen. We can remove it, we can add a new method, but changing will lead to not so subtle breakage as people who were used to the old behavior try to use the updated method.

Note that this isn't a matter of "it's ok to break in 3.0", because dict.fromkeys() is not seen as being a design mistake by any of the 'heavy hitters' in python-dev or python-3000 that I have heard (note that I am certainly not a 'heavy hitter').

Then lets find a different name.

...

...
...
What I wanted was to be able to specify the keys and get the values from the existing dictionary into the new dictionary without using a for loop to iterate over the keys.

d = dict(1='a', 2='b', 3='c', 4='d', 5='e')

d_odds = d.from_keys([1, 3, 5]) # new dict of items 1, 3, 5 d_evens = d.from_keys([2, 4]) # new dict of items 2, 4

There currently isn't a way to split a dictionary without iterating it's contents even if you know the keys you need before hand.

Um...

def from_keys(d, iterator): return dict((i, d[i]) for i in iterator)

(iterating)

Yep as I said just above this.

"""There currently isn't a way to split a dictionary without iterating it's contents ..."""

...

Lists have __getslice__, __setslice__, and __delslice__. It could be argued that those can be handled just as well with iterators and loops as well. Of course we see them as seq[s:s+x], on both lists and strings. So why not have an equivalent for dictionaries. We can't slice them, but we do have key lists to use in the same way.

...

...
...
A del_keys method could replace the clear method. del_keys would be more useful as it could operate on a partial set of keys.

d.delkeys(d.keys()) # The current clear method behavior.

I can't remember ever needing something like this that wasn't handled by d.clear() .

All or nothing. d = dict() works just as well.

Not when you want to mutate a dictionary.

...

And I'd prefer to define the function in this case for readability reasons.

...
...
splitdict(d, keys): """ Split dictionary d using keys. """ keys_rest = set(d.keys()) - set(keys) return d.from_keys(keys), d.from_keys(keys_rest)

I can't think of a simple one-liner for this one that wouldn't duplicate work.

:-)

This is one of the main motivators.

...

...
...
I think the set_keys, from_keys, and del_keys methods could add both performance and clarity benefits to python.

Performance, sometimes, for some use-cases. Clarity? Maybe. Your split* functions are a bit confusing to me, and I've never really needed any of the functions that you list.

I think sometime our need is determined by what is available for use. So if it's not available, our minds filter it out from the solutions we consider. That way, we don't need the things we don't have or can't get.

My minds "need filter" seems to be broken in that respect. I often need things I don't have. But sometimes that works out to be good. ;-)

...

...
...
So to summarize...

1. Replace existing fromkeys method with a set_keys method. 2. Add a partial copy items from_keys method. 3. Replace the clear method with a del_keys method.

Not all X line functions should be builtins.

Of course I knew someone would point this out.

I'm usually the one to invoke it. Maybe I have less tolerance to arguably trivial additions to Python than others.

...

I'm not requesting the above example functions be builtins. Only the changes to the dict methods be considered. They would allow those above functions to work in a more efficient way and I'd be happy to add those functions to my own library.

With these methods in most cases the functions wouldn't even be needed. You would just use the methods in combinations with each other directly and the result would still be readable without a lot of 'code' overhead.

My single expression replacements were to show that the functions aren't needed now, as most are *easily* implemented in Python 2.5 in a straightforward manner.

...

Also consider this from a larger view. List has __getslice__, __setslice__, and __delslice__. Set has numerous methods that operate on more than one element.

...

Dictionaries are suppose to be highly efficient, but they only have limited methods that can operate on more than one item at a time, so you end up iterating over the keys to do nearly everything.

...

So as an alternative, leave fromkeys and clear alone and add...

getkeys(keys) -> dict setkeys(keys, v=None) delkeys(keys)

Where these offer the equivalent of list slice functionality to dictionaries.

getkeys/setkeys/delkeys seem to me like they should be named getitems/setitems/delitems, because they are getting/setting/deleting the entire key->value association, not merely the keys.

...

...
If you find that you are doing the above more often than you think you should, create a module with all of the related functionality that automatically patches the builtins on import and place it in the Python cheeseshop. If people find that the functionality helps them, then we should consider it for inclusion. As it stands, most of the methods you offer have a very simple one-line version that is already very efficient.

Iterators and for loops are fairly efficient for small dictionaries, but iterating can still be considerable slower than the equivalent C code if they are large dictionaries.

...

...
...
So this replaces two methods and adds one more. Overall I think the usefulness of these would be very good.

I don't find the current dictionary API to be lacking in any way other than "what do I really need to override to get functionality X", but that is a documentation issue more than anything.

...
...
I also think it will work very well with the python 3000 keys method returning an iterator. (And still be two fewer methods than we currently have.)

I'm sorry, but I can't really see how your changes would add to Python's flexibility without cluttering up interfaces and confusing current users.

I think it cleans up the API more than it clutters it up. It coverts two limited use methods to be more general, and adds one more that works with the already existing update method nicely.

...

In both cases of the two existing methods, fromkeys and clear, your arguments, that there all ready exists easy one line functions to do this, would be enough of a reason to not have them in the first place. So do you feel they should be removed?

...

I plan on doing a search of places where these things can make a difference in making the code more readable and/or faster.

Jim Jewett

6:23 p.m.

New subject: dict.fromkeys() better as dict().setkeys() ? (and other suggestions)

On 5/29/07, Josiah Carlson <jcarlson@uci.edu> wrote:

...

Ron Adam <rrr@ronadam.com> wrote:

...
Josiah Carlson wrote:

...

...
...
dict.fromkeys() is not seen as being a design mistake

...

...
...
...
splitdict(d, keys): """ Split dictionary d using keys. """ keys_rest = set(d.keys()) - set(keys) return d.from_keys(keys), d.from_keys(keys_rest)

...

...
...
I can't think of a simple one-liner for this one that wouldn't duplicate work.

...

unlikely to happen precisely because *others* aren't mentioning, "dictionaries need to be cleaned up".

Ron Adam

2:12 a.m.

New subject: dict.fromkeys() better as dict().setkeys() ? (and other suggestions)

Jim Jewett wrote:

...

Steve Howell

2:15 a.m.

New subject: dict.fromkeys() better as dict().setkeys() ? (and other suggestions)

--- Ron Adam <rrr@ronadam.com> wrote:

...

Would that be called a de-comprehension?

Steve Howell

2:32 a.m.

New subject: dict.fromkeys() better as dict().setkeys() ? (and other suggestions)

--- Ron Adam <rrr@ronadam.com> wrote:

...

Ron Adam

2:45 a.m.

New subject: dict.fromkeys() better as dict().setkeys() ? (and other suggestions)

Steve Howell wrote:

...

Steve Howell

May 2007

2:53 a.m.

New subject: dict.fromkeys() better as dict().setkeys() ? (and other suggestions)

--- Ron Adam <rrr@ronadam.com> wrote:

...

Ron Adam

3:30 a.m.

New subject: dict.fromkeys() better as dict().setkeys() ? (and other suggestions)

Steve Howell wrote:

...

For simple binary partitions, there is the bool function.

Or more likely you may have a method in a class that tests for a particular condition. pass, fail = decomp(list_of_classes, lambda x: x.test()) Cheers, Ron

Steve Howell

3:31 a.m.

New subject: dict.fromkeys() better as dict().setkeys() ? (and other suggestions)

--- Ron Adam <rrr@ronadam.com> wrote:

...

George Sakkis

3:58 a.m.

New subject: dict.fromkeys() better as dict().setkeys() ? (and other suggestions)

On 5/29/07, Steve Howell <showell30@yahoo.com> wrote:

...

Or maybe you skipped homework on the itertools.groupby thread of c.l.py. ;-) George -- "If I have been able to see further, it was only because I stood on the shoulders of million monkeys."

Ron Adam

4:24 a.m.

New subject: dict.fromkeys() better as dict().setkeys() ? (and other suggestions)

Steve Howell wrote:

...

Ron Adam

8:21 p.m.

New subject: Dictionary group key/value modificaton methods.

...

...
Josiah Carlson wrote:

...
Ron Adam <rrr@ronadam.com> wrote:

...

...
Is 12 cases out of about 315,000 python files a big enough need to keep the current behavior? 315,000 is the number returned from google code for all python files, 'lang:python'. (I'm sure there are some duplicates)

Is this more convincing. ;-)

Not to me, as I use dict.fromkeys(), and going from a simple expression to an assignment then mutate is unnecessary cognitive load. It would have been more convincing had you offered...

dict((i, v) for i in keys)

Well, there you go. :-)

...

But then again, basically every one of your additions is a one line expression. I would also consider the above myself, if it weren't for the fact that I'm supporting a Python 2.3 codebase. Please see my discussion below of *removing* functionality.

...

Until you can show significant use-cases in the wild, and show that the slowdown of these functions in Python compared to C is sufficient to render the addition of the functions in your own personal library useless, I'm going to stick with my -1.

...

I was pointing out how you would duplicate exactly the functionality you were proposing for dict.set_keys(). It is very difficult for me to offer you alternate implementations for your own use, or as reasons why I don't believe they should be added, if you move the target ;).

...

Until you can show significant use-cases

...

Usually we find substantial use-cases ....

...

... that I don't remember anyone having ever asked for before

...

... but again, use-cases ...

...

I've never needed to do this.

...

Please find me real-world use-cases ...

...

Show me code that is easier to understand.

Ok, I get it. :-)

...

...
Also consider this from a larger view. List has __getslice__, __setslice__, and __delslice__. Set has numerous methods that operate on more than one element.

Lists are ordered sequences, dictionaries are not. Sets are not mappings, they are sets (which is why they have set operations). Dictionaries are a mapping from keys to values, used as both an arbitrary data store as well as data and method member lookups on objects. The most common use-cases of dictionaries *don't* call for any of the additional functionality that you have offered.

...

If they did, then it would have already been added.

...

...
Dictionaries are suppose to be highly efficient, but they only have limited methods that can operate on more than one item at a time, so you end up iterating over the keys to do nearly everything.

Iteration is a fundamental building block in Python. That's why for loops, iterators, generators, generator expressions, list comprehensions, etc., all use iteration over an iterator to do their work. Building more functionality into dictionaries won't make them easier to use, it will merely add more methods that you think will help. Is there anyone else who likes this idea? Please speak up.

...

getkeys/setkeys/delkeys seem to me like they should be named getitems/setitems/delitems, because they are getting/setting/deleting the entire key->value association, not merely the keys.

6480

Age (days ago)

6482

Last active (days ago)

List overview

Download

18 comments

5 participants

participants (5)

George Sakkis
Jim Jewett
Josiah Carlson
Ron Adam
Steve Howell

dict.fromkeys() better as dict().setkeys() ? (and other suggestions)

Ron Adam

Josiah Carlson

Ron Adam

Ron Adam

Josiah Carlson

Josiah Carlson

Ron Adam

Josiah Carlson

Jim Jewett

Ron Adam

Steve Howell

Steve Howell

Ron Adam

Steve Howell

Ron Adam

Steve Howell

George Sakkis

Ron Adam

Ron Adam

Josiah Carlson

Ron Adam

Ron Adam

Josiah Carlson

Josiah Carlson

Ron Adam

Josiah Carlson

Jim Jewett

Ron Adam

Steve Howell

Steve Howell

Ron Adam

Steve Howell

Ron Adam

Steve Howell

George Sakkis

Ron Adam

Ron Adam

tags

participants (5)