sorteddict PEP proposal [started off as orderedict]

Steven Bethard steven.bethard at gmail.com
Tue Sep 25 12:47:48 EDT 2007


Mark Summerfield wrote:
> PEP: XXX
> Title: Sorted Dictionary
[snip]
>     In addition, the keys() method has two optional arguments:
> 
>     keys(firstindex : int = None, secondindex : int = None) -> list of keys
> 
>     The parameter names aren't nice, but using say "start" and "end" would
>     be misleading since the intention is for the parameters to work like
>     they do in range(), e.g.
> 
>         sd.keys()       # returns a list of all the sorteddict's keys
>         sd.keys(10)     # returns a list of the first 10 keys
>         sd.keys(19, 35) # returns a list of the 19th-34th keys inclusive

You should use the range() names, "start" and "stop":

     >>> help(range)
     Help on built-in function range in module __builtin__:

     range(...)
         range([start,] stop[, step]) -> list of integers

But I also agree that this is probably not necessary given the existence 
of slicing and itertools.islice().

>     Since the sorteddict's data is always kept in key order, indexes
>     (integer offsets) into the sorteddict make sense.  Five additional
>     methods are proposed to take advantage of this:
> 
>     key(index : int) -> value
> 
>     item(index : int) -> (key, value)
> 
>     value(index : int) -> key
> 
>     set_value(index : int, value)
> 
>     delete(index : int)
> 
>     Items and values can still be accessed using the key, e.g., sd[key],
>     since all the dict's methods are available.

I agree with others that these APIs are really not necessary. Simply 
call keys(), values() or items() and then use that list if you really 
want indexing. If you're dealing with a lot of these kind of indexing 
behaviors, working with the plain list object will be faster anyway.

> Examples
> 
>     To keep a collection of filenames on a case-insensitive file system in
>     sorted order, we could use code like this:
> 
>         files = collections.sorteddict.sorteddict()

The type should probably just be in collections directly. No need for a 
sub-package. So that would look like::

     files = collections.sorteddict()

I also strongly support Paul Hankin's proposal for the constructor API::

     sorteddict((mapping|sequence|..), cmp=None, key=None, reverse=False)

Being able to sort by a key= function would make sorteddict() 
dramatically more useful. I almost never use the builtin heap() because 
it doesn't support a key= function and I always need one. I suspect use 
cases for sorteddict() will be similar. For example::

     files = collections.sorteddict(key=lambda s: s.lower())
     for name in os.listdir("."):
         files[name] = ... some calculated value ...

Now the file names are in lower-case sorted order, but if you actually 
retrieve the keys, e.g. through .items(), you'll see the real filename, 
not the lower-cased ones.

STeVe



More information about the Python-list mailing list