sorted unique elements from a list; using 2.3 features
Delaney, Timothy
tdelaney at avaya.com
Mon Jan 6 00:49:47 EST 2003
> From: Andrew Dalke [mailto:adalke at mindspring.com]
>
> Python 2.3 offers at least two new ways to do this. The first is
> with the new 'Set' class
>
> # Let 'data' be a list or iterable object
> import sets
> subset = list(sets.Set(data))
> subset.sort()
> # Use 'subset' as needed
Using sets is definitely the Right Way (TM) to do it. This is one of the
primary use cases for sets (*everyone* wants to do this).
> (The 'list()' is needed because that's the only way to get elements
> out from a list. It provides an __iter__ but no 'tolist()' method.)
And this is the canonical way to transform any iterable to a list. Why
should every class that you want to transform to a list have to supply a
`tolist` method? Why not a `totuple` method?
> The other is with the new 'fromkeys' class, which constructs
Actually, dictionary class (static?) method.
> # Let 'data' be a list or iterable object
> subset = dict.fromkeys(data).keys()
> subset.sort()
> # Use 'subset' as needed
This, whilst slightly shorter (due to no import - which in future versions
will be going away anyway), is definitely *not* the Right Way (TM) to do it.
It is likely to confuse people.
> For a real-life example, suppose you want to get unique lines
> from the stdin input stream, sort them, and dump the results
> to stdout. Here's how to do it in Python 2.3
>
> import sys
> unique_lines = dict.fromkeys(sys.stdin).keys()
> unique_lines.sort()
> sys.stdout.writelines(unique_lines)
Nope - this is better done as:
import sets
import sys
unique_lines = list(sets.Set(sys.stdin))
unique_lines.sort()
sys,stdout.writelines(unique_lines)
It says explicitly what you are doing - creating a set of unique *values*
(since that is the definition of a set), the sorting the result.
Tim Delaney
More information about the Python-list
mailing list