More elegant solution for diffing two sequences
Lie Ryan
lie.1296 at gmail.com
Fri Dec 4 14:20:05 EST 2009
On 12/5/2009 4:20 AM, Ulrich Eckhardt wrote:
> Thinking about it, I perhaps should store the glyphs in a set from the
> beginning. Question is, can I (perhaps by providing the right hash function)
> sort them by their codepoint? I'll have to look at the docs...
Python does not guarantee that a particular characteristic of the hash
function will lead to a particular characteristic of the ordering of th
eset. Though AFAICT, the current set's ordering is determined by the
hash modulus the set's hashtable's real size, but if you rely on this
you're on your own. It's better if you sorted() them when you want a
sorted view (or turn to set just before finding the differences).
You can reduce the penalty of creating new data structure with something
like:
a = [...]
b = [...]
s_a = set(a)
s_a -= set(b)
that only creates two new sets (instead of three) and probably might be
faster too (though you'd need to profile to be sure).
More information about the Python-list
mailing list