More elegant solution for diffing two sequences

MRAB python at mrabarnett.plus.com
Fri Dec 4 19:31:39 CET 2009


Ulrich Eckhardt wrote:
> Lie Ryan wrote:
>> On 12/4/2009 8:28 AM, Ulrich Eckhardt wrote:
>>> I'm trying to write some code to diff two fonts. What I have is every
>>> character (glyph) of the two fonts in a list. I know that the list is
>>> sorted by the codepoints of the characters. What I'd like to ask is
>>> whether there is a more elegant solution to the loop below or whether
>>> there are any rough corners in my code (see below). Note that I'm
>>> targeting Python 2, not 3 yet.
>>>
>> Use sets:
>>
>> glyph_1 = set(font1.glyphs)
>> glyph_2 = set(font2.glyphs)
>> only_in_1 = glyph_1 - glyph_2
>> only_in_2 = glyph_2 - glyph_1
>> in_both = glyph_1 & glyph_2
>>
>> that is assuming font1.glyphs's value are hashable.
> 
> Thinking about it, I perhaps should store the glyphs in a set from the 
> beginning. Question is, can I (perhaps by providing the right hash function) 
> sort them by their codepoint? I'll have to look at the docs...
> 
> Thank you for this nudge in the right direction!
> 
For sets you need __hash__ and __eq__, and for sorting you need __lt__.
Here's a simple example:

class Glyph(object):
     def __init__(self, codepoint):
         self.codepoint = codepoint
     def __hash__(self):
         return self.codepoint
     def __eq__(self, other):
         return self.codepoint == other.codepoint
     def __lt__(self, other):
         return self.codepoint < other.codepoint
     def __repr__(self):
         return "Glyph(%s)" % self.codepoint




More information about the Python-list mailing list