lxml, comparing nodes
Stefan Behnel
stefan_ml at behnel.de
Fri Jul 25 07:02:09 EDT 2008
code_berzerker wrote:
>> If document order doesn't matter, try sorting the elements of each level in
>> the two documents by some arbitrary deterministic key, such as (tag name,
>> text, attr count, whatever), and then compare them in order, instead of trying
>> to find matches in multiple passes. itertools.groupby() might be your friend here.
>
> I think that sorting multiple times by each attribute will cost more
> than I've managed to do:
[...]
> let1 = [x for x in et1.iter()]
> let2 = [x for x in et2.iter()]
>
[...]
> while let1:
> el = let1.pop(0)
> foundEl = findMatchingElem(el, let2)
> if foundEl is None:
> return False
> let2.remove(foundEl)
> return True
>
> def findMatchingElem(el, eList):
> for elem in eList:
> if elemsEqual(el, elem):
> return elem
> return None
[...]
> Notice that if documents are in exact same order, each element is
> compared only once!
Not in your code.
Stefan
More information about the Python-list
mailing list