What is the best way to merge two lists?

Alex Martelli aleax at aleax.it
Mon Nov 18 06:19:44 EST 2002


Padraig Brady wrote:

> Pierre Rouleau wrote:
>> What's the most efficient way to merge two lists inside a single one
>> with the resulting list having only one instance of each element (ie not
>> having two elements with the same value)?
>> 
>> The order of the elements is not a criteria for the selection of the
>> algorithm.
> 
> This is relevant I think:
> http://www.python.org/peps/pep-0218.html

Unfortunately inapplicable due to the following detail noted at the end:

"""
Both types of set require that their elements are
    immutable, hashable objects.
"""

Pierre's key issue is that he wants to be able to handle items of ANY type 
-- presumably any type comparable for equality, otherwise 'the same value'
makes no sense, but definitely can't accept restriction to immutable,
hashable objects.

One solution that comes to mind at once would be (untested):

class genericset:

    def __init__(self):
        self.dict = {}
        self.lookaside = []

    def add(self, item):
        try:
            self.dict[item] = 1
        except TypeError:
            if item not in self.lookaside:
                self.lookaside.append(item)

    def __contains__(self, item):
        try:
            return item in self.dict
        except TypeError:
            return item in self.lookaside


This doesn't deal with the further issue of extracting items in a 
deterministic order -- any order, as long as it can be guaranteed
that the order will always be the same for any given set of items
on whatever computer this code is run.  But that's not hard to fix,
it suffices to add an arbitrary index instead of the fixed '1' as
the contents of self.dict and do some sorting at extraction time
(an auxiliary index-to-item dict might make extraction faster):

class genericsequenceableset:

    def __init__(self):
        self.dict = {}
        self.lookaside = []
        self.ninserts = 0

    def add(self, item):
        self.ninserts += 1
        try:
            self.dict[item] = self.ninserts
        except TypeError:
            if item not in self.lookaside:
                self.lookaside.append(item)

    def __contains__(self, item):
        try:
            return item in self.dict
        except TypeError:
            return item in self.lookaside

    def extractall(self):
        aux = [ (v, k) for k, v in self.dict.items() ]
        aux.sort()
        return [ k for v, k in aux ] + self.lookaside


Alex




More information about the Python-list mailing list