What is the best way to merge two lists?

Pierre Rouleau pieroul at attglobal.net
Sun Nov 17 10:38:16 EST 2002


John Hunter wrote:
>>>>>>"Pierre" == Pierre Rouleau <pieroul at attglobal.net> writes:
>>>>>
> 
>     Pierre> What's the most efficient way to merge two lists inside a
>     Pierre> single one with the resulting list having only one
>     Pierre> instance of each element (ie not having two elements with
>     Pierre> the same value)?
> 
>     Pierre> The order of the elements is not a criteria for the
>     Pierre> selection of the algorithm.
> 
> The best way will depend on the size of the list and the type of
> elements contained in each, but for Merge Lists 101, the best approach
> is to use a dictionary.
> 
> If you can afford the memory consumption, the following will give
> pretty good performance
> 
> l1 = [1,2,3,4]
> l2 = [2,3,12,14]
> 
> m = dict( [(x,1) for x in l1+l2])
> print m.keys()
> 
> If memory is an issue, you can process each list separately:
> 
> m1 = dict( [(x,1) for x in l1])
> m1.update( dict( [(x,1) for x in l2]))
> print m1.keys()
> 
> 
> The solutions above require a recent version of python (2.2+).  For
> older versions (<2.0), use the manual loop, which is slower but will
> run on pretty much any python
> 
> m = {}
> for x in l1+l2: 
>     m[x]=1
> print m.keys()
> 
>
Thanks!

Now, if i add two more criterias to the question, the solutions based on 
dictionary don't work. The criteria being:
  1) The input lists can contain elements of *any* type.
  2) The resulting list must always be the same for every execution with 
   identical input (deterministic results)

Your solution works if the elements that are inside the lists to merge 
are all hashable.  If one of the element of the list is not, it can't be 
entered as a dictionary key and the merge fails.  For example, all of 
your examples would fail mergin the following lists:

 >>> a = [1,2,[3,4,5],6]
 >>> b = [5,2,3]
 >>> m = dict( [(x,1) for x in a+b])
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
TypeError: list objects are unhashable
 >>>

I came up with a function that will handle heterogenous lists containing 
anything, but its pretty much a brute force approach.  There ought to be 
a better way:

def uniq(*args) :
    theInputList = []
    for theList in args:
       theInputList += theList
    theFinalList = []
    for elem in theInputList:
       if elem not in theFinalList:
          theFinalList.append(elem)
    return theFinalList

 >>> a = [1,2,[3,4,5],6]
 >>> b = [5,2,3]
 >>> c = uniq(a,b)
 >>> c
[1, 2, [3, 4, 5], 6, 5, 3]
 >>>

It is also deterministic (the result list will always be the same in all 
executions given the same inputs for all computers).

Is there a more efficient way to merge heterogenous lists than the brute 
force approach used by uniq() above?

Thanks

Pierre






More information about the Python-list mailing list