Best way to insert sorted in a list
Chris Torek
nospam at torek.net
Fri Jun 17 20:46:44 EDT 2011
In article <itgi3801bra at news2.newsguy.com> I wrote, in part:
>>Appending to the list is much faster, and if you are going to
>>dump a set of new items in, you can do that with: [...]
In article <mailman.96.1308348643.1164.python-list at python.org>
Ethan Furman <ethan at stoneleaf.us> wrote:
>> a.append(large_list)
> ^- should be a.extend(large_list)
Er, right. Posted in haste (had to get out the door). I also
wrote:
>> If len(large_list) is m, this is O(m). Inserting each item in
>> the "right place" would be O(m log (n + m)). But we still
>> have to sort:
>>
>> a.sort()
In article <mailman.98.1308353648.1164.python-list at python.org>,
Ian Kelly <ian.g.kelly at gmail.com> wrote:
>> This is O(log (n + m)), hence likely better than repeatedly inserting
>> in the correct place.
>Surely you mean O((n + m) log (n + m)).
Er, "maybe"? (It depends on the relative values of m and n, and
the underlying sort algorithm to some extent. Some algorithms are
better at inserting a relatively small number of items into a
mostly-sorted large list. As I recall, Shell sort does well with
this.) But generally, yes. See "posted in haste" above. :-)
There are a lot of other options, such as sorting just the list of
"items to be inserted", which lets you do a single merge pass:
# UNTESTED
def merge_sorted(it1, it2, must_copy = True):
"""
Merge two sorted lists/iterators it1 and it2.
Roughly equivalent to sorted(list(it2) + list(it2)),
except for attempts to be space-efficient.
You can provide must_copy = False if the two iterators
are already lists and can be destroyed for the purpose
of creating the result.
"""
# If it1 and it1 are deque objects, we don't need to
# reverse them, as popping from the front is efficient.
# If they are plain lists, popping from the end is
# required. If they are iterators or tuples we need
# to make a list version anyway. So:
if must_copy:
it1 = list(it1)
it2 = list(it2)
# Reverse sorted lists (it1 and it2 are definitely
# lists now) so that we can pop off the end.
it1.reverse()
it2.reverse()
# Now accumulate final sorted list. Basically, this is:
# take first (now last) item from each list, and put whichever
# one is smaller into the result. When either list runs
# out, tack on the entire remaining list (whichever one is
# non-empty -- if both are empty, the two extend ops are
# no-ops, so we can just add both lists).
#
# Note that we have to re-reverse them to get
# them back into forward order before extending.
result = []
while it1 and it2:
# Note: I don't know if it might be faster
# to .pop() each item and .append() the one we
# did not want to pop after all. This is just
# an example, after all.
last1 = it1[-1]
last2 = it2[-1]
if last2 < last1:
result.append(last2)
it2.pop()
else:
result.append(last1)
it1.pop()
it1.reverse()
it2.reverse()
result.extend(it1)
result.extend(it2)
return result
So, now if "a" is the original (sorted) list and "b" is the not-yet-
sorted list of things to add:
a = merge_sorted(a, sorted(b), must_copy = False)
will work, provided you are not required to do the merge "in place".
Use the usual slicing trick if that is necessary:
a[:] = merge_sorted(a, sorted(b), must_copy = False)
If list b is already sorted, leave out the sorted() step. If list
b is not sorted and is particularly long, use b.sort() to sort in
place, rather than making a sorted copy.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: gmail (figure it out) http://web.torek.net/torek/index.html
More information about the Python-list
mailing list