[Tim]
... Sorting on field 'NumberOfEmployees' -- 6531 records via sort: 120.15 120.24 120.26 120.31 122.58 via msort: 134.25 134.29 134.50 134.74 135.09 ... [where the # of comparisons done is] Sorting on field 'NumberOfEmployees' -- 6531 records via sort: 76167 ... via msort: 76554 ... ... [and various hypotheses for why it's >10% slower anyway don't pan out] ... I conclude that msort is at worst only a tiny bit slower when doing NumberOfEmployees, and possibly a little faster. The only measure that doesn't agree with that conclusion is time.clock() -- but time is such a subjective thing I won't let that bother me <wink>.
It's the dicts again. NumberOfEmployees isn't always unique, and in particular it's missing in 6635-6531 = 104 records, so that values = [(x.get(fieldname), x) for x in data] includes 104 tuples with a None first element. Comparing a pair of those gets resolved by comparing the dicts, and dict comparison ain't cheap. Building the DSU tuple-list via values = [(x.get(fieldname), i, x) for i, x in enumerate(data)] instead leads to Sorting on field 'NumberOfEmployees' -- 6531 records via sort: 47.47 47.50 47.54 47.66 47.75 via msort: 48.21 48.23 48.43 48.81 48.85 which gives both methods a huge speed boost, and cuts .sort's speed advantage much closer to its small advantage in total # of comparisons. I expect it's just luck of the draw as to which method is going to end up comparing tuples with equal first elements more often, and msort apparently does in this case (and those comparisons are more expensive, because they have to go on to invoke int compare too). A larger lesson: even if Python gets a stable sort and advertises stability (we don't have to guarantee it even if it's there), there may *still* be strong "go fast" reasons to include an object's index in its DSU tuple. tickledly y'rs - tim