On Thu, Sep 19, 2019 at 11:58 PM Richard Musil <risa2000x@gmail.com> wrote:

I did not verify the results (I hope they are correct), but the timing suggest that the set operation is always faster. When however we add the set setup to the mix, the build up time, which is O(n + m) becomes more significant than actually sorting and comparing the lists which is technically of O(nlog(n) + mlog(m)). It would also suggest that for some large lists the list implementation _can_ actually be faster if we calculate in an eventual conversion of the lists to the sets and possible even more faster if the lists were already ordered.

Ok, I misread the original code, the lists were not sorted in the previous results (and their should be). So with the correction to have them sorted, the results are:

For A not in B:
trying A 1000, B 5000
best naive_relcomp test time (msec): 38.147
best relcomp_set test time (msec): 0.035
best relcomp_set with set init test time (msec): 0.219
best list_relcomp test time (msec): 0.330

For A not in B:
trying A 10000, B 50000
best relcomp_set test time (msec): 0.546
best relcomp_set with set init test time (msec): 3.293
best list_relcomp test time (msec): 3.493

For A subset of B:
trying A 100000, B 500000
best relcomp_set test time (msec): 8.651
best relcomp_set with set init test time (msec): 55.133
best list_relcomp test time (msec): 144.094

Which actually puts it in line with the expectations. I guess there is not much more to say.

Richard M.