aleax at mac.com
Fri Aug 31 04:54:46 CEST 2007
Ricardo Aráoz <ricaraoz at gmail.com> wrote:
> Alex Martelli wrote:
> > <zzbbaadd at aol.com> wrote:
> > ...
> >> In my case of have done os.listdir() on two directories. I want to see
> >> what files are in directory A that are not in directory B.
> > So why would you care about WHERE, in the listdir of B, are to be found
> > the files that are in A but not B?! You should call .index only if you
> > CARE about the position.
> > def inAnotB(A, B):
> > inA = os.listdir(A)
> > inBs = set(os.listdir(B))
> > return [f for f in inA if f not in inBs]
> > is the "one obvious way to do it" (the set(...) is just a simple and
> > powerful optimization -- checking membership in a set is roughly O(1),
> > while checking membership in a list of N items is O(N)...).
> And what is the order of passing a list into a set? O(N)+?
Roughly O(N), yes (with the usual caveats about hashing costs, &c;-).
So, when A has M files and B has N, your total costs are roughly O(M+N)
instead of O(M*N) -- a really juicy improvement for large M and N!
More information about the Python-list