reading directory entries one by one

Wed May 22 11:57:51 EDT 2002

Michael P. Soulier:
>    I have a few with 4000+ files in them. There's enough memory on the box
to
>easily handle this, but it's still not efficient.

"efficient"?  What does that mean in this context?  There are three
metrics I can think of:
 - development time, but a list is far easier to understand and use than
     an iterator.  Eg, to sort given an iterator you need to put it in
     a list() first.

 - memory size, for 4000+ files at, oh, 40 bytes per name/string is
     160K.  Which is about 1/10 of what Python is using, and less than
     1/1000th of what most machines have.

 - run time, it's likely faster for Python to build one list in a
     go than have the iterator overhead

The last is likely mostly a theoretical advantage, just like the
second is mostly a theoretical disadvantage.  (Theoretical meaning
unlikely to affect real-life programs.)  The first is probably
the biggest advantage, making it more efficient.

                    Andrew
                    dalke at dalkescientific.com