Scanning through Directory

Michael Peuser mpeuser at web.de
Wed Aug 13 08:12:14 EDT 2003


Hi Anand,

os.path.walk is just a conveniance function around listdir, although a
*very* conveniant one where you can write scanning programs quite readable.

The timing on Windows 2000 is somewhat complicated, as there seems to be
some OS internal caching, so effective reading of the drive is somtimes
suppressed. Even the drive internal cache might influence the situation:

I can give you the following benchmark:
40 GB 2'' Laptop disk,  8 GB  in 120.000 files

win32.FindFiles: initial:  43 s     follow-up:   3,5s
walk or listdir:    initial:  85 s      follow-up:  42 s

This means that in one-shot cases the speed-up (2x) does not look so
tremendeous,
it will increase with faster disks (and decrease with faster processors)
However, the potential speed-up in repeated situations is tremendious.
My initial question, how this could be possible is still valid, because the
implementation of listdir will use the API function FindFiles eventually....


Kindly
Michael P

"Anand Pillai" <pythonguy at Hotpop.com> schrieb im Newsbeitrag
news:84fc4588.0308130217.4e64c3a at posting.google.com...
> Did you try os.path.walk() in your experiments?
> If you did, you could post the profile data for it too.
> I find it one of the most useful functions defined in
> the os.path module.
>
> Thanks!
>
> Anand
>
> "Michael Peuser" <mpeuser at web.de> wrote in message
news:<bhalk1$qlv$06$1 at news.t-online.com>...
> > There is a portable way of reading directories
> >    os.listdir
> > in conjunction with
> >   os.stat and a little help from os.path.split and os.path.join
> > Though using os.stat gives programs a somewhat clumbsy look,
> > I had not been too annoyed with it until I tried to scan a whole disk.
> > Reworking the join and split operations already gave a considerable
> > speed-up
> >
> > As I need it for Windows I then with very little effort changed to:
> >    win32api.FindFiles
> > (I already had used win32api.GetDiskFreeSpace)
> >
> > This was a breakthrough: 5 secs instead of 50+ !
> >
> > os.listdir seems to do little more than win32api FindFiles
> > How can it be so slow???
> > Why is there no portable way of getting "DiskFreeSpace" info
> > Questions and Questions....
> >
> > I strongly recommend the win32api calls it you have similar
applications.
> > The interface is extremely simple and the only penalty is
non-compatibility
> > ;-)
> >
> > I also recommend to use the Python profiler from time to time for
programms
> > running more than one second. It might give you interesting insights...
> >
> > Kindly
> > Michael Peuser






More information about the Python-list mailing list