Amount of files on a drive?

Robin Munn rmunn at pobox.com
Thu Feb 7 11:21:09 EST 2002


On 6 Feb 2002 04:57:54 GMT, Philip Swartzleonard <starx at pacbell.net> wrote:
>G. Willoughby || Tue 05 Feb 2002 07:56:57a:
>
>> i am using this at the minute:
>> ------------------------------------------------------------------
>> import win32api
>> import string
>> import os
>> 
>> def countFiles(arg, dir, files):
>>     for file in files:
>>         count=count+1
>> 
>> fileListings=[]
>> driveList=string.split(win32api.GetLogicalDriveStrings(),'\0')[:-1]
>> 
>> for drive in driveList:
>>     os.path.walk(drive, countFiles, 0)
>> ------------------------------------------------------------------
>> but its pretty slow any other suggestion how i could speed it up??
>
>Well, first determine how long it actually takes (time it), and see how it 
>compares to say, selecting everything on the drive and hitting properties 
>in the context menu to get a count (i.e. see how good windows itself is at 
>it). It took my 800mhz athlon about 35 seconds to tell me that i have 6.15 
>gigs in 101,064 files in 3,930 folders on my C drive. Counting on that 
>scale can get slow.

Also, the bottleneck here is going to be the drive's access time, not
the processor speed. The access time is usually measured in milliseconds
and is the average time for the drive's read/write head to move from one
location to another on the disk surface. This operation is sometimes
called a "seek". Since the drive has to perform at least one seek for
every file, and usually more, the total time spent waiting for the hard
drive far outweighs the time spent by the processor.

On the other hand, I do see one thing you could do to improve your code.
Chage the countFiles() function thus:

------------------------------------------------------------------
def countFiles(arg, dir, files):
    count = count + len(files)
------------------------------------------------------------------

Also, I presume you're using a global variable to hold your count,
unlike the trimmed-down example code you showed us. In the code you
showed us above, the count variable would be a variable local to
countFiles() and would not hold its value across separate invocations of
countFiles(). Instead, you would do:

------------------------------------------------------------------
count = 0

def countFiles(arg, dir, files):
    global count
    count = count + len(files)
------------------------------------------------------------------

The "global count" statement inside countFiles() specifies that from
here on, the name count will refer not to a variable local to
countFiles() but instead to the global variable of that name.

Gee, this is reminding me of that student blooper (probably an urban
legend) that went, "The Iliad was not written by Homer, but by another
man of that name." :)

-- 
Robin Munn
rmunn at pobox.com



More information about the Python-list mailing list