[Python-ideas] os.listdir iteration support
Aahz
aahz at pythoncraft.com
Sun Nov 25 01:29:17 CET 2007
On Fri, Nov 23, 2007, Giampaolo Rodola' wrote:
>
> Surely it's a rather specific use case, but it is one of the tasks
> which takes the longest amount of time on an FTP server. 20,000 is
> probably an exaggerated hypothetical situation, so I did a simple test
> with a more realistic scenario.
> On windows a very crowded directory is C:\windows\system32. Currently
> the C:\windows\system32 of my Windows XP workstation contains 2201
> files.
> I tried to run the code below which is how an FTP server should
> properly respond to a "LIST" command issued by client.
> It took 1.70300006866 seconds to complete the first time and
> 0.266000032425 the second one.
Your code calls os.stat() on each file. I know from past experience
that os.stat() is *extremely* expensive. Because os.listdir() runs at C
speed, it only gets slow when run against hundreds of thousands of
entries.
(One directory on a work server has over 200K entries, and it takes
os.listdir() about twenty seconds. I believe that if we switched from
ext3 to something more appropriate that would get reduced.)
> I don't know if such specific use case could justify a listdir
> generators support to have into the stdlib but having something like
> Greg Ewing's opendirs module could have saved a lot of time in this
> specific case.
Doubtful.
--
Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/
"Typing is cheap. Thinking is expensive." --Roy Smith
More information about the Python-ideas
mailing list