[Python-ideas] Speed up os.walk() 5x to 9x by using file attributes from FindFirst/NextFile() and readdir()

Jim Jewett jimjjewett at gmail.com
Thu Nov 15 06:53:39 CET 2012


On 11/15/12, Mike Meyer <mwm at mired.org> wrote:
> On Nov 14, 2012 10:19 PM, "Jim Jewett" <jimjjewett at gmail.com> wrote:

> Note that you're eliding the proposal these questions were about, that
> os.iterdir return some kind of object that had attributes that carried the
> stat values, or None if they weren't available.

>> Or are you saying that you want to distinguish between "This
>> filesystem doesn't track that information", "This process couldn't get
>> that information right now", and "That particular piece of information
>> requires a second call that hasn't been made yet"?

> I want to distinguish between the case where st_mode is filled from the
> BSD/Unix d_type directory entry - meaning there is information so st_mode
> is not None, but the information is incomplete and requires a second system
> call to fetch - and the case where it's filled via the Windows calls which
> provide all the information that is available for st_mode, so no second
> system call is needed.

So you're basically saying that you want to know whether an explicit
stat call would make a difference?  (Other than updating the
information if it has changed.)

>> > 2) How about making these attributes properties, so that touching one
>> > that isn't there causes them all to be populated.
>> Part of the motivation was to minimize extra system calls; that
>> suggests making another one should be a function call instead of a
>> property.

> Except that I don't see that there's anything to do once you've found a
> None-valued attribute *except* make that extra call.

ah.  I was thinking of reporting, where you could just leave a column
off the report.

Or of some sort of optimization, where knowing the size (or last
change date) is not required, but may be helpful.  I suppose these
might be sufficiently uncommon that triggering the extra stat call
instead of returning None might be justified.

-jJ



More information about the Python-ideas mailing list