[Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info
Ben Hoyt
benhoyt at gmail.com
Mon May 13 14:25:08 CEST 2013
Okay, I've renamed my "BetterWalk" module to "scandir" and updated it
as per our discussion:
https://github.com/benhoyt/scandir/#readme
It's not yet production-ready, and is basically still in API and
performance testing stage. For instance, the underlying scandir_helper
functions don't even return iterators yet -- they're just glorified
versions of os.listdir() that return an additional d_ino/d_type
(Linux) or stat_result (Windows).
In any case, I really like the API (thanks mostly to Nick Coghlan),
and performance is great, even with DirEntry being written in Python.
PERFORMANCE: On Windows I'm seeing that scandir.walk() on a large test
tree (see benchmark.py) is 8-9 times faster than os.walk(), and on
Linux it's 3-4 times faster. Yes, it is that much faster, and yes,
those numbers are real. :-)
Please critique away. At this stage it'd be most helpful to critique
any API or performance-related issues rather than coding style or
minor bugs, as I'm expecting the code itself will change quite a bit
still.
Todos:
* Make _scandir.scandir_helper functions return real iterators instead of lists
* Move building of DirEntry objects into C module, so basically the
entire scandir() is in C
* Add tests
-Ben
More information about the Python-Dev
mailing list