2015-02-13 11:52 GMT+01:00 Serhiy Storchaka <storchaka@gmail.com>:
You can try to make Python implementation faster if
1) Don't set attributes to None in constructor.
The class uses __slots__. Setting attributes in the class body is forbidden when __slots__ is used.
3) Or pass DirEntry to _scandir:
def scandir(path): yield from _scandir(path, DirEntry)
I implemented that and there is no major change (1.3x faster => 1.5x, it's still far from 3.5x faster seen with the C implementation). I analyzed numbers (on my desktop PC, HDD, ext4): - readdir: 380 ns - os.stat: 1500 ns - DirEntry(C): 100 ns - DirEntry (Py): 530 ns (5.3x slower) - is_dir(C): 75 ns - is_dir (Py): 260 ns (3.5x slower) listdir+stat benchmarks takes (readdir + stat) nanoseconds scandir+is_dir takes (readdir + DirEntry + is_dir) nanoseconds => scandir+is_dir is faster than list+stat if (DirEntry+is_dir) is faster than (stat). Callig os.stat takes 1500 ns, while readdir() only provides informations required by the benchmark. So if DirEntry + DirEntry.is_dir is faster than 1500 ns, we won :-) The Python implementation takes 790 ns, but the C implementation takes only 175 ns! (4.5x faster) I don't think that any Python performance trick can reduce the Python overhead to make the C+Python implementation interesting compared to os.listdir+os.stat. We are talking about nanoseconds, Python cannot beat C at this resolution. Victor