scandir slower than listdir
Steve D'Aprano
steve+python at pearwood.info
Thu Jul 20 07:43:02 EDT 2017
On Thu, 20 Jul 2017 03:33 pm, Torsten Bronger wrote:
> Hallöchen!
>
> With a 24,000 files directory on an SSD running Ubuntu,
>
> #!/usr/bin/python3
>
> import os, time
>
>
> start = time.time()
> list(os.listdir("/home/bronger/.saves"))
> print("listdir:", time.time() - start)
>
> start = time.time()
> list(os.scandir("/home/bronger/.saves"))
> print("scandir:", time.time() - start)
>
> yields
>
> listdir: 0.045470237731933594
> scandir: 0.08043360710144043
>
> However, scandir is supposed to be faster than listdir. Why do I
> see this?
The documentation says:
"Using scandir() instead of listdir() can significantly increase the performance
of code that ALSO NEEDS FILE TYPE OR FILE ATTRIBUTE INFORMATION"
[emphasis added]
https://docs.python.org/3.5/library/os.html#os.scandir
If all you need is the names, listdir() is faster because it only returns the
names. scandir() returns a data structure which may include cached values for:
- the name
- full path
- flag whether it is a directory
- flag whether it is a file
- flag whether it is a symlink
- inode number
- file stat record
--
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.
More information about the Python-list
mailing list