[Python-ideas] BetterWalk, a better and faster os.walk() for Python

Robert Collins robertc at robertcollins.net
Fri Nov 23 01:26:48 CET 2012


If you want to test cold cache behaviour, see /proc/sys/vm/drop_caches

-Rob

On Fri, Nov 23, 2012 at 1:05 PM, Andrew Barnert <abarnert at yahoo.com> wrote:
> From: Ben Hoyt <benhoyt at gmail.com>
> Sent: Thu, November 22, 2012 2:44:02 PM
>
>
>> > I tested on OS X 10.8.2 on a Retina MBP 15" with 16GB and the stock SSD,
>>using
>> > Apple 2.6 and 2.7 and python.org 3.3. It seems to be a bit slower in 2.x,  a
>>bit
>> > faster in 3.x, more so in 32-bit mode, and better without -s. The  best
>>result I
>> > got anywhere was 1.5x (3.3, 32-bit, no -s), but repeating  that test gave
>> > anywhere from 1.2x to 1.5x.
>>
>> Yeah, that's about  what I'm seeing on Linux and OS X. (Though for some
>> weird reason I'm seeing  10x as fast on OS X when I do "python
>> benchmark.py /usr" -- hence my comments in the  README.)
>
> I get exactly 1.0x on this test with 2.6 and 2.7, 1.3x with 3.3, 1.4x with
> 32-bit 3.3. Any chance your /usr has a symlink to a remote or otherwise slow or
> non-HFS+ filesystem? Is that worth testing?
>
> Also, the -s version seems to fail on dangling symlinks:
>
> $ python2.7 benchmark.py -s /usr
> Priming the system's cache...Benchmarking walks on /usr, repeat 1/3...
> Traceback (most recent call last):
>   File "benchmark.py", line 121, in <module>
>     main()
>   File "benchmark.py", line 118, in main
>     benchmark(tree_dir, get_size=options.size)
>   File "benchmark.py", line 83, in benchmark
>     os_walk_time = min(os_walk_time, timeit.timeit(do_os_walk, number=1))
>   File
> "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/timeit.py",
>  line 228, in timeit
>     return Timer(stmt, setup, timer).timeit(number)
>   File
> "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/timeit.py",
>  line 194, in timeit
>     timing = self.inner(it, self.timer)
>   File
> "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/timeit.py",
>  line 100, in inner
>     _func()
>   File "benchmark.py", line 57, in do_os_walk
>     size += os.path.getsize(fullname)
>   File
> "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/genericpath.py",
>  line 49, in getsize
>     return os.stat(filename).st_size
> OSError: [Errno 2] No such file or directory:
> '/usr/local/Cellar/gfortran/4.2.4-5666.3/lib/gcc/i686-apple-darwin11/4.2.1/include/ppc_intrinsics.h'
>
>
> $ readlink
> /usr/local/Cellar/gfortran/4.2.4-5666.3/lib/gcc/i686-apple-darwin11/4.2.1/include/ppc_intrinsics.h
>
> ../../../../../include/gcc/darwin/4.2/ppc_intrinsics.h
> $ ls
> /usr/local/Cellar/gfortran/4.2.4-5666.3/lib/gcc/i686-apple-darwin11/4.2.1/include/../../../../../include/gcc/darwin/4.2/ppc_intrinsics.h
>
> ls:
> /usr/local/Cellar/gfortran/4.2.4-5666.3/lib/gcc/i686-apple-darwin11/4.2.1/include/../../../../../include/gcc/darwin/4.2/ppc_intrinsics.h:
>  No such file or directory
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas



-- 
Robert Collins <rbtcollins at hp.com>
Distinguished Technologist
HP Cloud Services



More information about the Python-ideas mailing list