[Python-ideas] BetterWalk, a better and faster os.walk() for Python

Andrew Barnert abarnert at yahoo.com
Fri Nov 23 03:48:36 CET 2012


> From: Robert Collins <robertc at robertcollins.net>
> Sent: Thu, November 22, 2012 4:26:49 PM
> 
> If you want to test cold cache behaviour, see  /proc/sys/vm/drop_caches
> 
> -Rob


On a Mac? There's no /proc filesystem on OS X; that's linux-specific.

> On Fri, Nov 23, 2012 at 1:05 PM,  Andrew Barnert <abarnert at yahoo.com> wrote:
> > From:  Ben Hoyt <benhoyt at gmail.com>
> > Sent: Thu,  November 22, 2012 2:44:02 PM
> >
> >
> >> > I tested on OS X  10.8.2 on a Retina MBP 15" with 16GB and the stock  
SSD,
> >>using
> >> > Apple 2.6 and 2.7 and python.org 3.3. It seems to be a bit slower in  
>2.x,  a
> >>bit
> >> > faster in 3.x, more so in 32-bit  mode, and better without -s. The  best
> >>result I
> >> >  got anywhere was 1.5x (3.3, 32-bit, no -s), but repeating  that test  
>gave
> >> > anywhere from 1.2x to 1.5x.
> >>
> >> Yeah,  that's about  what I'm seeing on Linux and OS X. (Though for  some
> >> weird reason I'm seeing  10x as fast on OS X when I do  "python
> >> benchmark.py /usr" -- hence my comments in  the  README.)
> >
> > I get exactly 1.0x on this test with 2.6 and  2.7, 1.3x with 3.3, 1.4x with
> > 32-bit 3.3. Any chance your /usr has a  symlink to a remote or otherwise slow 
>or
> > non-HFS+ filesystem? Is that  worth testing?
> >
> > Also, the -s version seems to fail on dangling  symlinks:
> >
> > $ python2.7 benchmark.py -s /usr
> > Priming the  system's cache...Benchmarking walks on /usr, repeat 1/3...
> > Traceback  (most recent call last):
> >   File "benchmark.py", line 121, in  <module>
> >     main()
> >   File "benchmark.py",  line 118, in main
> >     benchmark(tree_dir,  get_size=options.size)
> >   File "benchmark.py", line 83, in  benchmark
> >     os_walk_time = min(os_walk_time,  timeit.timeit(do_os_walk, number=1))
> >   File
> >  
>"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/timeit.py",
>
> >   line 228, in timeit
> >     return Timer(stmt, setup,  timer).timeit(number)
> >   File
> >  
>"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/timeit.py",
>
> >   line 194, in timeit
> >     timing = self.inner(it,  self.timer)
> >   File
> >  
>"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/timeit.py",
>
> >   line 100, in inner
> >     _func()
> >   File  "benchmark.py", line 57, in do_os_walk
> >     size +=  os.path.getsize(fullname)
> >   File
> >  
>"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/genericpath.py",
>
> >   line 49, in getsize
> >     return  os.stat(filename).st_size
> > OSError: [Errno 2] No such file or  directory:
> >  
>'/usr/local/Cellar/gfortran/4.2.4-5666.3/lib/gcc/i686-apple-darwin11/4.2.1/include/ppc_intrinsics.h'
>
> >
> >
> >  $ readlink
> >  
>/usr/local/Cellar/gfortran/4.2.4-5666.3/lib/gcc/i686-apple-darwin11/4.2.1/include/ppc_intrinsics.h
>
> >
> >  ../../../../../include/gcc/darwin/4.2/ppc_intrinsics.h
> > $ ls
> >  
>/usr/local/Cellar/gfortran/4.2.4-5666.3/lib/gcc/i686-apple-darwin11/4.2.1/include/../../../../../include/gcc/darwin/4.2/ppc_intrinsics.h
>
> >
> >  ls:
> >  
>/usr/local/Cellar/gfortran/4.2.4-5666.3/lib/gcc/i686-apple-darwin11/4.2.1/include/../../../../../include/gcc/darwin/4.2/ppc_intrinsics.h:
>
> >   No such file or directory
> >
> >  _______________________________________________
> > Python-ideas mailing  list
> > Python-ideas at python.org
> >  http://mail.python.org/mailman/listinfo/python-ideas
> 
> 
> 
> -- 
> Robert Collins <rbtcollins at hp.com>
> Distinguished  Technologist
> HP Cloud Services
> 



More information about the Python-ideas mailing list