[Distutils] buildout/setuptools slow as it scans the whole project dir

Reinout van Rees reinout at vanrees.org
Wed Apr 15 17:01:19 CEST 2015


Reinout van Rees schreef op 15-04-15 om 16:26:
> Setuptools seems to run all "egg_info.writers" entry points it can 
> find in the piece of code where the slowness occurs. So any of the 
> entry points could be the culprit. And perhaps even an entry point 
> outside of setuptools, due to the way entry points work.
>
> I'll try to debug further.
Ok, the egg_info.writers entry points are innocent.

A couple of debug "print" statements later I found that 
setuptools/commands/egg_info.py's manifest_maker class calls "findall()",
which is the monkeypatched version from setuptools/__init__.py.

This does an "os.walk" through the entire project directory to create a 
full and complete list of files. Inside the os.walk() loop, it calls 
os.path.isfile() on every item.

This takes a long time! 90%+ of the time is spend in parts/omelette/. 
And most of that in parts/omelette/django's localization folders...
Of course it also goes through the var/ folder, which might store all 
sorts of files.


Now... this is before it actually tries to include/exclude/graft stuff 
per the MANIFEST.in instructions. It just goes through the entire project.


Why is it so slow? Probably because I'm running it on OSX inside a 
vmware VM (running ubuntu). Probably the disk access via the 
virtualization layer is slow.
I'm not completely satisfied yet, as sometimes it also is slow on the 
server. (Also on VMs, but normally the performance hit isn't noticable).



So.... it almost seems as if there's no solution to this?

Or can someone give a hint on os.walk performance relative to VMs?


Reinout

-- 
Reinout van Rees                          http://reinout.vanrees.org/
reinout at vanrees.org                   http://www.nelen-schuurmans.nl/
"Learning history by destroying artifacts is a time-honored atrocity"




More information about the Distutils-SIG mailing list