Mailman 3 speeding up shutil.copy*() - Python-ideas

March 3, 2013

      shutil.copy*() use copyfileobj():
"""
    while 1:
        buf = fsrc.read(length)
        if not buf:
            break
        fdst.write(buf)
"""

This allocates and frees a lot of buffers, and could be optimized with
readinto().
Unfortunately, I don't think we can change copyfileobj(), because it
might be passed objects that don't implement readinto().

By implementing it directly in copyfile() (it would probably be better
to expose it in shutil to make it available to tarfile & Co), there's
a modest improvement:

$ dd if=/dev/zero of=/tmp/foo bs=1M count=100

Without patch:
$ ./python -m timeit -s "import shutil" "shutil.copyfile('/tmp/foo',
'/dev/null')"
10 loops, best of 3: 218 msec per loop

With readinto():
$ ./python -m timeit -s "import shutil" "shutil.copyfile('/tmp/foo',
'/dev/null')"
10 loops, best of 3: 202 msec per loop

(I'm using /dev/null as target because my hdd is really slow: other
benchmarks are welcome, just beware that /tmp might be tmpfs).

I've also written a dirty patch to use sendfile(). Here, the
improvement is really significant:

With sendfile():
$ ./python -m timeit -s "import shutil" "shutil.copyfile('/tmp/foo',
'/dev/null')"
100 loops, best of 3: 5.39 msec per loop

Thoughts?

cf

speeding up shutil.copy*()

Charles-François Natali

Christian Heimes

Charles-François Natali

Daniel Holth

Antoine Pitrou

Charles-François Natali

Antoine Pitrou

Charles-François Natali

Gregory P. Smith

Antoine Pitrou

Serhiy Storchaka

Charles-François Natali

Christian Heimes

Charles-François Natali

Daniel Holth

Antoine Pitrou

Charles-François Natali

Antoine Pitrou

Charles-François Natali

Gregory P. Smith

Antoine Pitrou

Serhiy Storchaka

Charles-François Natali

tags

participants (6)