[New-bugs-announce] [issue27194] Tarfile superfluous truncate calls slows extraction.

Jason Fried report at bugs.python.org
Fri Jun 3 00:56:33 EDT 2016

New submission from Jason Fried:

With large tar file extracts I noticed that tarfile was slower than it should be.  Seems in linux that for large files (10MB) truncate is not always a free operation even when it should be a no-op. ex: File is already 10mb seek to end and truncate. 

I created a script to test the validity of this patch.  It generates two random tar archives containing 1024 files of 10mb each. The files are randomized so disk caching should not interfere. 

So to extract those 1g tar files the following was observed
Time Delta for TarFile: 148.23699307441711
Time Delta for FastTarFile: 107.71058106422424
Time Diff: 40.52641201019287 0.27338932859929255

components: Library (Lib)
files: truncate.patch
keywords: patch
messages: 267035
nosy: asvetlov, fried, lukasz.langa
priority: normal
severity: normal
status: open
title: Tarfile superfluous truncate calls slows extraction.
type: performance
versions: Python 3.5
Added file: http://bugs.python.org/file43138/truncate.patch

Python tracker <report at bugs.python.org>

More information about the New-bugs-announce mailing list