[New-bugs-announce] [issue32453] shutil.rmtree can have O(n^2) performance on large dirs

Niklas Hambüchen report at bugs.python.org
Sat Dec 30 00:15:09 EST 2017

New submission from Niklas Hambüchen <nh2 at deditus.de>:

See http://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=24412edeaf556a for the explanation and equivalent fix in coreutils.

The gist ist that deleting entries in inode order can improve deletion performance dramatically.

To obtain inode numbers and sort by them, one needs to `getdents()` all the entries ahead of time, but rmtree() already gets all dirents ahead of the deletion. https://bugs.python.org/issue28564 recently improved shutil.rmtree() performance by using scandir(), but nevertheless the returned generator is list()ed, so we already have all necessary informtaion in memory and would just have to perform an O(n) integer sort.

I propose we check if the improvements made to `rm -r` in coreutils should be ported to shutil.rmtree().

components: Library (Lib)
messages: 309217
nosy: nh2
priority: normal
severity: normal
status: open
title: shutil.rmtree can have O(n^2) performance on large dirs
type: enhancement
versions: Python 3.7

Python tracker <report at bugs.python.org>

More information about the New-bugs-announce mailing list