[Python-ideas] fnmatch.filter_false

Steven D'Aprano steve at pearwood.info
Sun May 21 02:37:28 EDT 2017


On Sun, May 21, 2017 at 08:04:04AM +0200, Pavol Lisy wrote:

> If fnmatch.filter was written to solve performance, isn't calling
> _filtered step back?

It adds overhead of *one* function call regardless of how many 
thousands or millions of names you are filtering. And the benefit is 
that filter() and filterfalse() can share the same implementation, 
without duplicating code.

If you have a tiny list, then it might be a tiny bit slower, but we 
shouldn't care too much about the performance for tiny lists. We 
should care about performance for large lists, but for large N the 
overhead of one extra function call is insignificant.

At least... I hope so. If somebody demonstrates an actual and 
significant performance hit, then we should rethink the implementation.

But even a small performance hit is acceptible, if it comes with a 
significant benefit (no need to duplicate code) and it is small enough.

On my computer, using Python 3.5, I get little significant difference:

# original
[steve at ando ~]$ python3.5 -m timeit -s "from fnmatch import filter" 
-s "L = list('abcdefghi')" "filter(L, 'b')"
100000 loops, best of 3: 11.7 usec per loop


# patched
100000 loops, best of 3: 12.8 usec per loop

Who cares about 1 millionth of a second? :-)


For a larger list, the difference vanishes:

# original
[steve at ando ~]$ python3.5 -m timeit -s "from fnmatch import filter" -s 
"L = list('abcdefghi')*10000" "filter(L, 'b')"
10 loops, best of 3: 80.2 msec per loop


# patched
10 loops, best of 3: 79.1 msec per loop


(Yes, the patched version was faster, on my computer.)

These are quick and dirty benchmarks, using an older version of Python, 
but I expect the same will apply in 3.7 if anyone bothers to check :-)


-- 
Steve


More information about the Python-ideas mailing list