[Python-ideas] fnmatch.filter_false
Steven D'Aprano
steve at pearwood.info
Sun May 21 02:37:28 EDT 2017
On Sun, May 21, 2017 at 08:04:04AM +0200, Pavol Lisy wrote:
> If fnmatch.filter was written to solve performance, isn't calling
> _filtered step back?
It adds overhead of *one* function call regardless of how many
thousands or millions of names you are filtering. And the benefit is
that filter() and filterfalse() can share the same implementation,
without duplicating code.
If you have a tiny list, then it might be a tiny bit slower, but we
shouldn't care too much about the performance for tiny lists. We
should care about performance for large lists, but for large N the
overhead of one extra function call is insignificant.
At least... I hope so. If somebody demonstrates an actual and
significant performance hit, then we should rethink the implementation.
But even a small performance hit is acceptible, if it comes with a
significant benefit (no need to duplicate code) and it is small enough.
On my computer, using Python 3.5, I get little significant difference:
# original
[steve at ando ~]$ python3.5 -m timeit -s "from fnmatch import filter"
-s "L = list('abcdefghi')" "filter(L, 'b')"
100000 loops, best of 3: 11.7 usec per loop
# patched
100000 loops, best of 3: 12.8 usec per loop
Who cares about 1 millionth of a second? :-)
For a larger list, the difference vanishes:
# original
[steve at ando ~]$ python3.5 -m timeit -s "from fnmatch import filter" -s
"L = list('abcdefghi')*10000" "filter(L, 'b')"
10 loops, best of 3: 80.2 msec per loop
# patched
10 loops, best of 3: 79.1 msec per loop
(Yes, the patched version was faster, on my computer.)
These are quick and dirty benchmarks, using an older version of Python,
but I expect the same will apply in 3.7 if anyone bothers to check :-)
--
Steve
More information about the Python-ideas
mailing list