[Python-ideas] fnmatch.filter_false
Wolfgang Maier
wolfgang.maier at biologie.uni-freiburg.de
Fri May 19 10:03:16 EDT 2017
On 05/17/2017 07:55 PM,
tritium-list at sdamon.com wrote:
> Top posting, apologies.
>
> I'm sure there is a better way to do it, and there is a performance hit, but
> its negligible. This is also a three line delta of the function.
>
> from fnmatch import _compile_pattern, filter as old_filter
> import os
> import os.path
> import posixpath
>
>
> data = os.listdir()
>
> def filter(names, pat, *, invert=False):
> """Return the subset of the list NAMES that match PAT."""
> result = []
> pat = os.path.normcase(pat)
> match = _compile_pattern(pat)
> if os.path is posixpath:
> # normcase on posix is NOP. Optimize it away from the loop.
> for name in names:
> if bool(match(name)) == (not invert):
> result.append(name)
> else:
> for name in names:
> if bool(match(os.path.normcase(name))) == (not invert):
> result.append(name)
> return result
>
> if __name__ == '__main__':
> import timeit
> print(timeit.timeit(
> "filter(data, '__*')",
> setup="from __main__ import filter, data"
> ))
> print(timeit.timeit(
> "filter(data, '__*')",
> setup="from __main__ import old_filter as filter, data"
> ))
>
> The first test (modified code) timed at 22.492161903402575, where the second
> test (unmodified) timed at 19.555531892032324
>
If you don't care about slow-downs in this range, you could use this
pattern:
excluded = set(filter(data, '__*'))
result = [item for item in data if item not in excluded]
It seems to take just as much longer although the slow-down is not
constant but depends on the size of the set you need to generate.
Wolfgang
More information about the Python-ideas
mailing list