[Python-ideas] fnmatch.filter_false

Wolfgang Maier wolfgang.maier at biologie.uni-freiburg.de
Fri May 19 10:03:16 EDT 2017


On 05/17/2017 07:55 PM, 
tritium-list at sdamon.com wrote:
> Top posting, apologies.
> 
> I'm sure there is a better way to do it, and there is a performance hit, but
> its negligible.  This is also a three line delta of the function.
> 
> from fnmatch import _compile_pattern, filter as old_filter
> import os
> import os.path
> import posixpath
> 
> 
> data = os.listdir()
> 
> def filter(names, pat, *, invert=False):
>      """Return the subset of the list NAMES that match PAT."""
>      result = []
>      pat = os.path.normcase(pat)
>      match = _compile_pattern(pat)
>      if os.path is posixpath:
>          # normcase on posix is NOP. Optimize it away from the loop.
>          for name in names:
>              if bool(match(name)) == (not invert):
>                  result.append(name)
>      else:
>          for name in names:
>              if bool(match(os.path.normcase(name))) == (not invert):
>                  result.append(name)
>      return result
> 
> if __name__ == '__main__':
>      import timeit
>      print(timeit.timeit(
>          "filter(data, '__*')",
>          setup="from __main__ import filter, data"
>       ))
>      print(timeit.timeit(
>          "filter(data, '__*')",
>          setup="from __main__ import old_filter as filter, data"
>      ))
> 
> The first test (modified code) timed at 22.492161903402575, where the second
> test (unmodified) timed at 19.555531892032324
> 

If you don't care about slow-downs in this range, you could use this 
pattern:

excluded = set(filter(data, '__*'))
result = [item for item in data if item not in excluded]

It seems to take just as much longer although the slow-down is not 
constant but depends on the size of the set you need to generate.

Wolfgang




More information about the Python-ideas mailing list