[Python-ideas] fnmatch.filter_false
tritium-list at sdamon.com
tritium-list at sdamon.com
Fri May 19 14:01:27 EDT 2017
> -----Original Message-----
> From: Python-ideas [mailto:python-ideas-bounces+tritium-
> list=sdamon.com at python.org] On Behalf Of Wolfgang Maier
> Sent: Friday, May 19, 2017 10:03 AM
> To: python-ideas at python.org
> Subject: Re: [Python-ideas] fnmatch.filter_false
>
> On 05/17/2017 07:55 PM,
> tritium-list at sdamon.com wrote:
> > Top posting, apologies.
> >
> > I'm sure there is a better way to do it, and there is a performance hit,
but
> > its negligible. This is also a three line delta of the function.
> >
> > from fnmatch import _compile_pattern, filter as old_filter
> > import os
> > import os.path
> > import posixpath
> >
> >
> > data = os.listdir()
> >
> > def filter(names, pat, *, invert=False):
> > """Return the subset of the list NAMES that match PAT."""
> > result = []
> > pat = os.path.normcase(pat)
> > match = _compile_pattern(pat)
> > if os.path is posixpath:
> > # normcase on posix is NOP. Optimize it away from the loop.
> > for name in names:
> > if bool(match(name)) == (not invert):
> > result.append(name)
> > else:
> > for name in names:
> > if bool(match(os.path.normcase(name))) == (not invert):
> > result.append(name)
> > return result
> >
> > if __name__ == '__main__':
> > import timeit
> > print(timeit.timeit(
> > "filter(data, '__*')",
> > setup="from __main__ import filter, data"
> > ))
> > print(timeit.timeit(
> > "filter(data, '__*')",
> > setup="from __main__ import old_filter as filter, data"
> > ))
> >
> > The first test (modified code) timed at 22.492161903402575, where the
> second
> > test (unmodified) timed at 19.555531892032324
> >
>
> If you don't care about slow-downs in this range, you could use this
> pattern:
>
> excluded = set(filter(data, '__*'))
> result = [item for item in data if item not in excluded]
>
> It seems to take just as much longer although the slow-down is not
> constant but depends on the size of the set you need to generate.
>
> Wolfgang
>
If I didn't care about performance, I wouldn't be using filter - the only
reason to use filter over a list comprehension is performance. The standard
library has a performant inclusion filter, but does not have a performant
exclusion filter.
More information about the Python-ideas
mailing list