Self optimizing iterable
Carl Banks
pavlovevidence at gmail.com
Fri Jul 17 21:35:33 EDT 2009
On Jul 17, 5:40 pm, Paul Rubin <http://phr...@NOSPAM.invalid> wrote:
> Zac Burns <zac... at gmail.com> writes:
> > An example use case for this would be for something like a large table
> > of regular expressions that would be iterated over trying to match in
> > some string. If some regular expressions are more statistically more
> > successful then the iteration will generally be short.
>
> Generally if you are matching against a bunch of regexps, they will
> tend to match overlapping sets, so you want to control precisely what
> order they are tested in. Having stuff bubble around based on hit
> counts doesn't sound good.
As a corrollary, if you happen to be matching non-overlapping sets,
then it is a red flag that there could be some other (better) way to
dispatch than a linear regexp search. For instance, if all your
regexps start with a different keyword, then you should dispatch on
that keyword using a dictionary. A regexp can be used after
dispatching to extract parameters if necessary.
It's possible to have disjoint regexps without a simple dispatch
criterion, but I'd guess that's less common.
Carl Banks
More information about the Python-list
mailing list