psss...I want to move from Perl to Python
Cameron Simpson
cs at zip.com.au
Sun Jan 31 16:16:46 EST 2016
On 31Jan2016 09:49, Paul Rubin <no.email at nospam.invalid> wrote:
>Cameron Simpson <cs at zip.com.au> writes:
>> Adzapper. It has many many regexps matching URLs. (Actually a more
>> globlike syntax, but it gets turned into a regexp.) You plug it into
>> your squid proxy.
>
>Oh cool, is that out there in circulation?
Yes:
http://adzapper.sourceforge.net/
which includes the installation instructions (install script, add a line to
squid.conf).
However my publication workflow is broken. (And source forge isn't what it used
to be.) I need to get the update process improved. I'm happy to send the latest
copy to people by private email.
>It sounds like the approach of merging all the regexes into one and
>compiling to a FSM could be a big win. I wouldn't expect too big a
>state space explosion.
Perhaps so. The existing script (a) merges regexps for successive patterns for
the same class and (b) use's perl's "study" function, which examines a string
which will have several regexps applies to it - it nots things like character
positions I gather, which is used in the matching process. Since the zapper
applies all the rules to most URLs this is a performance win.
Cheers,
Cameron Simpson <cs at zip.com.au>
More information about the Python-list
mailing list