[OT] a little about regex
rrr at ronadam.com
Wed Oct 18 09:32:39 CEST 2006
> Your mail has been scanned by InterScan MSS.
> I'm trying to get working an assertion which filter address from some domain
> but if it's prefixed by '.com'.
> Even trying to put the result in a negate test I can't get the wanted result.
> The tought in program term :
>>>> def filter(adr):
> ... import re
> ... allow = re.compile('.*\.my(>|$)')
> ... deny = re.compile('.*\.com\.my(>|$)')
> ... cnt = 0
> ... if deny.search(adr): cnt += 1
> ... if allow.search(adr): cnt += 1
> ... return cnt
>>>> filter('some.ads at lazyfox.com.my')
>>>> filter('some.ads at lazyfox.net.my')
> Seem that I miss some better regex implementation to avoid that both of the
> filters taking action. I'm thinking of lookbehind (negative or positive)
> option, but I think I couldn't realize it yet.
> I think the compilation should either allow have no '.com' before '.my' or
> deny should have _only_ '.com' before '.my'. Sorry I don't get the correct
> sintax to do it.
> Suggestions are welcome.
Instead of using two separate if's, Use an if - elif and be sure to test the
narrower filter first. (You have them in the correct order) That way it will
skip the more general filter and not increment cnt twice.
It's not exactly clear on what output you are seeking. If you want 0 for not
filtered and 1 for filtered, then look to Freds Hint.
Or are you writing a test at the moment, a 1 means it only passed one filter so
you know your filters are working as designed?
Another approach would be to assign values for filtered, accepted, and undefined
and set those accordingly instead of incrementing and decrementing a counter.
More information about the Python-list