hachoir-regex is a Python library for regular expression (regex or regexp) manupulation. You can use a|b (or) and a+b (and) operators. Expressions are optimized during the construction: merge ranges, simplify repetitions, etc. It also contains a class for pattern matching allowing to search multiple strings and regex at the same time. Website: http://hachoir.org/wiki/hachoir-regex Regex examples ============== Different methods to create regex:
from hachoir_regex import parse, createRange, createString createString("bike") + createString("motor")
parse('(foo|fooo|foot|football)') regex = createString("1") | createString("3"); regex regex |= createRange("2", "4"); regex
As you can see, you can use classic "a|b" (or) and "a+b" (and) Python operators, and expressions are optimized for fast pattern matching. Regex using repetition:
parse("(a{2,}){3,4}")
parse("(a*|b)*") parse("(a*|b|){4,5}")
Compute minimum and maximum length of matched pattern:
r=parse('(cat|horse)') r.minLength(), r.maxLength() (3, 5) r=parse('(a{2,}|b+)') r.minLength(), r.maxLength() (1, None)
Pattern maching
===============
Use PatternMaching if you would like to match multiple strings and
regex at the same time:
>>> from hachoir_regex import PatternMatching
>>> p = PatternMatching()
>>> p.addString("a")
>>> p.addString("b")
>>> p.addRegex("[cd]")
>>> for start, end, item in p.search("a b c d"):
... print "%s..%s: %s" % (start, end, item)
...
0..1: a
2..3: b
4..5: [cd]
6..7: [cd]
You can also attach user data to a pattern:
>>> p = PatternMatching()
>>> p.addString("un", 1)
>>> p.addString("deux", 2)
>>> for start, end, item in p.search("un deux"):
... print "%r at %s: userdata=%r" % (item, start, item.user)
...
participants (1)
-
haypo