[New-bugs-announce] [issue12870] Regex object should have introspection methods
report at bugs.python.org
Wed Aug 31 19:29:34 CEST 2011
New submission from Matt Chaput <matt at whoosh.ca>:
Several times in the recent past I've wished for the following methods on the regular expression object. These would allow me to speed up search and parsing code, by limiting the number of regex matches I need to try.
literal_prefix(): Returns any literal string at the start of the pattern (before any "special" parts). E.g., for the pattern "ab(c|d)ef" the method would return "ab". For the pattern "abc|def" the method would return "". When matching a regex against keys in a btree, this would let me limit the search to just the range of keys with the prefix.
first_chars(): Returns a string/list/set/whatever of the possible first characters that could appear at the start of a matching string. E.g. for the pattern "ab(c|d)ef" the method would return "a". For the pattern "[a-d]ef" the method would return "abcd". When parsing a string with regexes, this would let me only have to test the regexes that could match at the current character.
As long as you're making a new regex package, I thought I'd put in a request for these :)
components: Regular Expressions
title: Regex object should have introspection methods
type: feature request
versions: Python 3.3
Python tracker <report at bugs.python.org>
More information about the New-bugs-announce