[Python-Dev] Re: pre-PEP [corrected]: Complete,
Structured Regular Expression Group Matching
Fredrik Lundh
fredrik at pythonware.com
Thu Aug 12 20:18:27 CEST 2004
Mike Coleman wrote:
> Re maintenance, yeah regexp is pretty terse and ugly. Generally, though, I'd
> rather deal with a reasonably well-considered 80 char regexp than 100 lines of
> code that does the same thing.
well, the examples in your PEP can be written as:
data = [line[:-1].split(":") for line in open(filename)]
and
import ConfigParser
c = ConfigParser.ConfigParser()
c.read(filename)
data = []
for section in c.sections():
data.append((section, c.items(section)))
both of which are shorter than your structparse examples.
and most of the one-liners in your pre-PEP can be handled with a
combination of "match" and "finditer". here's a 16-line helper that
parses strings matching the "a(b)*c" pattern into a prefix/list/tail tuple.
import re
def parse(string, pat1, pat2):
"""Parse a string having the form pat1(pat2)*"""
m = re.match(pat1, string)
i = m.end()
a = m.group(1)
b = []
for m in re.compile(pat2 + "|.").finditer(string, i):
try:
token = m.group(m.lastindex)
except IndexError:
break
b.append(token)
i = m.end()
return a, b, string[i:]
>>> parse("hello 1 2 3 4 # 5", "(\w+)", "\s*(\d+)")
('hello', ['1', '2', '3', '4'], ' # 5')
tweak as necessary.
</F>
More information about the Python-Dev
mailing list