Powerful perl paradigm I don't find in python
Peter Otten
__peter__ at web.de
Fri Jan 15 08:34:40 EST 2016
Charles T. Smith wrote:
> What the original snippet does is parse *and consume* a string - actually,
> to avoid maintaining a cursor traverse the string. The perl feature is
> that substitute allows the found pattern to be replaced, but retains the
> group after the expression is complete.
That is too technical for my taste. When is your "paradigm" more useful than
a simple
re.finditer(), re.findall(), or re.split()
?
>> things = []
>> while some_str != tail:
>> m = re.match(pattern_str, some_str)
>> things.append(some_str[:m.end()])
>> some_str = some_str[m.end():]
If that were common (or even ever occured) I'd write a helper which avoids
the brittle some_str != tail comparison and exposes the functionality in a
for loop:
class MissingTailError(ValueError):
pass
class UnparsedRestError(ValueError):
pass
def shave_off(regex, text, tail=None):
"""
>>> for s in shave_off(r"[a-z]+ \\d+\\s*",
... "foo 12 bar 34 baz", tail="baz"):
... s
'foo 12 '
'bar 34 '
"""
if tail is not None:
if text.endswith(tail):
end = len(text) - len(tail)
else:
raise MissingTailError("%r does not end with %r" % (text, tail))
else:
end = len(text)
start = 0
r = re.compile(regex)
while start != end:
m = r.match(text, start, end)
if m is None:
raise UnparsedRestError(
"%r does not match pattern %r"
% (text[start:end], r.pattern))
yield text[m.start():m.end()]
start = m.end()
More information about the Python-list
mailing list