[Tutor] Regex
Peter Otten
__peter__ at web.de
Tue May 4 08:28:21 EDT 2021
On 04/05/2021 12:19, Alan Gauld via Tutor wrote:
> On 04/05/2021 10:51, Alan Gauld via Tutor wrote:
>>>>> s = "Jan 31 01:33:12 ubuntu.local ticky: ERROR Tried to add
>> information to closed ticket (mcintosh)"
>>>>> start = s.index('ERROR')
>>>>> end = s.index('ticket')+len('ticket')
>>>>> s[start:end]
>> 'ERROR Tried to add information to closed ticket'
>>>>>
>
> Following up my own post...
>
> find() would be better than index() here since it returns -1 if not
> found - which will work on the slice to give the whole remaining string...
>
> Also I should have included start in the second search:
>
> start = s.find('ERROR')
> end = s.find('ticket',start)+len('ticket')
> s[start:end]
If possible I try to avoid that kind of arithmetic. Especially
str.find() with its return value of -1 for unsuccessful searches is a
bit of a bug magnet.
The following function includes the start token and returns anything
until but not including the end token, or the rest of the string if
there's no end token.
>>> def span(s, start, end):
a, b, c = s.partition(start)
if not b: return ""
return b + c.partition(end)[0]
>>> span("before start ... end", "start", "end")
'start ... '
>>> span("before start ...", "start", "end")
'start ...'
>>> span("something else", "start", "end")
''
>>> span("end before start ...", "start", "end")
'start ...'
>>> s = ("Jan 31 01:33:12 ubuntu.local ticky: ERROR "
"Tried to add information to closed ticket "
"(mcintosh)")
>>> span(s, "ERROR", "(")
'ERROR Tried to add information to closed ticket '
>>> _.strip() # get rid of enclosing whitespace
'ERROR Tried to add information to closed ticket'
More information about the Tutor
mailing list