[Tutor] Regex

Tue May 4 08:28:21 EDT 2021

On 04/05/2021 12:19, Alan Gauld via Tutor wrote:
> On 04/05/2021 10:51, Alan Gauld via Tutor wrote:
>>>>> s = "Jan 31 01:33:12 ubuntu.local ticky: ERROR Tried to add
>> information to closed ticket (mcintosh)"
>>>>> start = s.index('ERROR')
>>>>> end = s.index('ticket')+len('ticket')
>>>>> s[start:end]
>> 'ERROR Tried to add information to closed ticket'
>>>>>
> 
> Following up my own post...
> 
> find() would be better than index() here since it returns -1 if not
> found - which will work on the slice to give the whole remaining string...
> 
> Also I should have included start in the second search:
> 
> start = s.find('ERROR')
> end = s.find('ticket',start)+len('ticket')
> s[start:end]

If possible I try to avoid that kind of arithmetic. Especially 
str.find() with its return value of -1 for unsuccessful searches is a 
bit of a bug magnet.

The following function includes the start token and returns anything 
until but not including the end token, or the rest of the string if 
there's no end token.

 >>> def span(s, start, end):
	a, b, c = s.partition(start)
	if not b: return ""
	return b + c.partition(end)[0]

 >>> span("before start ... end", "start", "end")
'start ... '
 >>> span("before start ...", "start", "end")
'start ...'
 >>> span("something else", "start", "end")
''
 >>> span("end before start ...", "start", "end")
'start ...'

 >>> s = ("Jan 31 01:33:12 ubuntu.local ticky: ERROR "
          "Tried to add information to closed ticket "
          "(mcintosh)")
 >>> span(s, "ERROR", "(")
'ERROR Tried to add information to closed ticket '
 >>> _.strip()  # get rid of enclosing whitespace
'ERROR Tried to add information to closed ticket'