position of matches

Peter Hansen peter at engcorp.com
Sat Jun 22 20:33:34 EDT 2002


les wrote:
> 
> i am new to python and want to do the following:
> given a string str='abcABCDefsgRSTVderDae'
> i would like to find all the internal lowercase strings between 2 caps
> i.e. pattern=re.compile(r'[A-Z]([a-z]+)[A-Z]')
> 
> match_obj=pattern.search(str)
> begin,end=match_obj.span()
> 
> however i would like to get all the begining and end positions
> of the pattern,
> i.e.
> efsg  begin=7 end=10
> der   begin=15 end 17

Try this:
>>> for x in m.finditer(s):
...   print '%s\tbegin=%s end=%s' % ((x.group(1),) + x.span(1))
...
efsg    begin=7 end=11
der     begin=15 end=18

Note that you might want to adjust the "end" index to fit Python's
view of the world.  Python generally uses the index of a "slice"
made to the string *between* two characters rather than referring
to the index of one of the characters.  This lets you easily use
the slice notation as below.  (This explains why my code shows 11
and 18 instead of 10 and 17 for the end position.)  A very recent 
thread explained this in more detail.

>>> s[7:11]
'efsg'
>>> s[15:18]
'der'

(Note: I also used 's' above instead of 'str' since 'str' is
a builtin in 2.2.  Safer not to use it.)

-Peter



More information about the Python-list mailing list