[Tutor] Substring substitution
Bernard Lebel
3dbernard at gmail.com
Thu Sep 8 22:15:02 CEST 2005
Hi Kent,
This is nice!
There is one thing though. When I run the oRe.sub() call, I get an error:
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "\\Linuxserver\prod\XSI\WORKGROUP_4.0\Data\Scripts\pipeline\filesystem\bb_processshotdigits.py",
line 63, in ?
processPath( r'C:\temp\MT_03_03_03\allo.txt', False )
File "\\Linuxserver\prod\XSI\WORKGROUP_4.0\Data\Scripts\pipeline\filesystem\bb_processshotdigits.py",
line 45, in processPath
else: sSubString = matchShot( sSubString )
File "\\Linuxserver\prod\XSI\WORKGROUP_4.0\Data\Scripts\pipeline\filesystem\bb_processshotdigits.py",
line 11, in matchShot
sNewString = oRe.sub( r'\10\2', sSubString )
File "D:\Python24\Lib\sre.py", line 260, in filter
return sre_parse.expand_template(template, match)
File "D:\Python24\Lib\sre_parse.py", line 781, in expand_template
raise error, "invalid group reference"
sre_constants.error: invalid group reference
This is my new match function:
def matchShot( sSubString ):
# Create regular expression object
oRe = re.compile( "(\d\d_\d\d\_)(\d\d)" )
oMatch = oRe.search( sSubString )
if oMatch != None:
sNewString = oRe.sub( r'\10\2', sSubString )
return sNewString
else:
return sSubString
I have read the sub() documentation entry but I have to confess that
made things more confusing for me...
Thanks again
Bernard
On 9/8/05, Kent Johnson <kent37 at tds.net> wrote:
> Bernard Lebel wrote:
> > Hello,
> >
> > I have a string, and I use a regular expression to search a match in
> > it. When I find one, I would like to break down the string, using the
> > matched part of it, to be able to perform some formatting and to later
> > build a brand new string with the separate parts.
> >
> > The regular expression part works ok, but my problem is to extract the
> > matched pattern from the string. I'm not sure how to do that...
> >
> >
> > sString = 'mt_03_04_04_anim'
> >
> > # Create regular expression object
> > oRe = re.compile( "\d\d_\d\d\_\d\d" )
> >
> > # Break-up path
> > aString = sString.split( os.sep )
> >
> > # Iterate individual components
> > for i in range( 0, len( aString ) ):
> >
> > sSubString = aString[i]
> >
> > # Search with shot number of 2 digits
> > oMatch = oRe.search( sSubString )
> >
> > if oMatch != None:
> > # Replace last sequence of two digits by 3 digits!!
>
> Hi Bernard,
>
> It sounds like you need to put some groups into your regex and use re.sub().
>
> By putting groups in the regex you can refer to pieces of the match. For example
>
> >>> import re
> >>> s = 'mt_03_04_04_anim'
> >>> oRe = re.compile( "(\d\d_\d\d\_)(\d\d)" )
> >>> m = oRe.search(s)
> >>> m.group(1)
> '03_04_'
> >>> m.group(2)
> '04'
>
> With re.sub(), you provide a replacement pattern that can refer to the groups from the match pattern. So to insert new characters between the groups is easy:
>
> >>> oRe.sub(r'\1XX\2', s)
> 'mt_03_04_XX04_anim'
>
> This may be enough power to do what you want, I'm not sure from your description. But re.sub() has another trick up its sleeve - the replacement 'expression' can be a callable which is passed the match object and returns the string to replace it with. For example, if you wanted to find all the two digit numbers in a string and add one to them, you could do it like this:
>
> >>> def incMatch(m):
> ... s = m.group(0) # use the whole match
> ... return str(int(s)+1).zfill(2)
> ...
> >>> re.sub(r'\d\d', incMatch, '01_09_23')
> '02_10_24'
>
> This capability can be used to do complicated replacements.
>
> Kent
>
> _______________________________________________
> Tutor maillist - Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor
>
More information about the Tutor
mailing list