[Tutor] strings matching problem

Clay Shirky clay at shirky.com
Thu Aug 7 23:08:15 EDT 2003


> Hello !
> 
> First of all thanks to everyone on the list for giving consideration to my
> problems. With the help of the tutor mailing list i am really learning a lot
> about programming in python.
> 
> I have a small problem for now also.
> 
> I have a few names like a name followed by hash(#) and then a no. which
> indicates the repeated  occurance of the name.
> 
> Like .abc ,  .abc#1 ,  .abc#2
> 
> I want that if i encounter such kind of name then it should be converted to
> .abc
> 
> I think that this can be done with re.compile and some kind of match with
> re.match

You'd rather do re.sub (substitution) than re.match, and you only need to
use re.compile if you are doing a *lot* of matches and speed is an issue, so
if its just a one-off renaming of a few hundred files, you can skip
re.compile.

Here's a sample using re.sub

# dummy file list
file_list = [ "spam.abc", "eggs.abc#1", "spam_eggs.abc#2" ]

import re # import the regular expression module

for f in file_list:
    new_f = re.sub("abc#\d+$", "abc", f) # sub "abc# + digits" with abc
    print f, "->", new_f

Note that the regular expression, abc#\d+$, is pretty narrowly tailored to
the example you gave. Its "abc followed by a hash mark followed by one or
more digits, to the end of the string", so if the pattern is more general,
like sometimes theres a hash but no number, or characters after the numbers,
or the string after the hash can include letters, you will need to change
the regular expression.

-clay




More information about the Tutor mailing list