How to escape # hash character in regex match strings

Peter Otten __peter__ at web.de
Wed Jun 10 11:31:40 EDT 2009


504crank at gmail.com wrote:

> I've encountered a problem with my RegEx learning curve -- how to
> escape hash characters # in strings being matched, e.g.:
> 
>>>> string = re.escape('123#abc456')
>>>> match = re.match('\d+', string)
>>>> print match
> 
> <_sre.SRE_Match object at 0x00A6A800>
>>>> print match.group()
> 
> 123
> 
> The correct result should be:
> 
> 123456

>>> "".join(re.findall("\d+", "123#abc456"))
'123456'

> I've tried to escape the hash symbol in the match string without
> result.
> 
> Any ideas? Is the answer something I overlooked in my lurching Python
> schooling?

re.escape() is used to build the regex from a string that may contain 
characters that have a special meaning in regular expressions but that you 
want to treat as literals. You can for example search for r"C:\dir" with 

>>> re.compile(re.escape(r"C:\dir")).findall(r"C:\dir C:7ir")
['C:\\dir']

Without escaping you'd get

>>> re.compile(r"C:\dir").findall(r"C:\dir C:7ir")
['C:7ir']

Peter




More information about the Python-list mailing list