regex question on .findall and \b

Nobody nobody at nowhere.com
Thu Jul 2 14:41:54 EDT 2009


On Thu, 02 Jul 2009 09:38:56 -0700, Ethan Furman wrote:

> Greetings!
> 
> My closest to successfull attempt:
> 
> Python 2.5.4 (r254:67916, Dec 23 2008, 15:10:54) [MSC v.1310 32 bit (Intel)]
> Type "copyright", "credits" or "license" for more information.
> 
> IPython 0.9.1 -- An enhanced Interactive Python.
> 
>    In [161]: re.findall('\d+','this is test a3 attempt 79')
>    Out[161]: ['3', '79']
> 
> What I really want in just the 79, as a3 is not a decimal number, but 
> when I add the \b word boundaries I get:
> 
>    In [162]: re.findall('\b\d+\b','this is test a3 attempt 79')
>    Out[162]: []
> 
> What am I missing?

You need to use a raw string (r'...') to prevent \b from being interpreted
as a backspace:

	re.findall(r'\b\d+\b','this is test a3 attempt 79')

\d isn't a recognised escape sequence, so it doesn't get interpreted:

	> print '\b'
	^H
	> print '\d'
	\d
	> print r'\b'
	\b

Try to get into the habit of using raw strings for regexps.




More information about the Python-list mailing list