Regex help needed!

MRAB python at mrabarnett.plus.com
Mon Dec 21 10:49:49 EST 2009


Oltmans wrote:
> Hello,. everyone.
> 
> I've a string that looks something like
> ----
> lksjdfls <div id ='amazon_345343'> kdjff lsdfs </div> sdjfls <div id
> =   "amazon_35343433">sdfsd</div><div id='amazon_8898'>welcome</div>
> ----
> 
>>From above string I need the digits within the ID attribute. For
> example, required output from above string is
> - 35343433
> - 345343
> - 8898
> 
> I've written this regex that's kind of working
> re.findall("\w+\s*\W+amazon_(\d+)",str)
> 
> but I was just wondering that there might be a better RegEx to do that
> same thing. Can you kindly suggest a better/improved Regex. Thank you
> in advance.

Try:

     re.findall(r"""<div\s*id\s*=\s*['"]amazon_(\d+)['"]>""", str)

You shouldn't be using 'str' as a variable name because it hides the
builtin string class 'str'.



More information about the Python-list mailing list