Regex help needed!

mik3 mik3l3374 at gmail.com
Mon Dec 21 07:42:00 EST 2009


On Dec 21, 7:38 pm, Oltmans <rolf.oltm... at gmail.com> wrote:
> Hello,. everyone.
>
> I've a string that looks something like
> ----
> lksjdfls <div id ='amazon_345343'> kdjff lsdfs </div> sdjfls <div id
> =   "amazon_35343433">sdfsd</div><div id='amazon_8898'>welcome</div>
> ----
>
> From above string I need the digits within the ID attribute. For
> example, required output from above string is
> - 35343433
> - 345343
> - 8898
>
> I've written this regex that's kind of working
> re.findall("\w+\s*\W+amazon_(\d+)",str)
>
> but I was just wondering that there might be a better RegEx to do that
> same thing. Can you kindly suggest a better/improved Regex. Thank you
> in advance.

don't need regular expression. just do a split on amazon

>>> s="""lksjdfls <div id =\'amazon_345343\'> kdjff lsdfs </div> sdjfls <div id =   "amazon_35343433">sdfsd</div><div id=\'amazon_8898\'>welcome</div>"""

>>> for item in s.split("amazon_")[1:]:
...   print item
...
345343'> kdjff lsdfs </div> sdjfls <div id =   "
35343433">sdfsd</div><div id='
8898'>welcome</div>

then find  ' or " indices and do index  slicing.



More information about the Python-list mailing list