Regex help needed!
mik3
mik3l3374 at gmail.com
Mon Dec 21 07:42:00 EST 2009
On Dec 21, 7:38 pm, Oltmans <rolf.oltm... at gmail.com> wrote:
> Hello,. everyone.
>
> I've a string that looks something like
> ----
> lksjdfls <div id ='amazon_345343'> kdjff lsdfs </div> sdjfls <div id
> = "amazon_35343433">sdfsd</div><div id='amazon_8898'>welcome</div>
> ----
>
> From above string I need the digits within the ID attribute. For
> example, required output from above string is
> - 35343433
> - 345343
> - 8898
>
> I've written this regex that's kind of working
> re.findall("\w+\s*\W+amazon_(\d+)",str)
>
> but I was just wondering that there might be a better RegEx to do that
> same thing. Can you kindly suggest a better/improved Regex. Thank you
> in advance.
don't need regular expression. just do a split on amazon
>>> s="""lksjdfls <div id =\'amazon_345343\'> kdjff lsdfs </div> sdjfls <div id = "amazon_35343433">sdfsd</div><div id=\'amazon_8898\'>welcome</div>"""
>>> for item in s.split("amazon_")[1:]:
... print item
...
345343'> kdjff lsdfs </div> sdjfls <div id = "
35343433">sdfsd</div><div id='
8898'>welcome</div>
then find ' or " indices and do index slicing.
More information about the Python-list
mailing list