<br>Its an extremely bad idea to use regex for HTML. You want to change one tiny little thing and you have to write the regex all over again. if its a throwaway script, then go ahead. <br><div class="gmail_quote">2010/3/20 Luis M. González <span dir="ltr"><<a href="mailto:luismgz@gmail.com">luismgz@gmail.com</a>></span><br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div><div></div><div class="h5">On Mar 20, 12:04 am, Jimbo <<a href="mailto:nill...@yahoo.com">nill...@yahoo.com</a>> wrote:<br>


> Hello<br>

><br>

> I am trying to grab some numbers from a string containing HTML text.<br>

> Can you suggest any good functions that I could use to do this? What<br>

> would be the easiest way to extract the following numbers from this<br>

> string...<br>

><br>

> My String has this layout & I have commented what I want to grab:<br>

> [CODE] """</th><br>

>                                 <td class="last">43.200 </td><br>

>                                 <td class="change indicator" nowrap>0.040 </td><br>

><br>

>                                                    <td>43.150 </td> #<br>

> I need to grab this number only<br>

>                                 <td>43.200 </td><br>

>                                                    <td>43.130 </td> #<br>

> I need to grab this number only<br>

>                                 <td>43.290 </td>                                         <td>43.100 </td> # I need to<br>

> grab this number only<br>

>                                 <td>7,450,447 </td><br>

>                                 <td class="middle"><a<br>

>                                         href="/asx/markets/optionPrices.do?<br>

> by=underlyingCode&underlyingCode=BHP&expiryDate=&optionType=">Options</<br>

> a></td><br>

>                                 <td class="middle"><a<br>

>                                         href="/asx/markets/warrantPrices.do?<br>

> by=underlyingAsxCode&underlyingCode=BHP">Warrants & Structured<br>

> Products</a></td><br>

>                                 <td class="middle"><a<br>

>                                         href="/asx/markets/cfdPrices.do?<br>

> by=underlyingAsxCode&underlyingCode=BHP">CFDs</a></td><br>

>                                 <td class="middle"><a href="<a href="http://hfgapps.hubb.com/asxtools/" target="_blank">http://hfgapps.hubb.com/asxtools/</a><br>

> Charts.aspx?<br>

> TimeFrame=D6&compare=comp_index&indicies=XJO&pma1=20&pma2=20&asxCode=BHP">< img<br>

> src="/images/chart.gif" border="0" height="15" width="15"></a><br>

> </td><br>

>                                 <td><a href="/research/announcements/status_notes.htm#XD">XD</a><br>

>                                 </td><br>

>                                 <td><a href="/asx/statistics/announcements.do?<br>

> by=asxCode&asxCode=BHP&timeframe=D&period=W">Recent</a><br>

> </td><br>

>                         </tr>"""[/CODE]<br>

<br>

<br>

</div></div>You should use BeautifulSoup or perhaps regular expressions.<br>

Or if you are not very smart, lik me, just try a brute force approach:<br>

<br>

>>> for i in s.split('>'):<br>

        for e in i.split():<br>

                if '.' in e and e[0].isdigit():<br>

                        print (e)<br>

<br>

<br>

43.200<br>

0.040<br>

43.150<br>

43.200<br>

43.130<br>

43.290<br>

43.100<br>

<div><div></div><div class="h5">>>><br>

--<br>

<a href="http://mail.python.org/mailman/listinfo/python-list" target="_blank">http://mail.python.org/mailman/listinfo/python-list</a><br>

</div></div></blockquote></div><br>