Python Regex Question
crybaby
joemystery123 at gmail.com
Thu Sep 20 19:52:16 EDT 2007
On Sep 20, 4:12 pm, Tobiah <t... at tobiah.org> wrote:
> joemystery... at gmail.com wrote:
> > I need to extract the number on each <td tags from a html file.
>
> > i.e 49.950 from the following:
>
> > <td align=right width=80><font size=2 face="New Times
> > Roman,Times,Serif"> 49.950 </font></td>
>
> > The actual number between: 49.950 can be any number of
> > digits before decimal and after decimal.
>
> > <td align=right width=80><font size=2 face="New Times
> > Roman,Times,Serif"> ######.#### </font></td>
>
> > How can I just extract the real/integer number using regex?
>
> '[0-9]*\.[0-9]*'
>
> --
> Posted via a free Usenet account fromhttp://www.teranews.com
I am trying to use BeautifulSoup:
soup = BeautifulSoup(page)
td_tags = soup.findAll('td')
i=0
for td in td_tags:
i = i+1
print "td: ", td
# re.search('[0-9]*\.[0-9]*', td)
price = re.compile('[0-9]*\.[0-9]*').search(td)
I am getting an error:
price= re.compile('[0-9]*\.[0-9]*').search(td)
TypeError: expected string or buffer
Does beautiful soup returns array of objects? If so, how do I pass
"td" instance as string to re.search? What is the different between
re.search vs re.compile().search?
More information about the Python-list
mailing list