Html character entity conversion
Marc 'BlackJack' Rintsch
bj_666 at gmx.net
Sun Jul 30 16:53:51 EDT 2006
In <1154266972.154519.175040 at m73g2000cwd.googlegroups.com>,
pak.andrei at gmail.com wrote:
> Here is my script:
>
> from mechanize import *
> from BeautifulSoup import *
> import StringIO
> b = Browser()
> f = b.open("http://www.translate.ru/text.asp?lang=ru")
> b.select_form(nr=0)
> b["source"] = "hello python"
> html = b.submit().get_data()
> soup = BeautifulSoup(html)
> print soup.find("span", id = "r_text").string
>
> OUTPUT:
> привет
> питон
> ----------
> In russian it looks like:
> "привет питон"
>
> How can I translate this using standard Python libraries??
Have you tried a more recent version of BeautifulSoup? IIRC current
versions always decode text to unicode objects before returning them.
Ciao,
Marc
More information about the Python-list
mailing list