[Tutor] [Fwd: Re: Spanish text in BS problem]
Ismael Garrido
ismaelgf at adinet.com.uy
Thu Nov 10 02:35:54 CET 2005
Found the problem myself.
(look down)
Ismael Garrido wrote:
> This is the script:
>
> import BeautifulSoup
> import os
>
> a = open("zona.htm")
> text = a.readlines()
> a.close()
>
> BS = BeautifulSoup.BeautifulSoup(str(text))
Apparently, str(text) is the cause of the problem. If instead I do:
"".join(text) it all works allright. I guess this is because str
converts 'ó' to '\xf3' while "".join() does not change the strings in
any way. Now the output from BS makes sense.
Bye,
Ismael
> for ed in BS('span', {'class':'ed_ant_fecha'}):
> fecha = ed.next.split(" ")[1].replace(".","-")
> urlynombre = ed.findNextSibling().findNextSibling().findNextSibling()
> url = 'http://espectador.com/' + urlynombre.get('href')
> nombre = urlynombre.next.next
>
> print url
> print "D:/dolina/"+fecha, nombre
> print
> ###end
More information about the Tutor
mailing list