[Tutor] [Fwd: Re: Spanish text in BS problem]
Hugo González Monteverde
hugonz-lists at h-lab.net
Thu Nov 10 14:38:40 CET 2005
Hi Ismael,
I'm glad you found the answer. It is very enlightening as I thought I
had to do with locale and did some tests without getting the problem (I
do not have BS installed now)
As I speak Spanish this had got me worried. :/
Hugo
Ismael Garrido wrote:
> Found the problem myself.
> (look down)
>
> Ismael Garrido wrote:
>
>
>>This is the script:
>>
>>import BeautifulSoup
>>import os
>>
>>a = open("zona.htm")
>>text = a.readlines()
>>a.close()
>>
>>BS = BeautifulSoup.BeautifulSoup(str(text))
>
>
> Apparently, str(text) is the cause of the problem. If instead I do:
> "".join(text) it all works allright. I guess this is because str
> converts 'ó' to '\xf3' while "".join() does not change the strings in
> any way. Now the output from BS makes sense.
>
> Bye,
> Ismael
>
>
>>for ed in BS('span', {'class':'ed_ant_fecha'}):
>> fecha = ed.next.split(" ")[1].replace(".","-")
>> urlynombre = ed.findNextSibling().findNextSibling().findNextSibling()
>> url = 'http://espectador.com/' + urlynombre.get('href')
>> nombre = urlynombre.next.next
>>
>> print url
>> print "D:/dolina/"+fecha, nombre
>> print
>>###end
>
>
>
> _______________________________________________
> Tutor maillist - Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor
>
More information about the Tutor
mailing list