[Tutor] [Fwd: Re: Spanish text in BS problem]

Hugo González Monteverde hugonz-lists at h-lab.net
Thu Nov 10 14:38:40 CET 2005


Hi Ismael,

I'm glad you found the answer. It is very enlightening as I thought I 
had to do with locale and did some tests without getting the problem (I 
do not have BS installed now)

As I speak Spanish this had got me worried. :/

Hugo



Ismael Garrido wrote:
> Found the problem myself.
> (look down)
> 
> Ismael Garrido wrote:
> 
> 
>>This is the script:
>>
>>import BeautifulSoup
>>import os
>>
>>a = open("zona.htm")
>>text = a.readlines()
>>a.close()
>>
>>BS = BeautifulSoup.BeautifulSoup(str(text))
> 
> 
> Apparently, str(text) is the cause of the problem. If instead I do: 
> "".join(text) it all works allright. I guess this is because str 
> converts 'ó' to '\xf3' while "".join() does not change the strings in 
> any way. Now the output from BS makes sense.
> 
> Bye,
> Ismael
> 
> 
>>for ed in BS('span', {'class':'ed_ant_fecha'}):
>>   fecha = ed.next.split(" ")[1].replace(".","-")
>>   urlynombre = ed.findNextSibling().findNextSibling().findNextSibling()
>>   url = 'http://espectador.com/' + urlynombre.get('href')
>>   nombre = urlynombre.next.next
>>
>>   print url
>>   print "D:/dolina/"+fecha, nombre
>>   print
>>###end
> 
> 
> 
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor
> 


More information about the Tutor mailing list