[Tutor] Spanish text in BS problem

Kent Johnson kent37 at tds.net
Wed Nov 9 13:53:56 CET 2005


Ismael Garrido wrote:
> Hello
> 
> I'm using Beautiful Soup to scrape a site (that's in Spanish) I 
> sometimes come across strings like:
> 'Ner\\xf3n como cantor'
> 
> Which gets printed:
> Ner\xf3n como cantor
> 
> When they should be:
> Nerón como cantor
> 
> I don't know if it is my fault (due to me misusing BS) or is it a BS 
> fault. Anyway, is there a way to print the string correctly?
> 
> This is the code I'm using in BS
> 
> a = open("zona.htm")
> text = a.readlines()
> a.close()
> 
> BS = BeautifulSoup.BeautifulSoup(str(text))
> 
> for ed in BS('span', {'class':'ed_ant_fecha'}):
>     urlynombre = ed.findNextSibling().findNextSibling().findNextSibling()
>     nombre = urlynombre.next.next
> 
> And "nombre" is that string I mentioned.

Can you show a complete example including the URL you are fetching and the code that prints nombre? It's hard to tell from what you have shown where the problem is. We have seen some strangeness in BS when dealing with non-ASCII so that could be the problem, or you could be misinterpreting what is being printed.

Kent

-- 
http://www.kentsjohnson.com



More information about the Tutor mailing list