Replacing utf-8 characters
no at spam
Wed Oct 5 16:35:36 CEST 2005
Hi, I am using Python to scrape web pages and I do not have problem
unless I run into a site that is utf-8. It seems & is changed to &
when the site is utf-8.
If I try to replace it with .replace('&','&') it for some reason
does not replace it.
For example: http://today.reuters.co.uk/news/default.aspx
The url in the page looks like this
However when I pull it into python the URL ends up looking like this
(notice the & instead of just & in the URL)
More information about the Python-list