Problem when fetching page using urllib2.urlopen
dorzey at googlemail.com
Mon Aug 10 18:15:07 CEST 2009
"geturl - this returns the real URL of the page fetched. This is
useful because urlopen (or the opener object used) may have followed a
redirect. The URL of the page fetched may not be the same as the URL
requested." from http://www.voidspace.org.uk/python/articles/urllib2.shtml#info-and-geturl
It might be worth checking that you are actually getting the page you
want; I seem to remember that semicolons need to be encoded, similar
On Aug 10, 12:43 pm, jitu <nair.jiten... at gmail.com> wrote:
> On Aug 10, 4:39 pm, jitu <nair.jiten... at gmail.com> wrote:
> > Hi,
> > A html page contains 'anchor' elements with 'href' attribute having
> > a semicolon in the url , while fetching the page using
> > urllib2.urlopen, all such href's containing 'semicolons' are
> > truncated.
> > For example the hrefhttp://travel.yahoo.com/p-travelguide-6901959-pune_restaurants-i;_ylt...
> > get truncated tohttp://travel.yahoo.com/p-travelguide-6901959-pune_restaurants-i
> > The page I am talking about can be fetched fromhttp://travel.yahoo.com/p-travelguide-485468-pune_india_vacations-i;_...
> > Thanks a Lot
> > Regards
> > jitu
> Sorry, the question what I wanted to ask was, whether is this the
> correct behaviour or a bug ?
> Thanks A Lot.
More information about the Python-list