Problem when fetching page using urllib2.urlopen

dorzey dorzey at googlemail.com
Mon Aug 10 12:15:07 EDT 2009


"geturl - this returns the real URL of the page fetched. This is
useful because urlopen (or the opener object used) may have followed a
redirect. The URL of the page fetched may not be the same as the URL
requested." from http://www.voidspace.org.uk/python/articles/urllib2.shtml#info-and-geturl

It might be worth checking that you are actually getting the page you
want; I seem to remember that semicolons need to be encoded, similar
to '&'.

Dorzey

On Aug 10, 12:43 pm, jitu <nair.jiten... at gmail.com> wrote:
> On Aug 10, 4:39 pm, jitu <nair.jiten... at gmail.com> wrote:
>
> > Hi,
>
> > A html page  contains 'anchor' elements with 'href' attribute  having
> > a semicolon  in the url , while fetching the page using
> > urllib2.urlopen, all such href's  containing  'semicolons' are
> > truncated.
>
> > For example the hrefhttp://travel.yahoo.com/p-travelguide-6901959-pune_restaurants-i;_ylt...
> > get truncated tohttp://travel.yahoo.com/p-travelguide-6901959-pune_restaurants-i
>
> > The page I am talking about can be fetched fromhttp://travel.yahoo.com/p-travelguide-485468-pune_india_vacations-i;_...
>
> > Thanks a Lot
> > Regards
> > jitu
>
> Hi
>
>    Sorry, the question what I wanted to ask was, whether is this the
> correct behaviour or a bug ?
>
> Thanks A Lot.
> Regards
> jitu




More information about the Python-list mailing list