[Python-Dev] Urllib code or the docs appear wrong

Guido van Rossum gvanrossum at gmail.com
Tue Mar 8 04:32:25 CET 2005


On Mon, 7 Mar 2005 14:43:01 -0600, Skip Montanaro <skip at pobox.com> wrote:
> 
> It seems to me that either urllib's docs are wrong or its code is wrong
> w.r.t. how the User-agent header is handled.  In part, the docs say:
> 
>     By default, the URLopener class sends a User-Agent: header of
>     "urllib/VVV", where VVV is the urllib version number. Applications can
>     define their own User-Agent: header by subclassing URLopener or
>     FancyURLopener and setting the instance attribute version to an
>     appropriate string value before the open() method is called.
> 
> Looking at the code it seems to me that the User-agent header is fixed at
> instantiation time:
> 
>     version = "Python-urllib/%s" % __version__
> 
>     # Constructor
>     def __init__(self, proxies=None, **x509):
>         ...
>         self.addheaders = [('User-agent', self.version)]
>         ...
> 
> and that when open_http() is called, it simply calls putheader() for each
> element of addheaders.  Setting the version instance attribute will have no
> effect.  If I managed to add another User-agent header before open_http()
> was called, the request would wind up with two copies which is probably not
> desirable either.
> 
> I can see a couple ways around this:
> 
>     * Just change the docs to match the current implementation.  Users
>       wishing to override the User-agent header would then have to subclass
>       FancyURLopener and set the version class attribute.
> 
>     * Defer decisions about the value of the User-agent until open_http() is
>       called.
> 
> It appears the OpenerDirector class in urllib2 has a similar "early binding"
> problem.
> 
> I don't particularly care how this is solved, but it appears to need
> solving.

Good catch. I propose fixing the docs; "fixing" the code after so many
year of being out of sync with the doc might cause more surprises.
(Unless you can find evidence in CVS that this *used* to work and
someone introduced an unfortunate optimization that disabled the
feature.)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-Dev mailing list