[Tutor] Drifting Values in urllib.

Danny Yoo dyoo at hkn.eecs.berkeley.edu
Sun May 7 03:24:14 CEST 2006


> parameters = urllib.urlencode 
> ({"id":"%s","origin":"%s","dest":"%s","class1":"85",
"weight1":"185","custdata":"%s","respickup":"","resdel":"%s",
"insidechrg":"","furnchrg":"","cod":""})%(id,origin,dest,custdata,resdel)

Hi Doug,


On closer look, I think there's a misunderstanding here.

According to the documentation on urllib.urlencode():

     urlencode(query[, doseq])
     Convert a mapping object or a sequence of two-element tuples to a
     ``url-encoded'' string, suitable to pass to urlopen() above as the
     optional data argument.


For example:

####################################
>>> urllib.urlencode({'Hot' : '1',
...                   'Cross' : '2',
...                   'Buns' : '3'})
'Hot=1&Buns=3&Cross=2'
####################################

Here, we're passing a string of key/value pairs.

Alternatively, we can pass:

####################################
>>> urllib.urlencode([('Hot', 1),
...                   ('Cross', '2'),
...                   ('Buns', '3')])
'Hot=1&Cross=2&Buns=3'
####################################


I'm not quite sure I see where the string interpolation comes in. 
Wait... ok, now I understand what you're trying to do, looking back on 
what you tried:

##########################################################################
parameters = (urllib.urlencode({"id":"%s","origin":"%s","dest":"%s", ...})
               % (id,origin,dest, ...))
##########################################################################

You're using urlencode to build a template string, which is then passed 
into urlencode.

Don't do this.  *grin*

You're missing the whole point behind urlencode(): it's meant to protect 
both key and values so that their string content is clean.  Concretely: we 
know that parameter values aren't allowed to have things like ampersands, 
or else those ampersands will be misinterpreted.

We use urlencode() to encode those values properly.  Rather than:

##################################################
>>> urllib.urlencode({'fish': '%s'}) % 'surf&turf'
'fish=                surf&turf'
##################################################

which gives the wrong result, it is more correct to do:

###########################################
>>> urllib.urlencode({'fish': 'surf&turf'})
'fish=surf%26turf'
###########################################

So, again: when we're building those URL strings, doing the interpolation 
outside of the urlencode produces incorrectly protected URL strings.  Go 
the direct route, and pass values to urlencode() too.


More information about the Tutor mailing list