[Tutor] retrieving httponly cookies on accessing webpage with urllib2

Kent Johnson kent37 at tds.net
Fri Oct 17 13:29:42 CEST 2008


On Thu, Oct 16, 2008 at 11:40 PM, xbmuncher <xboxmuncher at gmail.com> wrote:
> I'm trying to mimic my firefox browser in requesting a webpage with python.

> So I tried trusty ol' urllib2 to request it in python:
> import urllib2
>
>
> url = 'http://www.website.com'
>
> #headers
> h = {
> 'User-Agent' : 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.3)
<snip>
> }
> #request page
> reqObj = urllib2.Request(url, None, h)
> urlObj = urllib2.urlopen(reqObj)

It doesn't work to set the User-Agent header this way. See
http://personalpages.tds.net/~kent37/kk/00010.html#e10request-headers
for a recipe.

> #read response
> print urlObj.read()

What content do you get? Is it an error message? What does wireshark
show for this request?

> Notice the content length is considerably smaller, and no cookies are sent
> to me like they were in firefox. I know only a little about httpOnly
> cookies, but that it is some kind of special cookie that I suppose has
> something to do with python not being able to access it like firefox. All I
> want to do is have python receive the same cookies that firefox did, how can
> I do this? I read somewhere that httpOnly cookies were implemented in the
> python cookie module:
> http://glyphobet.net/blog/blurb/285
> ....yet the other cookies aren't being sent either...

I don't think that has anything to do with your problem. httpOnly is
set by the server and interpreted by the browser. The Python change
was to allow httpOnly to be set by servers written in Python.

Kent


More information about the Tutor mailing list