urllib (and urllib2) read all data from page on open()?
alexs at advfn.com
Mon Mar 14 15:48:25 CET 2005
Whilst it might be able to do what I want I feel this to be a flaw in urllib
that should be fixed, or at least added to a buglist somewhere so I can at
least pretend someone other than me cares.
From: Swaroop C H [mailto:g2swaroop at yahoo.com]
Sent: 14 March 2005 14:45
To: Alex Stapleton
Subject: RE: urllib (and urllib2) read all data from page on open()?
--- Alex Stapleton <alexs at advfn.com> wrote:
> Except wouldn't it of already read the entire file when it opened,
> or does it occour on the first read()? Also will the data returned
> from handle.read(100) be raw HTTP? In which case what if the
> encoding is chunked or gzipped?
Maybe the httplib module can help you.
>From http://docs.python.org/lib/httplib-examples.html :
>>> import httplib
>>> conn = httplib.HTTPConnection("www.python.org")
>>> conn.request("GET", "/index.html")
>>> r1 = conn.getresponse()
>>> print r1.status, r1.reason
>>> data1 = r1.read()
>>> conn.request("GET", "/parrot.spam")
>>> r2 = conn.getresponse()
>>> print r2.status, r2.reason
404 Not Found
>>> data2 = r2.read()
As far as I can understand, you can read() data only when you want
There's a warning that says "This module defines classes which
implement the client side of the HTTP and HTTPS protocols. It is
normally not used directly -- the module urllib uses it to handle
URLs that use HTTP and HTTPS."
Swaroop C H
More information about the Python-list