How to grab a part of web page?

Derrick 'dman' Hudson dman at dman.ddts.net
Tue Jul 9 23:54:28 EDT 2002


On Tue, Jul 09, 2002 at 10:47:44PM +0200, A wrote:
| Hi,
| Is it possible to download only a part of web page?
| Say I need to find out an information about a customer that starts 
| at 1500 byte position and ends at 2000 byte position. If the whole 
| page has about 100 kb it seems to me waste of time to load all the 
| page.
| What is the best, yet easy, solution?
| Is it possible to use httplib or necessary socket module?

Depending on the HTTP server at the other end, you _may_ be able to
request that the document starts at a certain byte position.  Older
servers definitely won't support that feature.  You can read up on it
in the RFCs that define HTTP/1.1.  I don't know much about it myself
other than applications will call it "resuming" a download.  Then you
could just drop the connection when you've seen as much data as you
want.

You can probably do the first part with the httplib module -- I think
it lets you specify "extra" headers to add.  I don't think you can
make it kill the connection, though.

In any case, depending on where you are and where you download the
data from, 100kb could take less than a second to transfer, and the
gain of not transfering the whole thing won't be noticeable by the
user.

-D

-- 
 
The nice thing about windoze is - it does not just crash,
it displays a dialog box and lets you press 'ok' first.
 
http://dman.ddts.net/~dman/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 248 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-list/attachments/20020709/f4fb7cc3/attachment.sig>


More information about the Python-list mailing list