Use of httplib?

Steve Holden sholden at holdenweb.com
Wed Dec 13 18:55:23 EST 2000


David Lees <DavidL at nospammy_raqia.com> wrote in message
news:3A37E456.772811C9 at nospammy_raqia.com...
> I am running the client code below to try and send and receive text from
> a server.  My problem is that I do not know how to recieve back an
> arbitrary number of lines.  In a test case, when I know the exact number
> of lines as shown below (e.g. 11) I can use readline.  However, in
> general I do not know how many lines there are and the program hangs on
> the connection when  I accidentaly read to the end.  The comment out
> pieces of code also either hang or do not give back all the text.
>
Hmm ... the manual page for httplib reminds us that it is used by urllib,
and although it does implement the HTTP client-side protocol it's actually
much easier to use urllib.  Here's a snippet form a crawler program I wrote
some time ago (warning: not tested under 2.0) which shows that it's very
easy to use urllib to read HTML pages from a server:

    ignored = []
    distance = 1
    while distance < MAXDIST:
    l.append([])
    for URL in l[distance-1]:
        try:
        if URL == '-':
            f = sys.stdin
        else:
            f = urllib.urlopen(URL)
        data = f.read()
        if f is not sys.stdin:
            f.close()

        p = myHTMLParser(fmt, URL)
        p.feed(data)
        links, rubbish = p.close()
        known[URL].setgood()


> I am copying from the Library Reference section 11.4, but it does not
> seem to document the readline, read or readlines methods for http.
>
Bottom line: use urllib, that's what it's for.  Easier, too!  Good luck.

regards
 Steve






More information about the Python-list mailing list