New subject: httplib

June 2, 2000

      It looks like a definite bug. I have *no* idea, tho, why it is doing
that... I did quite a bit of testing with chunked replies. Admittedly,
though, I didn't stack up requests like you've done in your test. I'm
wrapping up mod_dav at the moment, so I don't really have time to look
deeply into this. Mebbe next week?

Regarding the pipeline request thing. I think it would probably be best to
just drop the whole "hold the previous response and wait for it to be
closed" thing. I don't know why that is in there; probably a leftover
(converted) semantic from the old-style HTTP() class. I'd be quite fine
just axing it and allowing the client to shove ten requests down the pipe
before pulling the first response back out.

Oh. Wait. Maybe that was it. You can't read the "next" response until the
first one has been read. Well... no need to block putting new responses;
we just need to create a way to "get the next reply" and/or "can I get the
next reply yet?"

Cheers,
-g

p.s. Moshe also had a short list of review items. I read thru them, but
not with the code in hand to understand some of his specifics.

On Wed, 31 May 2000, Jeremy Hylton wrote:
...
...
...
...
...
...
"GS" == Greg Stein <gstein@lyra.org> writes:
GS> [ and recall my email last week that I've updated httplib.py and
  GS> posted it to my web pages; it is awaiting review for integration
  GS> into the Python core; it still needs docs and more testing
  GS> scenarios, tho
I've been looking at the httplib code, and I found what may be a bug.
Not sure, because I'm not sure how the API works for pipelined
requests.
I've got some test code that looks a bit like this:
def test_new_interface_series(urls):
    paths = []
    the_host = None
    for url in urls:
        host, path = get_host_and_path(url)
        if the_host is None:
            the_host = host
        else:
            assert host == the_host
        paths.append(path)
conn = httplib.HTTPConnection(the_host)
    for path in paths:
        conn.request('GET', path, headers={'User-Agent': 'httplib/Python'})
    for path in paths:
        errcode, errmsg, resp = conn.getreply()
        buf = resp.read()
        if errcode == 200:
            print errcode, resp.headers
        else:
            print errcode, `errmsg`, resp
        print resp.getheader('Content-Length'), len(buf)
        print repr(buf[:40])
        print repr(buf[-40:])
        print
    conn.close()
test_new_interface_series(['http://www.python.org/',
                        'http://www.python.org/pics/PyBanner054.gif',
                        'http://www.python.org/pics/PythonHi.gif',
                        'http://www.python.org/Jobs.html',
                        'http://www.python.org/doc/',
                        'http://www.python.org/doc/current/',
                           ])
The second loop that reads the replies gets fouled up after a couple
of responses.  I added even more debugging and found that the first
line of the corrupted response is
...
'ontent-Type: text/html\015\012'
It looks like some part of the program is consuming too much input.  I
haven't been able to figure out what part yet.  Hoping that you might
have some good ideas.
Thinking about this issue, I came up with a potential API problem.
You must read the body after calling getreply and before calling
getreply a second time.  This kind of implicit requirement is a bit
tricky.  It would help if the implementation could raise an error if
this happens.  It might be even better if it just worked, although it
seems a bit too magical.
Jeremy
-- 
Greg Stein, http://www.lyra.org/

httplib (was: Adding LDAP to the Python core... ?!)

Greg Stein

Greg Stein

Greg Stein

Jeremy Hylton

Greg Stein

Jeremy Hylton

tags

participants (2)