[Python-bugs-list] [ python-Bugs-755080 ] AssertionError from urllib.retrieve / httplib

Tue Sep 23 08:35:39 EDT 2003

Bugs item #755080, was opened at 2003-06-15 19:37
Message generated for change (Comment added) made by jmoses
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=755080&group_id=5470

Category: Python Library
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Stuart Bishop (zenzen)
Assigned to: Nobody/Anonymous (nobody)
Summary: AssertionError from urllib.retrieve / httplib

Initial Comment:

The following statement is occasionally generating

AssertionErrors:

    current_page = urllib.urlopen(action,data).read()

Traceback (most recent call last):

  File "/Users/zen/bin/autospamrep.py", line 161, in ?

    current_page = handle_spamcop_page(current_page)

  File "/Users/zen/bin/autospamrep.py", line 137, in 

handle_spamcop_page

    current_page = urllib.urlopen(action,data).read()

  File "/sw/lib/python2.3/httplib.py", line 1150, in read

    assert not self._line_consumed and self._line_left

Fix may be to do the following in 

LineAndFileWrapper.__init__ (last two lines are new):

def __init__(self, line, file):

        self._line = line

        self._file = file

        self._line_consumed = 0

        self._line_offset = 0

        self._line_left = len(line)

        if not self._line_left:

            self._done()

----------------------------------------------------------------------

Comment By: Jon Moses (jmoses)
Date: 2003-09-23 07:35

Message:
Logged In: YES 
user_id=55110

I switched from using urllib.urlretrieve / urllib.urlopen to

using httplib, since I can debug with it.  I no longer get

the error this bug is about.

The other problem I seemed to be having was related to the

data I was recieving, which was generated in part from the

data I was passing to the server.  I changed the data I was

sending (changed ' ' to '%20') and ever thing works fine. 

Even using urllib.urlopen().  Sorry for the confusion.

The data that the server was sending back to the broken

request was outputted like this, using

httplib.http.set_debuglevel(1):

------start

Getting: doi.crossref.org

connect: (doi.crossref.org, 80)

send: 'GET

/servlet/query?usr=<deleted>&pwd=<deleted>&qdata=|Canadian

Journal of Fisheries and Aquatic

Sciences|Adkison|52||2762||full_text|1|<snip> HTTP/1.0\r\n\r\n'

reply: '\n'

|Canadian

-----------end

I don't know if that helps, but maybe.

Thanks much.

----------------------------------------------------------------------

Comment By: Jon Moses (jmoses)
Date: 2003-09-23 06:52

Message:
Logged In: YES 
user_id=55110

Whups, my bad, I just assumed (and we know what happens

then) that this was for python 2.2, since that's what I was

having the problem with.  My next step was to try with

Python 2.3.  I'll let you know if it works (since it sounds

like it should).

And yes, that's what I meant.  Data from the http read was

still being outputted to the screen, while other output from

_past_ where the read was occuring was also being output. 

I'd end up with output like this:

[data from http read]

[data from after]

[data from http read]

and the data was from the same connection.

Hopefully the switch to 2.3 makes my issues moot.  Thanks

----------------------------------------------------------------------

Comment By: Jeremy Hylton (jhylton)
Date: 2003-09-22 22:09

Message:
Logged In: YES 
user_id=31392

jmoses: Are you seeing this problem with Python 2.3?  I

thought we had fixed the problem in the original report.

Also, I'm not sure what you mean by program execution

continuing.  Do you mean that the for loop finished and the

rest of the program continued executing, even though there

was data left to read?  

What would probably help most is a trace of the session with

httplib's set_debuglevel() enabled.  If that's got sensitive

data, you can email it to me privately.

----------------------------------------------------------------------

Comment By: Jon Moses (jmoses)
Date: 2003-09-22 15:38

Message:
Logged In: YES 
user_id=55110

I also experience this problem, and it's repeatable.  When

trying to talk with CrossRef (www.crossref.com) server, I

get this same error.  I don't know why.  All the crossref

server does is spit back text.  It normally takes between 10

and 20 seconds to recieve all the data.  I've successfully

viewed the results with mozilla, and with wget.

I'd post the URL i'm hitting, but it's a for-pay service. 

This is the code I'm using:

...

( name, headers ) = urllib.urlretrieve( url )

...

While attempting to recieve this data, I tried doing a:

...

u = urllib.urlopen( url )

for line in u.readlines():

  print line

...

but program execution seemed to continue while the data was

being received, which is not cool.  I'm not sure if that's

expected behaviour or not.

Let me know if I can provide you with any more information.

-jon

----------------------------------------------------------------------

Comment By: Stuart Bishop (zenzen)
Date: 2003-06-24 07:46

Message:
Logged In: YES 
user_id=46639

I've been unable to repeat the problem through a tcpwatch.py 

proxy, so I'm guessing the trigger is connecting to a fairly loaded 

server over a 56k modem - possibly the socket is in a bad state 

and nothing noticed?

I'll try not going through tcpwatch.py for a bit and see if I can still 

trigger the problem in case there was a server problem triggering 

it that has been fixed.

----------------------------------------------------------------------

Comment By: Jeremy Hylton (jhylton)
Date: 2003-06-16 14:40

Message:
Logged In: YES 
user_id=31392

Can you reproduce this problem easily?  We've seen something

like it before, but have had trouble figuring out what goes

wrong.

----------------------------------------------------------------------

Comment By: Stuart Bishop (zenzen)
Date: 2003-06-15 19:55

Message:
Logged In: YES 
user_id=46639

My suggested fix is wrong.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=755080&group_id=5470