[ python-Bugs-725265 ] urlopen object's read() doesn't read to EOF
SourceForge.net
noreply at sourceforge.net
Wed Mar 31 23:23:32 EST 2004
Bugs item #725265, was opened at 2003-04-21 16:49
Message generated for change (Comment added) made by fdrake
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=725265&group_id=5470
Category: Documentation
Group: Python 2.2.2
>Status: Closed
Resolution: Fixed
Priority: 5
Submitted By: Christopher Smith (smichr)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: urlopen object's read() doesn't read to EOF
Initial Comment:
On http://python.org/doc/current/lib/module-urllib.html it says that
the object returned by urlopen supports the read()method and that
this and other methods "have the same interface as for file objects
-- see section 2.2.8". In that section on page
http://python.org/doc/current/lib/bltin-file-objects.html it says about
the read() method that "if the size argument is negative or omitted,
[read should] read all data until EOF is reached."
I was a bit surprised when a project that students of mine were
working on were failing when they tried to process the data
obtained by the read() method on a connection made to a web
page. The problem, apparently, is that the read may not obtain all
of the data requested in the first request and the total response
has to be built up someting like follows:
import urllib
c=urllib.urlopen("http://www.blakeschool.org")
data = ''
while 1:
packet=c.read()
if packet == '': break
data+=packet
I'm not sure if this is a feature or a bug. Could a file's read method
fail to obtain the whole file in one read(), too? It seems that either
the documentation should be changed or the read() method for at
least urllib objects should be changed.
/c
Christopher P. Smith
The Blake School
Minneapolis, MN
----------------------------------------------------------------------
>Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2004-03-31 23:23
Message:
Logged In: YES
user_id=3066
Backported to Python 2.3.4 as Doc/lib/liburllib.tex 1.50.8.2.
----------------------------------------------------------------------
Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2004-03-25 12:04
Message:
Logged In: YES
user_id=3066
This is an issue with reading from a socket; there's no way
to recognize the end of the stream until the remote end of
the socket actually closes the socket.
I've documented this limitation in Doc/lib/liburllib.tex
1.52. Someone should backport the patch to Python 2.3.x and
close this report.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=725265&group_id=5470
More information about the Python-bugs-list
mailing list