[New-bugs-announce] [issue26499] http.client.IncompleteRead from HTTPResponse read after part reading file

Mon Mar 7 05:10:53 EST 2016

New submission from Peter:

This is a regression in Python 3.5 tested under Linux and Mac OS X, spotted from a failing test in Biopython https://github.com/biopython/biopython/issues/773 where we would parse a file from the internet. The trigger is partially reading the network handle line by line (e.g. until an end record marker is found), and then calling handle.read() to fetch any remaining data. Self contained examples below.

Note that partially reading a file like this still works:

$ python3.5
Python 3.5.0 (default, Sep 14 2015, 12:13:24) 
[GCC 4.2.1 Compatible Apple LLVM 6.1.0 (clang-602.0.53)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> 
>>> from urllib.request import urlopen
>>> handle = urlopen("http://www.python.org")
>>> chunk = handle.read(50)
>>> rest = handle.read()
>>> handle.close()

However, the following variants read a few lines and then attempt to call handle.read() and fail. The URL is not important (as long as it has over four lines in these examples).

Using readline,

>>> from urllib.request import urlopen
>>> handle = urlopen("http://www.python.org")
>>> for i in range(4):
...     line = handle.readline()
... 
>>> rest = handle.read()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/xxx/lib/python3.5/http/client.py", line 446, in read
    s = self._safe_read(self.length)
  File "/Users/xxx/lib/python3.5/http/client.py", line 594, in _safe_read
    raise IncompleteRead(b''.join(s), amt)
http.client.IncompleteRead: IncompleteRead(46698 bytes read, 259 more expected)

Using line iteration via next,

>>> from urllib.request import urlopen
>>> handle = urlopen("http://www.python.org")
>>> for i in range(4):
...      line = next(handle)
... 
>>> rest = handle.read()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/xxx/lib/python3.5/http/client.py", line 446, in read
    s = self._safe_read(self.length)
  File "/Users/xxx/lib/python3.5/http/client.py", line 594, in _safe_read
    raise IncompleteRead(b''.join(s), amt)
http.client.IncompleteRead: IncompleteRead(46698 bytes read, 259 more expected)

Using line iteration directly,

>>> from urllib.request import urlopen
>>> count = 0
>>> handle = urlopen("http://www.python.org")
>>> for line in handle:
...     count += 1
...     if count == 4:
...         break
... 
>>> rest = handle.read()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/xxx/lib/python3.5/http/client.py", line 446, in read
    s = self._safe_read(self.length)
  File "/Users/xxx/lib/python3.5/http/client.py", line 594, in _safe_read
    raise IncompleteRead(b''.join(s), amt)
http.client.IncompleteRead: IncompleteRead(46698 bytes read, 259 more expected)

These examples all worked on Python 3.3 and 3.4 so this is a regression.

----------
components: Library (Lib)
messages: 261287
nosy: maubp
priority: normal
severity: normal
status: open
title: http.client.IncompleteRead from HTTPResponse read after part reading file
type: behavior
versions: Python 3.5

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue26499>
_______________________________________