[Python-Dev] very bad network performance
janssen at parc.com
Mon Apr 14 20:36:33 CEST 2008
There's some really convoluted code in socket._fileobject.__init__()
here. When initializing a _fileobject, if the 'bufsize' parameter is
explicitly given as zero, that's turned into an _rbufsize of 1, which,
combined with the 'min' change, will produce the read-one-byte
behavior. The code for setting _rbufsize seems odd; be nice if it was
commented with some notes on why these specific selections were made.
if bufsize < 0:
bufsize = self.default_bufsize
if bufsize == 0:
self._rbufsize = 1
elif bufsize == 1:
self._rbufsize = self.default_bufsize
self._rbufsize = bufsize
self._wbufsize = bufsize
It also depends on whether 'read' is called with an explicit # of
bytes to read (which appears to be the case here).
So, it's not the code in socket.py, necessarily; it's the code which
opens the socket, most likely. The only library which seems to use a
bufsize of zero is httplib (which has a lot of other problems as
well). I think the change cited below (while IMO correct) will affect
a number of other HTTP-based services, as well.
> Terry is right. Please file a bug. I do think there may be a problem
> with that change but I don't have the time to review it in depth.
> Hopefully others will. I do recall that sockets reading one byte at a
> time has been a problem before -- I recall a bug about this in the
> 1.5.2 era for Windows... Too bad it's back. :-(
> On Mon, Apr 14, 2008 at 10:25 AM, Terry Reedy <tjreedy at udel.edu> wrote:
> > "Ralf Schmitt" <schmir at gmail.com> wrote in message
> > news:932f8baf0804140912u54adc7d5md7261541857f21bd at mail.gmail.com...
> > | Hi all,
> > |
> > | I'm using mercurial with the release25-maint branch. I noticed that
> > checking
> > | out a local repository now takes more than
> > | 5 minutes (it should be around 30s).
> > |
> > | I've tracked it down to this change:
> > | http://hgpy.de/py/release25-maint/rev/e9446c6ab3cd
> > | this is svn revision 61009. Here is the diff inline:
> > |
> > | --- a/Lib/socket.py Fri Mar 23 14:27:29 2007 +0100
> > | +++ b/Lib/socket.py Sat Feb 23 20:30:59 2008 +0100
> > | @@ -305,7 +305,7 @@
> > | self._rbuf = ""
> > | while True:
> > | left = size - buf_len
> > | - recv_size = max(self._rbufsize, left)
> > | + recv_size = min(self._rbufsize, left)
> > | data = self._sock.recv(recv_size)
> > | if not data:
> > | break
> > |
> > |
> > |
> > | self._rbufsize if 1, and so the code reads one byte at a time. this is
> > | clearly wrong, I'm posting it to the mailing list, as I don't want
> > | this issue to get lost in the bugtracker.
> > --------------------------------------------------------------------------------
> > It is at least as likely to get lost here. There is a mailing list for new
> > tracker items that many devs subscribe to.
More information about the Python-Dev