[Python-Dev] very bad network performance

Bill Janssen janssen at parc.com
Mon Apr 14 20:36:33 CEST 2008


There's some really convoluted code in socket._fileobject.__init__()
here.  When initializing a _fileobject, if the 'bufsize' parameter is
explicitly given as zero, that's turned into an _rbufsize of 1, which,
combined with the 'min' change, will produce the read-one-byte
behavior.  The code for setting _rbufsize seems odd; be nice if it was
commented with some notes on why these specific selections were made.

        if bufsize < 0:
            bufsize = self.default_bufsize
        if bufsize == 0:
            self._rbufsize = 1
        elif bufsize == 1:
            self._rbufsize = self.default_bufsize
        else:
            self._rbufsize = bufsize
        self._wbufsize = bufsize

It also depends on whether 'read' is called with an explicit # of
bytes to read (which appears to be the case here).

So, it's not the code in socket.py, necessarily; it's the code which
opens the socket, most likely.  The only library which seems to use a
bufsize of zero is httplib (which has a lot of other problems as
well).  I think the change cited below (while IMO correct) will affect
a number of other HTTP-based services, as well.

Bill

> Ralf,
> 
> Terry is right. Please file a bug. I do think there may be a problem
> with that change but I don't have the time to review it in depth.
> Hopefully others will. I do recall that sockets reading one byte at a
> time has been a problem before -- I recall a bug about this in the
> 1.5.2 era for Windows... Too bad it's back. :-(
> 
> --Guido
> 
> On Mon, Apr 14, 2008 at 10:25 AM, Terry Reedy <tjreedy at udel.edu> wrote:
> >
> >  "Ralf Schmitt" <schmir at gmail.com> wrote in message
> >  news:932f8baf0804140912u54adc7d5md7261541857f21bd at mail.gmail.com...
> >
> >
> > | Hi all,
> >  |
> >  | I'm using mercurial with the release25-maint branch. I noticed that
> >  checking
> >  | out a local repository now takes more than
> >  | 5 minutes (it should be around 30s).
> >  |
> >  | I've tracked it down to this change:
> >  | http://hgpy.de/py/release25-maint/rev/e9446c6ab3cd
> >  | this is svn revision 61009. Here is the diff inline:
> >  |
> >  | --- a/Lib/socket.py Fri Mar 23 14:27:29 2007 +0100
> >  | +++ b/Lib/socket.py Sat Feb 23 20:30:59 2008 +0100
> >  | @@ -305,7 +305,7 @@
> >  |             self._rbuf = ""
> >  |             while True:
> >  |                 left = size - buf_len
> >  | -                recv_size = max(self._rbufsize, left)
> >  | +                recv_size = min(self._rbufsize, left)
> >  |                 data = self._sock.recv(recv_size)
> >  |                 if not data:
> >  |                     break
> >  |
> >  |
> >  |
> >  | self._rbufsize if 1, and so the code reads one byte at a time. this is
> >  | clearly wrong, I'm posting it to the mailing list, as I don't want
> >  | this issue to get lost in the bugtracker.
> >
> >  --------------------------------------------------------------------------------
> >
> >  It is at least as likely to get lost here.  There is a mailing list for new
> >  tracker items that many devs subscribe to.







More information about the Python-Dev mailing list