Slow network reading?

Thu May 11 09:16:44 EDT 2006

Ivan Voras wrote:

>      def query(self, sql):
>          self.sf.write("SQL %s\r\n" % sql)
>          self.sf.flush()
>          resp = self.sf.readline().rstrip()
>          m = SQLCacheD.re_rec.match(resp)
>          if m != None: # only if some rows are returned (SELECT)
>              n_rows = int(m.group(1))
>              n_cols = int(m.group(2))
>              cols = []
>              for c in xrange(n_cols):
>                  cols.append(self.sf.readline().rstrip())
>              rs = []
>              for r in xrange(n_rows):
>                  row = {}
>                  for c in cols:
>                      row[c] = self.sf.readline().rstrip()
>                  rs.append(row)
>              return rs
>          m = SQLCacheD.re_ok.match(resp)
>          if m != None: # no rows returned (e.g. INSERT/UPDATE/DELETE)
>              return True
>          raise SQLCacheD_Exception(resp)

Comparative CPU & memory utilisation statistics, not to mention platform 
and version of Python, would be useful hints...

Note that the file-like object returned by makefile() has significant
portions of heavy lifting code in Python rather than C which can be a
drag on ultimate performance...  If on a Unix platform, it may be worth
experimenting with os.fdopen() on the socket's fileno() to see whether
the core Python file object (implemented in C) can be used in place of
the lookalike returned from the makefile method.

Even without that, you are specifying a buffer size smaller than the
default (8k - see Lib/socket.py). 16k might be even better.

Although they're only micro-optimisations, I'd be interested in the
relative performance of the query method re-written as:

     def query(self, sql):
         self.sf.write("SQL %s\r\n" % sql)
         self.sf.flush()
         sf_readline = self.sf.readline
         resp = sf_readline().rstrip()
         m = self.re_rec.match(resp)
         if m is not None:
             # some rows are returned (SELECT)
             rows = range(int(m.group(1)))
             cols = range(int(m.group(2)))
             for c in cols:
                 cols[c] = sf_readline().rstrip()
             for r in rows:
                 row = {}
                 for c in cols:
                     row[c] = sf_readline().rstrip()
                 rows[r] = row
             return rows
         elif self.re_ok.match(resp) is not None:
             # no rows returned (e.g. INSERT/UPDATE/DELETE)
             return True
         raise SQLCacheD_Exception(resp)

This implementation is based on 2 strategies for better performance:
- minimise name lookups by hoisting references from outside the method
   to local references;
- pre-allocate lists when the required sizes are known, to avoid the
   costs associated with growing them.

Both strategies can pay fair dividends when the repetition counts are
large enough; whether this is the case for your tests I can't say.

-- 
-------------------------------------------------------------------------
Andrew I MacIntyre                     "These thoughts are mine alone..."
E-mail: andymac at bullseye.apana.org.au  (pref) | Snail: PO Box 370
        andymac at pcug.org.au             (alt) |        Belconnen ACT 2616
Web:    http://www.andymac.org/               |        Australia