[Python-Dev] xreadlines : readlines :: xrange : range

Thomas Wouters thomas@xs4all.net
Thu, 4 Jan 2001 07:51:17 +0100


On Wed, Jan 03, 2001 at 07:37:22PM -0500, Guido van Rossum wrote:
> > In other words: use it! :)
> 
> Mind doing a few platform tests on the (new version of the) patch?

Well, only a bit :) It's annoying that BSDI doesn't come with autoconf, but
I managed to use all my early-morning wit (it's 6:30AM <wink>) to work
around it. I've tested it on BSDI 4.1 and FreeBSD 4.2-RELEASE.

> I already know that it works on Red Hat Linux 6.2 (my box) and Solaris
> 2.6 (Andrew's box).  I would be delighted to know that it works on at
> least one other platform that has getc_unlocked() and one platform
> that doesn't have it!

Sorry, I have to disappoint you. FreeBSD does have getc_unlocked, they
just didn't document it. Hurrah for autoconf ;P Anyway, it worked like a
charm on BSDI:

(Python 2.0)
total 1794310 chars and 37660 lines
count_chars_lines     0.310  0.300
readlines_sizehint    0.150  0.150
using_fileinput       2.013  2.017
while_readline        1.006  1.000

(CVS Python + getc_unlocked)
daemon2:~/python/python/dist/src > ./python test.py termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.354  0.350
readlines_sizehint    0.182  0.183
using_fileinput       1.594  1.583
while_readline        0.363  0.367

But something weird is going on on FreeBSD:

(Standard CVS Python)
> ./python ~thomas/test.py ~thomas/termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.265  0.266
readlines_sizehint    0.148  0.148
using_fileinput       0.943  0.938
while_readline        0.214  0.219

(CVS+getc_unlocked)
> ./python-getc-unlocked  ~thomas/test.py ~thomas/termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.266  0.266
readlines_sizehint    0.151  0.141
using_fileinput       1.066  1.078
while_readline        0.283  0.281

This was sufficiently unexpected that I looked a bit further. The FreeBSD
Python was compiled without editing Modules/Setup, so it was statically
linked, no readline etc, but *with* threads (which are on by default, and
functional on both FreeBSD and BSDI 4.1.) Here's the timings after I enabled
just '*shared*':

(CVS + *shared*)
> ./python ~thomas/test.py ~thomas/termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.276  0.273
readlines_sizehint    0.150  0.156
using_fileinput       0.902  0.898
while_readline        0.206  0.203

(This was not a fluke, I repeated it several times, getting hardly any
variation.) Enabling readline and cursesmodule had no additional effect.
Adding *shared* to the getc_unlocked tree saw roughly the same improvement,
but was still slower than without getc_unlocked.

(CVS + *shared* + getc_unlocked)
> ./python ~thomas/test.py ~thomas/termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.272  0.273
readlines_sizehint    0.149  0.148
using_fileinput       1.031  1.031
while_readline        0.267  0.266

Increasing the size of the testfile didn't change anything, other than the
absolute numbers. I browsed stdio.h, where both getc() and getc_unlocked()
are defined as macros. getc_unlocked is defined as:

#define __sgetc(p) (--(p)->_r < 0 ? __srget(p) : (int)(*(p)->_p++))
#define getc_unlocked(fp)       __sgetc(fp)

and getc either as

#define getc(fp)        getc_unlocked(fp)
(without threads) or

static __inline int                     \
__getc_locked(FILE *_fp)                \
{                                       \
        extern int __isthreaded;        \
        int _ret;                       \
        if (__isthreaded)               \
                _FLOCKFILE(_fp);        \
        _ret = getc_unlocked(_fp);      \
        if (__isthreaded)               \
                funlockfile(_fp);       \
        return (_ret);                  \
}
#define getc(fp)        __getc_locked(fp)

_FLOCKFILE(x) is defined as flockfile(x), so that isn't the difference. The
speed difference has to be in the quick-and-easy test for whether the
locking is even necessary. Starting a thread on 'time.sleep(900)' in test.py
shows these numbers:

(standard CVS python)
> ./python-shared-std ~/test.py ~/termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.433  0.445
readlines_sizehint    0.204  0.188
using_fileinput       1.595  1.594
while_readline        0.456  0.453

(getc_unlocked)
> ./python-getc-unlocked-shared ~/test.py ~/termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.441  0.453
readlines_sizehint    0.206  0.195
using_fileinput       1.677  1.688
while_readline        0.509  0.508

So... using getc_unlocked manually for performance reasons isn't a cardinal
sin on FreeBSD only if you are really using threads :-)

Lets-outsmart-the-OS-scheduler-next!-ly y'rs
-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!