[Python-Dev] RE: xreadline speed vs readlines_sizehint

Tim Peters tim.one@home.com
Thu, 11 Jan 2001 01:10:40 -0500


[Tim, to MarkF]
>> You average over 255 chars/line?  [nag, nag, nag]

[Mark Favas]
> Real-life input, my boy! It's actually a syslog from my
> mailserver, consisting mainly of sendmail log messages, and I
> have a current need to process these things (MS Exchange,
> corrupted database, clobbered backup tapes), so this thread
> came along at the right time...

Hmm.  I tuned ms_getline_hack for Guido's logfiles, which he said don't
often exceed 160 chars/line.  I guess if you're on a 64-bit platform,
though, it must take about twice as many chars per line to record a log msg
<wink>.

> ...
> Removing the buffer size arg in the call to readlines_sizehint results
> in this (using up-to-the-minute CVS):
> total 131426612 chars and 514216 lines
> count_chars_lines     4.922  4.916
> readlines_sizehint    3.881  3.850
> using_fileinput      10.371 10.366
> while_readline       10.943 10.916
> for_xreadlines        2.990  2.967
>
> and with an 8Kb sizehint:
> total 131426612 chars and 514216 lines
> count_chars_lines     5.241  5.216
> readlines_sizehint    2.917  2.900
> using_fileinput      10.351 10.333
> while_readline       10.990 10.983
> for_xreadlines        2.877  2.867

That's sure consistent across platforms, then.  I guess we'll write it off
to "cache effects" (a catch-all explanation for any timing mystery -- go
ahead, just *try* to prove it's wrong <0.5 wink>).

[and Mark has HAVE_GETC_UNLOCKED on his Tru64 Unix box, yet
 using_fileinput is quicker than while_readline]

> With USE_MS_GETLINE_HACK and HAVE_GETC_UNLOCKED both #define'd
> (although defining the former makes the latter def irrelevant):
> (test_bufio also OK)
> total 131426612 chars and 514216 lines
> count_chars_lines     5.056  5.050
> readlines_sizehint    3.771  3.667
> using_fileinput      11.128 11.116
> while_readline        8.287  8.233
> for_xreadlines        3.090  3.083

So ms_getline_hack is significantly faster on your box (I'm only looking at
while_readline:  11 using getc_unlocked, 8.3 using ms_getline_hack).  There
are only two reasons I can imagine for that:

1. Your vendor optimizes the inner loop in fgets (as all vendors should, but
few do).

and/or

2. Despite the long average length of your lines, many of them are
nevertheless shorter than 200 chars, and so all the pain ms_getline_hack
endures to avoid a realloc pays off.

Unfortunately, there's not enough info to figure out if either, both, or
none of those are on-target.  It's such a large percentage speedup, though,
that my bet goes primarily to #1 -- unless realloc is really pig slow on
your box.  Which some things *are*:

> With USE_MS_GETLINE_HACK and HAVE_GETC_UNLOCKED both #undef'ed (just
> for completeness):
> total 131426612 chars and 514216 lines
> count_chars_lines     4.916  4.900
> readlines_sizehint    3.875  3.867
> using_fileinput      14.404 14.383
> while_readline       322.728 321.837
> for_xreadlines        7.113  7.100
>
> So, having HAVE_GETC_UNLOCKED #define'd does make a small improvement
> <grin>

Yes, that's the "platform from Mars" evidence I was seeking:  if
ms_getline_hack survives test_bufio on *your* crazy box, it's as close to
provably correct as any algorithm in all of Computer Science <wink>.

a-factor-of-39-is-almost-big-enough-to-notice!-ly y'rs  - tim