[Python-Dev] RE: xreadline speed vs readlines_sizehint
Tim Peters
tim.one@home.com
Thu, 11 Jan 2001 01:10:40 -0500
[Tim, to MarkF]
>> You average over 255 chars/line? [nag, nag, nag]
[Mark Favas]
> Real-life input, my boy! It's actually a syslog from my
> mailserver, consisting mainly of sendmail log messages, and I
> have a current need to process these things (MS Exchange,
> corrupted database, clobbered backup tapes), so this thread
> came along at the right time...
Hmm. I tuned ms_getline_hack for Guido's logfiles, which he said don't
often exceed 160 chars/line. I guess if you're on a 64-bit platform,
though, it must take about twice as many chars per line to record a log msg
<wink>.
> ...
> Removing the buffer size arg in the call to readlines_sizehint results
> in this (using up-to-the-minute CVS):
> total 131426612 chars and 514216 lines
> count_chars_lines 4.922 4.916
> readlines_sizehint 3.881 3.850
> using_fileinput 10.371 10.366
> while_readline 10.943 10.916
> for_xreadlines 2.990 2.967
>
> and with an 8Kb sizehint:
> total 131426612 chars and 514216 lines
> count_chars_lines 5.241 5.216
> readlines_sizehint 2.917 2.900
> using_fileinput 10.351 10.333
> while_readline 10.990 10.983
> for_xreadlines 2.877 2.867
That's sure consistent across platforms, then. I guess we'll write it off
to "cache effects" (a catch-all explanation for any timing mystery -- go
ahead, just *try* to prove it's wrong <0.5 wink>).
[and Mark has HAVE_GETC_UNLOCKED on his Tru64 Unix box, yet
using_fileinput is quicker than while_readline]
> With USE_MS_GETLINE_HACK and HAVE_GETC_UNLOCKED both #define'd
> (although defining the former makes the latter def irrelevant):
> (test_bufio also OK)
> total 131426612 chars and 514216 lines
> count_chars_lines 5.056 5.050
> readlines_sizehint 3.771 3.667
> using_fileinput 11.128 11.116
> while_readline 8.287 8.233
> for_xreadlines 3.090 3.083
So ms_getline_hack is significantly faster on your box (I'm only looking at
while_readline: 11 using getc_unlocked, 8.3 using ms_getline_hack). There
are only two reasons I can imagine for that:
1. Your vendor optimizes the inner loop in fgets (as all vendors should, but
few do).
and/or
2. Despite the long average length of your lines, many of them are
nevertheless shorter than 200 chars, and so all the pain ms_getline_hack
endures to avoid a realloc pays off.
Unfortunately, there's not enough info to figure out if either, both, or
none of those are on-target. It's such a large percentage speedup, though,
that my bet goes primarily to #1 -- unless realloc is really pig slow on
your box. Which some things *are*:
> With USE_MS_GETLINE_HACK and HAVE_GETC_UNLOCKED both #undef'ed (just
> for completeness):
> total 131426612 chars and 514216 lines
> count_chars_lines 4.916 4.900
> readlines_sizehint 3.875 3.867
> using_fileinput 14.404 14.383
> while_readline 322.728 321.837
> for_xreadlines 7.113 7.100
>
> So, having HAVE_GETC_UNLOCKED #define'd does make a small improvement
> <grin>
Yes, that's the "platform from Mars" evidence I was seeking: if
ms_getline_hack survives test_bufio on *your* crazy box, it's as close to
provably correct as any algorithm in all of Computer Science <wink>.
a-factor-of-39-is-almost-big-enough-to-notice!-ly y'rs - tim