[Python-Dev] Rehabilitating fgets

Tim Peters tim_one@email.msn.com
Sat, 6 Jan 2001 22:33:02 -0500


> [Tim suggests to use fgets(), preparing the buffer with non-null
> bytes, and searching for a null byte from the right.]

[Guido]
> If this is really sufficiently fast, I'd say, go for it.  Looks
> bullet-proof as long as the source code to MSVCRT doesn't change. :-)

Surprise?  Despite all the memsets, memchrs (looking for a newline), and
one-at-a-time backward searches (looking for a null byte), it's a huge win
on Windows:

total 117615824 chars and 3237568 lines
readlines_sizehint    9.550  9.578
using_fileinput      28.790 28.781
while_readline       13.120 13.134

The last one was 30.5 seconds before the fgets hackery.

I'll check it in tomorrow after sleeping on it (there's a large pile of
messy endcases (not only does fgets() invent a null byte, it can't tell you
whether it stopped reading due to EOF, so maybe the last line in the file
ends with 10000 null bytes + no newline + exactly lines up with a buffer
boundary -- etc); test_builtin is failing in a closely related area but
nobody would have checked in code that failed a std test <wink>; and it's
been a frustrating day all around).

i-want-my-cable-modem-back-now-ly y'rs  - tim