[Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
Tim Peters
tim.one@home.com
Wed, 3 Jan 2001 04:32:55 -0500
[Guido & Tim, wonder about faking getline-like functionality for Windows]
The attached is kinda baffling. The std tests pass with it, and it changes my test
timings from:
count_chars_lines 14.780 14.772
readlines_sizehint 9.390 9.375
using_fileinput 66.130 66.157
while_readline 30.380 30.337
to:
count_chars_lines 14.880 14.854
readlines_sizehint 9.280 9.302
using_fileinput 48.610 48.589
while_readline 13.450 13.451
Big win? You bet. But ...
The baffling parts:
1. That Perl still takes only 6 seconds in line-at-a-time mode.
2. I originally wrote a getline workalike, instead of building directly into a PyString
buffer. That made my test run *slower*, and I'm talking factor of 2, not a yawn. To
judge from my usually silent disk (I've got 256Mb RAM on this box), I'm afraid the extra
mallocs required may have triggered the horrid Win9x malloc-thrashing problem I wrote
about while I was still at Dragon. Consider that another vote for Vlad's PyMalloc --
we've got no handle on x-platform dynamic memory behavior now. Python's destiny is to
replace both the platform OS and libc anyway <0.9 wink>.
The scary parts:
+ As the "XXX" comments indicate, this is full of little insecurities.
+ Another one I just thought of: if the user's last operations on the fp were two or more
consecutive ungetc calls, all bets are off. But then MS doesn't define what happens then
either.
+ This is much less ambitious than I recall Perl's code being: it doesn't try to guess
anything about the file, and effectively captures only what would happen if you could
unroll the guts of a getc-in-a-loop and optimize the snot out of it. The good news is
that this means it's much easier to maintain (it touches only two of the MS FILE* fields,
and in ways that are pretty obviously correct). The bad news is that this seems also
pretty clearly all there *is* to be gotten out of breaking into the FILE* abstraction for
the particular test case I'm using; and increasing TUNEME doesn't save any time at all:
the sucker is flying at full speed already.
+ It drops (line-at-a-time) drops to a little under 13 seconds if I comment out the thread
macros.
+ I haven't looked at Perl's implementation in a year, and they must have dreamt up
another trick since then. That's a "scary part" indeed to anyone who has ever looked at
Perl's implementation.
retreating-into-a-fetal-position-ly y'rs - tim
Anyone wants to play, the sandbox is fileobject.c. Do two things: insert this new chunk
somewhere above get_line:
#ifdef MS_WIN32
static PyObject*
win32_getline(FILE *fp)
{
/* XXX ignores thread safety -- but so does MS's getc macro! */
PyObject* v;
char* pBuf; /* next free slot in v's buffer */
/* MS's internals are declared in terms of ints, but it's a sure bet
* that won't last forever -- use size_t now & live w/ the casting;
* ditto for Python's routines
*/
size_t total_buf_size = 100;
size_t free_buf_size = total_buf_size;
#define TUNEME 1000 /* how much to boost the string buffer when exhausted */
v = PyString_FromStringAndSize((char *)NULL, (int)total_buf_size);
if (v == NULL)
return NULL;
pBuf = BUF(v);
Py_BEGIN_ALLOW_THREADS
for (;;) {
char ch;
size_t ms_cnt; /* FILE->_cnt shadow */
char* ms_ptr; /* FILE->_ptr shadow */
size_t max_to_copy, i;
/* stdio buffer empty or in unknown state; rather
* than try to simulate every quirk of MS's internals,
* let the MS macros deal with it.
*/
/* XXX we also wind up here when we simply run out of string
* XXX buffer space, but I'm not sure I care: making this a
* XXX double-nested loop doesn't seem worth it
*/
ch = getc(fp);
if (ch == EOF)
break;
/* make sure we've got some breathing room */
if (free_buf_size < 100) {
size_t currentoffset = pBuf - BUF(v);
total_buf_size += TUNEME; /* XXX check for overflow */
Py_BLOCK_THREADS
if (_PyString_Resize(&v, (int)total_buf_size) < 0)
return NULL;
Py_UNBLOCK_THREADS
pBuf = BUF(v) + currentoffset;
free_buf_size = TUNEME;
}
/* ch wasn't EOF, so store it */
*pBuf++ = ch;
--free_buf_size;
if (ch == '\n') {
break;
}
ms_cnt = (size_t)fp->_cnt;
if (!ms_cnt) {
/* XXX this is a slow way to read one character at
* XXX a time if, e.g., the stream is unbuffered
*/
continue;
}
/* payback! now we don't have to check for buffer overflows or
* EOF inside the loop, nor does the macro _filbuf() branch force
* _ptr and _cnt in and out of memory on each iteration
*/
ms_ptr = fp->_ptr;
assert(ms_cnt > 0);
i = max_to_copy = ms_cnt < free_buf_size ? ms_cnt : free_buf_size;
do {
/* XXX unclear to me why MS's getc macro does "& 0xff" */
*pBuf++ = ch = *ms_ptr++ & 0xff;
} while (--i && ch != '\n');
/* update the shadows & counters */
fp->_ptr = ms_ptr;
free_buf_size -= max_to_copy - i;
fp->_cnt = ms_cnt - (max_to_copy - i);
if (ch == '\n')
break;
}
Py_END_ALLOW_THREADS
_PyString_Resize(&v, pBuf - BUF(v));
return v;
}
#endif
2. Within get_line, add this before the #endif (this is the getline #if block):
#elif defined(MS_WIN32)
if (n == 0) {
return win32_getline(fp);
}