[Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Objects fileobject.c,2.91,2.92

Guido van Rossum guido@python.org
Mon, 13 Nov 2000 17:36:06 -0500


> > Mh...  Thanks.  And how are we testing for large file support?
> > Perhaps the test should be improved rather than hacks piled on top of
> > each other?  What's wrong with the current test?
> > 
> 
> I agree: the test should improve. Here it is:
> 
> AC_MSG_CHECKING(whether to enable large file support)
> if test "$have_long_long" = yes -a \
>     "$ac_cv_sizeof_off_t" -gt "$ac_cv_sizeof_long" -a \
>     "$ac_cv_sizeof_long_long" -ge "$ac_cv_sizeof_off_t"; then
>   AC_DEFINE(HAVE_LARGEFILE_SUPPORT)
>   AC_MSG_RESULT(yes)
> else
>   AC_MSG_RESULT(no)
> fi
> 
> 
> BSDI, etc pass this test but do not have 64-bit capable ftell/fseek functions
> (i.e. ones that use that off_t variable that is as big as a LONG_LONG). In
> posix these are called ftello/fseeko. We also check for ftell64/fseek64 (I
> think). In windows is is _tell64, I think. ...anyway there are a lot of ways
> to spell it.
> 
> I don't know right away how to translate that into an appropriate configure
> test. I.e. how do you test that the platform has an ftell/fseek-like function
> that uses an index variable whose sizeof() is at least 8. 
> 
> Note that, IIRC, windows had some funny case where the test above would have
> failed but that was okay because there were lower level I/O functions that
> used a 64-bit capable fpos_t (instead of off_t). I can't remember the exact
> details. 

After a bit of grepping, it seems that HAVE_LARGEFILE_SUPPORT reliably
means that the low-level system calls (lseek(), stat() etc.)  support
large files, through an off_t type that is at least 8 bytes (assumed
to be equivalent with a long long in some places, given the use of
PyLong_FromLongLong() and PyLong_AsLongLong()).

But the problems occur in fileobject.c, where we're dealing with the
stdio library.  Not all stdio libraries seem to support long files in
the same way, and they use a different typedef, fpos_t, which may be
larger or smaller in size than off_t.

Aha!  It looks like the problem is that under a variety of
circumstances TELL64(fd) is required; but these have in common that
HAVE_LARGEFILE_SUPPORT is defined, and in all cases except MS_WIN64 it
is defined as equal to lseek((fd),0,SEEK_CUR).

So wouldn't the problem be solved if we changed the #ifdefs so that
that is the default definition, instead of only defining that for
specific platforms?  As more and more platforms develop kernel support
for 64-bit offsets but their stdio libraries lag behind (or use APIs
we don't know about, or assume that fsetpos() is sufficient), we'll
run into this more and more often.

Here's the current logic for defining TELL64() in fileobject.c:

#if defined(MS_WIN64)
#define TELL64 _telli64
#elif defined(__NetBSD__) || ... /* etc. */
/* NOTE: this is only used on older
   NetBSD prior to f*o() funcions */
#define TELL64(fd) lseek((fd),0,SEEK_CUR)
#endif

There's exactly one place where TELL64() is subsequently used, which
is inside

#elif defined(HAVE_LARGEFILE_SUPPORT) && SIZEOF_FPOS_T >= 8
#else

Any objections???

--Guido van Rossum (home page: http://www.python.org/~guido/)