[Python-bugs-list] [ python-Bugs-451890 ] Building with Large File Support fails

noreply@sourceforge.net noreply@sourceforge.net
Sat, 08 Sep 2001 21:18:40 -0700


Bugs item #451890, was opened at 2001-08-16 18:00
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=451890&group_id=5470

Category: Build
Group: Python 2.2
Status: Open
Resolution: None
Priority: 5
Submitted By: Gerhard Häring (ghaering)
Assigned to: Guido van Rossum (gvanrossum)
Summary: Building with Large File Support fails

Initial Comment:
(At least) on Linux, building 2.2-HEAD fails when 
building with Large File Support. In 
Objects/fileobject.c function _portable_ftell line 
262.


----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2001-09-08 21:18

Message:
Logged In: YES 
user_id=6380

Interesting!  My test script for large files worked, so
_FILE_OFFSET_BITS and _LARGEFILE_SOURCE are defined in your
pyconfig.h, but apparently the test for
HAVE_LARGEFILE_SUPPORT failed, because that symbol is *not*
set in your pyconfig.h -- and everthing else keys off it!

So the only symbol you really need to pass is
HAVE_LARGEFILE_SUPPORT, and as a workaround you can define
that yourself in pyconfig.h.

This symbol is defined by a bit of configure code that looks
like this in the m4 input:

AC_MSG_CHECKING(whether to enable large file support)
if test "$have_long_long" = yes -a \
	"$ac_cv_sizeof_off_t" -gt "$ac_cv_sizeof_long" -a \
	"$ac_cv_sizeof_long_long" -ge "$ac_cv_sizeof_off_t"; then
  AC_DEFINE(HAVE_LARGEFILE_SUPPORT)
  AC_MSG_RESULT(yes)
else
  AC_MSG_RESULT(no)
fi

Can you upload config.status? That should tell me which of
those symbols doesn't have the right value. My guess is that
off_t is measured at 32 bits because _FILE_OFFSET_BITS is
not defined as 64 at the point that the symbol is measured.
So I have to tweak more stuff...  Back to the drawing board.
:-(

----------------------------------------------------------------------

Comment By: Gerhard Häring (ghaering)
Date: 2001-09-08 13:10

Message:
Logged In: YES 
user_id=163326

To find out the glibc version, you can invoke "glibcbug". 
My default bug report says:
...
Release:       libc-2.2.2
No, I don't get LFS support without manual work, with
CVS-HEAD and 2.2a3. I've uploaded my entire config.log file,
maybe you can make some sense of it. (it does find fello and
fseeko, but my pyconfig.h doesn't define the needed macros).
Come to think of it, I'll upload my pyconfig.h, too.



----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2001-09-08 12:22

Message:
Logged In: NO 

(This is Guido, in a hurry, not logged in :-)

Gerhard, I'm surprised you still had to pass options to
make. It works without those for me. (How do I tell the
version of glibc I'm using?)

Can you tell me what config.log says after
"checking for CFLAGS to enable large files"?

Have you tried 2.2a3?

----------------------------------------------------------------------

Comment By: Gerhard Häring (ghaering)
Date: 2001-09-08 12:12

Message:
Logged In: YES 
user_id=163326

Guido, I can build the current CVS now with LFS, too (Linux
2.4, glibc 2.2). I saw you did a lot in the configure
script, but I still had to give options to the make command
(grabbed them from Sean's latest source RPMs).

This worked for me:
./configure
make OPT="-g -O3 -D_FILE_OFFSET_BITS=64
-DHAVE_LARGEFILE_SUPPORT" CFLAGS="-g -O3
-D_FILE_OFFSET_BITS=64 -DHAVE_LARGEFILE_SUPPORT" 

Shouldn't the feature define HAVE_LARGEFILE_SUPPORT be
automatically added to pyconfig.h?

It would perhaps be a good idea add the info on how to build
with LFS to the build instructions.

Thanks,
Gerhard


----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-09-05 11:36

Message:
Logged In: YES 
user_id=6380

Gerhard, can you try the current CVS? I've done a few things
to try and fix this. I can now build just fine on a pretty
recent Linux 2.4 kernel.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-09-03 02:23

Message:
Logged In: YES 
user_id=21627

To fix the bug at hand (building fails), the following
strategy might be sufficient:
- produce an autoconf test that checks whether fpos_t is
integral, and "large"; define this by default for MSVC
- use this test in portable_fseek/portable_ftell.

I also wonder why the order in which APIs are tried is
different in fseek and ftell (fseek tries fseeko first,
ftell tries ftello only second).

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-08-20 13:19

Message:
Logged In: YES 
user_id=31435

By itself, adding opaque getpos/setpos sounds pretty easy 
(BTW, f{get,set}pos are std in C99).

Returning a usable 64-bit integer remains a x-platform 
mess.  The C99 rationale sez f{get,set}pos are the intended 
way to work with large files, but they provide no way to 
break the abstraction (Jeremy & I both looked in vain -- 
there is no defined way to extract the stream position from 
an fpos_t object, neither to do arithmetic on one).

On Windows, f{get,set}pos are (currently) the only way to 
get a 64-bit stream position from the MS C library (and MS 
doesn't (currently) mix that in with a state encoding; the 
Win32 API has other ways to deal with this).

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-08-20 06:21

Message:
Logged In: YES 
user_id=6380

OK, so we need to add separate getpos() and setpos() methods
that return an opaque wrapper for an fpos_t. That sounds
like serious work, plus it will require changing Python apps
that use seek and tell.

So I think we shold *also* continue to search for a way to
use a 64-bit seek offset for Python's seek() and tell()
methods -- I'm presuming this is hidden *somewhere* in the
fpos_t still, since the underlying OS certainly uses
lseek64(). If there's no way to extract it out of the
fpos_t, I propose to call lseek64() directly (after using a
fflush()) on the file descriptor.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-08-19 22:24

Message:
Logged In: YES 
user_id=31435

Noting that C99 *requires* fpos_t values to hold all the 
info in an mbstate_t, in addition to stream position info.  
So we have to expect others to follow glibc in this, and 
eventually everyone.  fpos_t cannot resolve to an array 
type, but anything else is fair (in particular it need not 
map to an integral type -- and probably won't anymore).

We have to give up belief that fpos_t is a number, because 
it's not.  We can believe that ftell returns a number, 
because it does <wink> -- but ftell isn't suitable for 
large file support.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-08-17 06:13

Message:
Logged In: YES 
user_id=21627

This started in glibc 2.2, I believe, so it would appear in
Redhat 7, SuSE 7, etc.
To see the problem, you have to ./configure with
CFLAGS="-D_FILE_OFFSET_BITS=64" OPT="-O2 $(CFLAGS)"; see
pyconfig.h.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-08-17 03:55

Message:
Logged In: YES 
user_id=6380

Whoa.  Interesting. Which Linux version is this?

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-08-17 00:21

Message:
Logged In: YES 
user_id=21627

This fails because in glibc, fpos_t contains an mb_state 
field, so that on restoring the file position, the 
multibyte encoding state of the file can be restored.

I see two solutions here:
- Python could give up the guarantee that the ftell result 
is a number, and return an object that embeds the fpos_t.
- Python could give up that guarantee that ftell/fseek 
works in all cases, and only use ftell(o), which should 
always return a number (atleast in Posix). If that 
approach is taken, an additional fgetpos/fsetpos call may 
be appropriate.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=451890&group_id=5470