Re: [Python-Dev] [Python-checkins] cpython (2.7): Issue #10276: test_zlib checks that inputs of 2 GB are handled correctly by
On Tue, May 3, 2011 at 3:19 PM, victor.stinner
+# Issue #10276 - check that inputs of 2 GB are handled correctly. +# Be aware of issues #1202, #8650, #8651 and #10276 +class ChecksumBigBufferTestCase(unittest.TestCase): + int_max = 0x7FFFFFFF + + @unittest.skipUnless(mmap, "mmap() is not available.") + def test_big_buffer(self): + if sys.platform[:3] == 'win' or sys.platform == 'darwin': + requires('largefile', + 'test requires %s bytes and a long time to run' % + str(self.int_max)) + try: + with open(TESTFN, "wb+") as f: + f.seek(self.int_max-4) + f.write("asdf") + f.flush() + try: + m = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ) + self.assertEqual(zlib.crc32(m), 0x709418e7) + self.assertEqual(zlib.adler32(m), -2072837729) + finally: + m.close() + except (IOError, OverflowError): + raise unittest.SkipTest("filesystem doesn't have largefile support") + finally: + unlink(TESTFN) + +
0x7FFFFFFF is (2G-1) bytes. For a 2GB buffer, int_max should be 0x80000000. However, if you make this change, crc32() and adler32() raise OverflowErrors (see changeset a0681e7a6ded). This makes the test to erroneously report that the filesystem doesn't support large files. The assertEqual() tests should probably be changed to assertRaises(..., OverflowError). Also, the assignment to m needs to be moved outside of the inner try...finally block. If mmap() fails, the call to m.close() raises a new exception because m has not yet been bound. This seems to be causing failures on some of the 32-bit buildbots. As an aside, in this sort of situation is it better to just go and commit a fix myself, or is raising it on the mailing list first the right way to do things? Cheers, Nadeem
Hello,
On Tue, 3 May 2011 16:22:27 +0200
Nadeem Vawda
As an aside, in this sort of situation is it better to just go and commit a fix myself, or is raising it on the mailing list first the right way to do things?
Raising it on the mailing-list makes it serve as a kind of post-commit review. Also, it ensures that the committer of the original patch understands the issues with it. cheers Antoine.
Le mardi 03 mai 2011 à 16:22 +0200, Nadeem Vawda a écrit :
On Tue, May 3, 2011 at 3:19 PM, victor.stinner
wrote: +# Issue #10276 - check that inputs of 2 GB are handled correctly. +# Be aware of issues #1202, #8650, #8651 and #10276 +class ChecksumBigBufferTestCase(unittest.TestCase): + int_max = 0x7FFFFFFF + + @unittest.skipUnless(mmap, "mmap() is not available.") + def test_big_buffer(self): + if sys.platform[:3] == 'win' or sys.platform == 'darwin': + requires('largefile', + 'test requires %s bytes and a long time to run' % + str(self.int_max)) + try: + with open(TESTFN, "wb+") as f: + f.seek(self.int_max-4) + f.write("asdf") + f.flush() + try: + m = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ) + self.assertEqual(zlib.crc32(m), 0x709418e7) + self.assertEqual(zlib.adler32(m), -2072837729) + finally: + m.close() + except (IOError, OverflowError): + raise unittest.SkipTest("filesystem doesn't have largefile support") + finally: + unlink(TESTFN) + +
0x7FFFFFFF is (2G-1) bytes. For a 2GB buffer, int_max should be 0x80000000. However, if you make this change, crc32() and adler32() raise OverflowErrors (see changeset a0681e7a6ded).
I don't want to check OverflowError: the test is supposed to compute the checksum of a buffer of 0x7FFFFFFF bytes, to check crc32() and adler32(). 0x7FFFFFFF is the biggest size supported by these functions (zlib doesn't use Py_ssize_t in Python 2.7). If you use a buffer of 0x80000000 bytes, you test PyArg_Parse*() functions, which have already a dedicated test (in test_xml_etree_c, it's not the best file to store such test...).
Also, the assignment to m needs to be moved outside of the inner try...finally block.
Yeah, I noticed this with buildbots: already fixed by dd58f8072216.
As an aside, in this sort of situation is it better to just go and commit a fix myself, or is raising it on the mailing list first the right way to do things?
I'm not sure that you understood the test, so I think that it's better to ask first on IRC and/or the mailing list. Victor
On Tue, May 3, 2011 at 10:38 PM, Victor Stinner
I don't want to check OverflowError: the test is supposed to compute the checksum of a buffer of 0x7FFFFFFF bytes, to check crc32() and adler32(). 0x7FFFFFFF is the biggest size supported by these functions (zlib doesn't use Py_ssize_t in Python 2.7).
I see. Since you mentioned issue 10276 in the commit message, I assumed you were testing for the underlying C functions truncating their arguments. It seems that I was mistaken. Sorry for the confusion. Cheers, Nadeem
Victor Stinner wrote:
Le mardi 03 mai 2011 à 16:22 +0200, Nadeem Vawda a écrit :
On Tue, May 3, 2011 at 3:19 PM, victor.stinner
wrote: +# Issue #10276 - check that inputs of 2 GB are handled correctly. +# Be aware of issues #1202, #8650, #8651 and #10276 +class ChecksumBigBufferTestCase(unittest.TestCase): + int_max = 0x7FFFFFFF + + @unittest.skipUnless(mmap, "mmap() is not available.") + def test_big_buffer(self): + if sys.platform[:3] == 'win' or sys.platform == 'darwin': + requires('largefile', + 'test requires %s bytes and a long time to run' % + str(self.int_max)) + try: + with open(TESTFN, "wb+") as f: + f.seek(self.int_max-4) + f.write("asdf") + f.flush() + try: + m = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ) + self.assertEqual(zlib.crc32(m), 0x709418e7) + self.assertEqual(zlib.adler32(m), -2072837729) + finally: + m.close() + except (IOError, OverflowError): + raise unittest.SkipTest("filesystem doesn't have largefile support") + finally: + unlink(TESTFN) + + 0x7FFFFFFF is (2G-1) bytes. For a 2GB buffer, int_max should be 0x80000000. However, if you make this change, crc32() and adler32() raise OverflowErrors (see changeset a0681e7a6ded).
I don't want to check OverflowError: the test is supposed to compute the checksum of a buffer of 0x7FFFFFFF bytes
The comment says 'check that inputs of 2 GB are handled correctly' but the file created is 1 byte short of 2Gb. Is the test wrong, or just wrongly commented? Or am I not understanding? ~Ethan~
Le mercredi 04 mai 2011 à 15:40 -0700, Ethan Furman a écrit :
Victor Stinner wrote:
Le mardi 03 mai 2011 à 16:22 +0200, Nadeem Vawda a écrit :
On Tue, May 3, 2011 at 3:19 PM, victor.stinner
wrote: +# Issue #10276 - check that inputs of 2 GB are handled correctly. +# Be aware of issues #1202, #8650, #8651 and #10276 +class ChecksumBigBufferTestCase(unittest.TestCase): + int_max = 0x7FFFFFFF + + @unittest.skipUnless(mmap, "mmap() is not available.") + def test_big_buffer(self): + if sys.platform[:3] == 'win' or sys.platform == 'darwin': + requires('largefile', + 'test requires %s bytes and a long time to run' % + str(self.int_max)) + try: + with open(TESTFN, "wb+") as f: + f.seek(self.int_max-4) + f.write("asdf") + f.flush() + try: + m = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ) + self.assertEqual(zlib.crc32(m), 0x709418e7) + self.assertEqual(zlib.adler32(m), -2072837729) + finally: + m.close() + except (IOError, OverflowError): + raise unittest.SkipTest("filesystem doesn't have largefile support") + finally: + unlink(TESTFN) + + 0x7FFFFFFF is (2G-1) bytes. For a 2GB buffer, int_max should be 0x80000000. However, if you make this change, crc32() and adler32() raise OverflowErrors (see changeset a0681e7a6ded).
I don't want to check OverflowError: the test is supposed to compute the checksum of a buffer of 0x7FFFFFFF bytes
The comment says 'check that inputs of 2 GB are handled correctly' but the file created is 1 byte short of 2Gb. Is the test wrong, or just wrongly commented? Or am I not understanding?
If you write a byte after 2 GB of zeros, the file size is 2 GB+the few bytes. This trick is to create quickly a large file: some OSes support sparse files, zeros are not written on disk. But on Mac OS X and Windows, you really write 2 GB+some bytes. Victor
On Thu, May 5, 2011 at 11:33 AM, Victor Stinner
Le mercredi 04 mai 2011 à 15:40 -0700, Ethan Furman a écrit :
The comment says 'check that inputs of 2 GB are handled correctly' but the file created is 1 byte short of 2Gb. Is the test wrong, or just wrongly commented? Or am I not understanding?
If you write a byte after 2 GB of zeros, the file size is 2 GB+the few bytes. This trick is to create quickly a large file: some OSes support sparse files, zeros are not written on disk. But on Mac OS X and Windows, you really write 2 GB+some bytes.
Ethan's point is that 0x7FFFFFFF is not 2GB - it is (2G-1) bytes. So the test and the preceding comment are inconsistent.
On 5 May 2011 10:33, Victor Stinner
If you write a byte after 2 GB of zeros, the file size is 2 GB+the few bytes. This trick is to create quickly a large file: some OSes support sparse files, zeros are not written on disk. But on Mac OS X and Windows, you really write 2 GB+some bytes.
FWIW, on Windows you can create sparse files, using DeviceIoControl(FILE_SET_SPARSE). It's probably too messy to be worth it for this case, though... Paul
Victor Stinner wrote:
Le mercredi 04 mai 2011 à 15:40 -0700, Ethan Furman a écrit :
Victor Stinner wrote:
Le mardi 03 mai 2011 à 16:22 +0200, Nadeem Vawda a écrit :
On Tue, May 3, 2011 at 3:19 PM, victor.stinner
wrote: + int_max = 0x7FFFFFFF
+ with open(TESTFN, "wb+") as f: + f.seek(self.int_max-4) + f.write("asdf") + f.flush()
0x7FFFFFFF is (2G-1) bytes. For a 2GB buffer, int_max should be 0x80000000. However, if you make this change, crc32() and adler32() raise OverflowErrors (see changeset a0681e7a6ded).
I don't want to check OverflowError: the test is supposed to compute the checksum of a buffer of 0x7FFFFFFF bytes
The comment says 'check that inputs of 2 GB are handled correctly' but the file created is 1 byte short of 2Gb. Is the test wrong, or just wrongly commented? Or am I not understanding?
If you write a byte after 2 GB of zeros, the file size is 2 GB+the few bytes. This trick is to create quickly a large file: some OSes support sparse files, zeros are not written on disk. But on Mac OS X and Windows, you really write 2 GB+some bytes.
True, but that's not what's happening -- four bytes are being written at int_max - 4, and int_max is one less that 2GB; hence the resulting file is one less than 2GB. ~Ethan~
Le jeudi 05 mai 2011 à 05:07 -0700, Ethan Furman a écrit :
... hence the resulting file is one less than 2GB.
Yep, it's 0x7FFFFFFF because it's INT_MAX, the biggest value storable in an int. The zlib module stores the buffer size into an int in Python 2.7 (and Py_ssize_t in Python 3.3). Victor
Victor Stinner wrote:
Le jeudi 05 mai 2011 à 05:07 -0700, Ethan Furman a écrit :
... hence the resulting file is one less than 2GB.
Yep, it's 0x7FFFFFFF because it's INT_MAX, the biggest value storable in an int. The zlib module stores the buffer size into an int in Python 2.7 (and Py_ssize_t in Python 3.3).
So we are agreed that the file is not, in fact, 2GB in size...
On Tue, May 3, 2011 at 3:19 PM, victor.stinner
wrote: +# Issue #10276 - check that inputs of 2 GB are handled correctly. +# Be aware of issues #1202, #8650, #8651 and #10276
So why do the comments say we are testing a 2GB input? ~Ethan~
participants (5)
-
Antoine Pitrou
-
Ethan Furman
-
Nadeem Vawda
-
Paul Moore
-
Victor Stinner