Mailman 3 Re: [Python-Dev] [Python-checkins] cpython (2.7): Issue #10276: test_zlib checks that inputs of 2 GB are handled correctly by - Python-Dev

newer
Re: [Python-Dev] cpython (merge...

Re: [Python-Dev] [Python-checkins] cpython (2.7): Issue #10276: test_zlib checks that inputs of 2 GB are handled correctly by

older
What if replacing items in a...

Nadeem Vawda

3 May 2011 3 May '11

7:52 p.m.

On Tue, May 3, 2011 at 3:19 PM, victor.stinner wrote:

...

+# Issue #10276 - check that inputs of 2 GB are handled correctly. +# Be aware of issues #1202, #8650, #8651 and #10276 +class ChecksumBigBufferTestCase(unittest.TestCase): + int_max = 0x7FFFFFFF + + @unittest.skipUnless(mmap, "mmap() is not available.") + def test_big_buffer(self): + if sys.platform[:3] == 'win' or sys.platform == 'darwin': + requires('largefile', + 'test requires %s bytes and a long time to run' % + str(self.int_max)) + try: + with open(TESTFN, "wb+") as f: + f.seek(self.int_max-4) + f.write("asdf") + f.flush() + try: + m = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ) + self.assertEqual(zlib.crc32(m), 0x709418e7) + self.assertEqual(zlib.adler32(m), -2072837729) + finally: + m.close() + except (IOError, OverflowError): + raise unittest.SkipTest("filesystem doesn't have largefile support") + finally: + unlink(TESTFN) + +

0x7FFFFFFF is (2G-1) bytes. For a 2GB buffer, int_max should be 0x80000000. However, if you make this change, crc32() and adler32() raise OverflowErrors (see changeset a0681e7a6ded). This makes the test to erroneously report that the filesystem doesn't support large files. The assertEqual() tests should probably be changed to assertRaises(..., OverflowError). Also, the assignment to m needs to be moved outside of the inner try...finally block. If mmap() fails, the call to m.close() raises a new exception because m has not yet been bound. This seems to be causing failures on some of the 32-bit buildbots. As an aside, in this sort of situation is it better to just go and commit a fix myself, or is raising it on the mailing list first the right way to do things? Cheers, Nadeem

Show replies by thread

Antoine Pitrou

4 May 4 May

12:27 a.m.

New subject: [Python-checkins] cpython (2.7): Issue #10276: test_zlib checks that inputs of 2 GB are handled correctly by

Hello, On Tue, 3 May 2011 16:22:27 +0200 Nadeem Vawda wrote:

...

As an aside, in this sort of situation is it better to just go and commit a fix myself, or is raising it on the mailing list first the right way to do things?

Raising it on the mailing-list makes it serve as a kind of post-commit review. Also, it ensures that the committer of the original patch understands the issues with it. cheers Antoine.

Victor Stinner

2:08 a.m.

New subject: [Python-checkins] cpython (2.7): Issue #10276: test_zlib checks that inputs of 2 GB are handled correctly by

Le mardi 03 mai 2011 à 16:22 +0200, Nadeem Vawda a écrit :

...

On Tue, May 3, 2011 at 3:19 PM, victor.stinner wrote:

...
+# Issue #10276 - check that inputs of 2 GB are handled correctly. +# Be aware of issues #1202, #8650, #8651 and #10276 +class ChecksumBigBufferTestCase(unittest.TestCase): + int_max = 0x7FFFFFFF + + @unittest.skipUnless(mmap, "mmap() is not available.") + def test_big_buffer(self): + if sys.platform[:3] == 'win' or sys.platform == 'darwin': + requires('largefile', + 'test requires %s bytes and a long time to run' % + str(self.int_max)) + try: + with open(TESTFN, "wb+") as f: + f.seek(self.int_max-4) + f.write("asdf") + f.flush() + try: + m = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ) + self.assertEqual(zlib.crc32(m), 0x709418e7) + self.assertEqual(zlib.adler32(m), -2072837729) + finally: + m.close() + except (IOError, OverflowError): + raise unittest.SkipTest("filesystem doesn't have largefile support") + finally: + unlink(TESTFN) + +

0x7FFFFFFF is (2G-1) bytes. For a 2GB buffer, int_max should be 0x80000000. However, if you make this change, crc32() and adler32() raise OverflowErrors (see changeset a0681e7a6ded).

I don't want to check OverflowError: the test is supposed to compute the checksum of a buffer of 0x7FFFFFFF bytes, to check crc32() and adler32(). 0x7FFFFFFF is the biggest size supported by these functions (zlib doesn't use Py_ssize_t in Python 2.7). If you use a buffer of 0x80000000 bytes, you test PyArg_Parse*() functions, which have already a dedicated test (in test_xml_etree_c, it's not the best file to store such test...).

...

Also, the assignment to m needs to be moved outside of the inner try...finally block.

Yeah, I noticed this with buildbots: already fixed by dd58f8072216.

...

As an aside, in this sort of situation is it better to just go and commit a fix myself, or is raising it on the mailing list first the right way to do things?

I'm not sure that you understood the test, so I think that it's better to ask first on IRC and/or the mailing list. Victor

Nadeem Vawda

2:41 a.m.

New subject: [Python-checkins] cpython (2.7): Issue #10276: test_zlib checks that inputs of 2 GB are handled correctly by

On Tue, May 3, 2011 at 10:38 PM, Victor Stinner wrote:

...

I don't want to check OverflowError: the test is supposed to compute the checksum of a buffer of 0x7FFFFFFF bytes, to check crc32() and adler32(). 0x7FFFFFFF is the biggest size supported by these functions (zlib doesn't use Py_ssize_t in Python 2.7).

I see. Since you mentioned issue 10276 in the commit message, I assumed you were testing for the underlying C functions truncating their arguments. It seems that I was mistaken. Sorry for the confusion. Cheers, Nadeem

Ethan Furman

5 May 5 May

4:10 a.m.

New subject: [Python-checkins] cpython (2.7): Issue #10276: test_zlib checks that inputs of 2 GB are handled correctly by

Victor Stinner wrote:

...

Le mardi 03 mai 2011 à 16:22 +0200, Nadeem Vawda a écrit :

...
On Tue, May 3, 2011 at 3:19 PM, victor.stinner wrote:

...
+# Issue #10276 - check that inputs of 2 GB are handled correctly. +# Be aware of issues #1202, #8650, #8651 and #10276 +class ChecksumBigBufferTestCase(unittest.TestCase): + int_max = 0x7FFFFFFF + + @unittest.skipUnless(mmap, "mmap() is not available.") + def test_big_buffer(self): + if sys.platform[:3] == 'win' or sys.platform == 'darwin': + requires('largefile', + 'test requires %s bytes and a long time to run' % + str(self.int_max)) + try: + with open(TESTFN, "wb+") as f: + f.seek(self.int_max-4) + f.write("asdf") + f.flush() + try: + m = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ) + self.assertEqual(zlib.crc32(m), 0x709418e7) + self.assertEqual(zlib.adler32(m), -2072837729) + finally: + m.close() + except (IOError, OverflowError): + raise unittest.SkipTest("filesystem doesn't have largefile support") + finally: + unlink(TESTFN) + + 0x7FFFFFFF is (2G-1) bytes. For a 2GB buffer, int_max should be 0x80000000. However, if you make this change, crc32() and adler32() raise OverflowErrors (see changeset a0681e7a6ded).

I don't want to check OverflowError: the test is supposed to compute the checksum of a buffer of 0x7FFFFFFF bytes

The comment says 'check that inputs of 2 GB are handled correctly' but the file created is 1 byte short of 2Gb. Is the test wrong, or just wrongly commented? Or am I not understanding? ~Ethan~

Victor Stinner

3:03 p.m.

New subject: [Python-checkins] cpython (2.7): Issue #10276: test_zlib checks that inputs of 2 GB are handled correctly by

Le mercredi 04 mai 2011 à 15:40 -0700, Ethan Furman a écrit :

...

Victor Stinner wrote:

...
Le mardi 03 mai 2011 à 16:22 +0200, Nadeem Vawda a écrit :

...
On Tue, May 3, 2011 at 3:19 PM, victor.stinner wrote:

...
+# Issue #10276 - check that inputs of 2 GB are handled correctly. +# Be aware of issues #1202, #8650, #8651 and #10276 +class ChecksumBigBufferTestCase(unittest.TestCase): + int_max = 0x7FFFFFFF + + @unittest.skipUnless(mmap, "mmap() is not available.") + def test_big_buffer(self): + if sys.platform[:3] == 'win' or sys.platform == 'darwin': + requires('largefile', + 'test requires %s bytes and a long time to run' % + str(self.int_max)) + try: + with open(TESTFN, "wb+") as f: + f.seek(self.int_max-4) + f.write("asdf") + f.flush() + try: + m = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ) + self.assertEqual(zlib.crc32(m), 0x709418e7) + self.assertEqual(zlib.adler32(m), -2072837729) + finally: + m.close() + except (IOError, OverflowError): + raise unittest.SkipTest("filesystem doesn't have largefile support") + finally: + unlink(TESTFN) + + 0x7FFFFFFF is (2G-1) bytes. For a 2GB buffer, int_max should be 0x80000000. However, if you make this change, crc32() and adler32() raise OverflowErrors (see changeset a0681e7a6ded).

I don't want to check OverflowError: the test is supposed to compute the checksum of a buffer of 0x7FFFFFFF bytes

The comment says 'check that inputs of 2 GB are handled correctly' but the file created is 1 byte short of 2Gb. Is the test wrong, or just wrongly commented? Or am I not understanding?

If you write a byte after 2 GB of zeros, the file size is 2 GB+the few bytes. This trick is to create quickly a large file: some OSes support sparse files, zeros are not written on disk. But on Mac OS X and Windows, you really write 2 GB+some bytes. Victor

Nadeem Vawda

3:13 p.m.

New subject: [Python-checkins] cpython (2.7): Issue #10276: test_zlib checks that inputs of 2 GB are handled correctly by

On Thu, May 5, 2011 at 11:33 AM, Victor Stinner wrote:

...

Le mercredi 04 mai 2011 à 15:40 -0700, Ethan Furman a écrit :

...
The comment says 'check that inputs of 2 GB are handled correctly' but the file created is 1 byte short of 2Gb. Is the test wrong, or just wrongly commented? Or am I not understanding?

If you write a byte after 2 GB of zeros, the file size is 2 GB+the few bytes. This trick is to create quickly a large file: some OSes support sparse files, zeros are not written on disk. But on Mac OS X and Windows, you really write 2 GB+some bytes.

Ethan's point is that 0x7FFFFFFF is not 2GB - it is (2G-1) bytes. So the test and the preceding comment are inconsistent.

Paul Moore

3:23 p.m.

New subject: [Python-checkins] cpython (2.7): Issue #10276: test_zlib checks that inputs of 2 GB are handled correctly by

On 5 May 2011 10:33, Victor Stinner wrote:

...

If you write a byte after 2 GB of zeros, the file size is 2 GB+the few bytes. This trick is to create quickly a large file: some OSes support sparse files, zeros are not written on disk. But on Mac OS X and Windows, you really write 2 GB+some bytes.

FWIW, on Windows you can create sparse files, using DeviceIoControl(FILE_SET_SPARSE). It's probably too messy to be worth it for this case, though... Paul

Ethan Furman

5:37 p.m.

New subject: [Python-checkins] cpython (2.7): Issue #10276: test_zlib checks that inputs of 2 GB are handled correctly by

Victor Stinner wrote:

...

Le mercredi 04 mai 2011 à 15:40 -0700, Ethan Furman a écrit :

...
Victor Stinner wrote:

...
Le mardi 03 mai 2011 à 16:22 +0200, Nadeem Vawda a écrit :

...
On Tue, May 3, 2011 at 3:19 PM, victor.stinner wrote:

...
+ int_max = 0x7FFFFFFF

+ with open(TESTFN, "wb+") as f: + f.seek(self.int_max-4) + f.write("asdf") + f.flush()

0x7FFFFFFF is (2G-1) bytes. For a 2GB buffer, int_max should be 0x80000000. However, if you make this change, crc32() and adler32() raise OverflowErrors (see changeset a0681e7a6ded).

I don't want to check OverflowError: the test is supposed to compute the checksum of a buffer of 0x7FFFFFFF bytes

The comment says 'check that inputs of 2 GB are handled correctly' but the file created is 1 byte short of 2Gb. Is the test wrong, or just wrongly commented? Or am I not understanding?

If you write a byte after 2 GB of zeros, the file size is 2 GB+the few bytes. This trick is to create quickly a large file: some OSes support sparse files, zeros are not written on disk. But on Mac OS X and Windows, you really write 2 GB+some bytes.

True, but that's not what's happening -- four bytes are being written at int_max - 4, and int_max is one less that 2GB; hence the resulting file is one less than 2GB. ~Ethan~

Victor Stinner

5:57 p.m.

New subject: [Python-checkins] cpython (2.7): Issue #10276: test_zlib checks that inputs of 2 GB are handled correctly by

Le jeudi 05 mai 2011 à 05:07 -0700, Ethan Furman a écrit :

...

... hence the resulting file is one less than 2GB.

Yep, it's 0x7FFFFFFF because it's INT_MAX, the biggest value storable in an int. The zlib module stores the buffer size into an int in Python 2.7 (and Py_ssize_t in Python 3.3). Victor

Ethan Furman

8:47 p.m.

New subject: [Python-checkins] cpython (2.7): Issue #10276: test_zlib checks that inputs of 2 GB are handled correctly by

Victor Stinner wrote:

...

Le jeudi 05 mai 2011 à 05:07 -0700, Ethan Furman a écrit :

...
... hence the resulting file is one less than 2GB.

Yep, it's 0x7FFFFFFF because it's INT_MAX, the biggest value storable in an int. The zlib module stores the buffer size into an int in Python 2.7 (and Py_ssize_t in Python 3.3).

So we are agreed that the file is not, in fact, 2GB in size...

...

On Tue, May 3, 2011 at 3:19 PM, victor.stinner wrote:

...
+# Issue #10276 - check that inputs of 2 GB are handled correctly. +# Be aware of issues #1202, #8650, #8651 and #10276

So why do the comments say we are testing a 2GB input? ~Ethan~

4734

Age (days ago)

4736

Last active (days ago)

List overview

Download

10 comments

5 participants

participants (5)

Antoine Pitrou
Ethan Furman
Nadeem Vawda
Paul Moore
Victor Stinner

Re: [Python-Dev] [Python-checkins] cpython (2.7): Issue #10276: test_zlib checks that inputs of 2 GB are handled correctly by

tags

participants (5)