foodwatch - Newsletter

From python-checkins at python.org Fri Apr 1 00:46:54 2011 From: python-checkins at python.org (raymond.hettinger) Date: Fri, 01 Apr 2011 00:46:54 +0200 Subject: [Python-checkins] cpython (3.2): Issue #7796: Add link to Jan Kaliszewski's alternate constructor and ABC for Message-ID: http://hg.python.org/cpython/rev/7aa3f1f7ac94 changeset: 69093:7aa3f1f7ac94 branch: 3.2 parent: 69091:9797bfe8240f user: Raymond Hettinger date: Thu Mar 31 15:46:06 2011 -0700 summary: Issue #7796: Add link to Jan Kaliszewski's alternate constructor and ABC for named tuples. files: Doc/library/collections.rst | 11 +++++++++-- Misc/ACKS | 1 + 2 files changed, 10 insertions(+), 2 deletions(-) diff --git a/Doc/library/collections.rst b/Doc/library/collections.rst --- a/Doc/library/collections.rst +++ b/Doc/library/collections.rst @@ -775,8 +775,15 @@ .. seealso:: - `Named tuple recipe `_ - adapted for Python 2.4. + * `Named tuple recipe `_ + adapted for Python 2.4. + + * `Recipe for named tuple abstract base class with a metaclass mix-in + `_ + by Jan Kaliszewski. Besides providing an :term:`abstract base class` for + named tuples, it also supports an alternate :term:`metaclass`-based + constructor that is convenient for use cases where named tuples are being + subclassed. :class:`OrderedDict` objects diff --git a/Misc/ACKS b/Misc/ACKS --- a/Misc/ACKS +++ b/Misc/ACKS @@ -449,6 +449,7 @@ Peter van Kampen Rafe Kaplan Jacob Kaplan-Moss +Jan Kaliszewski Arkady Koplyarov Lou Kates Hiroaki Kawai -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Fri Apr 1 00:46:56 2011 From: python-checkins at python.org (raymond.hettinger) Date: Fri, 01 Apr 2011 00:46:56 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Issue #7796: Add link to Jan Kaliszewski's alternate constructor and ABC for Message-ID: http://hg.python.org/cpython/rev/330d3482cad8 changeset: 69094:330d3482cad8 parent: 69092:3e191db416a6 parent: 69093:7aa3f1f7ac94 user: Raymond Hettinger date: Thu Mar 31 15:46:39 2011 -0700 summary: Issue #7796: Add link to Jan Kaliszewski's alternate constructor and ABC for named tuples. files: Doc/library/collections.rst | 11 +++++++++-- Misc/ACKS | 1 + 2 files changed, 10 insertions(+), 2 deletions(-) diff --git a/Doc/library/collections.rst b/Doc/library/collections.rst --- a/Doc/library/collections.rst +++ b/Doc/library/collections.rst @@ -857,8 +857,15 @@ .. seealso:: - `Named tuple recipe `_ - adapted for Python 2.4. + * `Named tuple recipe `_ + adapted for Python 2.4. + + * `Recipe for named tuple abstract base class with a metaclass mix-in + `_ + by Jan Kaliszewski. Besides providing an :term:`abstract base class` for + named tuples, it also supports an alternate :term:`metaclass`-based + constructor that is convenient for use cases where named tuples are being + subclassed. :class:`OrderedDict` objects diff --git a/Misc/ACKS b/Misc/ACKS --- a/Misc/ACKS +++ b/Misc/ACKS @@ -452,6 +452,7 @@ Peter van Kampen Rafe Kaplan Jacob Kaplan-Moss +Jan Kaliszewski Arkady Koplyarov Lou Kates Hiroaki Kawai -- Repository URL: http://hg.python.org/cpython From tjreedy at udel.edu Fri Apr 1 01:44:12 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 31 Mar 2011 19:44:12 -0400 Subject: [Python-checkins] cpython (3.2): Add links to make the math docs more usable. In-Reply-To: <4D94D2A8.3000405@gmail.com> References: <4D94D2A8.3000405@gmail.com> Message-ID: <4D9511CC.3010901@udel.edu> >> + Return the `Gamma function` at *x*. >> > > There's a space missing here, and the link doesn't work. It does for me. This may depend on the mail reader and whether it parses the url out in spite of the missing space. From python-checkins at python.org Fri Apr 1 02:31:31 2011 From: python-checkins at python.org (victor.stinner) Date: Fri, 01 Apr 2011 02:31:31 +0200 Subject: [Python-checkins] cpython: Issue #11393: Fix faulthandler_thread(): release cancel lock before join lock Message-ID: http://hg.python.org/cpython/rev/8b1341d51fe6 changeset: 69095:8b1341d51fe6 user: Victor Stinner date: Fri Apr 01 02:28:22 2011 +0200 summary: Issue #11393: Fix faulthandler_thread(): release cancel lock before join lock If the thread releases the join lock before the cancel lock, the thread may sometimes still be alive at cancel_dump_tracebacks_later() exit. So the cancel lock may be destroyed while the thread is still alive, whereas the thread will try to release the cancel lock, which just crash. Another minor fix: the thread doesn't release the cancel lock if it didn't acquire it. files: Modules/faulthandler.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Modules/faulthandler.c b/Modules/faulthandler.c --- a/Modules/faulthandler.c +++ b/Modules/faulthandler.c @@ -401,6 +401,7 @@ thread.timeout_ms, 0); if (st == PY_LOCK_ACQUIRED) { /* Cancelled by user */ + PyThread_release_lock(thread.cancel_event); break; } /* Timeout => dump traceback */ @@ -419,7 +420,6 @@ /* The only way out */ thread.running = 0; PyThread_release_lock(thread.join_event); - PyThread_release_lock(thread.cancel_event); } static void -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Fri Apr 1 03:00:21 2011 From: python-checkins at python.org (victor.stinner) Date: Fri, 01 Apr 2011 03:00:21 +0200 Subject: [Python-checkins] cpython: Issue #11393: New try to fix faulthandler_thread() Message-ID: http://hg.python.org/cpython/rev/0fb0fbd442b4 changeset: 69096:0fb0fbd442b4 user: Victor Stinner date: Fri Apr 01 03:00:05 2011 +0200 summary: Issue #11393: New try to fix faulthandler_thread() Always release the cancel join. Fix also another corner case: _PyFaulthandler_Fini() called after setting running variable to zero, but before releasing the join lock. files: Modules/faulthandler.c | 12 ++++++------ 1 files changed, 6 insertions(+), 6 deletions(-) diff --git a/Modules/faulthandler.c b/Modules/faulthandler.c --- a/Modules/faulthandler.c +++ b/Modules/faulthandler.c @@ -401,7 +401,6 @@ thread.timeout_ms, 0); if (st == PY_LOCK_ACQUIRED) { /* Cancelled by user */ - PyThread_release_lock(thread.cancel_event); break; } /* Timeout => dump traceback */ @@ -418,8 +417,9 @@ } while (ok && thread.repeat); /* The only way out */ + PyThread_release_lock(thread.cancel_event); + PyThread_release_lock(thread.join_event); thread.running = 0; - PyThread_release_lock(thread.join_event); } static void @@ -428,11 +428,11 @@ if (thread.running) { /* Notify cancellation */ PyThread_release_lock(thread.cancel_event); - /* Wait for thread to join */ - PyThread_acquire_lock(thread.join_event, 1); - assert(thread.running == 0); - PyThread_release_lock(thread.join_event); } + /* Wait for thread to join */ + PyThread_acquire_lock(thread.join_event, 1); + assert(thread.running == 0); + PyThread_release_lock(thread.join_event); Py_CLEAR(thread.file); } -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Fri Apr 1 03:17:36 2011 From: python-checkins at python.org (victor.stinner) Date: Fri, 01 Apr 2011 03:17:36 +0200 Subject: [Python-checkins] cpython: Issue #11393: fix usage of locks in faulthandler Message-ID: http://hg.python.org/cpython/rev/3558eecd84f0 changeset: 69097:3558eecd84f0 user: Victor Stinner date: Fri Apr 01 03:16:51 2011 +0200 summary: Issue #11393: fix usage of locks in faulthandler * faulthandler_cancel_dump_tracebacks_later() is responsible to set running to zero (so we don't need the volatile keyword anymore) * release locks if PyThread_start_new_thread() fails assert(thread.running == 0) was wrong in a corner case files: Modules/faulthandler.c | 7 ++++--- 1 files changed, 4 insertions(+), 3 deletions(-) diff --git a/Modules/faulthandler.c b/Modules/faulthandler.c --- a/Modules/faulthandler.c +++ b/Modules/faulthandler.c @@ -48,7 +48,7 @@ int fd; PY_TIMEOUT_T timeout_ms; /* timeout in microseconds */ int repeat; - volatile int running; + int running; PyInterpreterState *interp; int exit; /* released by parent thread when cancel request */ @@ -419,7 +419,6 @@ /* The only way out */ PyThread_release_lock(thread.cancel_event); PyThread_release_lock(thread.join_event); - thread.running = 0; } static void @@ -431,8 +430,8 @@ } /* Wait for thread to join */ PyThread_acquire_lock(thread.join_event, 1); - assert(thread.running == 0); PyThread_release_lock(thread.join_event); + thread.running = 0; Py_CLEAR(thread.file); } @@ -486,6 +485,8 @@ thread.running = 1; if (PyThread_start_new_thread(faulthandler_thread, NULL) == -1) { thread.running = 0; + PyThread_release_lock(thread.join_event); + PyThread_release_lock(thread.cancel_event); Py_CLEAR(thread.file); PyErr_SetString(PyExc_RuntimeError, "unable to start watchdog thread"); -- Repository URL: http://hg.python.org/cpython From solipsis at pitrou.net Fri Apr 1 04:55:16 2011 From: solipsis at pitrou.net (solipsis at pitrou.net) Date: Fri, 01 Apr 2011 04:55:16 +0200 Subject: [Python-checkins] Daily reference leaks (3558eecd84f0): sum=0 Message-ID: results for 3558eecd84f0 on branch "default" -------------------------------------------- Command line was: ['./python', '-m', 'test.regrtest', '-uall', '-R', '3:3:/home/antoine/cpython/refleaks/reflogYX7SDN', '-x'] From python-checkins at python.org Fri Apr 1 09:20:17 2011 From: python-checkins at python.org (georg.brandl) Date: Fri, 01 Apr 2011 09:20:17 +0200 Subject: [Python-checkins] cpython: Fix markup. Message-ID: http://hg.python.org/cpython/rev/214d0608fb84 changeset: 69098:214d0608fb84 user: Georg Brandl date: Fri Apr 01 09:19:57 2011 +0200 summary: Fix markup. files: Doc/library/faulthandler.rst | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/Doc/library/faulthandler.rst b/Doc/library/faulthandler.rst --- a/Doc/library/faulthandler.rst +++ b/Doc/library/faulthandler.rst @@ -69,8 +69,8 @@ Dump the tracebacks of all threads, after a timeout of *timeout* seconds, or each *timeout* seconds if *repeat* is ``True``. If *exit* is True, call - :cfunc:`_exit` with status=1 after dumping the tracebacks to terminate - immediatly the process, which is not safe. For example, :cfunc:`_exit` + :c:func:`_exit` with status=1 after dumping the tracebacks to terminate + immediatly the process, which is not safe. For example, :c:func:`_exit` doesn't flush file buffers. If the function is called twice, the new call replaces previous parameters (resets the timeout). The timer has a sub-second resolution. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Fri Apr 1 12:14:28 2011 From: python-checkins at python.org (victor.stinner) Date: Fri, 01 Apr 2011 12:14:28 +0200 Subject: [Python-checkins] cpython: Issue #11393: fault handler uses raise(signum) for SIGILL on Windows Message-ID: http://hg.python.org/cpython/rev/e51d8a160a8a changeset: 69099:e51d8a160a8a user: Victor Stinner date: Fri Apr 01 12:08:57 2011 +0200 summary: Issue #11393: fault handler uses raise(signum) for SIGILL on Windows files: Modules/faulthandler.c | 27 ++++++++++++--------------- 1 files changed, 12 insertions(+), 15 deletions(-) diff --git a/Modules/faulthandler.c b/Modules/faulthandler.c --- a/Modules/faulthandler.c +++ b/Modules/faulthandler.c @@ -270,14 +270,16 @@ else _Py_DumpTraceback(fd, tstate); -#ifndef MS_WINDOWS - /* call the previous signal handler: it is called if we use sigaction() - thanks to SA_NODEFER flag, otherwise it is deferred */ +#ifdef MS_WINDOWS + if (signum == SIGSEGV) { + /* don't call explictly the previous handler for SIGSEGV in this signal + handler, because the Windows signal handler would not be called */ + return; + } +#endif + /* call the previous signal handler: it is called immediatly if we use + sigaction() thanks to SA_NODEFER flag, otherwise it is deferred */ raise(signum); -#else - /* on Windows, don't call explictly the previous handler, because Windows - signal handler would not be called */ -#endif } /* Install handler for fatal signals (SIGSEGV, SIGFPE, ...). */ @@ -681,8 +683,9 @@ faulthandler_sigsegv(PyObject *self, PyObject *args) { #if defined(MS_WINDOWS) - /* faulthandler_fatal_error() restores the previous signal handler and then - gives back the execution flow to the program. In a normal case, the + /* For SIGSEGV, faulthandler_fatal_error() restores the previous signal + handler and then gives back the execution flow to the program (without + calling explicitly the previous error handler). In a normal case, the SIGSEGV was raised by the kernel because of a fault, and so if the program retries to execute the same instruction, the fault will be raised again. @@ -724,13 +727,7 @@ static PyObject * faulthandler_sigill(PyObject *self, PyObject *args) { -#if defined(MS_WINDOWS) - /* see faulthandler_sigsegv() for the explanation about while(1) */ - while(1) - raise(SIGILL); -#else raise(SIGILL); -#endif Py_RETURN_NONE; } #endif -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Fri Apr 1 12:14:29 2011 From: python-checkins at python.org (victor.stinner) Date: Fri, 01 Apr 2011 12:14:29 +0200 Subject: [Python-checkins] cpython: Issue #11393: The fault handler handles also SIGABRT Message-ID: http://hg.python.org/cpython/rev/a4fa79b0d478 changeset: 69100:a4fa79b0d478 user: Victor Stinner date: Fri Apr 01 12:13:55 2011 +0200 summary: Issue #11393: The fault handler handles also SIGABRT files: Doc/library/faulthandler.rst | 14 ++++---- Doc/using/cmdline.rst | 5 ++- Lib/test/test_faulthandler.py | 9 ++++++ Modules/faulthandler.c | 33 +++++++++++++++++----- Python/pythonrun.c | 1 + 5 files changed, 45 insertions(+), 17 deletions(-) diff --git a/Doc/library/faulthandler.rst b/Doc/library/faulthandler.rst --- a/Doc/library/faulthandler.rst +++ b/Doc/library/faulthandler.rst @@ -6,10 +6,10 @@ This module contains functions to dump the Python traceback explicitly, on a fault, after a timeout or on a user signal. Call :func:`faulthandler.enable` to -install fault handlers for :const:`SIGSEGV`, :const:`SIGFPE`, :const:`SIGBUS` -and :const:`SIGILL` signals. You can also enable them at startup by setting the -:envvar:`PYTHONFAULTHANDLER` environment variable or by using :option:`-X` -``faulthandler`` command line option. +install fault handlers for :const:`SIGSEGV`, :const:`SIGFPE`, :const:`SIGABRT`, +:const:`SIGBUS` and :const:`SIGILL` signals. You can also enable them at +startup by setting the :envvar:`PYTHONFAULTHANDLER` environment variable or by +using :option:`-X` ``faulthandler`` command line option. The fault handler is compatible with system fault handlers like Apport or the Windows fault handler. The module uses an alternative stack for signal @@ -48,9 +48,9 @@ .. function:: enable(file=sys.stderr, all_threads=False) Enable the fault handler: install handlers for :const:`SIGSEGV`, - :const:`SIGFPE`, :const:`SIGBUS` and :const:`SIGILL` signals to dump the - Python traceback. It dumps the traceback of the current thread, or all - threads if *all_threads* is ``True``, into *file*. + :const:`SIGFPE`, :const:`SIGABRT`, :const:`SIGBUS` and :const:`SIGILL` + signals to dump the Python traceback. It dumps the traceback of the current + thread, or all threads if *all_threads* is ``True``, into *file*. .. function:: disable() diff --git a/Doc/using/cmdline.rst b/Doc/using/cmdline.rst --- a/Doc/using/cmdline.rst +++ b/Doc/using/cmdline.rst @@ -502,8 +502,9 @@ If this environment variable is set, :func:`faulthandler.enable` is called at startup: install a handler for :const:`SIGSEGV`, :const:`SIGFPE`, - :const:`SIGBUS` and :const:`SIGILL` signals to dump the Python traceback. - This is equivalent to :option:`-X` ``faulthandler`` option. + :const:`SIGABRT`, :const:`SIGBUS` and :const:`SIGILL` signals to dump the + Python traceback. This is equivalent to :option:`-X` ``faulthandler`` + option. Debug-mode variables diff --git a/Lib/test/test_faulthandler.py b/Lib/test/test_faulthandler.py --- a/Lib/test/test_faulthandler.py +++ b/Lib/test/test_faulthandler.py @@ -112,6 +112,15 @@ 3, 'Segmentation fault') + def test_sigabrt(self): + self.check_fatal_error(""" +import faulthandler +faulthandler.enable() +faulthandler._sigabrt() +""".strip(), + 3, + 'Aborted') + @unittest.skipIf(sys.platform == 'win32', "SIGFPE cannot be caught on Windows") def test_sigfpe(self): diff --git a/Modules/faulthandler.c b/Modules/faulthandler.c --- a/Modules/faulthandler.c +++ b/Modules/faulthandler.c @@ -10,9 +10,9 @@ #endif #ifndef MS_WINDOWS - /* register() is useless on Windows, because only SIGSEGV and SIGILL can be - handled by the process, and these signals can only be used with enable(), - not using register() */ + /* register() is useless on Windows, because only SIGSEGV, SIGABRT and + SIGILL can be handled by the process, and these signals can only be used + with enable(), not using register() */ # define FAULTHANDLER_USER #endif @@ -96,6 +96,7 @@ {SIGILL, 0, "Illegal instruction", }, #endif {SIGFPE, 0, "Floating point exception", }, + {SIGABRT, 0, "Aborted", }, /* define SIGSEGV at the end to make it the default choice if searching the handler fails in faulthandler_fatal_error() */ {SIGSEGV, 0, "Segmentation fault", } @@ -202,7 +203,7 @@ } -/* Handler of SIGSEGV, SIGFPE, SIGBUS and SIGILL signals. +/* Handler of SIGSEGV, SIGFPE, SIGABRT, SIGBUS and SIGILL signals. Display the current Python traceback, restore the previous handler and call the previous handler. @@ -253,9 +254,9 @@ PUTS(fd, handler->name); PUTS(fd, "\n\n"); - /* SIGSEGV, SIGFPE, SIGBUS and SIGILL are synchronous signals and so are - delivered to the thread that caused the fault. Get the Python thread - state of the current thread. + /* SIGSEGV, SIGFPE, SIGABRT, SIGBUS and SIGILL are synchronous signals and + so are delivered to the thread that caused the fault. Get the Python + thread state of the current thread. PyThreadState_Get() doesn't give the state of the thread that caused the fault if the thread released the GIL, and so this function cannot be @@ -282,7 +283,7 @@ raise(signum); } -/* Install handler for fatal signals (SIGSEGV, SIGFPE, ...). */ +/* Install the handler for fatal signals, faulthandler_fatal_error(). */ static PyObject* faulthandler_enable(PyObject *self, PyObject *args, PyObject *kwargs) @@ -714,6 +715,20 @@ Py_RETURN_NONE; } +static PyObject * +faulthandler_sigabrt(PyObject *self, PyObject *args) +{ +#if _MSC_VER + /* If Python is compiled in debug mode with Visual Studio, abort() opens + a popup asking the user how to handle the assertion. Use raise(SIGABRT) + instead. */ + raise(SIGABRT); +#else + abort(); +#endif + Py_RETURN_NONE; +} + #ifdef SIGBUS static PyObject * faulthandler_sigbus(PyObject *self, PyObject *args) @@ -847,6 +862,8 @@ "a SIGSEGV or SIGBUS signal depending on the platform")}, {"_sigsegv", faulthandler_sigsegv, METH_VARARGS, PyDoc_STR("_sigsegv(): raise a SIGSEGV signal")}, + {"_sigabrt", faulthandler_sigabrt, METH_VARARGS, + PyDoc_STR("_sigabrt(): raise a SIGABRT signal")}, {"_sigfpe", (PyCFunction)faulthandler_sigfpe, METH_NOARGS, PyDoc_STR("_sigfpe(): raise a SIGFPE signal")}, #ifdef SIGBUS diff --git a/Python/pythonrun.c b/Python/pythonrun.c --- a/Python/pythonrun.c +++ b/Python/pythonrun.c @@ -2124,6 +2124,7 @@ fflush(stderr); _Py_DumpTraceback(fd, tstate); } + _PyFaulthandler_Fini(); } #ifdef MS_WINDOWS -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Fri Apr 1 13:10:41 2011 From: python-checkins at python.org (victor.stinner) Date: Fri, 01 Apr 2011 13:10:41 +0200 Subject: [Python-checkins] cpython: Issue #11393: Fix faulthandler.disable() and add a test Message-ID: http://hg.python.org/cpython/rev/a27755b10448 changeset: 69101:a27755b10448 user: Victor Stinner date: Fri Apr 01 12:56:17 2011 +0200 summary: Issue #11393: Fix faulthandler.disable() and add a test files: Lib/test/test_faulthandler.py | 32 +++++++++++++++++----- Modules/faulthandler.c | 8 ++-- 2 files changed, 28 insertions(+), 12 deletions(-) diff --git a/Lib/test/test_faulthandler.py b/Lib/test/test_faulthandler.py --- a/Lib/test/test_faulthandler.py +++ b/Lib/test/test_faulthandler.py @@ -431,13 +431,15 @@ @unittest.skipIf(not hasattr(faulthandler, "register"), "need faulthandler.register") - def check_register(self, filename=False, all_threads=False): + def check_register(self, filename=False, all_threads=False, + unregister=False): """ Register a handler displaying the traceback on a user signal. Raise the signal and check the written traceback. Raise an error if the output doesn't match the expected format. """ + signum = signal.SIGUSR1 code = """ import faulthandler import os @@ -446,12 +448,15 @@ def func(signum): os.kill(os.getpid(), signum) -signum = signal.SIGUSR1 +signum = {signum} +unregister = {unregister} if {has_filename}: file = open({filename}, "wb") else: file = None faulthandler.register(signum, file=file, all_threads={all_threads}) +if unregister: + faulthandler.unregister(signum) func(signum) if file is not None: file.close() @@ -460,20 +465,31 @@ filename=repr(filename), has_filename=bool(filename), all_threads=all_threads, + signum=signum, + unregister=unregister, ) trace, exitcode = self.get_output(code, filename) trace = '\n'.join(trace) - if all_threads: - regex = 'Current thread XXX:\n' + if not unregister: + if all_threads: + regex = 'Current thread XXX:\n' + else: + regex = 'Traceback $most recent call first$:\n' + regex = expected_traceback(6, 17, regex) + self.assertRegex(trace, regex) else: - regex = 'Traceback $most recent call first$:\n' - regex = expected_traceback(6, 14, regex) - self.assertRegex(trace, regex) - self.assertEqual(exitcode, 0) + self.assertEqual(trace, '') + if unregister: + self.assertNotEqual(exitcode, 0) + else: + self.assertEqual(exitcode, 0) def test_register(self): self.check_register() + def test_unregister(self): + self.check_register(unregister=True) + def test_register_file(self): with temporary_filename() as filename: self.check_register(filename=filename) diff --git a/Modules/faulthandler.c b/Modules/faulthandler.c --- a/Modules/faulthandler.c +++ b/Modules/faulthandler.c @@ -628,7 +628,7 @@ static int faulthandler_unregister(user_signal_t *user, int signum) { - if (user->enabled) + if (!user->enabled) return 0; user->enabled = 0; #ifdef HAVE_SIGACTION @@ -976,7 +976,7 @@ void _PyFaulthandler_Fini(void) { #ifdef FAULTHANDLER_USER - unsigned int i; + unsigned int signum; #endif #ifdef FAULTHANDLER_LATER @@ -995,8 +995,8 @@ #ifdef FAULTHANDLER_USER /* user */ if (user_signals != NULL) { - for (i=0; i < NSIG; i++) - faulthandler_unregister(&user_signals[i], i+1); + for (signum=0; signum < NSIG; signum++) + faulthandler_unregister(&user_signals[signum], signum); free(user_signals); user_signals = NULL; } -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Fri Apr 1 15:39:59 2011 From: python-checkins at python.org (victor.stinner) Date: Fri, 01 Apr 2011 15:39:59 +0200 Subject: [Python-checkins] cpython: Issue #11393: _Py_DumpTraceback() writes the header even if there is no frame Message-ID: http://hg.python.org/cpython/rev/7e3ed426962f changeset: 69102:7e3ed426962f user: Victor Stinner date: Fri Apr 01 15:34:01 2011 +0200 summary: Issue #11393: _Py_DumpTraceback() writes the header even if there is no frame files: Include/traceback.h | 4 +--- Python/traceback.c | 14 +++++++------- 2 files changed, 8 insertions(+), 10 deletions(-) diff --git a/Include/traceback.h b/Include/traceback.h --- a/Include/traceback.h +++ b/Include/traceback.h @@ -38,8 +38,6 @@ ... File "xxx", line xxx in - Return 0 on success, -1 on error. - This function is written for debug purpose only, to dump the traceback in the worst case: after a segmentation fault, at fatal error, etc. That's why, it is very limited. Strings are truncated to 100 characters and encoded to @@ -49,7 +47,7 @@ This function is signal safe. */ -PyAPI_DATA(int) _Py_DumpTraceback( +PyAPI_DATA(void) _Py_DumpTraceback( int fd, PyThreadState *tstate); diff --git a/Python/traceback.c b/Python/traceback.c --- a/Python/traceback.c +++ b/Python/traceback.c @@ -556,18 +556,19 @@ write(fd, "\n", 1); } -static int +static void dump_traceback(int fd, PyThreadState *tstate, int write_header) { PyFrameObject *frame; unsigned int depth; + if (write_header) + PUTS(fd, "Traceback (most recent call first):\n"); + frame = _PyThreadState_GetFrame(tstate); if (frame == NULL) - return -1; + return; - if (write_header) - PUTS(fd, "Traceback (most recent call first):\n"); depth = 0; while (frame != NULL) { if (MAX_FRAME_DEPTH <= depth) { @@ -580,13 +581,12 @@ frame = frame->f_back; depth++; } - return 0; } -int +void _Py_DumpTraceback(int fd, PyThreadState *tstate) { - return dump_traceback(fd, tstate, 1); + dump_traceback(fd, tstate, 1); } /* Write the thread identifier into the file 'fd': "Current thread 0xHHHH:\" if -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Fri Apr 1 15:40:03 2011 From: python-checkins at python.org (victor.stinner) Date: Fri, 01 Apr 2011 15:40:03 +0200 Subject: [Python-checkins] cpython: Issue #11393: signal of user signal displays tracebacks even if tstate==NULL Message-ID: http://hg.python.org/cpython/rev/e609105dff64 changeset: 69103:e609105dff64 user: Victor Stinner date: Fri Apr 01 15:37:12 2011 +0200 summary: Issue #11393: signal of user signal displays tracebacks even if tstate==NULL * faulthandler_user() displays the tracebacks of all threads even if it is unable to get the state of the current thread * test_faulthandler: only release the GIL in test_gil_released() check * create check_signum() subfunction files: Lib/test/test_faulthandler.py | 9 ++- Modules/faulthandler.c | 58 ++++++++++++++-------- 2 files changed, 43 insertions(+), 24 deletions(-) diff --git a/Lib/test/test_faulthandler.py b/Lib/test/test_faulthandler.py --- a/Lib/test/test_faulthandler.py +++ b/Lib/test/test_faulthandler.py @@ -8,6 +8,8 @@ import tempfile import unittest +TIMEOUT = 0.5 + try: from resource import setrlimit, RLIMIT_CORE, error as resource_error except ImportError: @@ -189,7 +191,7 @@ import faulthandler output = open({filename}, 'wb') faulthandler.enable(output) -faulthandler._read_null(True) +faulthandler._read_null() """.strip().format(filename=repr(filename)), 4, '(?:Segmentation fault|Bus error)', @@ -199,7 +201,7 @@ self.check_fatal_error(""" import faulthandler faulthandler.enable(all_threads=True) -faulthandler._read_null(True) +faulthandler._read_null() """.strip(), 3, '(?:Segmentation fault|Bus error)', @@ -376,7 +378,7 @@ # Check that sleep() was not interrupted assert (b - a) >= min_pause, "{{}} < {{}}".format(b - a, min_pause) -timeout = 0.5 +timeout = {timeout} repeat = {repeat} cancel = {cancel} if {has_filename}: @@ -394,6 +396,7 @@ has_filename=bool(filename), repeat=repeat, cancel=cancel, + timeout=TIMEOUT, ) trace, exitcode = self.get_output(code, filename) trace = '\n'.join(trace) diff --git a/Modules/faulthandler.c b/Modules/faulthandler.c --- a/Modules/faulthandler.c +++ b/Modules/faulthandler.c @@ -65,6 +65,7 @@ int fd; int all_threads; _Py_sighandler_t previous; + PyInterpreterState *interp; } user_signal_t; static user_signal_t *user_signals; @@ -529,15 +530,35 @@ the thread doesn't hold the GIL. Read the thread local storage (TLS) instead: call PyGILState_GetThisThreadState(). */ tstate = PyGILState_GetThisThreadState(); - if (tstate == NULL) { - /* unable to get the current thread, do nothing */ - return; - } if (user->all_threads) - _Py_DumpTracebackThreads(user->fd, tstate->interp, tstate); - else + _Py_DumpTracebackThreads(user->fd, user->interp, tstate); + else { + if (tstate == NULL) + return; _Py_DumpTraceback(user->fd, tstate); + } +} + +static int +check_signum(int signum) +{ + unsigned int i; + + for (i=0; i < faulthandler_nsignals; i++) { + if (faulthandler_handlers[i].signum == signum) { + PyErr_Format(PyExc_RuntimeError, + "signal %i cannot be registered, " + "use enable() instead", + signum); + return 0; + } + } + if (signum < 1 || NSIG <= signum) { + PyErr_SetString(PyExc_ValueError, "signal number out of range"); + return 0; + } + return 1; } static PyObject* @@ -549,12 +570,12 @@ PyObject *file = NULL; int all_threads = 0; int fd; - unsigned int i; user_signal_t *user; _Py_sighandler_t previous; #ifdef HAVE_SIGACTION struct sigaction action; #endif + PyThreadState *tstate; int err; if (!PyArg_ParseTupleAndKeywords(args, kwargs, @@ -562,19 +583,15 @@ &signum, &file, &all_threads)) return NULL; - if (signum < 1 || NSIG <= signum) { - PyErr_SetString(PyExc_ValueError, "signal number out of range"); + if (!check_signum(signum)) return NULL; - } - for (i=0; i < faulthandler_nsignals; i++) { - if (faulthandler_handlers[i].signum == signum) { - PyErr_Format(PyExc_RuntimeError, - "signal %i cannot be registered by register(), " - "use enable() instead", - signum); - return NULL; - } + /* The caller holds the GIL and so PyThreadState_Get() can be used */ + tstate = PyThreadState_Get(); + if (tstate == NULL) { + PyErr_SetString(PyExc_RuntimeError, + "unable to get the current thread state"); + return NULL; } file = faulthandler_get_fileno(file, &fd); @@ -620,6 +637,7 @@ user->fd = fd; user->all_threads = all_threads; user->previous = previous; + user->interp = tstate->interp; user->enabled = 1; Py_RETURN_NONE; @@ -651,10 +669,8 @@ if (!PyArg_ParseTuple(args, "i:unregister", &signum)) return NULL; - if (signum < 1 || NSIG <= signum) { - PyErr_SetString(PyExc_ValueError, "signal number out of range"); + if (!check_signum(signum)) return NULL; - } user = &user_signals[signum]; change = faulthandler_unregister(user, signum); -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Fri Apr 1 16:00:11 2011 From: python-checkins at python.org (victor.stinner) Date: Fri, 01 Apr 2011 16:00:11 +0200 Subject: [Python-checkins] cpython: Issue #11727: set regrtest default timeout to 15 minutes Message-ID: http://hg.python.org/cpython/rev/15f6fe139181 changeset: 69104:15f6fe139181 user: Victor Stinner date: Fri Apr 01 15:59:59 2011 +0200 summary: Issue #11727: set regrtest default timeout to 15 minutes files: Lib/test/regrtest.py | 5 +++-- Misc/NEWS | 4 +++- 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/Lib/test/regrtest.py b/Lib/test/regrtest.py --- a/Lib/test/regrtest.py +++ b/Lib/test/regrtest.py @@ -22,7 +22,8 @@ -h/--help -- print this text and exit --timeout TIMEOUT -- dump the traceback and exit if a test takes more - than TIMEOUT seconds + than TIMEOUT seconds (default: 15 minutes); disable + the timeout if TIMEOUT is zero Verbosity @@ -239,7 +240,7 @@ findleaks=False, use_resources=None, trace=False, coverdir='coverage', runleaks=False, huntrleaks=False, verbose2=False, print_slow=False, random_seed=None, use_mp=None, verbose3=False, forever=False, - header=False, timeout=None): + header=False, timeout=15*60): """Execute a test suite. This also parses command-line options and modifies its behavior diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -361,7 +361,9 @@ Tests ----- -- Issue #11727: add --timeout option to regrtest (disabled by default). +- Issue #11727: If a test takes more than 15 minutes, regrtest dumps the + traceback of all threads and exits. Use --timeout option to change the + default timeout or to disable it. - Issue #11653: fix -W with -j in regrtest. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Fri Apr 1 18:17:16 2011 From: python-checkins at python.org (r.david.murray) Date: Fri, 01 Apr 2011 18:17:16 +0200 Subject: [Python-checkins] devguide: Top level sections only in sidebar TOC on FAQ page. Message-ID: http://hg.python.org/devguide/rev/1dc036ca6c94 changeset: 409:1dc036ca6c94 user: R David Murray date: Fri Apr 01 12:16:53 2011 -0400 summary: Top level sections only in sidebar TOC on FAQ page. files: faq.rst | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/faq.rst b/faq.rst --- a/faq.rst +++ b/faq.rst @@ -1,3 +1,5 @@ +:tocdepth: 2 + .. _faq: Python Developer FAQ -- Repository URL: http://hg.python.org/devguide From python-checkins at python.org Fri Apr 1 18:40:20 2011 From: python-checkins at python.org (victor.stinner) Date: Fri, 01 Apr 2011 18:40:20 +0200 Subject: [Python-checkins] cpython: Issue #11727: set regrtest default timeout to 30 minutes Message-ID: http://hg.python.org/cpython/rev/053bc5ca199b changeset: 69105:053bc5ca199b user: Victor Stinner date: Fri Apr 01 18:16:36 2011 +0200 summary: Issue #11727: set regrtest default timeout to 30 minutes files: Lib/test/regrtest.py | 4 ++-- Misc/NEWS | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/Lib/test/regrtest.py b/Lib/test/regrtest.py --- a/Lib/test/regrtest.py +++ b/Lib/test/regrtest.py @@ -22,7 +22,7 @@ -h/--help -- print this text and exit --timeout TIMEOUT -- dump the traceback and exit if a test takes more - than TIMEOUT seconds (default: 15 minutes); disable + than TIMEOUT seconds (default: 30 minutes); disable the timeout if TIMEOUT is zero Verbosity @@ -240,7 +240,7 @@ findleaks=False, use_resources=None, trace=False, coverdir='coverage', runleaks=False, huntrleaks=False, verbose2=False, print_slow=False, random_seed=None, use_mp=None, verbose3=False, forever=False, - header=False, timeout=15*60): + header=False, timeout=30*60): """Execute a test suite. This also parses command-line options and modifies its behavior diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -361,7 +361,7 @@ Tests ----- -- Issue #11727: If a test takes more than 15 minutes, regrtest dumps the +- Issue #11727: If a test takes more than 30 minutes, regrtest dumps the traceback of all threads and exits. Use --timeout option to change the default timeout or to disable it. -- Repository URL: http://hg.python.org/cpython From solipsis at pitrou.net Sat Apr 2 04:57:59 2011 From: solipsis at pitrou.net (solipsis at pitrou.net) Date: Sat, 02 Apr 2011 04:57:59 +0200 Subject: [Python-checkins] Daily reference leaks (053bc5ca199b): sum=0 Message-ID: results for 053bc5ca199b on branch "default" -------------------------------------------- Command line was: ['./python', '-m', 'test.regrtest', '-uall', '-R', '3:3:/home/antoine/cpython/refleaks/reflog21pzX3', '-x'] From solipsis at pitrou.net Sun Apr 3 04:55:31 2011 From: solipsis at pitrou.net (solipsis at pitrou.net) Date: Sun, 03 Apr 2011 04:55:31 +0200 Subject: [Python-checkins] Daily reference leaks (053bc5ca199b): sum=-56 Message-ID: results for 053bc5ca199b on branch "default" -------------------------------------------- test_pyexpat leaked [0, 0, -56] references, sum=-56 Command line was: ['./python', '-m', 'test.regrtest', '-uall', '-R', '3:3:/home/antoine/cpython/refleaks/reflogPtWxL8', '-x'] From python-checkins at python.org Sun Apr 3 15:26:49 2011 From: python-checkins at python.org (ezio.melotti) Date: Sun, 03 Apr 2011 15:26:49 +0200 Subject: [Python-checkins] cpython (3.1): Fix typo noticed by Sandro Tosi. Message-ID: http://hg.python.org/cpython/rev/821244a44163 changeset: 69106:821244a44163 branch: 3.1 parent: 69066:8e074d9b1587 user: Ezio Melotti date: Sun Apr 03 16:20:21 2011 +0300 summary: Fix typo noticed by Sandro Tosi. files: Doc/library/profile.rst | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Doc/library/profile.rst b/Doc/library/profile.rst --- a/Doc/library/profile.rst +++ b/Doc/library/profile.rst @@ -48,7 +48,7 @@ The profiler modules are designed to provide an execution profile for a given program, not for benchmarking purposes (for that, there is :mod:`timeit` for - resonably accurate results). This particularly applies to benchmarking + reasonably accurate results). This particularly applies to benchmarking Python code against C code: the profilers introduce overhead for Python code, but not for C-level functions, and so the C code would seem faster than any Python one. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Apr 3 15:26:49 2011 From: python-checkins at python.org (ezio.melotti) Date: Sun, 03 Apr 2011 15:26:49 +0200 Subject: [Python-checkins] cpython (merge 3.1 -> 3.2): Merge with 3.1 Message-ID: http://hg.python.org/cpython/rev/5fd1ac1c9474 changeset: 69107:5fd1ac1c9474 branch: 3.2 parent: 69093:7aa3f1f7ac94 parent: 69106:821244a44163 user: Ezio Melotti date: Sun Apr 03 16:24:22 2011 +0300 summary: Merge with 3.1 files: Doc/library/profile.rst | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Doc/library/profile.rst b/Doc/library/profile.rst --- a/Doc/library/profile.rst +++ b/Doc/library/profile.rst @@ -50,7 +50,7 @@ The profiler modules are designed to provide an execution profile for a given program, not for benchmarking purposes (for that, there is :mod:`timeit` for - resonably accurate results). This particularly applies to benchmarking + reasonably accurate results). This particularly applies to benchmarking Python code against C code: the profilers introduce overhead for Python code, but not for C-level functions, and so the C code would seem faster than any Python one. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Apr 3 15:26:50 2011 From: python-checkins at python.org (ezio.melotti) Date: Sun, 03 Apr 2011 15:26:50 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Merge with 3.2 Message-ID: http://hg.python.org/cpython/rev/ca5932a51a9b changeset: 69108:ca5932a51a9b parent: 69105:053bc5ca199b parent: 69107:5fd1ac1c9474 user: Ezio Melotti date: Sun Apr 03 16:25:49 2011 +0300 summary: Merge with 3.2 files: Doc/library/profile.rst | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Doc/library/profile.rst b/Doc/library/profile.rst --- a/Doc/library/profile.rst +++ b/Doc/library/profile.rst @@ -50,7 +50,7 @@ The profiler modules are designed to provide an execution profile for a given program, not for benchmarking purposes (for that, there is :mod:`timeit` for - resonably accurate results). This particularly applies to benchmarking + reasonably accurate results). This particularly applies to benchmarking Python code against C code: the profilers introduce overhead for Python code, but not for C-level functions, and so the C code would seem faster than any Python one. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Apr 3 17:03:32 2011 From: python-checkins at python.org (ezio.melotti) Date: Sun, 03 Apr 2011 17:03:32 +0200 Subject: [Python-checkins] cpython (3.2): #11282: the fail* methods will stay around a few more versions. Message-ID: http://hg.python.org/cpython/rev/1fd736395df3 changeset: 69109:1fd736395df3 branch: 3.2 parent: 69107:5fd1ac1c9474 user: Ezio Melotti date: Sun Apr 03 17:37:58 2011 +0300 summary: #11282: the fail* methods will stay around a few more versions. files: Doc/library/unittest.rst | 2 +- Lib/unittest/case.py | 3 +-- Lib/unittest/test/test_case.py | 6 ++---- 3 files changed, 4 insertions(+), 7 deletions(-) diff --git a/Doc/library/unittest.rst b/Doc/library/unittest.rst --- a/Doc/library/unittest.rst +++ b/Doc/library/unittest.rst @@ -1459,7 +1459,7 @@ :meth:`.assertRaisesRegex` assertRaisesRegexp ============================== ====================== ====================== - .. deprecated-removed:: 3.1 3.3 + .. deprecated:: 3.1 the fail* aliases listed in the second column. .. deprecated:: 3.2 the assert* aliases listed in the third column. diff --git a/Lib/unittest/case.py b/Lib/unittest/case.py --- a/Lib/unittest/case.py +++ b/Lib/unittest/case.py @@ -1181,8 +1181,7 @@ return original_func(*args, **kwargs) return deprecated_func - # The fail* methods can be removed in 3.3, the 5 assert* methods will - # have to stay around for a few more versions. See #9424. + # see #9424 failUnlessEqual = assertEquals = _deprecate(assertEqual) failIfEqual = assertNotEquals = _deprecate(assertNotEqual) failUnlessAlmostEqual = assertAlmostEquals = _deprecate(assertAlmostEqual) diff --git a/Lib/unittest/test/test_case.py b/Lib/unittest/test/test_case.py --- a/Lib/unittest/test/test_case.py +++ b/Lib/unittest/test/test_case.py @@ -1088,10 +1088,8 @@ _runtime_warn("barz") def testDeprecatedMethodNames(self): - """Test that the deprecated methods raise a DeprecationWarning. - - The fail* methods will be removed in 3.3. The assert* methods will - have to stay around for a few more versions. See #9424. + """ + Test that the deprecated methods raise a DeprecationWarning. See #9424. """ old = ( (self.failIfEqual, (3, 5)), -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Apr 3 17:03:33 2011 From: python-checkins at python.org (ezio.melotti) Date: Sun, 03 Apr 2011 17:03:33 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): #11282: merge with 3.2. Message-ID: http://hg.python.org/cpython/rev/110bb604bc2f changeset: 69110:110bb604bc2f parent: 69108:ca5932a51a9b parent: 69109:1fd736395df3 user: Ezio Melotti date: Sun Apr 03 17:39:19 2011 +0300 summary: #11282: merge with 3.2. files: Doc/library/unittest.rst | 2 +- Lib/unittest/case.py | 3 +-- Lib/unittest/test/test_case.py | 6 ++---- 3 files changed, 4 insertions(+), 7 deletions(-) diff --git a/Doc/library/unittest.rst b/Doc/library/unittest.rst --- a/Doc/library/unittest.rst +++ b/Doc/library/unittest.rst @@ -1462,7 +1462,7 @@ :meth:`.assertRaisesRegex` assertRaisesRegexp ============================== ====================== ====================== - .. deprecated-removed:: 3.1 3.3 + .. deprecated:: 3.1 the fail* aliases listed in the second column. .. deprecated:: 3.2 the assert* aliases listed in the third column. diff --git a/Lib/unittest/case.py b/Lib/unittest/case.py --- a/Lib/unittest/case.py +++ b/Lib/unittest/case.py @@ -1110,8 +1110,7 @@ return original_func(*args, **kwargs) return deprecated_func - # The fail* methods can be removed in 3.3, the 5 assert* methods will - # have to stay around for a few more versions. See #9424. + # see #9424 assertEquals = _deprecate(assertEqual) assertNotEquals = _deprecate(assertNotEqual) assertAlmostEquals = _deprecate(assertAlmostEqual) diff --git a/Lib/unittest/test/test_case.py b/Lib/unittest/test/test_case.py --- a/Lib/unittest/test/test_case.py +++ b/Lib/unittest/test/test_case.py @@ -1093,10 +1093,8 @@ _runtime_warn("barz") def testDeprecatedMethodNames(self): - """Test that the deprecated methods raise a DeprecationWarning. - - The fail* methods will be removed in 3.3. The assert* methods will - have to stay around for a few more versions. See #9424. + """ + Test that the deprecated methods raise a DeprecationWarning. See #9424. """ old = ( (self.assertNotEquals, (3, 5)), -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Apr 3 17:03:36 2011 From: python-checkins at python.org (ezio.melotti) Date: Sun, 03 Apr 2011 17:03:36 +0200 Subject: [Python-checkins] cpython: #11282: add back the fail* methods and assertDictContainsSubset. Message-ID: http://hg.python.org/cpython/rev/aa658836e090 changeset: 69111:aa658836e090 user: Ezio Melotti date: Sun Apr 03 18:02:13 2011 +0300 summary: #11282: add back the fail* methods and assertDictContainsSubset. files: Lib/unittest/case.py | 41 ++++++++++- Lib/unittest/test/_test_warnings.py | 7 +- Lib/unittest/test/test_assertions.py | 9 ++ Lib/unittest/test/test_case.py | 53 ++++++++++++++++ Lib/unittest/test/test_runner.py | 10 ++- Misc/NEWS | 2 + 6 files changed, 113 insertions(+), 9 deletions(-) diff --git a/Lib/unittest/case.py b/Lib/unittest/case.py --- a/Lib/unittest/case.py +++ b/Lib/unittest/case.py @@ -938,6 +938,35 @@ standardMsg = self._truncateMessage(standardMsg, diff) self.fail(self._formatMessage(msg, standardMsg)) + def assertDictContainsSubset(self, subset, dictionary, msg=None): + """Checks whether dictionary is a superset of subset.""" + warnings.warn('assertDictContainsSubset is deprecated', + DeprecationWarning) + missing = [] + mismatched = [] + for key, value in subset.items(): + if key not in dictionary: + missing.append(key) + elif value != dictionary[key]: + mismatched.append('%s, expected: %s, actual: %s' % + (safe_repr(key), safe_repr(value), + safe_repr(dictionary[key]))) + + if not (missing or mismatched): + return + + standardMsg = '' + if missing: + standardMsg = 'Missing: %s' % ','.join(safe_repr(m) for m in + missing) + if mismatched: + if standardMsg: + standardMsg += '; ' + standardMsg += 'Mismatched values: %s' % ','.join(mismatched) + + self.fail(self._formatMessage(msg, standardMsg)) + + def assertCountEqual(self, first, second, msg=None): """An unordered sequence comparison asserting that the same elements, regardless of order. If the same element occurs more than once, @@ -1111,11 +1140,13 @@ return deprecated_func # see #9424 - assertEquals = _deprecate(assertEqual) - assertNotEquals = _deprecate(assertNotEqual) - assertAlmostEquals = _deprecate(assertAlmostEqual) - assertNotAlmostEquals = _deprecate(assertNotAlmostEqual) - assert_ = _deprecate(assertTrue) + failUnlessEqual = assertEquals = _deprecate(assertEqual) + failIfEqual = assertNotEquals = _deprecate(assertNotEqual) + failUnlessAlmostEqual = assertAlmostEquals = _deprecate(assertAlmostEqual) + failIfAlmostEqual = assertNotAlmostEquals = _deprecate(assertNotAlmostEqual) + failUnless = assert_ = _deprecate(assertTrue) + failUnlessRaises = _deprecate(assertRaises) + failIf = _deprecate(assertFalse) assertRaisesRegexp = _deprecate(assertRaisesRegex) assertRegexpMatches = _deprecate(assertRegex) diff --git a/Lib/unittest/test/_test_warnings.py b/Lib/unittest/test/_test_warnings.py --- a/Lib/unittest/test/_test_warnings.py +++ b/Lib/unittest/test/_test_warnings.py @@ -19,12 +19,17 @@ warnings.warn('rw', RuntimeWarning) class TestWarnings(unittest.TestCase): - # unittest warnings will be printed at most once per type + # unittest warnings will be printed at most once per type (max one message + # for the fail* methods, and one for the assert* methods) def test_assert(self): self.assertEquals(2+2, 4) self.assertEquals(2*2, 4) self.assertEquals(2**2, 4) + def test_fail(self): + self.failUnless(1) + self.failUnless(True) + def test_other_unittest(self): self.assertAlmostEqual(2+2, 4) self.assertNotAlmostEqual(4+4, 2) diff --git a/Lib/unittest/test/test_assertions.py b/Lib/unittest/test/test_assertions.py --- a/Lib/unittest/test/test_assertions.py +++ b/Lib/unittest/test/test_assertions.py @@ -223,6 +223,15 @@ "\+ \{'key': 'value'\}$", "\+ \{'key': 'value'\} : oops$"]) + def testAssertDictContainsSubset(self): + with warnings.catch_warnings(): + warnings.simplefilter("ignore", DeprecationWarning) + + self.assertMessages('assertDictContainsSubset', ({'key': 'value'}, {}), + ["^Missing: 'key'$", "^oops$", + "^Missing: 'key'$", + "^Missing: 'key' : oops$"]) + def testAssertMultiLineEqual(self): self.assertMessages('assertMultiLineEqual', ("", "foo"), [r"\+ foo$", "^oops$", diff --git a/Lib/unittest/test/test_case.py b/Lib/unittest/test/test_case.py --- a/Lib/unittest/test/test_case.py +++ b/Lib/unittest/test/test_case.py @@ -523,6 +523,36 @@ self.assertRaises(self.failureException, self.assertNotIn, 'cow', animals) + def testAssertDictContainsSubset(self): + with warnings.catch_warnings(): + warnings.simplefilter("ignore", DeprecationWarning) + + self.assertDictContainsSubset({}, {}) + self.assertDictContainsSubset({}, {'a': 1}) + self.assertDictContainsSubset({'a': 1}, {'a': 1}) + self.assertDictContainsSubset({'a': 1}, {'a': 1, 'b': 2}) + self.assertDictContainsSubset({'a': 1, 'b': 2}, {'a': 1, 'b': 2}) + + with self.assertRaises(self.failureException): + self.assertDictContainsSubset({1: "one"}, {}) + + with self.assertRaises(self.failureException): + self.assertDictContainsSubset({'a': 2}, {'a': 1}) + + with self.assertRaises(self.failureException): + self.assertDictContainsSubset({'c': 1}, {'a': 1}) + + with self.assertRaises(self.failureException): + self.assertDictContainsSubset({'a': 1, 'c': 1}, {'a': 1}) + + with self.assertRaises(self.failureException): + self.assertDictContainsSubset({'a': 1, 'c': 1}, {'a': 1}) + + one = ''.join(chr(i) for i in range(255)) + # this used to cause a UnicodeDecodeError constructing the failure msg + with self.assertRaises(self.failureException): + self.assertDictContainsSubset({'foo': one}, {'foo': '\uFFFD'}) + def testAssertEqual(self): equal_pairs = [ ((), ()), @@ -1097,11 +1127,19 @@ Test that the deprecated methods raise a DeprecationWarning. See #9424. """ old = ( + (self.failIfEqual, (3, 5)), (self.assertNotEquals, (3, 5)), + (self.failUnlessEqual, (3, 3)), (self.assertEquals, (3, 3)), + (self.failUnlessAlmostEqual, (2.0, 2.0)), (self.assertAlmostEquals, (2.0, 2.0)), + (self.failIfAlmostEqual, (3.0, 5.0)), (self.assertNotAlmostEquals, (3.0, 5.0)), + (self.failUnless, (True,)), (self.assert_, (True,)), + (self.failUnlessRaises, (TypeError, lambda _: 3.14 + 'spam')), + (self.failIf, (False,)), + (self.assertDictContainsSubset, (dict(a=1, b=2), dict(a=1, b=2, c=3))), (self.assertRaisesRegexp, (KeyError, 'foo', lambda: {}['foo'])), (self.assertRegexpMatches, ('bar', 'bar')), ) @@ -1109,6 +1147,21 @@ with self.assertWarns(DeprecationWarning): meth(*args) + # disable this test for now. When the version where the fail* methods will + # be removed is decided, re-enable it and update the version + def _testDeprecatedFailMethods(self): + """Test that the deprecated fail* methods get removed in 3.x""" + if sys.version_info[:2] < (3, 3): + return + deprecated_names = [ + 'failIfEqual', 'failUnlessEqual', 'failUnlessAlmostEqual', + 'failIfAlmostEqual', 'failUnless', 'failUnlessRaises', 'failIf', + 'assertDictContainsSubset', + ] + for deprecated_name in deprecated_names: + with self.assertRaises(AttributeError): + getattr(self, deprecated_name) # remove these in 3.x + def testDeepcopy(self): # Issue: 5660 class TestableTest(unittest.TestCase): diff --git a/Lib/unittest/test/test_runner.py b/Lib/unittest/test/test_runner.py --- a/Lib/unittest/test/test_runner.py +++ b/Lib/unittest/test/test_runner.py @@ -257,17 +257,19 @@ return [b.splitlines() for b in p.communicate()] opts = dict(stdout=subprocess.PIPE, stderr=subprocess.PIPE, cwd=os.path.dirname(__file__)) + ae_msg = b'Please use assertEqual instead.' + at_msg = b'Please use assertTrue instead.' # no args -> all the warnings are printed, unittest warnings only once p = subprocess.Popen([sys.executable, '_test_warnings.py'], **opts) out, err = get_parse_out_err(p) self.assertIn(b'OK', err) # check that the total number of warnings in the output is correct - self.assertEqual(len(out), 11) + self.assertEqual(len(out), 12) # check that the numbers of the different kind of warnings is correct for msg in [b'dw', b'iw', b'uw']: self.assertEqual(out.count(msg), 3) - for msg in [b'rw']: + for msg in [ae_msg, at_msg, b'rw']: self.assertEqual(out.count(msg), 1) args_list = ( @@ -292,9 +294,11 @@ **opts) out, err = get_parse_out_err(p) self.assertIn(b'OK', err) - self.assertEqual(len(out), 13) + self.assertEqual(len(out), 14) for msg in [b'dw', b'iw', b'uw', b'rw']: self.assertEqual(out.count(msg), 3) + for msg in [ae_msg, at_msg]: + self.assertEqual(out.count(msg), 1) def testStdErrLookedUpAtInstantiationTime(self): # see issue 10786 diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -87,6 +87,8 @@ Library ------- +- unittest.TestCase.assertSameElements has been removed. + - sys.getfilesystemencoding() raises a RuntimeError if initfsencoding() was not called yet: detect bootstrap (startup) issues earlier. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Apr 3 17:09:10 2011 From: python-checkins at python.org (antoine.pitrou) Date: Sun, 03 Apr 2011 17:09:10 +0200 Subject: [Python-checkins] cpython: Issue #5863: Rewrite BZ2File in pure Python, and allow it to accept Message-ID: http://hg.python.org/cpython/rev/2cb07a46f4b5 changeset: 69112:2cb07a46f4b5 user: Antoine Pitrou date: Sun Apr 03 17:05:46 2011 +0200 summary: Issue #5863: Rewrite BZ2File in pure Python, and allow it to accept file-like objects using a new `fileobj` constructor argument. Patch by Nadeem Vawda. files: Doc/ACKS.txt | 1 + Doc/library/bz2.rst | 221 +- Lib/bz2.py | 392 +++++ Lib/test/test_bz2.py | 142 +- Misc/NEWS | 4 + Modules/bz2module.c | 2281 ++++------------------------- PCbuild/bz2.vcproj | 4 +- PCbuild/pcbuild.sln | 2 +- PCbuild/readme.txt | 6 +- setup.py | 4 +- 10 files changed, 960 insertions(+), 2097 deletions(-) diff --git a/Doc/ACKS.txt b/Doc/ACKS.txt --- a/Doc/ACKS.txt +++ b/Doc/ACKS.txt @@ -202,6 +202,7 @@ * Jim Tittsler * David Turner * Ville Vainio + * Nadeem Vawda * Martijn Vries * Charles G. Waldman * Greg Ward diff --git a/Doc/library/bz2.rst b/Doc/library/bz2.rst --- a/Doc/library/bz2.rst +++ b/Doc/library/bz2.rst @@ -1,189 +1,149 @@ -:mod:`bz2` --- Compression compatible with :program:`bzip2` -=========================================================== +:mod:`bz2` --- Support for :program:`bzip2` compression +======================================================= .. module:: bz2 - :synopsis: Interface to compression and decompression routines - compatible with bzip2. + :synopsis: Interfaces for bzip2 compression and decompression. .. moduleauthor:: Gustavo Niemeyer +.. moduleauthor:: Nadeem Vawda .. sectionauthor:: Gustavo Niemeyer +.. sectionauthor:: Nadeem Vawda -This module provides a comprehensive interface for the bz2 compression library. -It implements a complete file interface, one-shot (de)compression functions, and -types for sequential (de)compression. +This module provides a comprehensive interface for compressing and +decompressing data using the bzip2 compression algorithm. -For other archive formats, see the :mod:`gzip`, :mod:`zipfile`, and +For related file formats, see the :mod:`gzip`, :mod:`zipfile`, and :mod:`tarfile` modules. -Here is a summary of the features offered by the bz2 module: +The :mod:`bz2` module contains: -* :class:`BZ2File` class implements a complete file interface, including - :meth:`~BZ2File.readline`, :meth:`~BZ2File.readlines`, - :meth:`~BZ2File.writelines`, :meth:`~BZ2File.seek`, etc; +* The :class:`BZ2File` class for reading and writing compressed files. +* The :class:`BZ2Compressor` and :class:`BZ2Decompressor` classes for + incremental (de)compression. +* The :func:`compress` and :func:`decompress` functions for one-shot + (de)compression. -* :class:`BZ2File` class implements emulated :meth:`~BZ2File.seek` support; - -* :class:`BZ2File` class implements universal newline support; - -* :class:`BZ2File` class offers an optimized line iteration using a readahead - algorithm; - -* Sequential (de)compression supported by :class:`BZ2Compressor` and - :class:`BZ2Decompressor` classes; - -* One-shot (de)compression supported by :func:`compress` and :func:`decompress` - functions; - -* Thread safety uses individual locking mechanism. +All of the classes in this module may safely be accessed from multiple threads. (De)compression of files ------------------------ -Handling of compressed files is offered by the :class:`BZ2File` class. +.. class:: BZ2File(filename=None, mode='r', buffering=None, compresslevel=9, fileobj=None) + Open a bzip2-compressed file. -.. class:: BZ2File(filename, mode='r', buffering=0, compresslevel=9) + The :class:`BZ2File` can wrap an existing :term:`file object` (given by + *fileobj*), or operate directly on a named file (named by *filename*). + Exactly one of these two parameters should be provided. - Open a bz2 file. Mode can be either ``'r'`` or ``'w'``, for reading (default) - or writing. When opened for writing, the file will be created if it doesn't - exist, and truncated otherwise. If *buffering* is given, ``0`` means - unbuffered, and larger numbers specify the buffer size; the default is - ``0``. If *compresslevel* is given, it must be a number between ``1`` and - ``9``; the default is ``9``. Add a ``'U'`` to mode to open the file for input - with universal newline support. Any line ending in the input file will be - seen as a ``'\n'`` in Python. Also, a file so opened gains the attribute - :attr:`newlines`; the value for this attribute is one of ``None`` (no newline - read yet), ``'\r'``, ``'\n'``, ``'\r\n'`` or a tuple containing all the - newline types seen. Universal newlines are available only when - reading. Instances support iteration in the same way as normal :class:`file` - instances. + The *mode* argument can be either ``'r'`` for reading (default), or ``'w'`` + for writing. - :class:`BZ2File` supports the :keyword:`with` statement. + The *buffering* argument is ignored. Its use is deprecated. + + If *mode* is ``'w'``, *compresslevel* can be a number between ``1`` and + ``9`` specifying the level of compression: ``1`` produces the least + compression, and ``9`` (default) produces the most compression. + + :class:`BZ2File` provides all of the members specified by the + :class:`io.BufferedIOBase`, except for :meth:`detach` and :meth:`truncate`. + Iteration and the :keyword:`with` statement are supported. + + :class:`BZ2File` also provides the following method: + + .. method:: peek([n]) + + Return buffered data without advancing the file position. At least one + byte of data will be returned (unless at EOF). The exact number of bytes + returned is unspecified. + + .. versionadded:: 3.3 .. versionchanged:: 3.1 Support for the :keyword:`with` statement was added. + .. versionchanged:: 3.3 + The :meth:`fileno`, :meth:`readable`, :meth:`seekable`, :meth:`writable`, + :meth:`read1` and :meth:`readinto` methods were added. - .. method:: close() + .. versionchanged:: 3.3 + The *fileobj* argument to the constructor was added. - Close the file. Sets data attribute :attr:`closed` to true. A closed file - cannot be used for further I/O operations. :meth:`close` may be called - more than once without error. - - .. method:: read([size]) - - Read at most *size* uncompressed bytes, returned as a byte string. If the - *size* argument is negative or omitted, read until EOF is reached. - - - .. method:: readline([size]) - - Return the next line from the file, as a byte string, retaining newline. - A non-negative *size* argument limits the maximum number of bytes to - return (an incomplete line may be returned then). Return an empty byte - string at EOF. - - - .. method:: readlines([size]) - - Return a list of lines read. The optional *size* argument, if given, is an - approximate bound on the total number of bytes in the lines returned. - - - .. method:: seek(offset[, whence]) - - Move to new file position. Argument *offset* is a byte count. Optional - argument *whence* defaults to ``os.SEEK_SET`` or ``0`` (offset from start - of file; offset should be ``>= 0``); other values are ``os.SEEK_CUR`` or - ``1`` (move relative to current position; offset can be positive or - negative), and ``os.SEEK_END`` or ``2`` (move relative to end of file; - offset is usually negative, although many platforms allow seeking beyond - the end of a file). - - Note that seeking of bz2 files is emulated, and depending on the - parameters the operation may be extremely slow. - - - .. method:: tell() - - Return the current file position, an integer. - - - .. method:: write(data) - - Write the byte string *data* to file. Note that due to buffering, - :meth:`close` may be needed before the file on disk reflects the data - written. - - - .. method:: writelines(sequence_of_byte_strings) - - Write the sequence of byte strings to the file. Note that newlines are not - added. The sequence can be any iterable object producing byte strings. - This is equivalent to calling write() for each byte string. - - -Sequential (de)compression --------------------------- - -Sequential compression and decompression is done using the classes -:class:`BZ2Compressor` and :class:`BZ2Decompressor`. - +Incremental (de)compression +--------------------------- .. class:: BZ2Compressor(compresslevel=9) Create a new compressor object. This object may be used to compress data - sequentially. If you want to compress data in one shot, use the - :func:`compress` function instead. The *compresslevel* parameter, if given, - must be a number between ``1`` and ``9``; the default is ``9``. + incrementally. For one-shot compression, use the :func:`compress` function + instead. + + *compresslevel*, if given, must be a number between ``1`` and ``9``. The + default is ``9``. .. method:: compress(data) - Provide more data to the compressor object. It will return chunks of - compressed data whenever possible. When you've finished providing data to - compress, call the :meth:`flush` method to finish the compression process, - and return what is left in internal buffers. + Provide data to the compressor object. Returns a chunk of compressed data + if possible, or an empty byte string otherwise. + + When you have finished providing data to the compressor, call the + :meth:`flush` method to finish the compression process. .. method:: flush() - Finish the compression process and return what is left in internal - buffers. You must not use the compressor object after calling this method. + Finish the compression process. Returns the compressed data left in + internal buffers. + + The compressor object may not be used after this method has been called. .. class:: BZ2Decompressor() Create a new decompressor object. This object may be used to decompress data - sequentially. If you want to decompress data in one shot, use the - :func:`decompress` function instead. + incrementally. For one-shot compression, use the :func:`decompress` function + instead. .. method:: decompress(data) - Provide more data to the decompressor object. It will return chunks of - decompressed data whenever possible. If you try to decompress data after - the end of stream is found, :exc:`EOFError` will be raised. If any data - was found after the end of stream, it'll be ignored and saved in - :attr:`unused_data` attribute. + Provide data to the decompressor object. Returns a chunk of decompressed + data if possible, or an empty byte string otherwise. + + Attempting to decompress data after the end of stream is reached raises + an :exc:`EOFError`. If any data is found after the end of the stream, it + is ignored and saved in the :attr:`unused_data` attribute. + + + .. attribute:: eof + + True if the end-of-stream marker has been reached. + + .. versionadded:: 3.3 + + + .. attribute:: unused_data + + Data found after the end of the compressed stream. One-shot (de)compression ------------------------ -One-shot compression and decompression is provided through the :func:`compress` -and :func:`decompress` functions. - - .. function:: compress(data, compresslevel=9) - Compress *data* in one shot. If you want to compress data sequentially, use - an instance of :class:`BZ2Compressor` instead. The *compresslevel* parameter, - if given, must be a number between ``1`` and ``9``; the default is ``9``. + Compress *data*. + + *compresslevel*, if given, must be a number between ``1`` and ``9``. The + default is ``9``. + + For incremental compression, use a :class:`BZ2Compressor` instead. .. function:: decompress(data) - Decompress *data* in one shot. If you want to decompress data sequentially, - use an instance of :class:`BZ2Decompressor` instead. + Decompress *data*. + For incremental decompression, use a :class:`BZ2Decompressor` instead. + diff --git a/Lib/bz2.py b/Lib/bz2.py new file mode 100644 --- /dev/null +++ b/Lib/bz2.py @@ -0,0 +1,392 @@ +"""Interface to the libbzip2 compression library. + +This module provides a file interface, classes for incremental +(de)compression, and functions for one-shot (de)compression. +""" + +__all__ = ["BZ2File", "BZ2Compressor", "BZ2Decompressor", "compress", + "decompress"] + +__author__ = "Nadeem Vawda " + +import io +import threading +import warnings + +from _bz2 import BZ2Compressor, BZ2Decompressor + + +_MODE_CLOSED = 0 +_MODE_READ = 1 +_MODE_READ_EOF = 2 +_MODE_WRITE = 3 + +_BUFFER_SIZE = 8192 + + +class BZ2File(io.BufferedIOBase): + + """A file object providing transparent bzip2 (de)compression. + + A BZ2File can act as a wrapper for an existing file object, or refer + directly to a named file on disk. + + Note that BZ2File provides a *binary* file interface - data read is + returned as bytes, and data to be written should be given as bytes. + """ + + def __init__(self, filename=None, mode="r", buffering=None, + compresslevel=9, fileobj=None): + """Open a bzip2-compressed file. + + If filename is given, open the named file. Otherwise, operate on + the file object given by fileobj. Exactly one of these two + parameters should be provided. + + mode can be 'r' for reading (default), or 'w' for writing. + + buffering is ignored. Its use is deprecated. + + If mode is 'w', compresslevel can be a number between 1 and 9 + specifying the level of compression: 1 produces the least + compression, and 9 (default) produces the most compression. + """ + # This lock must be recursive, so that BufferedIOBase's + # readline(), readlines() and writelines() don't deadlock. + self._lock = threading.RLock() + self._fp = None + self._closefp = False + self._mode = _MODE_CLOSED + self._pos = 0 + self._size = -1 + + if buffering is not None: + warnings.warn("Use of 'buffering' argument is deprecated", + DeprecationWarning) + + if not (1 <= compresslevel <= 9): + raise ValueError("compresslevel must be between 1 and 9") + + if mode in ("", "r", "rb"): + mode = "rb" + mode_code = _MODE_READ + self._decompressor = BZ2Decompressor() + self._buffer = None + elif mode in ("w", "wb"): + mode = "wb" + mode_code = _MODE_WRITE + self._compressor = BZ2Compressor() + else: + raise ValueError("Invalid mode: {!r}".format(mode)) + + if filename is not None and fileobj is None: + self._fp = open(filename, mode) + self._closefp = True + self._mode = mode_code + elif fileobj is not None and filename is None: + self._fp = fileobj + self._mode = mode_code + else: + raise ValueError("Must give exactly one of filename and fileobj") + + def close(self): + """Flush and close the file. + + May be called more than once without error. Once the file is + closed, any other operation on it will raise a ValueError. + """ + with self._lock: + if self._mode == _MODE_CLOSED: + return + try: + if self._mode in (_MODE_READ, _MODE_READ_EOF): + self._decompressor = None + elif self._mode == _MODE_WRITE: + self._fp.write(self._compressor.flush()) + self._compressor = None + finally: + try: + if self._closefp: + self._fp.close() + finally: + self._fp = None + self._closefp = False + self._mode = _MODE_CLOSED + self._buffer = None + + @property + def closed(self): + """True if this file is closed.""" + return self._mode == _MODE_CLOSED + + def fileno(self): + """Return the file descriptor for the underlying file.""" + return self._fp.fileno() + + def seekable(self): + """Return whether the file supports seeking.""" + return self.readable() + + def readable(self): + """Return whether the file was opened for reading.""" + return self._mode in (_MODE_READ, _MODE_READ_EOF) + + def writable(self): + """Return whether the file was opened for writing.""" + return self._mode == _MODE_WRITE + + # Mode-checking helper functions. + + def _check_not_closed(self): + if self.closed: + raise ValueError("I/O operation on closed file") + + def _check_can_read(self): + if not self.readable(): + self._check_not_closed() + raise io.UnsupportedOperation("File not open for reading") + + def _check_can_write(self): + if not self.writable(): + self._check_not_closed() + raise io.UnsupportedOperation("File not open for writing") + + def _check_can_seek(self): + if not self.seekable(): + self._check_not_closed() + raise io.UnsupportedOperation("Seeking is only supported " + "on files opening for reading") + + # Fill the readahead buffer if it is empty. Returns False on EOF. + def _fill_buffer(self): + if self._buffer: + return True + if self._decompressor.eof: + self._mode = _MODE_READ_EOF + self._size = self._pos + return False + rawblock = self._fp.read(_BUFFER_SIZE) + if not rawblock: + raise EOFError("Compressed file ended before the " + "end-of-stream marker was reached") + self._buffer = self._decompressor.decompress(rawblock) + return True + + # Read data until EOF. + # If return_data is false, consume the data without returning it. + def _read_all(self, return_data=True): + blocks = [] + while self._fill_buffer(): + if return_data: + blocks.append(self._buffer) + self._pos += len(self._buffer) + self._buffer = None + if return_data: + return b"".join(blocks) + + # Read a block of up to n bytes. + # If return_data is false, consume the data without returning it. + def _read_block(self, n, return_data=True): + blocks = [] + while n > 0 and self._fill_buffer(): + if n < len(self._buffer): + data = self._buffer[:n] + self._buffer = self._buffer[n:] + else: + data = self._buffer + self._buffer = None + if return_data: + blocks.append(data) + self._pos += len(data) + n -= len(data) + if return_data: + return b"".join(blocks) + + def peek(self, n=0): + """Return buffered data without advancing the file position. + + Always returns at least one byte of data, unless at EOF. + The exact number of bytes returned is unspecified. + """ + with self._lock: + self._check_can_read() + if self._mode == _MODE_READ_EOF or not self._fill_buffer(): + return b"" + return self._buffer + + def read(self, size=-1): + """Read up to size uncompressed bytes from the file. + + If size is negative or omitted, read until EOF is reached. + Returns b'' if the file is already at EOF. + """ + with self._lock: + self._check_can_read() + if self._mode == _MODE_READ_EOF or size == 0: + return b"" + elif size < 0: + return self._read_all() + else: + return self._read_block(size) + + def read1(self, size=-1): + """Read up to size uncompressed bytes with at most one read + from the underlying stream. + + Returns b'' if the file is at EOF. + """ + with self._lock: + self._check_can_read() + if (size == 0 or self._mode == _MODE_READ_EOF or + not self._fill_buffer()): + return b"" + if 0 < size < len(self._buffer): + data = self._buffer[:size] + self._buffer = self._buffer[size:] + else: + data = self._buffer + self._buffer = None + self._pos += len(data) + return data + + def readinto(self, b): + """Read up to len(b) bytes into b. + + Returns the number of bytes read (0 for EOF). + """ + with self._lock: + return io.BufferedIOBase.readinto(self, b) + + def readline(self, size=-1): + """Read a line of uncompressed bytes from the file. + + The terminating newline (if present) is retained. If size is + non-negative, no more than size bytes will be read (in which + case the line may be incomplete). Returns b'' if already at EOF. + """ + if not hasattr(size, "__index__"): + raise TypeError("Integer argument expected") + size = size.__index__() + with self._lock: + return io.BufferedIOBase.readline(self, size) + + def readlines(self, size=-1): + """Read a list of lines of uncompressed bytes from the file. + + size can be specified to control the number of lines read: no + further lines will be read once the total size of the lines read + so far equals or exceeds size. + """ + if not hasattr(size, "__index__"): + raise TypeError("Integer argument expected") + size = size.__index__() + with self._lock: + return io.BufferedIOBase.readlines(self, size) + + def write(self, data): + """Write a byte string to the file. + + Returns the number of uncompressed bytes written, which is + always len(data). Note that due to buffering, the file on disk + may not reflect the data written until close() is called. + """ + with self._lock: + self._check_can_write() + compressed = self._compressor.compress(data) + self._fp.write(compressed) + self._pos += len(data) + return len(data) + + def writelines(self, seq): + """Write a sequence of byte strings to the file. + + Returns the number of uncompressed bytes written. + seq can be any iterable yielding byte strings. + + Line separators are not added between the written byte strings. + """ + with self._lock: + return io.BufferedIOBase.writelines(self, seq) + + # Rewind the file to the beginning of the data stream. + def _rewind(self): + self._fp.seek(0, 0) + self._mode = _MODE_READ + self._pos = 0 + self._decompressor = BZ2Decompressor() + self._buffer = None + + def seek(self, offset, whence=0): + """Change the file position. + + The new position is specified by offset, relative to the + position indicated by whence. Values for whence are: + + 0: start of stream (default); offset must not be negative + 1: current stream position + 2: end of stream; offset must not be positive + + Returns the new file position. + + Note that seeking is emulated, so depending on the parameters, + this operation may be extremely slow. + """ + with self._lock: + self._check_can_seek() + + # Recalculate offset as an absolute file position. + if whence == 0: + pass + elif whence == 1: + offset = self._pos + offset + elif whence == 2: + # Seeking relative to EOF - we need to know the file's size. + if self._size < 0: + self._read_all(return_data=False) + offset = self._size + offset + else: + raise ValueError("Invalid value for whence: {}".format(whence)) + + # Make it so that offset is the number of bytes to skip forward. + if offset < self._pos: + self._rewind() + else: + offset -= self._pos + + # Read and discard data until we reach the desired position. + if self._mode != _MODE_READ_EOF: + self._read_block(offset, return_data=False) + + return self._pos + + def tell(self): + """Return the current file position.""" + with self._lock: + self._check_not_closed() + return self._pos + + +def compress(data, compresslevel=9): + """Compress a block of data. + + compresslevel, if given, must be a number between 1 and 9. + + For incremental compression, use a BZ2Compressor object instead. + """ + comp = BZ2Compressor(compresslevel) + return comp.compress(data) + comp.flush() + + +def decompress(data): + """Decompress a block of data. + + For incremental decompression, use a BZ2Decompressor object instead. + """ + if len(data) == 0: + return b"" + decomp = BZ2Decompressor() + result = decomp.decompress(data) + if not decomp.eof: + raise ValueError("Compressed data ended before the " + "end-of-stream marker was reached") + return result diff --git a/Lib/test/test_bz2.py b/Lib/test/test_bz2.py --- a/Lib/test/test_bz2.py +++ b/Lib/test/test_bz2.py @@ -21,7 +21,30 @@ class BaseTest(unittest.TestCase): "Base for other testcases." - TEXT = b'root:x:0:0:root:/root:/bin/bash\nbin:x:1:1:bin:/bin:\ndaemon:x:2:2:daemon:/sbin:\nadm:x:3:4:adm:/var/adm:\nlp:x:4:7:lp:/var/spool/lpd:\nsync:x:5:0:sync:/sbin:/bin/sync\nshutdown:x:6:0:shutdown:/sbin:/sbin/shutdown\nhalt:x:7:0:halt:/sbin:/sbin/halt\nmail:x:8:12:mail:/var/spool/mail:\nnews:x:9:13:news:/var/spool/news:\nuucp:x:10:14:uucp:/var/spool/uucp:\noperator:x:11:0:operator:/root:\ngames:x:12:100:games:/usr/games:\ngopher:x:13:30:gopher:/usr/lib/gopher-data:\nftp:x:14:50:FTP User:/var/ftp:/bin/bash\nnobody:x:65534:65534:Nobody:/home:\npostfix:x:100:101:postfix:/var/spool/postfix:\nniemeyer:x:500:500::/home/niemeyer:/bin/bash\npostgres:x:101:102:PostgreSQL Server:/var/lib/pgsql:/bin/bash\nmysql:x:102:103:MySQL server:/var/lib/mysql:/bin/bash\nwww:x:103:104::/var/www:/bin/false\n' + TEXT_LINES = [ + b'root:x:0:0:root:/root:/bin/bash\n', + b'bin:x:1:1:bin:/bin:\n', + b'daemon:x:2:2:daemon:/sbin:\n', + b'adm:x:3:4:adm:/var/adm:\n', + b'lp:x:4:7:lp:/var/spool/lpd:\n', + b'sync:x:5:0:sync:/sbin:/bin/sync\n', + b'shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown\n', + b'halt:x:7:0:halt:/sbin:/sbin/halt\n', + b'mail:x:8:12:mail:/var/spool/mail:\n', + b'news:x:9:13:news:/var/spool/news:\n', + b'uucp:x:10:14:uucp:/var/spool/uucp:\n', + b'operator:x:11:0:operator:/root:\n', + b'games:x:12:100:games:/usr/games:\n', + b'gopher:x:13:30:gopher:/usr/lib/gopher-data:\n', + b'ftp:x:14:50:FTP User:/var/ftp:/bin/bash\n', + b'nobody:x:65534:65534:Nobody:/home:\n', + b'postfix:x:100:101:postfix:/var/spool/postfix:\n', + b'niemeyer:x:500:500::/home/niemeyer:/bin/bash\n', + b'postgres:x:101:102:PostgreSQL Server:/var/lib/pgsql:/bin/bash\n', + b'mysql:x:102:103:MySQL server:/var/lib/mysql:/bin/bash\n', + b'www:x:103:104::/var/www:/bin/false\n', + ] + TEXT = b''.join(TEXT_LINES) DATA = b'BZh91AY&SY.\xc8N\x18\x00\x01>_\x80\x00\x10@\x02\xff\xf0\x01\x07n\x00?\xe7\xff\xe00\x01\x99\xaa\x00\xc0\x03F\x86\x8c#&\x83F\x9a\x03\x06\xa6\xd0\xa6\x93M\x0fQ\xa7\xa8\x06\x804hh\x12$\x11\xa4i4\xf14S\xd2\x88\xe5\xcd9gd6\x0b\n\xe9\x9b\xd5\x8a\x99\xf7\x08.K\x8ev\xfb\xf7xw\xbb\xdf\xa1\x92\xf1\xdd|/";\xa2\xba\x9f\xd5\xb1#A\xb6\xf6\xb3o\xc9\xc5y\\\xebO\xe7\x85\x9a\xbc\xb6f8\x952\xd5\xd7"%\x89>V,\xf7\xa6z\xe2\x9f\xa3\xdf\x11\x11"\xd6E)I\xa9\x13^\xca\xf3r\xd0\x03U\x922\xf26\xec\xb6\xed\x8b\xc3U\x13\x9d\xc5\x170\xa4\xfa^\x92\xacDF\x8a\x97\xd6\x19\xfe\xdd\xb8\xbd\x1a\x9a\x19\xa3\x80ankR\x8b\xe5\xd83]\xa9\xc6\x08\x82f\xf6\xb9"6l$\xb8j@\xc0\x8a\xb0l1..\xbak\x83ls\x15\xbc\xf4\xc1\x13\xbe\xf8E\xb8\x9d\r\xa8\x9dk\x84\xd3n\xfa\xacQ\x07\xb1%y\xaav\xb4\x08\xe0z\x1b\x16\xf5\x04\xe9\xcc\xb9\x08z\x1en7.G\xfc]\xc9\x14\xe1B@\xbb!8`' DATA_CRLF = b'BZh91AY&SY\xaez\xbbN\x00\x01H\xdf\x80\x00\x12@\x02\xff\xf0\x01\x07n\x00?\xe7\xff\xe0@\x01\xbc\xc6`\x86*\x8d=M\xa9\x9a\x86\xd0L@\x0fI\xa6!\xa1\x13\xc8\x88jdi\x8d@\x03@\x1a\x1a\x0c\x0c\x83 \x00\xc4h2\x19\x01\x82D\x84e\t\xe8\x99\x89\x19\x1ah\x00\r\x1a\x11\xaf\x9b\x0fG\xf5(\x1b\x1f?\t\x12\xcf\xb5\xfc\x95E\x00ps\x89\x12^\xa4\xdd\xa2&\x05(\x87\x04\x98\x89u\xe40%\xb6\x19\'\x8c\xc4\x89\xca\x07\x0e\x1b!\x91UIFU%C\x994!DI\xd2\xfa\xf0\xf1N8W\xde\x13A\xf5\x9cr%?\x9f3;I45A\xd1\x8bT\xb1\xa4\xc7\x8d\x1a\\"\xad\xa1\xabyBg\x15\xb9l\x88\x88\x91k"\x94\xa4\xd4\x89\xae*\xa6\x0b\x10\x0c\xd6\xd4m\xe86\xec\xb5j\x8a\x86j\';\xca.\x01I\xf2\xaaJ\xe8\x88\x8cU+t3\xfb\x0c\n\xa33\x13r2\r\x16\xe0\xb3(\xbf\x1d\x83r\xe7M\xf0D\x1365\xd8\x88\xd3\xa4\x92\xcb2\x06\x04\\\xc1\xb0\xea//\xbek&\xd8\xe6+t\xe5\xa1\x13\xada\x16\xder5"w]\xa2i\xb7[\x97R \xe2IT\xcd;Z\x04dk4\xad\x8a\t\xd3\x81z\x10\xf1:^`\xab\x1f\xc5\xdc\x91N\x14$+\x9e\xae\xd3\x80' @@ -54,13 +77,15 @@ if os.path.isfile(self.filename): os.unlink(self.filename) - def createTempFile(self, crlf=0): + def getData(self, crlf=False): + if crlf: + return self.DATA_CRLF + else: + return self.DATA + + def createTempFile(self, crlf=False): with open(self.filename, "wb") as f: - if crlf: - data = self.DATA_CRLF - else: - data = self.DATA - f.write(data) + f.write(self.getData(crlf)) def testRead(self): # "Test BZ2File.read()" @@ -70,7 +95,7 @@ self.assertEqual(bz2f.read(), self.TEXT) def testRead0(self): - # Test BBZ2File.read(0)" + # "Test BBZ2File.read(0)" self.createTempFile() with BZ2File(self.filename) as bz2f: self.assertRaises(TypeError, bz2f.read, None) @@ -94,6 +119,28 @@ with BZ2File(self.filename) as bz2f: self.assertEqual(bz2f.read(100), self.TEXT[:100]) + def testPeek(self): + # "Test BZ2File.peek()" + self.createTempFile() + with BZ2File(self.filename) as bz2f: + pdata = bz2f.peek() + self.assertNotEqual(len(pdata), 0) + self.assertTrue(self.TEXT.startswith(pdata)) + self.assertEqual(bz2f.read(), self.TEXT) + + def testReadInto(self): + # "Test BZ2File.readinto()" + self.createTempFile() + with BZ2File(self.filename) as bz2f: + n = 128 + b = bytearray(n) + self.assertEqual(bz2f.readinto(b), n) + self.assertEqual(b, self.TEXT[:n]) + n = len(self.TEXT) - n + b = bytearray(len(self.TEXT)) + self.assertEqual(bz2f.readinto(b), n) + self.assertEqual(b[:n], self.TEXT[-n:]) + def testReadLine(self): # "Test BZ2File.readline()" self.createTempFile() @@ -125,7 +172,7 @@ bz2f = BZ2File(self.filename) bz2f.close() self.assertRaises(ValueError, bz2f.__next__) - # This call will deadlock of the above .__next__ call failed to + # This call will deadlock if the above .__next__ call failed to # release the lock. self.assertRaises(ValueError, bz2f.readlines) @@ -217,6 +264,13 @@ self.assertEqual(bz2f.tell(), 0) self.assertEqual(bz2f.read(), self.TEXT) + def testFileno(self): + # "Test BZ2File.fileno()" + self.createTempFile() + with open(self.filename) as rawf: + with BZ2File(fileobj=rawf) as bz2f: + self.assertEqual(bz2f.fileno(), rawf.fileno()) + def testOpenDel(self): # "Test opening and deleting a file many times" self.createTempFile() @@ -278,17 +332,65 @@ t.join() def testMixedIterationReads(self): - # Issue #8397: mixed iteration and reads should be forbidden. - with bz2.BZ2File(self.filename, 'wb') as f: - # The internal buffer size is hard-wired to 8192 bytes, we must - # write out more than that for the test to stop half through - # the buffer. - f.write(self.TEXT * 100) - with bz2.BZ2File(self.filename, 'rb') as f: - next(f) - self.assertRaises(ValueError, f.read) - self.assertRaises(ValueError, f.readline) - self.assertRaises(ValueError, f.readlines) + # "Test mixed iteration and reads." + self.createTempFile() + linelen = len(self.TEXT_LINES[0]) + halflen = linelen // 2 + with bz2.BZ2File(self.filename) as bz2f: + bz2f.read(halflen) + self.assertEqual(next(bz2f), self.TEXT_LINES[0][halflen:]) + self.assertEqual(bz2f.read(), self.TEXT[linelen:]) + with bz2.BZ2File(self.filename) as bz2f: + bz2f.readline() + self.assertEqual(next(bz2f), self.TEXT_LINES[1]) + self.assertEqual(bz2f.readline(), self.TEXT_LINES[2]) + with bz2.BZ2File(self.filename) as bz2f: + bz2f.readlines() + with self.assertRaises(StopIteration): + next(bz2f) + self.assertEqual(bz2f.readlines(), []) + + def testReadBytesIO(self): + # "Test BZ2File.read() with BytesIO source" + with BytesIO(self.getData()) as bio: + with BZ2File(fileobj=bio) as bz2f: + self.assertRaises(TypeError, bz2f.read, None) + self.assertEqual(bz2f.read(), self.TEXT) + self.assertFalse(bio.closed) + + def testPeekBytesIO(self): + # "Test BZ2File.peek() with BytesIO source" + with BytesIO(self.getData()) as bio: + with BZ2File(fileobj=bio) as bz2f: + pdata = bz2f.peek() + self.assertNotEqual(len(pdata), 0) + self.assertTrue(self.TEXT.startswith(pdata)) + self.assertEqual(bz2f.read(), self.TEXT) + + def testWriteBytesIO(self): + # "Test BZ2File.write() with BytesIO destination" + with BytesIO() as bio: + with BZ2File(fileobj=bio, mode="w") as bz2f: + self.assertRaises(TypeError, bz2f.write) + bz2f.write(self.TEXT) + self.assertEqual(self.decompress(bio.getvalue()), self.TEXT) + self.assertFalse(bio.closed) + + def testSeekForwardBytesIO(self): + # "Test BZ2File.seek(150, 0) with BytesIO source" + with BytesIO(self.getData()) as bio: + with BZ2File(fileobj=bio) as bz2f: + self.assertRaises(TypeError, bz2f.seek) + bz2f.seek(150) + self.assertEqual(bz2f.read(), self.TEXT[150:]) + + def testSeekBackwardsBytesIO(self): + # "Test BZ2File.seek(-150, 1) with BytesIO source" + with BytesIO(self.getData()) as bio: + with BZ2File(fileobj=bio) as bz2f: + bz2f.read(500) + bz2f.seek(-150, 1) + self.assertEqual(bz2f.read(), self.TEXT[500-150:]) class BZ2CompressorTest(BaseTest): def testCompress(self): diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -87,6 +87,10 @@ Library ------- +- Issue #5863: Rewrite BZ2File in pure Python, and allow it to accept + file-like objects using a new ``fileobj`` constructor argument. Patch by + Nadeem Vawda. + - unittest.TestCase.assertSameElements has been removed. - sys.getfilesystemencoding() raises a RuntimeError if initfsencoding() was not diff --git a/Modules/bz2module.c b/Modules/_bz2module.c rename from Modules/bz2module.c rename to Modules/_bz2module.c --- a/Modules/bz2module.c +++ b/Modules/_bz2module.c @@ -1,215 +1,111 @@ -/* +/* _bz2 - Low-level Python interface to libbzip2. */ -python-bz2 - python bz2 library interface - -Copyright (c) 2002 Gustavo Niemeyer -Copyright (c) 2002 Python Software Foundation; All Rights Reserved - -*/ +#define PY_SSIZE_T_CLEAN #include "Python.h" -#include -#include #include "structmember.h" #ifdef WITH_THREAD #include "pythread.h" #endif -static char __author__[] = -"The bz2 python module was written by:\n\ -\n\ - Gustavo Niemeyer \n\ -"; +#include +#include -/* Our very own off_t-like type, 64-bit if possible */ -/* copied from Objects/fileobject.c */ -#if !defined(HAVE_LARGEFILE_SUPPORT) -typedef off_t Py_off_t; -#elif SIZEOF_OFF_T >= 8 -typedef off_t Py_off_t; -#elif SIZEOF_FPOS_T >= 8 -typedef fpos_t Py_off_t; -#else -#error "Large file support, but neither off_t nor fpos_t is large enough." -#endif -#define BUF(v) PyBytes_AS_STRING(v) - -#define MODE_CLOSED 0 -#define MODE_READ 1 -#define MODE_READ_EOF 2 -#define MODE_WRITE 3 - -#define BZ2FileObject_Check(v) (Py_TYPE(v) == &BZ2File_Type) - - -#ifdef BZ_CONFIG_ERROR - -#if SIZEOF_LONG >= 8 -#define BZS_TOTAL_OUT(bzs) \ - (((long)bzs->total_out_hi32 << 32) + bzs->total_out_lo32) -#elif SIZEOF_LONG_LONG >= 8 -#define BZS_TOTAL_OUT(bzs) \ - (((PY_LONG_LONG)bzs->total_out_hi32 << 32) + bzs->total_out_lo32) -#else -#define BZS_TOTAL_OUT(bzs) \ - bzs->total_out_lo32 -#endif - -#else /* ! BZ_CONFIG_ERROR */ - -#define BZ2_bzRead bzRead -#define BZ2_bzReadOpen bzReadOpen -#define BZ2_bzReadClose bzReadClose -#define BZ2_bzWrite bzWrite -#define BZ2_bzWriteOpen bzWriteOpen -#define BZ2_bzWriteClose bzWriteClose +#ifndef BZ_CONFIG_ERROR #define BZ2_bzCompress bzCompress #define BZ2_bzCompressInit bzCompressInit #define BZ2_bzCompressEnd bzCompressEnd #define BZ2_bzDecompress bzDecompress #define BZ2_bzDecompressInit bzDecompressInit #define BZ2_bzDecompressEnd bzDecompressEnd - -#define BZS_TOTAL_OUT(bzs) bzs->total_out - -#endif /* ! BZ_CONFIG_ERROR */ +#endif /* ! BZ_CONFIG_ERROR */ #ifdef WITH_THREAD #define ACQUIRE_LOCK(obj) do { \ - if (!PyThread_acquire_lock(obj->lock, 0)) { \ + if (!PyThread_acquire_lock((obj)->lock, 0)) { \ Py_BEGIN_ALLOW_THREADS \ - PyThread_acquire_lock(obj->lock, 1); \ + PyThread_acquire_lock((obj)->lock, 1); \ Py_END_ALLOW_THREADS \ - } } while(0) -#define RELEASE_LOCK(obj) PyThread_release_lock(obj->lock) + } } while (0) +#define RELEASE_LOCK(obj) PyThread_release_lock((obj)->lock) #else #define ACQUIRE_LOCK(obj) #define RELEASE_LOCK(obj) #endif -/* Bits in f_newlinetypes */ -#define NEWLINE_UNKNOWN 0 /* No newline seen, yet */ -#define NEWLINE_CR 1 /* \r newline seen */ -#define NEWLINE_LF 2 /* \n newline seen */ -#define NEWLINE_CRLF 4 /* \r\n newline seen */ - -/* ===================================================================== */ -/* Structure definitions. */ - -typedef struct { - PyObject_HEAD - FILE *rawfp; - - char* f_buf; /* Allocated readahead buffer */ - char* f_bufend; /* Points after last occupied position */ - char* f_bufptr; /* Current buffer position */ - - BZFILE *fp; - int mode; - Py_off_t pos; - Py_off_t size; -#ifdef WITH_THREAD - PyThread_type_lock lock; -#endif -} BZ2FileObject; typedef struct { PyObject_HEAD bz_stream bzs; - int running; + int flushed; #ifdef WITH_THREAD PyThread_type_lock lock; #endif -} BZ2CompObject; +} BZ2Compressor; typedef struct { PyObject_HEAD bz_stream bzs; - int running; + char eof; /* T_BOOL expects a char */ PyObject *unused_data; #ifdef WITH_THREAD PyThread_type_lock lock; #endif -} BZ2DecompObject; +} BZ2Decompressor; -/* ===================================================================== */ -/* Utility functions. */ -/* Refuse regular I/O if there's data in the iteration-buffer. - * Mixing them would cause data to arrive out of order, as the read* - * methods don't use the iteration buffer. */ -static int -check_iterbuffered(BZ2FileObject *f) -{ - if (f->f_buf != NULL && - (f->f_bufend - f->f_bufptr) > 0 && - f->f_buf[0] != '\0') { - PyErr_SetString(PyExc_ValueError, - "Mixing iteration and read methods would lose data"); - return -1; - } - return 0; -} +/* Helper functions. */ static int -Util_CatchBZ2Error(int bzerror) +catch_bz2_error(int bzerror) { - int ret = 0; switch(bzerror) { case BZ_OK: + case BZ_RUN_OK: + case BZ_FLUSH_OK: + case BZ_FINISH_OK: case BZ_STREAM_END: - break; + return 0; #ifdef BZ_CONFIG_ERROR case BZ_CONFIG_ERROR: PyErr_SetString(PyExc_SystemError, - "the bz2 library was not compiled " - "correctly"); - ret = 1; - break; + "libbzip2 was not compiled correctly"); + return 1; #endif - case BZ_PARAM_ERROR: PyErr_SetString(PyExc_ValueError, - "the bz2 library has received wrong " - "parameters"); - ret = 1; - break; - + "Internal error - " + "invalid parameters passed to libbzip2"); + return 1; case BZ_MEM_ERROR: PyErr_NoMemory(); - ret = 1; - break; - + return 1; case BZ_DATA_ERROR: case BZ_DATA_ERROR_MAGIC: - PyErr_SetString(PyExc_IOError, "invalid data stream"); - ret = 1; - break; - + PyErr_SetString(PyExc_IOError, "Invalid data stream"); + return 1; case BZ_IO_ERROR: - PyErr_SetString(PyExc_IOError, "unknown IO error"); - ret = 1; - break; - + PyErr_SetString(PyExc_IOError, "Unknown I/O error"); + return 1; case BZ_UNEXPECTED_EOF: PyErr_SetString(PyExc_EOFError, - "compressed file ended before the " - "logical end-of-stream was detected"); - ret = 1; - break; - + "Compressed file ended before the logical " + "end-of-stream was detected"); + return 1; case BZ_SEQUENCE_ERROR: PyErr_SetString(PyExc_RuntimeError, - "wrong sequence of bz2 library " - "commands used"); - ret = 1; - break; + "Internal error - " + "Invalid sequence of commands sent to libbzip2"); + return 1; + default: + PyErr_Format(PyExc_IOError, + "Unrecognized error from libbzip2: %d", bzerror); + return 1; } - return ret; } #if BUFSIZ < 8192 @@ -224,1599 +120,316 @@ #define BIGCHUNK (512 * 1024) #endif -/* This is a hacked version of Python's fileobject.c:new_buffersize(). */ -static size_t -Util_NewBufferSize(size_t currentsize) +static int +grow_buffer(PyObject **buf) { - if (currentsize > SMALLCHUNK) { - /* Keep doubling until we reach BIGCHUNK; - then keep adding BIGCHUNK. */ - if (currentsize <= BIGCHUNK) - return currentsize + currentsize; - else - return currentsize + BIGCHUNK; - } - return currentsize + SMALLCHUNK; + size_t size = PyBytes_GET_SIZE(*buf); + if (size <= SMALLCHUNK) + return _PyBytes_Resize(buf, size + SMALLCHUNK); + else if (size <= BIGCHUNK) + return _PyBytes_Resize(buf, size * 2); + else + return _PyBytes_Resize(buf, size + BIGCHUNK); } -/* This is a hacked version of Python's fileobject.c:get_line(). */ + +/* BZ2Compressor class. */ + static PyObject * -Util_GetLine(BZ2FileObject *f, int n) +compress(BZ2Compressor *c, char *data, size_t len, int action) { - char c; - char *buf, *end; - size_t total_v_size; /* total # of slots in buffer */ - size_t used_v_size; /* # used slots in buffer */ - size_t increment; /* amount to increment the buffer */ - PyObject *v; - int bzerror; - int bytes_read; + size_t data_size = 0; + PyObject *result; - total_v_size = n > 0 ? n : 100; - v = PyBytes_FromStringAndSize((char *)NULL, total_v_size); - if (v == NULL) + result = PyBytes_FromStringAndSize(NULL, SMALLCHUNK); + if (result == NULL) return NULL; + c->bzs.next_in = data; + /* FIXME This is not 64-bit clean - avail_in is an int. */ + c->bzs.avail_in = len; + c->bzs.next_out = PyBytes_AS_STRING(result); + c->bzs.avail_out = PyBytes_GET_SIZE(result); + for (;;) { + char *this_out; + int bzerror; - buf = BUF(v); - end = buf + total_v_size; + Py_BEGIN_ALLOW_THREADS + this_out = c->bzs.next_out; + bzerror = BZ2_bzCompress(&c->bzs, action); + data_size += c->bzs.next_out - this_out; + Py_END_ALLOW_THREADS + if (catch_bz2_error(bzerror)) + goto error; - for (;;) { - Py_BEGIN_ALLOW_THREADS - do { - bytes_read = BZ2_bzRead(&bzerror, f->fp, &c, 1); - f->pos++; - if (bytes_read == 0) - break; - *buf++ = c; - } while (bzerror == BZ_OK && c != '\n' && buf != end); - Py_END_ALLOW_THREADS - if (bzerror == BZ_STREAM_END) { - f->size = f->pos; - f->mode = MODE_READ_EOF; + /* In regular compression mode, stop when input data is exhausted. + In flushing mode, stop when all buffered data has been flushed. */ + if ((action == BZ_RUN && c->bzs.avail_in == 0) || + (action == BZ_FINISH && bzerror == BZ_STREAM_END)) break; - } else if (bzerror != BZ_OK) { - Util_CatchBZ2Error(bzerror); - Py_DECREF(v); - return NULL; - } - if (c == '\n') - break; - /* Must be because buf == end */ - if (n > 0) - break; - used_v_size = total_v_size; - increment = total_v_size >> 2; /* mild exponential growth */ - total_v_size += increment; - if (total_v_size > INT_MAX) { - PyErr_SetString(PyExc_OverflowError, - "line is longer than a Python string can hold"); - Py_DECREF(v); - return NULL; - } - if (_PyBytes_Resize(&v, total_v_size) < 0) { - return NULL; - } - buf = BUF(v) + used_v_size; - end = BUF(v) + total_v_size; - } - used_v_size = buf - BUF(v); - if (used_v_size != total_v_size) { - if (_PyBytes_Resize(&v, used_v_size) < 0) { - v = NULL; + if (c->bzs.avail_out == 0) { + if (grow_buffer(&result) < 0) + goto error; + c->bzs.next_out = PyBytes_AS_STRING(result) + data_size; + c->bzs.avail_out = PyBytes_GET_SIZE(result) - data_size; } } - return v; + if (data_size != PyBytes_GET_SIZE(result)) + if (_PyBytes_Resize(&result, data_size) < 0) + goto error; + return result; + +error: + Py_XDECREF(result); + return NULL; } -/* This is a hacked version of Python's fileobject.c:drop_readahead(). */ -static void -Util_DropReadAhead(BZ2FileObject *f) +PyDoc_STRVAR(BZ2Compressor_compress__doc__, +"compress(data) -> bytes\n" +"\n" +"Provide data to the compressor object. Returns a chunk of\n" +"compressed data if possible, or b'' otherwise.\n" +"\n" +"When you have finished providing data to the compressor, call the\n" +"flush() method to finish the compression process.\n"); + +static PyObject * +BZ2Compressor_compress(BZ2Compressor *self, PyObject *args) { - if (f->f_buf != NULL) { - PyMem_Free(f->f_buf); - f->f_buf = NULL; - } -} + Py_buffer buffer; + PyObject *result = NULL; -/* This is a hacked version of Python's fileobject.c:readahead(). */ -static int -Util_ReadAhead(BZ2FileObject *f, int bufsize) -{ - int chunksize; - int bzerror; - - if (f->f_buf != NULL) { - if((f->f_bufend - f->f_bufptr) >= 1) - return 0; - else - Util_DropReadAhead(f); - } - if (f->mode == MODE_READ_EOF) { - f->f_bufptr = f->f_buf; - f->f_bufend = f->f_buf; - return 0; - } - if ((f->f_buf = PyMem_Malloc(bufsize)) == NULL) { - PyErr_NoMemory(); - return -1; - } - Py_BEGIN_ALLOW_THREADS - chunksize = BZ2_bzRead(&bzerror, f->fp, f->f_buf, bufsize); - Py_END_ALLOW_THREADS - f->pos += chunksize; - if (bzerror == BZ_STREAM_END) { - f->size = f->pos; - f->mode = MODE_READ_EOF; - } else if (bzerror != BZ_OK) { - Util_CatchBZ2Error(bzerror); - Util_DropReadAhead(f); - return -1; - } - f->f_bufptr = f->f_buf; - f->f_bufend = f->f_buf + chunksize; - return 0; -} - -/* This is a hacked version of Python's - * fileobject.c:readahead_get_line_skip(). */ -static PyBytesObject * -Util_ReadAheadGetLineSkip(BZ2FileObject *f, int skip, int bufsize) -{ - PyBytesObject* s; - char *bufptr; - char *buf; - int len; - - if (f->f_buf == NULL) - if (Util_ReadAhead(f, bufsize) < 0) - return NULL; - - len = f->f_bufend - f->f_bufptr; - if (len == 0) - return (PyBytesObject *) - PyBytes_FromStringAndSize(NULL, skip); - bufptr = memchr(f->f_bufptr, '\n', len); - if (bufptr != NULL) { - bufptr++; /* Count the '\n' */ - len = bufptr - f->f_bufptr; - s = (PyBytesObject *) - PyBytes_FromStringAndSize(NULL, skip+len); - if (s == NULL) - return NULL; - memcpy(PyBytes_AS_STRING(s)+skip, f->f_bufptr, len); - f->f_bufptr = bufptr; - if (bufptr == f->f_bufend) - Util_DropReadAhead(f); - } else { - bufptr = f->f_bufptr; - buf = f->f_buf; - f->f_buf = NULL; /* Force new readahead buffer */ - s = Util_ReadAheadGetLineSkip(f, skip+len, - bufsize + (bufsize>>2)); - if (s == NULL) { - PyMem_Free(buf); - return NULL; - } - memcpy(PyBytes_AS_STRING(s)+skip, bufptr, len); - PyMem_Free(buf); - } - return s; -} - -/* ===================================================================== */ -/* Methods of BZ2File. */ - -PyDoc_STRVAR(BZ2File_read__doc__, -"read([size]) -> string\n\ -\n\ -Read at most size uncompressed bytes, returned as a string. If the size\n\ -argument is negative or omitted, read until EOF is reached.\n\ -"); - -/* This is a hacked version of Python's fileobject.c:file_read(). */ -static PyObject * -BZ2File_read(BZ2FileObject *self, PyObject *args) -{ - long bytesrequested = -1; - size_t bytesread, buffersize, chunksize; - int bzerror; - PyObject *ret = NULL; - - if (!PyArg_ParseTuple(args, "|l:read", &bytesrequested)) + if (!PyArg_ParseTuple(args, "y*:compress", &buffer)) return NULL; ACQUIRE_LOCK(self); - switch (self->mode) { - case MODE_READ: - break; - case MODE_READ_EOF: - ret = PyBytes_FromStringAndSize("", 0); - goto cleanup; - case MODE_CLOSED: - PyErr_SetString(PyExc_ValueError, - "I/O operation on closed file"); - goto cleanup; - default: - PyErr_SetString(PyExc_IOError, - "file is not ready for reading"); - goto cleanup; - } - - /* refuse to mix with f.next() */ - if (check_iterbuffered(self)) - goto cleanup; - - if (bytesrequested < 0) - buffersize = Util_NewBufferSize((size_t)0); + if (self->flushed) + PyErr_SetString(PyExc_ValueError, "Compressor has been flushed"); else - buffersize = bytesrequested; - if (buffersize > INT_MAX) { - PyErr_SetString(PyExc_OverflowError, - "requested number of bytes is " - "more than a Python string can hold"); - goto cleanup; - } - ret = PyBytes_FromStringAndSize((char *)NULL, buffersize); - if (ret == NULL || buffersize == 0) - goto cleanup; - bytesread = 0; - - for (;;) { - Py_BEGIN_ALLOW_THREADS - chunksize = BZ2_bzRead(&bzerror, self->fp, - BUF(ret)+bytesread, - buffersize-bytesread); - self->pos += chunksize; - Py_END_ALLOW_THREADS - bytesread += chunksize; - if (bzerror == BZ_STREAM_END) { - self->size = self->pos; - self->mode = MODE_READ_EOF; - break; - } else if (bzerror != BZ_OK) { - Util_CatchBZ2Error(bzerror); - Py_DECREF(ret); - ret = NULL; - goto cleanup; - } - if (bytesrequested < 0) { - buffersize = Util_NewBufferSize(buffersize); - if (_PyBytes_Resize(&ret, buffersize) < 0) { - ret = NULL; - goto cleanup; - } - } else { - break; - } - } - if (bytesread != buffersize) { - if (_PyBytes_Resize(&ret, bytesread) < 0) { - ret = NULL; - } - } - -cleanup: + result = compress(self, buffer.buf, buffer.len, BZ_RUN); RELEASE_LOCK(self); - return ret; + PyBuffer_Release(&buffer); + return result; } -PyDoc_STRVAR(BZ2File_readline__doc__, -"readline([size]) -> string\n\ -\n\ -Return the next line from the file, as a string, retaining newline.\n\ -A non-negative size argument will limit the maximum number of bytes to\n\ -return (an incomplete line may be returned then). Return an empty\n\ -string at EOF.\n\ -"); +PyDoc_STRVAR(BZ2Compressor_flush__doc__, +"flush() -> bytes\n" +"\n" +"Finish the compression process. Returns the compressed data left\n" +"in internal buffers.\n" +"\n" +"The compressor object may not be used after this method is called.\n"); static PyObject * -BZ2File_readline(BZ2FileObject *self, PyObject *args) +BZ2Compressor_flush(BZ2Compressor *self, PyObject *noargs) { - PyObject *ret = NULL; - int sizehint = -1; - - if (!PyArg_ParseTuple(args, "|i:readline", &sizehint)) - return NULL; + PyObject *result = NULL; ACQUIRE_LOCK(self); - switch (self->mode) { - case MODE_READ: - break; - case MODE_READ_EOF: - ret = PyBytes_FromStringAndSize("", 0); - goto cleanup; - case MODE_CLOSED: - PyErr_SetString(PyExc_ValueError, - "I/O operation on closed file"); - goto cleanup; - default: - PyErr_SetString(PyExc_IOError, - "file is not ready for reading"); - goto cleanup; + if (self->flushed) + PyErr_SetString(PyExc_ValueError, "Repeated call to flush()"); + else { + self->flushed = 1; + result = compress(self, NULL, 0, BZ_FINISH); } - - /* refuse to mix with f.next() */ - if (check_iterbuffered(self)) - goto cleanup; - - if (sizehint == 0) - ret = PyBytes_FromStringAndSize("", 0); - else - ret = Util_GetLine(self, (sizehint < 0) ? 0 : sizehint); - -cleanup: RELEASE_LOCK(self); - return ret; + return result; } -PyDoc_STRVAR(BZ2File_readlines__doc__, -"readlines([size]) -> list\n\ -\n\ -Call readline() repeatedly and return a list of lines read.\n\ -The optional size argument, if given, is an approximate bound on the\n\ -total number of bytes in the lines returned.\n\ -"); - -/* This is a hacked version of Python's fileobject.c:file_readlines(). */ -static PyObject * -BZ2File_readlines(BZ2FileObject *self, PyObject *args) +static int +BZ2Compressor_init(BZ2Compressor *self, PyObject *args, PyObject *kwargs) { - long sizehint = 0; - PyObject *list = NULL; - PyObject *line; - char small_buffer[SMALLCHUNK]; - char *buffer = small_buffer; - size_t buffersize = SMALLCHUNK; - PyObject *big_buffer = NULL; - size_t nfilled = 0; - size_t nread; - size_t totalread = 0; - char *p, *q, *end; - int err; - int shortread = 0; + int compresslevel = 9; int bzerror; - if (!PyArg_ParseTuple(args, "|l:readlines", &sizehint)) - return NULL; - - ACQUIRE_LOCK(self); - switch (self->mode) { - case MODE_READ: - break; - case MODE_READ_EOF: - list = PyList_New(0); - goto cleanup; - case MODE_CLOSED: - PyErr_SetString(PyExc_ValueError, - "I/O operation on closed file"); - goto cleanup; - default: - PyErr_SetString(PyExc_IOError, - "file is not ready for reading"); - goto cleanup; - } - - /* refuse to mix with f.next() */ - if (check_iterbuffered(self)) - goto cleanup; - - if ((list = PyList_New(0)) == NULL) - goto cleanup; - - for (;;) { - Py_BEGIN_ALLOW_THREADS - nread = BZ2_bzRead(&bzerror, self->fp, - buffer+nfilled, buffersize-nfilled); - self->pos += nread; - Py_END_ALLOW_THREADS - if (bzerror == BZ_STREAM_END) { - self->size = self->pos; - self->mode = MODE_READ_EOF; - if (nread == 0) { - sizehint = 0; - break; - } - shortread = 1; - } else if (bzerror != BZ_OK) { - Util_CatchBZ2Error(bzerror); - error: - Py_DECREF(list); - list = NULL; - goto cleanup; - } - totalread += nread; - p = memchr(buffer+nfilled, '\n', nread); - if (!shortread && p == NULL) { - /* Need a larger buffer to fit this line */ - nfilled += nread; - buffersize *= 2; - if (buffersize > INT_MAX) { - PyErr_SetString(PyExc_OverflowError, - "line is longer than a Python string can hold"); - goto error; - } - if (big_buffer == NULL) { - /* Create the big buffer */ - big_buffer = PyBytes_FromStringAndSize( - NULL, buffersize); - if (big_buffer == NULL) - goto error; - buffer = PyBytes_AS_STRING(big_buffer); - memcpy(buffer, small_buffer, nfilled); - } - else { - /* Grow the big buffer */ - if (_PyBytes_Resize(&big_buffer, buffersize) < 0){ - big_buffer = NULL; - goto error; - } - buffer = PyBytes_AS_STRING(big_buffer); - } - continue; - } - end = buffer+nfilled+nread; - q = buffer; - while (p != NULL) { - /* Process complete lines */ - p++; - line = PyBytes_FromStringAndSize(q, p-q); - if (line == NULL) - goto error; - err = PyList_Append(list, line); - Py_DECREF(line); - if (err != 0) - goto error; - q = p; - p = memchr(q, '\n', end-q); - } - /* Move the remaining incomplete line to the start */ - nfilled = end-q; - memmove(buffer, q, nfilled); - if (sizehint > 0) - if (totalread >= (size_t)sizehint) - break; - if (shortread) { - sizehint = 0; - break; - } - } - if (nfilled != 0) { - /* Partial last line */ - line = PyBytes_FromStringAndSize(buffer, nfilled); - if (line == NULL) - goto error; - if (sizehint > 0) { - /* Need to complete the last line */ - PyObject *rest = Util_GetLine(self, 0); - if (rest == NULL) { - Py_DECREF(line); - goto error; - } - PyBytes_Concat(&line, rest); - Py_DECREF(rest); - if (line == NULL) - goto error; - } - err = PyList_Append(list, line); - Py_DECREF(line); - if (err != 0) - goto error; - } - - cleanup: - RELEASE_LOCK(self); - if (big_buffer) { - Py_DECREF(big_buffer); - } - return list; -} - -PyDoc_STRVAR(BZ2File_write__doc__, -"write(data) -> None\n\ -\n\ -Write the 'data' string to file. Note that due to buffering, close() may\n\ -be needed before the file on disk reflects the data written.\n\ -"); - -/* This is a hacked version of Python's fileobject.c:file_write(). */ -static PyObject * -BZ2File_write(BZ2FileObject *self, PyObject *args) -{ - PyObject *ret = NULL; - Py_buffer pbuf; - char *buf; - int len; - int bzerror; - - if (!PyArg_ParseTuple(args, "y*:write", &pbuf)) - return NULL; - buf = pbuf.buf; - len = pbuf.len; - - ACQUIRE_LOCK(self); - switch (self->mode) { - case MODE_WRITE: - break; - - case MODE_CLOSED: - PyErr_SetString(PyExc_ValueError, - "I/O operation on closed file"); - goto cleanup; - - default: - PyErr_SetString(PyExc_IOError, - "file is not ready for writing"); - goto cleanup; - } - - Py_BEGIN_ALLOW_THREADS - BZ2_bzWrite (&bzerror, self->fp, buf, len); - self->pos += len; - Py_END_ALLOW_THREADS - - if (bzerror != BZ_OK) { - Util_CatchBZ2Error(bzerror); - goto cleanup; - } - - Py_INCREF(Py_None); - ret = Py_None; - -cleanup: - PyBuffer_Release(&pbuf); - RELEASE_LOCK(self); - return ret; -} - -PyDoc_STRVAR(BZ2File_writelines__doc__, -"writelines(sequence_of_strings) -> None\n\ -\n\ -Write the sequence of strings to the file. Note that newlines are not\n\ -added. The sequence can be any iterable object producing strings. This is\n\ -equivalent to calling write() for each string.\n\ -"); - -/* This is a hacked version of Python's fileobject.c:file_writelines(). */ -static PyObject * -BZ2File_writelines(BZ2FileObject *self, PyObject *seq) -{ -#define CHUNKSIZE 1000 - PyObject *list = NULL; - PyObject *iter = NULL; - PyObject *ret = NULL; - PyObject *line; - int i, j, index, len, islist; - int bzerror; - - ACQUIRE_LOCK(self); - switch (self->mode) { - case MODE_WRITE: - break; - - case MODE_CLOSED: - PyErr_SetString(PyExc_ValueError, - "I/O operation on closed file"); - goto error; - - default: - PyErr_SetString(PyExc_IOError, - "file is not ready for writing"); - goto error; - } - - islist = PyList_Check(seq); - if (!islist) { - iter = PyObject_GetIter(seq); - if (iter == NULL) { - PyErr_SetString(PyExc_TypeError, - "writelines() requires an iterable argument"); - goto error; - } - list = PyList_New(CHUNKSIZE); - if (list == NULL) - goto error; - } - - /* Strategy: slurp CHUNKSIZE lines into a private list, - checking that they are all strings, then write that list - without holding the interpreter lock, then come back for more. */ - for (index = 0; ; index += CHUNKSIZE) { - if (islist) { - Py_XDECREF(list); - list = PyList_GetSlice(seq, index, index+CHUNKSIZE); - if (list == NULL) - goto error; - j = PyList_GET_SIZE(list); - } - else { - for (j = 0; j < CHUNKSIZE; j++) { - line = PyIter_Next(iter); - if (line == NULL) { - if (PyErr_Occurred()) - goto error; - break; - } - PyList_SetItem(list, j, line); - } - } - if (j == 0) - break; - - /* Check that all entries are indeed byte strings. If not, - apply the same rules as for file.write() and - convert the rets to strings. This is slow, but - seems to be the only way since all conversion APIs - could potentially execute Python code. */ - for (i = 0; i < j; i++) { - PyObject *v = PyList_GET_ITEM(list, i); - if (!PyBytes_Check(v)) { - const char *buffer; - Py_ssize_t len; - if (PyObject_AsCharBuffer(v, &buffer, &len)) { - PyErr_SetString(PyExc_TypeError, - "writelines() " - "argument must be " - "a sequence of " - "bytes objects"); - goto error; - } - line = PyBytes_FromStringAndSize(buffer, - len); - if (line == NULL) - goto error; - Py_DECREF(v); - PyList_SET_ITEM(list, i, line); - } - } - - /* Since we are releasing the global lock, the - following code may *not* execute Python code. */ - Py_BEGIN_ALLOW_THREADS - for (i = 0; i < j; i++) { - line = PyList_GET_ITEM(list, i); - len = PyBytes_GET_SIZE(line); - BZ2_bzWrite (&bzerror, self->fp, - PyBytes_AS_STRING(line), len); - if (bzerror != BZ_OK) { - Py_BLOCK_THREADS - Util_CatchBZ2Error(bzerror); - goto error; - } - } - Py_END_ALLOW_THREADS - - if (j < CHUNKSIZE) - break; - } - - Py_INCREF(Py_None); - ret = Py_None; - - error: - RELEASE_LOCK(self); - Py_XDECREF(list); - Py_XDECREF(iter); - return ret; -#undef CHUNKSIZE -} - -PyDoc_STRVAR(BZ2File_seek__doc__, -"seek(offset [, whence]) -> None\n\ -\n\ -Move to new file position. Argument offset is a byte count. Optional\n\ -argument whence defaults to 0 (offset from start of file, offset\n\ -should be >= 0); other values are 1 (move relative to current position,\n\ -positive or negative), and 2 (move relative to end of file, usually\n\ -negative, although many platforms allow seeking beyond the end of a file).\n\ -\n\ -Note that seeking of bz2 files is emulated, and depending on the parameters\n\ -the operation may be extremely slow.\n\ -"); - -static PyObject * -BZ2File_seek(BZ2FileObject *self, PyObject *args) -{ - int where = 0; - PyObject *offobj; - Py_off_t offset; - char small_buffer[SMALLCHUNK]; - char *buffer = small_buffer; - size_t buffersize = SMALLCHUNK; - Py_off_t bytesread = 0; - size_t readsize; - int chunksize; - int bzerror; - PyObject *ret = NULL; - - if (!PyArg_ParseTuple(args, "O|i:seek", &offobj, &where)) - return NULL; -#if !defined(HAVE_LARGEFILE_SUPPORT) - offset = PyLong_AsLong(offobj); -#else - offset = PyLong_Check(offobj) ? - PyLong_AsLongLong(offobj) : PyLong_AsLong(offobj); -#endif - if (PyErr_Occurred()) - return NULL; - - ACQUIRE_LOCK(self); - Util_DropReadAhead(self); - switch (self->mode) { - case MODE_READ: - case MODE_READ_EOF: - break; - - case MODE_CLOSED: - PyErr_SetString(PyExc_ValueError, - "I/O operation on closed file"); - goto cleanup; - - default: - PyErr_SetString(PyExc_IOError, - "seek works only while reading"); - goto cleanup; - } - - if (where == 2) { - if (self->size == -1) { - assert(self->mode != MODE_READ_EOF); - for (;;) { - Py_BEGIN_ALLOW_THREADS - chunksize = BZ2_bzRead(&bzerror, self->fp, - buffer, buffersize); - self->pos += chunksize; - Py_END_ALLOW_THREADS - - bytesread += chunksize; - if (bzerror == BZ_STREAM_END) { - break; - } else if (bzerror != BZ_OK) { - Util_CatchBZ2Error(bzerror); - goto cleanup; - } - } - self->mode = MODE_READ_EOF; - self->size = self->pos; - bytesread = 0; - } - offset = self->size + offset; - } else if (where == 1) { - offset = self->pos + offset; - } - - /* Before getting here, offset must be the absolute position the file - * pointer should be set to. */ - - if (offset >= self->pos) { - /* we can move forward */ - offset -= self->pos; - } else { - /* we cannot move back, so rewind the stream */ - BZ2_bzReadClose(&bzerror, self->fp); - if (bzerror != BZ_OK) { - Util_CatchBZ2Error(bzerror); - goto cleanup; - } - rewind(self->rawfp); - self->pos = 0; - self->fp = BZ2_bzReadOpen(&bzerror, self->rawfp, - 0, 0, NULL, 0); - if (bzerror != BZ_OK) { - Util_CatchBZ2Error(bzerror); - goto cleanup; - } - self->mode = MODE_READ; - } - - if (offset <= 0 || self->mode == MODE_READ_EOF) - goto exit; - - /* Before getting here, offset must be set to the number of bytes - * to walk forward. */ - for (;;) { - if (offset-bytesread > buffersize) - readsize = buffersize; - else - /* offset might be wider that readsize, but the result - * of the subtraction is bound by buffersize (see the - * condition above). buffersize is 8192. */ - readsize = (size_t)(offset-bytesread); - Py_BEGIN_ALLOW_THREADS - chunksize = BZ2_bzRead(&bzerror, self->fp, buffer, readsize); - self->pos += chunksize; - Py_END_ALLOW_THREADS - bytesread += chunksize; - if (bzerror == BZ_STREAM_END) { - self->size = self->pos; - self->mode = MODE_READ_EOF; - break; - } else if (bzerror != BZ_OK) { - Util_CatchBZ2Error(bzerror); - goto cleanup; - } - if (bytesread == offset) - break; - } - -exit: - Py_INCREF(Py_None); - ret = Py_None; - -cleanup: - RELEASE_LOCK(self); - return ret; -} - -PyDoc_STRVAR(BZ2File_tell__doc__, -"tell() -> int\n\ -\n\ -Return the current file position, an integer (may be a long integer).\n\ -"); - -static PyObject * -BZ2File_tell(BZ2FileObject *self, PyObject *args) -{ - PyObject *ret = NULL; - - if (self->mode == MODE_CLOSED) { - PyErr_SetString(PyExc_ValueError, - "I/O operation on closed file"); - goto cleanup; - } - -#if !defined(HAVE_LARGEFILE_SUPPORT) - ret = PyLong_FromLong(self->pos); -#else - ret = PyLong_FromLongLong(self->pos); -#endif - -cleanup: - return ret; -} - -PyDoc_STRVAR(BZ2File_close__doc__, -"close() -> None or (perhaps) an integer\n\ -\n\ -Close the file. Sets data attribute .closed to true. A closed file\n\ -cannot be used for further I/O operations. close() may be called more\n\ -than once without error.\n\ -"); - -static PyObject * -BZ2File_close(BZ2FileObject *self) -{ - PyObject *ret = NULL; - int bzerror = BZ_OK; - - if (self->mode == MODE_CLOSED) { - Py_RETURN_NONE; - } - - ACQUIRE_LOCK(self); - switch (self->mode) { - case MODE_READ: - case MODE_READ_EOF: - BZ2_bzReadClose(&bzerror, self->fp); - break; - case MODE_WRITE: - BZ2_bzWriteClose(&bzerror, self->fp, - 0, NULL, NULL); - break; - } - self->mode = MODE_CLOSED; - fclose(self->rawfp); - self->rawfp = NULL; - if (bzerror == BZ_OK) { - Py_INCREF(Py_None); - ret = Py_None; - } - else { - Util_CatchBZ2Error(bzerror); - } - - RELEASE_LOCK(self); - return ret; -} - -PyDoc_STRVAR(BZ2File_enter_doc, -"__enter__() -> self."); - -static PyObject * -BZ2File_enter(BZ2FileObject *self) -{ - if (self->mode == MODE_CLOSED) { - PyErr_SetString(PyExc_ValueError, - "I/O operation on closed file"); - return NULL; - } - Py_INCREF(self); - return (PyObject *) self; -} - -PyDoc_STRVAR(BZ2File_exit_doc, -"__exit__(*excinfo) -> None. Closes the file."); - -static PyObject * -BZ2File_exit(BZ2FileObject *self, PyObject *args) -{ - PyObject *ret = PyObject_CallMethod((PyObject *) self, "close", NULL); - if (!ret) - /* If error occurred, pass through */ - return NULL; - Py_DECREF(ret); - Py_RETURN_NONE; -} - - -static PyObject *BZ2File_getiter(BZ2FileObject *self); - -static PyMethodDef BZ2File_methods[] = { - {"read", (PyCFunction)BZ2File_read, METH_VARARGS, BZ2File_read__doc__}, - {"readline", (PyCFunction)BZ2File_readline, METH_VARARGS, BZ2File_readline__doc__}, - {"readlines", (PyCFunction)BZ2File_readlines, METH_VARARGS, BZ2File_readlines__doc__}, - {"write", (PyCFunction)BZ2File_write, METH_VARARGS, BZ2File_write__doc__}, - {"writelines", (PyCFunction)BZ2File_writelines, METH_O, BZ2File_writelines__doc__}, - {"seek", (PyCFunction)BZ2File_seek, METH_VARARGS, BZ2File_seek__doc__}, - {"tell", (PyCFunction)BZ2File_tell, METH_NOARGS, BZ2File_tell__doc__}, - {"close", (PyCFunction)BZ2File_close, METH_NOARGS, BZ2File_close__doc__}, - {"__enter__", (PyCFunction)BZ2File_enter, METH_NOARGS, BZ2File_enter_doc}, - {"__exit__", (PyCFunction)BZ2File_exit, METH_VARARGS, BZ2File_exit_doc}, - {NULL, NULL} /* sentinel */ -}; - - -/* ===================================================================== */ -/* Getters and setters of BZ2File. */ - -static PyObject * -BZ2File_get_closed(BZ2FileObject *self, void *closure) -{ - return PyLong_FromLong(self->mode == MODE_CLOSED); -} - -static PyGetSetDef BZ2File_getset[] = { - {"closed", (getter)BZ2File_get_closed, NULL, - "True if the file is closed"}, - {NULL} /* Sentinel */ -}; - - -/* ===================================================================== */ -/* Slot definitions for BZ2File_Type. */ - -static int -BZ2File_init(BZ2FileObject *self, PyObject *args, PyObject *kwargs) -{ - static char *kwlist[] = {"filename", "mode", "buffering", - "compresslevel", 0}; - PyObject *name_obj = NULL; - char *name; - char *mode = "r"; - int buffering = -1; - int compresslevel = 9; - int bzerror; - int mode_char = 0; - - self->size = -1; - - if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O&|sii:BZ2File", - kwlist, PyUnicode_FSConverter, &name_obj, - &mode, &buffering, - &compresslevel)) + if (!PyArg_ParseTuple(args, "|i:BZ2Compressor", &compresslevel)) return -1; - - name = PyBytes_AsString(name_obj); - if (compresslevel < 1 || compresslevel > 9) { + if (!(1 <= compresslevel && compresslevel <= 9)) { PyErr_SetString(PyExc_ValueError, "compresslevel must be between 1 and 9"); - Py_DECREF(name_obj); return -1; } - for (;;) { - int error = 0; - switch (*mode) { - case 'r': - case 'w': - if (mode_char) - error = 1; - mode_char = *mode; - break; - - case 'b': - break; - - default: - error = 1; - break; - } - if (error) { - PyErr_Format(PyExc_ValueError, - "invalid mode char %c", *mode); - Py_DECREF(name_obj); - return -1; - } - mode++; - if (*mode == '\0') - break; - } - - if (mode_char == 0) { - mode_char = 'r'; - } - - mode = (mode_char == 'r') ? "rb" : "wb"; - - self->rawfp = fopen(name, mode); - Py_DECREF(name_obj); - if (self->rawfp == NULL) { - PyErr_SetFromErrno(PyExc_IOError); - return -1; - } - /* XXX Ignore buffering */ - - /* From now on, we have stuff to dealloc, so jump to error label - * instead of returning */ - #ifdef WITH_THREAD self->lock = PyThread_allocate_lock(); - if (!self->lock) { - PyErr_SetString(PyExc_MemoryError, "unable to allocate lock"); - goto error; + if (self->lock == NULL) { + PyErr_SetString(PyExc_MemoryError, "Unable to allocate lock"); + return -1; } #endif - if (mode_char == 'r') - self->fp = BZ2_bzReadOpen(&bzerror, self->rawfp, - 0, 0, NULL, 0); - else - self->fp = BZ2_bzWriteOpen(&bzerror, self->rawfp, - compresslevel, 0, 0); - - if (bzerror != BZ_OK) { - Util_CatchBZ2Error(bzerror); + bzerror = BZ2_bzCompressInit(&self->bzs, compresslevel, 0, 0); + if (catch_bz2_error(bzerror)) goto error; - } - - self->mode = (mode_char == 'r') ? MODE_READ : MODE_WRITE; return 0; error: - fclose(self->rawfp); - self->rawfp = NULL; #ifdef WITH_THREAD - if (self->lock) { - PyThread_free_lock(self->lock); - self->lock = NULL; - } + PyThread_free_lock(self->lock); + self->lock = NULL; #endif return -1; } static void -BZ2File_dealloc(BZ2FileObject *self) +BZ2Compressor_dealloc(BZ2Compressor *self) { - int bzerror; + BZ2_bzCompressEnd(&self->bzs); #ifdef WITH_THREAD - if (self->lock) + if (self->lock != NULL) PyThread_free_lock(self->lock); #endif - switch (self->mode) { - case MODE_READ: - case MODE_READ_EOF: - BZ2_bzReadClose(&bzerror, self->fp); - break; - case MODE_WRITE: - BZ2_bzWriteClose(&bzerror, self->fp, - 0, NULL, NULL); - break; - } - Util_DropReadAhead(self); - if (self->rawfp != NULL) - fclose(self->rawfp); Py_TYPE(self)->tp_free((PyObject *)self); } -/* This is a hacked version of Python's fileobject.c:file_getiter(). */ -static PyObject * -BZ2File_getiter(BZ2FileObject *self) -{ - if (self->mode == MODE_CLOSED) { - PyErr_SetString(PyExc_ValueError, - "I/O operation on closed file"); - return NULL; - } - Py_INCREF((PyObject*)self); - return (PyObject *)self; -} - -/* This is a hacked version of Python's fileobject.c:file_iternext(). */ -#define READAHEAD_BUFSIZE 8192 -static PyObject * -BZ2File_iternext(BZ2FileObject *self) -{ - PyBytesObject* ret; - ACQUIRE_LOCK(self); - if (self->mode == MODE_CLOSED) { - RELEASE_LOCK(self); - PyErr_SetString(PyExc_ValueError, - "I/O operation on closed file"); - return NULL; - } - ret = Util_ReadAheadGetLineSkip(self, 0, READAHEAD_BUFSIZE); - RELEASE_LOCK(self); - if (ret == NULL || PyBytes_GET_SIZE(ret) == 0) { - Py_XDECREF(ret); - return NULL; - } - return (PyObject *)ret; -} - -/* ===================================================================== */ -/* BZ2File_Type definition. */ - -PyDoc_VAR(BZ2File__doc__) = -PyDoc_STR( -"BZ2File(name [, mode='r', buffering=0, compresslevel=9]) -> file object\n\ -\n\ -Open a bz2 file. The mode can be 'r' or 'w', for reading (default) or\n\ -writing. When opened for writing, the file will be created if it doesn't\n\ -exist, and truncated otherwise. If the buffering argument is given, 0 means\n\ -unbuffered, and larger numbers specify the buffer size. If compresslevel\n\ -is given, must be a number between 1 and 9.\n\ -Data read is always returned in bytes; data written ought to be bytes.\n\ -"); - -static PyTypeObject BZ2File_Type = { - PyVarObject_HEAD_INIT(NULL, 0) - "bz2.BZ2File", /*tp_name*/ - sizeof(BZ2FileObject), /*tp_basicsize*/ - 0, /*tp_itemsize*/ - (destructor)BZ2File_dealloc, /*tp_dealloc*/ - 0, /*tp_print*/ - 0, /*tp_getattr*/ - 0, /*tp_setattr*/ - 0, /*tp_reserved*/ - 0, /*tp_repr*/ - 0, /*tp_as_number*/ - 0, /*tp_as_sequence*/ - 0, /*tp_as_mapping*/ - 0, /*tp_hash*/ - 0, /*tp_call*/ - 0, /*tp_str*/ - PyObject_GenericGetAttr,/*tp_getattro*/ - PyObject_GenericSetAttr,/*tp_setattro*/ - 0, /*tp_as_buffer*/ - Py_TPFLAGS_DEFAULT|Py_TPFLAGS_BASETYPE, /*tp_flags*/ - BZ2File__doc__, /*tp_doc*/ - 0, /*tp_traverse*/ - 0, /*tp_clear*/ - 0, /*tp_richcompare*/ - 0, /*tp_weaklistoffset*/ - (getiterfunc)BZ2File_getiter, /*tp_iter*/ - (iternextfunc)BZ2File_iternext, /*tp_iternext*/ - BZ2File_methods, /*tp_methods*/ - 0, /*tp_members*/ - BZ2File_getset, /*tp_getset*/ - 0, /*tp_base*/ - 0, /*tp_dict*/ - 0, /*tp_descr_get*/ - 0, /*tp_descr_set*/ - 0, /*tp_dictoffset*/ - (initproc)BZ2File_init, /*tp_init*/ - PyType_GenericAlloc, /*tp_alloc*/ - PyType_GenericNew, /*tp_new*/ - PyObject_Free, /*tp_free*/ - 0, /*tp_is_gc*/ +static PyMethodDef BZ2Compressor_methods[] = { + {"compress", (PyCFunction)BZ2Compressor_compress, METH_VARARGS, + BZ2Compressor_compress__doc__}, + {"flush", (PyCFunction)BZ2Compressor_flush, METH_NOARGS, + BZ2Compressor_flush__doc__}, + {NULL} }; +PyDoc_STRVAR(BZ2Compressor__doc__, +"BZ2Compressor(compresslevel=9)\n" +"\n" +"Create a compressor object for compressing data incrementally.\n" +"\n" +"compresslevel, if given, must be a number between 1 and 9.\n" +"\n" +"For one-shot compression, use the compress() function instead.\n"); -/* ===================================================================== */ -/* Methods of BZ2Comp. */ +static PyTypeObject BZ2Compressor_Type = { + PyVarObject_HEAD_INIT(NULL, 0) + "_bz2.BZ2Compressor", /* tp_name */ + sizeof(BZ2Compressor), /* tp_basicsize */ + 0, /* tp_itemsize */ + (destructor)BZ2Compressor_dealloc, /* tp_dealloc */ + 0, /* tp_print */ + 0, /* tp_getattr */ + 0, /* tp_setattr */ + 0, /* tp_reserved */ + 0, /* tp_repr */ + 0, /* tp_as_number */ + 0, /* tp_as_sequence */ + 0, /* tp_as_mapping */ + 0, /* tp_hash */ + 0, /* tp_call */ + 0, /* tp_str */ + 0, /* tp_getattro */ + 0, /* tp_setattro */ + 0, /* tp_as_buffer */ + Py_TPFLAGS_DEFAULT, /* tp_flags */ + BZ2Compressor__doc__, /* tp_doc */ + 0, /* tp_traverse */ + 0, /* tp_clear */ + 0, /* tp_richcompare */ + 0, /* tp_weaklistoffset */ + 0, /* tp_iter */ + 0, /* tp_iternext */ + BZ2Compressor_methods, /* tp_methods */ + 0, /* tp_members */ + 0, /* tp_getset */ + 0, /* tp_base */ + 0, /* tp_dict */ + 0, /* tp_descr_get */ + 0, /* tp_descr_set */ + 0, /* tp_dictoffset */ + (initproc)BZ2Compressor_init, /* tp_init */ + 0, /* tp_alloc */ + PyType_GenericNew, /* tp_new */ +}; -PyDoc_STRVAR(BZ2Comp_compress__doc__, -"compress(data) -> string\n\ -\n\ -Provide more data to the compressor object. It will return chunks of\n\ -compressed data whenever possible. When you've finished providing data\n\ -to compress, call the flush() method to finish the compression process,\n\ -and return what is left in the internal buffers.\n\ -"); + +/* BZ2Decompressor class. */ static PyObject * -BZ2Comp_compress(BZ2CompObject *self, PyObject *args) +decompress(BZ2Decompressor *d, char *data, size_t len) { - Py_buffer pdata; - char *data; - int datasize; - int bufsize = SMALLCHUNK; - PY_LONG_LONG totalout; - PyObject *ret = NULL; - bz_stream *bzs = &self->bzs; - int bzerror; + size_t data_size = 0; + PyObject *result; - if (!PyArg_ParseTuple(args, "y*:compress", &pdata)) - return NULL; - data = pdata.buf; - datasize = pdata.len; + result = PyBytes_FromStringAndSize(NULL, SMALLCHUNK); + if (result == NULL) + return result; + d->bzs.next_in = data; + /* FIXME This is not 64-bit clean - avail_in is an int. */ + d->bzs.avail_in = len; + d->bzs.next_out = PyBytes_AS_STRING(result); + d->bzs.avail_out = PyBytes_GET_SIZE(result); + for (;;) { + char *this_out; + int bzerror; - if (datasize == 0) { - PyBuffer_Release(&pdata); - return PyBytes_FromStringAndSize("", 0); - } - - ACQUIRE_LOCK(self); - if (!self->running) { - PyErr_SetString(PyExc_ValueError, - "this object was already flushed"); - goto error; - } - - ret = PyBytes_FromStringAndSize(NULL, bufsize); - if (!ret) - goto error; - - bzs->next_in = data; - bzs->avail_in = datasize; - bzs->next_out = BUF(ret); - bzs->avail_out = bufsize; - - totalout = BZS_TOTAL_OUT(bzs); - - for (;;) { Py_BEGIN_ALLOW_THREADS - bzerror = BZ2_bzCompress(bzs, BZ_RUN); + this_out = d->bzs.next_out; + bzerror = BZ2_bzDecompress(&d->bzs); + data_size += d->bzs.next_out - this_out; Py_END_ALLOW_THREADS - if (bzerror != BZ_RUN_OK) { - Util_CatchBZ2Error(bzerror); + if (catch_bz2_error(bzerror)) goto error; + if (bzerror == BZ_STREAM_END) { + d->eof = 1; + if (d->bzs.avail_in > 0) { /* Save leftover input to unused_data */ + Py_CLEAR(d->unused_data); + d->unused_data = PyBytes_FromStringAndSize(d->bzs.next_in, + d->bzs.avail_in); + if (d->unused_data == NULL) + goto error; + } + break; } - if (bzs->avail_in == 0) - break; /* no more input data */ - if (bzs->avail_out == 0) { - bufsize = Util_NewBufferSize(bufsize); - if (_PyBytes_Resize(&ret, bufsize) < 0) { - BZ2_bzCompressEnd(bzs); + if (d->bzs.avail_in == 0) + break; + if (d->bzs.avail_out == 0) { + if (grow_buffer(&result) < 0) goto error; - } - bzs->next_out = BUF(ret) + (BZS_TOTAL_OUT(bzs) - - totalout); - bzs->avail_out = bufsize - (bzs->next_out - BUF(ret)); + d->bzs.next_out = PyBytes_AS_STRING(result) + data_size; + d->bzs.avail_out = PyBytes_GET_SIZE(result) - data_size; } } - - if (_PyBytes_Resize(&ret, - (Py_ssize_t)(BZS_TOTAL_OUT(bzs) - totalout)) < 0) - goto error; - - RELEASE_LOCK(self); - PyBuffer_Release(&pdata); - return ret; + if (data_size != PyBytes_GET_SIZE(result)) + if (_PyBytes_Resize(&result, data_size) < 0) + goto error; + return result; error: - RELEASE_LOCK(self); - PyBuffer_Release(&pdata); - Py_XDECREF(ret); + Py_XDECREF(result); return NULL; } -PyDoc_STRVAR(BZ2Comp_flush__doc__, -"flush() -> string\n\ -\n\ -Finish the compression process and return what is left in internal buffers.\n\ -You must not use the compressor object after calling this method.\n\ -"); +PyDoc_STRVAR(BZ2Decompressor_decompress__doc__, +"decompress(data) -> bytes\n" +"\n" +"Provide data to the decompressor object. Returns a chunk of\n" +"decompressed data if possible, or b'' otherwise.\n" +"\n" +"Attempting to decompress data after the end of stream is reached\n" +"raises an EOFError. Any data found after the end of the stream\n" +"is ignored and saved in the unused_data attribute.\n"); static PyObject * -BZ2Comp_flush(BZ2CompObject *self) +BZ2Decompressor_decompress(BZ2Decompressor *self, PyObject *args) { - int bufsize = SMALLCHUNK; - PyObject *ret = NULL; - bz_stream *bzs = &self->bzs; - PY_LONG_LONG totalout; - int bzerror; + Py_buffer buffer; + PyObject *result = NULL; + + if (!PyArg_ParseTuple(args, "y*:decompress", &buffer)) + return NULL; ACQUIRE_LOCK(self); - if (!self->running) { - PyErr_SetString(PyExc_ValueError, "object was already " - "flushed"); - goto error; - } - self->running = 0; - - ret = PyBytes_FromStringAndSize(NULL, bufsize); - if (!ret) - goto error; - - bzs->next_out = BUF(ret); - bzs->avail_out = bufsize; - - totalout = BZS_TOTAL_OUT(bzs); - - for (;;) { - Py_BEGIN_ALLOW_THREADS - bzerror = BZ2_bzCompress(bzs, BZ_FINISH); - Py_END_ALLOW_THREADS - if (bzerror == BZ_STREAM_END) { - break; - } else if (bzerror != BZ_FINISH_OK) { - Util_CatchBZ2Error(bzerror); - goto error; - } - if (bzs->avail_out == 0) { - bufsize = Util_NewBufferSize(bufsize); - if (_PyBytes_Resize(&ret, bufsize) < 0) - goto error; - bzs->next_out = BUF(ret); - bzs->next_out = BUF(ret) + (BZS_TOTAL_OUT(bzs) - - totalout); - bzs->avail_out = bufsize - (bzs->next_out - BUF(ret)); - } - } - - if (bzs->avail_out != 0) { - if (_PyBytes_Resize(&ret, - (Py_ssize_t)(BZS_TOTAL_OUT(bzs) - totalout)) < 0) - goto error; - } - + if (self->eof) + PyErr_SetString(PyExc_EOFError, "End of stream already reached"); + else + result = decompress(self, buffer.buf, buffer.len); RELEASE_LOCK(self); - return ret; - -error: - RELEASE_LOCK(self); - Py_XDECREF(ret); - return NULL; + PyBuffer_Release(&buffer); + return result; } -static PyMethodDef BZ2Comp_methods[] = { - {"compress", (PyCFunction)BZ2Comp_compress, METH_VARARGS, - BZ2Comp_compress__doc__}, - {"flush", (PyCFunction)BZ2Comp_flush, METH_NOARGS, - BZ2Comp_flush__doc__}, - {NULL, NULL} /* sentinel */ -}; - - -/* ===================================================================== */ -/* Slot definitions for BZ2Comp_Type. */ - static int -BZ2Comp_init(BZ2CompObject *self, PyObject *args, PyObject *kwargs) -{ - int compresslevel = 9; - int bzerror; - static char *kwlist[] = {"compresslevel", 0}; - - if (!PyArg_ParseTupleAndKeywords(args, kwargs, "|i:BZ2Compressor", - kwlist, &compresslevel)) - return -1; - - if (compresslevel < 1 || compresslevel > 9) { - PyErr_SetString(PyExc_ValueError, - "compresslevel must be between 1 and 9"); - goto error; - } - -#ifdef WITH_THREAD - self->lock = PyThread_allocate_lock(); - if (!self->lock) { - PyErr_SetString(PyExc_MemoryError, "unable to allocate lock"); - goto error; - } -#endif - - memset(&self->bzs, 0, sizeof(bz_stream)); - bzerror = BZ2_bzCompressInit(&self->bzs, compresslevel, 0, 0); - if (bzerror != BZ_OK) { - Util_CatchBZ2Error(bzerror); - goto error; - } - - self->running = 1; - - return 0; -error: -#ifdef WITH_THREAD - if (self->lock) { - PyThread_free_lock(self->lock); - self->lock = NULL; - } -#endif - return -1; -} - -static void -BZ2Comp_dealloc(BZ2CompObject *self) -{ -#ifdef WITH_THREAD - if (self->lock) - PyThread_free_lock(self->lock); -#endif - BZ2_bzCompressEnd(&self->bzs); - Py_TYPE(self)->tp_free((PyObject *)self); -} - - -/* ===================================================================== */ -/* BZ2Comp_Type definition. */ - -PyDoc_STRVAR(BZ2Comp__doc__, -"BZ2Compressor([compresslevel=9]) -> compressor object\n\ -\n\ -Create a new compressor object. This object may be used to compress\n\ -data sequentially. If you want to compress data in one shot, use the\n\ -compress() function instead. The compresslevel parameter, if given,\n\ -must be a number between 1 and 9.\n\ -"); - -static PyTypeObject BZ2Comp_Type = { - PyVarObject_HEAD_INIT(NULL, 0) - "bz2.BZ2Compressor", /*tp_name*/ - sizeof(BZ2CompObject), /*tp_basicsize*/ - 0, /*tp_itemsize*/ - (destructor)BZ2Comp_dealloc, /*tp_dealloc*/ - 0, /*tp_print*/ - 0, /*tp_getattr*/ - 0, /*tp_setattr*/ - 0, /*tp_reserved*/ - 0, /*tp_repr*/ - 0, /*tp_as_number*/ - 0, /*tp_as_sequence*/ - 0, /*tp_as_mapping*/ - 0, /*tp_hash*/ - 0, /*tp_call*/ - 0, /*tp_str*/ - PyObject_GenericGetAttr,/*tp_getattro*/ - PyObject_GenericSetAttr,/*tp_setattro*/ - 0, /*tp_as_buffer*/ - Py_TPFLAGS_DEFAULT|Py_TPFLAGS_BASETYPE, /*tp_flags*/ - BZ2Comp__doc__, /*tp_doc*/ - 0, /*tp_traverse*/ - 0, /*tp_clear*/ - 0, /*tp_richcompare*/ - 0, /*tp_weaklistoffset*/ - 0, /*tp_iter*/ - 0, /*tp_iternext*/ - BZ2Comp_methods, /*tp_methods*/ - 0, /*tp_members*/ - 0, /*tp_getset*/ - 0, /*tp_base*/ - 0, /*tp_dict*/ - 0, /*tp_descr_get*/ - 0, /*tp_descr_set*/ - 0, /*tp_dictoffset*/ - (initproc)BZ2Comp_init, /*tp_init*/ - PyType_GenericAlloc, /*tp_alloc*/ - PyType_GenericNew, /*tp_new*/ - PyObject_Free, /*tp_free*/ - 0, /*tp_is_gc*/ -}; - - -/* ===================================================================== */ -/* Members of BZ2Decomp. */ - -#undef OFF -#define OFF(x) offsetof(BZ2DecompObject, x) - -static PyMemberDef BZ2Decomp_members[] = { - {"unused_data", T_OBJECT, OFF(unused_data), READONLY}, - {NULL} /* Sentinel */ -}; - - -/* ===================================================================== */ -/* Methods of BZ2Decomp. */ - -PyDoc_STRVAR(BZ2Decomp_decompress__doc__, -"decompress(data) -> string\n\ -\n\ -Provide more data to the decompressor object. It will return chunks\n\ -of decompressed data whenever possible. If you try to decompress data\n\ -after the end of stream is found, EOFError will be raised. If any data\n\ -was found after the end of stream, it'll be ignored and saved in\n\ -unused_data attribute.\n\ -"); - -static PyObject * -BZ2Decomp_decompress(BZ2DecompObject *self, PyObject *args) -{ - Py_buffer pdata; - char *data; - int datasize; - int bufsize = SMALLCHUNK; - PY_LONG_LONG totalout; - PyObject *ret = NULL; - bz_stream *bzs = &self->bzs; - int bzerror; - - if (!PyArg_ParseTuple(args, "y*:decompress", &pdata)) - return NULL; - data = pdata.buf; - datasize = pdata.len; - - ACQUIRE_LOCK(self); - if (!self->running) { - PyErr_SetString(PyExc_EOFError, "end of stream was " - "already found"); - goto error; - } - - ret = PyBytes_FromStringAndSize(NULL, bufsize); - if (!ret) - goto error; - - bzs->next_in = data; - bzs->avail_in = datasize; - bzs->next_out = BUF(ret); - bzs->avail_out = bufsize; - - totalout = BZS_TOTAL_OUT(bzs); - - for (;;) { - Py_BEGIN_ALLOW_THREADS - bzerror = BZ2_bzDecompress(bzs); - Py_END_ALLOW_THREADS - if (bzerror == BZ_STREAM_END) { - if (bzs->avail_in != 0) { - Py_DECREF(self->unused_data); - self->unused_data = - PyBytes_FromStringAndSize(bzs->next_in, - bzs->avail_in); - } - self->running = 0; - break; - } - if (bzerror != BZ_OK) { - Util_CatchBZ2Error(bzerror); - goto error; - } - if (bzs->avail_in == 0) - break; /* no more input data */ - if (bzs->avail_out == 0) { - bufsize = Util_NewBufferSize(bufsize); - if (_PyBytes_Resize(&ret, bufsize) < 0) { - BZ2_bzDecompressEnd(bzs); - goto error; - } - bzs->next_out = BUF(ret); - bzs->next_out = BUF(ret) + (BZS_TOTAL_OUT(bzs) - - totalout); - bzs->avail_out = bufsize - (bzs->next_out - BUF(ret)); - } - } - - if (bzs->avail_out != 0) { - if (_PyBytes_Resize(&ret, - (Py_ssize_t)(BZS_TOTAL_OUT(bzs) - totalout)) < 0) - goto error; - } - - RELEASE_LOCK(self); - PyBuffer_Release(&pdata); - return ret; - -error: - RELEASE_LOCK(self); - PyBuffer_Release(&pdata); - Py_XDECREF(ret); - return NULL; -} - -static PyMethodDef BZ2Decomp_methods[] = { - {"decompress", (PyCFunction)BZ2Decomp_decompress, METH_VARARGS, BZ2Decomp_decompress__doc__}, - {NULL, NULL} /* sentinel */ -}; - - -/* ===================================================================== */ -/* Slot definitions for BZ2Decomp_Type. */ - -static int -BZ2Decomp_init(BZ2DecompObject *self, PyObject *args, PyObject *kwargs) +BZ2Decompressor_init(BZ2Decompressor *self, PyObject *args, PyObject *kwargs) { int bzerror; @@ -1825,325 +438,120 @@ #ifdef WITH_THREAD self->lock = PyThread_allocate_lock(); - if (!self->lock) { - PyErr_SetString(PyExc_MemoryError, "unable to allocate lock"); - goto error; + if (self->lock == NULL) { + PyErr_SetString(PyExc_MemoryError, "Unable to allocate lock"); + return -1; } #endif self->unused_data = PyBytes_FromStringAndSize("", 0); - if (!self->unused_data) + if (self->unused_data == NULL) goto error; - memset(&self->bzs, 0, sizeof(bz_stream)); bzerror = BZ2_bzDecompressInit(&self->bzs, 0, 0); - if (bzerror != BZ_OK) { - Util_CatchBZ2Error(bzerror); + if (catch_bz2_error(bzerror)) goto error; - } - - self->running = 1; return 0; error: + Py_CLEAR(self->unused_data); #ifdef WITH_THREAD - if (self->lock) { - PyThread_free_lock(self->lock); - self->lock = NULL; - } + PyThread_free_lock(self->lock); + self->lock = NULL; #endif - Py_CLEAR(self->unused_data); return -1; } static void -BZ2Decomp_dealloc(BZ2DecompObject *self) +BZ2Decompressor_dealloc(BZ2Decompressor *self) { + BZ2_bzDecompressEnd(&self->bzs); + Py_CLEAR(self->unused_data); #ifdef WITH_THREAD - if (self->lock) + if (self->lock != NULL) PyThread_free_lock(self->lock); #endif - Py_XDECREF(self->unused_data); - BZ2_bzDecompressEnd(&self->bzs); Py_TYPE(self)->tp_free((PyObject *)self); } - -/* ===================================================================== */ -/* BZ2Decomp_Type definition. */ - -PyDoc_STRVAR(BZ2Decomp__doc__, -"BZ2Decompressor() -> decompressor object\n\ -\n\ -Create a new decompressor object. This object may be used to decompress\n\ -data sequentially. If you want to decompress data in one shot, use the\n\ -decompress() function instead.\n\ -"); - -static PyTypeObject BZ2Decomp_Type = { - PyVarObject_HEAD_INIT(NULL, 0) - "bz2.BZ2Decompressor", /*tp_name*/ - sizeof(BZ2DecompObject), /*tp_basicsize*/ - 0, /*tp_itemsize*/ - (destructor)BZ2Decomp_dealloc, /*tp_dealloc*/ - 0, /*tp_print*/ - 0, /*tp_getattr*/ - 0, /*tp_setattr*/ - 0, /*tp_reserved*/ - 0, /*tp_repr*/ - 0, /*tp_as_number*/ - 0, /*tp_as_sequence*/ - 0, /*tp_as_mapping*/ - 0, /*tp_hash*/ - 0, /*tp_call*/ - 0, /*tp_str*/ - PyObject_GenericGetAttr,/*tp_getattro*/ - PyObject_GenericSetAttr,/*tp_setattro*/ - 0, /*tp_as_buffer*/ - Py_TPFLAGS_DEFAULT|Py_TPFLAGS_BASETYPE, /*tp_flags*/ - BZ2Decomp__doc__, /*tp_doc*/ - 0, /*tp_traverse*/ - 0, /*tp_clear*/ - 0, /*tp_richcompare*/ - 0, /*tp_weaklistoffset*/ - 0, /*tp_iter*/ - 0, /*tp_iternext*/ - BZ2Decomp_methods, /*tp_methods*/ - BZ2Decomp_members, /*tp_members*/ - 0, /*tp_getset*/ - 0, /*tp_base*/ - 0, /*tp_dict*/ - 0, /*tp_descr_get*/ - 0, /*tp_descr_set*/ - 0, /*tp_dictoffset*/ - (initproc)BZ2Decomp_init, /*tp_init*/ - PyType_GenericAlloc, /*tp_alloc*/ - PyType_GenericNew, /*tp_new*/ - PyObject_Free, /*tp_free*/ - 0, /*tp_is_gc*/ +static PyMethodDef BZ2Decompressor_methods[] = { + {"decompress", (PyCFunction)BZ2Decompressor_decompress, METH_VARARGS, + BZ2Decompressor_decompress__doc__}, + {NULL} }; +PyDoc_STRVAR(BZ2Decompressor_eof__doc__, +"True if the end-of-stream marker has been reached."); -/* ===================================================================== */ -/* Module functions. */ +PyDoc_STRVAR(BZ2Decompressor_unused_data__doc__, +"Data found after the end of the compressed stream."); -PyDoc_STRVAR(bz2_compress__doc__, -"compress(data [, compresslevel=9]) -> string\n\ -\n\ -Compress data in one shot. If you want to compress data sequentially,\n\ -use an instance of BZ2Compressor instead. The compresslevel parameter, if\n\ -given, must be a number between 1 and 9.\n\ -"); - -static PyObject * -bz2_compress(PyObject *self, PyObject *args, PyObject *kwargs) -{ - int compresslevel=9; - Py_buffer pdata; - char *data; - int datasize; - int bufsize; - PyObject *ret = NULL; - bz_stream _bzs; - bz_stream *bzs = &_bzs; - int bzerror; - static char *kwlist[] = {"data", "compresslevel", 0}; - - if (!PyArg_ParseTupleAndKeywords(args, kwargs, "y*|i", - kwlist, &pdata, - &compresslevel)) - return NULL; - data = pdata.buf; - datasize = pdata.len; - - if (compresslevel < 1 || compresslevel > 9) { - PyErr_SetString(PyExc_ValueError, - "compresslevel must be between 1 and 9"); - PyBuffer_Release(&pdata); - return NULL; - } - - /* Conforming to bz2 manual, this is large enough to fit compressed - * data in one shot. We will check it later anyway. */ - bufsize = datasize + (datasize/100+1) + 600; - - ret = PyBytes_FromStringAndSize(NULL, bufsize); - if (!ret) { - PyBuffer_Release(&pdata); - return NULL; - } - - memset(bzs, 0, sizeof(bz_stream)); - - bzs->next_in = data; - bzs->avail_in = datasize; - bzs->next_out = BUF(ret); - bzs->avail_out = bufsize; - - bzerror = BZ2_bzCompressInit(bzs, compresslevel, 0, 0); - if (bzerror != BZ_OK) { - Util_CatchBZ2Error(bzerror); - PyBuffer_Release(&pdata); - Py_DECREF(ret); - return NULL; - } - - for (;;) { - Py_BEGIN_ALLOW_THREADS - bzerror = BZ2_bzCompress(bzs, BZ_FINISH); - Py_END_ALLOW_THREADS - if (bzerror == BZ_STREAM_END) { - break; - } else if (bzerror != BZ_FINISH_OK) { - BZ2_bzCompressEnd(bzs); - Util_CatchBZ2Error(bzerror); - PyBuffer_Release(&pdata); - Py_DECREF(ret); - return NULL; - } - if (bzs->avail_out == 0) { - bufsize = Util_NewBufferSize(bufsize); - if (_PyBytes_Resize(&ret, bufsize) < 0) { - BZ2_bzCompressEnd(bzs); - PyBuffer_Release(&pdata); - return NULL; - } - bzs->next_out = BUF(ret) + BZS_TOTAL_OUT(bzs); - bzs->avail_out = bufsize - (bzs->next_out - BUF(ret)); - } - } - - if (bzs->avail_out != 0) { - if (_PyBytes_Resize(&ret, (Py_ssize_t)BZS_TOTAL_OUT(bzs)) < 0) { - ret = NULL; - } - } - BZ2_bzCompressEnd(bzs); - - PyBuffer_Release(&pdata); - return ret; -} - -PyDoc_STRVAR(bz2_decompress__doc__, -"decompress(data) -> decompressed data\n\ -\n\ -Decompress data in one shot. If you want to decompress data sequentially,\n\ -use an instance of BZ2Decompressor instead.\n\ -"); - -static PyObject * -bz2_decompress(PyObject *self, PyObject *args) -{ - Py_buffer pdata; - char *data; - int datasize; - int bufsize = SMALLCHUNK; - PyObject *ret; - bz_stream _bzs; - bz_stream *bzs = &_bzs; - int bzerror; - - if (!PyArg_ParseTuple(args, "y*:decompress", &pdata)) - return NULL; - data = pdata.buf; - datasize = pdata.len; - - if (datasize == 0) { - PyBuffer_Release(&pdata); - return PyBytes_FromStringAndSize("", 0); - } - - ret = PyBytes_FromStringAndSize(NULL, bufsize); - if (!ret) { - PyBuffer_Release(&pdata); - return NULL; - } - - memset(bzs, 0, sizeof(bz_stream)); - - bzs->next_in = data; - bzs->avail_in = datasize; - bzs->next_out = BUF(ret); - bzs->avail_out = bufsize; - - bzerror = BZ2_bzDecompressInit(bzs, 0, 0); - if (bzerror != BZ_OK) { - Util_CatchBZ2Error(bzerror); - Py_DECREF(ret); - PyBuffer_Release(&pdata); - return NULL; - } - - for (;;) { - Py_BEGIN_ALLOW_THREADS - bzerror = BZ2_bzDecompress(bzs); - Py_END_ALLOW_THREADS - if (bzerror == BZ_STREAM_END) { - break; - } else if (bzerror != BZ_OK) { - BZ2_bzDecompressEnd(bzs); - Util_CatchBZ2Error(bzerror); - PyBuffer_Release(&pdata); - Py_DECREF(ret); - return NULL; - } - if (bzs->avail_in == 0) { - BZ2_bzDecompressEnd(bzs); - PyErr_SetString(PyExc_ValueError, - "couldn't find end of stream"); - PyBuffer_Release(&pdata); - Py_DECREF(ret); - return NULL; - } - if (bzs->avail_out == 0) { - bufsize = Util_NewBufferSize(bufsize); - if (_PyBytes_Resize(&ret, bufsize) < 0) { - BZ2_bzDecompressEnd(bzs); - PyBuffer_Release(&pdata); - return NULL; - } - bzs->next_out = BUF(ret) + BZS_TOTAL_OUT(bzs); - bzs->avail_out = bufsize - (bzs->next_out - BUF(ret)); - } - } - - if (bzs->avail_out != 0) { - if (_PyBytes_Resize(&ret, (Py_ssize_t)BZS_TOTAL_OUT(bzs)) < 0) { - ret = NULL; - } - } - BZ2_bzDecompressEnd(bzs); - PyBuffer_Release(&pdata); - - return ret; -} - -static PyMethodDef bz2_methods[] = { - {"compress", (PyCFunction) bz2_compress, METH_VARARGS|METH_KEYWORDS, - bz2_compress__doc__}, - {"decompress", (PyCFunction) bz2_decompress, METH_VARARGS, - bz2_decompress__doc__}, - {NULL, NULL} /* sentinel */ +static PyMemberDef BZ2Decompressor_members[] = { + {"eof", T_BOOL, offsetof(BZ2Decompressor, eof), + READONLY, BZ2Decompressor_eof__doc__}, + {"unused_data", T_OBJECT_EX, offsetof(BZ2Decompressor, unused_data), + READONLY, BZ2Decompressor_unused_data__doc__}, + {NULL} }; -/* ===================================================================== */ -/* Initialization function. */ +PyDoc_STRVAR(BZ2Decompressor__doc__, +"BZ2Decompressor()\n" +"\n" +"Create a decompressor object for decompressing data incrementally.\n" +"\n" +"For one-shot decompression, use the decompress() function instead.\n"); -PyDoc_STRVAR(bz2__doc__, -"The python bz2 module provides a comprehensive interface for\n\ -the bz2 compression library. It implements a complete file\n\ -interface, one shot (de)compression functions, and types for\n\ -sequential (de)compression.\n\ -"); +static PyTypeObject BZ2Decompressor_Type = { + PyVarObject_HEAD_INIT(NULL, 0) + "_bz2.BZ2Decompressor", /* tp_name */ + sizeof(BZ2Decompressor), /* tp_basicsize */ + 0, /* tp_itemsize */ + (destructor)BZ2Decompressor_dealloc,/* tp_dealloc */ + 0, /* tp_print */ + 0, /* tp_getattr */ + 0, /* tp_setattr */ + 0, /* tp_reserved */ + 0, /* tp_repr */ + 0, /* tp_as_number */ + 0, /* tp_as_sequence */ + 0, /* tp_as_mapping */ + 0, /* tp_hash */ + 0, /* tp_call */ + 0, /* tp_str */ + 0, /* tp_getattro */ + 0, /* tp_setattro */ + 0, /* tp_as_buffer */ + Py_TPFLAGS_DEFAULT, /* tp_flags */ + BZ2Decompressor__doc__, /* tp_doc */ + 0, /* tp_traverse */ + 0, /* tp_clear */ + 0, /* tp_richcompare */ + 0, /* tp_weaklistoffset */ + 0, /* tp_iter */ + 0, /* tp_iternext */ + BZ2Decompressor_methods, /* tp_methods */ + BZ2Decompressor_members, /* tp_members */ + 0, /* tp_getset */ + 0, /* tp_base */ + 0, /* tp_dict */ + 0, /* tp_descr_get */ + 0, /* tp_descr_set */ + 0, /* tp_dictoffset */ + (initproc)BZ2Decompressor_init, /* tp_init */ + 0, /* tp_alloc */ + PyType_GenericNew, /* tp_new */ +}; -static struct PyModuleDef bz2module = { +/* Module initialization. */ + +static struct PyModuleDef _bz2module = { PyModuleDef_HEAD_INIT, - "bz2", - bz2__doc__, + "_bz2", + NULL, -1, - bz2_methods, + NULL, NULL, NULL, NULL, @@ -2151,30 +559,25 @@ }; PyMODINIT_FUNC -PyInit_bz2(void) +PyInit__bz2(void) { PyObject *m; - if (PyType_Ready(&BZ2File_Type) < 0) + if (PyType_Ready(&BZ2Compressor_Type) < 0) return NULL; - if (PyType_Ready(&BZ2Comp_Type) < 0) - return NULL; - if (PyType_Ready(&BZ2Decomp_Type) < 0) + if (PyType_Ready(&BZ2Decompressor_Type) < 0) return NULL; - m = PyModule_Create(&bz2module); + m = PyModule_Create(&_bz2module); if (m == NULL) return NULL; - PyModule_AddObject(m, "__author__", PyUnicode_FromString(__author__)); + Py_INCREF(&BZ2Compressor_Type); + PyModule_AddObject(m, "BZ2Compressor", (PyObject *)&BZ2Compressor_Type); - Py_INCREF(&BZ2File_Type); - PyModule_AddObject(m, "BZ2File", (PyObject *)&BZ2File_Type); + Py_INCREF(&BZ2Decompressor_Type); + PyModule_AddObject(m, "BZ2Decompressor", + (PyObject *)&BZ2Decompressor_Type); - Py_INCREF(&BZ2Comp_Type); - PyModule_AddObject(m, "BZ2Compressor", (PyObject *)&BZ2Comp_Type); - - Py_INCREF(&BZ2Decomp_Type); - PyModule_AddObject(m, "BZ2Decompressor", (PyObject *)&BZ2Decomp_Type); return m; } diff --git a/PCbuild/bz2.vcproj b/PCbuild/_bz2.vcproj rename from PCbuild/bz2.vcproj rename to PCbuild/_bz2.vcproj --- a/PCbuild/bz2.vcproj +++ b/PCbuild/_bz2.vcproj @@ -2,7 +2,7 @@ diff --git a/PCbuild/pcbuild.sln b/PCbuild/pcbuild.sln --- a/PCbuild/pcbuild.sln +++ b/PCbuild/pcbuild.sln @@ -87,7 +87,7 @@ {CF7AC3D1-E2DF-41D2-BEA6-1E2556CDEA26} = {CF7AC3D1-E2DF-41D2-BEA6-1E2556CDEA26} EndProjectSection EndProject -Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "bz2", "bz2.vcproj", "{73FCD2BD-F133-46B7-8EC1-144CD82A59D5}" +Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "_bz2", "_bz2.vcproj", "{73FCD2BD-F133-46B7-8EC1-144CD82A59D5}" ProjectSection(ProjectDependencies) = postProject {CF7AC3D1-E2DF-41D2-BEA6-1E2556CDEA26} = {CF7AC3D1-E2DF-41D2-BEA6-1E2556CDEA26} EndProjectSection diff --git a/PCbuild/readme.txt b/PCbuild/readme.txt --- a/PCbuild/readme.txt +++ b/PCbuild/readme.txt @@ -112,9 +112,9 @@ pre-built Tcl/Tk in either ..\..\tcltk for 32-bit or ..\..\tcltk64 for 64-bit (relative to this directory). See below for instructions to build Tcl/Tk. -bz2 - Python wrapper for the libbz2 compression library. Homepage - http://sources.redhat.com/bzip2/ +_bz2 + Python wrapper for the libbzip2 compression library. Homepage + http://www.bzip.org/ Download the source from the python.org copy into the dist directory: diff --git a/setup.py b/setup.py --- a/setup.py +++ b/setup.py @@ -1233,11 +1233,11 @@ bz2_extra_link_args = ('-Wl,-search_paths_first',) else: bz2_extra_link_args = () - exts.append( Extension('bz2', ['bz2module.c'], + exts.append( Extension('_bz2', ['_bz2module.c'], libraries = ['bz2'], extra_link_args = bz2_extra_link_args) ) else: - missing.append('bz2') + missing.append('_bz2') # Interface to the Expat XML parser # -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Apr 3 17:09:11 2011 From: python-checkins at python.org (antoine.pitrou) Date: Sun, 03 Apr 2011 17:09:11 +0200 Subject: [Python-checkins] cpython: Fix whitespace Message-ID: http://hg.python.org/cpython/rev/ff105faf1bac changeset: 69113:ff105faf1bac user: Antoine Pitrou date: Sun Apr 03 17:08:49 2011 +0200 summary: Fix whitespace files: Lib/bz2.py | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/Lib/bz2.py b/Lib/bz2.py --- a/Lib/bz2.py +++ b/Lib/bz2.py @@ -105,7 +105,7 @@ self._fp.write(self._compressor.flush()) self._compressor = None finally: - try: + try: if self._closefp: self._fp.close() finally: @@ -251,7 +251,7 @@ def readinto(self, b): """Read up to len(b) bytes into b. - + Returns the number of bytes read (0 for EOF). """ with self._lock: -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Apr 3 18:20:19 2011 From: python-checkins at python.org (antoine.pitrou) Date: Sun, 03 Apr 2011 18:20:19 +0200 Subject: [Python-checkins] cpython (3.2): Issue #11746: Fix SSLContext.load_cert_chain() to accept elliptic curve private Message-ID: http://hg.python.org/cpython/rev/88ed3de28520 changeset: 69114:88ed3de28520 branch: 3.2 parent: 69109:1fd736395df3 user: Antoine Pitrou date: Sun Apr 03 18:15:34 2011 +0200 summary: Issue #11746: Fix SSLContext.load_cert_chain() to accept elliptic curve private keys. files: Misc/NEWS | 3 +++ Modules/_ssl.c | 2 +- 2 files changed, 4 insertions(+), 1 deletions(-) diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -49,6 +49,9 @@ Library ------- +- Issue #11746: Fix SSLContext.load_cert_chain() to accept elliptic curve + private keys. + - sys.getfilesystemencoding() raises a RuntimeError if initfsencoding() was not called yet: detect bootstrap (startup) issues earlier. diff --git a/Modules/_ssl.c b/Modules/_ssl.c --- a/Modules/_ssl.c +++ b/Modules/_ssl.c @@ -1623,7 +1623,7 @@ goto error; } PySSL_BEGIN_ALLOW_THREADS - r = SSL_CTX_use_RSAPrivateKey_file(self->ctx, + r = SSL_CTX_use_PrivateKey_file(self->ctx, PyBytes_AS_STRING(keyfile ? keyfile_bytes : certfile_bytes), SSL_FILETYPE_PEM); PySSL_END_ALLOW_THREADS -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Apr 3 18:20:20 2011 From: python-checkins at python.org (antoine.pitrou) Date: Sun, 03 Apr 2011 18:20:20 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Merge fix for issue #11746 Message-ID: http://hg.python.org/cpython/rev/c11e05a60d36 changeset: 69115:c11e05a60d36 parent: 69113:ff105faf1bac parent: 69114:88ed3de28520 user: Antoine Pitrou date: Sun Apr 03 18:16:50 2011 +0200 summary: Merge fix for issue #11746 files: Misc/NEWS | 3 +++ Modules/_ssl.c | 2 +- 2 files changed, 4 insertions(+), 1 deletions(-) diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -87,6 +87,9 @@ Library ------- +- Issue #11746: Fix SSLContext.load_cert_chain() to accept elliptic curve + private keys. + - Issue #5863: Rewrite BZ2File in pure Python, and allow it to accept file-like objects using a new ``fileobj`` constructor argument. Patch by Nadeem Vawda. diff --git a/Modules/_ssl.c b/Modules/_ssl.c --- a/Modules/_ssl.c +++ b/Modules/_ssl.c @@ -1620,7 +1620,7 @@ goto error; } PySSL_BEGIN_ALLOW_THREADS - r = SSL_CTX_use_RSAPrivateKey_file(self->ctx, + r = SSL_CTX_use_PrivateKey_file(self->ctx, PyBytes_AS_STRING(keyfile ? keyfile_bytes : certfile_bytes), SSL_FILETYPE_PEM); PySSL_END_ALLOW_THREADS -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Apr 3 18:29:49 2011 From: python-checkins at python.org (antoine.pitrou) Date: Sun, 03 Apr 2011 18:29:49 +0200 Subject: [Python-checkins] cpython: Issue #11748: try to fix sporadic failures in test_ftplib Message-ID: http://hg.python.org/cpython/rev/8a2d848244a2 changeset: 69116:8a2d848244a2 user: Antoine Pitrou date: Sun Apr 03 18:29:45 2011 +0200 summary: Issue #11748: try to fix sporadic failures in test_ftplib files: Lib/test/test_ftplib.py | 22 ++++++++++++++++------ 1 files changed, 16 insertions(+), 6 deletions(-) diff --git a/Lib/test/test_ftplib.py b/Lib/test/test_ftplib.py --- a/Lib/test/test_ftplib.py +++ b/Lib/test/test_ftplib.py @@ -611,16 +611,26 @@ def test_source_address(self): self.client.quit() port = support.find_unused_port() - self.client.connect(self.server.host, self.server.port, - source_address=(HOST, port)) - self.assertEqual(self.client.sock.getsockname()[1], port) - self.client.quit() + try: + self.client.connect(self.server.host, self.server.port, + source_address=(HOST, port)) + self.assertEqual(self.client.sock.getsockname()[1], port) + self.client.quit() + except IOError as e: + if e.errno == errno.EADDRINUSE: + self.skipTest("couldn't bind to port %d" % port) + raise def test_source_address_passive_connection(self): port = support.find_unused_port() self.client.source_address = (HOST, port) - with self.client.transfercmd('list') as sock: - self.assertEqual(sock.getsockname()[1], port) + try: + with self.client.transfercmd('list') as sock: + self.assertEqual(sock.getsockname()[1], port) + except IOError as e: + if e.errno == errno.EADDRINUSE: + self.skipTest("couldn't bind to port %d" % port) + raise def test_parse257(self): self.assertEqual(ftplib.parse257('257 "/foo/bar"'), '/foo/bar') -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Apr 3 18:46:26 2011 From: python-checkins at python.org (victor.stinner) Date: Sun, 03 Apr 2011 18:46:26 +0200 Subject: [Python-checkins] cpython: test_faulthandler: fix regex on the check_dump_traceback_threads() traceback Message-ID: http://hg.python.org/cpython/rev/cb169f61785b changeset: 69117:cb169f61785b user: Victor Stinner date: Sun Apr 03 18:41:22 2011 +0200 summary: test_faulthandler: fix regex on the check_dump_traceback_threads() traceback The traceback may contain "_is_owned": Thread 0x40962b90: File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/threading.py", line 220 in _is_owned File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/threading.py", line 227 in wait File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/threading.py", line 421 in wait File "", line 23 in run File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/threading.py", line 735 in _bootstrap_inner File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/threading.py", line 708 in _bootstrap Current thread XXX: File "", line 10 in dump File "", line 28 in files: Lib/test/test_faulthandler.py | 5 ++--- 1 files changed, 2 insertions(+), 3 deletions(-) diff --git a/Lib/test/test_faulthandler.py b/Lib/test/test_faulthandler.py --- a/Lib/test/test_faulthandler.py +++ b/Lib/test/test_faulthandler.py @@ -325,9 +325,8 @@ lineno = 10 regex = """ ^Thread 0x[0-9a-f]+: -(?: File ".*threading.py", line [0-9]+ in wait -)? File ".*threading.py", line [0-9]+ in wait - File "", line 23 in run +(?: File ".*threading.py", line [0-9]+ in [_a-z]+ +){{1,3}} File "", line 23 in run File ".*threading.py", line [0-9]+ in _bootstrap_inner File ".*threading.py", line [0-9]+ in _bootstrap -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Apr 3 18:46:28 2011 From: python-checkins at python.org (victor.stinner) Date: Sun, 03 Apr 2011 18:46:28 +0200 Subject: [Python-checkins] cpython: test_faulthandler: improve the test on dump_tracebacks_later(cancel=True) Message-ID: http://hg.python.org/cpython/rev/2d0a855ce30a changeset: 69118:2d0a855ce30a user: Victor Stinner date: Sun Apr 03 18:45:42 2011 +0200 summary: test_faulthandler: improve the test on dump_tracebacks_later(cancel=True) files: Lib/test/test_faulthandler.py | 33 ++++++++++------------ 1 files changed, 15 insertions(+), 18 deletions(-) diff --git a/Lib/test/test_faulthandler.py b/Lib/test/test_faulthandler.py --- a/Lib/test/test_faulthandler.py +++ b/Lib/test/test_faulthandler.py @@ -358,25 +358,19 @@ import time def func(repeat, cancel, timeout): + if cancel: + faulthandler.cancel_dump_tracebacks_later() + pause = timeout * 2.5 # on Windows XP, b-a gives 1.249931 after sleep(1.25) min_pause = pause * 0.9 a = time.time() time.sleep(pause) + b = time.time() faulthandler.cancel_dump_tracebacks_later() - b = time.time() # Check that sleep() was not interrupted assert (b - a) >= min_pause, "{{}} < {{}}".format(b - a, min_pause) - if cancel: - pause = timeout * 1.5 - min_pause = pause * 0.9 - a = time.time() - time.sleep(pause) - b = time.time() - # Check that sleep() was not interrupted - assert (b - a) >= min_pause, "{{}} < {{}}".format(b - a, min_pause) - timeout = {timeout} repeat = {repeat} cancel = {cancel} @@ -400,13 +394,16 @@ trace, exitcode = self.get_output(code, filename) trace = '\n'.join(trace) - if repeat: - count = 2 + if not cancel: + if repeat: + count = 2 + else: + count = 1 + header = 'Thread 0x[0-9a-f]+:\n' + regex = expected_traceback(12, 27, header, count=count) + self.assertRegex(trace, regex) else: - count = 1 - header = 'Thread 0x[0-9a-f]+:\n' - regex = expected_traceback(9, 33, header, count=count) - self.assertRegex(trace, regex) + self.assertEqual(trace, '') self.assertEqual(exitcode, 0) @unittest.skipIf(not hasattr(faulthandler, 'dump_tracebacks_later'), @@ -425,8 +422,8 @@ def test_dump_tracebacks_later_repeat(self): self.check_dump_tracebacks_later(repeat=True) - def test_dump_tracebacks_later_repeat_cancel(self): - self.check_dump_tracebacks_later(repeat=True, cancel=True) + def test_dump_tracebacks_later_cancel(self): + self.check_dump_tracebacks_later(cancel=True) def test_dump_tracebacks_later_file(self): self.check_dump_tracebacks_later(file=True) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Apr 3 23:46:44 2011 From: python-checkins at python.org (victor.stinner) Date: Sun, 03 Apr 2011 23:46:44 +0200 Subject: [Python-checkins] cpython: Issue #11727, issue #11753, issue #11755: disable regrtest timeout Message-ID: http://hg.python.org/cpython/rev/394f0ea0d29e changeset: 69119:394f0ea0d29e user: Victor Stinner date: Sun Apr 03 23:46:42 2011 +0200 summary: Issue #11727, issue #11753, issue #11755: disable regrtest timeout Disable regrtest timeout until #11753 and #11755 are fixed files: Lib/test/regrtest.py | 2 +- Misc/NEWS | 5 ++--- 2 files changed, 3 insertions(+), 4 deletions(-) diff --git a/Lib/test/regrtest.py b/Lib/test/regrtest.py --- a/Lib/test/regrtest.py +++ b/Lib/test/regrtest.py @@ -240,7 +240,7 @@ findleaks=False, use_resources=None, trace=False, coverdir='coverage', runleaks=False, huntrleaks=False, verbose2=False, print_slow=False, random_seed=None, use_mp=None, verbose3=False, forever=False, - header=False, timeout=30*60): + header=False, timeout=None): """Execute a test suite. This also parses command-line options and modifies its behavior diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -370,9 +370,8 @@ Tests ----- -- Issue #11727: If a test takes more than 30 minutes, regrtest dumps the - traceback of all threads and exits. Use --timeout option to change the - default timeout or to disable it. +- Issue #11727: Add a --timeout option to regrtest: if a test takes more than + TIMEOUT seconds, dumps the traceback of all threads and exits. - Issue #11653: fix -W with -j in regrtest. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 00:13:06 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 04 Apr 2011 00:13:06 +0200 Subject: [Python-checkins] cpython: Issue #11688: Add sqlite3.Connection.set_trace_callback(). Patch by Torsten Message-ID: http://hg.python.org/cpython/rev/575ee55081dc changeset: 69120:575ee55081dc user: Antoine Pitrou date: Mon Apr 04 00:12:04 2011 +0200 summary: Issue #11688: Add sqlite3.Connection.set_trace_callback(). Patch by Torsten Landschoff. files: Doc/library/sqlite3.rst | 16 ++++++ Lib/sqlite3/test/hooks.py | 48 ++++++++++++++++++- Misc/NEWS | 3 + Modules/_sqlite/connection.c | 62 ++++++++++++++++++++++++ 4 files changed, 128 insertions(+), 1 deletions(-) diff --git a/Doc/library/sqlite3.rst b/Doc/library/sqlite3.rst --- a/Doc/library/sqlite3.rst +++ b/Doc/library/sqlite3.rst @@ -369,6 +369,22 @@ method with :const:`None` for *handler*. +.. method:: Connection.set_trace_callback(trace_callback) + + Registers *trace_callback* to be called for each SQL statement that is + actually executed by the SQLite backend. + + The only argument passed to the callback is the statement (as string) that + is being executed. The return value of the callback is ignored. Note that + the backend does not only run statements passed to the :meth:`Cursor.execute` + methods. Other sources include the transaction management of the Python + module and the execution of triggers defined in the current database. + + Passing :const:`None` as *trace_callback* will disable the trace callback. + + .. versionadded:: 3.3 + + .. method:: Connection.enable_load_extension(enabled) This routine allows/disallows the SQLite engine to load SQLite extensions diff --git a/Lib/sqlite3/test/hooks.py b/Lib/sqlite3/test/hooks.py --- a/Lib/sqlite3/test/hooks.py +++ b/Lib/sqlite3/test/hooks.py @@ -175,10 +175,56 @@ con.execute("select 1 union select 2 union select 3").fetchall() self.assertEqual(action, 0, "progress handler was not cleared") +class TraceCallbackTests(unittest.TestCase): + def CheckTraceCallbackUsed(self): + """ + Test that the trace callback is invoked once it is set. + """ + con = sqlite.connect(":memory:") + traced_statements = [] + def trace(statement): + traced_statements.append(statement) + con.set_trace_callback(trace) + con.execute("create table foo(a, b)") + self.assertTrue(traced_statements) + self.assertTrue(any("create table foo" in stmt for stmt in traced_statements)) + + def CheckClearTraceCallback(self): + """ + Test that setting the trace callback to None clears the previously set callback. + """ + con = sqlite.connect(":memory:") + traced_statements = [] + def trace(statement): + traced_statements.append(statement) + con.set_trace_callback(trace) + con.set_trace_callback(None) + con.execute("create table foo(a, b)") + self.assertFalse(traced_statements, "trace callback was not cleared") + + def CheckUnicodeContent(self): + """ + Test that the statement can contain unicode literals. + """ + unicode_value = '\xf6\xe4\xfc\xd6\xc4\xdc\xdf\u20ac' + con = sqlite.connect(":memory:") + traced_statements = [] + def trace(statement): + traced_statements.append(statement) + con.set_trace_callback(trace) + con.execute("create table foo(x)") + con.execute("insert into foo(x) values (?)", (unicode_value,)) + con.commit() + self.assertTrue(any(unicode_value in stmt for stmt in traced_statements), + "Unicode data garbled in trace callback") + + + def suite(): collation_suite = unittest.makeSuite(CollationTests, "Check") progress_suite = unittest.makeSuite(ProgressTests, "Check") - return unittest.TestSuite((collation_suite, progress_suite)) + trace_suite = unittest.makeSuite(TraceCallbackTests, "Check") + return unittest.TestSuite((collation_suite, progress_suite, trace_suite)) def test(): runner = unittest.TextTestRunner() diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -87,6 +87,9 @@ Library ------- +- Issue #11688: Add sqlite3.Connection.set_trace_callback(). Patch by + Torsten Landschoff. + - Issue #11746: Fix SSLContext.load_cert_chain() to accept elliptic curve private keys. diff --git a/Modules/_sqlite/connection.c b/Modules/_sqlite/connection.c --- a/Modules/_sqlite/connection.c +++ b/Modules/_sqlite/connection.c @@ -904,6 +904,38 @@ return rc; } +static void _trace_callback(void* user_arg, const char* statement_string) +{ + PyObject *py_statement = NULL; + PyObject *ret = NULL; + +#ifdef WITH_THREAD + PyGILState_STATE gilstate; + + gilstate = PyGILState_Ensure(); +#endif + py_statement = PyUnicode_DecodeUTF8(statement_string, + strlen(statement_string), "replace"); + if (py_statement) { + ret = PyObject_CallFunctionObjArgs((PyObject*)user_arg, py_statement, NULL); + Py_DECREF(py_statement); + } + + if (ret) { + Py_DECREF(ret); + } else { + if (_enable_callback_tracebacks) { + PyErr_Print(); + } else { + PyErr_Clear(); + } + } + +#ifdef WITH_THREAD + PyGILState_Release(gilstate); +#endif +} + static PyObject* pysqlite_connection_set_authorizer(pysqlite_Connection* self, PyObject* args, PyObject* kwargs) { PyObject* authorizer_cb; @@ -963,6 +995,34 @@ return Py_None; } +static PyObject* pysqlite_connection_set_trace_callback(pysqlite_Connection* self, PyObject* args, PyObject* kwargs) +{ + PyObject* trace_callback; + + static char *kwlist[] = { "trace_callback", NULL }; + + if (!pysqlite_check_thread(self) || !pysqlite_check_connection(self)) { + return NULL; + } + + if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O:set_trace_callback", + kwlist, &trace_callback)) { + return NULL; + } + + if (trace_callback == Py_None) { + /* None clears the trace callback previously set */ + sqlite3_trace(self->db, 0, (void*)0); + } else { + if (PyDict_SetItem(self->function_pinboard, trace_callback, Py_None) == -1) + return NULL; + sqlite3_trace(self->db, _trace_callback, trace_callback); + } + + Py_INCREF(Py_None); + return Py_None; +} + #ifdef HAVE_LOAD_EXTENSION static PyObject* pysqlite_enable_load_extension(pysqlite_Connection* self, PyObject* args) { @@ -1516,6 +1576,8 @@ #endif {"set_progress_handler", (PyCFunction)pysqlite_connection_set_progress_handler, METH_VARARGS|METH_KEYWORDS, PyDoc_STR("Sets progress handler callback. Non-standard.")}, + {"set_trace_callback", (PyCFunction)pysqlite_connection_set_trace_callback, METH_VARARGS|METH_KEYWORDS, + PyDoc_STR("Sets a trace callback called for each SQL statement (passed as unicode). Non-standard.")}, {"execute", (PyCFunction)pysqlite_connection_execute, METH_VARARGS, PyDoc_STR("Executes a SQL statement. Non-standard.")}, {"executemany", (PyCFunction)pysqlite_connection_executemany, METH_VARARGS, -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 00:50:05 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 04 Apr 2011 00:50:05 +0200 Subject: [Python-checkins] cpython: Improve error message in test Message-ID: http://hg.python.org/cpython/rev/23519bc7d752 changeset: 69121:23519bc7d752 user: Antoine Pitrou date: Mon Apr 04 00:50:01 2011 +0200 summary: Improve error message in test files: Lib/sqlite3/test/hooks.py | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/Lib/sqlite3/test/hooks.py b/Lib/sqlite3/test/hooks.py --- a/Lib/sqlite3/test/hooks.py +++ b/Lib/sqlite3/test/hooks.py @@ -216,7 +216,8 @@ con.execute("insert into foo(x) values (?)", (unicode_value,)) con.commit() self.assertTrue(any(unicode_value in stmt for stmt in traced_statements), - "Unicode data garbled in trace callback") + "Unicode data %s garbled in trace callback: %s" + % (ascii(unicode_value), ', '.join(map(ascii, traced_statements)))) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 01:22:34 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 04 Apr 2011 01:22:34 +0200 Subject: [Python-checkins] cpython (3.2): Issue #11749: try to fix transient test_socket failure Message-ID: http://hg.python.org/cpython/rev/68a319ef70fc changeset: 69122:68a319ef70fc branch: 3.2 parent: 69114:88ed3de28520 user: Antoine Pitrou date: Mon Apr 04 01:21:37 2011 +0200 summary: Issue #11749: try to fix transient test_socket failure files: Lib/test/test_socket.py | 4 ++++ 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/Lib/test/test_socket.py b/Lib/test/test_socket.py --- a/Lib/test/test_socket.py +++ b/Lib/test/test_socket.py @@ -1384,6 +1384,10 @@ self.evt1.set() self.evt2.wait(1.0) first_seg = self.read_file.read(len(self.read_msg) - 3) + if first_seg is None: + # Data not arrived (can happen under Windows), wait a bit + time.sleep(0.5) + first_seg = self.read_file.read(len(self.read_msg) - 3) buf = bytearray(10) n = self.read_file.readinto(buf) self.assertEqual(n, 3) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 01:22:35 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 04 Apr 2011 01:22:35 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Issue #11749: try to fix transient test_socket failure Message-ID: http://hg.python.org/cpython/rev/44fc5f94bc90 changeset: 69123:44fc5f94bc90 parent: 69121:23519bc7d752 parent: 69122:68a319ef70fc user: Antoine Pitrou date: Mon Apr 04 01:22:06 2011 +0200 summary: Issue #11749: try to fix transient test_socket failure files: Lib/test/test_socket.py | 4 ++++ 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/Lib/test/test_socket.py b/Lib/test/test_socket.py --- a/Lib/test/test_socket.py +++ b/Lib/test/test_socket.py @@ -1411,6 +1411,10 @@ self.evt1.set() self.evt2.wait(1.0) first_seg = self.read_file.read(len(self.read_msg) - 3) + if first_seg is None: + # Data not arrived (can happen under Windows), wait a bit + time.sleep(0.5) + first_seg = self.read_file.read(len(self.read_msg) - 3) buf = bytearray(10) n = self.read_file.readinto(buf) self.assertEqual(n, 3) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 01:50:56 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 04 Apr 2011 01:50:56 +0200 Subject: [Python-checkins] cpython: Fix TraceCallbackTests to not use bound parameters (followup to issue #11688) Message-ID: http://hg.python.org/cpython/rev/ce37570768f5 changeset: 69124:ce37570768f5 user: Antoine Pitrou date: Mon Apr 04 01:50:50 2011 +0200 summary: Fix TraceCallbackTests to not use bound parameters (followup to issue #11688) files: Lib/sqlite3/test/hooks.py | 5 ++++- 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/Lib/sqlite3/test/hooks.py b/Lib/sqlite3/test/hooks.py --- a/Lib/sqlite3/test/hooks.py +++ b/Lib/sqlite3/test/hooks.py @@ -213,7 +213,10 @@ traced_statements.append(statement) con.set_trace_callback(trace) con.execute("create table foo(x)") - con.execute("insert into foo(x) values (?)", (unicode_value,)) + # Can't execute bound parameters as their values don't appear + # in traced statements before SQLite 3.6.21 + # (cf. http://www.sqlite.org/draft/releaselog/3_6_21.html) + con.execute('insert into foo(x) values ("%s")' % unicode_value) con.commit() self.assertTrue(any(unicode_value in stmt for stmt in traced_statements), "Unicode data %s garbled in trace callback: %s" -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 02:14:42 2011 From: python-checkins at python.org (steven.bethard) Date: Mon, 04 Apr 2011 02:14:42 +0200 Subject: [Python-checkins] cpython (2.7): Issue #9347: Fix formatting for tuples in argparse type= error messages. Message-ID: http://hg.python.org/cpython/rev/f961e9179998 changeset: 69125:f961e9179998 branch: 2.7 parent: 69080:5e7fc2a42c3c user: Steven Bethard date: Mon Apr 04 01:47:52 2011 +0200 summary: Issue #9347: Fix formatting for tuples in argparse type= error messages. files: Lib/argparse.py | 4 ++-- Lib/test/test_argparse.py | 2 ++ Misc/NEWS | 2 ++ 3 files changed, 6 insertions(+), 2 deletions(-) diff --git a/Lib/argparse.py b/Lib/argparse.py --- a/Lib/argparse.py +++ b/Lib/argparse.py @@ -1277,13 +1277,13 @@ # create the action object, and add it to the parser action_class = self._pop_action_class(kwargs) if not _callable(action_class): - raise ValueError('unknown action "%s"' % action_class) + raise ValueError('unknown action "%s"' % (action_class,)) action = action_class(**kwargs) # raise an error if the action type is not callable type_func = self._registry_get('type', action.type, action.type) if not _callable(type_func): - raise ValueError('%r is not callable' % type_func) + raise ValueError('%r is not callable' % (type_func,)) # raise an error if the metavar does not match the type if hasattr(self, "_get_formatter"): diff --git a/Lib/test/test_argparse.py b/Lib/test/test_argparse.py --- a/Lib/test/test_argparse.py +++ b/Lib/test/test_argparse.py @@ -4016,10 +4016,12 @@ def test_invalid_type(self): self.assertValueError('--foo', type='int') + self.assertValueError('--foo', type=(int, float)) def test_invalid_action(self): self.assertValueError('-x', action='foo') self.assertValueError('foo', action='baz') + self.assertValueError('--foo', action=('store', 'append')) parser = argparse.ArgumentParser() try: parser.add_argument("--foo", action="store-true") diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -257,6 +257,8 @@ - Issue #9026: Fix order of argparse sub-commands in help messages. +- Issue #9347: Fix formatting for tuples in argparse type= error messages. + Extension Modules ----------------- -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 02:14:43 2011 From: python-checkins at python.org (steven.bethard) Date: Mon, 04 Apr 2011 02:14:43 +0200 Subject: [Python-checkins] cpython (3.2): Issue #9347: Fix formatting for tuples in argparse type= error messages. Message-ID: http://hg.python.org/cpython/rev/69ab5251f3f0 changeset: 69126:69ab5251f3f0 branch: 3.2 parent: 69122:68a319ef70fc user: Steven Bethard date: Mon Apr 04 01:53:02 2011 +0200 summary: Issue #9347: Fix formatting for tuples in argparse type= error messages. files: Lib/argparse.py | 4 ++-- Lib/test/test_argparse.py | 2 ++ Misc/NEWS | 2 ++ 3 files changed, 6 insertions(+), 2 deletions(-) diff --git a/Lib/argparse.py b/Lib/argparse.py --- a/Lib/argparse.py +++ b/Lib/argparse.py @@ -1287,13 +1287,13 @@ # create the action object, and add it to the parser action_class = self._pop_action_class(kwargs) if not _callable(action_class): - raise ValueError('unknown action "%s"' % action_class) + raise ValueError('unknown action "%s"' % (action_class,)) action = action_class(**kwargs) # raise an error if the action type is not callable type_func = self._registry_get('type', action.type, action.type) if not _callable(type_func): - raise ValueError('%r is not callable' % type_func) + raise ValueError('%r is not callable' % (type_func,)) # raise an error if the metavar does not match the type if hasattr(self, "_get_formatter"): diff --git a/Lib/test/test_argparse.py b/Lib/test/test_argparse.py --- a/Lib/test/test_argparse.py +++ b/Lib/test/test_argparse.py @@ -4051,10 +4051,12 @@ def test_invalid_type(self): self.assertValueError('--foo', type='int') + self.assertValueError('--foo', type=(int, float)) def test_invalid_action(self): self.assertValueError('-x', action='foo') self.assertValueError('foo', action='baz') + self.assertValueError('--foo', action=('store', 'append')) parser = argparse.ArgumentParser() try: parser.add_argument("--foo", action="store-true") diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -188,6 +188,8 @@ - Issue #9026: Fix order of argparse sub-commands in help messages. +- Issue #9347: Fix formatting for tuples in argparse type= error messages. + Build ----- -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 02:14:46 2011 From: python-checkins at python.org (steven.bethard) Date: Mon, 04 Apr 2011 02:14:46 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Issue #9347: Fix formatting for tuples in argparse type= error messages. Message-ID: http://hg.python.org/cpython/rev/1f3f6443810a changeset: 69127:1f3f6443810a parent: 69123:44fc5f94bc90 parent: 69126:69ab5251f3f0 user: Steven Bethard date: Mon Apr 04 02:10:40 2011 +0200 summary: Issue #9347: Fix formatting for tuples in argparse type= error messages. files: Lib/argparse.py | 4 ++-- Lib/test/test_argparse.py | 2 ++ Misc/NEWS | 2 ++ 3 files changed, 6 insertions(+), 2 deletions(-) diff --git a/Lib/argparse.py b/Lib/argparse.py --- a/Lib/argparse.py +++ b/Lib/argparse.py @@ -1312,13 +1312,13 @@ # create the action object, and add it to the parser action_class = self._pop_action_class(kwargs) if not _callable(action_class): - raise ValueError('unknown action "%s"' % action_class) + raise ValueError('unknown action "%s"' % (action_class,)) action = action_class(**kwargs) # raise an error if the action type is not callable type_func = self._registry_get('type', action.type, action.type) if not _callable(type_func): - raise ValueError('%r is not callable' % type_func) + raise ValueError('%r is not callable' % (type_func,)) # raise an error if the metavar does not match the type if hasattr(self, "_get_formatter"): diff --git a/Lib/test/test_argparse.py b/Lib/test/test_argparse.py --- a/Lib/test/test_argparse.py +++ b/Lib/test/test_argparse.py @@ -4082,10 +4082,12 @@ def test_invalid_type(self): self.assertValueError('--foo', type='int') + self.assertValueError('--foo', type=(int, float)) def test_invalid_action(self): self.assertValueError('-x', action='foo') self.assertValueError('foo', action='baz') + self.assertValueError('--foo', action=('store', 'append')) parser = argparse.ArgumentParser() try: parser.add_argument("--foo", action="store-true") diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -349,6 +349,8 @@ - Issue #9026: Fix order of argparse sub-commands in help messages. +- Issue #9347: Fix formatting for tuples in argparse type= error messages. + Build ----- -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 02:14:48 2011 From: python-checkins at python.org (steven.bethard) Date: Mon, 04 Apr 2011 02:14:48 +0200 Subject: [Python-checkins] cpython (merge default -> default): Merge Message-ID: http://hg.python.org/cpython/rev/838e3b07a7f8 changeset: 69128:838e3b07a7f8 parent: 69127:1f3f6443810a parent: 69124:ce37570768f5 user: Steven Bethard date: Mon Apr 04 02:14:25 2011 +0200 summary: Merge files: Lib/sqlite3/test/hooks.py | 5 ++++- 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/Lib/sqlite3/test/hooks.py b/Lib/sqlite3/test/hooks.py --- a/Lib/sqlite3/test/hooks.py +++ b/Lib/sqlite3/test/hooks.py @@ -213,7 +213,10 @@ traced_statements.append(statement) con.set_trace_callback(trace) con.execute("create table foo(x)") - con.execute("insert into foo(x) values (?)", (unicode_value,)) + # Can't execute bound parameters as their values don't appear + # in traced statements before SQLite 3.6.21 + # (cf. http://www.sqlite.org/draft/releaselog/3_6_21.html) + con.execute('insert into foo(x) values ("%s")' % unicode_value) con.commit() self.assertTrue(any(unicode_value in stmt for stmt in traced_statements), "Unicode data %s garbled in trace callback: %s" -- Repository URL: http://hg.python.org/cpython From solipsis at pitrou.net Mon Apr 4 04:55:49 2011 From: solipsis at pitrou.net (solipsis at pitrou.net) Date: Mon, 04 Apr 2011 04:55:49 +0200 Subject: [Python-checkins] Daily reference leaks (838e3b07a7f8): sum=0 Message-ID: results for 838e3b07a7f8 on branch "default" -------------------------------------------- Command line was: ['./python', '-m', 'test.regrtest', '-uall', '-R', '3:3:/home/antoine/cpython/refleaks/reflogYOwho_', '-x'] From python-checkins at python.org Mon Apr 4 11:05:51 2011 From: python-checkins at python.org (victor.stinner) Date: Mon, 04 Apr 2011 11:05:51 +0200 Subject: [Python-checkins] cpython: Issue #11753: faulthandler thread uses pthread_sigmask() Message-ID: http://hg.python.org/cpython/rev/ebc03d7e7110 changeset: 69129:ebc03d7e7110 user: Victor Stinner date: Mon Apr 04 11:05:21 2011 +0200 summary: Issue #11753: faulthandler thread uses pthread_sigmask() The thread must not receive any signal. If the thread receives a signal, sem_timedwait() is interrupted and returns EINTR, but in this case, PyThread_acquire_lock_timed() retries sem_timedwait() and the main thread is not aware of the signal. The problem is that some tests expect that the main thread receives the signal, not faulthandler handler, which should be invisible. On Linux, the signal looks to be received by the main thread, whereas on FreeBSD, it can be any thread. files: Modules/faulthandler.c | 11 +++++++++++ 1 files changed, 11 insertions(+), 0 deletions(-) diff --git a/Modules/faulthandler.c b/Modules/faulthandler.c --- a/Modules/faulthandler.c +++ b/Modules/faulthandler.c @@ -399,6 +399,17 @@ const char* errmsg; PyThreadState *current; int ok; +#ifdef HAVE_PTHREAD_H + sigset_t set; + + /* we don't want to receive any signal */ + sigfillset(&set); +#if defined(HAVE_PTHREAD_SIGMASK) && !defined(HAVE_BROKEN_PTHREAD_SIGMASK) + pthread_sigmask(SIG_SETMASK, &set, NULL); +#else + sigprocmask(SIG_SETMASK, &set, NULL); +#endif +#endif do { st = PyThread_acquire_lock_timed(thread.cancel_event, -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 12:54:47 2011 From: python-checkins at python.org (victor.stinner) Date: Mon, 04 Apr 2011 12:54:47 +0200 Subject: [Python-checkins] cpython: Reenable regrtest.py timeout (30 min): #11738 and #11753 looks to be fixed Message-ID: http://hg.python.org/cpython/rev/9d59ae98013c changeset: 69130:9d59ae98013c user: Victor Stinner date: Mon Apr 04 12:54:33 2011 +0200 summary: Reenable regrtest.py timeout (30 min): #11738 and #11753 looks to be fixed files: Lib/test/regrtest.py | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Lib/test/regrtest.py b/Lib/test/regrtest.py --- a/Lib/test/regrtest.py +++ b/Lib/test/regrtest.py @@ -240,7 +240,7 @@ findleaks=False, use_resources=None, trace=False, coverdir='coverage', runleaks=False, huntrleaks=False, verbose2=False, print_slow=False, random_seed=None, use_mp=None, verbose3=False, forever=False, - header=False, timeout=None): + header=False, timeout=30*60): """Execute a test suite. This also parses command-line options and modifies its behavior -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 15:33:37 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 04 Apr 2011 15:33:37 +0200 Subject: [Python-checkins] peps: Fix use of "either" and typo in "checking" (Jim Jewett) Message-ID: http://hg.python.org/peps/rev/f65beac56930 changeset: 3854:f65beac56930 user: Antoine Pitrou date: Mon Apr 04 15:33:34 2011 +0200 summary: Fix use of "either" and typo in "checking" (Jim Jewett) files: pep-3151.txt | 18 +++++++++--------- 1 files changed, 9 insertions(+), 9 deletions(-) diff --git a/pep-3151.txt b/pep-3151.txt --- a/pep-3151.txt +++ b/pep-3151.txt @@ -83,12 +83,12 @@ A further proof of the ambiguity of this segmentation is that the standard library itself sometimes has problems deciding. For example, in the -``select`` module, similar failures will raise either ``select.error``, -``OSError`` or ``IOError`` depending on whether you are using select(), -a poll object, a kqueue object, or an epoll object. This makes user code -uselessly complicated since it has to be prepared to catch various -exception types, depending on which exact implementation of a single -primitive it chooses to use at runtime. +``select`` module, similar failures will raise ``select.error``, ``OSError`` +or ``IOError`` depending on whether you are using select(), a poll object, +a kqueue object, or an epoll object. This makes user code uselessly +complicated since it has to be prepared to catch various exception types, +depending on which exact implementation of a single primitive it chooses +to use at runtime. As for WindowsError, it seems to be a pointless distinction. First, it only exists on Windows systems, which requires tedious compatibility code @@ -171,10 +171,10 @@ For this we first must explain what we will call *careful* and *careless* exception handling. *Careless* (or "na?ve") code is defined as code which -blindly catches either of ``OSError``, ``IOError``, ``socket.error``, -``mmap.error``, ``WindowsError``, ``select.error`` without cheking the ``errno`` +blindly catches any of ``OSError``, ``IOError``, ``socket.error``, +``mmap.error``, ``WindowsError``, ``select.error`` without checking the ``errno`` attribute. This is because such exception types are much too broad to signify -anything. Either of them can be raised for error conditions as diverse as: a +anything. Any of them can be raised for error conditions as diverse as: a bad file descriptor (which will usually indicate a programming error), an unconnected socket (ditto), a socket timeout, a file type mismatch, an invalid argument, a transmission failure, insufficient permissions, a non-existent -- Repository URL: http://hg.python.org/peps From python-checkins at python.org Mon Apr 4 18:30:23 2011 From: python-checkins at python.org (raymond.hettinger) Date: Mon, 04 Apr 2011 18:30:23 +0200 Subject: [Python-checkins] cpython: Update timeit to use the new string formatting syntax. Message-ID: http://hg.python.org/cpython/rev/81c981ceb83e changeset: 69131:81c981ceb83e user: Raymond Hettinger date: Mon Apr 04 09:28:25 2011 -0700 summary: Update timeit to use the new string formatting syntax. files: Lib/timeit.py | 8 ++++---- 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/Lib/timeit.py b/Lib/timeit.py --- a/Lib/timeit.py +++ b/Lib/timeit.py @@ -79,10 +79,10 @@ # being indented 8 spaces. template = """ def inner(_it, _timer): - %(setup)s + {setup} _t0 = _timer() for _i in _it: - %(stmt)s + {stmt} _t1 = _timer() return _t1 - _t0 """ @@ -126,9 +126,9 @@ stmt = reindent(stmt, 8) if isinstance(setup, str): setup = reindent(setup, 4) - src = template % {'stmt': stmt, 'setup': setup} + src = template.format(stmt=stmt, setup=setup) elif hasattr(setup, '__call__'): - src = template % {'stmt': stmt, 'setup': '_setup()'} + src = template.format(stmt=stmt, setup='_setup()') ns['_setup'] = setup else: raise ValueError("setup is neither a string nor callable") -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 19:53:20 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 04 Apr 2011 19:53:20 +0200 Subject: [Python-checkins] cpython (3.1): Issue #11761: make tests for gc.get_count() less fragile Message-ID: http://hg.python.org/cpython/rev/36d92e923a1a changeset: 69132:36d92e923a1a branch: 3.1 parent: 69106:821244a44163 user: Antoine Pitrou date: Mon Apr 04 19:50:42 2011 +0200 summary: Issue #11761: make tests for gc.get_count() less fragile files: Lib/test/test_gc.py | 43 ++++++++++++++++++++------------ 1 files changed, 27 insertions(+), 16 deletions(-) diff --git a/Lib/test/test_gc.py b/Lib/test/test_gc.py --- a/Lib/test/test_gc.py +++ b/Lib/test/test_gc.py @@ -239,30 +239,41 @@ # The following two tests are fragile: # They precisely count the number of allocations, # which is highly implementation-dependent. - # For example: - # - disposed tuples are not freed, but reused - # - the call to assertEqual somehow avoids building its args tuple + # For example, disposed tuples are not freed, but reused. + # To minimize variations, though, we first store the get_count() results + # and check them at the end. def test_get_count(self): - # Avoid future allocation of method object - assertEqual = self._baseAssertEqual gc.collect() - assertEqual(gc.get_count(), (0, 0, 0)) - a = dict() - # since gc.collect(), we created two objects: - # the dict, and the tuple returned by get_count() - assertEqual(gc.get_count(), (2, 0, 0)) + a, b, c = gc.get_count() + x = [] + d, e, f = gc.get_count() + self.assertEqual((b, c), (0, 0)) + self.assertEqual((e, f), (0, 0)) + # This is less fragile than asserting that a equals 0. + self.assertLess(a, 5) + # Between the two calls to get_count(), at least one object was + # created (the list). + self.assertGreater(d, a) def test_collect_generations(self): - # Avoid future allocation of method object - assertEqual = self.assertEqual gc.collect() - a = dict() + # This object will "trickle" into generation N + 1 after + # each call to collect(N) + x = [] gc.collect(0) - assertEqual(gc.get_count(), (0, 1, 0)) + # x is now in gen 1 + a, b, c = gc.get_count() gc.collect(1) - assertEqual(gc.get_count(), (0, 0, 1)) + # x is now in gen 2 + d, e, f = gc.get_count() gc.collect(2) - assertEqual(gc.get_count(), (0, 0, 0)) + # x is now in gen 3 + g, h, i = gc.get_count() + # We don't check a, d, g since their exact values depends on + # internal implementation details of the interpreter. + self.assertEqual((b, c), (1, 0)) + self.assertEqual((e, f), (0, 1)) + self.assertEqual((h, i), (0, 0)) def test_trashcan(self): class Ouch: -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 19:53:21 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 04 Apr 2011 19:53:21 +0200 Subject: [Python-checkins] cpython (merge 3.1 -> 3.2): Issue #11761: make tests for gc.get_count() less fragile Message-ID: http://hg.python.org/cpython/rev/5daf9a8dc4e8 changeset: 69133:5daf9a8dc4e8 branch: 3.2 parent: 69126:69ab5251f3f0 parent: 69132:36d92e923a1a user: Antoine Pitrou date: Mon Apr 04 19:51:33 2011 +0200 summary: Issue #11761: make tests for gc.get_count() less fragile files: Lib/test/test_gc.py | 43 ++++++++++++++++++++------------ 1 files changed, 27 insertions(+), 16 deletions(-) diff --git a/Lib/test/test_gc.py b/Lib/test/test_gc.py --- a/Lib/test/test_gc.py +++ b/Lib/test/test_gc.py @@ -239,30 +239,41 @@ # The following two tests are fragile: # They precisely count the number of allocations, # which is highly implementation-dependent. - # For example: - # - disposed tuples are not freed, but reused - # - the call to assertEqual somehow avoids building its args tuple + # For example, disposed tuples are not freed, but reused. + # To minimize variations, though, we first store the get_count() results + # and check them at the end. def test_get_count(self): - # Avoid future allocation of method object - assertEqual = self._baseAssertEqual gc.collect() - assertEqual(gc.get_count(), (0, 0, 0)) - a = dict() - # since gc.collect(), we created two objects: - # the dict, and the tuple returned by get_count() - assertEqual(gc.get_count(), (2, 0, 0)) + a, b, c = gc.get_count() + x = [] + d, e, f = gc.get_count() + self.assertEqual((b, c), (0, 0)) + self.assertEqual((e, f), (0, 0)) + # This is less fragile than asserting that a equals 0. + self.assertLess(a, 5) + # Between the two calls to get_count(), at least one object was + # created (the list). + self.assertGreater(d, a) def test_collect_generations(self): - # Avoid future allocation of method object - assertEqual = self.assertEqual gc.collect() - a = dict() + # This object will "trickle" into generation N + 1 after + # each call to collect(N) + x = [] gc.collect(0) - assertEqual(gc.get_count(), (0, 1, 0)) + # x is now in gen 1 + a, b, c = gc.get_count() gc.collect(1) - assertEqual(gc.get_count(), (0, 0, 1)) + # x is now in gen 2 + d, e, f = gc.get_count() gc.collect(2) - assertEqual(gc.get_count(), (0, 0, 0)) + # x is now in gen 3 + g, h, i = gc.get_count() + # We don't check a, d, g since their exact values depends on + # internal implementation details of the interpreter. + self.assertEqual((b, c), (1, 0)) + self.assertEqual((e, f), (0, 1)) + self.assertEqual((h, i), (0, 0)) def test_trashcan(self): class Ouch: -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 19:53:23 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 04 Apr 2011 19:53:23 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Issue #11761: make tests for gc.get_count() less fragile Message-ID: http://hg.python.org/cpython/rev/24d4c5fd3bc6 changeset: 69134:24d4c5fd3bc6 parent: 69131:81c981ceb83e parent: 69133:5daf9a8dc4e8 user: Antoine Pitrou date: Mon Apr 04 19:52:56 2011 +0200 summary: Issue #11761: make tests for gc.get_count() less fragile files: Lib/test/test_gc.py | 43 ++++++++++++++++++++------------ 1 files changed, 27 insertions(+), 16 deletions(-) diff --git a/Lib/test/test_gc.py b/Lib/test/test_gc.py --- a/Lib/test/test_gc.py +++ b/Lib/test/test_gc.py @@ -241,32 +241,43 @@ # The following two tests are fragile: # They precisely count the number of allocations, # which is highly implementation-dependent. - # For example: - # - disposed tuples are not freed, but reused - # - the call to assertEqual somehow avoids building its args tuple + # For example, disposed tuples are not freed, but reused. + # To minimize variations, though, we first store the get_count() results + # and check them at the end. @refcount_test def test_get_count(self): - # Avoid future allocation of method object - assertEqual = self._baseAssertEqual gc.collect() - assertEqual(gc.get_count(), (0, 0, 0)) - a = dict() - # since gc.collect(), we created two objects: - # the dict, and the tuple returned by get_count() - assertEqual(gc.get_count(), (2, 0, 0)) + a, b, c = gc.get_count() + x = [] + d, e, f = gc.get_count() + self.assertEqual((b, c), (0, 0)) + self.assertEqual((e, f), (0, 0)) + # This is less fragile than asserting that a equals 0. + self.assertLess(a, 5) + # Between the two calls to get_count(), at least one object was + # created (the list). + self.assertGreater(d, a) @refcount_test def test_collect_generations(self): - # Avoid future allocation of method object - assertEqual = self.assertEqual gc.collect() - a = dict() + # This object will "trickle" into generation N + 1 after + # each call to collect(N) + x = [] gc.collect(0) - assertEqual(gc.get_count(), (0, 1, 0)) + # x is now in gen 1 + a, b, c = gc.get_count() gc.collect(1) - assertEqual(gc.get_count(), (0, 0, 1)) + # x is now in gen 2 + d, e, f = gc.get_count() gc.collect(2) - assertEqual(gc.get_count(), (0, 0, 0)) + # x is now in gen 3 + g, h, i = gc.get_count() + # We don't check a, d, g since their exact values depends on + # internal implementation details of the interpreter. + self.assertEqual((b, c), (1, 0)) + self.assertEqual((e, f), (0, 1)) + self.assertEqual((h, i), (0, 0)) def test_trashcan(self): class Ouch: -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 20:00:56 2011 From: python-checkins at python.org (brian.curtin) Date: Mon, 04 Apr 2011 20:00:56 +0200 Subject: [Python-checkins] cpython: Add x64-temp to ignore, prepend a forward slash to "build/" to include Message-ID: http://hg.python.org/cpython/rev/4d2575d971bc changeset: 69135:4d2575d971bc user: brian.curtin date: Mon Apr 04 13:00:49 2011 -0500 summary: Add x64-temp to ignore, prepend a forward slash to "build/" to include PCbuild/ changes (for VS project files, etc). files: .hgignore | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/.hgignore b/.hgignore --- a/.hgignore +++ b/.hgignore @@ -5,7 +5,7 @@ Makefile.pre$ TAGS$ autom4te.cache$ -build/ +/build/ buildno$ config.cache config.log @@ -63,4 +63,5 @@ PCbuild/*.ncb PCbuild/*.bsc PCbuild/Win32-temp-* +PCbuild/x64-temp-* __pycache__ -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 20:56:06 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 04 Apr 2011 20:56:06 +0200 Subject: [Python-checkins] cpython: Ignore build/ and Doc/build Message-ID: http://hg.python.org/cpython/rev/739bed65e445 changeset: 69136:739bed65e445 user: Antoine Pitrou date: Mon Apr 04 20:52:50 2011 +0200 summary: Ignore build/ and Doc/build files: .hgignore | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/.hgignore b/.hgignore --- a/.hgignore +++ b/.hgignore @@ -5,7 +5,8 @@ Makefile.pre$ TAGS$ autom4te.cache$ -/build/ +^build/ +^Doc/build/ buildno$ config.cache config.log -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 20:56:09 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 04 Apr 2011 20:56:09 +0200 Subject: [Python-checkins] cpython: Ignore AMD64 build files under Windows Message-ID: http://hg.python.org/cpython/rev/ef97e997aa02 changeset: 69137:ef97e997aa02 user: Antoine Pitrou date: Mon Apr 04 20:55:12 2011 +0200 summary: Ignore AMD64 build files under Windows files: .hgignore | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/.hgignore b/.hgignore --- a/.hgignore +++ b/.hgignore @@ -33,6 +33,7 @@ Modules/ld_so_aix$ Parser/pgen$ Parser/pgen.stamp$ +PCbuild/amd64/ ^core ^python-gdb.py ^python.exe-gdb.py -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 20:56:14 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 04 Apr 2011 20:56:14 +0200 Subject: [Python-checkins] cpython: Ignore other MSVC by-products Message-ID: http://hg.python.org/cpython/rev/100561a0f093 changeset: 69138:100561a0f093 user: Antoine Pitrou date: Mon Apr 04 20:55:48 2011 +0200 summary: Ignore other MSVC by-products files: .hgignore | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/.hgignore b/.hgignore --- a/.hgignore +++ b/.hgignore @@ -64,6 +64,8 @@ PCbuild/*.o PCbuild/*.ncb PCbuild/*.bsc +PCbuild/*.user +PCbuild/*.suo PCbuild/Win32-temp-* PCbuild/x64-temp-* __pycache__ -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 21:01:57 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 04 Apr 2011 21:01:57 +0200 Subject: [Python-checkins] cpython: Issue #10791: Implement missing method GzipFile.read1(), allowing GzipFile Message-ID: http://hg.python.org/cpython/rev/9775d67c9af9 changeset: 69139:9775d67c9af9 user: Antoine Pitrou date: Mon Apr 04 21:00:37 2011 +0200 summary: Issue #10791: Implement missing method GzipFile.read1(), allowing GzipFile to be wrapped in a TextIOWrapper. Patch by Nadeem Vawda. files: Lib/gzip.py | 22 ++++++++++++++++++++++ Lib/test/test_gzip.py | 23 +++++++++++++++++++++++ Misc/NEWS | 3 +++ 3 files changed, 48 insertions(+), 0 deletions(-) diff --git a/Lib/gzip.py b/Lib/gzip.py --- a/Lib/gzip.py +++ b/Lib/gzip.py @@ -348,6 +348,28 @@ self.offset += size return chunk + def read1(self, size=-1): + self._check_closed() + if self.mode != READ: + import errno + raise IOError(errno.EBADF, "read1() on write-only GzipFile object") + + if self.extrasize <= 0 and self.fileobj is None: + return b'' + + try: + self._read() + except EOFError: + pass + if size < 0 or size > self.extrasize: + size = self.extrasize + + offset = self.offset - self.extrastart + chunk = self.extrabuf[offset: offset + size] + self.extrasize -= size + self.offset += size + return chunk + def peek(self, n): if self.mode != READ: import errno diff --git a/Lib/test/test_gzip.py b/Lib/test/test_gzip.py --- a/Lib/test/test_gzip.py +++ b/Lib/test/test_gzip.py @@ -64,6 +64,21 @@ d = f.read() self.assertEqual(d, data1*50) + def test_read1(self): + self.test_write() + blocks = [] + nread = 0 + with gzip.GzipFile(self.filename, 'r') as f: + while True: + d = f.read1() + if not d: + break + blocks.append(d) + nread += len(d) + # Check that position was updated correctly (see issue10791). + self.assertEqual(f.tell(), nread) + self.assertEqual(b''.join(blocks), data1 * 50) + def test_io_on_closed_object(self): # Test that I/O operations on closed GzipFile objects raise a # ValueError, just like the corresponding functions on file objects. @@ -323,6 +338,14 @@ self.assertEqual(f.read(100), b'') self.assertEqual(nread, len(uncompressed)) + def test_textio_readlines(self): + # Issue #10791: TextIOWrapper.readlines() fails when wrapping GzipFile. + lines = (data1 * 50).decode("ascii").splitlines(True) + self.test_write() + with gzip.GzipFile(self.filename, 'r') as f: + with io.TextIOWrapper(f, encoding="ascii") as t: + self.assertEqual(t.readlines(), lines) + # Testing compress/decompress shortcut functions def test_compress(self): diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -87,6 +87,9 @@ Library ------- +- Issue #10791: Implement missing method GzipFile.read1(), allowing GzipFile + to be wrapped in a TextIOWrapper. Patch by Nadeem Vawda. + - Issue #11688: Add sqlite3.Connection.set_trace_callback(). Patch by Torsten Landschoff. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 21:09:09 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 04 Apr 2011 21:09:09 +0200 Subject: [Python-checkins] cpython (3.2): Clarify that GzipFile.read1() isn't implemented. Message-ID: http://hg.python.org/cpython/rev/8a2639fdf433 changeset: 69140:8a2639fdf433 branch: 3.2 parent: 69133:5daf9a8dc4e8 user: Antoine Pitrou date: Mon Apr 04 21:06:20 2011 +0200 summary: Clarify that GzipFile.read1() isn't implemented. files: Doc/library/gzip.rst | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Doc/library/gzip.rst b/Doc/library/gzip.rst --- a/Doc/library/gzip.rst +++ b/Doc/library/gzip.rst @@ -72,7 +72,7 @@ :class:`GzipFile` supports the :class:`io.BufferedIOBase` interface, including iteration and the :keyword:`with` statement. Only the - :meth:`truncate` method isn't implemented. + :meth:`read1` and :meth:`truncate` methods aren't implemented. :class:`GzipFile` also provides the following method: -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 21:09:10 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 04 Apr 2011 21:09:10 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Clarify that GzipFile.read1() is now implemented Message-ID: http://hg.python.org/cpython/rev/4fa9bfa21a7e changeset: 69141:4fa9bfa21a7e parent: 69139:9775d67c9af9 parent: 69140:8a2639fdf433 user: Antoine Pitrou date: Mon Apr 04 21:09:05 2011 +0200 summary: Clarify that GzipFile.read1() is now implemented files: Doc/library/gzip.rst | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/Doc/library/gzip.rst b/Doc/library/gzip.rst --- a/Doc/library/gzip.rst +++ b/Doc/library/gzip.rst @@ -94,6 +94,9 @@ .. versionchanged:: 3.2 Support for unseekable files was added. + .. versionchanged:: 3.3 + The :meth:`io.BufferedIOBase.read1` method is now implemented. + .. function:: open(filename, mode='rb', compresslevel=9) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 22:00:51 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 04 Apr 2011 22:00:51 +0200 Subject: [Python-checkins] cpython (3.1): Try to fix sporadic failure in test_thread/test_threading Message-ID: http://hg.python.org/cpython/rev/04b5cd2f8c87 changeset: 69142:04b5cd2f8c87 branch: 3.1 parent: 69132:36d92e923a1a user: Antoine Pitrou date: Mon Apr 04 21:59:09 2011 +0200 summary: Try to fix sporadic failure in test_thread/test_threading files: Lib/test/lock_tests.py | 8 +++++++- 1 files changed, 7 insertions(+), 1 deletions(-) diff --git a/Lib/test/lock_tests.py b/Lib/test/lock_tests.py --- a/Lib/test/lock_tests.py +++ b/Lib/test/lock_tests.py @@ -141,7 +141,13 @@ # We run many threads in the hope that existing threads ids won't # be recycled. Bunch(f, 15).wait_for_finished() - self.assertEqual(n, len(threading.enumerate())) + if len(threading.enumerate()) != n: + # There is a small window during which a Thread instance's + # target function has finished running, but the Thread is still + # alive and registered. Avoid spurious failures by waiting a + # bit more (seen on a buildbot). + time.sleep(0.4) + self.assertEqual(n, len(threading.enumerate())) class LockTests(BaseLockTests): -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 22:00:52 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 04 Apr 2011 22:00:52 +0200 Subject: [Python-checkins] cpython (merge 3.1 -> 3.2): Try to fix sporadic failure in test_thread/test_threading Message-ID: http://hg.python.org/cpython/rev/8d5ea25d79d0 changeset: 69143:8d5ea25d79d0 branch: 3.2 parent: 69140:8a2639fdf433 parent: 69142:04b5cd2f8c87 user: Antoine Pitrou date: Mon Apr 04 22:00:10 2011 +0200 summary: Try to fix sporadic failure in test_thread/test_threading files: Lib/test/lock_tests.py | 8 +++++++- 1 files changed, 7 insertions(+), 1 deletions(-) diff --git a/Lib/test/lock_tests.py b/Lib/test/lock_tests.py --- a/Lib/test/lock_tests.py +++ b/Lib/test/lock_tests.py @@ -149,7 +149,13 @@ # We run many threads in the hope that existing threads ids won't # be recycled. Bunch(f, 15).wait_for_finished() - self.assertEqual(n, len(threading.enumerate())) + if len(threading.enumerate()) != n: + # There is a small window during which a Thread instance's + # target function has finished running, but the Thread is still + # alive and registered. Avoid spurious failures by waiting a + # bit more (seen on a buildbot). + time.sleep(0.4) + self.assertEqual(n, len(threading.enumerate())) def test_timeout(self): lock = self.locktype() -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 22:00:58 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 04 Apr 2011 22:00:58 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Try to fix sporadic failure in test_thread/test_threading Message-ID: http://hg.python.org/cpython/rev/877e152c2eee changeset: 69144:877e152c2eee parent: 69141:4fa9bfa21a7e parent: 69143:8d5ea25d79d0 user: Antoine Pitrou date: Mon Apr 04 22:00:45 2011 +0200 summary: Try to fix sporadic failure in test_thread/test_threading files: Lib/test/lock_tests.py | 8 +++++++- 1 files changed, 7 insertions(+), 1 deletions(-) diff --git a/Lib/test/lock_tests.py b/Lib/test/lock_tests.py --- a/Lib/test/lock_tests.py +++ b/Lib/test/lock_tests.py @@ -149,7 +149,13 @@ # We run many threads in the hope that existing threads ids won't # be recycled. Bunch(f, 15).wait_for_finished() - self.assertEqual(n, len(threading.enumerate())) + if len(threading.enumerate()) != n: + # There is a small window during which a Thread instance's + # target function has finished running, but the Thread is still + # alive and registered. Avoid spurious failures by waiting a + # bit more (seen on a buildbot). + time.sleep(0.4) + self.assertEqual(n, len(threading.enumerate())) def test_timeout(self): lock = self.locktype() -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 23:13:51 2011 From: python-checkins at python.org (victor.stinner) Date: Mon, 04 Apr 2011 23:13:51 +0200 Subject: [Python-checkins] cpython: Issue #11619: _PyImport_LoadDynamicModule() doesn't encode the path to bytes Message-ID: http://hg.python.org/cpython/rev/1b7f484bab6e changeset: 69145:1b7f484bab6e user: Victor Stinner date: Mon Apr 04 23:05:53 2011 +0200 summary: Issue #11619: _PyImport_LoadDynamicModule() doesn't encode the path to bytes on Windows. files: Misc/NEWS | 3 ++ Python/dynload_win.c | 33 +++++++++++++++---------------- Python/importdl.c | 11 ++++++++++ 3 files changed, 30 insertions(+), 17 deletions(-) diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -10,6 +10,9 @@ Core and Builtins ----------------- +- Issue #11619: _PyImport_LoadDynamicModule() doesn't encode the path to bytes + on Windows. + - Issue #10998: Remove mentions of -Q, sys.flags.division_warning and Py_DivisionWarningFlag left over from Python 2. diff --git a/Python/dynload_win.c b/Python/dynload_win.c --- a/Python/dynload_win.c +++ b/Python/dynload_win.c @@ -171,8 +171,8 @@ return NULL; } -dl_funcptr _PyImport_GetDynLoadFunc(const char *shortname, - const char *pathname, FILE *fp) +dl_funcptr _PyImport_GetDynLoadWindows(const char *shortname, + PyObject *pathname, FILE *fp) { dl_funcptr p; char funcname[258], *import_python; @@ -185,8 +185,7 @@ { HINSTANCE hDLL = NULL; - char pathbuf[260]; - LPTSTR dummy; + wchar_t pathbuf[260]; unsigned int old_mode; ULONG_PTR cookie = 0; /* We use LoadLibraryEx so Windows looks for dependent DLLs @@ -198,14 +197,14 @@ /* Don't display a message box when Python can't load a DLL */ old_mode = SetErrorMode(SEM_FAILCRITICALERRORS); - if (GetFullPathName(pathname, - sizeof(pathbuf), - pathbuf, - &dummy)) { + if (GetFullPathNameW(PyUnicode_AS_UNICODE(pathname), + sizeof(pathbuf) / sizeof(pathbuf[0]), + pathbuf, + NULL)) { ULONG_PTR cookie = _Py_ActivateActCtx(); /* XXX This call doesn't exist in Windows CE */ - hDLL = LoadLibraryEx(pathname, NULL, - LOAD_WITH_ALTERED_SEARCH_PATH); + hDLL = LoadLibraryExW(PyUnicode_AS_UNICODE(pathname), NULL, + LOAD_WITH_ALTERED_SEARCH_PATH); _Py_DeactivateActCtx(cookie); } @@ -264,21 +263,21 @@ } else { char buffer[256]; + PyOS_snprintf(buffer, sizeof(buffer), #ifdef _DEBUG - PyOS_snprintf(buffer, sizeof(buffer), "python%d%d_d.dll", + "python%d%d_d.dll", #else - PyOS_snprintf(buffer, sizeof(buffer), "python%d%d.dll", + "python%d%d.dll", #endif PY_MAJOR_VERSION,PY_MINOR_VERSION); import_python = GetPythonImport(hDLL); if (import_python && strcasecmp(buffer,import_python)) { - PyOS_snprintf(buffer, sizeof(buffer), - "Module use of %.150s conflicts " - "with this version of Python.", - import_python); - PyErr_SetString(PyExc_ImportError,buffer); + PyErr_Format(PyExc_ImportError, + "Module use of %.150s conflicts " + "with this version of Python.", + import_python); FreeLibrary(hDLL); return NULL; } diff --git a/Python/importdl.c b/Python/importdl.c --- a/Python/importdl.c +++ b/Python/importdl.c @@ -12,8 +12,13 @@ #include "importdl.h" +#ifdef MS_WINDOWS +extern dl_funcptr _PyImport_GetDynLoadWindows(const char *shortname, + PyObject *pathname, FILE *fp); +#else extern dl_funcptr _PyImport_GetDynLoadFunc(const char *shortname, const char *pathname, FILE *fp); +#endif /* name should be ASCII only because the C language doesn't accept non-ASCII identifiers, and dynamic modules are written in C. */ @@ -22,7 +27,9 @@ _PyImport_LoadDynamicModule(PyObject *name, PyObject *path, FILE *fp) { PyObject *m; +#ifndef MS_WINDOWS PyObject *pathbytes; +#endif char *namestr, *lastdot, *shortname, *packagecontext, *oldcontext; dl_funcptr p0; PyObject* (*p)(void); @@ -48,12 +55,16 @@ shortname = lastdot+1; } +#ifdef MS_WINDOWS + p0 = _PyImport_GetDynLoadWindows(shortname, path, fp); +#else pathbytes = PyUnicode_EncodeFSDefault(path); if (pathbytes == NULL) return NULL; p0 = _PyImport_GetDynLoadFunc(shortname, PyBytes_AS_STRING(pathbytes), fp); Py_DECREF(pathbytes); +#endif p = (PyObject*(*)(void))p0; if (PyErr_Occurred()) return NULL; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 23:42:34 2011 From: python-checkins at python.org (victor.stinner) Date: Mon, 04 Apr 2011 23:42:34 +0200 Subject: [Python-checkins] cpython: Issue #11765: don't test time.sleep() in test_faulthandler Message-ID: http://hg.python.org/cpython/rev/8da8cd1ba9d9 changeset: 69146:8da8cd1ba9d9 user: Victor Stinner date: Mon Apr 04 23:42:30 2011 +0200 summary: Issue #11765: don't test time.sleep() in test_faulthandler time.time() and/or time.sleep() are not accurate on Windows, don't test them in test_faulthandler. Anyway, the check was written for an old implementation of dump_tracebacks_later(), it is not more needed. files: Lib/test/test_faulthandler.py | 12 ++---------- 1 files changed, 2 insertions(+), 10 deletions(-) diff --git a/Lib/test/test_faulthandler.py b/Lib/test/test_faulthandler.py --- a/Lib/test/test_faulthandler.py +++ b/Lib/test/test_faulthandler.py @@ -360,16 +360,8 @@ def func(repeat, cancel, timeout): if cancel: faulthandler.cancel_dump_tracebacks_later() - - pause = timeout * 2.5 - # on Windows XP, b-a gives 1.249931 after sleep(1.25) - min_pause = pause * 0.9 - a = time.time() - time.sleep(pause) - b = time.time() + time.sleep(timeout * 2.5) faulthandler.cancel_dump_tracebacks_later() - # Check that sleep() was not interrupted - assert (b - a) >= min_pause, "{{}} < {{}}".format(b - a, min_pause) timeout = {timeout} repeat = {repeat} @@ -400,7 +392,7 @@ else: count = 1 header = 'Thread 0x[0-9a-f]+:\n' - regex = expected_traceback(12, 27, header, count=count) + regex = expected_traceback(7, 19, header, count=count) self.assertRegex(trace, regex) else: self.assertEqual(trace, '') -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 5 01:37:19 2011 From: python-checkins at python.org (brett.cannon) Date: Tue, 05 Apr 2011 01:37:19 +0200 Subject: [Python-checkins] peps: Draft of PEP 399: Pure Python/C Accelerator Module Compatibiilty Requirements Message-ID: http://hg.python.org/peps/rev/7b9a5b01b479 changeset: 3855:7b9a5b01b479 user: Brett Cannon date: Mon Apr 04 16:37:07 2011 -0700 summary: Draft of PEP 399: Pure Python/C Accelerator Module Compatibiilty Requirements files: pep-0399.txt | 205 +++++++++++++++++++++++++++++++++++++++ 1 files changed, 205 insertions(+), 0 deletions(-) diff --git a/pep-0399.txt b/pep-0399.txt new file mode 100644 --- /dev/null +++ b/pep-0399.txt @@ -0,0 +1,205 @@ +PEP: 399 +Title: Pure Python/C Accelerator Module Compatibiilty Requirements +Version: $Revision: 88219 $ +Last-Modified: $Date: 2011-01-27 13:47:00 -0800 (Thu, 27 Jan 2011) $ +Author: Brett Cannon +Status: Draft +Type: Informational +Content-Type: text/x-rst +Created: 04-Apr-2011 +Python-Version: 3.3 +Post-History: + +Abstract +======== + +The Python standard library under CPython contains various instances +of modules implemented in both pure Python and C. This PEP requires +that in these instances that both the Python and C code *must* be +semantically identical (except in cases where implementation details +of a VM prevents it entirely). It is also required that new C-based +modules lacking a pure Python equivalent implementation get special +permissions to be added to the standard library. + + +Rationale +========= + +Python has grown beyond the CPython virtual machine (VM). IronPython_, +Jython_, and PyPy_ all currently being viable alternatives to the +CPython VM. This VM ecosystem that has sprung up around the Python +programming language has led to Python being used in many different +areas where CPython cannot be used, e.g., Jython allowing Python to be +used in Java applications. + +A problem all of the VMs other than CPython face is handling modules +from the standard library that are implemented in C. Since they do not +typically support the entire `C API of Python`_ they are unable to use +the code used to create the module. Often times this leads these other +VMs to either re-implement the modules in pure Python or in the +programming language used to implement the VM (e.g., in C# for +IronPython). This duplication of effort between CPython, PyPy, Jython, +and IronPython is extremely unfortunate as implementing a module *at +least* in pure Python would help mitigate this duplicate effort. + +The purpose of this PEP is to minimize this duplicate effort by +mandating that all new modules added to Python's standard library +*must* have a pure Python implementation _unless_ special dispensation +is given. This makes sure that a module in the stdlib is available to +all VMs and not just to CPython. + +Re-implementing parts (or all) of a module in C (in the case +of CPython) is still allowed for performance reasons, but any such +accelerated code must semantically match the pure Python equivalent to +prevent divergence. To accomplish this, the pure Python and C code must +be thoroughly tested with the *same* test suite to verify compliance. +This is to prevent users from accidentally relying +on semantics that are specific to the C code and are not reflected in +the pure Python implementation that other VMs rely upon, e.g., in +CPython 3.2.0, ``heapq.heappop()`` raises different exceptions +depending on whether the accelerated C code is used or not:: + + from test.support import import_fresh_module + + c_heapq = import_fresh_module('heapq', fresh=['_heapq']) + py_heapq = import_fresh_module('heapq', blocked=['_heapq']) + + + class Spam: + """Tester class which defines no other magic methods but + __len__().""" + def __len__(self): + return 0 + + + try: + c_heapq.heappop(Spam()) + except TypeError: + # "heap argument must be a list" + pass + + try: + py_heapq.heappop(Spam()) + except AttributeError: + # "'Foo' object has no attribute 'pop'" + pass + +This kind of divergence is a problem for users as they unwittingly +write code that is CPython-specific. This is also an issue for other +VM teams as they have to deal with bug reports from users thinking +that they incorrectly implemented the module when in fact it was +caused by an untested case. + + +Details +======= + +Starting in Python 3.3, any modules added to the standard library must +have a pure Python implementation. This rule can only be ignored if +the Python development team grants a special exemption for the module. +Typically the exemption would be granted only when a module wraps a +specific C-based library (e.g., sqlite3_). In granting an exemption it +will be recognized that the module will most likely be considered +exclusive to CPython and not part of Python's standard library that +other VMs are expected to support. Usage of ``ctypes`` to provide an +API for a C library will continue to be frowned upon as ``ctypes`` +lacks compiler guarantees that C code typically relies upon to prevent +certain errors from occurring (e.g., API changes). + +Even though a pure Python implementation is mandated by this PEP, it +does not preclude the use of a companion acceleration module. If an +acceleration module is provided it is to be named the same as the +module it is accelerating with an underscore attached as a prefix, +e.g., ``_warnings`` for ``warnings``. The common pattern to access +the accelerated code from the pure Python implementation is to import +it with an ``import *``, e.g., ``from _warnings import *``. This is +typically done at the end of the module to allow it to overwrite +specific Python objects with their accelerated equivalents. This kind +of import can also be done before the end of the module when needed, +e.g., an accelerated base class is provided but is then subclassed by +Python code. This PEP does not mandate that pre-existing modules in +the stdlib that lack a pure Python equivalent gain such a module. But +if people do volunteer to provide and maintain a pure Python +equivalent (e.g., the PyPy team volunteering their pure Python +implementation of the ``csv`` module and maintaining it) then such +code will be accepted. + +Any accelerated code must be semantically identical to the pure Python +implementation. The only time any semantics are allowed to be +different are when technical details of the VM providing the +accelerated code prevent matching semantics from being possible, e.g., +a class being a ``type`` when implemented in C. The semantics +equivalence requirement also dictates that no public API be provided +in accelerated code that does not exist in the pure Python code. +Without this requirement people could accidentally come to rely on a +detail in the acclerated code which is not made available to other VMs +that use the pure Python implementation. To help verify that the +contract of semantic equivalence is being met, a module must be tested +both with and without its accelerated code as thoroughly as possible. + +As an example, to write tests which exercise both the pure Python and +C acclerated versions of a module, a basic idiom can be followed:: + + import collections.abc + from test.support import import_fresh_module, run_unittest + import unittest + + c_heapq = import_fresh_module('heapq', fresh=['_heapq']) + py_heapq = import_fresh_module('heapq', blocked=['_heapq']) + + + class ExampleTest(unittest.TestCase): + + def test_heappop_exc_for_non_MutableSequence(self): + # Raise TypeError when heap is not a + # collections.abc.MutableSequence. + class Spam: + """Test class lacking many ABC-required methods + (e.g., pop()).""" + def __len__(self): + return 0 + + heap = Spam() + self.assertFalse(isinstance(heap, + collections.abc.MutableSequence)) + with self.assertRaises(TypeError): + self.heapq.heappop(heap) + + + class AcceleratedExampleTest(ExampleTest): + + """Test using the acclerated code.""" + + heapq = c_heapq + + + class PyExampleTest(ExampleTest): + + """Test with just the pure Python code.""" + + heapq = py_heapq + + + def test_main(): + run_unittest(AcceleratedExampleTest, PyExampleTest) + + + if __name__ == '__main__': + test_main() + +Thoroughness of the test can be verified using coverage measurements +with branching coverage on the pure Python code to verify that all +possible scenarios are tested using (or not using) accelerator code. + + +Copyright +========= + +This document has been placed in the public domain. + + +.. _IronPython: http://ironpython.net/ +.. _Jython: http://www.jython.org/ +.. _PyPy: http://pypy.org/ +.. _C API of Python: http://docs.python.org/py3k/c-api/index.html +.. _sqlite3: http://docs.python.org/py3k/library/sqlite3.html -- Repository URL: http://hg.python.org/peps From python-checkins at python.org Tue Apr 5 01:47:20 2011 From: python-checkins at python.org (brett.cannon) Date: Tue, 05 Apr 2011 01:47:20 +0200 Subject: [Python-checkins] peps: Fix a spelling mistake in the title of PEP 399 Message-ID: http://hg.python.org/peps/rev/359ccf54bc52 changeset: 3856:359ccf54bc52 user: Brett Cannon date: Mon Apr 04 16:47:09 2011 -0700 summary: Fix a spelling mistake in the title of PEP 399 files: pep-0399.txt | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/pep-0399.txt b/pep-0399.txt --- a/pep-0399.txt +++ b/pep-0399.txt @@ -1,5 +1,5 @@ PEP: 399 -Title: Pure Python/C Accelerator Module Compatibiilty Requirements +Title: Pure Python/C Accelerator Module Compatibility Requirements Version: $Revision: 88219 $ Last-Modified: $Date: 2011-01-27 13:47:00 -0800 (Thu, 27 Jan 2011) $ Author: Brett Cannon -- Repository URL: http://hg.python.org/peps From python-checkins at python.org Tue Apr 5 01:48:20 2011 From: python-checkins at python.org (victor.stinner) Date: Tue, 05 Apr 2011 01:48:20 +0200 Subject: [Python-checkins] cpython: Issue #10785: Store the filename as Unicode in the Python parser. Message-ID: http://hg.python.org/cpython/rev/6e9dc970ac0e changeset: 69147:6e9dc970ac0e user: Victor Stinner date: Tue Apr 05 00:39:01 2011 +0200 summary: Issue #10785: Store the filename as Unicode in the Python parser. files: Include/parsetok.h | 9 +++++- Makefile.pre.in | 7 +++-- Misc/NEWS | 2 + Modules/parsermodule.c | 1 + Parser/parsetok.c | 32 ++++++++++++++++++----- Parser/parsetok_pgen.c | 2 + Parser/tokenizer.c | 35 ++++++++++++++++--------- Parser/tokenizer.h | 8 +++++- Python/pythonrun.c | 40 ++++++++++++++++++------------ 9 files changed, 94 insertions(+), 42 deletions(-) diff --git a/Include/parsetok.h b/Include/parsetok.h --- a/Include/parsetok.h +++ b/Include/parsetok.h @@ -9,7 +9,10 @@ typedef struct { int error; - const char *filename; /* decoded from the filesystem encoding */ +#ifndef PGEN + /* The filename is useless for pgen, see comment in tok_state structure */ + PyObject *filename; +#endif int lineno; int offset; char *text; /* UTF-8-encoded string */ @@ -66,8 +69,10 @@ perrdetail *err_ret, int *flags); -/* Note that he following function is defined in pythonrun.c not parsetok.c. */ +/* Note that the following functions are defined in pythonrun.c, + not in parsetok.c */ PyAPI_FUNC(void) PyParser_SetError(perrdetail *); +PyAPI_FUNC(void) PyParser_ClearError(perrdetail *); #ifdef __cplusplus } diff --git a/Makefile.pre.in b/Makefile.pre.in --- a/Makefile.pre.in +++ b/Makefile.pre.in @@ -238,14 +238,13 @@ Parser/listnode.o \ Parser/node.o \ Parser/parser.o \ - Parser/parsetok.o \ Parser/bitset.o \ Parser/metagrammar.o \ Parser/firstsets.o \ Parser/grammar.o \ Parser/pgen.o -PARSER_OBJS= $(POBJS) Parser/myreadline.o Parser/tokenizer.o +PARSER_OBJS= $(POBJS) Parser/myreadline.o Parser/parsetok.o Parser/tokenizer.o PGOBJS= \ Objects/obmalloc.o \ @@ -254,10 +253,12 @@ Python/pyctype.o \ Parser/tokenizer_pgen.o \ Parser/printgrammar.o \ + Parser/parsetok_pgen.o \ Parser/pgenmain.o PARSER_HEADERS= \ Parser/parser.h \ + Include/parsetok.h \ Parser/tokenizer.h PGENOBJS= $(PGENMAIN) $(POBJS) $(PGOBJS) @@ -593,6 +594,7 @@ Parser/metagrammar.o: $(srcdir)/Parser/metagrammar.c Parser/tokenizer_pgen.o: $(srcdir)/Parser/tokenizer.c +Parser/parsetok_pgen.o: $(srcdir)/Parser/parsetok.c Parser/pgenmain.o: $(srcdir)/Include/parsetok.h @@ -700,7 +702,6 @@ Include/objimpl.h \ Include/opcode.h \ Include/osdefs.h \ - Include/parsetok.h \ Include/patchlevel.h \ Include/pgen.h \ Include/pgenheaders.h \ diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -10,6 +10,8 @@ Core and Builtins ----------------- +- Issue #10785: Store the filename as Unicode in the Python parser. + - Issue #11619: _PyImport_LoadDynamicModule() doesn't encode the path to bytes on Windows. diff --git a/Modules/parsermodule.c b/Modules/parsermodule.c --- a/Modules/parsermodule.c +++ b/Modules/parsermodule.c @@ -584,6 +584,7 @@ else PyParser_SetError(&err); } + PyParser_ClearError(&err); return (res); } diff --git a/Parser/parsetok.c b/Parser/parsetok.c --- a/Parser/parsetok.c +++ b/Parser/parsetok.c @@ -13,7 +13,7 @@ /* Forward */ static node *parsetok(struct tok_state *, grammar *, int, perrdetail *, int *); -static void initerr(perrdetail *err_ret, const char* filename); +static int initerr(perrdetail *err_ret, const char* filename); /* Parse input coming from a string. Return error code, print some errors. */ node * @@ -48,7 +48,8 @@ struct tok_state *tok; int exec_input = start == file_input; - initerr(err_ret, filename); + if (initerr(err_ret, filename) < 0) + return NULL; if (*flags & PyPARSE_IGNORE_COOKIE) tok = PyTokenizer_FromUTF8(s, exec_input); @@ -59,7 +60,10 @@ return NULL; } - tok->filename = filename ? filename : ""; +#ifndef PGEN + Py_INCREF(err_ret->filename); + tok->filename = err_ret->filename; +#endif return parsetok(tok, g, start, err_ret, flags); } @@ -90,13 +94,17 @@ { struct tok_state *tok; - initerr(err_ret, filename); + if (initerr(err_ret, filename) < 0) + return NULL; if ((tok = PyTokenizer_FromFile(fp, (char *)enc, ps1, ps2)) == NULL) { err_ret->error = E_NOMEM; return NULL; } - tok->filename = filename; +#ifndef PGEN + Py_INCREF(err_ret->filename); + tok->filename = err_ret->filename; +#endif return parsetok(tok, g, start, err_ret, flags); } @@ -267,14 +275,24 @@ return n; } -static void +static int initerr(perrdetail *err_ret, const char *filename) { err_ret->error = E_OK; - err_ret->filename = filename; err_ret->lineno = 0; err_ret->offset = 0; err_ret->text = NULL; err_ret->token = -1; err_ret->expected = -1; +#ifndef PGEN + if (filename) + err_ret->filename = PyUnicode_DecodeFSDefault(filename); + else + err_ret->filename = PyUnicode_FromString(""); + if (err_ret->filename == NULL) { + err_ret->error = E_ERROR; + return -1; + } +#endif + return 0; } diff --git a/Parser/parsetok_pgen.c b/Parser/parsetok_pgen.c new file mode 100644 --- /dev/null +++ b/Parser/parsetok_pgen.c @@ -0,0 +1,2 @@ +#define PGEN +#include "parsetok.c" diff --git a/Parser/tokenizer.c b/Parser/tokenizer.c --- a/Parser/tokenizer.c +++ b/Parser/tokenizer.c @@ -128,7 +128,6 @@ tok->prompt = tok->nextprompt = NULL; tok->lineno = 0; tok->level = 0; - tok->filename = NULL; tok->altwarning = 1; tok->alterror = 1; tok->alttabsize = 1; @@ -140,6 +139,7 @@ tok->encoding = NULL; tok->cont_line = 0; #ifndef PGEN + tok->filename = NULL; tok->decoding_readline = NULL; tok->decoding_buffer = NULL; #endif @@ -545,7 +545,6 @@ { char *line = NULL; int badchar = 0; - PyObject *filename; for (;;) { if (tok->decoding_state == STATE_NORMAL) { /* We already have a codec associated with @@ -586,16 +585,12 @@ if (badchar) { /* Need to add 1 to the line number, since this line has not been counted, yet. */ - filename = PyUnicode_DecodeFSDefault(tok->filename); - if (filename != NULL) { - PyErr_Format(PyExc_SyntaxError, - "Non-UTF-8 code starting with '\\x%.2x' " - "in file %U on line %i, " - "but no encoding declared; " - "see http://python.org/dev/peps/pep-0263/ for details", - badchar, filename, tok->lineno + 1); - Py_DECREF(filename); - } + PyErr_Format(PyExc_SyntaxError, + "Non-UTF-8 code starting with '\\x%.2x' " + "in file %U on line %i, " + "but no encoding declared; " + "see http://python.org/dev/peps/pep-0263/ for details", + badchar, tok->filename, tok->lineno + 1); return error_ret(tok); } #endif @@ -853,6 +848,7 @@ #ifndef PGEN Py_XDECREF(tok->decoding_readline); Py_XDECREF(tok->decoding_buffer); + Py_XDECREF(tok->filename); #endif if (tok->fp != NULL && tok->buf != NULL) PyMem_FREE(tok->buf); @@ -1247,8 +1243,13 @@ return 1; } if (tok->altwarning) { - PySys_WriteStderr("%s: inconsistent use of tabs and spaces " +#ifdef PGEN + PySys_WriteStderr("inconsistent use of tabs and spaces " + "in indentation\n"); +#else + PySys_FormatStderr("%U: inconsistent use of tabs and spaces " "in indentation\n", tok->filename); +#endif tok->altwarning = 0; } return 0; @@ -1718,6 +1719,11 @@ fclose(fp); return NULL; } +#ifndef PGEN + tok->filename = PyUnicode_FromString(""); + if (tok->filename == NULL) + goto error; +#endif while (tok->lineno < 2 && tok->done == E_OK) { PyTokenizer_Get(tok, &p_start, &p_end); } @@ -1727,6 +1733,9 @@ if (encoding) strcpy(encoding, tok->encoding); } +#ifndef PGEN +error: +#endif PyTokenizer_Free(tok); return encoding; } diff --git a/Parser/tokenizer.h b/Parser/tokenizer.h --- a/Parser/tokenizer.h +++ b/Parser/tokenizer.h @@ -40,7 +40,13 @@ int level; /* () [] {} Parentheses nesting level */ /* Used to allow free continuations inside them */ /* Stuff for checking on different tab sizes */ - const char *filename; /* encoded to the filesystem encoding */ +#ifndef PGEN + /* pgen doesn't have access to Python codecs, it cannot decode the input + filename. The bytes filename might be kept, but it is only used by + indenterror() and it is not really needed: pgen only compiles one file + (Grammar/Grammar). */ + PyObject *filename; +#endif int altwarning; /* Issue warning if alternate tabs don't match */ int alterror; /* Issue error if alternate tabs don't match */ int alttabsize; /* Alternate tab spacing */ diff --git a/Python/pythonrun.c b/Python/pythonrun.c --- a/Python/pythonrun.c +++ b/Python/pythonrun.c @@ -62,6 +62,7 @@ static PyObject *run_pyc_file(FILE *, const char *, PyObject *, PyObject *, PyCompilerFlags *); static void err_input(perrdetail *); +static void err_free(perrdetail *); static void initsigs(void); static void call_py_exitfuncs(void); static void wait_for_thread_shutdown(void); @@ -1887,12 +1888,13 @@ flags->cf_flags |= iflags & PyCF_MASK; mod = PyAST_FromNode(n, flags, filename, arena); PyNode_Free(n); - return mod; } else { err_input(&err); - return NULL; + mod = NULL; } + err_free(&err); + return mod; } mod_ty @@ -1917,14 +1919,15 @@ flags->cf_flags |= iflags & PyCF_MASK; mod = PyAST_FromNode(n, flags, filename, arena); PyNode_Free(n); - return mod; } else { err_input(&err); if (errcode) *errcode = err.error; - return NULL; + mod = NULL; } + err_free(&err); + return mod; } /* Simplified interface to parsefile -- return node or set exception */ @@ -1938,6 +1941,7 @@ start, NULL, NULL, &err, flags); if (n == NULL) err_input(&err); + err_free(&err); return n; } @@ -1952,6 +1956,7 @@ start, &err, flags); if (n == NULL) err_input(&err); + err_free(&err); return n; } @@ -1964,6 +1969,7 @@ &_PyParser_Grammar, start, &err, flags); if (n == NULL) err_input(&err); + err_free(&err); return n; } @@ -1977,11 +1983,23 @@ even parser modules. */ void +PyParser_ClearError(perrdetail *err) +{ + err_free(err); +} + +void PyParser_SetError(perrdetail *err) { err_input(err); } +static void +err_free(perrdetail *err) +{ + Py_CLEAR(err->filename); +} + /* Set the error appropriate to the given input error code (see errcode.h) */ static void @@ -1989,7 +2007,6 @@ { PyObject *v, *w, *errtype, *errtext; PyObject *msg_obj = NULL; - PyObject *filename; char *msg = NULL; errtype = PyExc_SyntaxError; @@ -2075,17 +2092,8 @@ errtext = PyUnicode_DecodeUTF8(err->text, strlen(err->text), "replace"); } - if (err->filename != NULL) - filename = PyUnicode_DecodeFSDefault(err->filename); - else { - Py_INCREF(Py_None); - filename = Py_None; - } - if (filename != NULL) - v = Py_BuildValue("(NiiN)", filename, - err->lineno, err->offset, errtext); - else - v = NULL; + v = Py_BuildValue("(OiiN)", err->filename, + err->lineno, err->offset, errtext); if (v != NULL) { if (msg_obj) w = Py_BuildValue("(OO)", msg_obj, v); -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 5 01:48:35 2011 From: python-checkins at python.org (victor.stinner) Date: Tue, 05 Apr 2011 01:48:35 +0200 Subject: [Python-checkins] cpython: Issue #9319: Include the filename in "Non-UTF8 code ..." syntax error. Message-ID: http://hg.python.org/cpython/rev/7b8d625eb6e4 changeset: 69148:7b8d625eb6e4 user: Victor Stinner date: Tue Apr 05 01:48:03 2011 +0200 summary: Issue #9319: Include the filename in "Non-UTF8 code ..." syntax error. files: Lib/test/test_imp.py | 6 ++++ Misc/NEWS | 2 + Parser/tokenizer.c | 41 +++++++++++++++++++++---------- Parser/tokenizer.h | 1 - Python/import.c | 10 +++--- Python/traceback.c | 6 ++-- 6 files changed, 43 insertions(+), 23 deletions(-) diff --git a/Lib/test/test_imp.py b/Lib/test/test_imp.py --- a/Lib/test/test_imp.py +++ b/Lib/test/test_imp.py @@ -58,6 +58,12 @@ with imp.find_module('module_' + mod, self.test_path)[0] as fd: self.assertEqual(fd.encoding, encoding) + path = [os.path.dirname(__file__)] + self.assertRaisesRegex(SyntaxError, + r"Non-UTF-8 code starting with '\\xf6'" + r" in file .*badsyntax_pep3120.py", + imp.find_module, 'badsyntax_pep3120', path) + def test_issue1267(self): for mod, encoding, _ in self.test_strings: fp, filename, info = imp.find_module('module_' + mod, diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -10,6 +10,8 @@ Core and Builtins ----------------- +- Issue #9319: Include the filename in "Non-UTF8 code ..." syntax error. + - Issue #10785: Store the filename as Unicode in the Python parser. - Issue #11619: _PyImport_LoadDynamicModule() doesn't encode the path to bytes diff --git a/Parser/tokenizer.c b/Parser/tokenizer.c --- a/Parser/tokenizer.c +++ b/Parser/tokenizer.c @@ -1690,17 +1690,18 @@ return result; } -/* Get -*- encoding -*- from a Python file. +/* Get the encoding of a Python file. Check for the coding cookie and check if + the file starts with a BOM. - PyTokenizer_FindEncoding returns NULL when it can't find the encoding in - the first or second line of the file (in which case the encoding - should be assumed to be PyUnicode_GetDefaultEncoding()). + PyTokenizer_FindEncodingFilename() returns NULL when it can't find the + encoding in the first or second line of the file (in which case the encoding + should be assumed to be UTF-8). - The char * returned is malloc'ed via PyMem_MALLOC() and thus must be freed - by the caller. -*/ + The char* returned is malloc'ed via PyMem_MALLOC() and thus must be freed + by the caller. */ + char * -PyTokenizer_FindEncoding(int fd) +PyTokenizer_FindEncodingFilename(int fd, PyObject *filename) { struct tok_state *tok; FILE *fp; @@ -1720,9 +1721,18 @@ return NULL; } #ifndef PGEN - tok->filename = PyUnicode_FromString(""); - if (tok->filename == NULL) - goto error; + if (filename != NULL) { + Py_INCREF(filename); + tok->filename = filename; + } + else { + tok->filename = PyUnicode_FromString(""); + if (tok->filename == NULL) { + fclose(fp); + PyTokenizer_Free(tok); + return encoding; + } + } #endif while (tok->lineno < 2 && tok->done == E_OK) { PyTokenizer_Get(tok, &p_start, &p_end); @@ -1733,13 +1743,16 @@ if (encoding) strcpy(encoding, tok->encoding); } -#ifndef PGEN -error: -#endif PyTokenizer_Free(tok); return encoding; } +char * +PyTokenizer_FindEncoding(int fd) +{ + return PyTokenizer_FindEncodingFilename(fd, NULL); +} + #ifdef Py_DEBUG void diff --git a/Parser/tokenizer.h b/Parser/tokenizer.h --- a/Parser/tokenizer.h +++ b/Parser/tokenizer.h @@ -75,7 +75,6 @@ extern int PyTokenizer_Get(struct tok_state *, char **, char **); extern char * PyTokenizer_RestoreEncoding(struct tok_state* tok, int len, int *offset); -extern char * PyTokenizer_FindEncoding(int); #ifdef __cplusplus } diff --git a/Python/import.c b/Python/import.c --- a/Python/import.c +++ b/Python/import.c @@ -124,12 +124,12 @@ /* See _PyImport_FixupExtensionObject() below */ static PyObject *extensions = NULL; +/* Function from Parser/tokenizer.c */ +extern char * PyTokenizer_FindEncodingFilename(int, PyObject *); + /* This table is defined in config.c: */ extern struct _inittab _PyImport_Inittab[]; -/* Method from Parser/tokenizer.c */ -extern char * PyTokenizer_FindEncoding(int); - struct _inittab *PyImport_Inittab = _PyImport_Inittab; /* these tables define the module suffixes that Python recognizes */ @@ -3540,9 +3540,9 @@ } if (fd != -1) { if (strchr(fdp->mode, 'b') == NULL) { - /* PyTokenizer_FindEncoding() returns PyMem_MALLOC'ed + /* PyTokenizer_FindEncodingFilename() returns PyMem_MALLOC'ed memory. */ - found_encoding = PyTokenizer_FindEncoding(fd); + found_encoding = PyTokenizer_FindEncodingFilename(fd, pathobj); lseek(fd, 0, 0); /* Reset position */ if (found_encoding == NULL && PyErr_Occurred()) { Py_XDECREF(pathobj); diff --git a/Python/traceback.c b/Python/traceback.c --- a/Python/traceback.c +++ b/Python/traceback.c @@ -18,8 +18,8 @@ #define MAX_FRAME_DEPTH 100 #define MAX_NTHREADS 100 -/* Method from Parser/tokenizer.c */ -extern char * PyTokenizer_FindEncoding(int); +/* Function from Parser/tokenizer.c */ +extern char * PyTokenizer_FindEncodingFilename(int, PyObject *); static PyObject * tb_dir(PyTracebackObject *self) @@ -251,7 +251,7 @@ /* use the right encoding to decode the file as unicode */ fd = PyObject_AsFileDescriptor(binary); - found_encoding = PyTokenizer_FindEncoding(fd); + found_encoding = PyTokenizer_FindEncodingFilename(fd, filename); encoding = (found_encoding != NULL) ? found_encoding : "utf-8"; lseek(fd, 0, 0); /* Reset position */ fob = PyObject_CallMethod(io, "TextIOWrapper", "Os", binary, encoding); -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 5 02:30:07 2011 From: python-checkins at python.org (victor.stinner) Date: Tue, 05 Apr 2011 02:30:07 +0200 Subject: [Python-checkins] cpython: Issue #11768: add debug messages in test_threadsignals.test_signals Message-ID: http://hg.python.org/cpython/rev/d14eac872a46 changeset: 69149:d14eac872a46 user: Victor Stinner date: Tue Apr 05 02:29:30 2011 +0200 summary: Issue #11768: add debug messages in test_threadsignals.test_signals files: Lib/test/test_threadsignals.py | 8 ++++++++ 1 files changed, 8 insertions(+), 0 deletions(-) diff --git a/Lib/test/test_threadsignals.py b/Lib/test/test_threadsignals.py --- a/Lib/test/test_threadsignals.py +++ b/Lib/test/test_threadsignals.py @@ -30,9 +30,14 @@ # a function that will be spawned as a separate thread. def send_signals(): + print("send_signals: enter (thread %s)" % thread.get_ident(), file=sys.stderr) + print("send_signals: raise SIGUSR1", file=sys.stderr) os.kill(process_pid, signal.SIGUSR1) + print("send_signals: raise SIGUSR2", file=sys.stderr) os.kill(process_pid, signal.SIGUSR2) + print("send_signals: release signalled_all", file=sys.stderr) signalled_all.release() + print("send_signals: exit (thread %s)" % thread.get_ident(), file=sys.stderr) class ThreadSignals(unittest.TestCase): @@ -41,9 +46,12 @@ # We spawn a thread, have the thread send two signals, and # wait for it to finish. Check that we got both signals # and that they were run by the main thread. + print("test_signals: acquire lock (thread %s)" % thread.get_ident(), file=sys.stderr) signalled_all.acquire() self.spawnSignallingThread() + print("test_signals: wait lock (thread %s)" % thread.get_ident(), file=sys.stderr) signalled_all.acquire() + print("test_signals: lock acquired", file=sys.stderr) # the signals that we asked the kernel to send # will come back, but we don't know when. # (it might even be after the thread exits -- Repository URL: http://hg.python.org/cpython From solipsis at pitrou.net Tue Apr 5 04:55:50 2011 From: solipsis at pitrou.net (solipsis at pitrou.net) Date: Tue, 05 Apr 2011 04:55:50 +0200 Subject: [Python-checkins] Daily reference leaks (d14eac872a46): sum=0 Message-ID: results for d14eac872a46 on branch "default" -------------------------------------------- Command line was: ['./python', '-m', 'test.regrtest', '-uall', '-R', '3:3:/home/antoine/cpython/refleaks/reflogrsldRs', '-x'] From python-checkins at python.org Tue Apr 5 11:34:06 2011 From: python-checkins at python.org (raymond.hettinger) Date: Tue, 05 Apr 2011 11:34:06 +0200 Subject: [Python-checkins] cpython: Issue #11707: Fast C version of functools.cmp_to_key() Message-ID: http://hg.python.org/cpython/rev/a03fb2fc3ed8 changeset: 69150:a03fb2fc3ed8 user: Raymond Hettinger date: Tue Apr 05 02:33:54 2011 -0700 summary: Issue #11707: Fast C version of functools.cmp_to_key() files: Lib/functools.py | 7 +- Lib/test/test_functools.py | 66 ++++++++++- Misc/NEWS | 3 + Modules/_functoolsmodule.c | 161 +++++++++++++++++++++++++ 4 files changed, 235 insertions(+), 2 deletions(-) diff --git a/Lib/functools.py b/Lib/functools.py --- a/Lib/functools.py +++ b/Lib/functools.py @@ -97,7 +97,7 @@ """Convert a cmp= function into a key= function""" class K(object): __slots__ = ['obj'] - def __init__(self, obj, *args): + def __init__(self, obj): self.obj = obj def __lt__(self, other): return mycmp(self.obj, other.obj) < 0 @@ -115,6 +115,11 @@ raise TypeError('hash not implemented') return K +try: + from _functools import cmp_to_key +except ImportError: + pass + _CacheInfo = namedtuple("CacheInfo", "hits misses maxsize currsize") def lru_cache(maxsize=100): diff --git a/Lib/test/test_functools.py b/Lib/test/test_functools.py --- a/Lib/test/test_functools.py +++ b/Lib/test/test_functools.py @@ -435,18 +435,81 @@ self.assertEqual(self.func(add, d), "".join(d.keys())) class TestCmpToKey(unittest.TestCase): + def test_cmp_to_key(self): + def cmp1(x, y): + return (x > y) - (x < y) + key = functools.cmp_to_key(cmp1) + self.assertEqual(key(3), key(3)) + self.assertGreater(key(3), key(1)) + def cmp2(x, y): + return int(x) - int(y) + key = functools.cmp_to_key(cmp2) + self.assertEqual(key(4.0), key('4')) + self.assertLess(key(2), key('35')) + + def test_cmp_to_key_arguments(self): + def cmp1(x, y): + return (x > y) - (x < y) + key = functools.cmp_to_key(mycmp=cmp1) + self.assertEqual(key(obj=3), key(obj=3)) + self.assertGreater(key(obj=3), key(obj=1)) + with self.assertRaises((TypeError, AttributeError)): + key(3) > 1 # rhs is not a K object + with self.assertRaises((TypeError, AttributeError)): + 1 < key(3) # lhs is not a K object + with self.assertRaises(TypeError): + key = functools.cmp_to_key() # too few args + with self.assertRaises(TypeError): + key = functools.cmp_to_key(cmp1, None) # too many args + key = functools.cmp_to_key(cmp1) + with self.assertRaises(TypeError): + key() # too few args + with self.assertRaises(TypeError): + key(None, None) # too many args + + def test_bad_cmp(self): + def cmp1(x, y): + raise ZeroDivisionError + key = functools.cmp_to_key(cmp1) + with self.assertRaises(ZeroDivisionError): + key(3) > key(1) + + class BadCmp: + def __lt__(self, other): + raise ZeroDivisionError + def cmp1(x, y): + return BadCmp() + with self.assertRaises(ZeroDivisionError): + key(3) > key(1) + + def test_obj_field(self): + def cmp1(x, y): + return (x > y) - (x < y) + key = functools.cmp_to_key(mycmp=cmp1) + self.assertEqual(key(50).obj, 50) + + def test_sort_int(self): def mycmp(x, y): return y - x self.assertEqual(sorted(range(5), key=functools.cmp_to_key(mycmp)), [4, 3, 2, 1, 0]) + def test_sort_int_str(self): + def mycmp(x, y): + x, y = int(x), int(y) + return (x > y) - (x < y) + values = [5, '3', 7, 2, '0', '1', 4, '10', 1] + values = sorted(values, key=functools.cmp_to_key(mycmp)) + self.assertEqual([int(value) for value in values], + [0, 1, 1, 2, 3, 4, 5, 7, 10]) + def test_hash(self): def mycmp(x, y): return y - x key = functools.cmp_to_key(mycmp) k = key(10) - self.assertRaises(TypeError, hash(k)) + self.assertRaises(TypeError, hash, k) class TestTotalOrdering(unittest.TestCase): @@ -655,6 +718,7 @@ def test_main(verbose=None): test_classes = ( + TestCmpToKey, TestPartial, TestPartialSubclass, TestPythonPartial, diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -97,6 +97,9 @@ - Issue #10791: Implement missing method GzipFile.read1(), allowing GzipFile to be wrapped in a TextIOWrapper. Patch by Nadeem Vawda. +- Issue #11707: Added a fast C version of functools.cmp_to_key(). + Patch by Filip Gruszczy?ski. + - Issue #11688: Add sqlite3.Connection.set_trace_callback(). Patch by Torsten Landschoff. diff --git a/Modules/_functoolsmodule.c b/Modules/_functoolsmodule.c --- a/Modules/_functoolsmodule.c +++ b/Modules/_functoolsmodule.c @@ -330,6 +330,165 @@ }; +/* cmp_to_key ***************************************************************/ + +typedef struct { + PyObject_HEAD; + PyObject *cmp; + PyObject *object; +} keyobject; + +static void +keyobject_dealloc(keyobject *ko) +{ + Py_DECREF(ko->cmp); + Py_XDECREF(ko->object); + PyObject_FREE(ko); +} + +static int +keyobject_traverse(keyobject *ko, visitproc visit, void *arg) +{ + Py_VISIT(ko->cmp); + if (ko->object) + Py_VISIT(ko->object); + return 0; +} + +static PyMemberDef keyobject_members[] = { + {"obj", T_OBJECT, + offsetof(keyobject, object), 0, + PyDoc_STR("Value wrapped by a key function.")}, + {NULL} +}; + +static PyObject * +keyobject_call(keyobject *ko, PyObject *args, PyObject *kw); + +static PyObject * +keyobject_richcompare(PyObject *ko, PyObject *other, int op); + +static PyTypeObject keyobject_type = { + PyVarObject_HEAD_INIT(&PyType_Type, 0) + "functools.KeyWrapper", /* tp_name */ + sizeof(keyobject), /* tp_basicsize */ + 0, /* tp_itemsize */ + /* methods */ + (destructor)keyobject_dealloc, /* tp_dealloc */ + 0, /* tp_print */ + 0, /* tp_getattr */ + 0, /* tp_setattr */ + 0, /* tp_reserved */ + 0, /* tp_repr */ + 0, /* tp_as_number */ + 0, /* tp_as_sequence */ + 0, /* tp_as_mapping */ + 0, /* tp_hash */ + (ternaryfunc)keyobject_call, /* tp_call */ + 0, /* tp_str */ + PyObject_GenericGetAttr, /* tp_getattro */ + 0, /* tp_setattro */ + 0, /* tp_as_buffer */ + Py_TPFLAGS_DEFAULT, /* tp_flags */ + 0, /* tp_doc */ + (traverseproc)keyobject_traverse, /* tp_traverse */ + 0, /* tp_clear */ + keyobject_richcompare, /* tp_richcompare */ + 0, /* tp_weaklistoffset */ + 0, /* tp_iter */ + 0, /* tp_iternext */ + 0, /* tp_methods */ + keyobject_members, /* tp_members */ + 0, /* tp_getset */ +}; + +static PyObject * +keyobject_call(keyobject *ko, PyObject *args, PyObject *kwds) +{ + PyObject *object; + keyobject *result; + static char *kwargs[] = {"obj", NULL}; + + if (!PyArg_ParseTupleAndKeywords(args, kwds, "O:K", kwargs, &object)) + return NULL; + result = PyObject_New(keyobject, &keyobject_type); + if (!result) + return NULL; + Py_INCREF(ko->cmp); + result->cmp = ko->cmp; + Py_INCREF(object); + result->object = object; + return (PyObject *)result; +} + +static PyObject * +keyobject_richcompare(PyObject *ko, PyObject *other, int op) +{ + PyObject *res; + PyObject *args; + PyObject *x; + PyObject *y; + PyObject *compare; + PyObject *answer; + static PyObject *zero; + + if (zero == NULL) { + zero = PyLong_FromLong(0); + if (!zero) + return NULL; + } + + if (Py_TYPE(other) != &keyobject_type){ + PyErr_Format(PyExc_TypeError, "other argument must be K instance"); + return NULL; + } + compare = ((keyobject *) ko)->cmp; + assert(compare != NULL); + x = ((keyobject *) ko)->object; + y = ((keyobject *) other)->object; + if (!x || !y){ + PyErr_Format(PyExc_AttributeError, "object"); + return NULL; + } + + /* Call the user's comparison function and translate the 3-way + * result into true or false (or error). + */ + args = PyTuple_New(2); + if (args == NULL) + return NULL; + Py_INCREF(x); + Py_INCREF(y); + PyTuple_SET_ITEM(args, 0, x); + PyTuple_SET_ITEM(args, 1, y); + res = PyObject_Call(compare, args, NULL); + Py_DECREF(args); + if (res == NULL) + return NULL; + answer = PyObject_RichCompare(res, zero, op); + Py_DECREF(res); + return answer; +} + +static PyObject * +functools_cmp_to_key(PyObject *self, PyObject *args, PyObject *kwds){ + PyObject *cmp; + static char *kwargs[] = {"mycmp", NULL}; + + if (!PyArg_ParseTupleAndKeywords(args, kwds, "O:cmp_to_key", kwargs, &cmp)) + return NULL; + keyobject *object = PyObject_New(keyobject, &keyobject_type); + if (!object) + return NULL; + Py_INCREF(cmp); + object->cmp = cmp; + object->object = NULL; + return (PyObject *)object; +} + +PyDoc_STRVAR(functools_cmp_to_key_doc, +"Convert a cmp= function into a key= function."); + /* reduce (used to be a builtin) ********************************************/ static PyObject * @@ -413,6 +572,8 @@ static PyMethodDef module_methods[] = { {"reduce", functools_reduce, METH_VARARGS, functools_reduce_doc}, + {"cmp_to_key", functools_cmp_to_key, METH_VARARGS | METH_KEYWORDS, + functools_cmp_to_key_doc}, {NULL, NULL} /* sentinel */ }; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 5 12:21:51 2011 From: python-checkins at python.org (victor.stinner) Date: Tue, 05 Apr 2011 12:21:51 +0200 Subject: [Python-checkins] cpython: Issue #11707: Fix compilation errors with Visual Studio Message-ID: http://hg.python.org/cpython/rev/76ed6a061ebe changeset: 69151:76ed6a061ebe user: Victor Stinner date: Tue Apr 05 12:21:35 2011 +0200 summary: Issue #11707: Fix compilation errors with Visual Studio Fix also a compiler (gcc) warning. files: Modules/_functoolsmodule.c | 14 ++++++++------ 1 files changed, 8 insertions(+), 6 deletions(-) diff --git a/Modules/_functoolsmodule.c b/Modules/_functoolsmodule.c --- a/Modules/_functoolsmodule.c +++ b/Modules/_functoolsmodule.c @@ -333,7 +333,7 @@ /* cmp_to_key ***************************************************************/ typedef struct { - PyObject_HEAD; + PyObject_HEAD PyObject *cmp; PyObject *object; } keyobject; @@ -471,13 +471,15 @@ } static PyObject * -functools_cmp_to_key(PyObject *self, PyObject *args, PyObject *kwds){ - PyObject *cmp; +functools_cmp_to_key(PyObject *self, PyObject *args, PyObject *kwds) +{ + PyObject *cmp; static char *kwargs[] = {"mycmp", NULL}; + keyobject *object; if (!PyArg_ParseTupleAndKeywords(args, kwds, "O:cmp_to_key", kwargs, &cmp)) return NULL; - keyobject *object = PyObject_New(keyobject, &keyobject_type); + object = PyObject_New(keyobject, &keyobject_type); if (!object) return NULL; Py_INCREF(cmp); @@ -572,8 +574,8 @@ static PyMethodDef module_methods[] = { {"reduce", functools_reduce, METH_VARARGS, functools_reduce_doc}, - {"cmp_to_key", functools_cmp_to_key, METH_VARARGS | METH_KEYWORDS, - functools_cmp_to_key_doc}, + {"cmp_to_key", (PyCFunction)functools_cmp_to_key, + METH_VARARGS | METH_KEYWORDS, functools_cmp_to_key_doc}, {NULL, NULL} /* sentinel */ }; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 5 13:16:12 2011 From: python-checkins at python.org (victor.stinner) Date: Tue, 05 Apr 2011 13:16:12 +0200 Subject: [Python-checkins] cpython: Issue #11757: subprocess ensures that select() and poll() timeout >= 0 Message-ID: http://hg.python.org/cpython/rev/3664fc29e867 changeset: 69152:3664fc29e867 user: Victor Stinner date: Tue Apr 05 13:13:08 2011 +0200 summary: Issue #11757: subprocess ensures that select() and poll() timeout >= 0 files: Lib/subprocess.py | 33 +++++++++++++++++++-------------- 1 files changed, 19 insertions(+), 14 deletions(-) diff --git a/Lib/subprocess.py b/Lib/subprocess.py --- a/Lib/subprocess.py +++ b/Lib/subprocess.py @@ -817,15 +817,10 @@ if self._communication_started and input: raise ValueError("Cannot send input after starting communication") - if timeout is not None: - endtime = time.time() + timeout - else: - endtime = None - # Optimization: If we are not worried about timeouts, we haven't # started communicating, and we have one or zero pipes, using select() # or threads is unnecessary. - if (endtime is None and not self._communication_started and + if (timeout is None and not self._communication_started and [self.stdin, self.stdout, self.stderr].count(None) >= 2): stdout = None stderr = None @@ -840,14 +835,18 @@ stderr = self.stderr.read() self.stderr.close() self.wait() - return (stdout, stderr) + else: + if timeout is not None: + endtime = time.time() + timeout + else: + endtime = None - try: - stdout, stderr = self._communicate(input, endtime, timeout) - finally: - self._communication_started = True + try: + stdout, stderr = self._communicate(input, endtime, timeout) + finally: + self._communication_started = True - sts = self.wait(timeout=self._remaining_time(endtime)) + sts = self.wait(timeout=self._remaining_time(endtime)) return (stdout, stderr) @@ -1604,8 +1603,11 @@ self._input = self._input.encode(self.stdin.encoding) while self._fd2file: + timeout = self._remaining_time(endtime) + if timeout is not None and timeout < 0: + raise TimeoutExpired(self.args, orig_timeout) try: - ready = poller.poll(self._remaining_time(endtime)) + ready = poller.poll(timeout) except select.error as e: if e.args[0] == errno.EINTR: continue @@ -1664,10 +1666,13 @@ stderr = self._stderr_buff while self._read_set or self._write_set: + timeout = self._remaining_time(endtime) + if timeout is not None and timeout < 0: + raise TimeoutExpired(self.args, orig_timeout) try: (rlist, wlist, xlist) = \ select.select(self._read_set, self._write_set, [], - self._remaining_time(endtime)) + timeout) except select.error as e: if e.args[0] == errno.EINTR: continue -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 5 16:10:14 2011 From: python-checkins at python.org (ross.lagerwall) Date: Tue, 05 Apr 2011 16:10:14 +0200 Subject: [Python-checkins] cpython (2.7): Issue #10963: Ensure that subprocess.communicate() never raises EPIPE. Message-ID: http://hg.python.org/cpython/rev/c10d55c51d81 changeset: 69153:c10d55c51d81 branch: 2.7 parent: 69125:f961e9179998 user: Ross Lagerwall date: Tue Apr 05 15:24:34 2011 +0200 summary: Issue #10963: Ensure that subprocess.communicate() never raises EPIPE. files: Lib/subprocess.py | 45 ++++++++++++++++++------ Lib/test/test_subprocess.py | 18 ++++++++++ Misc/NEWS | 2 + 3 files changed, 54 insertions(+), 11 deletions(-) diff --git a/Lib/subprocess.py b/Lib/subprocess.py --- a/Lib/subprocess.py +++ b/Lib/subprocess.py @@ -396,6 +396,7 @@ import traceback import gc import signal +import errno # Exception classes used by this module. class CalledProcessError(Exception): @@ -427,7 +428,6 @@ else: import select _has_poll = hasattr(select, 'poll') - import errno import fcntl import pickle @@ -726,7 +726,11 @@ stderr = None if self.stdin: if input: - self.stdin.write(input) + try: + self.stdin.write(input) + except IOError as e: + if e.errno != errno.EPIPE and e.errno != errno.EINVAL: + raise self.stdin.close() elif self.stdout: stdout = self.stdout.read() @@ -956,7 +960,11 @@ if self.stdin: if input is not None: - self.stdin.write(input) + try: + self.stdin.write(input) + except IOError as e: + if e.errno != errno.EPIPE: + raise self.stdin.close() if self.stdout: @@ -1336,9 +1344,16 @@ for fd, mode in ready: if mode & select.POLLOUT: chunk = input[input_offset : input_offset + _PIPE_BUF] - input_offset += os.write(fd, chunk) - if input_offset >= len(input): - close_unregister_and_remove(fd) + try: + input_offset += os.write(fd, chunk) + except OSError as e: + if e.errno == errno.EPIPE: + close_unregister_and_remove(fd) + else: + raise + else: + if input_offset >= len(input): + close_unregister_and_remove(fd) elif mode & select_POLLIN_POLLPRI: data = os.read(fd, 4096) if not data: @@ -1377,11 +1392,19 @@ if self.stdin in wlist: chunk = input[input_offset : input_offset + _PIPE_BUF] - bytes_written = os.write(self.stdin.fileno(), chunk) - input_offset += bytes_written - if input_offset >= len(input): - self.stdin.close() - write_set.remove(self.stdin) + try: + bytes_written = os.write(self.stdin.fileno(), chunk) + except OSError as e: + if e.errno == errno.EPIPE: + self.stdin.close() + write_set.remove(self.stdin) + else: + raise + else: + input_offset += bytes_written + if input_offset >= len(input): + self.stdin.close() + write_set.remove(self.stdin) if self.stdout in rlist: data = os.read(self.stdout.fileno(), 1024) diff --git a/Lib/test/test_subprocess.py b/Lib/test/test_subprocess.py --- a/Lib/test/test_subprocess.py +++ b/Lib/test/test_subprocess.py @@ -597,6 +597,24 @@ self.assertFalse(os.path.exists(ofname)) self.assertFalse(os.path.exists(efname)) + def test_communicate_epipe(self): + # Issue 10963: communicate() should hide EPIPE + p = subprocess.Popen([sys.executable, "-c", 'pass'], + stdin=subprocess.PIPE, + stdout=subprocess.PIPE, + stderr=subprocess.PIPE) + self.addCleanup(p.stdout.close) + self.addCleanup(p.stderr.close) + self.addCleanup(p.stdin.close) + p.communicate("x" * 2**20) + + def test_communicate_epipe_only_stdin(self): + # Issue 10963: communicate() should hide EPIPE + p = subprocess.Popen([sys.executable, "-c", 'pass'], + stdin=subprocess.PIPE) + self.addCleanup(p.stdin.close) + time.sleep(2) + p.communicate("x" * 2**20) # context manager class _SuppressCoreFiles(object): diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -47,6 +47,8 @@ Library ------- +- Issue #10963: Ensure that subprocess.communicate() never raises EPIPE. + - Issue #11662: Make urllib and urllib2 ignore redirections if the scheme is not HTTP, HTTPS or FTP (CVE-2011-1521). -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 5 16:10:21 2011 From: python-checkins at python.org (ross.lagerwall) Date: Tue, 05 Apr 2011 16:10:21 +0200 Subject: [Python-checkins] cpython (3.1): Issue #10963: Ensure that subprocess.communicate() never raises EPIPE. Message-ID: http://hg.python.org/cpython/rev/158495d49f58 changeset: 69154:158495d49f58 branch: 3.1 parent: 69142:04b5cd2f8c87 user: Ross Lagerwall date: Tue Apr 05 15:34:00 2011 +0200 summary: Issue #10963: Ensure that subprocess.communicate() never raises EPIPE. files: Lib/subprocess.py | 45 ++++++++++++++++++------ Lib/test/test_subprocess.py | 19 ++++++++++ Misc/NEWS | 2 + 3 files changed, 55 insertions(+), 11 deletions(-) diff --git a/Lib/subprocess.py b/Lib/subprocess.py --- a/Lib/subprocess.py +++ b/Lib/subprocess.py @@ -326,6 +326,7 @@ import traceback import gc import signal +import errno # Exception classes used by this module. class CalledProcessError(Exception): @@ -358,7 +359,6 @@ else: import select _has_poll = hasattr(select, 'poll') - import errno import fcntl import pickle @@ -699,7 +699,11 @@ stderr = None if self.stdin: if input: - self.stdin.write(input) + try: + self.stdin.write(input) + except IOError as e: + if e.errno != errno.EPIPE and e.errno != errno.EINVAL: + raise self.stdin.close() elif self.stdout: stdout = self.stdout.read() @@ -929,7 +933,11 @@ if self.stdin: if input is not None: - self.stdin.write(input) + try: + self.stdin.write(input) + except IOError as e: + if e.errno != errno.EPIPE: + raise self.stdin.close() if self.stdout: @@ -1290,9 +1298,16 @@ for fd, mode in ready: if mode & select.POLLOUT: chunk = input[input_offset : input_offset + _PIPE_BUF] - input_offset += os.write(fd, chunk) - if input_offset >= len(input): - close_unregister_and_remove(fd) + try: + input_offset += os.write(fd, chunk) + except OSError as e: + if e.errno == errno.EPIPE: + close_unregister_and_remove(fd) + else: + raise + else: + if input_offset >= len(input): + close_unregister_and_remove(fd) elif mode & select_POLLIN_POLLPRI: data = os.read(fd, 4096) if not data: @@ -1334,11 +1349,19 @@ if self.stdin in wlist: chunk = input[input_offset : input_offset + _PIPE_BUF] - bytes_written = os.write(self.stdin.fileno(), chunk) - input_offset += bytes_written - if input_offset >= len(input): - self.stdin.close() - write_set.remove(self.stdin) + try: + bytes_written = os.write(self.stdin.fileno(), chunk) + except OSError as e: + if e.errno == errno.EPIPE: + self.stdin.close() + write_set.remove(self.stdin) + else: + raise + else: + input_offset += bytes_written + if input_offset >= len(input): + self.stdin.close() + write_set.remove(self.stdin) if self.stdout in rlist: data = os.read(self.stdout.fileno(), 1024) diff --git a/Lib/test/test_subprocess.py b/Lib/test/test_subprocess.py --- a/Lib/test/test_subprocess.py +++ b/Lib/test/test_subprocess.py @@ -592,6 +592,25 @@ self.assertFalse(os.path.exists(ofname)) self.assertFalse(os.path.exists(efname)) + def test_communicate_epipe(self): + # Issue 10963: communicate() should hide EPIPE + p = subprocess.Popen([sys.executable, "-c", 'pass'], + stdin=subprocess.PIPE, + stdout=subprocess.PIPE, + stderr=subprocess.PIPE) + self.addCleanup(p.stdout.close) + self.addCleanup(p.stderr.close) + self.addCleanup(p.stdin.close) + p.communicate(b"x" * 2**20) + + def test_communicate_epipe_only_stdin(self): + # Issue 10963: communicate() should hide EPIPE + p = subprocess.Popen([sys.executable, "-c", 'pass'], + stdin=subprocess.PIPE) + self.addCleanup(p.stdin.close) + time.sleep(2) + p.communicate(b"x" * 2**20) + # # POSIX tests # diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -44,6 +44,8 @@ Library ------- +- Issue #10963: Ensure that subprocess.communicate() never raises EPIPE. + - Issue #11696: Fix ID generation in msilib. - Issue #9696: Fix exception incorrectly raised by xdrlib.Packer.pack_int when -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 5 16:10:23 2011 From: python-checkins at python.org (ross.lagerwall) Date: Tue, 05 Apr 2011 16:10:23 +0200 Subject: [Python-checkins] cpython (merge 3.1 -> 3.2): Merge with 3.1 Message-ID: http://hg.python.org/cpython/rev/a7363288c8d4 changeset: 69155:a7363288c8d4 branch: 3.2 parent: 69143:8d5ea25d79d0 parent: 69154:158495d49f58 user: Ross Lagerwall date: Tue Apr 05 15:48:47 2011 +0200 summary: Merge with 3.1 files: Lib/subprocess.py | 45 ++++++++++++++++++------ Lib/test/test_subprocess.py | 19 ++++++++++ Misc/NEWS | 2 + Objects/typeslots.inc | 2 +- 4 files changed, 56 insertions(+), 12 deletions(-) diff --git a/Lib/subprocess.py b/Lib/subprocess.py --- a/Lib/subprocess.py +++ b/Lib/subprocess.py @@ -345,6 +345,7 @@ import signal import builtins import warnings +import errno # Exception classes used by this module. class CalledProcessError(Exception): @@ -376,7 +377,6 @@ else: import select _has_poll = hasattr(select, 'poll') - import errno import fcntl import pickle @@ -785,7 +785,11 @@ stderr = None if self.stdin: if input: - self.stdin.write(input) + try: + self.stdin.write(input) + except IOError as e: + if e.errno != errno.EPIPE and e.errno != errno.EINVAL: + raise self.stdin.close() elif self.stdout: stdout = self.stdout.read() @@ -1019,7 +1023,11 @@ if self.stdin: if input is not None: - self.stdin.write(input) + try: + self.stdin.write(input) + except IOError as e: + if e.errno != errno.EPIPE: + raise self.stdin.close() if self.stdout: @@ -1455,9 +1463,16 @@ for fd, mode in ready: if mode & select.POLLOUT: chunk = input[input_offset : input_offset + _PIPE_BUF] - input_offset += os.write(fd, chunk) - if input_offset >= len(input): - close_unregister_and_remove(fd) + try: + input_offset += os.write(fd, chunk) + except OSError as e: + if e.errno == errno.EPIPE: + close_unregister_and_remove(fd) + else: + raise + else: + if input_offset >= len(input): + close_unregister_and_remove(fd) elif mode & select_POLLIN_POLLPRI: data = os.read(fd, 4096) if not data: @@ -1499,11 +1514,19 @@ if self.stdin in wlist: chunk = input[input_offset : input_offset + _PIPE_BUF] - bytes_written = os.write(self.stdin.fileno(), chunk) - input_offset += bytes_written - if input_offset >= len(input): - self.stdin.close() - write_set.remove(self.stdin) + try: + bytes_written = os.write(self.stdin.fileno(), chunk) + except OSError as e: + if e.errno == errno.EPIPE: + self.stdin.close() + write_set.remove(self.stdin) + else: + raise + else: + input_offset += bytes_written + if input_offset >= len(input): + self.stdin.close() + write_set.remove(self.stdin) if self.stdout in rlist: data = os.read(self.stdout.fileno(), 1024) diff --git a/Lib/test/test_subprocess.py b/Lib/test/test_subprocess.py --- a/Lib/test/test_subprocess.py +++ b/Lib/test/test_subprocess.py @@ -626,6 +626,25 @@ self.assertFalse(os.path.exists(ofname)) self.assertFalse(os.path.exists(efname)) + def test_communicate_epipe(self): + # Issue 10963: communicate() should hide EPIPE + p = subprocess.Popen([sys.executable, "-c", 'pass'], + stdin=subprocess.PIPE, + stdout=subprocess.PIPE, + stderr=subprocess.PIPE) + self.addCleanup(p.stdout.close) + self.addCleanup(p.stderr.close) + self.addCleanup(p.stdin.close) + p.communicate(b"x" * 2**20) + + def test_communicate_epipe_only_stdin(self): + # Issue 10963: communicate() should hide EPIPE + p = subprocess.Popen([sys.executable, "-c", 'pass'], + stdin=subprocess.PIPE) + self.addCleanup(p.stdin.close) + time.sleep(2) + p.communicate(b"x" * 2**20) + # context manager class _SuppressCoreFiles(object): diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -49,6 +49,8 @@ Library ------- +- Issue #10963: Ensure that subprocess.communicate() never raises EPIPE. + - Issue #11746: Fix SSLContext.load_cert_chain() to accept elliptic curve private keys. diff --git a/Objects/typeslots.inc b/Objects/typeslots.inc --- a/Objects/typeslots.inc +++ b/Objects/typeslots.inc @@ -1,4 +1,4 @@ -/* Generated by typeslots.py $Revision: 87806 $ */ +/* Generated by typeslots.py $Revision$ */ 0, 0, offsetof(PyHeapTypeObject, as_mapping.mp_ass_subscript), -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 5 16:10:25 2011 From: python-checkins at python.org (ross.lagerwall) Date: Tue, 05 Apr 2011 16:10:25 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Merge with 3.2 Message-ID: http://hg.python.org/cpython/rev/c81ad4361c49 changeset: 69156:c81ad4361c49 parent: 69152:3664fc29e867 parent: 69155:a7363288c8d4 user: Ross Lagerwall date: Tue Apr 05 16:07:49 2011 +0200 summary: Merge with 3.2 files: Lib/subprocess.py | 45 ++++++++++++++++++------ Lib/test/test_subprocess.py | 19 ++++++++++ Misc/NEWS | 2 + 3 files changed, 55 insertions(+), 11 deletions(-) diff --git a/Lib/subprocess.py b/Lib/subprocess.py --- a/Lib/subprocess.py +++ b/Lib/subprocess.py @@ -348,6 +348,7 @@ import signal import builtins import warnings +import errno # Exception classes used by this module. class SubprocessError(Exception): pass @@ -396,7 +397,6 @@ else: import select _has_poll = hasattr(select, 'poll') - import errno import fcntl import pickle @@ -826,7 +826,11 @@ stderr = None if self.stdin: if input: - self.stdin.write(input) + try: + self.stdin.write(input) + except IOError as e: + if e.errno != errno.EPIPE and e.errno != errno.EINVAL: + raise self.stdin.close() elif self.stdout: stdout = self.stdout.read() @@ -1104,7 +1108,11 @@ if self.stdin: if input is not None: - self.stdin.write(input) + try: + self.stdin.write(input) + except IOError as e: + if e.errno != errno.EPIPE: + raise self.stdin.close() # Wait for the reader threads, or time out. If we time out, the @@ -1621,9 +1629,16 @@ if mode & select.POLLOUT: chunk = self._input[self._input_offset : self._input_offset + _PIPE_BUF] - self._input_offset += os.write(fd, chunk) - if self._input_offset >= len(self._input): - close_unregister_and_remove(fd) + try: + self._input_offset += os.write(fd, chunk) + except OSError as e: + if e.errno == errno.EPIPE: + close_unregister_and_remove(fd) + else: + raise + else: + if self._input_offset >= len(self._input): + close_unregister_and_remove(fd) elif mode & select_POLLIN_POLLPRI: data = os.read(fd, 4096) if not data: @@ -1691,11 +1706,19 @@ if self.stdin in wlist: chunk = self._input[self._input_offset : self._input_offset + _PIPE_BUF] - bytes_written = os.write(self.stdin.fileno(), chunk) - self._input_offset += bytes_written - if self._input_offset >= len(self._input): - self.stdin.close() - self._write_set.remove(self.stdin) + try: + bytes_written = os.write(self.stdin.fileno(), chunk) + except OSError as e: + if e.errno == errno.EPIPE: + self.stdin.close() + self._write_set.remove(self.stdin) + else: + raise + else: + self._input_offset += bytes_written + if self._input_offset >= len(self._input): + self.stdin.close() + self._write_set.remove(self.stdin) if self.stdout in rlist: data = os.read(self.stdout.fileno(), 1024) diff --git a/Lib/test/test_subprocess.py b/Lib/test/test_subprocess.py --- a/Lib/test/test_subprocess.py +++ b/Lib/test/test_subprocess.py @@ -720,6 +720,25 @@ self.assertFalse(os.path.exists(ofname)) self.assertFalse(os.path.exists(efname)) + def test_communicate_epipe(self): + # Issue 10963: communicate() should hide EPIPE + p = subprocess.Popen([sys.executable, "-c", 'pass'], + stdin=subprocess.PIPE, + stdout=subprocess.PIPE, + stderr=subprocess.PIPE) + self.addCleanup(p.stdout.close) + self.addCleanup(p.stderr.close) + self.addCleanup(p.stdin.close) + p.communicate(b"x" * 2**20) + + def test_communicate_epipe_only_stdin(self): + # Issue 10963: communicate() should hide EPIPE + p = subprocess.Popen([sys.executable, "-c", 'pass'], + stdin=subprocess.PIPE) + self.addCleanup(p.stdin.close) + time.sleep(2) + p.communicate(b"x" * 2**20) + # context manager class _SuppressCoreFiles(object): diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -94,6 +94,8 @@ Library ------- +- Issue #10963: Ensure that subprocess.communicate() never raises EPIPE. + - Issue #10791: Implement missing method GzipFile.read1(), allowing GzipFile to be wrapped in a TextIOWrapper. Patch by Nadeem Vawda. -- Repository URL: http://hg.python.org/cpython From ezio.melotti at gmail.com Tue Apr 5 16:53:04 2011 From: ezio.melotti at gmail.com (Ezio Melotti) Date: Tue, 05 Apr 2011 17:53:04 +0300 Subject: [Python-checkins] peps: Draft of PEP 399: Pure Python/C Accelerator Module Compatibiilty Requirements In-Reply-To: References: Message-ID: <4D9B2CD0.7060002@gmail.com> Hi, On 05/04/2011 2.37, brett.cannon wrote: > http://hg.python.org/peps/rev/7b9a5b01b479 > changeset: 3855:7b9a5b01b479 > user: Brett Cannon > date: Mon Apr 04 16:37:07 2011 -0700 > summary: > Draft of PEP 399: Pure Python/C Accelerator Module Compatibiilty Requirements > > files: > pep-0399.txt | 205 +++++++++++++++++++++++++++++++++++++++ > 1 files changed, 205 insertions(+), 0 deletions(-) > > > diff --git a/pep-0399.txt b/pep-0399.txt > new file mode 100644 > --- /dev/null > +++ b/pep-0399.txt > @@ -0,0 +1,205 @@ > +PEP: 399 > +Title: Pure Python/C Accelerator Module Compatibiilty Requirements > +Version: $Revision: 88219 $ > +Last-Modified: $Date: 2011-01-27 13:47:00 -0800 (Thu, 27 Jan 2011) $ > +Author: Brett Cannon > +Status: Draft > +Type: Informational > +Content-Type: text/x-rst > +Created: 04-Apr-2011 > +Python-Version: 3.3 > +Post-History: > + > [...] > + > +Any accelerated code must be semantically identical to the pure Python > +implementation. The only time any semantics are allowed to be > +different are when technical details of the VM providing the > +accelerated code prevent matching semantics from being possible, e.g., > +a class being a ``type`` when implemented in C. The semantics > +equivalence requirement also dictates that no public API be provided > +in accelerated code that does not exist in the pure Python code. > +Without this requirement people could accidentally come to rely on a > +detail in the acclerated code which is not made available to other VMs s/acclerated/accelerated/ > +that use the pure Python implementation. To help verify that the > +contract of semantic equivalence is being met, a module must be tested > +both with and without its accelerated code as thoroughly as possible. > + > +As an example, to write tests which exercise both the pure Python and > +C acclerated versions of a module, a basic idiom can be followed:: ditto > + > + import collections.abc > + from test.support import import_fresh_module, run_unittest > + import unittest > + > + c_heapq = import_fresh_module('heapq', fresh=['_heapq']) > + py_heapq = import_fresh_module('heapq', blocked=['_heapq']) > + > + > + class ExampleTest(unittest.TestCase): > [...] > +Copyright > +========= > + > +This document has been placed in the public domain. > + > + > +.. _IronPython: http://ironpython.net/ > +.. _Jython: http://www.jython.org/ > +.. _PyPy: http://pypy.org/ > +.. _C API of Python: http://docs.python.org/py3k/c-api/index.html > +.. _sqlite3: http://docs.python.org/py3k/library/sqlite3.html > Best Regards, Ezio Melotti From python-checkins at python.org Tue Apr 5 18:13:19 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 05 Apr 2011 18:13:19 +0200 Subject: [Python-checkins] cpython (3.1): Try to fix sporadic test_multiprocessing failure Message-ID: http://hg.python.org/cpython/rev/7a1ef59d765b changeset: 69157:7a1ef59d765b branch: 3.1 parent: 69154:158495d49f58 user: Antoine Pitrou date: Tue Apr 05 18:11:33 2011 +0200 summary: Try to fix sporadic test_multiprocessing failure files: Lib/test/test_multiprocessing.py | 12 +++++++++++- 1 files changed, 11 insertions(+), 1 deletions(-) diff --git a/Lib/test/test_multiprocessing.py b/Lib/test/test_multiprocessing.py --- a/Lib/test/test_multiprocessing.py +++ b/Lib/test/test_multiprocessing.py @@ -12,6 +12,7 @@ import sys import os import gc +import errno import signal import array import socket @@ -1277,7 +1278,16 @@ manager.shutdown() manager = QueueManager( address=addr, authkey=authkey, serializer=SERIALIZER) - manager.start() + try: + manager.start() + except IOError as e: + if e.errno != errno.EADDRINUSE: + raise + # Retry after some time, in case the old socket was lingering + # (sporadic failure on buildbots) + time.sleep(1.0) + manager = QueueManager( + address=addr, authkey=authkey, serializer=SERIALIZER) manager.shutdown() # -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 5 18:13:20 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 05 Apr 2011 18:13:20 +0200 Subject: [Python-checkins] cpython (merge 3.1 -> 3.2): Try to fix sporadic test_multiprocessing failure Message-ID: http://hg.python.org/cpython/rev/8a65e6aff672 changeset: 69158:8a65e6aff672 branch: 3.2 parent: 69155:a7363288c8d4 parent: 69157:7a1ef59d765b user: Antoine Pitrou date: Tue Apr 05 18:12:15 2011 +0200 summary: Try to fix sporadic test_multiprocessing failure files: Lib/test/test_multiprocessing.py | 12 +++++++++++- 1 files changed, 11 insertions(+), 1 deletions(-) diff --git a/Lib/test/test_multiprocessing.py b/Lib/test/test_multiprocessing.py --- a/Lib/test/test_multiprocessing.py +++ b/Lib/test/test_multiprocessing.py @@ -11,6 +11,7 @@ import sys import os import gc +import errno import signal import array import socket @@ -1359,7 +1360,16 @@ manager.shutdown() manager = QueueManager( address=addr, authkey=authkey, serializer=SERIALIZER) - manager.start() + try: + manager.start() + except IOError as e: + if e.errno != errno.EADDRINUSE: + raise + # Retry after some time, in case the old socket was lingering + # (sporadic failure on buildbots) + time.sleep(1.0) + manager = QueueManager( + address=addr, authkey=authkey, serializer=SERIALIZER) manager.shutdown() # -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 5 18:13:24 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 05 Apr 2011 18:13:24 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Try to fix sporadic test_multiprocessing failure Message-ID: http://hg.python.org/cpython/rev/a9371cf1cc61 changeset: 69159:a9371cf1cc61 parent: 69156:c81ad4361c49 parent: 69158:8a65e6aff672 user: Antoine Pitrou date: Tue Apr 05 18:13:06 2011 +0200 summary: Try to fix sporadic test_multiprocessing failure files: Lib/test/test_multiprocessing.py | 12 +++++++++++- 1 files changed, 11 insertions(+), 1 deletions(-) diff --git a/Lib/test/test_multiprocessing.py b/Lib/test/test_multiprocessing.py --- a/Lib/test/test_multiprocessing.py +++ b/Lib/test/test_multiprocessing.py @@ -11,6 +11,7 @@ import sys import os import gc +import errno import signal import array import socket @@ -1371,7 +1372,16 @@ manager.shutdown() manager = QueueManager( address=addr, authkey=authkey, serializer=SERIALIZER) - manager.start() + try: + manager.start() + except IOError as e: + if e.errno != errno.EADDRINUSE: + raise + # Retry after some time, in case the old socket was lingering + # (sporadic failure on buildbots) + time.sleep(1.0) + manager = QueueManager( + address=addr, authkey=authkey, serializer=SERIALIZER) manager.shutdown() # -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 5 19:40:55 2011 From: python-checkins at python.org (ezio.melotti) Date: Tue, 05 Apr 2011 19:40:55 +0200 Subject: [Python-checkins] cpython (2.7): #7311: fix HTMLParser to accept non-ASCII attribute values. Message-ID: http://hg.python.org/cpython/rev/7d4dea76c476 changeset: 69160:7d4dea76c476 branch: 2.7 parent: 69153:c10d55c51d81 user: Ezio Melotti date: Tue Apr 05 20:40:52 2011 +0300 summary: #7311: fix HTMLParser to accept non-ASCII attribute values. files: Lib/HTMLParser.py | 2 +- Lib/test/test_htmlparser.py | 17 +++++++++++++++++ Misc/NEWS | 2 ++ 3 files changed, 20 insertions(+), 1 deletions(-) diff --git a/Lib/HTMLParser.py b/Lib/HTMLParser.py --- a/Lib/HTMLParser.py +++ b/Lib/HTMLParser.py @@ -26,7 +26,7 @@ tagfind = re.compile('[a-zA-Z][-.a-zA-Z0-9:_]*') attrfind = re.compile( r'\s*([a-zA-Z_][-.:a-zA-Z_0-9]*)(\s*=\s*' - r'(\'[^\']*\'|"[^"]*"|[-a-zA-Z0-9./,:;+*%?!&$_#=~@]*))?') + r'(\'[^\']*\'|"[^"]*"|[^\s"\'=<>`]*))?') locatestarttagend = re.compile(r""" <[a-zA-Z][-.a-zA-Z0-9:_]* # tag name diff --git a/Lib/test/test_htmlparser.py b/Lib/test/test_htmlparser.py --- a/Lib/test/test_htmlparser.py +++ b/Lib/test/test_htmlparser.py @@ -208,6 +208,23 @@ ("starttag", "a", [("href", "mailto:xyz at example.com")]), ]) + def test_attr_nonascii(self): + # see issue 7311 + self._run_check(u" $\u4e2d\u6587$ ", [ + ("starttag", "img", [("src", "/foo/bar.png"), + ("alt", u"\u4e2d\u6587")]), + ]) + self._run_check(u"", [ + ("starttag", "a", [("title", u"\u30c6\u30b9\u30c8"), + ("href", u"\u30c6\u30b9\u30c8.html")]), + ]) + self._run_check(u'', [ + ("starttag", "a", [("title", u"\u30c6\u30b9\u30c8"), + ("href", u"\u30c6\u30b9\u30c8.html")]), + ]) + def test_attr_entity_replacement(self): self._run_check("""""", [ ("starttag", "a", [("b", "&><\"'")]), diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -47,6 +47,8 @@ Library ------- +- Issue #7311: fix HTMLParser to accept non-ASCII attribute values. + - Issue #10963: Ensure that subprocess.communicate() never raises EPIPE. - Issue #11662: Make urllib and urllib2 ignore redirections if the -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 5 20:47:51 2011 From: python-checkins at python.org (barry.warsaw) Date: Tue, 05 Apr 2011 20:47:51 +0200 Subject: [Python-checkins] peps: PEP 396. Message-ID: http://hg.python.org/peps/rev/1857fe1e65ab changeset: 3857:1857fe1e65ab parent: 3854:f65beac56930 user: Barry Warsaw date: Tue Apr 05 14:47:18 2011 -0400 summary: PEP 396. files: pep-0396.txt | 311 +++++++++++++++++++++++++++++++++++++++ 1 files changed, 311 insertions(+), 0 deletions(-) diff --git a/pep-0396.txt b/pep-0396.txt new file mode 100644 --- /dev/null +++ b/pep-0396.txt @@ -0,0 +1,311 @@ +PEP: 396 +Title: Module Version Numbers +Version: $Revision: 65628 $ +Last-Modified: $Date: 2008-08-10 09:59:20 -0400 (Sun, 10 Aug 2008) $ +Author: Barry Warsaw +Status: Draft +Type: Informational +Content-Type: text/x-rst +Created: 2011-03-16 +Post-History: + + +Abstract +======== + +Given that it is useful and common to specify version numbers for +Python modules, and given that different ways of doing this have grown +organically within the Python community, it is useful to establish +standard conventions for module authors to adhere to and reference. +This informational PEP describes best practices for Python module +authors who want to define the version number of their Python module. + +Conformance with this PEP is optional, however other Python tools +(such as ``distutils2`` [1]_) may be adapted to use the conventions +defined here. + + +User Stories +============ + +Alice is writing a new module, called ``alice``, which she wants to +share with other Python developers. ``alice`` is a simple module and +lives in one file, ``alice.py``. Alice wants to specify a version +number so that her users can tell which version they are using. +Because her module lives entirely in one file, she wants to add the +version number to that file. + +Bob has written a module called ``bob`` which he has shared with many +users. ``bob.py`` contains a version number for the convenience of +his users. Bob learns about the Cheeseshop [2]_, and adds some simple +packaging using classic distutils so that he can upload *The Bob +Bundle* to the Cheeseshop. Because ``bob.py`` already specifies a +version number which his users can access programmatically, he wants +the same API to continue to work even though his users now get it from +the Cheeseshop. + +Carole maintains several namespace packages, each of which are +independently developed and distributed. In order for her users to +properly specify dependencies on the right versions of her packages, +she specifies the version numbers in the namespace package's +``setup.py`` file. Because Carol wants to have to update one version +number per package, she specifies the version number in her module and +has the ``setup.py`` extract the module version number when she builds +the *sdist* archive. + +David maintains a package in the standard library, and also produces +standalone versions for other versions of Python. The standard +library copy defines the version number in the module, and this same +version number is used for the standalone distributions as well. + + +Rationale +========= + +Python modules, both in the standard library and available from third +parties, have long included version numbers. There are established +de-facto standards for describing version numbers, and many ad-hoc +ways have grown organically over the years. Often, version numbers +can be retrieved from a module programmatically, by importing the +module and inspecting an attribute. Classic Python distutils +``setup()`` functions [3]_ describe a ``version`` argument where the +release's version number can be specified. PEP 8 [4]_ describes the +use of a module attribute called ``__version__`` for recording +"Subversion, CVS, or RCS" version strings using keyword expansion. In +the PEP author's own email archives, the earliest example of the use +of an ``__version__`` module attribute by independent module +developers dates back to 1995. + +Another example of version information is the sqlite3 [5]_ library +with its ``sqlite_version_info``, ``version``, and ``version_info`` +attributes. It may not be immediately obvious which attribute +contains a version number for the module, and which contains a version +number for the underlying SQLite3 library. + +This informational PEP codifies established practice, and recommends +standard ways of describing module version numbers, along with some +use cases for when -- and when *not* -- to include them. Its adoption +by module authors is purely voluntary; packaging tools in the standard +library will provide optional support for the standards defined +herein, and other tools in the Python universe may comply as well. + + +Specification +============= + +#. In general, modules in the standard library SHOULD NOT have version + numbers. They implicitly carry the version number of the Python + release they are included in. + +#. On a case-by-case basis, standard library modules which are also + released in standalone form for other Python versions MAY include a + module version number when included in the standard library, and + SHOULD include a version number when packaged separately. + +#. When a module includes a version number, it SHOULD be available in + the ``__version__`` attribute on that module. + +#. For modules which are also packages, the module namespace SHOULD + include the ``__version__`` attribute. + +#. For modules which live inside a namespace package, the sub-package + name SHOULD include the ``__version__`` attribute. The namespace + module itself SHOULD NOT include its own ``__version__`` attribute. + +#. The ``__version__`` attribute's value SHOULD be a string. + +#. Module version numbers SHOULD conform to the normalized version + format specified in PEP 386 [6]_. + +#. Module version numbers SHOULD NOT contain version control system + supplied revision numbers, or any other semantically different + version numbers (e.g. underlying library version number). + +#. Wherever a ``__version__`` attribute exists, a module MAY also + include a ``__version_info__`` attribute, containing a tuple + representation of the module version number, for easy comparisons. + +#. ``__version_info__`` SHOULD be of the format returned by PEP 386's + ``parse_version()`` function. + +#. The ``version`` attribute in a classic distutils ``setup.py`` + file, or the PEP 345 [7]_ ``Version`` metadata field SHOULD be + derived from the ``__version__`` field, or vice versa. + + +Examples +======== + +Retrieving the version number from a third party package:: + + >>> import bzrlib + >>> bzrlib.__version__ + '2.3.0' + +Retrieving the version number from a standard library package that is +also distributed as a standalone module:: + + >>> import email + >>> email.__version__ + '5.1.0' + +Version numbers for namespace packages:: + + >>> import flufl.i18n + >>> import flufl.enum + >>> import flufl.lock + + >>> print flufl.i18n.__version__ + 1.0.4 + >>> print flufl.enum.__version__ + 3.1 + >>> print flufl.lock.__version__ + 2.1 + + >>> import flufl + >>> flufl.__version__ + Traceback (most recent call last): + File "", line 1, in + AttributeError: 'module' object has no attribute '__version__' + >>> + + +Deriving +======== + +Module version numbers can appear in at least two places, and +sometimes more. For example, in accordance with this PEP, they are +available programmatically on the module's ``__version__`` attribute. +In a classic distutils ``setup.py`` file, the ``setup()`` function +takes a ``version`` argument, while the distutils2 ``setup.cfg`` file +has a ``version`` key. The version number must also get into the PEP +345 metadata, preferably when the *sdist* archive is built. It's +desirable for module authors to only have to specify the version +number once, and have all the other uses derive from this single +definition. + +While there are any number of ways this could be done, this section +describes one possible approach, for each scenario. + +Let's say Elle adds this attribute to her module file ``elle.py``:: + + __version__ = '3.1.1' + + +Classic distutils +----------------- + +In classic distutils, the simplest way to add the version string to +the ``setup()`` function in ``setup.py`` is to do something like +this:: + + from elle import __version__ + setup(name='elle', version=__version__) + +In the PEP author's experience however, this can fail in some cases, +such as when the module uses automatic Python 3 conversion via the +``2to3`` program (because ``setup.py`` is executed by Python 3 before +the ``elle`` module has been converted). + +In that case, it's not much more difficult to write a little code to +parse the ``__version__`` from the file rather than importing it:: + + import re + DEFAULT_VERSION_RE = re.compile(r'(?P\d+\.\d(?:\.\d+)?)') + + def get_version(filename, pattern=None): + if pattern is None: + cre = DEFAULT_VERSION_RE + else: + cre = re.compile(pattern) + with open(filename) as fp: + for line in fp: + if line.startswith('__version__'): + mo = cre.search(line) + assert mo, 'No valid __version__ string found' + return mo.group('version') + raise AssertionError('No __version__ assignment found') + + setup(name='elle', version=get_version('elle.py')) + + +Distutils2 +---------- + +Because the distutils2 style ``setup.cfg`` is declarative, we can't +run any code to extract the ``__version__`` attribute, either via +import or via parsing. This PEP suggests a special key be added to +the ``[metadata]`` section of the ``setup.cfg`` file to indicate "get +the version from this file". Something like this might work:: + + [metadata] + version-from-file: elle.py + +where ``parse`` means to use a parsing method similar to the above, on +the file named after the colon. The exact recipe for doing this will +be discussed in the appropriate distutils2 development forum. + +An alternative is to only define the version number in ``setup.cfg`` +and use the ``pkgutil`` module [8]_ to make it available +programmatically. E.g. in ``elle.py``:: + + from distutils2._backport import pkgutil + __version__ = pkgutil.get_distribution('elle').metadata['version'] + + +PEP 376 metadata +================ + +PEP 376 [9]_ defines a standard for static metadata, but doesn't +describe the process by which this metadata gets created. It is +highly desirable for the derived version information to be placed into +the PEP 376 ``.dist-info`` metadata at build-time rather than +install-time. This way, the metadata will be available for +introspection even when the code is not installed. + + +References +========== + +.. [1] Distutils2 documentation + (http://distutils2.notmyidea.org/) + +.. [2] The Cheeseshop (Python Package Index) + (http://pypi.python.org) + +.. [3] http://docs.python.org/distutils/setupscript.html + +.. [4] PEP 8, Style Guide for Python Code + (http://www.python.org/dev/peps/pep-0008) + +.. [5] sqlite3 module documentation + (http://docs.python.org/library/sqlite3.html) + +.. [6] PEP 386, Changing the version comparison module in Distutils + (http://www.python.org/dev/peps/pep-0386/) + +.. [7] PEP 345, Metadata for Python Software Packages 1.2 + (http://www.python.org/dev/peps/pep-0345/#version) + +.. [8] pkgutil - Package utilities + (http://distutils2.notmyidea.org/library/pkgutil.html) + +.. [9] PEP 376, Database of Installed Python Distributions + (http://www.python.org/dev/peps/pep-0376/) + + +Copyright +========= + +This document has been placed in the public domain. + + + +.. + Local Variables: + mode: indented-text + indent-tabs-mode: nil + sentence-end-double-space: t + fill-column: 70 + coding: utf-8 + End: -- Repository URL: http://hg.python.org/peps From python-checkins at python.org Tue Apr 5 20:47:52 2011 From: python-checkins at python.org (barry.warsaw) Date: Tue, 05 Apr 2011 20:47:52 +0200 Subject: [Python-checkins] peps (merge default -> default): merge Message-ID: http://hg.python.org/peps/rev/fc65dddc2af3 changeset: 3858:fc65dddc2af3 parent: 3857:1857fe1e65ab parent: 3856:359ccf54bc52 user: Barry Warsaw date: Tue Apr 05 14:47:46 2011 -0400 summary: merge files: pep-0399.txt | 205 +++++++++++++++++++++++++++++++++++++++ 1 files changed, 205 insertions(+), 0 deletions(-) diff --git a/pep-0399.txt b/pep-0399.txt new file mode 100644 --- /dev/null +++ b/pep-0399.txt @@ -0,0 +1,205 @@ +PEP: 399 +Title: Pure Python/C Accelerator Module Compatibility Requirements +Version: $Revision: 88219 $ +Last-Modified: $Date: 2011-01-27 13:47:00 -0800 (Thu, 27 Jan 2011) $ +Author: Brett Cannon +Status: Draft +Type: Informational +Content-Type: text/x-rst +Created: 04-Apr-2011 +Python-Version: 3.3 +Post-History: + +Abstract +======== + +The Python standard library under CPython contains various instances +of modules implemented in both pure Python and C. This PEP requires +that in these instances that both the Python and C code *must* be +semantically identical (except in cases where implementation details +of a VM prevents it entirely). It is also required that new C-based +modules lacking a pure Python equivalent implementation get special +permissions to be added to the standard library. + + +Rationale +========= + +Python has grown beyond the CPython virtual machine (VM). IronPython_, +Jython_, and PyPy_ all currently being viable alternatives to the +CPython VM. This VM ecosystem that has sprung up around the Python +programming language has led to Python being used in many different +areas where CPython cannot be used, e.g., Jython allowing Python to be +used in Java applications. + +A problem all of the VMs other than CPython face is handling modules +from the standard library that are implemented in C. Since they do not +typically support the entire `C API of Python`_ they are unable to use +the code used to create the module. Often times this leads these other +VMs to either re-implement the modules in pure Python or in the +programming language used to implement the VM (e.g., in C# for +IronPython). This duplication of effort between CPython, PyPy, Jython, +and IronPython is extremely unfortunate as implementing a module *at +least* in pure Python would help mitigate this duplicate effort. + +The purpose of this PEP is to minimize this duplicate effort by +mandating that all new modules added to Python's standard library +*must* have a pure Python implementation _unless_ special dispensation +is given. This makes sure that a module in the stdlib is available to +all VMs and not just to CPython. + +Re-implementing parts (or all) of a module in C (in the case +of CPython) is still allowed for performance reasons, but any such +accelerated code must semantically match the pure Python equivalent to +prevent divergence. To accomplish this, the pure Python and C code must +be thoroughly tested with the *same* test suite to verify compliance. +This is to prevent users from accidentally relying +on semantics that are specific to the C code and are not reflected in +the pure Python implementation that other VMs rely upon, e.g., in +CPython 3.2.0, ``heapq.heappop()`` raises different exceptions +depending on whether the accelerated C code is used or not:: + + from test.support import import_fresh_module + + c_heapq = import_fresh_module('heapq', fresh=['_heapq']) + py_heapq = import_fresh_module('heapq', blocked=['_heapq']) + + + class Spam: + """Tester class which defines no other magic methods but + __len__().""" + def __len__(self): + return 0 + + + try: + c_heapq.heappop(Spam()) + except TypeError: + # "heap argument must be a list" + pass + + try: + py_heapq.heappop(Spam()) + except AttributeError: + # "'Foo' object has no attribute 'pop'" + pass + +This kind of divergence is a problem for users as they unwittingly +write code that is CPython-specific. This is also an issue for other +VM teams as they have to deal with bug reports from users thinking +that they incorrectly implemented the module when in fact it was +caused by an untested case. + + +Details +======= + +Starting in Python 3.3, any modules added to the standard library must +have a pure Python implementation. This rule can only be ignored if +the Python development team grants a special exemption for the module. +Typically the exemption would be granted only when a module wraps a +specific C-based library (e.g., sqlite3_). In granting an exemption it +will be recognized that the module will most likely be considered +exclusive to CPython and not part of Python's standard library that +other VMs are expected to support. Usage of ``ctypes`` to provide an +API for a C library will continue to be frowned upon as ``ctypes`` +lacks compiler guarantees that C code typically relies upon to prevent +certain errors from occurring (e.g., API changes). + +Even though a pure Python implementation is mandated by this PEP, it +does not preclude the use of a companion acceleration module. If an +acceleration module is provided it is to be named the same as the +module it is accelerating with an underscore attached as a prefix, +e.g., ``_warnings`` for ``warnings``. The common pattern to access +the accelerated code from the pure Python implementation is to import +it with an ``import *``, e.g., ``from _warnings import *``. This is +typically done at the end of the module to allow it to overwrite +specific Python objects with their accelerated equivalents. This kind +of import can also be done before the end of the module when needed, +e.g., an accelerated base class is provided but is then subclassed by +Python code. This PEP does not mandate that pre-existing modules in +the stdlib that lack a pure Python equivalent gain such a module. But +if people do volunteer to provide and maintain a pure Python +equivalent (e.g., the PyPy team volunteering their pure Python +implementation of the ``csv`` module and maintaining it) then such +code will be accepted. + +Any accelerated code must be semantically identical to the pure Python +implementation. The only time any semantics are allowed to be +different are when technical details of the VM providing the +accelerated code prevent matching semantics from being possible, e.g., +a class being a ``type`` when implemented in C. The semantics +equivalence requirement also dictates that no public API be provided +in accelerated code that does not exist in the pure Python code. +Without this requirement people could accidentally come to rely on a +detail in the acclerated code which is not made available to other VMs +that use the pure Python implementation. To help verify that the +contract of semantic equivalence is being met, a module must be tested +both with and without its accelerated code as thoroughly as possible. + +As an example, to write tests which exercise both the pure Python and +C acclerated versions of a module, a basic idiom can be followed:: + + import collections.abc + from test.support import import_fresh_module, run_unittest + import unittest + + c_heapq = import_fresh_module('heapq', fresh=['_heapq']) + py_heapq = import_fresh_module('heapq', blocked=['_heapq']) + + + class ExampleTest(unittest.TestCase): + + def test_heappop_exc_for_non_MutableSequence(self): + # Raise TypeError when heap is not a + # collections.abc.MutableSequence. + class Spam: + """Test class lacking many ABC-required methods + (e.g., pop()).""" + def __len__(self): + return 0 + + heap = Spam() + self.assertFalse(isinstance(heap, + collections.abc.MutableSequence)) + with self.assertRaises(TypeError): + self.heapq.heappop(heap) + + + class AcceleratedExampleTest(ExampleTest): + + """Test using the acclerated code.""" + + heapq = c_heapq + + + class PyExampleTest(ExampleTest): + + """Test with just the pure Python code.""" + + heapq = py_heapq + + + def test_main(): + run_unittest(AcceleratedExampleTest, PyExampleTest) + + + if __name__ == '__main__': + test_main() + +Thoroughness of the test can be verified using coverage measurements +with branching coverage on the pure Python code to verify that all +possible scenarios are tested using (or not using) accelerator code. + + +Copyright +========= + +This document has been placed in the public domain. + + +.. _IronPython: http://ironpython.net/ +.. _Jython: http://www.jython.org/ +.. _PyPy: http://pypy.org/ +.. _C API of Python: http://docs.python.org/py3k/c-api/index.html +.. _sqlite3: http://docs.python.org/py3k/library/sqlite3.html -- Repository URL: http://hg.python.org/peps From python-checkins at python.org Tue Apr 5 20:48:35 2011 From: python-checkins at python.org (barry.warsaw) Date: Tue, 05 Apr 2011 20:48:35 +0200 Subject: [Python-checkins] peps: Added Post-History. Message-ID: http://hg.python.org/peps/rev/6d0808c23ad8 changeset: 3859:6d0808c23ad8 user: Barry Warsaw date: Tue Apr 05 14:48:30 2011 -0400 summary: Added Post-History. files: pep-0396.txt | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/pep-0396.txt b/pep-0396.txt --- a/pep-0396.txt +++ b/pep-0396.txt @@ -7,7 +7,7 @@ Type: Informational Content-Type: text/x-rst Created: 2011-03-16 -Post-History: +Post-History: 2011-04-05 Abstract -- Repository URL: http://hg.python.org/peps From python-checkins at python.org Wed Apr 6 00:23:43 2011 From: python-checkins at python.org (benjamin.peterson) Date: Wed, 06 Apr 2011 00:23:43 +0200 Subject: [Python-checkins] cpython: implement tp_clear Message-ID: http://hg.python.org/cpython/rev/7b5d09343929 changeset: 69161:7b5d09343929 parent: 69159:a9371cf1cc61 user: Benjamin Peterson date: Tue Apr 05 17:25:14 2011 -0500 summary: implement tp_clear files: Modules/_functoolsmodule.c | 11 ++++++++++- 1 files changed, 10 insertions(+), 1 deletions(-) diff --git a/Modules/_functoolsmodule.c b/Modules/_functoolsmodule.c --- a/Modules/_functoolsmodule.c +++ b/Modules/_functoolsmodule.c @@ -355,6 +355,15 @@ return 0; } +static int +keyobject_clear(keyobject *ko) +{ + Py_CLEAR(ko->cmp); + if (ko->object) + Py_CLEAR(ko->object); + return 0; +} + static PyMemberDef keyobject_members[] = { {"obj", T_OBJECT, offsetof(keyobject, object), 0, @@ -392,7 +401,7 @@ Py_TPFLAGS_DEFAULT, /* tp_flags */ 0, /* tp_doc */ (traverseproc)keyobject_traverse, /* tp_traverse */ - 0, /* tp_clear */ + (inquiry)keyobject_clear, /* tp_clear */ keyobject_richcompare, /* tp_richcompare */ 0, /* tp_weaklistoffset */ 0, /* tp_iter */ -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Apr 6 02:20:05 2011 From: python-checkins at python.org (ned.deily) Date: Wed, 06 Apr 2011 02:20:05 +0200 Subject: [Python-checkins] cpython (2.7): Issue #7108: Fix test_commands to not fail when special attributes ('@' Message-ID: http://hg.python.org/cpython/rev/5616cbce0bee changeset: 69162:5616cbce0bee branch: 2.7 parent: 69160:7d4dea76c476 user: Ned Deily date: Tue Apr 05 17:16:09 2011 -0700 summary: Issue #7108: Fix test_commands to not fail when special attributes ('@' or '.') appear in 'ls -l' output. files: Lib/test/test_commands.py | 6 +++++- Misc/NEWS | 3 +++ 2 files changed, 8 insertions(+), 1 deletions(-) diff --git a/Lib/test/test_commands.py b/Lib/test/test_commands.py --- a/Lib/test/test_commands.py +++ b/Lib/test/test_commands.py @@ -49,8 +49,12 @@ # drwxr-xr-x 15 Joe User My Group 4096 Aug 12 12:50 / # Note that the first case above has a space in the group name # while the second one has a space in both names. + # Special attributes supported: + # + = has ACLs + # @ = has Mac OS X extended attributes + # . = has a SELinux security context pat = r'''d......... # It is a directory. - \+? # It may have ACLs. + [.+@]? # It may have special attributes. \s+\d+ # It has some number of links. [^/]* # Skip user, group, size, and date. /\. # and end with the name of the file. diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -315,6 +315,9 @@ Tests ----- +- Issue #7108: Fix test_commands to not fail when special attributes ('@' + or '.') appear in 'ls -l' output. + - Issue #11490: test_subprocess:test_leaking_fds_on_error no longer gives a false positive if the last directory in the path is inaccessible. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Apr 6 02:44:00 2011 From: python-checkins at python.org (alexander.belopolsky) Date: Wed, 06 Apr 2011 02:44:00 +0200 Subject: [Python-checkins] cpython (3.2): Issue #11576: Fixed timedelta subtraction glitch on big timedelta values Message-ID: http://hg.python.org/cpython/rev/76180cc853b6 changeset: 69163:76180cc853b6 branch: 3.2 parent: 69158:8a65e6aff672 user: Alexander Belopolsky date: Tue Apr 05 20:07:38 2011 -0400 summary: Issue #11576: Fixed timedelta subtraction glitch on big timedelta values files: Lib/datetime.py | 6 +++++- Lib/test/datetimetester.py | 6 ++++++ Modules/_datetimemodule.c | 15 ++++++++------- 3 files changed, 19 insertions(+), 8 deletions(-) diff --git a/Lib/datetime.py b/Lib/datetime.py --- a/Lib/datetime.py +++ b/Lib/datetime.py @@ -485,7 +485,11 @@ def __sub__(self, other): if isinstance(other, timedelta): - return self + -other + # for CPython compatibility, we cannot use + # our __class__ here, but need a real timedelta + return timedelta(self._days - other._days, + self._seconds - other._seconds, + self._microseconds - other._microseconds) return NotImplemented def __rsub__(self, other): diff --git a/Lib/test/datetimetester.py b/Lib/test/datetimetester.py --- a/Lib/test/datetimetester.py +++ b/Lib/test/datetimetester.py @@ -383,6 +383,12 @@ for i in range(-10, 10): eq((i*us/-3)//us, round(i/-3)) + # Issue #11576 + eq(td(999999999, 86399, 999999) - td(999999999, 86399, 999998), + td(0, 0, 1)) + eq(td(999999999, 1, 1) - td(999999999, 1, 0), + td(0, 0, 1)) + def test_disallowed_computations(self): a = timedelta(42) diff --git a/Modules/_datetimemodule.c b/Modules/_datetimemodule.c --- a/Modules/_datetimemodule.c +++ b/Modules/_datetimemodule.c @@ -1801,13 +1801,14 @@ if (PyDelta_Check(left) && PyDelta_Check(right)) { /* delta - delta */ - PyObject *minus_right = PyNumber_Negative(right); - if (minus_right) { - result = delta_add(left, minus_right); - Py_DECREF(minus_right); - } - else - result = NULL; + /* The C-level additions can't overflow because of the + * invariant bounds. + */ + int days = GET_TD_DAYS(left) - GET_TD_DAYS(right); + int seconds = GET_TD_SECONDS(left) - GET_TD_SECONDS(right); + int microseconds = GET_TD_MICROSECONDS(left) - + GET_TD_MICROSECONDS(right); + result = new_delta(days, seconds, microseconds, 1); } if (result == Py_NotImplemented) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Apr 6 02:44:01 2011 From: python-checkins at python.org (alexander.belopolsky) Date: Wed, 06 Apr 2011 02:44:01 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Issue #11576: Fixed timedelta subtraction glitch on big timedelta values Message-ID: http://hg.python.org/cpython/rev/d492915cf76d changeset: 69164:d492915cf76d parent: 69161:7b5d09343929 parent: 69163:76180cc853b6 user: Alexander Belopolsky date: Tue Apr 05 20:43:15 2011 -0400 summary: Issue #11576: Fixed timedelta subtraction glitch on big timedelta values files: Lib/datetime.py | 6 +++++- Lib/test/datetimetester.py | 6 ++++++ Modules/_datetimemodule.c | 15 ++++++++------- 3 files changed, 19 insertions(+), 8 deletions(-) diff --git a/Lib/datetime.py b/Lib/datetime.py --- a/Lib/datetime.py +++ b/Lib/datetime.py @@ -485,7 +485,11 @@ def __sub__(self, other): if isinstance(other, timedelta): - return self + -other + # for CPython compatibility, we cannot use + # our __class__ here, but need a real timedelta + return timedelta(self._days - other._days, + self._seconds - other._seconds, + self._microseconds - other._microseconds) return NotImplemented def __rsub__(self, other): diff --git a/Lib/test/datetimetester.py b/Lib/test/datetimetester.py --- a/Lib/test/datetimetester.py +++ b/Lib/test/datetimetester.py @@ -383,6 +383,12 @@ for i in range(-10, 10): eq((i*us/-3)//us, round(i/-3)) + # Issue #11576 + eq(td(999999999, 86399, 999999) - td(999999999, 86399, 999998), + td(0, 0, 1)) + eq(td(999999999, 1, 1) - td(999999999, 1, 0), + td(0, 0, 1)) + def test_disallowed_computations(self): a = timedelta(42) diff --git a/Modules/_datetimemodule.c b/Modules/_datetimemodule.c --- a/Modules/_datetimemodule.c +++ b/Modules/_datetimemodule.c @@ -1801,13 +1801,14 @@ if (PyDelta_Check(left) && PyDelta_Check(right)) { /* delta - delta */ - PyObject *minus_right = PyNumber_Negative(right); - if (minus_right) { - result = delta_add(left, minus_right); - Py_DECREF(minus_right); - } - else - result = NULL; + /* The C-level additions can't overflow because of the + * invariant bounds. + */ + int days = GET_TD_DAYS(left) - GET_TD_DAYS(right); + int seconds = GET_TD_SECONDS(left) - GET_TD_SECONDS(right); + int microseconds = GET_TD_MICROSECONDS(left) - + GET_TD_MICROSECONDS(right); + result = new_delta(days, seconds, microseconds, 1); } if (result == Py_NotImplemented) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Apr 6 04:14:27 2011 From: python-checkins at python.org (alexander.belopolsky) Date: Wed, 06 Apr 2011 04:14:27 +0200 Subject: [Python-checkins] cpython (2.7): Issue #11576: Fixed timedelta subtraction glitch on big timedelta values Message-ID: http://hg.python.org/cpython/rev/202a9feb1fd6 changeset: 69165:202a9feb1fd6 branch: 2.7 parent: 69162:5616cbce0bee user: Alexander Belopolsky date: Tue Apr 05 22:12:22 2011 -0400 summary: Issue #11576: Fixed timedelta subtraction glitch on big timedelta values files: Lib/test/test_datetime.py | 7 +++++++ Modules/datetimemodule.c | 15 ++++++++------- 2 files changed, 15 insertions(+), 7 deletions(-) diff --git a/Lib/test/test_datetime.py b/Lib/test/test_datetime.py --- a/Lib/test/test_datetime.py +++ b/Lib/test/test_datetime.py @@ -231,6 +231,13 @@ eq(a//10, td(0, 7*24*360)) eq(a//3600000, td(0, 0, 7*24*1000)) + # Issue #11576 + eq(td(999999999, 86399, 999999) - td(999999999, 86399, 999998), + td(0, 0, 1)) + eq(td(999999999, 1, 1) - td(999999999, 1, 0), + td(0, 0, 1)) + + def test_disallowed_computations(self): a = timedelta(42) diff --git a/Modules/datetimemodule.c b/Modules/datetimemodule.c --- a/Modules/datetimemodule.c +++ b/Modules/datetimemodule.c @@ -1737,13 +1737,14 @@ if (PyDelta_Check(left) && PyDelta_Check(right)) { /* delta - delta */ - PyObject *minus_right = PyNumber_Negative(right); - if (minus_right) { - result = delta_add(left, minus_right); - Py_DECREF(minus_right); - } - else - result = NULL; + /* The C-level additions can't overflow because of the + * invariant bounds. + */ + int days = GET_TD_DAYS(left) - GET_TD_DAYS(right); + int seconds = GET_TD_SECONDS(left) - GET_TD_SECONDS(right); + int microseconds = GET_TD_MICROSECONDS(left) - + GET_TD_MICROSECONDS(right); + result = new_delta(days, seconds, microseconds, 1); } if (result == Py_NotImplemented) -- Repository URL: http://hg.python.org/cpython From solipsis at pitrou.net Wed Apr 6 04:56:58 2011 From: solipsis at pitrou.net (solipsis at pitrou.net) Date: Wed, 06 Apr 2011 04:56:58 +0200 Subject: [Python-checkins] Daily reference leaks (d492915cf76d): sum=0 Message-ID: results for d492915cf76d on branch "default" -------------------------------------------- Command line was: ['./python', '-m', 'test.regrtest', '-uall', '-R', '3:3:/home/antoine/cpython/refleaks/refloglZqgHa', '-x'] From python-checkins at python.org Wed Apr 6 08:16:34 2011 From: python-checkins at python.org (senthil.kumaran) Date: Wed, 06 Apr 2011 08:16:34 +0200 Subject: [Python-checkins] cpython (3.1): Issue #10762: Guard against invalid/non-supported format string '%f' on Message-ID: http://hg.python.org/cpython/rev/2ca1bc677a60 changeset: 69166:2ca1bc677a60 branch: 3.1 parent: 69157:7a1ef59d765b user: Senthil Kumaran date: Wed Apr 06 12:54:06 2011 +0800 summary: Issue #10762: Guard against invalid/non-supported format string '%f' on Windows. Patch Santoso Wijaya. files: Lib/test/test_time.py | 8 ++++++++ Modules/timemodule.c | 2 +- 2 files changed, 9 insertions(+), 1 deletions(-) diff --git a/Lib/test/test_time.py b/Lib/test/test_time.py --- a/Lib/test/test_time.py +++ b/Lib/test/test_time.py @@ -2,6 +2,7 @@ import time import unittest import locale +import sys class TimeTestCase(unittest.TestCase): @@ -37,6 +38,13 @@ except ValueError: self.fail('conversion specifier: %r failed.' % format) + # Issue #10762: Guard against invalid/non-supported format string + # so that Python don't crash (Windows crashes when the format string + # input to [w]strftime is not kosher. + if sys.platform.startswith('win'): + with self.assertRaises(ValueError): + time.strftime('%f') + def test_strftime_bounds_checking(self): # Make sure that strftime() checks the bounds of the various parts #of the time tuple (0 is valid for *all* values). diff --git a/Modules/timemodule.c b/Modules/timemodule.c --- a/Modules/timemodule.c +++ b/Modules/timemodule.c @@ -549,7 +549,7 @@ if (outbuf[1]=='#') ++outbuf; /* not documented by python, */ if (outbuf[1]=='\0' || - !wcschr(L"aAbBcdfHIjmMpSUwWxXyYzZ%", outbuf[1])) + !wcschr(L"aAbBcdHIjmMpSUwWxXyYzZ%", outbuf[1])) { PyErr_SetString(PyExc_ValueError, "Invalid format string"); return 0; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Apr 6 08:16:36 2011 From: python-checkins at python.org (senthil.kumaran) Date: Wed, 06 Apr 2011 08:16:36 +0200 Subject: [Python-checkins] cpython (merge 3.1 -> 3.2): Merge from 3.1 Message-ID: http://hg.python.org/cpython/rev/1accc17055c9 changeset: 69167:1accc17055c9 branch: 3.2 parent: 69163:76180cc853b6 parent: 69166:2ca1bc677a60 user: Senthil Kumaran date: Wed Apr 06 14:11:09 2011 +0800 summary: Merge from 3.1 files: Lib/test/test_time.py | 8 ++++++++ Modules/timemodule.c | 2 +- 2 files changed, 9 insertions(+), 1 deletions(-) diff --git a/Lib/test/test_time.py b/Lib/test/test_time.py --- a/Lib/test/test_time.py +++ b/Lib/test/test_time.py @@ -3,6 +3,7 @@ import unittest import locale import sysconfig +import sys import warnings class TimeTestCase(unittest.TestCase): @@ -39,6 +40,13 @@ except ValueError: self.fail('conversion specifier: %r failed.' % format) + # Issue #10762: Guard against invalid/non-supported format string + # so that Python don't crash (Windows crashes when the format string + # input to [w]strftime is not kosher. + if sys.platform.startswith('win'): + with self.assertRaises(ValueError): + time.strftime('%f') + def _bounds_checking(self, func=time.strftime): # Make sure that strftime() checks the bounds of the various parts #of the time tuple (0 is valid for *all* values). diff --git a/Modules/timemodule.c b/Modules/timemodule.c --- a/Modules/timemodule.c +++ b/Modules/timemodule.c @@ -512,7 +512,7 @@ if (outbuf[1]=='#') ++outbuf; /* not documented by python, */ if (outbuf[1]=='\0' || - !wcschr(L"aAbBcdfHIjmMpSUwWxXyYzZ%", outbuf[1])) + !wcschr(L"aAbBcdHIjmMpSUwWxXyYzZ%", outbuf[1])) { PyErr_SetString(PyExc_ValueError, "Invalid format string"); return 0; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Apr 6 08:16:37 2011 From: python-checkins at python.org (senthil.kumaran) Date: Wed, 06 Apr 2011 08:16:37 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): merge from 3.2 Message-ID: http://hg.python.org/cpython/rev/dc728ac66c3c changeset: 69168:dc728ac66c3c parent: 69164:d492915cf76d parent: 69167:1accc17055c9 user: Senthil Kumaran date: Wed Apr 06 14:16:08 2011 +0800 summary: merge from 3.2 files: Lib/test/test_time.py | 8 ++++++++ Modules/timemodule.c | 2 +- 2 files changed, 9 insertions(+), 1 deletions(-) diff --git a/Lib/test/test_time.py b/Lib/test/test_time.py --- a/Lib/test/test_time.py +++ b/Lib/test/test_time.py @@ -3,6 +3,7 @@ import unittest import locale import sysconfig +import sys import warnings class TimeTestCase(unittest.TestCase): @@ -39,6 +40,13 @@ except ValueError: self.fail('conversion specifier: %r failed.' % format) + # Issue #10762: Guard against invalid/non-supported format string + # so that Python don't crash (Windows crashes when the format string + # input to [w]strftime is not kosher. + if sys.platform.startswith('win'): + with self.assertRaises(ValueError): + time.strftime('%f') + def _bounds_checking(self, func=time.strftime): # Make sure that strftime() checks the bounds of the various parts #of the time tuple (0 is valid for *all* values). diff --git a/Modules/timemodule.c b/Modules/timemodule.c --- a/Modules/timemodule.c +++ b/Modules/timemodule.c @@ -512,7 +512,7 @@ if (outbuf[1]=='#') ++outbuf; /* not documented by python, */ if (outbuf[1]=='\0' || - !wcschr(L"aAbBcdfHIjmMpSUwWxXyYzZ%", outbuf[1])) + !wcschr(L"aAbBcdHIjmMpSUwWxXyYzZ%", outbuf[1])) { PyErr_SetString(PyExc_ValueError, "Invalid format string"); return 0; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Apr 6 08:45:33 2011 From: python-checkins at python.org (senthil.kumaran) Date: Wed, 06 Apr 2011 08:45:33 +0200 Subject: [Python-checkins] cpython (2.7): Issue #10762: Guard against invalid/non-supported format string '%f' on Message-ID: http://hg.python.org/cpython/rev/1320f29bcf98 changeset: 69169:1320f29bcf98 branch: 2.7 parent: 69162:5616cbce0bee user: Senthil Kumaran date: Wed Apr 06 14:27:47 2011 +0800 summary: Issue #10762: Guard against invalid/non-supported format string '%f' on Windows. Patch Santoso Wijaya. files: Lib/test/test_time.py | 8 ++++++++ Modules/timemodule.c | 2 +- 2 files changed, 9 insertions(+), 1 deletions(-) diff --git a/Lib/test/test_time.py b/Lib/test/test_time.py --- a/Lib/test/test_time.py +++ b/Lib/test/test_time.py @@ -1,6 +1,7 @@ from test import test_support import time import unittest +import sys class TimeTestCase(unittest.TestCase): @@ -37,6 +38,13 @@ except ValueError: self.fail('conversion specifier: %r failed.' % format) + # Issue #10762: Guard against invalid/non-supported format string + # so that Python don't crash (Windows crashes when the format string + # input to [w]strftime is not kosher. + if sys.platform.startswith('win'): + with self.assertRaises(ValueError): + time.strftime('%f') + def test_strftime_bounds_checking(self): # Make sure that strftime() checks the bounds of the various parts #of the time tuple (0 is valid for *all* values). diff --git a/Modules/timemodule.c b/Modules/timemodule.c --- a/Modules/timemodule.c +++ b/Modules/timemodule.c @@ -487,7 +487,7 @@ if (outbuf[1]=='#') ++outbuf; /* not documented by python, */ if (outbuf[1]=='\0' || - !strchr("aAbBcdfHIjmMpSUwWxXyYzZ%", outbuf[1])) + !strchr("aAbBcdHIjmMpSUwWxXyYzZ%", outbuf[1])) { PyErr_SetString(PyExc_ValueError, "Invalid format string"); return 0; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Apr 6 08:45:36 2011 From: python-checkins at python.org (senthil.kumaran) Date: Wed, 06 Apr 2011 08:45:36 +0200 Subject: [Python-checkins] cpython (merge 2.7 -> 2.7): hg pull/merge - Changes to accomodate. Message-ID: http://hg.python.org/cpython/rev/da212fa62fea changeset: 69170:da212fa62fea branch: 2.7 parent: 69169:1320f29bcf98 parent: 69165:202a9feb1fd6 user: Senthil Kumaran date: Wed Apr 06 14:41:42 2011 +0800 summary: hg pull/merge - Changes to accomodate. files: Lib/test/test_datetime.py | 7 +++++++ Modules/datetimemodule.c | 15 ++++++++------- 2 files changed, 15 insertions(+), 7 deletions(-) diff --git a/Lib/test/test_datetime.py b/Lib/test/test_datetime.py --- a/Lib/test/test_datetime.py +++ b/Lib/test/test_datetime.py @@ -231,6 +231,13 @@ eq(a//10, td(0, 7*24*360)) eq(a//3600000, td(0, 0, 7*24*1000)) + # Issue #11576 + eq(td(999999999, 86399, 999999) - td(999999999, 86399, 999998), + td(0, 0, 1)) + eq(td(999999999, 1, 1) - td(999999999, 1, 0), + td(0, 0, 1)) + + def test_disallowed_computations(self): a = timedelta(42) diff --git a/Modules/datetimemodule.c b/Modules/datetimemodule.c --- a/Modules/datetimemodule.c +++ b/Modules/datetimemodule.c @@ -1737,13 +1737,14 @@ if (PyDelta_Check(left) && PyDelta_Check(right)) { /* delta - delta */ - PyObject *minus_right = PyNumber_Negative(right); - if (minus_right) { - result = delta_add(left, minus_right); - Py_DECREF(minus_right); - } - else - result = NULL; + /* The C-level additions can't overflow because of the + * invariant bounds. + */ + int days = GET_TD_DAYS(left) - GET_TD_DAYS(right); + int seconds = GET_TD_SECONDS(left) - GET_TD_SECONDS(right); + int microseconds = GET_TD_MICROSECONDS(left) - + GET_TD_MICROSECONDS(right); + result = new_delta(days, seconds, microseconds, 1); } if (result == Py_NotImplemented) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Apr 6 14:16:41 2011 From: python-checkins at python.org (r.david.murray) Date: Wed, 06 Apr 2011 14:16:41 +0200 Subject: [Python-checkins] cpython (3.2): #11605: don't use set/get_payload in feedparser; they do conversions. Message-ID: http://hg.python.org/cpython/rev/b807cf929e26 changeset: 69171:b807cf929e26 branch: 3.2 parent: 69167:1accc17055c9 user: R David Murray date: Wed Apr 06 08:13:02 2011 -0400 summary: #11605: don't use set/get_payload in feedparser; they do conversions. Really the whole API needs to be gone over to restore the separation of concerns; but that's what email6 is about. files: Lib/email/feedparser.py | 4 +- Lib/email/test/test_email.py | 47 ++++++++++++++++++++++++ Misc/NEWS | 3 + 3 files changed, 52 insertions(+), 2 deletions(-) diff --git a/Lib/email/feedparser.py b/Lib/email/feedparser.py --- a/Lib/email/feedparser.py +++ b/Lib/email/feedparser.py @@ -368,12 +368,12 @@ end = len(mo.group(0)) self._last.epilogue = epilogue[:-end] else: - payload = self._last.get_payload() + payload = self._last._payload if isinstance(payload, str): mo = NLCRE_eol.search(payload) if mo: payload = payload[:-len(mo.group(0))] - self._last.set_payload(payload) + self._last._payload = payload self._input.pop_eof_matcher() self._pop_message() # Set the multipart up for newline cleansing, which will diff --git a/Lib/email/test/test_email.py b/Lib/email/test/test_email.py --- a/Lib/email/test/test_email.py +++ b/Lib/email/test/test_email.py @@ -3168,6 +3168,53 @@ g = email.generator.BytesGenerator(s) g.flatten(msg, linesep='\r\n') self.assertEqual(s.getvalue(), text) + + def test_8bit_multipart(self): + # Issue 11605 + source = textwrap.dedent("""\ + Date: Fri, 18 Mar 2011 17:15:43 +0100 + To: foo at example.com + From: foodwatch-Newsletter + Subject: Aktuelles zu Japan, Klonfleisch und Smiley-System + Message-ID: <76a486bee62b0d200f33dc2ca08220ad at localhost.localdomain> + MIME-Version: 1.0 + Content-Type: multipart/alternative; + boundary="b1_76a486bee62b0d200f33dc2ca08220ad" + + --b1_76a486bee62b0d200f33dc2ca08220ad + Content-Type: text/plain; charset="utf-8" + Content-Transfer-Encoding: 8bit + + Guten Tag, , + + mit gro?er Betroffenheit verfolgen auch wir im foodwatch-Team die + Nachrichten aus Japan. + + + --b1_76a486bee62b0d200f33dc2ca08220ad + Content-Type: text/html; charset="utf-8" + Content-Transfer-Encoding: 8bit + + + + + foodwatch - Newsletter + + +

mit großer Betroffenheit verfolgen auch wir im foodwatch-Team + die Nachrichten aus Japan.