From python-checkins at python.org Fri Apr 1 00:46:54 2011 From: python-checkins at python.org (raymond.hettinger) Date: Fri, 01 Apr 2011 00:46:54 +0200 Subject: [Python-checkins] cpython (3.2): Issue #7796: Add link to Jan Kaliszewski's alternate constructor and ABC for Message-ID: http://hg.python.org/cpython/rev/7aa3f1f7ac94 changeset: 69093:7aa3f1f7ac94 branch: 3.2 parent: 69091:9797bfe8240f user: Raymond Hettinger date: Thu Mar 31 15:46:06 2011 -0700 summary: Issue #7796: Add link to Jan Kaliszewski's alternate constructor and ABC for named tuples. files: Doc/library/collections.rst | 11 +++++++++-- Misc/ACKS | 1 + 2 files changed, 10 insertions(+), 2 deletions(-) diff --git a/Doc/library/collections.rst b/Doc/library/collections.rst --- a/Doc/library/collections.rst +++ b/Doc/library/collections.rst @@ -775,8 +775,15 @@ .. seealso:: - `Named tuple recipe `_ - adapted for Python 2.4. + * `Named tuple recipe `_ + adapted for Python 2.4. + + * `Recipe for named tuple abstract base class with a metaclass mix-in + `_ + by Jan Kaliszewski. Besides providing an :term:`abstract base class` for + named tuples, it also supports an alternate :term:`metaclass`-based + constructor that is convenient for use cases where named tuples are being + subclassed. :class:`OrderedDict` objects diff --git a/Misc/ACKS b/Misc/ACKS --- a/Misc/ACKS +++ b/Misc/ACKS @@ -449,6 +449,7 @@ Peter van Kampen Rafe Kaplan Jacob Kaplan-Moss +Jan Kaliszewski Arkady Koplyarov Lou Kates Hiroaki Kawai -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Fri Apr 1 00:46:56 2011 From: python-checkins at python.org (raymond.hettinger) Date: Fri, 01 Apr 2011 00:46:56 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Issue #7796: Add link to Jan Kaliszewski's alternate constructor and ABC for Message-ID: http://hg.python.org/cpython/rev/330d3482cad8 changeset: 69094:330d3482cad8 parent: 69092:3e191db416a6 parent: 69093:7aa3f1f7ac94 user: Raymond Hettinger date: Thu Mar 31 15:46:39 2011 -0700 summary: Issue #7796: Add link to Jan Kaliszewski's alternate constructor and ABC for named tuples. files: Doc/library/collections.rst | 11 +++++++++-- Misc/ACKS | 1 + 2 files changed, 10 insertions(+), 2 deletions(-) diff --git a/Doc/library/collections.rst b/Doc/library/collections.rst --- a/Doc/library/collections.rst +++ b/Doc/library/collections.rst @@ -857,8 +857,15 @@ .. seealso:: - `Named tuple recipe `_ - adapted for Python 2.4. + * `Named tuple recipe `_ + adapted for Python 2.4. + + * `Recipe for named tuple abstract base class with a metaclass mix-in + `_ + by Jan Kaliszewski. Besides providing an :term:`abstract base class` for + named tuples, it also supports an alternate :term:`metaclass`-based + constructor that is convenient for use cases where named tuples are being + subclassed. :class:`OrderedDict` objects diff --git a/Misc/ACKS b/Misc/ACKS --- a/Misc/ACKS +++ b/Misc/ACKS @@ -452,6 +452,7 @@ Peter van Kampen Rafe Kaplan Jacob Kaplan-Moss +Jan Kaliszewski Arkady Koplyarov Lou Kates Hiroaki Kawai -- Repository URL: http://hg.python.org/cpython From tjreedy at udel.edu Fri Apr 1 01:44:12 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 31 Mar 2011 19:44:12 -0400 Subject: [Python-checkins] cpython (3.2): Add links to make the math docs more usable. In-Reply-To: <4D94D2A8.3000405@gmail.com> References: <4D94D2A8.3000405@gmail.com> Message-ID: <4D9511CC.3010901@udel.edu> >> + Return the `Gamma function` at *x*. >> > > There's a space missing here, and the link doesn't work. It does for me. This may depend on the mail reader and whether it parses the url out in spite of the missing space. From python-checkins at python.org Fri Apr 1 02:31:31 2011 From: python-checkins at python.org (victor.stinner) Date: Fri, 01 Apr 2011 02:31:31 +0200 Subject: [Python-checkins] cpython: Issue #11393: Fix faulthandler_thread(): release cancel lock before join lock Message-ID: http://hg.python.org/cpython/rev/8b1341d51fe6 changeset: 69095:8b1341d51fe6 user: Victor Stinner date: Fri Apr 01 02:28:22 2011 +0200 summary: Issue #11393: Fix faulthandler_thread(): release cancel lock before join lock If the thread releases the join lock before the cancel lock, the thread may sometimes still be alive at cancel_dump_tracebacks_later() exit. So the cancel lock may be destroyed while the thread is still alive, whereas the thread will try to release the cancel lock, which just crash. Another minor fix: the thread doesn't release the cancel lock if it didn't acquire it. files: Modules/faulthandler.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Modules/faulthandler.c b/Modules/faulthandler.c --- a/Modules/faulthandler.c +++ b/Modules/faulthandler.c @@ -401,6 +401,7 @@ thread.timeout_ms, 0); if (st == PY_LOCK_ACQUIRED) { /* Cancelled by user */ + PyThread_release_lock(thread.cancel_event); break; } /* Timeout => dump traceback */ @@ -419,7 +420,6 @@ /* The only way out */ thread.running = 0; PyThread_release_lock(thread.join_event); - PyThread_release_lock(thread.cancel_event); } static void -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Fri Apr 1 03:00:21 2011 From: python-checkins at python.org (victor.stinner) Date: Fri, 01 Apr 2011 03:00:21 +0200 Subject: [Python-checkins] cpython: Issue #11393: New try to fix faulthandler_thread() Message-ID: http://hg.python.org/cpython/rev/0fb0fbd442b4 changeset: 69096:0fb0fbd442b4 user: Victor Stinner date: Fri Apr 01 03:00:05 2011 +0200 summary: Issue #11393: New try to fix faulthandler_thread() Always release the cancel join. Fix also another corner case: _PyFaulthandler_Fini() called after setting running variable to zero, but before releasing the join lock. files: Modules/faulthandler.c | 12 ++++++------ 1 files changed, 6 insertions(+), 6 deletions(-) diff --git a/Modules/faulthandler.c b/Modules/faulthandler.c --- a/Modules/faulthandler.c +++ b/Modules/faulthandler.c @@ -401,7 +401,6 @@ thread.timeout_ms, 0); if (st == PY_LOCK_ACQUIRED) { /* Cancelled by user */ - PyThread_release_lock(thread.cancel_event); break; } /* Timeout => dump traceback */ @@ -418,8 +417,9 @@ } while (ok && thread.repeat); /* The only way out */ + PyThread_release_lock(thread.cancel_event); + PyThread_release_lock(thread.join_event); thread.running = 0; - PyThread_release_lock(thread.join_event); } static void @@ -428,11 +428,11 @@ if (thread.running) { /* Notify cancellation */ PyThread_release_lock(thread.cancel_event); - /* Wait for thread to join */ - PyThread_acquire_lock(thread.join_event, 1); - assert(thread.running == 0); - PyThread_release_lock(thread.join_event); } + /* Wait for thread to join */ + PyThread_acquire_lock(thread.join_event, 1); + assert(thread.running == 0); + PyThread_release_lock(thread.join_event); Py_CLEAR(thread.file); } -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Fri Apr 1 03:17:36 2011 From: python-checkins at python.org (victor.stinner) Date: Fri, 01 Apr 2011 03:17:36 +0200 Subject: [Python-checkins] cpython: Issue #11393: fix usage of locks in faulthandler Message-ID: http://hg.python.org/cpython/rev/3558eecd84f0 changeset: 69097:3558eecd84f0 user: Victor Stinner date: Fri Apr 01 03:16:51 2011 +0200 summary: Issue #11393: fix usage of locks in faulthandler * faulthandler_cancel_dump_tracebacks_later() is responsible to set running to zero (so we don't need the volatile keyword anymore) * release locks if PyThread_start_new_thread() fails assert(thread.running == 0) was wrong in a corner case files: Modules/faulthandler.c | 7 ++++--- 1 files changed, 4 insertions(+), 3 deletions(-) diff --git a/Modules/faulthandler.c b/Modules/faulthandler.c --- a/Modules/faulthandler.c +++ b/Modules/faulthandler.c @@ -48,7 +48,7 @@ int fd; PY_TIMEOUT_T timeout_ms; /* timeout in microseconds */ int repeat; - volatile int running; + int running; PyInterpreterState *interp; int exit; /* released by parent thread when cancel request */ @@ -419,7 +419,6 @@ /* The only way out */ PyThread_release_lock(thread.cancel_event); PyThread_release_lock(thread.join_event); - thread.running = 0; } static void @@ -431,8 +430,8 @@ } /* Wait for thread to join */ PyThread_acquire_lock(thread.join_event, 1); - assert(thread.running == 0); PyThread_release_lock(thread.join_event); + thread.running = 0; Py_CLEAR(thread.file); } @@ -486,6 +485,8 @@ thread.running = 1; if (PyThread_start_new_thread(faulthandler_thread, NULL) == -1) { thread.running = 0; + PyThread_release_lock(thread.join_event); + PyThread_release_lock(thread.cancel_event); Py_CLEAR(thread.file); PyErr_SetString(PyExc_RuntimeError, "unable to start watchdog thread"); -- Repository URL: http://hg.python.org/cpython From solipsis at pitrou.net Fri Apr 1 04:55:16 2011 From: solipsis at pitrou.net (solipsis at pitrou.net) Date: Fri, 01 Apr 2011 04:55:16 +0200 Subject: [Python-checkins] Daily reference leaks (3558eecd84f0): sum=0 Message-ID: results for 3558eecd84f0 on branch "default" -------------------------------------------- Command line was: ['./python', '-m', 'test.regrtest', '-uall', '-R', '3:3:/home/antoine/cpython/refleaks/reflogYX7SDN', '-x'] From python-checkins at python.org Fri Apr 1 09:20:17 2011 From: python-checkins at python.org (georg.brandl) Date: Fri, 01 Apr 2011 09:20:17 +0200 Subject: [Python-checkins] cpython: Fix markup. Message-ID: http://hg.python.org/cpython/rev/214d0608fb84 changeset: 69098:214d0608fb84 user: Georg Brandl date: Fri Apr 01 09:19:57 2011 +0200 summary: Fix markup. files: Doc/library/faulthandler.rst | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/Doc/library/faulthandler.rst b/Doc/library/faulthandler.rst --- a/Doc/library/faulthandler.rst +++ b/Doc/library/faulthandler.rst @@ -69,8 +69,8 @@ Dump the tracebacks of all threads, after a timeout of *timeout* seconds, or each *timeout* seconds if *repeat* is ``True``. If *exit* is True, call - :cfunc:`_exit` with status=1 after dumping the tracebacks to terminate - immediatly the process, which is not safe. For example, :cfunc:`_exit` + :c:func:`_exit` with status=1 after dumping the tracebacks to terminate + immediatly the process, which is not safe. For example, :c:func:`_exit` doesn't flush file buffers. If the function is called twice, the new call replaces previous parameters (resets the timeout). The timer has a sub-second resolution. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Fri Apr 1 12:14:28 2011 From: python-checkins at python.org (victor.stinner) Date: Fri, 01 Apr 2011 12:14:28 +0200 Subject: [Python-checkins] cpython: Issue #11393: fault handler uses raise(signum) for SIGILL on Windows Message-ID: http://hg.python.org/cpython/rev/e51d8a160a8a changeset: 69099:e51d8a160a8a user: Victor Stinner date: Fri Apr 01 12:08:57 2011 +0200 summary: Issue #11393: fault handler uses raise(signum) for SIGILL on Windows files: Modules/faulthandler.c | 27 ++++++++++++--------------- 1 files changed, 12 insertions(+), 15 deletions(-) diff --git a/Modules/faulthandler.c b/Modules/faulthandler.c --- a/Modules/faulthandler.c +++ b/Modules/faulthandler.c @@ -270,14 +270,16 @@ else _Py_DumpTraceback(fd, tstate); -#ifndef MS_WINDOWS - /* call the previous signal handler: it is called if we use sigaction() - thanks to SA_NODEFER flag, otherwise it is deferred */ +#ifdef MS_WINDOWS + if (signum == SIGSEGV) { + /* don't call explictly the previous handler for SIGSEGV in this signal + handler, because the Windows signal handler would not be called */ + return; + } +#endif + /* call the previous signal handler: it is called immediatly if we use + sigaction() thanks to SA_NODEFER flag, otherwise it is deferred */ raise(signum); -#else - /* on Windows, don't call explictly the previous handler, because Windows - signal handler would not be called */ -#endif } /* Install handler for fatal signals (SIGSEGV, SIGFPE, ...). */ @@ -681,8 +683,9 @@ faulthandler_sigsegv(PyObject *self, PyObject *args) { #if defined(MS_WINDOWS) - /* faulthandler_fatal_error() restores the previous signal handler and then - gives back the execution flow to the program. In a normal case, the + /* For SIGSEGV, faulthandler_fatal_error() restores the previous signal + handler and then gives back the execution flow to the program (without + calling explicitly the previous error handler). In a normal case, the SIGSEGV was raised by the kernel because of a fault, and so if the program retries to execute the same instruction, the fault will be raised again. @@ -724,13 +727,7 @@ static PyObject * faulthandler_sigill(PyObject *self, PyObject *args) { -#if defined(MS_WINDOWS) - /* see faulthandler_sigsegv() for the explanation about while(1) */ - while(1) - raise(SIGILL); -#else raise(SIGILL); -#endif Py_RETURN_NONE; } #endif -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Fri Apr 1 12:14:29 2011 From: python-checkins at python.org (victor.stinner) Date: Fri, 01 Apr 2011 12:14:29 +0200 Subject: [Python-checkins] cpython: Issue #11393: The fault handler handles also SIGABRT Message-ID: http://hg.python.org/cpython/rev/a4fa79b0d478 changeset: 69100:a4fa79b0d478 user: Victor Stinner date: Fri Apr 01 12:13:55 2011 +0200 summary: Issue #11393: The fault handler handles also SIGABRT files: Doc/library/faulthandler.rst | 14 ++++---- Doc/using/cmdline.rst | 5 ++- Lib/test/test_faulthandler.py | 9 ++++++ Modules/faulthandler.c | 33 +++++++++++++++++----- Python/pythonrun.c | 1 + 5 files changed, 45 insertions(+), 17 deletions(-) diff --git a/Doc/library/faulthandler.rst b/Doc/library/faulthandler.rst --- a/Doc/library/faulthandler.rst +++ b/Doc/library/faulthandler.rst @@ -6,10 +6,10 @@ This module contains functions to dump the Python traceback explicitly, on a fault, after a timeout or on a user signal. Call :func:`faulthandler.enable` to -install fault handlers for :const:`SIGSEGV`, :const:`SIGFPE`, :const:`SIGBUS` -and :const:`SIGILL` signals. You can also enable them at startup by setting the -:envvar:`PYTHONFAULTHANDLER` environment variable or by using :option:`-X` -``faulthandler`` command line option. +install fault handlers for :const:`SIGSEGV`, :const:`SIGFPE`, :const:`SIGABRT`, +:const:`SIGBUS` and :const:`SIGILL` signals. You can also enable them at +startup by setting the :envvar:`PYTHONFAULTHANDLER` environment variable or by +using :option:`-X` ``faulthandler`` command line option. The fault handler is compatible with system fault handlers like Apport or the Windows fault handler. The module uses an alternative stack for signal @@ -48,9 +48,9 @@ .. function:: enable(file=sys.stderr, all_threads=False) Enable the fault handler: install handlers for :const:`SIGSEGV`, - :const:`SIGFPE`, :const:`SIGBUS` and :const:`SIGILL` signals to dump the - Python traceback. It dumps the traceback of the current thread, or all - threads if *all_threads* is ``True``, into *file*. + :const:`SIGFPE`, :const:`SIGABRT`, :const:`SIGBUS` and :const:`SIGILL` + signals to dump the Python traceback. It dumps the traceback of the current + thread, or all threads if *all_threads* is ``True``, into *file*. .. function:: disable() diff --git a/Doc/using/cmdline.rst b/Doc/using/cmdline.rst --- a/Doc/using/cmdline.rst +++ b/Doc/using/cmdline.rst @@ -502,8 +502,9 @@ If this environment variable is set, :func:`faulthandler.enable` is called at startup: install a handler for :const:`SIGSEGV`, :const:`SIGFPE`, - :const:`SIGBUS` and :const:`SIGILL` signals to dump the Python traceback. - This is equivalent to :option:`-X` ``faulthandler`` option. + :const:`SIGABRT`, :const:`SIGBUS` and :const:`SIGILL` signals to dump the + Python traceback. This is equivalent to :option:`-X` ``faulthandler`` + option. Debug-mode variables diff --git a/Lib/test/test_faulthandler.py b/Lib/test/test_faulthandler.py --- a/Lib/test/test_faulthandler.py +++ b/Lib/test/test_faulthandler.py @@ -112,6 +112,15 @@ 3, 'Segmentation fault') + def test_sigabrt(self): + self.check_fatal_error(""" +import faulthandler +faulthandler.enable() +faulthandler._sigabrt() +""".strip(), + 3, + 'Aborted') + @unittest.skipIf(sys.platform == 'win32', "SIGFPE cannot be caught on Windows") def test_sigfpe(self): diff --git a/Modules/faulthandler.c b/Modules/faulthandler.c --- a/Modules/faulthandler.c +++ b/Modules/faulthandler.c @@ -10,9 +10,9 @@ #endif #ifndef MS_WINDOWS - /* register() is useless on Windows, because only SIGSEGV and SIGILL can be - handled by the process, and these signals can only be used with enable(), - not using register() */ + /* register() is useless on Windows, because only SIGSEGV, SIGABRT and + SIGILL can be handled by the process, and these signals can only be used + with enable(), not using register() */ # define FAULTHANDLER_USER #endif @@ -96,6 +96,7 @@ {SIGILL, 0, "Illegal instruction", }, #endif {SIGFPE, 0, "Floating point exception", }, + {SIGABRT, 0, "Aborted", }, /* define SIGSEGV at the end to make it the default choice if searching the handler fails in faulthandler_fatal_error() */ {SIGSEGV, 0, "Segmentation fault", } @@ -202,7 +203,7 @@ } -/* Handler of SIGSEGV, SIGFPE, SIGBUS and SIGILL signals. +/* Handler of SIGSEGV, SIGFPE, SIGABRT, SIGBUS and SIGILL signals. Display the current Python traceback, restore the previous handler and call the previous handler. @@ -253,9 +254,9 @@ PUTS(fd, handler->name); PUTS(fd, "\n\n"); - /* SIGSEGV, SIGFPE, SIGBUS and SIGILL are synchronous signals and so are - delivered to the thread that caused the fault. Get the Python thread - state of the current thread. + /* SIGSEGV, SIGFPE, SIGABRT, SIGBUS and SIGILL are synchronous signals and + so are delivered to the thread that caused the fault. Get the Python + thread state of the current thread. PyThreadState_Get() doesn't give the state of the thread that caused the fault if the thread released the GIL, and so this function cannot be @@ -282,7 +283,7 @@ raise(signum); } -/* Install handler for fatal signals (SIGSEGV, SIGFPE, ...). */ +/* Install the handler for fatal signals, faulthandler_fatal_error(). */ static PyObject* faulthandler_enable(PyObject *self, PyObject *args, PyObject *kwargs) @@ -714,6 +715,20 @@ Py_RETURN_NONE; } +static PyObject * +faulthandler_sigabrt(PyObject *self, PyObject *args) +{ +#if _MSC_VER + /* If Python is compiled in debug mode with Visual Studio, abort() opens + a popup asking the user how to handle the assertion. Use raise(SIGABRT) + instead. */ + raise(SIGABRT); +#else + abort(); +#endif + Py_RETURN_NONE; +} + #ifdef SIGBUS static PyObject * faulthandler_sigbus(PyObject *self, PyObject *args) @@ -847,6 +862,8 @@ "a SIGSEGV or SIGBUS signal depending on the platform")}, {"_sigsegv", faulthandler_sigsegv, METH_VARARGS, PyDoc_STR("_sigsegv(): raise a SIGSEGV signal")}, + {"_sigabrt", faulthandler_sigabrt, METH_VARARGS, + PyDoc_STR("_sigabrt(): raise a SIGABRT signal")}, {"_sigfpe", (PyCFunction)faulthandler_sigfpe, METH_NOARGS, PyDoc_STR("_sigfpe(): raise a SIGFPE signal")}, #ifdef SIGBUS diff --git a/Python/pythonrun.c b/Python/pythonrun.c --- a/Python/pythonrun.c +++ b/Python/pythonrun.c @@ -2124,6 +2124,7 @@ fflush(stderr); _Py_DumpTraceback(fd, tstate); } + _PyFaulthandler_Fini(); } #ifdef MS_WINDOWS -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Fri Apr 1 13:10:41 2011 From: python-checkins at python.org (victor.stinner) Date: Fri, 01 Apr 2011 13:10:41 +0200 Subject: [Python-checkins] cpython: Issue #11393: Fix faulthandler.disable() and add a test Message-ID: http://hg.python.org/cpython/rev/a27755b10448 changeset: 69101:a27755b10448 user: Victor Stinner date: Fri Apr 01 12:56:17 2011 +0200 summary: Issue #11393: Fix faulthandler.disable() and add a test files: Lib/test/test_faulthandler.py | 32 +++++++++++++++++----- Modules/faulthandler.c | 8 ++-- 2 files changed, 28 insertions(+), 12 deletions(-) diff --git a/Lib/test/test_faulthandler.py b/Lib/test/test_faulthandler.py --- a/Lib/test/test_faulthandler.py +++ b/Lib/test/test_faulthandler.py @@ -431,13 +431,15 @@ @unittest.skipIf(not hasattr(faulthandler, "register"), "need faulthandler.register") - def check_register(self, filename=False, all_threads=False): + def check_register(self, filename=False, all_threads=False, + unregister=False): """ Register a handler displaying the traceback on a user signal. Raise the signal and check the written traceback. Raise an error if the output doesn't match the expected format. """ + signum = signal.SIGUSR1 code = """ import faulthandler import os @@ -446,12 +448,15 @@ def func(signum): os.kill(os.getpid(), signum) -signum = signal.SIGUSR1 +signum = {signum} +unregister = {unregister} if {has_filename}: file = open({filename}, "wb") else: file = None faulthandler.register(signum, file=file, all_threads={all_threads}) +if unregister: + faulthandler.unregister(signum) func(signum) if file is not None: file.close() @@ -460,20 +465,31 @@ filename=repr(filename), has_filename=bool(filename), all_threads=all_threads, + signum=signum, + unregister=unregister, ) trace, exitcode = self.get_output(code, filename) trace = '\n'.join(trace) - if all_threads: - regex = 'Current thread XXX:\n' + if not unregister: + if all_threads: + regex = 'Current thread XXX:\n' + else: + regex = 'Traceback \(most recent call first\):\n' + regex = expected_traceback(6, 17, regex) + self.assertRegex(trace, regex) else: - regex = 'Traceback \(most recent call first\):\n' - regex = expected_traceback(6, 14, regex) - self.assertRegex(trace, regex) - self.assertEqual(exitcode, 0) + self.assertEqual(trace, '') + if unregister: + self.assertNotEqual(exitcode, 0) + else: + self.assertEqual(exitcode, 0) def test_register(self): self.check_register() + def test_unregister(self): + self.check_register(unregister=True) + def test_register_file(self): with temporary_filename() as filename: self.check_register(filename=filename) diff --git a/Modules/faulthandler.c b/Modules/faulthandler.c --- a/Modules/faulthandler.c +++ b/Modules/faulthandler.c @@ -628,7 +628,7 @@ static int faulthandler_unregister(user_signal_t *user, int signum) { - if (user->enabled) + if (!user->enabled) return 0; user->enabled = 0; #ifdef HAVE_SIGACTION @@ -976,7 +976,7 @@ void _PyFaulthandler_Fini(void) { #ifdef FAULTHANDLER_USER - unsigned int i; + unsigned int signum; #endif #ifdef FAULTHANDLER_LATER @@ -995,8 +995,8 @@ #ifdef FAULTHANDLER_USER /* user */ if (user_signals != NULL) { - for (i=0; i < NSIG; i++) - faulthandler_unregister(&user_signals[i], i+1); + for (signum=0; signum < NSIG; signum++) + faulthandler_unregister(&user_signals[signum], signum); free(user_signals); user_signals = NULL; } -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Fri Apr 1 15:39:59 2011 From: python-checkins at python.org (victor.stinner) Date: Fri, 01 Apr 2011 15:39:59 +0200 Subject: [Python-checkins] cpython: Issue #11393: _Py_DumpTraceback() writes the header even if there is no frame Message-ID: http://hg.python.org/cpython/rev/7e3ed426962f changeset: 69102:7e3ed426962f user: Victor Stinner date: Fri Apr 01 15:34:01 2011 +0200 summary: Issue #11393: _Py_DumpTraceback() writes the header even if there is no frame files: Include/traceback.h | 4 +--- Python/traceback.c | 14 +++++++------- 2 files changed, 8 insertions(+), 10 deletions(-) diff --git a/Include/traceback.h b/Include/traceback.h --- a/Include/traceback.h +++ b/Include/traceback.h @@ -38,8 +38,6 @@ ... File "xxx", line xxx in - Return 0 on success, -1 on error. - This function is written for debug purpose only, to dump the traceback in the worst case: after a segmentation fault, at fatal error, etc. That's why, it is very limited. Strings are truncated to 100 characters and encoded to @@ -49,7 +47,7 @@ This function is signal safe. */ -PyAPI_DATA(int) _Py_DumpTraceback( +PyAPI_DATA(void) _Py_DumpTraceback( int fd, PyThreadState *tstate); diff --git a/Python/traceback.c b/Python/traceback.c --- a/Python/traceback.c +++ b/Python/traceback.c @@ -556,18 +556,19 @@ write(fd, "\n", 1); } -static int +static void dump_traceback(int fd, PyThreadState *tstate, int write_header) { PyFrameObject *frame; unsigned int depth; + if (write_header) + PUTS(fd, "Traceback (most recent call first):\n"); + frame = _PyThreadState_GetFrame(tstate); if (frame == NULL) - return -1; + return; - if (write_header) - PUTS(fd, "Traceback (most recent call first):\n"); depth = 0; while (frame != NULL) { if (MAX_FRAME_DEPTH <= depth) { @@ -580,13 +581,12 @@ frame = frame->f_back; depth++; } - return 0; } -int +void _Py_DumpTraceback(int fd, PyThreadState *tstate) { - return dump_traceback(fd, tstate, 1); + dump_traceback(fd, tstate, 1); } /* Write the thread identifier into the file 'fd': "Current thread 0xHHHH:\" if -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Fri Apr 1 15:40:03 2011 From: python-checkins at python.org (victor.stinner) Date: Fri, 01 Apr 2011 15:40:03 +0200 Subject: [Python-checkins] cpython: Issue #11393: signal of user signal displays tracebacks even if tstate==NULL Message-ID: http://hg.python.org/cpython/rev/e609105dff64 changeset: 69103:e609105dff64 user: Victor Stinner date: Fri Apr 01 15:37:12 2011 +0200 summary: Issue #11393: signal of user signal displays tracebacks even if tstate==NULL * faulthandler_user() displays the tracebacks of all threads even if it is unable to get the state of the current thread * test_faulthandler: only release the GIL in test_gil_released() check * create check_signum() subfunction files: Lib/test/test_faulthandler.py | 9 ++- Modules/faulthandler.c | 58 ++++++++++++++-------- 2 files changed, 43 insertions(+), 24 deletions(-) diff --git a/Lib/test/test_faulthandler.py b/Lib/test/test_faulthandler.py --- a/Lib/test/test_faulthandler.py +++ b/Lib/test/test_faulthandler.py @@ -8,6 +8,8 @@ import tempfile import unittest +TIMEOUT = 0.5 + try: from resource import setrlimit, RLIMIT_CORE, error as resource_error except ImportError: @@ -189,7 +191,7 @@ import faulthandler output = open({filename}, 'wb') faulthandler.enable(output) -faulthandler._read_null(True) +faulthandler._read_null() """.strip().format(filename=repr(filename)), 4, '(?:Segmentation fault|Bus error)', @@ -199,7 +201,7 @@ self.check_fatal_error(""" import faulthandler faulthandler.enable(all_threads=True) -faulthandler._read_null(True) +faulthandler._read_null() """.strip(), 3, '(?:Segmentation fault|Bus error)', @@ -376,7 +378,7 @@ # Check that sleep() was not interrupted assert (b - a) >= min_pause, "{{}} < {{}}".format(b - a, min_pause) -timeout = 0.5 +timeout = {timeout} repeat = {repeat} cancel = {cancel} if {has_filename}: @@ -394,6 +396,7 @@ has_filename=bool(filename), repeat=repeat, cancel=cancel, + timeout=TIMEOUT, ) trace, exitcode = self.get_output(code, filename) trace = '\n'.join(trace) diff --git a/Modules/faulthandler.c b/Modules/faulthandler.c --- a/Modules/faulthandler.c +++ b/Modules/faulthandler.c @@ -65,6 +65,7 @@ int fd; int all_threads; _Py_sighandler_t previous; + PyInterpreterState *interp; } user_signal_t; static user_signal_t *user_signals; @@ -529,15 +530,35 @@ the thread doesn't hold the GIL. Read the thread local storage (TLS) instead: call PyGILState_GetThisThreadState(). */ tstate = PyGILState_GetThisThreadState(); - if (tstate == NULL) { - /* unable to get the current thread, do nothing */ - return; - } if (user->all_threads) - _Py_DumpTracebackThreads(user->fd, tstate->interp, tstate); - else + _Py_DumpTracebackThreads(user->fd, user->interp, tstate); + else { + if (tstate == NULL) + return; _Py_DumpTraceback(user->fd, tstate); + } +} + +static int +check_signum(int signum) +{ + unsigned int i; + + for (i=0; i < faulthandler_nsignals; i++) { + if (faulthandler_handlers[i].signum == signum) { + PyErr_Format(PyExc_RuntimeError, + "signal %i cannot be registered, " + "use enable() instead", + signum); + return 0; + } + } + if (signum < 1 || NSIG <= signum) { + PyErr_SetString(PyExc_ValueError, "signal number out of range"); + return 0; + } + return 1; } static PyObject* @@ -549,12 +570,12 @@ PyObject *file = NULL; int all_threads = 0; int fd; - unsigned int i; user_signal_t *user; _Py_sighandler_t previous; #ifdef HAVE_SIGACTION struct sigaction action; #endif + PyThreadState *tstate; int err; if (!PyArg_ParseTupleAndKeywords(args, kwargs, @@ -562,19 +583,15 @@ &signum, &file, &all_threads)) return NULL; - if (signum < 1 || NSIG <= signum) { - PyErr_SetString(PyExc_ValueError, "signal number out of range"); + if (!check_signum(signum)) return NULL; - } - for (i=0; i < faulthandler_nsignals; i++) { - if (faulthandler_handlers[i].signum == signum) { - PyErr_Format(PyExc_RuntimeError, - "signal %i cannot be registered by register(), " - "use enable() instead", - signum); - return NULL; - } + /* The caller holds the GIL and so PyThreadState_Get() can be used */ + tstate = PyThreadState_Get(); + if (tstate == NULL) { + PyErr_SetString(PyExc_RuntimeError, + "unable to get the current thread state"); + return NULL; } file = faulthandler_get_fileno(file, &fd); @@ -620,6 +637,7 @@ user->fd = fd; user->all_threads = all_threads; user->previous = previous; + user->interp = tstate->interp; user->enabled = 1; Py_RETURN_NONE; @@ -651,10 +669,8 @@ if (!PyArg_ParseTuple(args, "i:unregister", &signum)) return NULL; - if (signum < 1 || NSIG <= signum) { - PyErr_SetString(PyExc_ValueError, "signal number out of range"); + if (!check_signum(signum)) return NULL; - } user = &user_signals[signum]; change = faulthandler_unregister(user, signum); -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Fri Apr 1 16:00:11 2011 From: python-checkins at python.org (victor.stinner) Date: Fri, 01 Apr 2011 16:00:11 +0200 Subject: [Python-checkins] cpython: Issue #11727: set regrtest default timeout to 15 minutes Message-ID: http://hg.python.org/cpython/rev/15f6fe139181 changeset: 69104:15f6fe139181 user: Victor Stinner date: Fri Apr 01 15:59:59 2011 +0200 summary: Issue #11727: set regrtest default timeout to 15 minutes files: Lib/test/regrtest.py | 5 +++-- Misc/NEWS | 4 +++- 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/Lib/test/regrtest.py b/Lib/test/regrtest.py --- a/Lib/test/regrtest.py +++ b/Lib/test/regrtest.py @@ -22,7 +22,8 @@ -h/--help -- print this text and exit --timeout TIMEOUT -- dump the traceback and exit if a test takes more - than TIMEOUT seconds + than TIMEOUT seconds (default: 15 minutes); disable + the timeout if TIMEOUT is zero Verbosity @@ -239,7 +240,7 @@ findleaks=False, use_resources=None, trace=False, coverdir='coverage', runleaks=False, huntrleaks=False, verbose2=False, print_slow=False, random_seed=None, use_mp=None, verbose3=False, forever=False, - header=False, timeout=None): + header=False, timeout=15*60): """Execute a test suite. This also parses command-line options and modifies its behavior diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -361,7 +361,9 @@ Tests ----- -- Issue #11727: add --timeout option to regrtest (disabled by default). +- Issue #11727: If a test takes more than 15 minutes, regrtest dumps the + traceback of all threads and exits. Use --timeout option to change the + default timeout or to disable it. - Issue #11653: fix -W with -j in regrtest. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Fri Apr 1 18:17:16 2011 From: python-checkins at python.org (r.david.murray) Date: Fri, 01 Apr 2011 18:17:16 +0200 Subject: [Python-checkins] devguide: Top level sections only in sidebar TOC on FAQ page. Message-ID: http://hg.python.org/devguide/rev/1dc036ca6c94 changeset: 409:1dc036ca6c94 user: R David Murray date: Fri Apr 01 12:16:53 2011 -0400 summary: Top level sections only in sidebar TOC on FAQ page. files: faq.rst | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/faq.rst b/faq.rst --- a/faq.rst +++ b/faq.rst @@ -1,3 +1,5 @@ +:tocdepth: 2 + .. _faq: Python Developer FAQ -- Repository URL: http://hg.python.org/devguide From python-checkins at python.org Fri Apr 1 18:40:20 2011 From: python-checkins at python.org (victor.stinner) Date: Fri, 01 Apr 2011 18:40:20 +0200 Subject: [Python-checkins] cpython: Issue #11727: set regrtest default timeout to 30 minutes Message-ID: http://hg.python.org/cpython/rev/053bc5ca199b changeset: 69105:053bc5ca199b user: Victor Stinner date: Fri Apr 01 18:16:36 2011 +0200 summary: Issue #11727: set regrtest default timeout to 30 minutes files: Lib/test/regrtest.py | 4 ++-- Misc/NEWS | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/Lib/test/regrtest.py b/Lib/test/regrtest.py --- a/Lib/test/regrtest.py +++ b/Lib/test/regrtest.py @@ -22,7 +22,7 @@ -h/--help -- print this text and exit --timeout TIMEOUT -- dump the traceback and exit if a test takes more - than TIMEOUT seconds (default: 15 minutes); disable + than TIMEOUT seconds (default: 30 minutes); disable the timeout if TIMEOUT is zero Verbosity @@ -240,7 +240,7 @@ findleaks=False, use_resources=None, trace=False, coverdir='coverage', runleaks=False, huntrleaks=False, verbose2=False, print_slow=False, random_seed=None, use_mp=None, verbose3=False, forever=False, - header=False, timeout=15*60): + header=False, timeout=30*60): """Execute a test suite. This also parses command-line options and modifies its behavior diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -361,7 +361,7 @@ Tests ----- -- Issue #11727: If a test takes more than 15 minutes, regrtest dumps the +- Issue #11727: If a test takes more than 30 minutes, regrtest dumps the traceback of all threads and exits. Use --timeout option to change the default timeout or to disable it. -- Repository URL: http://hg.python.org/cpython From solipsis at pitrou.net Sat Apr 2 04:57:59 2011 From: solipsis at pitrou.net (solipsis at pitrou.net) Date: Sat, 02 Apr 2011 04:57:59 +0200 Subject: [Python-checkins] Daily reference leaks (053bc5ca199b): sum=0 Message-ID: results for 053bc5ca199b on branch "default" -------------------------------------------- Command line was: ['./python', '-m', 'test.regrtest', '-uall', '-R', '3:3:/home/antoine/cpython/refleaks/reflog21pzX3', '-x'] From solipsis at pitrou.net Sun Apr 3 04:55:31 2011 From: solipsis at pitrou.net (solipsis at pitrou.net) Date: Sun, 03 Apr 2011 04:55:31 +0200 Subject: [Python-checkins] Daily reference leaks (053bc5ca199b): sum=-56 Message-ID: results for 053bc5ca199b on branch "default" -------------------------------------------- test_pyexpat leaked [0, 0, -56] references, sum=-56 Command line was: ['./python', '-m', 'test.regrtest', '-uall', '-R', '3:3:/home/antoine/cpython/refleaks/reflogPtWxL8', '-x'] From python-checkins at python.org Sun Apr 3 15:26:49 2011 From: python-checkins at python.org (ezio.melotti) Date: Sun, 03 Apr 2011 15:26:49 +0200 Subject: [Python-checkins] cpython (3.1): Fix typo noticed by Sandro Tosi. Message-ID: http://hg.python.org/cpython/rev/821244a44163 changeset: 69106:821244a44163 branch: 3.1 parent: 69066:8e074d9b1587 user: Ezio Melotti date: Sun Apr 03 16:20:21 2011 +0300 summary: Fix typo noticed by Sandro Tosi. files: Doc/library/profile.rst | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Doc/library/profile.rst b/Doc/library/profile.rst --- a/Doc/library/profile.rst +++ b/Doc/library/profile.rst @@ -48,7 +48,7 @@ The profiler modules are designed to provide an execution profile for a given program, not for benchmarking purposes (for that, there is :mod:`timeit` for - resonably accurate results). This particularly applies to benchmarking + reasonably accurate results). This particularly applies to benchmarking Python code against C code: the profilers introduce overhead for Python code, but not for C-level functions, and so the C code would seem faster than any Python one. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Apr 3 15:26:49 2011 From: python-checkins at python.org (ezio.melotti) Date: Sun, 03 Apr 2011 15:26:49 +0200 Subject: [Python-checkins] cpython (merge 3.1 -> 3.2): Merge with 3.1 Message-ID: http://hg.python.org/cpython/rev/5fd1ac1c9474 changeset: 69107:5fd1ac1c9474 branch: 3.2 parent: 69093:7aa3f1f7ac94 parent: 69106:821244a44163 user: Ezio Melotti date: Sun Apr 03 16:24:22 2011 +0300 summary: Merge with 3.1 files: Doc/library/profile.rst | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Doc/library/profile.rst b/Doc/library/profile.rst --- a/Doc/library/profile.rst +++ b/Doc/library/profile.rst @@ -50,7 +50,7 @@ The profiler modules are designed to provide an execution profile for a given program, not for benchmarking purposes (for that, there is :mod:`timeit` for - resonably accurate results). This particularly applies to benchmarking + reasonably accurate results). This particularly applies to benchmarking Python code against C code: the profilers introduce overhead for Python code, but not for C-level functions, and so the C code would seem faster than any Python one. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Apr 3 15:26:50 2011 From: python-checkins at python.org (ezio.melotti) Date: Sun, 03 Apr 2011 15:26:50 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Merge with 3.2 Message-ID: http://hg.python.org/cpython/rev/ca5932a51a9b changeset: 69108:ca5932a51a9b parent: 69105:053bc5ca199b parent: 69107:5fd1ac1c9474 user: Ezio Melotti date: Sun Apr 03 16:25:49 2011 +0300 summary: Merge with 3.2 files: Doc/library/profile.rst | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Doc/library/profile.rst b/Doc/library/profile.rst --- a/Doc/library/profile.rst +++ b/Doc/library/profile.rst @@ -50,7 +50,7 @@ The profiler modules are designed to provide an execution profile for a given program, not for benchmarking purposes (for that, there is :mod:`timeit` for - resonably accurate results). This particularly applies to benchmarking + reasonably accurate results). This particularly applies to benchmarking Python code against C code: the profilers introduce overhead for Python code, but not for C-level functions, and so the C code would seem faster than any Python one. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Apr 3 17:03:32 2011 From: python-checkins at python.org (ezio.melotti) Date: Sun, 03 Apr 2011 17:03:32 +0200 Subject: [Python-checkins] cpython (3.2): #11282: the fail* methods will stay around a few more versions. Message-ID: http://hg.python.org/cpython/rev/1fd736395df3 changeset: 69109:1fd736395df3 branch: 3.2 parent: 69107:5fd1ac1c9474 user: Ezio Melotti date: Sun Apr 03 17:37:58 2011 +0300 summary: #11282: the fail* methods will stay around a few more versions. files: Doc/library/unittest.rst | 2 +- Lib/unittest/case.py | 3 +-- Lib/unittest/test/test_case.py | 6 ++---- 3 files changed, 4 insertions(+), 7 deletions(-) diff --git a/Doc/library/unittest.rst b/Doc/library/unittest.rst --- a/Doc/library/unittest.rst +++ b/Doc/library/unittest.rst @@ -1459,7 +1459,7 @@ :meth:`.assertRaisesRegex` assertRaisesRegexp ============================== ====================== ====================== - .. deprecated-removed:: 3.1 3.3 + .. deprecated:: 3.1 the fail* aliases listed in the second column. .. deprecated:: 3.2 the assert* aliases listed in the third column. diff --git a/Lib/unittest/case.py b/Lib/unittest/case.py --- a/Lib/unittest/case.py +++ b/Lib/unittest/case.py @@ -1181,8 +1181,7 @@ return original_func(*args, **kwargs) return deprecated_func - # The fail* methods can be removed in 3.3, the 5 assert* methods will - # have to stay around for a few more versions. See #9424. + # see #9424 failUnlessEqual = assertEquals = _deprecate(assertEqual) failIfEqual = assertNotEquals = _deprecate(assertNotEqual) failUnlessAlmostEqual = assertAlmostEquals = _deprecate(assertAlmostEqual) diff --git a/Lib/unittest/test/test_case.py b/Lib/unittest/test/test_case.py --- a/Lib/unittest/test/test_case.py +++ b/Lib/unittest/test/test_case.py @@ -1088,10 +1088,8 @@ _runtime_warn("barz") def testDeprecatedMethodNames(self): - """Test that the deprecated methods raise a DeprecationWarning. - - The fail* methods will be removed in 3.3. The assert* methods will - have to stay around for a few more versions. See #9424. + """ + Test that the deprecated methods raise a DeprecationWarning. See #9424. """ old = ( (self.failIfEqual, (3, 5)), -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Apr 3 17:03:33 2011 From: python-checkins at python.org (ezio.melotti) Date: Sun, 03 Apr 2011 17:03:33 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): #11282: merge with 3.2. Message-ID: http://hg.python.org/cpython/rev/110bb604bc2f changeset: 69110:110bb604bc2f parent: 69108:ca5932a51a9b parent: 69109:1fd736395df3 user: Ezio Melotti date: Sun Apr 03 17:39:19 2011 +0300 summary: #11282: merge with 3.2. files: Doc/library/unittest.rst | 2 +- Lib/unittest/case.py | 3 +-- Lib/unittest/test/test_case.py | 6 ++---- 3 files changed, 4 insertions(+), 7 deletions(-) diff --git a/Doc/library/unittest.rst b/Doc/library/unittest.rst --- a/Doc/library/unittest.rst +++ b/Doc/library/unittest.rst @@ -1462,7 +1462,7 @@ :meth:`.assertRaisesRegex` assertRaisesRegexp ============================== ====================== ====================== - .. deprecated-removed:: 3.1 3.3 + .. deprecated:: 3.1 the fail* aliases listed in the second column. .. deprecated:: 3.2 the assert* aliases listed in the third column. diff --git a/Lib/unittest/case.py b/Lib/unittest/case.py --- a/Lib/unittest/case.py +++ b/Lib/unittest/case.py @@ -1110,8 +1110,7 @@ return original_func(*args, **kwargs) return deprecated_func - # The fail* methods can be removed in 3.3, the 5 assert* methods will - # have to stay around for a few more versions. See #9424. + # see #9424 assertEquals = _deprecate(assertEqual) assertNotEquals = _deprecate(assertNotEqual) assertAlmostEquals = _deprecate(assertAlmostEqual) diff --git a/Lib/unittest/test/test_case.py b/Lib/unittest/test/test_case.py --- a/Lib/unittest/test/test_case.py +++ b/Lib/unittest/test/test_case.py @@ -1093,10 +1093,8 @@ _runtime_warn("barz") def testDeprecatedMethodNames(self): - """Test that the deprecated methods raise a DeprecationWarning. - - The fail* methods will be removed in 3.3. The assert* methods will - have to stay around for a few more versions. See #9424. + """ + Test that the deprecated methods raise a DeprecationWarning. See #9424. """ old = ( (self.assertNotEquals, (3, 5)), -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Apr 3 17:03:36 2011 From: python-checkins at python.org (ezio.melotti) Date: Sun, 03 Apr 2011 17:03:36 +0200 Subject: [Python-checkins] cpython: #11282: add back the fail* methods and assertDictContainsSubset. Message-ID: http://hg.python.org/cpython/rev/aa658836e090 changeset: 69111:aa658836e090 user: Ezio Melotti date: Sun Apr 03 18:02:13 2011 +0300 summary: #11282: add back the fail* methods and assertDictContainsSubset. files: Lib/unittest/case.py | 41 ++++++++++- Lib/unittest/test/_test_warnings.py | 7 +- Lib/unittest/test/test_assertions.py | 9 ++ Lib/unittest/test/test_case.py | 53 ++++++++++++++++ Lib/unittest/test/test_runner.py | 10 ++- Misc/NEWS | 2 + 6 files changed, 113 insertions(+), 9 deletions(-) diff --git a/Lib/unittest/case.py b/Lib/unittest/case.py --- a/Lib/unittest/case.py +++ b/Lib/unittest/case.py @@ -938,6 +938,35 @@ standardMsg = self._truncateMessage(standardMsg, diff) self.fail(self._formatMessage(msg, standardMsg)) + def assertDictContainsSubset(self, subset, dictionary, msg=None): + """Checks whether dictionary is a superset of subset.""" + warnings.warn('assertDictContainsSubset is deprecated', + DeprecationWarning) + missing = [] + mismatched = [] + for key, value in subset.items(): + if key not in dictionary: + missing.append(key) + elif value != dictionary[key]: + mismatched.append('%s, expected: %s, actual: %s' % + (safe_repr(key), safe_repr(value), + safe_repr(dictionary[key]))) + + if not (missing or mismatched): + return + + standardMsg = '' + if missing: + standardMsg = 'Missing: %s' % ','.join(safe_repr(m) for m in + missing) + if mismatched: + if standardMsg: + standardMsg += '; ' + standardMsg += 'Mismatched values: %s' % ','.join(mismatched) + + self.fail(self._formatMessage(msg, standardMsg)) + + def assertCountEqual(self, first, second, msg=None): """An unordered sequence comparison asserting that the same elements, regardless of order. If the same element occurs more than once, @@ -1111,11 +1140,13 @@ return deprecated_func # see #9424 - assertEquals = _deprecate(assertEqual) - assertNotEquals = _deprecate(assertNotEqual) - assertAlmostEquals = _deprecate(assertAlmostEqual) - assertNotAlmostEquals = _deprecate(assertNotAlmostEqual) - assert_ = _deprecate(assertTrue) + failUnlessEqual = assertEquals = _deprecate(assertEqual) + failIfEqual = assertNotEquals = _deprecate(assertNotEqual) + failUnlessAlmostEqual = assertAlmostEquals = _deprecate(assertAlmostEqual) + failIfAlmostEqual = assertNotAlmostEquals = _deprecate(assertNotAlmostEqual) + failUnless = assert_ = _deprecate(assertTrue) + failUnlessRaises = _deprecate(assertRaises) + failIf = _deprecate(assertFalse) assertRaisesRegexp = _deprecate(assertRaisesRegex) assertRegexpMatches = _deprecate(assertRegex) diff --git a/Lib/unittest/test/_test_warnings.py b/Lib/unittest/test/_test_warnings.py --- a/Lib/unittest/test/_test_warnings.py +++ b/Lib/unittest/test/_test_warnings.py @@ -19,12 +19,17 @@ warnings.warn('rw', RuntimeWarning) class TestWarnings(unittest.TestCase): - # unittest warnings will be printed at most once per type + # unittest warnings will be printed at most once per type (max one message + # for the fail* methods, and one for the assert* methods) def test_assert(self): self.assertEquals(2+2, 4) self.assertEquals(2*2, 4) self.assertEquals(2**2, 4) + def test_fail(self): + self.failUnless(1) + self.failUnless(True) + def test_other_unittest(self): self.assertAlmostEqual(2+2, 4) self.assertNotAlmostEqual(4+4, 2) diff --git a/Lib/unittest/test/test_assertions.py b/Lib/unittest/test/test_assertions.py --- a/Lib/unittest/test/test_assertions.py +++ b/Lib/unittest/test/test_assertions.py @@ -223,6 +223,15 @@ "\+ \{'key': 'value'\}$", "\+ \{'key': 'value'\} : oops$"]) + def testAssertDictContainsSubset(self): + with warnings.catch_warnings(): + warnings.simplefilter("ignore", DeprecationWarning) + + self.assertMessages('assertDictContainsSubset', ({'key': 'value'}, {}), + ["^Missing: 'key'$", "^oops$", + "^Missing: 'key'$", + "^Missing: 'key' : oops$"]) + def testAssertMultiLineEqual(self): self.assertMessages('assertMultiLineEqual', ("", "foo"), [r"\+ foo$", "^oops$", diff --git a/Lib/unittest/test/test_case.py b/Lib/unittest/test/test_case.py --- a/Lib/unittest/test/test_case.py +++ b/Lib/unittest/test/test_case.py @@ -523,6 +523,36 @@ self.assertRaises(self.failureException, self.assertNotIn, 'cow', animals) + def testAssertDictContainsSubset(self): + with warnings.catch_warnings(): + warnings.simplefilter("ignore", DeprecationWarning) + + self.assertDictContainsSubset({}, {}) + self.assertDictContainsSubset({}, {'a': 1}) + self.assertDictContainsSubset({'a': 1}, {'a': 1}) + self.assertDictContainsSubset({'a': 1}, {'a': 1, 'b': 2}) + self.assertDictContainsSubset({'a': 1, 'b': 2}, {'a': 1, 'b': 2}) + + with self.assertRaises(self.failureException): + self.assertDictContainsSubset({1: "one"}, {}) + + with self.assertRaises(self.failureException): + self.assertDictContainsSubset({'a': 2}, {'a': 1}) + + with self.assertRaises(self.failureException): + self.assertDictContainsSubset({'c': 1}, {'a': 1}) + + with self.assertRaises(self.failureException): + self.assertDictContainsSubset({'a': 1, 'c': 1}, {'a': 1}) + + with self.assertRaises(self.failureException): + self.assertDictContainsSubset({'a': 1, 'c': 1}, {'a': 1}) + + one = ''.join(chr(i) for i in range(255)) + # this used to cause a UnicodeDecodeError constructing the failure msg + with self.assertRaises(self.failureException): + self.assertDictContainsSubset({'foo': one}, {'foo': '\uFFFD'}) + def testAssertEqual(self): equal_pairs = [ ((), ()), @@ -1097,11 +1127,19 @@ Test that the deprecated methods raise a DeprecationWarning. See #9424. """ old = ( + (self.failIfEqual, (3, 5)), (self.assertNotEquals, (3, 5)), + (self.failUnlessEqual, (3, 3)), (self.assertEquals, (3, 3)), + (self.failUnlessAlmostEqual, (2.0, 2.0)), (self.assertAlmostEquals, (2.0, 2.0)), + (self.failIfAlmostEqual, (3.0, 5.0)), (self.assertNotAlmostEquals, (3.0, 5.0)), + (self.failUnless, (True,)), (self.assert_, (True,)), + (self.failUnlessRaises, (TypeError, lambda _: 3.14 + 'spam')), + (self.failIf, (False,)), + (self.assertDictContainsSubset, (dict(a=1, b=2), dict(a=1, b=2, c=3))), (self.assertRaisesRegexp, (KeyError, 'foo', lambda: {}['foo'])), (self.assertRegexpMatches, ('bar', 'bar')), ) @@ -1109,6 +1147,21 @@ with self.assertWarns(DeprecationWarning): meth(*args) + # disable this test for now. When the version where the fail* methods will + # be removed is decided, re-enable it and update the version + def _testDeprecatedFailMethods(self): + """Test that the deprecated fail* methods get removed in 3.x""" + if sys.version_info[:2] < (3, 3): + return + deprecated_names = [ + 'failIfEqual', 'failUnlessEqual', 'failUnlessAlmostEqual', + 'failIfAlmostEqual', 'failUnless', 'failUnlessRaises', 'failIf', + 'assertDictContainsSubset', + ] + for deprecated_name in deprecated_names: + with self.assertRaises(AttributeError): + getattr(self, deprecated_name) # remove these in 3.x + def testDeepcopy(self): # Issue: 5660 class TestableTest(unittest.TestCase): diff --git a/Lib/unittest/test/test_runner.py b/Lib/unittest/test/test_runner.py --- a/Lib/unittest/test/test_runner.py +++ b/Lib/unittest/test/test_runner.py @@ -257,17 +257,19 @@ return [b.splitlines() for b in p.communicate()] opts = dict(stdout=subprocess.PIPE, stderr=subprocess.PIPE, cwd=os.path.dirname(__file__)) + ae_msg = b'Please use assertEqual instead.' + at_msg = b'Please use assertTrue instead.' # no args -> all the warnings are printed, unittest warnings only once p = subprocess.Popen([sys.executable, '_test_warnings.py'], **opts) out, err = get_parse_out_err(p) self.assertIn(b'OK', err) # check that the total number of warnings in the output is correct - self.assertEqual(len(out), 11) + self.assertEqual(len(out), 12) # check that the numbers of the different kind of warnings is correct for msg in [b'dw', b'iw', b'uw']: self.assertEqual(out.count(msg), 3) - for msg in [b'rw']: + for msg in [ae_msg, at_msg, b'rw']: self.assertEqual(out.count(msg), 1) args_list = ( @@ -292,9 +294,11 @@ **opts) out, err = get_parse_out_err(p) self.assertIn(b'OK', err) - self.assertEqual(len(out), 13) + self.assertEqual(len(out), 14) for msg in [b'dw', b'iw', b'uw', b'rw']: self.assertEqual(out.count(msg), 3) + for msg in [ae_msg, at_msg]: + self.assertEqual(out.count(msg), 1) def testStdErrLookedUpAtInstantiationTime(self): # see issue 10786 diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -87,6 +87,8 @@ Library ------- +- unittest.TestCase.assertSameElements has been removed. + - sys.getfilesystemencoding() raises a RuntimeError if initfsencoding() was not called yet: detect bootstrap (startup) issues earlier. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Apr 3 17:09:10 2011 From: python-checkins at python.org (antoine.pitrou) Date: Sun, 03 Apr 2011 17:09:10 +0200 Subject: [Python-checkins] cpython: Issue #5863: Rewrite BZ2File in pure Python, and allow it to accept Message-ID: http://hg.python.org/cpython/rev/2cb07a46f4b5 changeset: 69112:2cb07a46f4b5 user: Antoine Pitrou date: Sun Apr 03 17:05:46 2011 +0200 summary: Issue #5863: Rewrite BZ2File in pure Python, and allow it to accept file-like objects using a new `fileobj` constructor argument. Patch by Nadeem Vawda. files: Doc/ACKS.txt | 1 + Doc/library/bz2.rst | 221 +- Lib/bz2.py | 392 +++++ Lib/test/test_bz2.py | 142 +- Misc/NEWS | 4 + Modules/bz2module.c | 2281 ++++------------------------- PCbuild/bz2.vcproj | 4 +- PCbuild/pcbuild.sln | 2 +- PCbuild/readme.txt | 6 +- setup.py | 4 +- 10 files changed, 960 insertions(+), 2097 deletions(-) diff --git a/Doc/ACKS.txt b/Doc/ACKS.txt --- a/Doc/ACKS.txt +++ b/Doc/ACKS.txt @@ -202,6 +202,7 @@ * Jim Tittsler * David Turner * Ville Vainio + * Nadeem Vawda * Martijn Vries * Charles G. Waldman * Greg Ward diff --git a/Doc/library/bz2.rst b/Doc/library/bz2.rst --- a/Doc/library/bz2.rst +++ b/Doc/library/bz2.rst @@ -1,189 +1,149 @@ -:mod:`bz2` --- Compression compatible with :program:`bzip2` -=========================================================== +:mod:`bz2` --- Support for :program:`bzip2` compression +======================================================= .. module:: bz2 - :synopsis: Interface to compression and decompression routines - compatible with bzip2. + :synopsis: Interfaces for bzip2 compression and decompression. .. moduleauthor:: Gustavo Niemeyer +.. moduleauthor:: Nadeem Vawda .. sectionauthor:: Gustavo Niemeyer +.. sectionauthor:: Nadeem Vawda -This module provides a comprehensive interface for the bz2 compression library. -It implements a complete file interface, one-shot (de)compression functions, and -types for sequential (de)compression. +This module provides a comprehensive interface for compressing and +decompressing data using the bzip2 compression algorithm. -For other archive formats, see the :mod:`gzip`, :mod:`zipfile`, and +For related file formats, see the :mod:`gzip`, :mod:`zipfile`, and :mod:`tarfile` modules. -Here is a summary of the features offered by the bz2 module: +The :mod:`bz2` module contains: -* :class:`BZ2File` class implements a complete file interface, including - :meth:`~BZ2File.readline`, :meth:`~BZ2File.readlines`, - :meth:`~BZ2File.writelines`, :meth:`~BZ2File.seek`, etc; +* The :class:`BZ2File` class for reading and writing compressed files. +* The :class:`BZ2Compressor` and :class:`BZ2Decompressor` classes for + incremental (de)compression. +* The :func:`compress` and :func:`decompress` functions for one-shot + (de)compression. -* :class:`BZ2File` class implements emulated :meth:`~BZ2File.seek` support; - -* :class:`BZ2File` class implements universal newline support; - -* :class:`BZ2File` class offers an optimized line iteration using a readahead - algorithm; - -* Sequential (de)compression supported by :class:`BZ2Compressor` and - :class:`BZ2Decompressor` classes; - -* One-shot (de)compression supported by :func:`compress` and :func:`decompress` - functions; - -* Thread safety uses individual locking mechanism. +All of the classes in this module may safely be accessed from multiple threads. (De)compression of files ------------------------ -Handling of compressed files is offered by the :class:`BZ2File` class. +.. class:: BZ2File(filename=None, mode='r', buffering=None, compresslevel=9, fileobj=None) + Open a bzip2-compressed file. -.. class:: BZ2File(filename, mode='r', buffering=0, compresslevel=9) + The :class:`BZ2File` can wrap an existing :term:`file object` (given by + *fileobj*), or operate directly on a named file (named by *filename*). + Exactly one of these two parameters should be provided. - Open a bz2 file. Mode can be either ``'r'`` or ``'w'``, for reading (default) - or writing. When opened for writing, the file will be created if it doesn't - exist, and truncated otherwise. If *buffering* is given, ``0`` means - unbuffered, and larger numbers specify the buffer size; the default is - ``0``. If *compresslevel* is given, it must be a number between ``1`` and - ``9``; the default is ``9``. Add a ``'U'`` to mode to open the file for input - with universal newline support. Any line ending in the input file will be - seen as a ``'\n'`` in Python. Also, a file so opened gains the attribute - :attr:`newlines`; the value for this attribute is one of ``None`` (no newline - read yet), ``'\r'``, ``'\n'``, ``'\r\n'`` or a tuple containing all the - newline types seen. Universal newlines are available only when - reading. Instances support iteration in the same way as normal :class:`file` - instances. + The *mode* argument can be either ``'r'`` for reading (default), or ``'w'`` + for writing. - :class:`BZ2File` supports the :keyword:`with` statement. + The *buffering* argument is ignored. Its use is deprecated. + + If *mode* is ``'w'``, *compresslevel* can be a number between ``1`` and + ``9`` specifying the level of compression: ``1`` produces the least + compression, and ``9`` (default) produces the most compression. + + :class:`BZ2File` provides all of the members specified by the + :class:`io.BufferedIOBase`, except for :meth:`detach` and :meth:`truncate`. + Iteration and the :keyword:`with` statement are supported. + + :class:`BZ2File` also provides the following method: + + .. method:: peek([n]) + + Return buffered data without advancing the file position. At least one + byte of data will be returned (unless at EOF). The exact number of bytes + returned is unspecified. + + .. versionadded:: 3.3 .. versionchanged:: 3.1 Support for the :keyword:`with` statement was added. + .. versionchanged:: 3.3 + The :meth:`fileno`, :meth:`readable`, :meth:`seekable`, :meth:`writable`, + :meth:`read1` and :meth:`readinto` methods were added. - .. method:: close() + .. versionchanged:: 3.3 + The *fileobj* argument to the constructor was added. - Close the file. Sets data attribute :attr:`closed` to true. A closed file - cannot be used for further I/O operations. :meth:`close` may be called - more than once without error. - - .. method:: read([size]) - - Read at most *size* uncompressed bytes, returned as a byte string. If the - *size* argument is negative or omitted, read until EOF is reached. - - - .. method:: readline([size]) - - Return the next line from the file, as a byte string, retaining newline. - A non-negative *size* argument limits the maximum number of bytes to - return (an incomplete line may be returned then). Return an empty byte - string at EOF. - - - .. method:: readlines([size]) - - Return a list of lines read. The optional *size* argument, if given, is an - approximate bound on the total number of bytes in the lines returned. - - - .. method:: seek(offset[, whence]) - - Move to new file position. Argument *offset* is a byte count. Optional - argument *whence* defaults to ``os.SEEK_SET`` or ``0`` (offset from start - of file; offset should be ``>= 0``); other values are ``os.SEEK_CUR`` or - ``1`` (move relative to current position; offset can be positive or - negative), and ``os.SEEK_END`` or ``2`` (move relative to end of file; - offset is usually negative, although many platforms allow seeking beyond - the end of a file). - - Note that seeking of bz2 files is emulated, and depending on the - parameters the operation may be extremely slow. - - - .. method:: tell() - - Return the current file position, an integer. - - - .. method:: write(data) - - Write the byte string *data* to file. Note that due to buffering, - :meth:`close` may be needed before the file on disk reflects the data - written. - - - .. method:: writelines(sequence_of_byte_strings) - - Write the sequence of byte strings to the file. Note that newlines are not - added. The sequence can be any iterable object producing byte strings. - This is equivalent to calling write() for each byte string. - - -Sequential (de)compression --------------------------- - -Sequential compression and decompression is done using the classes -:class:`BZ2Compressor` and :class:`BZ2Decompressor`. - +Incremental (de)compression +--------------------------- .. class:: BZ2Compressor(compresslevel=9) Create a new compressor object. This object may be used to compress data - sequentially. If you want to compress data in one shot, use the - :func:`compress` function instead. The *compresslevel* parameter, if given, - must be a number between ``1`` and ``9``; the default is ``9``. + incrementally. For one-shot compression, use the :func:`compress` function + instead. + + *compresslevel*, if given, must be a number between ``1`` and ``9``. The + default is ``9``. .. method:: compress(data) - Provide more data to the compressor object. It will return chunks of - compressed data whenever possible. When you've finished providing data to - compress, call the :meth:`flush` method to finish the compression process, - and return what is left in internal buffers. + Provide data to the compressor object. Returns a chunk of compressed data + if possible, or an empty byte string otherwise. + + When you have finished providing data to the compressor, call the + :meth:`flush` method to finish the compression process. .. method:: flush() - Finish the compression process and return what is left in internal - buffers. You must not use the compressor object after calling this method. + Finish the compression process. Returns the compressed data left in + internal buffers. + + The compressor object may not be used after this method has been called. .. class:: BZ2Decompressor() Create a new decompressor object. This object may be used to decompress data - sequentially. If you want to decompress data in one shot, use the - :func:`decompress` function instead. + incrementally. For one-shot compression, use the :func:`decompress` function + instead. .. method:: decompress(data) - Provide more data to the decompressor object. It will return chunks of - decompressed data whenever possible. If you try to decompress data after - the end of stream is found, :exc:`EOFError` will be raised. If any data - was found after the end of stream, it'll be ignored and saved in - :attr:`unused_data` attribute. + Provide data to the decompressor object. Returns a chunk of decompressed + data if possible, or an empty byte string otherwise. + + Attempting to decompress data after the end of stream is reached raises + an :exc:`EOFError`. If any data is found after the end of the stream, it + is ignored and saved in the :attr:`unused_data` attribute. + + + .. attribute:: eof + + True if the end-of-stream marker has been reached. + + .. versionadded:: 3.3 + + + .. attribute:: unused_data + + Data found after the end of the compressed stream. One-shot (de)compression ------------------------ -One-shot compression and decompression is provided through the :func:`compress` -and :func:`decompress` functions. - - .. function:: compress(data, compresslevel=9) - Compress *data* in one shot. If you want to compress data sequentially, use - an instance of :class:`BZ2Compressor` instead. The *compresslevel* parameter, - if given, must be a number between ``1`` and ``9``; the default is ``9``. + Compress *data*. + + *compresslevel*, if given, must be a number between ``1`` and ``9``. The + default is ``9``. + + For incremental compression, use a :class:`BZ2Compressor` instead. .. function:: decompress(data) - Decompress *data* in one shot. If you want to decompress data sequentially, - use an instance of :class:`BZ2Decompressor` instead. + Decompress *data*. + For incremental decompression, use a :class:`BZ2Decompressor` instead. + diff --git a/Lib/bz2.py b/Lib/bz2.py new file mode 100644 --- /dev/null +++ b/Lib/bz2.py @@ -0,0 +1,392 @@ +"""Interface to the libbzip2 compression library. + +This module provides a file interface, classes for incremental +(de)compression, and functions for one-shot (de)compression. +""" + +__all__ = ["BZ2File", "BZ2Compressor", "BZ2Decompressor", "compress", + "decompress"] + +__author__ = "Nadeem Vawda " + +import io +import threading +import warnings + +from _bz2 import BZ2Compressor, BZ2Decompressor + + +_MODE_CLOSED = 0 +_MODE_READ = 1 +_MODE_READ_EOF = 2 +_MODE_WRITE = 3 + +_BUFFER_SIZE = 8192 + + +class BZ2File(io.BufferedIOBase): + + """A file object providing transparent bzip2 (de)compression. + + A BZ2File can act as a wrapper for an existing file object, or refer + directly to a named file on disk. + + Note that BZ2File provides a *binary* file interface - data read is + returned as bytes, and data to be written should be given as bytes. + """ + + def __init__(self, filename=None, mode="r", buffering=None, + compresslevel=9, fileobj=None): + """Open a bzip2-compressed file. + + If filename is given, open the named file. Otherwise, operate on + the file object given by fileobj. Exactly one of these two + parameters should be provided. + + mode can be 'r' for reading (default), or 'w' for writing. + + buffering is ignored. Its use is deprecated. + + If mode is 'w', compresslevel can be a number between 1 and 9 + specifying the level of compression: 1 produces the least + compression, and 9 (default) produces the most compression. + """ + # This lock must be recursive, so that BufferedIOBase's + # readline(), readlines() and writelines() don't deadlock. + self._lock = threading.RLock() + self._fp = None + self._closefp = False + self._mode = _MODE_CLOSED + self._pos = 0 + self._size = -1 + + if buffering is not None: + warnings.warn("Use of 'buffering' argument is deprecated", + DeprecationWarning) + + if not (1 <= compresslevel <= 9): + raise ValueError("compresslevel must be between 1 and 9") + + if mode in ("", "r", "rb"): + mode = "rb" + mode_code = _MODE_READ + self._decompressor = BZ2Decompressor() + self._buffer = None + elif mode in ("w", "wb"): + mode = "wb" + mode_code = _MODE_WRITE + self._compressor = BZ2Compressor() + else: + raise ValueError("Invalid mode: {!r}".format(mode)) + + if filename is not None and fileobj is None: + self._fp = open(filename, mode) + self._closefp = True + self._mode = mode_code + elif fileobj is not None and filename is None: + self._fp = fileobj + self._mode = mode_code + else: + raise ValueError("Must give exactly one of filename and fileobj") + + def close(self): + """Flush and close the file. + + May be called more than once without error. Once the file is + closed, any other operation on it will raise a ValueError. + """ + with self._lock: + if self._mode == _MODE_CLOSED: + return + try: + if self._mode in (_MODE_READ, _MODE_READ_EOF): + self._decompressor = None + elif self._mode == _MODE_WRITE: + self._fp.write(self._compressor.flush()) + self._compressor = None + finally: + try: + if self._closefp: + self._fp.close() + finally: + self._fp = None + self._closefp = False + self._mode = _MODE_CLOSED + self._buffer = None + + @property + def closed(self): + """True if this file is closed.""" + return self._mode == _MODE_CLOSED + + def fileno(self): + """Return the file descriptor for the underlying file.""" + return self._fp.fileno() + + def seekable(self): + """Return whether the file supports seeking.""" + return self.readable() + + def readable(self): + """Return whether the file was opened for reading.""" + return self._mode in (_MODE_READ, _MODE_READ_EOF) + + def writable(self): + """Return whether the file was opened for writing.""" + return self._mode == _MODE_WRITE + + # Mode-checking helper functions. + + def _check_not_closed(self): + if self.closed: + raise ValueError("I/O operation on closed file") + + def _check_can_read(self): + if not self.readable(): + self._check_not_closed() + raise io.UnsupportedOperation("File not open for reading") + + def _check_can_write(self): + if not self.writable(): + self._check_not_closed() + raise io.UnsupportedOperation("File not open for writing") + + def _check_can_seek(self): + if not self.seekable(): + self._check_not_closed() + raise io.UnsupportedOperation("Seeking is only supported " + "on files opening for reading") + + # Fill the readahead buffer if it is empty. Returns False on EOF. + def _fill_buffer(self): + if self._buffer: + return True + if self._decompressor.eof: + self._mode = _MODE_READ_EOF + self._size = self._pos + return False + rawblock = self._fp.read(_BUFFER_SIZE) + if not rawblock: + raise EOFError("Compressed file ended before the " + "end-of-stream marker was reached") + self._buffer = self._decompressor.decompress(rawblock) + return True + + # Read data until EOF. + # If return_data is false, consume the data without returning it. + def _read_all(self, return_data=True): + blocks = [] + while self._fill_buffer(): + if return_data: + blocks.append(self._buffer) + self._pos += len(self._buffer) + self._buffer = None + if return_data: + return b"".join(blocks) + + # Read a block of up to n bytes. + # If return_data is false, consume the data without returning it. + def _read_block(self, n, return_data=True): + blocks = [] + while n > 0 and self._fill_buffer(): + if n < len(self._buffer): + data = self._buffer[:n] + self._buffer = self._buffer[n:] + else: + data = self._buffer + self._buffer = None + if return_data: + blocks.append(data) + self._pos += len(data) + n -= len(data) + if return_data: + return b"".join(blocks) + + def peek(self, n=0): + """Return buffered data without advancing the file position. + + Always returns at least one byte of data, unless at EOF. + The exact number of bytes returned is unspecified. + """ + with self._lock: + self._check_can_read() + if self._mode == _MODE_READ_EOF or not self._fill_buffer(): + return b"" + return self._buffer + + def read(self, size=-1): + """Read up to size uncompressed bytes from the file. + + If size is negative or omitted, read until EOF is reached. + Returns b'' if the file is already at EOF. + """ + with self._lock: + self._check_can_read() + if self._mode == _MODE_READ_EOF or size == 0: + return b"" + elif size < 0: + return self._read_all() + else: + return self._read_block(size) + + def read1(self, size=-1): + """Read up to size uncompressed bytes with at most one read + from the underlying stream. + + Returns b'' if the file is at EOF. + """ + with self._lock: + self._check_can_read() + if (size == 0 or self._mode == _MODE_READ_EOF or + not self._fill_buffer()): + return b"" + if 0 < size < len(self._buffer): + data = self._buffer[:size] + self._buffer = self._buffer[size:] + else: + data = self._buffer + self._buffer = None + self._pos += len(data) + return data + + def readinto(self, b): + """Read up to len(b) bytes into b. + + Returns the number of bytes read (0 for EOF). + """ + with self._lock: + return io.BufferedIOBase.readinto(self, b) + + def readline(self, size=-1): + """Read a line of uncompressed bytes from the file. + + The terminating newline (if present) is retained. If size is + non-negative, no more than size bytes will be read (in which + case the line may be incomplete). Returns b'' if already at EOF. + """ + if not hasattr(size, "__index__"): + raise TypeError("Integer argument expected") + size = size.__index__() + with self._lock: + return io.BufferedIOBase.readline(self, size) + + def readlines(self, size=-1): + """Read a list of lines of uncompressed bytes from the file. + + size can be specified to control the number of lines read: no + further lines will be read once the total size of the lines read + so far equals or exceeds size. + """ + if not hasattr(size, "__index__"): + raise TypeError("Integer argument expected") + size = size.__index__() + with self._lock: + return io.BufferedIOBase.readlines(self, size) + + def write(self, data): + """Write a byte string to the file. + + Returns the number of uncompressed bytes written, which is + always len(data). Note that due to buffering, the file on disk + may not reflect the data written until close() is called. + """ + with self._lock: + self._check_can_write() + compressed = self._compressor.compress(data) + self._fp.write(compressed) + self._pos += len(data) + return len(data) + + def writelines(self, seq): + """Write a sequence of byte strings to the file. + + Returns the number of uncompressed bytes written. + seq can be any iterable yielding byte strings. + + Line separators are not added between the written byte strings. + """ + with self._lock: + return io.BufferedIOBase.writelines(self, seq) + + # Rewind the file to the beginning of the data stream. + def _rewind(self): + self._fp.seek(0, 0) + self._mode = _MODE_READ + self._pos = 0 + self._decompressor = BZ2Decompressor() + self._buffer = None + + def seek(self, offset, whence=0): + """Change the file position. + + The new position is specified by offset, relative to the + position indicated by whence. Values for whence are: + + 0: start of stream (default); offset must not be negative + 1: current stream position + 2: end of stream; offset must not be positive + + Returns the new file position. + + Note that seeking is emulated, so depending on the parameters, + this operation may be extremely slow. + """ + with self._lock: + self._check_can_seek() + + # Recalculate offset as an absolute file position. + if whence == 0: + pass + elif whence == 1: + offset = self._pos + offset + elif whence == 2: + # Seeking relative to EOF - we need to know the file's size. + if self._size < 0: + self._read_all(return_data=False) + offset = self._size + offset + else: + raise ValueError("Invalid value for whence: {}".format(whence)) + + # Make it so that offset is the number of bytes to skip forward. + if offset < self._pos: + self._rewind() + else: + offset -= self._pos + + # Read and discard data until we reach the desired position. + if self._mode != _MODE_READ_EOF: + self._read_block(offset, return_data=False) + + return self._pos + + def tell(self): + """Return the current file position.""" + with self._lock: + self._check_not_closed() + return self._pos + + +def compress(data, compresslevel=9): + """Compress a block of data. + + compresslevel, if given, must be a number between 1 and 9. + + For incremental compression, use a BZ2Compressor object instead. + """ + comp = BZ2Compressor(compresslevel) + return comp.compress(data) + comp.flush() + + +def decompress(data): + """Decompress a block of data. + + For incremental decompression, use a BZ2Decompressor object instead. + """ + if len(data) == 0: + return b"" + decomp = BZ2Decompressor() + result = decomp.decompress(data) + if not decomp.eof: + raise ValueError("Compressed data ended before the " + "end-of-stream marker was reached") + return result diff --git a/Lib/test/test_bz2.py b/Lib/test/test_bz2.py --- a/Lib/test/test_bz2.py +++ b/Lib/test/test_bz2.py @@ -21,7 +21,30 @@ class BaseTest(unittest.TestCase): "Base for other testcases." - TEXT = b'root:x:0:0:root:/root:/bin/bash\nbin:x:1:1:bin:/bin:\ndaemon:x:2:2:daemon:/sbin:\nadm:x:3:4:adm:/var/adm:\nlp:x:4:7:lp:/var/spool/lpd:\nsync:x:5:0:sync:/sbin:/bin/sync\nshutdown:x:6:0:shutdown:/sbin:/sbin/shutdown\nhalt:x:7:0:halt:/sbin:/sbin/halt\nmail:x:8:12:mail:/var/spool/mail:\nnews:x:9:13:news:/var/spool/news:\nuucp:x:10:14:uucp:/var/spool/uucp:\noperator:x:11:0:operator:/root:\ngames:x:12:100:games:/usr/games:\ngopher:x:13:30:gopher:/usr/lib/gopher-data:\nftp:x:14:50:FTP User:/var/ftp:/bin/bash\nnobody:x:65534:65534:Nobody:/home:\npostfix:x:100:101:postfix:/var/spool/postfix:\nniemeyer:x:500:500::/home/niemeyer:/bin/bash\npostgres:x:101:102:PostgreSQL Server:/var/lib/pgsql:/bin/bash\nmysql:x:102:103:MySQL server:/var/lib/mysql:/bin/bash\nwww:x:103:104::/var/www:/bin/false\n' + TEXT_LINES = [ + b'root:x:0:0:root:/root:/bin/bash\n', + b'bin:x:1:1:bin:/bin:\n', + b'daemon:x:2:2:daemon:/sbin:\n', + b'adm:x:3:4:adm:/var/adm:\n', + b'lp:x:4:7:lp:/var/spool/lpd:\n', + b'sync:x:5:0:sync:/sbin:/bin/sync\n', + b'shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown\n', + b'halt:x:7:0:halt:/sbin:/sbin/halt\n', + b'mail:x:8:12:mail:/var/spool/mail:\n', + b'news:x:9:13:news:/var/spool/news:\n', + b'uucp:x:10:14:uucp:/var/spool/uucp:\n', + b'operator:x:11:0:operator:/root:\n', + b'games:x:12:100:games:/usr/games:\n', + b'gopher:x:13:30:gopher:/usr/lib/gopher-data:\n', + b'ftp:x:14:50:FTP User:/var/ftp:/bin/bash\n', + b'nobody:x:65534:65534:Nobody:/home:\n', + b'postfix:x:100:101:postfix:/var/spool/postfix:\n', + b'niemeyer:x:500:500::/home/niemeyer:/bin/bash\n', + b'postgres:x:101:102:PostgreSQL Server:/var/lib/pgsql:/bin/bash\n', + b'mysql:x:102:103:MySQL server:/var/lib/mysql:/bin/bash\n', + b'www:x:103:104::/var/www:/bin/false\n', + ] + TEXT = b''.join(TEXT_LINES) DATA = b'BZh91AY&SY.\xc8N\x18\x00\x01>_\x80\x00\x10@\x02\xff\xf0\x01\x07n\x00?\xe7\xff\xe00\x01\x99\xaa\x00\xc0\x03F\x86\x8c#&\x83F\x9a\x03\x06\xa6\xd0\xa6\x93M\x0fQ\xa7\xa8\x06\x804hh\x12$\x11\xa4i4\xf14S\xd2\x88\xe5\xcd9gd6\x0b\n\xe9\x9b\xd5\x8a\x99\xf7\x08.K\x8ev\xfb\xf7xw\xbb\xdf\xa1\x92\xf1\xdd|/";\xa2\xba\x9f\xd5\xb1#A\xb6\xf6\xb3o\xc9\xc5y\\\xebO\xe7\x85\x9a\xbc\xb6f8\x952\xd5\xd7"%\x89>V,\xf7\xa6z\xe2\x9f\xa3\xdf\x11\x11"\xd6E)I\xa9\x13^\xca\xf3r\xd0\x03U\x922\xf26\xec\xb6\xed\x8b\xc3U\x13\x9d\xc5\x170\xa4\xfa^\x92\xacDF\x8a\x97\xd6\x19\xfe\xdd\xb8\xbd\x1a\x9a\x19\xa3\x80ankR\x8b\xe5\xd83]\xa9\xc6\x08\x82f\xf6\xb9"6l$\xb8j@\xc0\x8a\xb0l1..\xbak\x83ls\x15\xbc\xf4\xc1\x13\xbe\xf8E\xb8\x9d\r\xa8\x9dk\x84\xd3n\xfa\xacQ\x07\xb1%y\xaav\xb4\x08\xe0z\x1b\x16\xf5\x04\xe9\xcc\xb9\x08z\x1en7.G\xfc]\xc9\x14\xe1B@\xbb!8`' DATA_CRLF = b'BZh91AY&SY\xaez\xbbN\x00\x01H\xdf\x80\x00\x12@\x02\xff\xf0\x01\x07n\x00?\xe7\xff\xe0@\x01\xbc\xc6`\x86*\x8d=M\xa9\x9a\x86\xd0L@\x0fI\xa6!\xa1\x13\xc8\x88jdi\x8d@\x03@\x1a\x1a\x0c\x0c\x83 \x00\xc4h2\x19\x01\x82D\x84e\t\xe8\x99\x89\x19\x1ah\x00\r\x1a\x11\xaf\x9b\x0fG\xf5(\x1b\x1f?\t\x12\xcf\xb5\xfc\x95E\x00ps\x89\x12^\xa4\xdd\xa2&\x05(\x87\x04\x98\x89u\xe40%\xb6\x19\'\x8c\xc4\x89\xca\x07\x0e\x1b!\x91UIFU%C\x994!DI\xd2\xfa\xf0\xf1N8W\xde\x13A\xf5\x9cr%?\x9f3;I45A\xd1\x8bT\xb1\xa4\xc7\x8d\x1a\\"\xad\xa1\xabyBg\x15\xb9l\x88\x88\x91k"\x94\xa4\xd4\x89\xae*\xa6\x0b\x10\x0c\xd6\xd4m\xe86\xec\xb5j\x8a\x86j\';\xca.\x01I\xf2\xaaJ\xe8\x88\x8cU+t3\xfb\x0c\n\xa33\x13r2\r\x16\xe0\xb3(\xbf\x1d\x83r\xe7M\xf0D\x1365\xd8\x88\xd3\xa4\x92\xcb2\x06\x04\\\xc1\xb0\xea//\xbek&\xd8\xe6+t\xe5\xa1\x13\xada\x16\xder5"w]\xa2i\xb7[\x97R \xe2IT\xcd;Z\x04dk4\xad\x8a\t\xd3\x81z\x10\xf1:^`\xab\x1f\xc5\xdc\x91N\x14$+\x9e\xae\xd3\x80' @@ -54,13 +77,15 @@ if os.path.isfile(self.filename): os.unlink(self.filename) - def createTempFile(self, crlf=0): + def getData(self, crlf=False): + if crlf: + return self.DATA_CRLF + else: + return self.DATA + + def createTempFile(self, crlf=False): with open(self.filename, "wb") as f: - if crlf: - data = self.DATA_CRLF - else: - data = self.DATA - f.write(data) + f.write(self.getData(crlf)) def testRead(self): # "Test BZ2File.read()" @@ -70,7 +95,7 @@ self.assertEqual(bz2f.read(), self.TEXT) def testRead0(self): - # Test BBZ2File.read(0)" + # "Test BBZ2File.read(0)" self.createTempFile() with BZ2File(self.filename) as bz2f: self.assertRaises(TypeError, bz2f.read, None) @@ -94,6 +119,28 @@ with BZ2File(self.filename) as bz2f: self.assertEqual(bz2f.read(100), self.TEXT[:100]) + def testPeek(self): + # "Test BZ2File.peek()" + self.createTempFile() + with BZ2File(self.filename) as bz2f: + pdata = bz2f.peek() + self.assertNotEqual(len(pdata), 0) + self.assertTrue(self.TEXT.startswith(pdata)) + self.assertEqual(bz2f.read(), self.TEXT) + + def testReadInto(self): + # "Test BZ2File.readinto()" + self.createTempFile() + with BZ2File(self.filename) as bz2f: + n = 128 + b = bytearray(n) + self.assertEqual(bz2f.readinto(b), n) + self.assertEqual(b, self.TEXT[:n]) + n = len(self.TEXT) - n + b = bytearray(len(self.TEXT)) + self.assertEqual(bz2f.readinto(b), n) + self.assertEqual(b[:n], self.TEXT[-n:]) + def testReadLine(self): # "Test BZ2File.readline()" self.createTempFile() @@ -125,7 +172,7 @@ bz2f = BZ2File(self.filename) bz2f.close() self.assertRaises(ValueError, bz2f.__next__) - # This call will deadlock of the above .__next__ call failed to + # This call will deadlock if the above .__next__ call failed to # release the lock. self.assertRaises(ValueError, bz2f.readlines) @@ -217,6 +264,13 @@ self.assertEqual(bz2f.tell(), 0) self.assertEqual(bz2f.read(), self.TEXT) + def testFileno(self): + # "Test BZ2File.fileno()" + self.createTempFile() + with open(self.filename) as rawf: + with BZ2File(fileobj=rawf) as bz2f: + self.assertEqual(bz2f.fileno(), rawf.fileno()) + def testOpenDel(self): # "Test opening and deleting a file many times" self.createTempFile() @@ -278,17 +332,65 @@ t.join() def testMixedIterationReads(self): - # Issue #8397: mixed iteration and reads should be forbidden. - with bz2.BZ2File(self.filename, 'wb') as f: - # The internal buffer size is hard-wired to 8192 bytes, we must - # write out more than that for the test to stop half through - # the buffer. - f.write(self.TEXT * 100) - with bz2.BZ2File(self.filename, 'rb') as f: - next(f) - self.assertRaises(ValueError, f.read) - self.assertRaises(ValueError, f.readline) - self.assertRaises(ValueError, f.readlines) + # "Test mixed iteration and reads." + self.createTempFile() + linelen = len(self.TEXT_LINES[0]) + halflen = linelen // 2 + with bz2.BZ2File(self.filename) as bz2f: + bz2f.read(halflen) + self.assertEqual(next(bz2f), self.TEXT_LINES[0][halflen:]) + self.assertEqual(bz2f.read(), self.TEXT[linelen:]) + with bz2.BZ2File(self.filename) as bz2f: + bz2f.readline() + self.assertEqual(next(bz2f), self.TEXT_LINES[1]) + self.assertEqual(bz2f.readline(), self.TEXT_LINES[2]) + with bz2.BZ2File(self.filename) as bz2f: + bz2f.readlines() + with self.assertRaises(StopIteration): + next(bz2f) + self.assertEqual(bz2f.readlines(), []) + + def testReadBytesIO(self): + # "Test BZ2File.read() with BytesIO source" + with BytesIO(self.getData()) as bio: + with BZ2File(fileobj=bio) as bz2f: + self.assertRaises(TypeError, bz2f.read, None) + self.assertEqual(bz2f.read(), self.TEXT) + self.assertFalse(bio.closed) + + def testPeekBytesIO(self): + # "Test BZ2File.peek() with BytesIO source" + with BytesIO(self.getData()) as bio: + with BZ2File(fileobj=bio) as bz2f: + pdata = bz2f.peek() + self.assertNotEqual(len(pdata), 0) + self.assertTrue(self.TEXT.startswith(pdata)) + self.assertEqual(bz2f.read(), self.TEXT) + + def testWriteBytesIO(self): + # "Test BZ2File.write() with BytesIO destination" + with BytesIO() as bio: + with BZ2File(fileobj=bio, mode="w") as bz2f: + self.assertRaises(TypeError, bz2f.write) + bz2f.write(self.TEXT) + self.assertEqual(self.decompress(bio.getvalue()), self.TEXT) + self.assertFalse(bio.closed) + + def testSeekForwardBytesIO(self): + # "Test BZ2File.seek(150, 0) with BytesIO source" + with BytesIO(self.getData()) as bio: + with BZ2File(fileobj=bio) as bz2f: + self.assertRaises(TypeError, bz2f.seek) + bz2f.seek(150) + self.assertEqual(bz2f.read(), self.TEXT[150:]) + + def testSeekBackwardsBytesIO(self): + # "Test BZ2File.seek(-150, 1) with BytesIO source" + with BytesIO(self.getData()) as bio: + with BZ2File(fileobj=bio) as bz2f: + bz2f.read(500) + bz2f.seek(-150, 1) + self.assertEqual(bz2f.read(), self.TEXT[500-150:]) class BZ2CompressorTest(BaseTest): def testCompress(self): diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -87,6 +87,10 @@ Library ------- +- Issue #5863: Rewrite BZ2File in pure Python, and allow it to accept + file-like objects using a new ``fileobj`` constructor argument. Patch by + Nadeem Vawda. + - unittest.TestCase.assertSameElements has been removed. - sys.getfilesystemencoding() raises a RuntimeError if initfsencoding() was not diff --git a/Modules/bz2module.c b/Modules/_bz2module.c rename from Modules/bz2module.c rename to Modules/_bz2module.c --- a/Modules/bz2module.c +++ b/Modules/_bz2module.c @@ -1,215 +1,111 @@ -/* +/* _bz2 - Low-level Python interface to libbzip2. */ -python-bz2 - python bz2 library interface - -Copyright (c) 2002 Gustavo Niemeyer -Copyright (c) 2002 Python Software Foundation; All Rights Reserved - -*/ +#define PY_SSIZE_T_CLEAN #include "Python.h" -#include -#include #include "structmember.h" #ifdef WITH_THREAD #include "pythread.h" #endif -static char __author__[] = -"The bz2 python module was written by:\n\ -\n\ - Gustavo Niemeyer \n\ -"; +#include +#include -/* Our very own off_t-like type, 64-bit if possible */ -/* copied from Objects/fileobject.c */ -#if !defined(HAVE_LARGEFILE_SUPPORT) -typedef off_t Py_off_t; -#elif SIZEOF_OFF_T >= 8 -typedef off_t Py_off_t; -#elif SIZEOF_FPOS_T >= 8 -typedef fpos_t Py_off_t; -#else -#error "Large file support, but neither off_t nor fpos_t is large enough." -#endif -#define BUF(v) PyBytes_AS_STRING(v) - -#define MODE_CLOSED 0 -#define MODE_READ 1 -#define MODE_READ_EOF 2 -#define MODE_WRITE 3 - -#define BZ2FileObject_Check(v) (Py_TYPE(v) == &BZ2File_Type) - - -#ifdef BZ_CONFIG_ERROR - -#if SIZEOF_LONG >= 8 -#define BZS_TOTAL_OUT(bzs) \ - (((long)bzs->total_out_hi32 << 32) + bzs->total_out_lo32) -#elif SIZEOF_LONG_LONG >= 8 -#define BZS_TOTAL_OUT(bzs) \ - (((PY_LONG_LONG)bzs->total_out_hi32 << 32) + bzs->total_out_lo32) -#else -#define BZS_TOTAL_OUT(bzs) \ - bzs->total_out_lo32 -#endif - -#else /* ! BZ_CONFIG_ERROR */ - -#define BZ2_bzRead bzRead -#define BZ2_bzReadOpen bzReadOpen -#define BZ2_bzReadClose bzReadClose -#define BZ2_bzWrite bzWrite -#define BZ2_bzWriteOpen bzWriteOpen -#define BZ2_bzWriteClose bzWriteClose +#ifndef BZ_CONFIG_ERROR #define BZ2_bzCompress bzCompress #define BZ2_bzCompressInit bzCompressInit #define BZ2_bzCompressEnd bzCompressEnd #define BZ2_bzDecompress bzDecompress #define BZ2_bzDecompressInit bzDecompressInit #define BZ2_bzDecompressEnd bzDecompressEnd - -#define BZS_TOTAL_OUT(bzs) bzs->total_out - -#endif /* ! BZ_CONFIG_ERROR */ +#endif /* ! BZ_CONFIG_ERROR */ #ifdef WITH_THREAD #define ACQUIRE_LOCK(obj) do { \ - if (!PyThread_acquire_lock(obj->lock, 0)) { \ + if (!PyThread_acquire_lock((obj)->lock, 0)) { \ Py_BEGIN_ALLOW_THREADS \ - PyThread_acquire_lock(obj->lock, 1); \ + PyThread_acquire_lock((obj)->lock, 1); \ Py_END_ALLOW_THREADS \ - } } while(0) -#define RELEASE_LOCK(obj) PyThread_release_lock(obj->lock) + } } while (0) +#define RELEASE_LOCK(obj) PyThread_release_lock((obj)->lock) #else #define ACQUIRE_LOCK(obj) #define RELEASE_LOCK(obj) #endif -/* Bits in f_newlinetypes */ -#define NEWLINE_UNKNOWN 0 /* No newline seen, yet */ -#define NEWLINE_CR 1 /* \r newline seen */ -#define NEWLINE_LF 2 /* \n newline seen */ -#define NEWLINE_CRLF 4 /* \r\n newline seen */ - -/* ===================================================================== */ -/* Structure definitions. */ - -typedef struct { - PyObject_HEAD - FILE *rawfp; - - char* f_buf; /* Allocated readahead buffer */ - char* f_bufend; /* Points after last occupied position */ - char* f_bufptr; /* Current buffer position */ - - BZFILE *fp; - int mode; - Py_off_t pos; - Py_off_t size; -#ifdef WITH_THREAD - PyThread_type_lock lock; -#endif -} BZ2FileObject; typedef struct { PyObject_HEAD bz_stream bzs; - int running; + int flushed; #ifdef WITH_THREAD PyThread_type_lock lock; #endif -} BZ2CompObject; +} BZ2Compressor; typedef struct { PyObject_HEAD bz_stream bzs; - int running; + char eof; /* T_BOOL expects a char */ PyObject *unused_data; #ifdef WITH_THREAD PyThread_type_lock lock; #endif -} BZ2DecompObject; +} BZ2Decompressor; -/* ===================================================================== */ -/* Utility functions. */ -/* Refuse regular I/O if there's data in the iteration-buffer. - * Mixing them would cause data to arrive out of order, as the read* - * methods don't use the iteration buffer. */ -static int -check_iterbuffered(BZ2FileObject *f) -{ - if (f->f_buf != NULL && - (f->f_bufend - f->f_bufptr) > 0 && - f->f_buf[0] != '\0') { - PyErr_SetString(PyExc_ValueError, - "Mixing iteration and read methods would lose data"); - return -1; - } - return 0; -} +/* Helper functions. */ static int -Util_CatchBZ2Error(int bzerror) +catch_bz2_error(int bzerror) { - int ret = 0; switch(bzerror) { case BZ_OK: + case BZ_RUN_OK: + case BZ_FLUSH_OK: + case BZ_FINISH_OK: case BZ_STREAM_END: - break; + return 0; #ifdef BZ_CONFIG_ERROR case BZ_CONFIG_ERROR: PyErr_SetString(PyExc_SystemError, - "the bz2 library was not compiled " - "correctly"); - ret = 1; - break; + "libbzip2 was not compiled correctly"); + return 1; #endif - case BZ_PARAM_ERROR: PyErr_SetString(PyExc_ValueError, - "the bz2 library has received wrong " - "parameters"); - ret = 1; - break; - + "Internal error - " + "invalid parameters passed to libbzip2"); + return 1; case BZ_MEM_ERROR: PyErr_NoMemory(); - ret = 1; - break; - + return 1; case BZ_DATA_ERROR: case BZ_DATA_ERROR_MAGIC: - PyErr_SetString(PyExc_IOError, "invalid data stream"); - ret = 1; - break; - + PyErr_SetString(PyExc_IOError, "Invalid data stream"); + return 1; case BZ_IO_ERROR: - PyErr_SetString(PyExc_IOError, "unknown IO error"); - ret = 1; - break; - + PyErr_SetString(PyExc_IOError, "Unknown I/O error"); + return 1; case BZ_UNEXPECTED_EOF: PyErr_SetString(PyExc_EOFError, - "compressed file ended before the " - "logical end-of-stream was detected"); - ret = 1; - break; - + "Compressed file ended before the logical " + "end-of-stream was detected"); + return 1; case BZ_SEQUENCE_ERROR: PyErr_SetString(PyExc_RuntimeError, - "wrong sequence of bz2 library " - "commands used"); - ret = 1; - break; + "Internal error - " + "Invalid sequence of commands sent to libbzip2"); + return 1; + default: + PyErr_Format(PyExc_IOError, + "Unrecognized error from libbzip2: %d", bzerror); + return 1; } - return ret; } #if BUFSIZ < 8192 @@ -224,1599 +120,316 @@ #define BIGCHUNK (512 * 1024) #endif -/* This is a hacked version of Python's fileobject.c:new_buffersize(). */ -static size_t -Util_NewBufferSize(size_t currentsize) +static int +grow_buffer(PyObject **buf) { - if (currentsize > SMALLCHUNK) { - /* Keep doubling until we reach BIGCHUNK; - then keep adding BIGCHUNK. */ - if (currentsize <= BIGCHUNK) - return currentsize + currentsize; - else - return currentsize + BIGCHUNK; - } - return currentsize + SMALLCHUNK; + size_t size = PyBytes_GET_SIZE(*buf); + if (size <= SMALLCHUNK) + return _PyBytes_Resize(buf, size + SMALLCHUNK); + else if (size <= BIGCHUNK) + return _PyBytes_Resize(buf, size * 2); + else + return _PyBytes_Resize(buf, size + BIGCHUNK); } -/* This is a hacked version of Python's fileobject.c:get_line(). */ + +/* BZ2Compressor class. */ + static PyObject * -Util_GetLine(BZ2FileObject *f, int n) +compress(BZ2Compressor *c, char *data, size_t len, int action) { - char c; - char *buf, *end; - size_t total_v_size; /* total # of slots in buffer */ - size_t used_v_size; /* # used slots in buffer */ - size_t increment; /* amount to increment the buffer */ - PyObject *v; - int bzerror; - int bytes_read; + size_t data_size = 0; + PyObject *result; - total_v_size = n > 0 ? n : 100; - v = PyBytes_FromStringAndSize((char *)NULL, total_v_size); - if (v == NULL) + result = PyBytes_FromStringAndSize(NULL, SMALLCHUNK); + if (result == NULL) return NULL; + c->bzs.next_in = data; + /* FIXME This is not 64-bit clean - avail_in is an int. */ + c->bzs.avail_in = len; + c->bzs.next_out = PyBytes_AS_STRING(result); + c->bzs.avail_out = PyBytes_GET_SIZE(result); + for (;;) { + char *this_out; + int bzerror; - buf = BUF(v); - end = buf + total_v_size; + Py_BEGIN_ALLOW_THREADS + this_out = c->bzs.next_out; + bzerror = BZ2_bzCompress(&c->bzs, action); + data_size += c->bzs.next_out - this_out; + Py_END_ALLOW_THREADS + if (catch_bz2_error(bzerror)) + goto error; - for (;;) { - Py_BEGIN_ALLOW_THREADS - do { - bytes_read = BZ2_bzRead(&bzerror, f->fp, &c, 1); - f->pos++; - if (bytes_read == 0) - break; - *buf++ = c; - } while (bzerror == BZ_OK && c != '\n' && buf != end); - Py_END_ALLOW_THREADS - if (bzerror == BZ_STREAM_END) { - f->size = f->pos; - f->mode = MODE_READ_EOF; + /* In regular compression mode, stop when input data is exhausted. + In flushing mode, stop when all buffered data has been flushed. */ + if ((action == BZ_RUN && c->bzs.avail_in == 0) || + (action == BZ_FINISH && bzerror == BZ_STREAM_END)) break; - } else if (bzerror != BZ_OK) { - Util_CatchBZ2Error(bzerror); - Py_DECREF(v); - return NULL; - } - if (c == '\n') - break; - /* Must be because buf == end */ - if (n > 0) - break; - used_v_size = total_v_size; - increment = total_v_size >> 2; /* mild exponential growth */ - total_v_size += increment; - if (total_v_size > INT_MAX) { - PyErr_SetString(PyExc_OverflowError, - "line is longer than a Python string can hold"); - Py_DECREF(v); - return NULL; - } - if (_PyBytes_Resize(&v, total_v_size) < 0) { - return NULL; - } - buf = BUF(v) + used_v_size; - end = BUF(v) + total_v_size; - } - used_v_size = buf - BUF(v); - if (used_v_size != total_v_size) { - if (_PyBytes_Resize(&v, used_v_size) < 0) { - v = NULL; + if (c->bzs.avail_out == 0) { + if (grow_buffer(&result) < 0) + goto error; + c->bzs.next_out = PyBytes_AS_STRING(result) + data_size; + c->bzs.avail_out = PyBytes_GET_SIZE(result) - data_size; } } - return v; + if (data_size != PyBytes_GET_SIZE(result)) + if (_PyBytes_Resize(&result, data_size) < 0) + goto error; + return result; + +error: + Py_XDECREF(result); + return NULL; } -/* This is a hacked version of Python's fileobject.c:drop_readahead(). */ -static void -Util_DropReadAhead(BZ2FileObject *f) +PyDoc_STRVAR(BZ2Compressor_compress__doc__, +"compress(data) -> bytes\n" +"\n" +"Provide data to the compressor object. Returns a chunk of\n" +"compressed data if possible, or b'' otherwise.\n" +"\n" +"When you have finished providing data to the compressor, call the\n" +"flush() method to finish the compression process.\n"); + +static PyObject * +BZ2Compressor_compress(BZ2Compressor *self, PyObject *args) { - if (f->f_buf != NULL) { - PyMem_Free(f->f_buf); - f->f_buf = NULL; - } -} + Py_buffer buffer; + PyObject *result = NULL; -/* This is a hacked version of Python's fileobject.c:readahead(). */ -static int -Util_ReadAhead(BZ2FileObject *f, int bufsize) -{ - int chunksize; - int bzerror; - - if (f->f_buf != NULL) { - if((f->f_bufend - f->f_bufptr) >= 1) - return 0; - else - Util_DropReadAhead(f); - } - if (f->mode == MODE_READ_EOF) { - f->f_bufptr = f->f_buf; - f->f_bufend = f->f_buf; - return 0; - } - if ((f->f_buf = PyMem_Malloc(bufsize)) == NULL) { - PyErr_NoMemory(); - return -1; - } - Py_BEGIN_ALLOW_THREADS - chunksize = BZ2_bzRead(&bzerror, f->fp, f->f_buf, bufsize); - Py_END_ALLOW_THREADS - f->pos += chunksize; - if (bzerror == BZ_STREAM_END) { - f->size = f->pos; - f->mode = MODE_READ_EOF; - } else if (bzerror != BZ_OK) { - Util_CatchBZ2Error(bzerror); - Util_DropReadAhead(f); - return -1; - } - f->f_bufptr = f->f_buf; - f->f_bufend = f->f_buf + chunksize; - return 0; -} - -/* This is a hacked version of Python's - * fileobject.c:readahead_get_line_skip(). */ -static PyBytesObject * -Util_ReadAheadGetLineSkip(BZ2FileObject *f, int skip, int bufsize) -{ - PyBytesObject* s; - char *bufptr; - char *buf; - int len; - - if (f->f_buf == NULL) - if (Util_ReadAhead(f, bufsize) < 0) - return NULL; - - len = f->f_bufend - f->f_bufptr; - if (len == 0) - return (PyBytesObject *) - PyBytes_FromStringAndSize(NULL, skip); - bufptr = memchr(f->f_bufptr, '\n', len); - if (bufptr != NULL) { - bufptr++; /* Count the '\n' */ - len = bufptr - f->f_bufptr; - s = (PyBytesObject *) - PyBytes_FromStringAndSize(NULL, skip+len); - if (s == NULL) - return NULL; - memcpy(PyBytes_AS_STRING(s)+skip, f->f_bufptr, len); - f->f_bufptr = bufptr; - if (bufptr == f->f_bufend) - Util_DropReadAhead(f); - } else { - bufptr = f->f_bufptr; - buf = f->f_buf; - f->f_buf = NULL; /* Force new readahead buffer */ - s = Util_ReadAheadGetLineSkip(f, skip+len, - bufsize + (bufsize>>2)); - if (s == NULL) { - PyMem_Free(buf); - return NULL; - } - memcpy(PyBytes_AS_STRING(s)+skip, bufptr, len); - PyMem_Free(buf); - } - return s; -} - -/* ===================================================================== */ -/* Methods of BZ2File. */ - -PyDoc_STRVAR(BZ2File_read__doc__, -"read([size]) -> string\n\ -\n\ -Read at most size uncompressed bytes, returned as a string. If the size\n\ -argument is negative or omitted, read until EOF is reached.\n\ -"); - -/* This is a hacked version of Python's fileobject.c:file_read(). */ -static PyObject * -BZ2File_read(BZ2FileObject *self, PyObject *args) -{ - long bytesrequested = -1; - size_t bytesread, buffersize, chunksize; - int bzerror; - PyObject *ret = NULL; - - if (!PyArg_ParseTuple(args, "|l:read", &bytesrequested)) + if (!PyArg_ParseTuple(args, "y*:compress", &buffer)) return NULL; ACQUIRE_LOCK(self); - switch (self->mode) { - case MODE_READ: - break; - case MODE_READ_EOF: - ret = PyBytes_FromStringAndSize("", 0); - goto cleanup; - case MODE_CLOSED: - PyErr_SetString(PyExc_ValueError, - "I/O operation on closed file"); - goto cleanup; - default: - PyErr_SetString(PyExc_IOError, - "file is not ready for reading"); - goto cleanup; - } - - /* refuse to mix with f.next() */ - if (check_iterbuffered(self)) - goto cleanup; - - if (bytesrequested < 0) - buffersize = Util_NewBufferSize((size_t)0); + if (self->flushed) + PyErr_SetString(PyExc_ValueError, "Compressor has been flushed"); else - buffersize = bytesrequested; - if (buffersize > INT_MAX) { - PyErr_SetString(PyExc_OverflowError, - "requested number of bytes is " - "more than a Python string can hold"); - goto cleanup; - } - ret = PyBytes_FromStringAndSize((char *)NULL, buffersize); - if (ret == NULL || buffersize == 0) - goto cleanup; - bytesread = 0; - - for (;;) { - Py_BEGIN_ALLOW_THREADS - chunksize = BZ2_bzRead(&bzerror, self->fp, - BUF(ret)+bytesread, - buffersize-bytesread); - self->pos += chunksize; - Py_END_ALLOW_THREADS - bytesread += chunksize; - if (bzerror == BZ_STREAM_END) { - self->size = self->pos; - self->mode = MODE_READ_EOF; - break; - } else if (bzerror != BZ_OK) { - Util_CatchBZ2Error(bzerror); - Py_DECREF(ret); - ret = NULL; - goto cleanup; - } - if (bytesrequested < 0) { - buffersize = Util_NewBufferSize(buffersize); - if (_PyBytes_Resize(&ret, buffersize) < 0) { - ret = NULL; - goto cleanup; - } - } else { - break; - } - } - if (bytesread != buffersize) { - if (_PyBytes_Resize(&ret, bytesread) < 0) { - ret = NULL; - } - } - -cleanup: + result = compress(self, buffer.buf, buffer.len, BZ_RUN); RELEASE_LOCK(self); - return ret; + PyBuffer_Release(&buffer); + return result; } -PyDoc_STRVAR(BZ2File_readline__doc__, -"readline([size]) -> string\n\ -\n\ -Return the next line from the file, as a string, retaining newline.\n\ -A non-negative size argument will limit the maximum number of bytes to\n\ -return (an incomplete line may be returned then). Return an empty\n\ -string at EOF.\n\ -"); +PyDoc_STRVAR(BZ2Compressor_flush__doc__, +"flush() -> bytes\n" +"\n" +"Finish the compression process. Returns the compressed data left\n" +"in internal buffers.\n" +"\n" +"The compressor object may not be used after this method is called.\n"); static PyObject * -BZ2File_readline(BZ2FileObject *self, PyObject *args) +BZ2Compressor_flush(BZ2Compressor *self, PyObject *noargs) { - PyObject *ret = NULL; - int sizehint = -1; - - if (!PyArg_ParseTuple(args, "|i:readline", &sizehint)) - return NULL; + PyObject *result = NULL; ACQUIRE_LOCK(self); - switch (self->mode) { - case MODE_READ: - break; - case MODE_READ_EOF: - ret = PyBytes_FromStringAndSize("", 0); - goto cleanup; - case MODE_CLOSED: - PyErr_SetString(PyExc_ValueError, - "I/O operation on closed file"); - goto cleanup; - default: - PyErr_SetString(PyExc_IOError, - "file is not ready for reading"); - goto cleanup; + if (self->flushed) + PyErr_SetString(PyExc_ValueError, "Repeated call to flush()"); + else { + self->flushed = 1; + result = compress(self, NULL, 0, BZ_FINISH); } - - /* refuse to mix with f.next() */ - if (check_iterbuffered(self)) - goto cleanup; - - if (sizehint == 0) - ret = PyBytes_FromStringAndSize("", 0); - else - ret = Util_GetLine(self, (sizehint < 0) ? 0 : sizehint); - -cleanup: RELEASE_LOCK(self); - return ret; + return result; } -PyDoc_STRVAR(BZ2File_readlines__doc__, -"readlines([size]) -> list\n\ -\n\ -Call readline() repeatedly and return a list of lines read.\n\ -The optional size argument, if given, is an approximate bound on the\n\ -total number of bytes in the lines returned.\n\ -"); - -/* This is a hacked version of Python's fileobject.c:file_readlines(). */ -static PyObject * -BZ2File_readlines(BZ2FileObject *self, PyObject *args) +static int +BZ2Compressor_init(BZ2Compressor *self, PyObject *args, PyObject *kwargs) { - long sizehint = 0; - PyObject *list = NULL; - PyObject *line; - char small_buffer[SMALLCHUNK]; - char *buffer = small_buffer; - size_t buffersize = SMALLCHUNK; - PyObject *big_buffer = NULL; - size_t nfilled = 0; - size_t nread; - size_t totalread = 0; - char *p, *q, *end; - int err; - int shortread = 0; + int compresslevel = 9; int bzerror; - if (!PyArg_ParseTuple(args, "|l:readlines", &sizehint)) - return NULL; - - ACQUIRE_LOCK(self); - switch (self->mode) { - case MODE_READ: - break; - case MODE_READ_EOF: - list = PyList_New(0); - goto cleanup; - case MODE_CLOSED: - PyErr_SetString(PyExc_ValueError, - "I/O operation on closed file"); - goto cleanup; - default: - PyErr_SetString(PyExc_IOError, - "file is not ready for reading"); - goto cleanup; - } - - /* refuse to mix with f.next() */ - if (check_iterbuffered(self)) - goto cleanup; - - if ((list = PyList_New(0)) == NULL) - goto cleanup; - - for (;;) { - Py_BEGIN_ALLOW_THREADS - nread = BZ2_bzRead(&bzerror, self->fp, - buffer+nfilled, buffersize-nfilled); - self->pos += nread; - Py_END_ALLOW_THREADS - if (bzerror == BZ_STREAM_END) { - self->size = self->pos; - self->mode = MODE_READ_EOF; - if (nread == 0) { - sizehint = 0; - break; - } - shortread = 1; - } else if (bzerror != BZ_OK) { - Util_CatchBZ2Error(bzerror); - error: - Py_DECREF(list); - list = NULL; - goto cleanup; - } - totalread += nread; - p = memchr(buffer+nfilled, '\n', nread); - if (!shortread && p == NULL) { - /* Need a larger buffer to fit this line */ - nfilled += nread; - buffersize *= 2; - if (buffersize > INT_MAX) { - PyErr_SetString(PyExc_OverflowError, - "line is longer than a Python string can hold"); - goto error; - } - if (big_buffer == NULL) { - /* Create the big buffer */ - big_buffer = PyBytes_FromStringAndSize( - NULL, buffersize); - if (big_buffer == NULL) - goto error; - buffer = PyBytes_AS_STRING(big_buffer); - memcpy(buffer, small_buffer, nfilled); - } - else { - /* Grow the big buffer */ - if (_PyBytes_Resize(&big_buffer, buffersize) < 0){ - big_buffer = NULL; - goto error; - } - buffer = PyBytes_AS_STRING(big_buffer); - } - continue; - } - end = buffer+nfilled+nread; - q = buffer; - while (p != NULL) { - /* Process complete lines */ - p++; - line = PyBytes_FromStringAndSize(q, p-q); - if (line == NULL) - goto error; - err = PyList_Append(list, line); - Py_DECREF(line); - if (err != 0) - goto error; - q = p; - p = memchr(q, '\n', end-q); - } - /* Move the remaining incomplete line to the start */ - nfilled = end-q; - memmove(buffer, q, nfilled); - if (sizehint > 0) - if (totalread >= (size_t)sizehint) - break; - if (shortread) { - sizehint = 0; - break; - } - } - if (nfilled != 0) { - /* Partial last line */ - line = PyBytes_FromStringAndSize(buffer, nfilled); - if (line == NULL) - goto error; - if (sizehint > 0) { - /* Need to complete the last line */ - PyObject *rest = Util_GetLine(self, 0); - if (rest == NULL) { - Py_DECREF(line); - goto error; - } - PyBytes_Concat(&line, rest); - Py_DECREF(rest); - if (line == NULL) - goto error; - } - err = PyList_Append(list, line); - Py_DECREF(line); - if (err != 0) - goto error; - } - - cleanup: - RELEASE_LOCK(self); - if (big_buffer) { - Py_DECREF(big_buffer); - } - return list; -} - -PyDoc_STRVAR(BZ2File_write__doc__, -"write(data) -> None\n\ -\n\ -Write the 'data' string to file. Note that due to buffering, close() may\n\ -be needed before the file on disk reflects the data written.\n\ -"); - -/* This is a hacked version of Python's fileobject.c:file_write(). */ -static PyObject * -BZ2File_write(BZ2FileObject *self, PyObject *args) -{ - PyObject *ret = NULL; - Py_buffer pbuf; - char *buf; - int len; - int bzerror; - - if (!PyArg_ParseTuple(args, "y*:write", &pbuf)) - return NULL; - buf = pbuf.buf; - len = pbuf.len; - - ACQUIRE_LOCK(self); - switch (self->mode) { - case MODE_WRITE: - break; - - case MODE_CLOSED: - PyErr_SetString(PyExc_ValueError, - "I/O operation on closed file"); - goto cleanup; - - default: - PyErr_SetString(PyExc_IOError, - "file is not ready for writing"); - goto cleanup; - } - - Py_BEGIN_ALLOW_THREADS - BZ2_bzWrite (&bzerror, self->fp, buf, len); - self->pos += len; - Py_END_ALLOW_THREADS - - if (bzerror != BZ_OK) { - Util_CatchBZ2Error(bzerror); - goto cleanup; - } - - Py_INCREF(Py_None); - ret = Py_None; - -cleanup: - PyBuffer_Release(&pbuf); - RELEASE_LOCK(self); - return ret; -} - -PyDoc_STRVAR(BZ2File_writelines__doc__, -"writelines(sequence_of_strings) -> None\n\ -\n\ -Write the sequence of strings to the file. Note that newlines are not\n\ -added. The sequence can be any iterable object producing strings. This is\n\ -equivalent to calling write() for each string.\n\ -"); - -/* This is a hacked version of Python's fileobject.c:file_writelines(). */ -static PyObject * -BZ2File_writelines(BZ2FileObject *self, PyObject *seq) -{ -#define CHUNKSIZE 1000 - PyObject *list = NULL; - PyObject *iter = NULL; - PyObject *ret = NULL; - PyObject *line; - int i, j, index, len, islist; - int bzerror; - - ACQUIRE_LOCK(self); - switch (self->mode) { - case MODE_WRITE: - break; - - case MODE_CLOSED: - PyErr_SetString(PyExc_ValueError, - "I/O operation on closed file"); - goto error; - - default: - PyErr_SetString(PyExc_IOError, - "file is not ready for writing"); - goto error; - } - - islist = PyList_Check(seq); - if (!islist) { - iter = PyObject_GetIter(seq); - if (iter == NULL) { - PyErr_SetString(PyExc_TypeError, - "writelines() requires an iterable argument"); - goto error; - } - list = PyList_New(CHUNKSIZE); - if (list == NULL) - goto error; - } - - /* Strategy: slurp CHUNKSIZE lines into a private list, - checking that they are all strings, then write that list - without holding the interpreter lock, then come back for more. */ - for (index = 0; ; index += CHUNKSIZE) { - if (islist) { - Py_XDECREF(list); - list = PyList_GetSlice(seq, index, index+CHUNKSIZE); - if (list == NULL) - goto error; - j = PyList_GET_SIZE(list); - } - else { - for (j = 0; j < CHUNKSIZE; j++) { - line = PyIter_Next(iter); - if (line == NULL) { - if (PyErr_Occurred()) - goto error; - break; - } - PyList_SetItem(list, j, line); - } - } - if (j == 0) - break; - - /* Check that all entries are indeed byte strings. If not, - apply the same rules as for file.write() and - convert the rets to strings. This is slow, but - seems to be the only way since all conversion APIs - could potentially execute Python code. */ - for (i = 0; i < j; i++) { - PyObject *v = PyList_GET_ITEM(list, i); - if (!PyBytes_Check(v)) { - const char *buffer; - Py_ssize_t len; - if (PyObject_AsCharBuffer(v, &buffer, &len)) { - PyErr_SetString(PyExc_TypeError, - "writelines() " - "argument must be " - "a sequence of " - "bytes objects"); - goto error; - } - line = PyBytes_FromStringAndSize(buffer, - len); - if (line == NULL) - goto error; - Py_DECREF(v); - PyList_SET_ITEM(list, i, line); - } - } - - /* Since we are releasing the global lock, the - following code may *not* execute Python code. */ - Py_BEGIN_ALLOW_THREADS - for (i = 0; i < j; i++) { - line = PyList_GET_ITEM(list, i); - len = PyBytes_GET_SIZE(line); - BZ2_bzWrite (&bzerror, self->fp, - PyBytes_AS_STRING(line), len); - if (bzerror != BZ_OK) { - Py_BLOCK_THREADS - Util_CatchBZ2Error(bzerror); - goto error; - } - } - Py_END_ALLOW_THREADS - - if (j < CHUNKSIZE) - break; - } - - Py_INCREF(Py_None); - ret = Py_None; - - error: - RELEASE_LOCK(self); - Py_XDECREF(list); - Py_XDECREF(iter); - return ret; -#undef CHUNKSIZE -} - -PyDoc_STRVAR(BZ2File_seek__doc__, -"seek(offset [, whence]) -> None\n\ -\n\ -Move to new file position. Argument offset is a byte count. Optional\n\ -argument whence defaults to 0 (offset from start of file, offset\n\ -should be >= 0); other values are 1 (move relative to current position,\n\ -positive or negative), and 2 (move relative to end of file, usually\n\ -negative, although many platforms allow seeking beyond the end of a file).\n\ -\n\ -Note that seeking of bz2 files is emulated, and depending on the parameters\n\ -the operation may be extremely slow.\n\ -"); - -static PyObject * -BZ2File_seek(BZ2FileObject *self, PyObject *args) -{ - int where = 0; - PyObject *offobj; - Py_off_t offset; - char small_buffer[SMALLCHUNK]; - char *buffer = small_buffer; - size_t buffersize = SMALLCHUNK; - Py_off_t bytesread = 0; - size_t readsize; - int chunksize; - int bzerror; - PyObject *ret = NULL; - - if (!PyArg_ParseTuple(args, "O|i:seek", &offobj, &where)) - return NULL; -#if !defined(HAVE_LARGEFILE_SUPPORT) - offset = PyLong_AsLong(offobj); -#else - offset = PyLong_Check(offobj) ? - PyLong_AsLongLong(offobj) : PyLong_AsLong(offobj); -#endif - if (PyErr_Occurred()) - return NULL; - - ACQUIRE_LOCK(self); - Util_DropReadAhead(self); - switch (self->mode) { - case MODE_READ: - case MODE_READ_EOF: - break; - - case MODE_CLOSED: - PyErr_SetString(PyExc_ValueError, - "I/O operation on closed file"); - goto cleanup; - - default: - PyErr_SetString(PyExc_IOError, - "seek works only while reading"); - goto cleanup; - } - - if (where == 2) { - if (self->size == -1) { - assert(self->mode != MODE_READ_EOF); - for (;;) { - Py_BEGIN_ALLOW_THREADS - chunksize = BZ2_bzRead(&bzerror, self->fp, - buffer, buffersize); - self->pos += chunksize; - Py_END_ALLOW_THREADS - - bytesread += chunksize; - if (bzerror == BZ_STREAM_END) { - break; - } else if (bzerror != BZ_OK) { - Util_CatchBZ2Error(bzerror); - goto cleanup; - } - } - self->mode = MODE_READ_EOF; - self->size = self->pos; - bytesread = 0; - } - offset = self->size + offset; - } else if (where == 1) { - offset = self->pos + offset; - } - - /* Before getting here, offset must be the absolute position the file - * pointer should be set to. */ - - if (offset >= self->pos) { - /* we can move forward */ - offset -= self->pos; - } else { - /* we cannot move back, so rewind the stream */ - BZ2_bzReadClose(&bzerror, self->fp); - if (bzerror != BZ_OK) { - Util_CatchBZ2Error(bzerror); - goto cleanup; - } - rewind(self->rawfp); - self->pos = 0; - self->fp = BZ2_bzReadOpen(&bzerror, self->rawfp, - 0, 0, NULL, 0); - if (bzerror != BZ_OK) { - Util_CatchBZ2Error(bzerror); - goto cleanup; - } - self->mode = MODE_READ; - } - - if (offset <= 0 || self->mode == MODE_READ_EOF) - goto exit; - - /* Before getting here, offset must be set to the number of bytes - * to walk forward. */ - for (;;) { - if (offset-bytesread > buffersize) - readsize = buffersize; - else - /* offset might be wider that readsize, but the result - * of the subtraction is bound by buffersize (see the - * condition above). buffersize is 8192. */ - readsize = (size_t)(offset-bytesread); - Py_BEGIN_ALLOW_THREADS - chunksize = BZ2_bzRead(&bzerror, self->fp, buffer, readsize); - self->pos += chunksize; - Py_END_ALLOW_THREADS - bytesread += chunksize; - if (bzerror == BZ_STREAM_END) { - self->size = self->pos; - self->mode = MODE_READ_EOF; - break; - } else if (bzerror != BZ_OK) { - Util_CatchBZ2Error(bzerror); - goto cleanup; - } - if (bytesread == offset) - break; - } - -exit: - Py_INCREF(Py_None); - ret = Py_None; - -cleanup: - RELEASE_LOCK(self); - return ret; -} - -PyDoc_STRVAR(BZ2File_tell__doc__, -"tell() -> int\n\ -\n\ -Return the current file position, an integer (may be a long integer).\n\ -"); - -static PyObject * -BZ2File_tell(BZ2FileObject *self, PyObject *args) -{ - PyObject *ret = NULL; - - if (self->mode == MODE_CLOSED) { - PyErr_SetString(PyExc_ValueError, - "I/O operation on closed file"); - goto cleanup; - } - -#if !defined(HAVE_LARGEFILE_SUPPORT) - ret = PyLong_FromLong(self->pos); -#else - ret = PyLong_FromLongLong(self->pos); -#endif - -cleanup: - return ret; -} - -PyDoc_STRVAR(BZ2File_close__doc__, -"close() -> None or (perhaps) an integer\n\ -\n\ -Close the file. Sets data attribute .closed to true. A closed file\n\ -cannot be used for further I/O operations. close() may be called more\n\ -than once without error.\n\ -"); - -static PyObject * -BZ2File_close(BZ2FileObject *self) -{ - PyObject *ret = NULL; - int bzerror = BZ_OK; - - if (self->mode == MODE_CLOSED) { - Py_RETURN_NONE; - } - - ACQUIRE_LOCK(self); - switch (self->mode) { - case MODE_READ: - case MODE_READ_EOF: - BZ2_bzReadClose(&bzerror, self->fp); - break; - case MODE_WRITE: - BZ2_bzWriteClose(&bzerror, self->fp, - 0, NULL, NULL); - break; - } - self->mode = MODE_CLOSED; - fclose(self->rawfp); - self->rawfp = NULL; - if (bzerror == BZ_OK) { - Py_INCREF(Py_None); - ret = Py_None; - } - else { - Util_CatchBZ2Error(bzerror); - } - - RELEASE_LOCK(self); - return ret; -} - -PyDoc_STRVAR(BZ2File_enter_doc, -"__enter__() -> self."); - -static PyObject * -BZ2File_enter(BZ2FileObject *self) -{ - if (self->mode == MODE_CLOSED) { - PyErr_SetString(PyExc_ValueError, - "I/O operation on closed file"); - return NULL; - } - Py_INCREF(self); - return (PyObject *) self; -} - -PyDoc_STRVAR(BZ2File_exit_doc, -"__exit__(*excinfo) -> None. Closes the file."); - -static PyObject * -BZ2File_exit(BZ2FileObject *self, PyObject *args) -{ - PyObject *ret = PyObject_CallMethod((PyObject *) self, "close", NULL); - if (!ret) - /* If error occurred, pass through */ - return NULL; - Py_DECREF(ret); - Py_RETURN_NONE; -} - - -static PyObject *BZ2File_getiter(BZ2FileObject *self); - -static PyMethodDef BZ2File_methods[] = { - {"read", (PyCFunction)BZ2File_read, METH_VARARGS, BZ2File_read__doc__}, - {"readline", (PyCFunction)BZ2File_readline, METH_VARARGS, BZ2File_readline__doc__}, - {"readlines", (PyCFunction)BZ2File_readlines, METH_VARARGS, BZ2File_readlines__doc__}, - {"write", (PyCFunction)BZ2File_write, METH_VARARGS, BZ2File_write__doc__}, - {"writelines", (PyCFunction)BZ2File_writelines, METH_O, BZ2File_writelines__doc__}, - {"seek", (PyCFunction)BZ2File_seek, METH_VARARGS, BZ2File_seek__doc__}, - {"tell", (PyCFunction)BZ2File_tell, METH_NOARGS, BZ2File_tell__doc__}, - {"close", (PyCFunction)BZ2File_close, METH_NOARGS, BZ2File_close__doc__}, - {"__enter__", (PyCFunction)BZ2File_enter, METH_NOARGS, BZ2File_enter_doc}, - {"__exit__", (PyCFunction)BZ2File_exit, METH_VARARGS, BZ2File_exit_doc}, - {NULL, NULL} /* sentinel */ -}; - - -/* ===================================================================== */ -/* Getters and setters of BZ2File. */ - -static PyObject * -BZ2File_get_closed(BZ2FileObject *self, void *closure) -{ - return PyLong_FromLong(self->mode == MODE_CLOSED); -} - -static PyGetSetDef BZ2File_getset[] = { - {"closed", (getter)BZ2File_get_closed, NULL, - "True if the file is closed"}, - {NULL} /* Sentinel */ -}; - - -/* ===================================================================== */ -/* Slot definitions for BZ2File_Type. */ - -static int -BZ2File_init(BZ2FileObject *self, PyObject *args, PyObject *kwargs) -{ - static char *kwlist[] = {"filename", "mode", "buffering", - "compresslevel", 0}; - PyObject *name_obj = NULL; - char *name; - char *mode = "r"; - int buffering = -1; - int compresslevel = 9; - int bzerror; - int mode_char = 0; - - self->size = -1; - - if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O&|sii:BZ2File", - kwlist, PyUnicode_FSConverter, &name_obj, - &mode, &buffering, - &compresslevel)) + if (!PyArg_ParseTuple(args, "|i:BZ2Compressor", &compresslevel)) return -1; - - name = PyBytes_AsString(name_obj); - if (compresslevel < 1 || compresslevel > 9) { + if (!(1 <= compresslevel && compresslevel <= 9)) { PyErr_SetString(PyExc_ValueError, "compresslevel must be between 1 and 9"); - Py_DECREF(name_obj); return -1; } - for (;;) { - int error = 0; - switch (*mode) { - case 'r': - case 'w': - if (mode_char) - error = 1; - mode_char = *mode; - break; - - case 'b': - break; - - default: - error = 1; - break; - } - if (error) { - PyErr_Format(PyExc_ValueError, - "invalid mode char %c", *mode); - Py_DECREF(name_obj); - return -1; - } - mode++; - if (*mode == '\0') - break; - } - - if (mode_char == 0) { - mode_char = 'r'; - } - - mode = (mode_char == 'r') ? "rb" : "wb"; - - self->rawfp = fopen(name, mode); - Py_DECREF(name_obj); - if (self->rawfp == NULL) { - PyErr_SetFromErrno(PyExc_IOError); - return -1; - } - /* XXX Ignore buffering */ - - /* From now on, we have stuff to dealloc, so jump to error label - * instead of returning */ - #ifdef WITH_THREAD self->lock = PyThread_allocate_lock(); - if (!self->lock) { - PyErr_SetString(PyExc_MemoryError, "unable to allocate lock"); - goto error; + if (self->lock == NULL) { + PyErr_SetString(PyExc_MemoryError, "Unable to allocate lock"); + return -1; } #endif - if (mode_char == 'r') - self->fp = BZ2_bzReadOpen(&bzerror, self->rawfp, - 0, 0, NULL, 0); - else - self->fp = BZ2_bzWriteOpen(&bzerror, self->rawfp, - compresslevel, 0, 0); - - if (bzerror != BZ_OK) { - Util_CatchBZ2Error(bzerror); + bzerror = BZ2_bzCompressInit(&self->bzs, compresslevel, 0, 0); + if (catch_bz2_error(bzerror)) goto error; - } - - self->mode = (mode_char == 'r') ? MODE_READ : MODE_WRITE; return 0; error: - fclose(self->rawfp); - self->rawfp = NULL; #ifdef WITH_THREAD - if (self->lock) { - PyThread_free_lock(self->lock); - self->lock = NULL; - } + PyThread_free_lock(self->lock); + self->lock = NULL; #endif return -1; } static void -BZ2File_dealloc(BZ2FileObject *self) +BZ2Compressor_dealloc(BZ2Compressor *self) { - int bzerror; + BZ2_bzCompressEnd(&self->bzs); #ifdef WITH_THREAD - if (self->lock) + if (self->lock != NULL) PyThread_free_lock(self->lock); #endif - switch (self->mode) { - case MODE_READ: - case MODE_READ_EOF: - BZ2_bzReadClose(&bzerror, self->fp); - break; - case MODE_WRITE: - BZ2_bzWriteClose(&bzerror, self->fp, - 0, NULL, NULL); - break; - } - Util_DropReadAhead(self); - if (self->rawfp != NULL) - fclose(self->rawfp); Py_TYPE(self)->tp_free((PyObject *)self); } -/* This is a hacked version of Python's fileobject.c:file_getiter(). */ -static PyObject * -BZ2File_getiter(BZ2FileObject *self) -{ - if (self->mode == MODE_CLOSED) { - PyErr_SetString(PyExc_ValueError, - "I/O operation on closed file"); - return NULL; - } - Py_INCREF((PyObject*)self); - return (PyObject *)self; -} - -/* This is a hacked version of Python's fileobject.c:file_iternext(). */ -#define READAHEAD_BUFSIZE 8192 -static PyObject * -BZ2File_iternext(BZ2FileObject *self) -{ - PyBytesObject* ret; - ACQUIRE_LOCK(self); - if (self->mode == MODE_CLOSED) { - RELEASE_LOCK(self); - PyErr_SetString(PyExc_ValueError, - "I/O operation on closed file"); - return NULL; - } - ret = Util_ReadAheadGetLineSkip(self, 0, READAHEAD_BUFSIZE); - RELEASE_LOCK(self); - if (ret == NULL || PyBytes_GET_SIZE(ret) == 0) { - Py_XDECREF(ret); - return NULL; - } - return (PyObject *)ret; -} - -/* ===================================================================== */ -/* BZ2File_Type definition. */ - -PyDoc_VAR(BZ2File__doc__) = -PyDoc_STR( -"BZ2File(name [, mode='r', buffering=0, compresslevel=9]) -> file object\n\ -\n\ -Open a bz2 file. The mode can be 'r' or 'w', for reading (default) or\n\ -writing. When opened for writing, the file will be created if it doesn't\n\ -exist, and truncated otherwise. If the buffering argument is given, 0 means\n\ -unbuffered, and larger numbers specify the buffer size. If compresslevel\n\ -is given, must be a number between 1 and 9.\n\ -Data read is always returned in bytes; data written ought to be bytes.\n\ -"); - -static PyTypeObject BZ2File_Type = { - PyVarObject_HEAD_INIT(NULL, 0) - "bz2.BZ2File", /*tp_name*/ - sizeof(BZ2FileObject), /*tp_basicsize*/ - 0, /*tp_itemsize*/ - (destructor)BZ2File_dealloc, /*tp_dealloc*/ - 0, /*tp_print*/ - 0, /*tp_getattr*/ - 0, /*tp_setattr*/ - 0, /*tp_reserved*/ - 0, /*tp_repr*/ - 0, /*tp_as_number*/ - 0, /*tp_as_sequence*/ - 0, /*tp_as_mapping*/ - 0, /*tp_hash*/ - 0, /*tp_call*/ - 0, /*tp_str*/ - PyObject_GenericGetAttr,/*tp_getattro*/ - PyObject_GenericSetAttr,/*tp_setattro*/ - 0, /*tp_as_buffer*/ - Py_TPFLAGS_DEFAULT|Py_TPFLAGS_BASETYPE, /*tp_flags*/ - BZ2File__doc__, /*tp_doc*/ - 0, /*tp_traverse*/ - 0, /*tp_clear*/ - 0, /*tp_richcompare*/ - 0, /*tp_weaklistoffset*/ - (getiterfunc)BZ2File_getiter, /*tp_iter*/ - (iternextfunc)BZ2File_iternext, /*tp_iternext*/ - BZ2File_methods, /*tp_methods*/ - 0, /*tp_members*/ - BZ2File_getset, /*tp_getset*/ - 0, /*tp_base*/ - 0, /*tp_dict*/ - 0, /*tp_descr_get*/ - 0, /*tp_descr_set*/ - 0, /*tp_dictoffset*/ - (initproc)BZ2File_init, /*tp_init*/ - PyType_GenericAlloc, /*tp_alloc*/ - PyType_GenericNew, /*tp_new*/ - PyObject_Free, /*tp_free*/ - 0, /*tp_is_gc*/ +static PyMethodDef BZ2Compressor_methods[] = { + {"compress", (PyCFunction)BZ2Compressor_compress, METH_VARARGS, + BZ2Compressor_compress__doc__}, + {"flush", (PyCFunction)BZ2Compressor_flush, METH_NOARGS, + BZ2Compressor_flush__doc__}, + {NULL} }; +PyDoc_STRVAR(BZ2Compressor__doc__, +"BZ2Compressor(compresslevel=9)\n" +"\n" +"Create a compressor object for compressing data incrementally.\n" +"\n" +"compresslevel, if given, must be a number between 1 and 9.\n" +"\n" +"For one-shot compression, use the compress() function instead.\n"); -/* ===================================================================== */ -/* Methods of BZ2Comp. */ +static PyTypeObject BZ2Compressor_Type = { + PyVarObject_HEAD_INIT(NULL, 0) + "_bz2.BZ2Compressor", /* tp_name */ + sizeof(BZ2Compressor), /* tp_basicsize */ + 0, /* tp_itemsize */ + (destructor)BZ2Compressor_dealloc, /* tp_dealloc */ + 0, /* tp_print */ + 0, /* tp_getattr */ + 0, /* tp_setattr */ + 0, /* tp_reserved */ + 0, /* tp_repr */ + 0, /* tp_as_number */ + 0, /* tp_as_sequence */ + 0, /* tp_as_mapping */ + 0, /* tp_hash */ + 0, /* tp_call */ + 0, /* tp_str */ + 0, /* tp_getattro */ + 0, /* tp_setattro */ + 0, /* tp_as_buffer */ + Py_TPFLAGS_DEFAULT, /* tp_flags */ + BZ2Compressor__doc__, /* tp_doc */ + 0, /* tp_traverse */ + 0, /* tp_clear */ + 0, /* tp_richcompare */ + 0, /* tp_weaklistoffset */ + 0, /* tp_iter */ + 0, /* tp_iternext */ + BZ2Compressor_methods, /* tp_methods */ + 0, /* tp_members */ + 0, /* tp_getset */ + 0, /* tp_base */ + 0, /* tp_dict */ + 0, /* tp_descr_get */ + 0, /* tp_descr_set */ + 0, /* tp_dictoffset */ + (initproc)BZ2Compressor_init, /* tp_init */ + 0, /* tp_alloc */ + PyType_GenericNew, /* tp_new */ +}; -PyDoc_STRVAR(BZ2Comp_compress__doc__, -"compress(data) -> string\n\ -\n\ -Provide more data to the compressor object. It will return chunks of\n\ -compressed data whenever possible. When you've finished providing data\n\ -to compress, call the flush() method to finish the compression process,\n\ -and return what is left in the internal buffers.\n\ -"); + +/* BZ2Decompressor class. */ static PyObject * -BZ2Comp_compress(BZ2CompObject *self, PyObject *args) +decompress(BZ2Decompressor *d, char *data, size_t len) { - Py_buffer pdata; - char *data; - int datasize; - int bufsize = SMALLCHUNK; - PY_LONG_LONG totalout; - PyObject *ret = NULL; - bz_stream *bzs = &self->bzs; - int bzerror; + size_t data_size = 0; + PyObject *result; - if (!PyArg_ParseTuple(args, "y*:compress", &pdata)) - return NULL; - data = pdata.buf; - datasize = pdata.len; + result = PyBytes_FromStringAndSize(NULL, SMALLCHUNK); + if (result == NULL) + return result; + d->bzs.next_in = data; + /* FIXME This is not 64-bit clean - avail_in is an int. */ + d->bzs.avail_in = len; + d->bzs.next_out = PyBytes_AS_STRING(result); + d->bzs.avail_out = PyBytes_GET_SIZE(result); + for (;;) { + char *this_out; + int bzerror; - if (datasize == 0) { - PyBuffer_Release(&pdata); - return PyBytes_FromStringAndSize("", 0); - } - - ACQUIRE_LOCK(self); - if (!self->running) { - PyErr_SetString(PyExc_ValueError, - "this object was already flushed"); - goto error; - } - - ret = PyBytes_FromStringAndSize(NULL, bufsize); - if (!ret) - goto error; - - bzs->next_in = data; - bzs->avail_in = datasize; - bzs->next_out = BUF(ret); - bzs->avail_out = bufsize; - - totalout = BZS_TOTAL_OUT(bzs); - - for (;;) { Py_BEGIN_ALLOW_THREADS - bzerror = BZ2_bzCompress(bzs, BZ_RUN); + this_out = d->bzs.next_out; + bzerror = BZ2_bzDecompress(&d->bzs); + data_size += d->bzs.next_out - this_out; Py_END_ALLOW_THREADS - if (bzerror != BZ_RUN_OK) { - Util_CatchBZ2Error(bzerror); + if (catch_bz2_error(bzerror)) goto error; + if (bzerror == BZ_STREAM_END) { + d->eof = 1; + if (d->bzs.avail_in > 0) { /* Save leftover input to unused_data */ + Py_CLEAR(d->unused_data); + d->unused_data = PyBytes_FromStringAndSize(d->bzs.next_in, + d->bzs.avail_in); + if (d->unused_data == NULL) + goto error; + } + break; } - if (bzs->avail_in == 0) - break; /* no more input data */ - if (bzs->avail_out == 0) { - bufsize = Util_NewBufferSize(bufsize); - if (_PyBytes_Resize(&ret, bufsize) < 0) { - BZ2_bzCompressEnd(bzs); + if (d->bzs.avail_in == 0) + break; + if (d->bzs.avail_out == 0) { + if (grow_buffer(&result) < 0) goto error; - } - bzs->next_out = BUF(ret) + (BZS_TOTAL_OUT(bzs) - - totalout); - bzs->avail_out = bufsize - (bzs->next_out - BUF(ret)); + d->bzs.next_out = PyBytes_AS_STRING(result) + data_size; + d->bzs.avail_out = PyBytes_GET_SIZE(result) - data_size; } } - - if (_PyBytes_Resize(&ret, - (Py_ssize_t)(BZS_TOTAL_OUT(bzs) - totalout)) < 0) - goto error; - - RELEASE_LOCK(self); - PyBuffer_Release(&pdata); - return ret; + if (data_size != PyBytes_GET_SIZE(result)) + if (_PyBytes_Resize(&result, data_size) < 0) + goto error; + return result; error: - RELEASE_LOCK(self); - PyBuffer_Release(&pdata); - Py_XDECREF(ret); + Py_XDECREF(result); return NULL; } -PyDoc_STRVAR(BZ2Comp_flush__doc__, -"flush() -> string\n\ -\n\ -Finish the compression process and return what is left in internal buffers.\n\ -You must not use the compressor object after calling this method.\n\ -"); +PyDoc_STRVAR(BZ2Decompressor_decompress__doc__, +"decompress(data) -> bytes\n" +"\n" +"Provide data to the decompressor object. Returns a chunk of\n" +"decompressed data if possible, or b'' otherwise.\n" +"\n" +"Attempting to decompress data after the end of stream is reached\n" +"raises an EOFError. Any data found after the end of the stream\n" +"is ignored and saved in the unused_data attribute.\n"); static PyObject * -BZ2Comp_flush(BZ2CompObject *self) +BZ2Decompressor_decompress(BZ2Decompressor *self, PyObject *args) { - int bufsize = SMALLCHUNK; - PyObject *ret = NULL; - bz_stream *bzs = &self->bzs; - PY_LONG_LONG totalout; - int bzerror; + Py_buffer buffer; + PyObject *result = NULL; + + if (!PyArg_ParseTuple(args, "y*:decompress", &buffer)) + return NULL; ACQUIRE_LOCK(self); - if (!self->running) { - PyErr_SetString(PyExc_ValueError, "object was already " - "flushed"); - goto error; - } - self->running = 0; - - ret = PyBytes_FromStringAndSize(NULL, bufsize); - if (!ret) - goto error; - - bzs->next_out = BUF(ret); - bzs->avail_out = bufsize; - - totalout = BZS_TOTAL_OUT(bzs); - - for (;;) { - Py_BEGIN_ALLOW_THREADS - bzerror = BZ2_bzCompress(bzs, BZ_FINISH); - Py_END_ALLOW_THREADS - if (bzerror == BZ_STREAM_END) { - break; - } else if (bzerror != BZ_FINISH_OK) { - Util_CatchBZ2Error(bzerror); - goto error; - } - if (bzs->avail_out == 0) { - bufsize = Util_NewBufferSize(bufsize); - if (_PyBytes_Resize(&ret, bufsize) < 0) - goto error; - bzs->next_out = BUF(ret); - bzs->next_out = BUF(ret) + (BZS_TOTAL_OUT(bzs) - - totalout); - bzs->avail_out = bufsize - (bzs->next_out - BUF(ret)); - } - } - - if (bzs->avail_out != 0) { - if (_PyBytes_Resize(&ret, - (Py_ssize_t)(BZS_TOTAL_OUT(bzs) - totalout)) < 0) - goto error; - } - + if (self->eof) + PyErr_SetString(PyExc_EOFError, "End of stream already reached"); + else + result = decompress(self, buffer.buf, buffer.len); RELEASE_LOCK(self); - return ret; - -error: - RELEASE_LOCK(self); - Py_XDECREF(ret); - return NULL; + PyBuffer_Release(&buffer); + return result; } -static PyMethodDef BZ2Comp_methods[] = { - {"compress", (PyCFunction)BZ2Comp_compress, METH_VARARGS, - BZ2Comp_compress__doc__}, - {"flush", (PyCFunction)BZ2Comp_flush, METH_NOARGS, - BZ2Comp_flush__doc__}, - {NULL, NULL} /* sentinel */ -}; - - -/* ===================================================================== */ -/* Slot definitions for BZ2Comp_Type. */ - static int -BZ2Comp_init(BZ2CompObject *self, PyObject *args, PyObject *kwargs) -{ - int compresslevel = 9; - int bzerror; - static char *kwlist[] = {"compresslevel", 0}; - - if (!PyArg_ParseTupleAndKeywords(args, kwargs, "|i:BZ2Compressor", - kwlist, &compresslevel)) - return -1; - - if (compresslevel < 1 || compresslevel > 9) { - PyErr_SetString(PyExc_ValueError, - "compresslevel must be between 1 and 9"); - goto error; - } - -#ifdef WITH_THREAD - self->lock = PyThread_allocate_lock(); - if (!self->lock) { - PyErr_SetString(PyExc_MemoryError, "unable to allocate lock"); - goto error; - } -#endif - - memset(&self->bzs, 0, sizeof(bz_stream)); - bzerror = BZ2_bzCompressInit(&self->bzs, compresslevel, 0, 0); - if (bzerror != BZ_OK) { - Util_CatchBZ2Error(bzerror); - goto error; - } - - self->running = 1; - - return 0; -error: -#ifdef WITH_THREAD - if (self->lock) { - PyThread_free_lock(self->lock); - self->lock = NULL; - } -#endif - return -1; -} - -static void -BZ2Comp_dealloc(BZ2CompObject *self) -{ -#ifdef WITH_THREAD - if (self->lock) - PyThread_free_lock(self->lock); -#endif - BZ2_bzCompressEnd(&self->bzs); - Py_TYPE(self)->tp_free((PyObject *)self); -} - - -/* ===================================================================== */ -/* BZ2Comp_Type definition. */ - -PyDoc_STRVAR(BZ2Comp__doc__, -"BZ2Compressor([compresslevel=9]) -> compressor object\n\ -\n\ -Create a new compressor object. This object may be used to compress\n\ -data sequentially. If you want to compress data in one shot, use the\n\ -compress() function instead. The compresslevel parameter, if given,\n\ -must be a number between 1 and 9.\n\ -"); - -static PyTypeObject BZ2Comp_Type = { - PyVarObject_HEAD_INIT(NULL, 0) - "bz2.BZ2Compressor", /*tp_name*/ - sizeof(BZ2CompObject), /*tp_basicsize*/ - 0, /*tp_itemsize*/ - (destructor)BZ2Comp_dealloc, /*tp_dealloc*/ - 0, /*tp_print*/ - 0, /*tp_getattr*/ - 0, /*tp_setattr*/ - 0, /*tp_reserved*/ - 0, /*tp_repr*/ - 0, /*tp_as_number*/ - 0, /*tp_as_sequence*/ - 0, /*tp_as_mapping*/ - 0, /*tp_hash*/ - 0, /*tp_call*/ - 0, /*tp_str*/ - PyObject_GenericGetAttr,/*tp_getattro*/ - PyObject_GenericSetAttr,/*tp_setattro*/ - 0, /*tp_as_buffer*/ - Py_TPFLAGS_DEFAULT|Py_TPFLAGS_BASETYPE, /*tp_flags*/ - BZ2Comp__doc__, /*tp_doc*/ - 0, /*tp_traverse*/ - 0, /*tp_clear*/ - 0, /*tp_richcompare*/ - 0, /*tp_weaklistoffset*/ - 0, /*tp_iter*/ - 0, /*tp_iternext*/ - BZ2Comp_methods, /*tp_methods*/ - 0, /*tp_members*/ - 0, /*tp_getset*/ - 0, /*tp_base*/ - 0, /*tp_dict*/ - 0, /*tp_descr_get*/ - 0, /*tp_descr_set*/ - 0, /*tp_dictoffset*/ - (initproc)BZ2Comp_init, /*tp_init*/ - PyType_GenericAlloc, /*tp_alloc*/ - PyType_GenericNew, /*tp_new*/ - PyObject_Free, /*tp_free*/ - 0, /*tp_is_gc*/ -}; - - -/* ===================================================================== */ -/* Members of BZ2Decomp. */ - -#undef OFF -#define OFF(x) offsetof(BZ2DecompObject, x) - -static PyMemberDef BZ2Decomp_members[] = { - {"unused_data", T_OBJECT, OFF(unused_data), READONLY}, - {NULL} /* Sentinel */ -}; - - -/* ===================================================================== */ -/* Methods of BZ2Decomp. */ - -PyDoc_STRVAR(BZ2Decomp_decompress__doc__, -"decompress(data) -> string\n\ -\n\ -Provide more data to the decompressor object. It will return chunks\n\ -of decompressed data whenever possible. If you try to decompress data\n\ -after the end of stream is found, EOFError will be raised. If any data\n\ -was found after the end of stream, it'll be ignored and saved in\n\ -unused_data attribute.\n\ -"); - -static PyObject * -BZ2Decomp_decompress(BZ2DecompObject *self, PyObject *args) -{ - Py_buffer pdata; - char *data; - int datasize; - int bufsize = SMALLCHUNK; - PY_LONG_LONG totalout; - PyObject *ret = NULL; - bz_stream *bzs = &self->bzs; - int bzerror; - - if (!PyArg_ParseTuple(args, "y*:decompress", &pdata)) - return NULL; - data = pdata.buf; - datasize = pdata.len; - - ACQUIRE_LOCK(self); - if (!self->running) { - PyErr_SetString(PyExc_EOFError, "end of stream was " - "already found"); - goto error; - } - - ret = PyBytes_FromStringAndSize(NULL, bufsize); - if (!ret) - goto error; - - bzs->next_in = data; - bzs->avail_in = datasize; - bzs->next_out = BUF(ret); - bzs->avail_out = bufsize; - - totalout = BZS_TOTAL_OUT(bzs); - - for (;;) { - Py_BEGIN_ALLOW_THREADS - bzerror = BZ2_bzDecompress(bzs); - Py_END_ALLOW_THREADS - if (bzerror == BZ_STREAM_END) { - if (bzs->avail_in != 0) { - Py_DECREF(self->unused_data); - self->unused_data = - PyBytes_FromStringAndSize(bzs->next_in, - bzs->avail_in); - } - self->running = 0; - break; - } - if (bzerror != BZ_OK) { - Util_CatchBZ2Error(bzerror); - goto error; - } - if (bzs->avail_in == 0) - break; /* no more input data */ - if (bzs->avail_out == 0) { - bufsize = Util_NewBufferSize(bufsize); - if (_PyBytes_Resize(&ret, bufsize) < 0) { - BZ2_bzDecompressEnd(bzs); - goto error; - } - bzs->next_out = BUF(ret); - bzs->next_out = BUF(ret) + (BZS_TOTAL_OUT(bzs) - - totalout); - bzs->avail_out = bufsize - (bzs->next_out - BUF(ret)); - } - } - - if (bzs->avail_out != 0) { - if (_PyBytes_Resize(&ret, - (Py_ssize_t)(BZS_TOTAL_OUT(bzs) - totalout)) < 0) - goto error; - } - - RELEASE_LOCK(self); - PyBuffer_Release(&pdata); - return ret; - -error: - RELEASE_LOCK(self); - PyBuffer_Release(&pdata); - Py_XDECREF(ret); - return NULL; -} - -static PyMethodDef BZ2Decomp_methods[] = { - {"decompress", (PyCFunction)BZ2Decomp_decompress, METH_VARARGS, BZ2Decomp_decompress__doc__}, - {NULL, NULL} /* sentinel */ -}; - - -/* ===================================================================== */ -/* Slot definitions for BZ2Decomp_Type. */ - -static int -BZ2Decomp_init(BZ2DecompObject *self, PyObject *args, PyObject *kwargs) +BZ2Decompressor_init(BZ2Decompressor *self, PyObject *args, PyObject *kwargs) { int bzerror; @@ -1825,325 +438,120 @@ #ifdef WITH_THREAD self->lock = PyThread_allocate_lock(); - if (!self->lock) { - PyErr_SetString(PyExc_MemoryError, "unable to allocate lock"); - goto error; + if (self->lock == NULL) { + PyErr_SetString(PyExc_MemoryError, "Unable to allocate lock"); + return -1; } #endif self->unused_data = PyBytes_FromStringAndSize("", 0); - if (!self->unused_data) + if (self->unused_data == NULL) goto error; - memset(&self->bzs, 0, sizeof(bz_stream)); bzerror = BZ2_bzDecompressInit(&self->bzs, 0, 0); - if (bzerror != BZ_OK) { - Util_CatchBZ2Error(bzerror); + if (catch_bz2_error(bzerror)) goto error; - } - - self->running = 1; return 0; error: + Py_CLEAR(self->unused_data); #ifdef WITH_THREAD - if (self->lock) { - PyThread_free_lock(self->lock); - self->lock = NULL; - } + PyThread_free_lock(self->lock); + self->lock = NULL; #endif - Py_CLEAR(self->unused_data); return -1; } static void -BZ2Decomp_dealloc(BZ2DecompObject *self) +BZ2Decompressor_dealloc(BZ2Decompressor *self) { + BZ2_bzDecompressEnd(&self->bzs); + Py_CLEAR(self->unused_data); #ifdef WITH_THREAD - if (self->lock) + if (self->lock != NULL) PyThread_free_lock(self->lock); #endif - Py_XDECREF(self->unused_data); - BZ2_bzDecompressEnd(&self->bzs); Py_TYPE(self)->tp_free((PyObject *)self); } - -/* ===================================================================== */ -/* BZ2Decomp_Type definition. */ - -PyDoc_STRVAR(BZ2Decomp__doc__, -"BZ2Decompressor() -> decompressor object\n\ -\n\ -Create a new decompressor object. This object may be used to decompress\n\ -data sequentially. If you want to decompress data in one shot, use the\n\ -decompress() function instead.\n\ -"); - -static PyTypeObject BZ2Decomp_Type = { - PyVarObject_HEAD_INIT(NULL, 0) - "bz2.BZ2Decompressor", /*tp_name*/ - sizeof(BZ2DecompObject), /*tp_basicsize*/ - 0, /*tp_itemsize*/ - (destructor)BZ2Decomp_dealloc, /*tp_dealloc*/ - 0, /*tp_print*/ - 0, /*tp_getattr*/ - 0, /*tp_setattr*/ - 0, /*tp_reserved*/ - 0, /*tp_repr*/ - 0, /*tp_as_number*/ - 0, /*tp_as_sequence*/ - 0, /*tp_as_mapping*/ - 0, /*tp_hash*/ - 0, /*tp_call*/ - 0, /*tp_str*/ - PyObject_GenericGetAttr,/*tp_getattro*/ - PyObject_GenericSetAttr,/*tp_setattro*/ - 0, /*tp_as_buffer*/ - Py_TPFLAGS_DEFAULT|Py_TPFLAGS_BASETYPE, /*tp_flags*/ - BZ2Decomp__doc__, /*tp_doc*/ - 0, /*tp_traverse*/ - 0, /*tp_clear*/ - 0, /*tp_richcompare*/ - 0, /*tp_weaklistoffset*/ - 0, /*tp_iter*/ - 0, /*tp_iternext*/ - BZ2Decomp_methods, /*tp_methods*/ - BZ2Decomp_members, /*tp_members*/ - 0, /*tp_getset*/ - 0, /*tp_base*/ - 0, /*tp_dict*/ - 0, /*tp_descr_get*/ - 0, /*tp_descr_set*/ - 0, /*tp_dictoffset*/ - (initproc)BZ2Decomp_init, /*tp_init*/ - PyType_GenericAlloc, /*tp_alloc*/ - PyType_GenericNew, /*tp_new*/ - PyObject_Free, /*tp_free*/ - 0, /*tp_is_gc*/ +static PyMethodDef BZ2Decompressor_methods[] = { + {"decompress", (PyCFunction)BZ2Decompressor_decompress, METH_VARARGS, + BZ2Decompressor_decompress__doc__}, + {NULL} }; +PyDoc_STRVAR(BZ2Decompressor_eof__doc__, +"True if the end-of-stream marker has been reached."); -/* ===================================================================== */ -/* Module functions. */ +PyDoc_STRVAR(BZ2Decompressor_unused_data__doc__, +"Data found after the end of the compressed stream."); -PyDoc_STRVAR(bz2_compress__doc__, -"compress(data [, compresslevel=9]) -> string\n\ -\n\ -Compress data in one shot. If you want to compress data sequentially,\n\ -use an instance of BZ2Compressor instead. The compresslevel parameter, if\n\ -given, must be a number between 1 and 9.\n\ -"); - -static PyObject * -bz2_compress(PyObject *self, PyObject *args, PyObject *kwargs) -{ - int compresslevel=9; - Py_buffer pdata; - char *data; - int datasize; - int bufsize; - PyObject *ret = NULL; - bz_stream _bzs; - bz_stream *bzs = &_bzs; - int bzerror; - static char *kwlist[] = {"data", "compresslevel", 0}; - - if (!PyArg_ParseTupleAndKeywords(args, kwargs, "y*|i", - kwlist, &pdata, - &compresslevel)) - return NULL; - data = pdata.buf; - datasize = pdata.len; - - if (compresslevel < 1 || compresslevel > 9) { - PyErr_SetString(PyExc_ValueError, - "compresslevel must be between 1 and 9"); - PyBuffer_Release(&pdata); - return NULL; - } - - /* Conforming to bz2 manual, this is large enough to fit compressed - * data in one shot. We will check it later anyway. */ - bufsize = datasize + (datasize/100+1) + 600; - - ret = PyBytes_FromStringAndSize(NULL, bufsize); - if (!ret) { - PyBuffer_Release(&pdata); - return NULL; - } - - memset(bzs, 0, sizeof(bz_stream)); - - bzs->next_in = data; - bzs->avail_in = datasize; - bzs->next_out = BUF(ret); - bzs->avail_out = bufsize; - - bzerror = BZ2_bzCompressInit(bzs, compresslevel, 0, 0); - if (bzerror != BZ_OK) { - Util_CatchBZ2Error(bzerror); - PyBuffer_Release(&pdata); - Py_DECREF(ret); - return NULL; - } - - for (;;) { - Py_BEGIN_ALLOW_THREADS - bzerror = BZ2_bzCompress(bzs, BZ_FINISH); - Py_END_ALLOW_THREADS - if (bzerror == BZ_STREAM_END) { - break; - } else if (bzerror != BZ_FINISH_OK) { - BZ2_bzCompressEnd(bzs); - Util_CatchBZ2Error(bzerror); - PyBuffer_Release(&pdata); - Py_DECREF(ret); - return NULL; - } - if (bzs->avail_out == 0) { - bufsize = Util_NewBufferSize(bufsize); - if (_PyBytes_Resize(&ret, bufsize) < 0) { - BZ2_bzCompressEnd(bzs); - PyBuffer_Release(&pdata); - return NULL; - } - bzs->next_out = BUF(ret) + BZS_TOTAL_OUT(bzs); - bzs->avail_out = bufsize - (bzs->next_out - BUF(ret)); - } - } - - if (bzs->avail_out != 0) { - if (_PyBytes_Resize(&ret, (Py_ssize_t)BZS_TOTAL_OUT(bzs)) < 0) { - ret = NULL; - } - } - BZ2_bzCompressEnd(bzs); - - PyBuffer_Release(&pdata); - return ret; -} - -PyDoc_STRVAR(bz2_decompress__doc__, -"decompress(data) -> decompressed data\n\ -\n\ -Decompress data in one shot. If you want to decompress data sequentially,\n\ -use an instance of BZ2Decompressor instead.\n\ -"); - -static PyObject * -bz2_decompress(PyObject *self, PyObject *args) -{ - Py_buffer pdata; - char *data; - int datasize; - int bufsize = SMALLCHUNK; - PyObject *ret; - bz_stream _bzs; - bz_stream *bzs = &_bzs; - int bzerror; - - if (!PyArg_ParseTuple(args, "y*:decompress", &pdata)) - return NULL; - data = pdata.buf; - datasize = pdata.len; - - if (datasize == 0) { - PyBuffer_Release(&pdata); - return PyBytes_FromStringAndSize("", 0); - } - - ret = PyBytes_FromStringAndSize(NULL, bufsize); - if (!ret) { - PyBuffer_Release(&pdata); - return NULL; - } - - memset(bzs, 0, sizeof(bz_stream)); - - bzs->next_in = data; - bzs->avail_in = datasize; - bzs->next_out = BUF(ret); - bzs->avail_out = bufsize; - - bzerror = BZ2_bzDecompressInit(bzs, 0, 0); - if (bzerror != BZ_OK) { - Util_CatchBZ2Error(bzerror); - Py_DECREF(ret); - PyBuffer_Release(&pdata); - return NULL; - } - - for (;;) { - Py_BEGIN_ALLOW_THREADS - bzerror = BZ2_bzDecompress(bzs); - Py_END_ALLOW_THREADS - if (bzerror == BZ_STREAM_END) { - break; - } else if (bzerror != BZ_OK) { - BZ2_bzDecompressEnd(bzs); - Util_CatchBZ2Error(bzerror); - PyBuffer_Release(&pdata); - Py_DECREF(ret); - return NULL; - } - if (bzs->avail_in == 0) { - BZ2_bzDecompressEnd(bzs); - PyErr_SetString(PyExc_ValueError, - "couldn't find end of stream"); - PyBuffer_Release(&pdata); - Py_DECREF(ret); - return NULL; - } - if (bzs->avail_out == 0) { - bufsize = Util_NewBufferSize(bufsize); - if (_PyBytes_Resize(&ret, bufsize) < 0) { - BZ2_bzDecompressEnd(bzs); - PyBuffer_Release(&pdata); - return NULL; - } - bzs->next_out = BUF(ret) + BZS_TOTAL_OUT(bzs); - bzs->avail_out = bufsize - (bzs->next_out - BUF(ret)); - } - } - - if (bzs->avail_out != 0) { - if (_PyBytes_Resize(&ret, (Py_ssize_t)BZS_TOTAL_OUT(bzs)) < 0) { - ret = NULL; - } - } - BZ2_bzDecompressEnd(bzs); - PyBuffer_Release(&pdata); - - return ret; -} - -static PyMethodDef bz2_methods[] = { - {"compress", (PyCFunction) bz2_compress, METH_VARARGS|METH_KEYWORDS, - bz2_compress__doc__}, - {"decompress", (PyCFunction) bz2_decompress, METH_VARARGS, - bz2_decompress__doc__}, - {NULL, NULL} /* sentinel */ +static PyMemberDef BZ2Decompressor_members[] = { + {"eof", T_BOOL, offsetof(BZ2Decompressor, eof), + READONLY, BZ2Decompressor_eof__doc__}, + {"unused_data", T_OBJECT_EX, offsetof(BZ2Decompressor, unused_data), + READONLY, BZ2Decompressor_unused_data__doc__}, + {NULL} }; -/* ===================================================================== */ -/* Initialization function. */ +PyDoc_STRVAR(BZ2Decompressor__doc__, +"BZ2Decompressor()\n" +"\n" +"Create a decompressor object for decompressing data incrementally.\n" +"\n" +"For one-shot decompression, use the decompress() function instead.\n"); -PyDoc_STRVAR(bz2__doc__, -"The python bz2 module provides a comprehensive interface for\n\ -the bz2 compression library. It implements a complete file\n\ -interface, one shot (de)compression functions, and types for\n\ -sequential (de)compression.\n\ -"); +static PyTypeObject BZ2Decompressor_Type = { + PyVarObject_HEAD_INIT(NULL, 0) + "_bz2.BZ2Decompressor", /* tp_name */ + sizeof(BZ2Decompressor), /* tp_basicsize */ + 0, /* tp_itemsize */ + (destructor)BZ2Decompressor_dealloc,/* tp_dealloc */ + 0, /* tp_print */ + 0, /* tp_getattr */ + 0, /* tp_setattr */ + 0, /* tp_reserved */ + 0, /* tp_repr */ + 0, /* tp_as_number */ + 0, /* tp_as_sequence */ + 0, /* tp_as_mapping */ + 0, /* tp_hash */ + 0, /* tp_call */ + 0, /* tp_str */ + 0, /* tp_getattro */ + 0, /* tp_setattro */ + 0, /* tp_as_buffer */ + Py_TPFLAGS_DEFAULT, /* tp_flags */ + BZ2Decompressor__doc__, /* tp_doc */ + 0, /* tp_traverse */ + 0, /* tp_clear */ + 0, /* tp_richcompare */ + 0, /* tp_weaklistoffset */ + 0, /* tp_iter */ + 0, /* tp_iternext */ + BZ2Decompressor_methods, /* tp_methods */ + BZ2Decompressor_members, /* tp_members */ + 0, /* tp_getset */ + 0, /* tp_base */ + 0, /* tp_dict */ + 0, /* tp_descr_get */ + 0, /* tp_descr_set */ + 0, /* tp_dictoffset */ + (initproc)BZ2Decompressor_init, /* tp_init */ + 0, /* tp_alloc */ + PyType_GenericNew, /* tp_new */ +}; -static struct PyModuleDef bz2module = { +/* Module initialization. */ + +static struct PyModuleDef _bz2module = { PyModuleDef_HEAD_INIT, - "bz2", - bz2__doc__, + "_bz2", + NULL, -1, - bz2_methods, + NULL, NULL, NULL, NULL, @@ -2151,30 +559,25 @@ }; PyMODINIT_FUNC -PyInit_bz2(void) +PyInit__bz2(void) { PyObject *m; - if (PyType_Ready(&BZ2File_Type) < 0) + if (PyType_Ready(&BZ2Compressor_Type) < 0) return NULL; - if (PyType_Ready(&BZ2Comp_Type) < 0) - return NULL; - if (PyType_Ready(&BZ2Decomp_Type) < 0) + if (PyType_Ready(&BZ2Decompressor_Type) < 0) return NULL; - m = PyModule_Create(&bz2module); + m = PyModule_Create(&_bz2module); if (m == NULL) return NULL; - PyModule_AddObject(m, "__author__", PyUnicode_FromString(__author__)); + Py_INCREF(&BZ2Compressor_Type); + PyModule_AddObject(m, "BZ2Compressor", (PyObject *)&BZ2Compressor_Type); - Py_INCREF(&BZ2File_Type); - PyModule_AddObject(m, "BZ2File", (PyObject *)&BZ2File_Type); + Py_INCREF(&BZ2Decompressor_Type); + PyModule_AddObject(m, "BZ2Decompressor", + (PyObject *)&BZ2Decompressor_Type); - Py_INCREF(&BZ2Comp_Type); - PyModule_AddObject(m, "BZ2Compressor", (PyObject *)&BZ2Comp_Type); - - Py_INCREF(&BZ2Decomp_Type); - PyModule_AddObject(m, "BZ2Decompressor", (PyObject *)&BZ2Decomp_Type); return m; } diff --git a/PCbuild/bz2.vcproj b/PCbuild/_bz2.vcproj rename from PCbuild/bz2.vcproj rename to PCbuild/_bz2.vcproj --- a/PCbuild/bz2.vcproj +++ b/PCbuild/_bz2.vcproj @@ -2,7 +2,7 @@ diff --git a/PCbuild/pcbuild.sln b/PCbuild/pcbuild.sln --- a/PCbuild/pcbuild.sln +++ b/PCbuild/pcbuild.sln @@ -87,7 +87,7 @@ {CF7AC3D1-E2DF-41D2-BEA6-1E2556CDEA26} = {CF7AC3D1-E2DF-41D2-BEA6-1E2556CDEA26} EndProjectSection EndProject -Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "bz2", "bz2.vcproj", "{73FCD2BD-F133-46B7-8EC1-144CD82A59D5}" +Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "_bz2", "_bz2.vcproj", "{73FCD2BD-F133-46B7-8EC1-144CD82A59D5}" ProjectSection(ProjectDependencies) = postProject {CF7AC3D1-E2DF-41D2-BEA6-1E2556CDEA26} = {CF7AC3D1-E2DF-41D2-BEA6-1E2556CDEA26} EndProjectSection diff --git a/PCbuild/readme.txt b/PCbuild/readme.txt --- a/PCbuild/readme.txt +++ b/PCbuild/readme.txt @@ -112,9 +112,9 @@ pre-built Tcl/Tk in either ..\..\tcltk for 32-bit or ..\..\tcltk64 for 64-bit (relative to this directory). See below for instructions to build Tcl/Tk. -bz2 - Python wrapper for the libbz2 compression library. Homepage - http://sources.redhat.com/bzip2/ +_bz2 + Python wrapper for the libbzip2 compression library. Homepage + http://www.bzip.org/ Download the source from the python.org copy into the dist directory: diff --git a/setup.py b/setup.py --- a/setup.py +++ b/setup.py @@ -1233,11 +1233,11 @@ bz2_extra_link_args = ('-Wl,-search_paths_first',) else: bz2_extra_link_args = () - exts.append( Extension('bz2', ['bz2module.c'], + exts.append( Extension('_bz2', ['_bz2module.c'], libraries = ['bz2'], extra_link_args = bz2_extra_link_args) ) else: - missing.append('bz2') + missing.append('_bz2') # Interface to the Expat XML parser # -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Apr 3 17:09:11 2011 From: python-checkins at python.org (antoine.pitrou) Date: Sun, 03 Apr 2011 17:09:11 +0200 Subject: [Python-checkins] cpython: Fix whitespace Message-ID: http://hg.python.org/cpython/rev/ff105faf1bac changeset: 69113:ff105faf1bac user: Antoine Pitrou date: Sun Apr 03 17:08:49 2011 +0200 summary: Fix whitespace files: Lib/bz2.py | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/Lib/bz2.py b/Lib/bz2.py --- a/Lib/bz2.py +++ b/Lib/bz2.py @@ -105,7 +105,7 @@ self._fp.write(self._compressor.flush()) self._compressor = None finally: - try: + try: if self._closefp: self._fp.close() finally: @@ -251,7 +251,7 @@ def readinto(self, b): """Read up to len(b) bytes into b. - + Returns the number of bytes read (0 for EOF). """ with self._lock: -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Apr 3 18:20:19 2011 From: python-checkins at python.org (antoine.pitrou) Date: Sun, 03 Apr 2011 18:20:19 +0200 Subject: [Python-checkins] cpython (3.2): Issue #11746: Fix SSLContext.load_cert_chain() to accept elliptic curve private Message-ID: http://hg.python.org/cpython/rev/88ed3de28520 changeset: 69114:88ed3de28520 branch: 3.2 parent: 69109:1fd736395df3 user: Antoine Pitrou date: Sun Apr 03 18:15:34 2011 +0200 summary: Issue #11746: Fix SSLContext.load_cert_chain() to accept elliptic curve private keys. files: Misc/NEWS | 3 +++ Modules/_ssl.c | 2 +- 2 files changed, 4 insertions(+), 1 deletions(-) diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -49,6 +49,9 @@ Library ------- +- Issue #11746: Fix SSLContext.load_cert_chain() to accept elliptic curve + private keys. + - sys.getfilesystemencoding() raises a RuntimeError if initfsencoding() was not called yet: detect bootstrap (startup) issues earlier. diff --git a/Modules/_ssl.c b/Modules/_ssl.c --- a/Modules/_ssl.c +++ b/Modules/_ssl.c @@ -1623,7 +1623,7 @@ goto error; } PySSL_BEGIN_ALLOW_THREADS - r = SSL_CTX_use_RSAPrivateKey_file(self->ctx, + r = SSL_CTX_use_PrivateKey_file(self->ctx, PyBytes_AS_STRING(keyfile ? keyfile_bytes : certfile_bytes), SSL_FILETYPE_PEM); PySSL_END_ALLOW_THREADS -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Apr 3 18:20:20 2011 From: python-checkins at python.org (antoine.pitrou) Date: Sun, 03 Apr 2011 18:20:20 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Merge fix for issue #11746 Message-ID: http://hg.python.org/cpython/rev/c11e05a60d36 changeset: 69115:c11e05a60d36 parent: 69113:ff105faf1bac parent: 69114:88ed3de28520 user: Antoine Pitrou date: Sun Apr 03 18:16:50 2011 +0200 summary: Merge fix for issue #11746 files: Misc/NEWS | 3 +++ Modules/_ssl.c | 2 +- 2 files changed, 4 insertions(+), 1 deletions(-) diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -87,6 +87,9 @@ Library ------- +- Issue #11746: Fix SSLContext.load_cert_chain() to accept elliptic curve + private keys. + - Issue #5863: Rewrite BZ2File in pure Python, and allow it to accept file-like objects using a new ``fileobj`` constructor argument. Patch by Nadeem Vawda. diff --git a/Modules/_ssl.c b/Modules/_ssl.c --- a/Modules/_ssl.c +++ b/Modules/_ssl.c @@ -1620,7 +1620,7 @@ goto error; } PySSL_BEGIN_ALLOW_THREADS - r = SSL_CTX_use_RSAPrivateKey_file(self->ctx, + r = SSL_CTX_use_PrivateKey_file(self->ctx, PyBytes_AS_STRING(keyfile ? keyfile_bytes : certfile_bytes), SSL_FILETYPE_PEM); PySSL_END_ALLOW_THREADS -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Apr 3 18:29:49 2011 From: python-checkins at python.org (antoine.pitrou) Date: Sun, 03 Apr 2011 18:29:49 +0200 Subject: [Python-checkins] cpython: Issue #11748: try to fix sporadic failures in test_ftplib Message-ID: http://hg.python.org/cpython/rev/8a2d848244a2 changeset: 69116:8a2d848244a2 user: Antoine Pitrou date: Sun Apr 03 18:29:45 2011 +0200 summary: Issue #11748: try to fix sporadic failures in test_ftplib files: Lib/test/test_ftplib.py | 22 ++++++++++++++++------ 1 files changed, 16 insertions(+), 6 deletions(-) diff --git a/Lib/test/test_ftplib.py b/Lib/test/test_ftplib.py --- a/Lib/test/test_ftplib.py +++ b/Lib/test/test_ftplib.py @@ -611,16 +611,26 @@ def test_source_address(self): self.client.quit() port = support.find_unused_port() - self.client.connect(self.server.host, self.server.port, - source_address=(HOST, port)) - self.assertEqual(self.client.sock.getsockname()[1], port) - self.client.quit() + try: + self.client.connect(self.server.host, self.server.port, + source_address=(HOST, port)) + self.assertEqual(self.client.sock.getsockname()[1], port) + self.client.quit() + except IOError as e: + if e.errno == errno.EADDRINUSE: + self.skipTest("couldn't bind to port %d" % port) + raise def test_source_address_passive_connection(self): port = support.find_unused_port() self.client.source_address = (HOST, port) - with self.client.transfercmd('list') as sock: - self.assertEqual(sock.getsockname()[1], port) + try: + with self.client.transfercmd('list') as sock: + self.assertEqual(sock.getsockname()[1], port) + except IOError as e: + if e.errno == errno.EADDRINUSE: + self.skipTest("couldn't bind to port %d" % port) + raise def test_parse257(self): self.assertEqual(ftplib.parse257('257 "/foo/bar"'), '/foo/bar') -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Apr 3 18:46:26 2011 From: python-checkins at python.org (victor.stinner) Date: Sun, 03 Apr 2011 18:46:26 +0200 Subject: [Python-checkins] cpython: test_faulthandler: fix regex on the check_dump_traceback_threads() traceback Message-ID: http://hg.python.org/cpython/rev/cb169f61785b changeset: 69117:cb169f61785b user: Victor Stinner date: Sun Apr 03 18:41:22 2011 +0200 summary: test_faulthandler: fix regex on the check_dump_traceback_threads() traceback The traceback may contain "_is_owned": Thread 0x40962b90: File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/threading.py", line 220 in _is_owned File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/threading.py", line 227 in wait File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/threading.py", line 421 in wait File "", line 23 in run File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/threading.py", line 735 in _bootstrap_inner File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/threading.py", line 708 in _bootstrap Current thread XXX: File "", line 10 in dump File "", line 28 in files: Lib/test/test_faulthandler.py | 5 ++--- 1 files changed, 2 insertions(+), 3 deletions(-) diff --git a/Lib/test/test_faulthandler.py b/Lib/test/test_faulthandler.py --- a/Lib/test/test_faulthandler.py +++ b/Lib/test/test_faulthandler.py @@ -325,9 +325,8 @@ lineno = 10 regex = """ ^Thread 0x[0-9a-f]+: -(?: File ".*threading.py", line [0-9]+ in wait -)? File ".*threading.py", line [0-9]+ in wait - File "", line 23 in run +(?: File ".*threading.py", line [0-9]+ in [_a-z]+ +){{1,3}} File "", line 23 in run File ".*threading.py", line [0-9]+ in _bootstrap_inner File ".*threading.py", line [0-9]+ in _bootstrap -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Apr 3 18:46:28 2011 From: python-checkins at python.org (victor.stinner) Date: Sun, 03 Apr 2011 18:46:28 +0200 Subject: [Python-checkins] cpython: test_faulthandler: improve the test on dump_tracebacks_later(cancel=True) Message-ID: http://hg.python.org/cpython/rev/2d0a855ce30a changeset: 69118:2d0a855ce30a user: Victor Stinner date: Sun Apr 03 18:45:42 2011 +0200 summary: test_faulthandler: improve the test on dump_tracebacks_later(cancel=True) files: Lib/test/test_faulthandler.py | 33 ++++++++++------------ 1 files changed, 15 insertions(+), 18 deletions(-) diff --git a/Lib/test/test_faulthandler.py b/Lib/test/test_faulthandler.py --- a/Lib/test/test_faulthandler.py +++ b/Lib/test/test_faulthandler.py @@ -358,25 +358,19 @@ import time def func(repeat, cancel, timeout): + if cancel: + faulthandler.cancel_dump_tracebacks_later() + pause = timeout * 2.5 # on Windows XP, b-a gives 1.249931 after sleep(1.25) min_pause = pause * 0.9 a = time.time() time.sleep(pause) + b = time.time() faulthandler.cancel_dump_tracebacks_later() - b = time.time() # Check that sleep() was not interrupted assert (b - a) >= min_pause, "{{}} < {{}}".format(b - a, min_pause) - if cancel: - pause = timeout * 1.5 - min_pause = pause * 0.9 - a = time.time() - time.sleep(pause) - b = time.time() - # Check that sleep() was not interrupted - assert (b - a) >= min_pause, "{{}} < {{}}".format(b - a, min_pause) - timeout = {timeout} repeat = {repeat} cancel = {cancel} @@ -400,13 +394,16 @@ trace, exitcode = self.get_output(code, filename) trace = '\n'.join(trace) - if repeat: - count = 2 + if not cancel: + if repeat: + count = 2 + else: + count = 1 + header = 'Thread 0x[0-9a-f]+:\n' + regex = expected_traceback(12, 27, header, count=count) + self.assertRegex(trace, regex) else: - count = 1 - header = 'Thread 0x[0-9a-f]+:\n' - regex = expected_traceback(9, 33, header, count=count) - self.assertRegex(trace, regex) + self.assertEqual(trace, '') self.assertEqual(exitcode, 0) @unittest.skipIf(not hasattr(faulthandler, 'dump_tracebacks_later'), @@ -425,8 +422,8 @@ def test_dump_tracebacks_later_repeat(self): self.check_dump_tracebacks_later(repeat=True) - def test_dump_tracebacks_later_repeat_cancel(self): - self.check_dump_tracebacks_later(repeat=True, cancel=True) + def test_dump_tracebacks_later_cancel(self): + self.check_dump_tracebacks_later(cancel=True) def test_dump_tracebacks_later_file(self): self.check_dump_tracebacks_later(file=True) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Apr 3 23:46:44 2011 From: python-checkins at python.org (victor.stinner) Date: Sun, 03 Apr 2011 23:46:44 +0200 Subject: [Python-checkins] cpython: Issue #11727, issue #11753, issue #11755: disable regrtest timeout Message-ID: http://hg.python.org/cpython/rev/394f0ea0d29e changeset: 69119:394f0ea0d29e user: Victor Stinner date: Sun Apr 03 23:46:42 2011 +0200 summary: Issue #11727, issue #11753, issue #11755: disable regrtest timeout Disable regrtest timeout until #11753 and #11755 are fixed files: Lib/test/regrtest.py | 2 +- Misc/NEWS | 5 ++--- 2 files changed, 3 insertions(+), 4 deletions(-) diff --git a/Lib/test/regrtest.py b/Lib/test/regrtest.py --- a/Lib/test/regrtest.py +++ b/Lib/test/regrtest.py @@ -240,7 +240,7 @@ findleaks=False, use_resources=None, trace=False, coverdir='coverage', runleaks=False, huntrleaks=False, verbose2=False, print_slow=False, random_seed=None, use_mp=None, verbose3=False, forever=False, - header=False, timeout=30*60): + header=False, timeout=None): """Execute a test suite. This also parses command-line options and modifies its behavior diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -370,9 +370,8 @@ Tests ----- -- Issue #11727: If a test takes more than 30 minutes, regrtest dumps the - traceback of all threads and exits. Use --timeout option to change the - default timeout or to disable it. +- Issue #11727: Add a --timeout option to regrtest: if a test takes more than + TIMEOUT seconds, dumps the traceback of all threads and exits. - Issue #11653: fix -W with -j in regrtest. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 00:13:06 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 04 Apr 2011 00:13:06 +0200 Subject: [Python-checkins] cpython: Issue #11688: Add sqlite3.Connection.set_trace_callback(). Patch by Torsten Message-ID: http://hg.python.org/cpython/rev/575ee55081dc changeset: 69120:575ee55081dc user: Antoine Pitrou date: Mon Apr 04 00:12:04 2011 +0200 summary: Issue #11688: Add sqlite3.Connection.set_trace_callback(). Patch by Torsten Landschoff. files: Doc/library/sqlite3.rst | 16 ++++++ Lib/sqlite3/test/hooks.py | 48 ++++++++++++++++++- Misc/NEWS | 3 + Modules/_sqlite/connection.c | 62 ++++++++++++++++++++++++ 4 files changed, 128 insertions(+), 1 deletions(-) diff --git a/Doc/library/sqlite3.rst b/Doc/library/sqlite3.rst --- a/Doc/library/sqlite3.rst +++ b/Doc/library/sqlite3.rst @@ -369,6 +369,22 @@ method with :const:`None` for *handler*. +.. method:: Connection.set_trace_callback(trace_callback) + + Registers *trace_callback* to be called for each SQL statement that is + actually executed by the SQLite backend. + + The only argument passed to the callback is the statement (as string) that + is being executed. The return value of the callback is ignored. Note that + the backend does not only run statements passed to the :meth:`Cursor.execute` + methods. Other sources include the transaction management of the Python + module and the execution of triggers defined in the current database. + + Passing :const:`None` as *trace_callback* will disable the trace callback. + + .. versionadded:: 3.3 + + .. method:: Connection.enable_load_extension(enabled) This routine allows/disallows the SQLite engine to load SQLite extensions diff --git a/Lib/sqlite3/test/hooks.py b/Lib/sqlite3/test/hooks.py --- a/Lib/sqlite3/test/hooks.py +++ b/Lib/sqlite3/test/hooks.py @@ -175,10 +175,56 @@ con.execute("select 1 union select 2 union select 3").fetchall() self.assertEqual(action, 0, "progress handler was not cleared") +class TraceCallbackTests(unittest.TestCase): + def CheckTraceCallbackUsed(self): + """ + Test that the trace callback is invoked once it is set. + """ + con = sqlite.connect(":memory:") + traced_statements = [] + def trace(statement): + traced_statements.append(statement) + con.set_trace_callback(trace) + con.execute("create table foo(a, b)") + self.assertTrue(traced_statements) + self.assertTrue(any("create table foo" in stmt for stmt in traced_statements)) + + def CheckClearTraceCallback(self): + """ + Test that setting the trace callback to None clears the previously set callback. + """ + con = sqlite.connect(":memory:") + traced_statements = [] + def trace(statement): + traced_statements.append(statement) + con.set_trace_callback(trace) + con.set_trace_callback(None) + con.execute("create table foo(a, b)") + self.assertFalse(traced_statements, "trace callback was not cleared") + + def CheckUnicodeContent(self): + """ + Test that the statement can contain unicode literals. + """ + unicode_value = '\xf6\xe4\xfc\xd6\xc4\xdc\xdf\u20ac' + con = sqlite.connect(":memory:") + traced_statements = [] + def trace(statement): + traced_statements.append(statement) + con.set_trace_callback(trace) + con.execute("create table foo(x)") + con.execute("insert into foo(x) values (?)", (unicode_value,)) + con.commit() + self.assertTrue(any(unicode_value in stmt for stmt in traced_statements), + "Unicode data garbled in trace callback") + + + def suite(): collation_suite = unittest.makeSuite(CollationTests, "Check") progress_suite = unittest.makeSuite(ProgressTests, "Check") - return unittest.TestSuite((collation_suite, progress_suite)) + trace_suite = unittest.makeSuite(TraceCallbackTests, "Check") + return unittest.TestSuite((collation_suite, progress_suite, trace_suite)) def test(): runner = unittest.TextTestRunner() diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -87,6 +87,9 @@ Library ------- +- Issue #11688: Add sqlite3.Connection.set_trace_callback(). Patch by + Torsten Landschoff. + - Issue #11746: Fix SSLContext.load_cert_chain() to accept elliptic curve private keys. diff --git a/Modules/_sqlite/connection.c b/Modules/_sqlite/connection.c --- a/Modules/_sqlite/connection.c +++ b/Modules/_sqlite/connection.c @@ -904,6 +904,38 @@ return rc; } +static void _trace_callback(void* user_arg, const char* statement_string) +{ + PyObject *py_statement = NULL; + PyObject *ret = NULL; + +#ifdef WITH_THREAD + PyGILState_STATE gilstate; + + gilstate = PyGILState_Ensure(); +#endif + py_statement = PyUnicode_DecodeUTF8(statement_string, + strlen(statement_string), "replace"); + if (py_statement) { + ret = PyObject_CallFunctionObjArgs((PyObject*)user_arg, py_statement, NULL); + Py_DECREF(py_statement); + } + + if (ret) { + Py_DECREF(ret); + } else { + if (_enable_callback_tracebacks) { + PyErr_Print(); + } else { + PyErr_Clear(); + } + } + +#ifdef WITH_THREAD + PyGILState_Release(gilstate); +#endif +} + static PyObject* pysqlite_connection_set_authorizer(pysqlite_Connection* self, PyObject* args, PyObject* kwargs) { PyObject* authorizer_cb; @@ -963,6 +995,34 @@ return Py_None; } +static PyObject* pysqlite_connection_set_trace_callback(pysqlite_Connection* self, PyObject* args, PyObject* kwargs) +{ + PyObject* trace_callback; + + static char *kwlist[] = { "trace_callback", NULL }; + + if (!pysqlite_check_thread(self) || !pysqlite_check_connection(self)) { + return NULL; + } + + if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O:set_trace_callback", + kwlist, &trace_callback)) { + return NULL; + } + + if (trace_callback == Py_None) { + /* None clears the trace callback previously set */ + sqlite3_trace(self->db, 0, (void*)0); + } else { + if (PyDict_SetItem(self->function_pinboard, trace_callback, Py_None) == -1) + return NULL; + sqlite3_trace(self->db, _trace_callback, trace_callback); + } + + Py_INCREF(Py_None); + return Py_None; +} + #ifdef HAVE_LOAD_EXTENSION static PyObject* pysqlite_enable_load_extension(pysqlite_Connection* self, PyObject* args) { @@ -1516,6 +1576,8 @@ #endif {"set_progress_handler", (PyCFunction)pysqlite_connection_set_progress_handler, METH_VARARGS|METH_KEYWORDS, PyDoc_STR("Sets progress handler callback. Non-standard.")}, + {"set_trace_callback", (PyCFunction)pysqlite_connection_set_trace_callback, METH_VARARGS|METH_KEYWORDS, + PyDoc_STR("Sets a trace callback called for each SQL statement (passed as unicode). Non-standard.")}, {"execute", (PyCFunction)pysqlite_connection_execute, METH_VARARGS, PyDoc_STR("Executes a SQL statement. Non-standard.")}, {"executemany", (PyCFunction)pysqlite_connection_executemany, METH_VARARGS, -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 00:50:05 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 04 Apr 2011 00:50:05 +0200 Subject: [Python-checkins] cpython: Improve error message in test Message-ID: http://hg.python.org/cpython/rev/23519bc7d752 changeset: 69121:23519bc7d752 user: Antoine Pitrou date: Mon Apr 04 00:50:01 2011 +0200 summary: Improve error message in test files: Lib/sqlite3/test/hooks.py | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/Lib/sqlite3/test/hooks.py b/Lib/sqlite3/test/hooks.py --- a/Lib/sqlite3/test/hooks.py +++ b/Lib/sqlite3/test/hooks.py @@ -216,7 +216,8 @@ con.execute("insert into foo(x) values (?)", (unicode_value,)) con.commit() self.assertTrue(any(unicode_value in stmt for stmt in traced_statements), - "Unicode data garbled in trace callback") + "Unicode data %s garbled in trace callback: %s" + % (ascii(unicode_value), ', '.join(map(ascii, traced_statements)))) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 01:22:34 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 04 Apr 2011 01:22:34 +0200 Subject: [Python-checkins] cpython (3.2): Issue #11749: try to fix transient test_socket failure Message-ID: http://hg.python.org/cpython/rev/68a319ef70fc changeset: 69122:68a319ef70fc branch: 3.2 parent: 69114:88ed3de28520 user: Antoine Pitrou date: Mon Apr 04 01:21:37 2011 +0200 summary: Issue #11749: try to fix transient test_socket failure files: Lib/test/test_socket.py | 4 ++++ 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/Lib/test/test_socket.py b/Lib/test/test_socket.py --- a/Lib/test/test_socket.py +++ b/Lib/test/test_socket.py @@ -1384,6 +1384,10 @@ self.evt1.set() self.evt2.wait(1.0) first_seg = self.read_file.read(len(self.read_msg) - 3) + if first_seg is None: + # Data not arrived (can happen under Windows), wait a bit + time.sleep(0.5) + first_seg = self.read_file.read(len(self.read_msg) - 3) buf = bytearray(10) n = self.read_file.readinto(buf) self.assertEqual(n, 3) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 01:22:35 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 04 Apr 2011 01:22:35 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Issue #11749: try to fix transient test_socket failure Message-ID: http://hg.python.org/cpython/rev/44fc5f94bc90 changeset: 69123:44fc5f94bc90 parent: 69121:23519bc7d752 parent: 69122:68a319ef70fc user: Antoine Pitrou date: Mon Apr 04 01:22:06 2011 +0200 summary: Issue #11749: try to fix transient test_socket failure files: Lib/test/test_socket.py | 4 ++++ 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/Lib/test/test_socket.py b/Lib/test/test_socket.py --- a/Lib/test/test_socket.py +++ b/Lib/test/test_socket.py @@ -1411,6 +1411,10 @@ self.evt1.set() self.evt2.wait(1.0) first_seg = self.read_file.read(len(self.read_msg) - 3) + if first_seg is None: + # Data not arrived (can happen under Windows), wait a bit + time.sleep(0.5) + first_seg = self.read_file.read(len(self.read_msg) - 3) buf = bytearray(10) n = self.read_file.readinto(buf) self.assertEqual(n, 3) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 01:50:56 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 04 Apr 2011 01:50:56 +0200 Subject: [Python-checkins] cpython: Fix TraceCallbackTests to not use bound parameters (followup to issue #11688) Message-ID: http://hg.python.org/cpython/rev/ce37570768f5 changeset: 69124:ce37570768f5 user: Antoine Pitrou date: Mon Apr 04 01:50:50 2011 +0200 summary: Fix TraceCallbackTests to not use bound parameters (followup to issue #11688) files: Lib/sqlite3/test/hooks.py | 5 ++++- 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/Lib/sqlite3/test/hooks.py b/Lib/sqlite3/test/hooks.py --- a/Lib/sqlite3/test/hooks.py +++ b/Lib/sqlite3/test/hooks.py @@ -213,7 +213,10 @@ traced_statements.append(statement) con.set_trace_callback(trace) con.execute("create table foo(x)") - con.execute("insert into foo(x) values (?)", (unicode_value,)) + # Can't execute bound parameters as their values don't appear + # in traced statements before SQLite 3.6.21 + # (cf. http://www.sqlite.org/draft/releaselog/3_6_21.html) + con.execute('insert into foo(x) values ("%s")' % unicode_value) con.commit() self.assertTrue(any(unicode_value in stmt for stmt in traced_statements), "Unicode data %s garbled in trace callback: %s" -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 02:14:42 2011 From: python-checkins at python.org (steven.bethard) Date: Mon, 04 Apr 2011 02:14:42 +0200 Subject: [Python-checkins] cpython (2.7): Issue #9347: Fix formatting for tuples in argparse type= error messages. Message-ID: http://hg.python.org/cpython/rev/f961e9179998 changeset: 69125:f961e9179998 branch: 2.7 parent: 69080:5e7fc2a42c3c user: Steven Bethard date: Mon Apr 04 01:47:52 2011 +0200 summary: Issue #9347: Fix formatting for tuples in argparse type= error messages. files: Lib/argparse.py | 4 ++-- Lib/test/test_argparse.py | 2 ++ Misc/NEWS | 2 ++ 3 files changed, 6 insertions(+), 2 deletions(-) diff --git a/Lib/argparse.py b/Lib/argparse.py --- a/Lib/argparse.py +++ b/Lib/argparse.py @@ -1277,13 +1277,13 @@ # create the action object, and add it to the parser action_class = self._pop_action_class(kwargs) if not _callable(action_class): - raise ValueError('unknown action "%s"' % action_class) + raise ValueError('unknown action "%s"' % (action_class,)) action = action_class(**kwargs) # raise an error if the action type is not callable type_func = self._registry_get('type', action.type, action.type) if not _callable(type_func): - raise ValueError('%r is not callable' % type_func) + raise ValueError('%r is not callable' % (type_func,)) # raise an error if the metavar does not match the type if hasattr(self, "_get_formatter"): diff --git a/Lib/test/test_argparse.py b/Lib/test/test_argparse.py --- a/Lib/test/test_argparse.py +++ b/Lib/test/test_argparse.py @@ -4016,10 +4016,12 @@ def test_invalid_type(self): self.assertValueError('--foo', type='int') + self.assertValueError('--foo', type=(int, float)) def test_invalid_action(self): self.assertValueError('-x', action='foo') self.assertValueError('foo', action='baz') + self.assertValueError('--foo', action=('store', 'append')) parser = argparse.ArgumentParser() try: parser.add_argument("--foo", action="store-true") diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -257,6 +257,8 @@ - Issue #9026: Fix order of argparse sub-commands in help messages. +- Issue #9347: Fix formatting for tuples in argparse type= error messages. + Extension Modules ----------------- -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 02:14:43 2011 From: python-checkins at python.org (steven.bethard) Date: Mon, 04 Apr 2011 02:14:43 +0200 Subject: [Python-checkins] cpython (3.2): Issue #9347: Fix formatting for tuples in argparse type= error messages. Message-ID: http://hg.python.org/cpython/rev/69ab5251f3f0 changeset: 69126:69ab5251f3f0 branch: 3.2 parent: 69122:68a319ef70fc user: Steven Bethard date: Mon Apr 04 01:53:02 2011 +0200 summary: Issue #9347: Fix formatting for tuples in argparse type= error messages. files: Lib/argparse.py | 4 ++-- Lib/test/test_argparse.py | 2 ++ Misc/NEWS | 2 ++ 3 files changed, 6 insertions(+), 2 deletions(-) diff --git a/Lib/argparse.py b/Lib/argparse.py --- a/Lib/argparse.py +++ b/Lib/argparse.py @@ -1287,13 +1287,13 @@ # create the action object, and add it to the parser action_class = self._pop_action_class(kwargs) if not _callable(action_class): - raise ValueError('unknown action "%s"' % action_class) + raise ValueError('unknown action "%s"' % (action_class,)) action = action_class(**kwargs) # raise an error if the action type is not callable type_func = self._registry_get('type', action.type, action.type) if not _callable(type_func): - raise ValueError('%r is not callable' % type_func) + raise ValueError('%r is not callable' % (type_func,)) # raise an error if the metavar does not match the type if hasattr(self, "_get_formatter"): diff --git a/Lib/test/test_argparse.py b/Lib/test/test_argparse.py --- a/Lib/test/test_argparse.py +++ b/Lib/test/test_argparse.py @@ -4051,10 +4051,12 @@ def test_invalid_type(self): self.assertValueError('--foo', type='int') + self.assertValueError('--foo', type=(int, float)) def test_invalid_action(self): self.assertValueError('-x', action='foo') self.assertValueError('foo', action='baz') + self.assertValueError('--foo', action=('store', 'append')) parser = argparse.ArgumentParser() try: parser.add_argument("--foo", action="store-true") diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -188,6 +188,8 @@ - Issue #9026: Fix order of argparse sub-commands in help messages. +- Issue #9347: Fix formatting for tuples in argparse type= error messages. + Build ----- -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 02:14:46 2011 From: python-checkins at python.org (steven.bethard) Date: Mon, 04 Apr 2011 02:14:46 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Issue #9347: Fix formatting for tuples in argparse type= error messages. Message-ID: http://hg.python.org/cpython/rev/1f3f6443810a changeset: 69127:1f3f6443810a parent: 69123:44fc5f94bc90 parent: 69126:69ab5251f3f0 user: Steven Bethard date: Mon Apr 04 02:10:40 2011 +0200 summary: Issue #9347: Fix formatting for tuples in argparse type= error messages. files: Lib/argparse.py | 4 ++-- Lib/test/test_argparse.py | 2 ++ Misc/NEWS | 2 ++ 3 files changed, 6 insertions(+), 2 deletions(-) diff --git a/Lib/argparse.py b/Lib/argparse.py --- a/Lib/argparse.py +++ b/Lib/argparse.py @@ -1312,13 +1312,13 @@ # create the action object, and add it to the parser action_class = self._pop_action_class(kwargs) if not _callable(action_class): - raise ValueError('unknown action "%s"' % action_class) + raise ValueError('unknown action "%s"' % (action_class,)) action = action_class(**kwargs) # raise an error if the action type is not callable type_func = self._registry_get('type', action.type, action.type) if not _callable(type_func): - raise ValueError('%r is not callable' % type_func) + raise ValueError('%r is not callable' % (type_func,)) # raise an error if the metavar does not match the type if hasattr(self, "_get_formatter"): diff --git a/Lib/test/test_argparse.py b/Lib/test/test_argparse.py --- a/Lib/test/test_argparse.py +++ b/Lib/test/test_argparse.py @@ -4082,10 +4082,12 @@ def test_invalid_type(self): self.assertValueError('--foo', type='int') + self.assertValueError('--foo', type=(int, float)) def test_invalid_action(self): self.assertValueError('-x', action='foo') self.assertValueError('foo', action='baz') + self.assertValueError('--foo', action=('store', 'append')) parser = argparse.ArgumentParser() try: parser.add_argument("--foo", action="store-true") diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -349,6 +349,8 @@ - Issue #9026: Fix order of argparse sub-commands in help messages. +- Issue #9347: Fix formatting for tuples in argparse type= error messages. + Build ----- -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 02:14:48 2011 From: python-checkins at python.org (steven.bethard) Date: Mon, 04 Apr 2011 02:14:48 +0200 Subject: [Python-checkins] cpython (merge default -> default): Merge Message-ID: http://hg.python.org/cpython/rev/838e3b07a7f8 changeset: 69128:838e3b07a7f8 parent: 69127:1f3f6443810a parent: 69124:ce37570768f5 user: Steven Bethard date: Mon Apr 04 02:14:25 2011 +0200 summary: Merge files: Lib/sqlite3/test/hooks.py | 5 ++++- 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/Lib/sqlite3/test/hooks.py b/Lib/sqlite3/test/hooks.py --- a/Lib/sqlite3/test/hooks.py +++ b/Lib/sqlite3/test/hooks.py @@ -213,7 +213,10 @@ traced_statements.append(statement) con.set_trace_callback(trace) con.execute("create table foo(x)") - con.execute("insert into foo(x) values (?)", (unicode_value,)) + # Can't execute bound parameters as their values don't appear + # in traced statements before SQLite 3.6.21 + # (cf. http://www.sqlite.org/draft/releaselog/3_6_21.html) + con.execute('insert into foo(x) values ("%s")' % unicode_value) con.commit() self.assertTrue(any(unicode_value in stmt for stmt in traced_statements), "Unicode data %s garbled in trace callback: %s" -- Repository URL: http://hg.python.org/cpython From solipsis at pitrou.net Mon Apr 4 04:55:49 2011 From: solipsis at pitrou.net (solipsis at pitrou.net) Date: Mon, 04 Apr 2011 04:55:49 +0200 Subject: [Python-checkins] Daily reference leaks (838e3b07a7f8): sum=0 Message-ID: results for 838e3b07a7f8 on branch "default" -------------------------------------------- Command line was: ['./python', '-m', 'test.regrtest', '-uall', '-R', '3:3:/home/antoine/cpython/refleaks/reflogYOwho_', '-x'] From python-checkins at python.org Mon Apr 4 11:05:51 2011 From: python-checkins at python.org (victor.stinner) Date: Mon, 04 Apr 2011 11:05:51 +0200 Subject: [Python-checkins] cpython: Issue #11753: faulthandler thread uses pthread_sigmask() Message-ID: http://hg.python.org/cpython/rev/ebc03d7e7110 changeset: 69129:ebc03d7e7110 user: Victor Stinner date: Mon Apr 04 11:05:21 2011 +0200 summary: Issue #11753: faulthandler thread uses pthread_sigmask() The thread must not receive any signal. If the thread receives a signal, sem_timedwait() is interrupted and returns EINTR, but in this case, PyThread_acquire_lock_timed() retries sem_timedwait() and the main thread is not aware of the signal. The problem is that some tests expect that the main thread receives the signal, not faulthandler handler, which should be invisible. On Linux, the signal looks to be received by the main thread, whereas on FreeBSD, it can be any thread. files: Modules/faulthandler.c | 11 +++++++++++ 1 files changed, 11 insertions(+), 0 deletions(-) diff --git a/Modules/faulthandler.c b/Modules/faulthandler.c --- a/Modules/faulthandler.c +++ b/Modules/faulthandler.c @@ -399,6 +399,17 @@ const char* errmsg; PyThreadState *current; int ok; +#ifdef HAVE_PTHREAD_H + sigset_t set; + + /* we don't want to receive any signal */ + sigfillset(&set); +#if defined(HAVE_PTHREAD_SIGMASK) && !defined(HAVE_BROKEN_PTHREAD_SIGMASK) + pthread_sigmask(SIG_SETMASK, &set, NULL); +#else + sigprocmask(SIG_SETMASK, &set, NULL); +#endif +#endif do { st = PyThread_acquire_lock_timed(thread.cancel_event, -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 12:54:47 2011 From: python-checkins at python.org (victor.stinner) Date: Mon, 04 Apr 2011 12:54:47 +0200 Subject: [Python-checkins] cpython: Reenable regrtest.py timeout (30 min): #11738 and #11753 looks to be fixed Message-ID: http://hg.python.org/cpython/rev/9d59ae98013c changeset: 69130:9d59ae98013c user: Victor Stinner date: Mon Apr 04 12:54:33 2011 +0200 summary: Reenable regrtest.py timeout (30 min): #11738 and #11753 looks to be fixed files: Lib/test/regrtest.py | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Lib/test/regrtest.py b/Lib/test/regrtest.py --- a/Lib/test/regrtest.py +++ b/Lib/test/regrtest.py @@ -240,7 +240,7 @@ findleaks=False, use_resources=None, trace=False, coverdir='coverage', runleaks=False, huntrleaks=False, verbose2=False, print_slow=False, random_seed=None, use_mp=None, verbose3=False, forever=False, - header=False, timeout=None): + header=False, timeout=30*60): """Execute a test suite. This also parses command-line options and modifies its behavior -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 15:33:37 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 04 Apr 2011 15:33:37 +0200 Subject: [Python-checkins] peps: Fix use of "either" and typo in "checking" (Jim Jewett) Message-ID: http://hg.python.org/peps/rev/f65beac56930 changeset: 3854:f65beac56930 user: Antoine Pitrou date: Mon Apr 04 15:33:34 2011 +0200 summary: Fix use of "either" and typo in "checking" (Jim Jewett) files: pep-3151.txt | 18 +++++++++--------- 1 files changed, 9 insertions(+), 9 deletions(-) diff --git a/pep-3151.txt b/pep-3151.txt --- a/pep-3151.txt +++ b/pep-3151.txt @@ -83,12 +83,12 @@ A further proof of the ambiguity of this segmentation is that the standard library itself sometimes has problems deciding. For example, in the -``select`` module, similar failures will raise either ``select.error``, -``OSError`` or ``IOError`` depending on whether you are using select(), -a poll object, a kqueue object, or an epoll object. This makes user code -uselessly complicated since it has to be prepared to catch various -exception types, depending on which exact implementation of a single -primitive it chooses to use at runtime. +``select`` module, similar failures will raise ``select.error``, ``OSError`` +or ``IOError`` depending on whether you are using select(), a poll object, +a kqueue object, or an epoll object. This makes user code uselessly +complicated since it has to be prepared to catch various exception types, +depending on which exact implementation of a single primitive it chooses +to use at runtime. As for WindowsError, it seems to be a pointless distinction. First, it only exists on Windows systems, which requires tedious compatibility code @@ -171,10 +171,10 @@ For this we first must explain what we will call *careful* and *careless* exception handling. *Careless* (or "na?ve") code is defined as code which -blindly catches either of ``OSError``, ``IOError``, ``socket.error``, -``mmap.error``, ``WindowsError``, ``select.error`` without cheking the ``errno`` +blindly catches any of ``OSError``, ``IOError``, ``socket.error``, +``mmap.error``, ``WindowsError``, ``select.error`` without checking the ``errno`` attribute. This is because such exception types are much too broad to signify -anything. Either of them can be raised for error conditions as diverse as: a +anything. Any of them can be raised for error conditions as diverse as: a bad file descriptor (which will usually indicate a programming error), an unconnected socket (ditto), a socket timeout, a file type mismatch, an invalid argument, a transmission failure, insufficient permissions, a non-existent -- Repository URL: http://hg.python.org/peps From python-checkins at python.org Mon Apr 4 18:30:23 2011 From: python-checkins at python.org (raymond.hettinger) Date: Mon, 04 Apr 2011 18:30:23 +0200 Subject: [Python-checkins] cpython: Update timeit to use the new string formatting syntax. Message-ID: http://hg.python.org/cpython/rev/81c981ceb83e changeset: 69131:81c981ceb83e user: Raymond Hettinger date: Mon Apr 04 09:28:25 2011 -0700 summary: Update timeit to use the new string formatting syntax. files: Lib/timeit.py | 8 ++++---- 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/Lib/timeit.py b/Lib/timeit.py --- a/Lib/timeit.py +++ b/Lib/timeit.py @@ -79,10 +79,10 @@ # being indented 8 spaces. template = """ def inner(_it, _timer): - %(setup)s + {setup} _t0 = _timer() for _i in _it: - %(stmt)s + {stmt} _t1 = _timer() return _t1 - _t0 """ @@ -126,9 +126,9 @@ stmt = reindent(stmt, 8) if isinstance(setup, str): setup = reindent(setup, 4) - src = template % {'stmt': stmt, 'setup': setup} + src = template.format(stmt=stmt, setup=setup) elif hasattr(setup, '__call__'): - src = template % {'stmt': stmt, 'setup': '_setup()'} + src = template.format(stmt=stmt, setup='_setup()') ns['_setup'] = setup else: raise ValueError("setup is neither a string nor callable") -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 19:53:20 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 04 Apr 2011 19:53:20 +0200 Subject: [Python-checkins] cpython (3.1): Issue #11761: make tests for gc.get_count() less fragile Message-ID: http://hg.python.org/cpython/rev/36d92e923a1a changeset: 69132:36d92e923a1a branch: 3.1 parent: 69106:821244a44163 user: Antoine Pitrou date: Mon Apr 04 19:50:42 2011 +0200 summary: Issue #11761: make tests for gc.get_count() less fragile files: Lib/test/test_gc.py | 43 ++++++++++++++++++++------------ 1 files changed, 27 insertions(+), 16 deletions(-) diff --git a/Lib/test/test_gc.py b/Lib/test/test_gc.py --- a/Lib/test/test_gc.py +++ b/Lib/test/test_gc.py @@ -239,30 +239,41 @@ # The following two tests are fragile: # They precisely count the number of allocations, # which is highly implementation-dependent. - # For example: - # - disposed tuples are not freed, but reused - # - the call to assertEqual somehow avoids building its args tuple + # For example, disposed tuples are not freed, but reused. + # To minimize variations, though, we first store the get_count() results + # and check them at the end. def test_get_count(self): - # Avoid future allocation of method object - assertEqual = self._baseAssertEqual gc.collect() - assertEqual(gc.get_count(), (0, 0, 0)) - a = dict() - # since gc.collect(), we created two objects: - # the dict, and the tuple returned by get_count() - assertEqual(gc.get_count(), (2, 0, 0)) + a, b, c = gc.get_count() + x = [] + d, e, f = gc.get_count() + self.assertEqual((b, c), (0, 0)) + self.assertEqual((e, f), (0, 0)) + # This is less fragile than asserting that a equals 0. + self.assertLess(a, 5) + # Between the two calls to get_count(), at least one object was + # created (the list). + self.assertGreater(d, a) def test_collect_generations(self): - # Avoid future allocation of method object - assertEqual = self.assertEqual gc.collect() - a = dict() + # This object will "trickle" into generation N + 1 after + # each call to collect(N) + x = [] gc.collect(0) - assertEqual(gc.get_count(), (0, 1, 0)) + # x is now in gen 1 + a, b, c = gc.get_count() gc.collect(1) - assertEqual(gc.get_count(), (0, 0, 1)) + # x is now in gen 2 + d, e, f = gc.get_count() gc.collect(2) - assertEqual(gc.get_count(), (0, 0, 0)) + # x is now in gen 3 + g, h, i = gc.get_count() + # We don't check a, d, g since their exact values depends on + # internal implementation details of the interpreter. + self.assertEqual((b, c), (1, 0)) + self.assertEqual((e, f), (0, 1)) + self.assertEqual((h, i), (0, 0)) def test_trashcan(self): class Ouch: -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 19:53:21 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 04 Apr 2011 19:53:21 +0200 Subject: [Python-checkins] cpython (merge 3.1 -> 3.2): Issue #11761: make tests for gc.get_count() less fragile Message-ID: http://hg.python.org/cpython/rev/5daf9a8dc4e8 changeset: 69133:5daf9a8dc4e8 branch: 3.2 parent: 69126:69ab5251f3f0 parent: 69132:36d92e923a1a user: Antoine Pitrou date: Mon Apr 04 19:51:33 2011 +0200 summary: Issue #11761: make tests for gc.get_count() less fragile files: Lib/test/test_gc.py | 43 ++++++++++++++++++++------------ 1 files changed, 27 insertions(+), 16 deletions(-) diff --git a/Lib/test/test_gc.py b/Lib/test/test_gc.py --- a/Lib/test/test_gc.py +++ b/Lib/test/test_gc.py @@ -239,30 +239,41 @@ # The following two tests are fragile: # They precisely count the number of allocations, # which is highly implementation-dependent. - # For example: - # - disposed tuples are not freed, but reused - # - the call to assertEqual somehow avoids building its args tuple + # For example, disposed tuples are not freed, but reused. + # To minimize variations, though, we first store the get_count() results + # and check them at the end. def test_get_count(self): - # Avoid future allocation of method object - assertEqual = self._baseAssertEqual gc.collect() - assertEqual(gc.get_count(), (0, 0, 0)) - a = dict() - # since gc.collect(), we created two objects: - # the dict, and the tuple returned by get_count() - assertEqual(gc.get_count(), (2, 0, 0)) + a, b, c = gc.get_count() + x = [] + d, e, f = gc.get_count() + self.assertEqual((b, c), (0, 0)) + self.assertEqual((e, f), (0, 0)) + # This is less fragile than asserting that a equals 0. + self.assertLess(a, 5) + # Between the two calls to get_count(), at least one object was + # created (the list). + self.assertGreater(d, a) def test_collect_generations(self): - # Avoid future allocation of method object - assertEqual = self.assertEqual gc.collect() - a = dict() + # This object will "trickle" into generation N + 1 after + # each call to collect(N) + x = [] gc.collect(0) - assertEqual(gc.get_count(), (0, 1, 0)) + # x is now in gen 1 + a, b, c = gc.get_count() gc.collect(1) - assertEqual(gc.get_count(), (0, 0, 1)) + # x is now in gen 2 + d, e, f = gc.get_count() gc.collect(2) - assertEqual(gc.get_count(), (0, 0, 0)) + # x is now in gen 3 + g, h, i = gc.get_count() + # We don't check a, d, g since their exact values depends on + # internal implementation details of the interpreter. + self.assertEqual((b, c), (1, 0)) + self.assertEqual((e, f), (0, 1)) + self.assertEqual((h, i), (0, 0)) def test_trashcan(self): class Ouch: -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 19:53:23 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 04 Apr 2011 19:53:23 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Issue #11761: make tests for gc.get_count() less fragile Message-ID: http://hg.python.org/cpython/rev/24d4c5fd3bc6 changeset: 69134:24d4c5fd3bc6 parent: 69131:81c981ceb83e parent: 69133:5daf9a8dc4e8 user: Antoine Pitrou date: Mon Apr 04 19:52:56 2011 +0200 summary: Issue #11761: make tests for gc.get_count() less fragile files: Lib/test/test_gc.py | 43 ++++++++++++++++++++------------ 1 files changed, 27 insertions(+), 16 deletions(-) diff --git a/Lib/test/test_gc.py b/Lib/test/test_gc.py --- a/Lib/test/test_gc.py +++ b/Lib/test/test_gc.py @@ -241,32 +241,43 @@ # The following two tests are fragile: # They precisely count the number of allocations, # which is highly implementation-dependent. - # For example: - # - disposed tuples are not freed, but reused - # - the call to assertEqual somehow avoids building its args tuple + # For example, disposed tuples are not freed, but reused. + # To minimize variations, though, we first store the get_count() results + # and check them at the end. @refcount_test def test_get_count(self): - # Avoid future allocation of method object - assertEqual = self._baseAssertEqual gc.collect() - assertEqual(gc.get_count(), (0, 0, 0)) - a = dict() - # since gc.collect(), we created two objects: - # the dict, and the tuple returned by get_count() - assertEqual(gc.get_count(), (2, 0, 0)) + a, b, c = gc.get_count() + x = [] + d, e, f = gc.get_count() + self.assertEqual((b, c), (0, 0)) + self.assertEqual((e, f), (0, 0)) + # This is less fragile than asserting that a equals 0. + self.assertLess(a, 5) + # Between the two calls to get_count(), at least one object was + # created (the list). + self.assertGreater(d, a) @refcount_test def test_collect_generations(self): - # Avoid future allocation of method object - assertEqual = self.assertEqual gc.collect() - a = dict() + # This object will "trickle" into generation N + 1 after + # each call to collect(N) + x = [] gc.collect(0) - assertEqual(gc.get_count(), (0, 1, 0)) + # x is now in gen 1 + a, b, c = gc.get_count() gc.collect(1) - assertEqual(gc.get_count(), (0, 0, 1)) + # x is now in gen 2 + d, e, f = gc.get_count() gc.collect(2) - assertEqual(gc.get_count(), (0, 0, 0)) + # x is now in gen 3 + g, h, i = gc.get_count() + # We don't check a, d, g since their exact values depends on + # internal implementation details of the interpreter. + self.assertEqual((b, c), (1, 0)) + self.assertEqual((e, f), (0, 1)) + self.assertEqual((h, i), (0, 0)) def test_trashcan(self): class Ouch: -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 20:00:56 2011 From: python-checkins at python.org (brian.curtin) Date: Mon, 04 Apr 2011 20:00:56 +0200 Subject: [Python-checkins] cpython: Add x64-temp to ignore, prepend a forward slash to "build/" to include Message-ID: http://hg.python.org/cpython/rev/4d2575d971bc changeset: 69135:4d2575d971bc user: brian.curtin date: Mon Apr 04 13:00:49 2011 -0500 summary: Add x64-temp to ignore, prepend a forward slash to "build/" to include PCbuild/ changes (for VS project files, etc). files: .hgignore | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/.hgignore b/.hgignore --- a/.hgignore +++ b/.hgignore @@ -5,7 +5,7 @@ Makefile.pre$ TAGS$ autom4te.cache$ -build/ +/build/ buildno$ config.cache config.log @@ -63,4 +63,5 @@ PCbuild/*.ncb PCbuild/*.bsc PCbuild/Win32-temp-* +PCbuild/x64-temp-* __pycache__ -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 20:56:06 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 04 Apr 2011 20:56:06 +0200 Subject: [Python-checkins] cpython: Ignore build/ and Doc/build Message-ID: http://hg.python.org/cpython/rev/739bed65e445 changeset: 69136:739bed65e445 user: Antoine Pitrou date: Mon Apr 04 20:52:50 2011 +0200 summary: Ignore build/ and Doc/build files: .hgignore | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/.hgignore b/.hgignore --- a/.hgignore +++ b/.hgignore @@ -5,7 +5,8 @@ Makefile.pre$ TAGS$ autom4te.cache$ -/build/ +^build/ +^Doc/build/ buildno$ config.cache config.log -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 20:56:09 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 04 Apr 2011 20:56:09 +0200 Subject: [Python-checkins] cpython: Ignore AMD64 build files under Windows Message-ID: http://hg.python.org/cpython/rev/ef97e997aa02 changeset: 69137:ef97e997aa02 user: Antoine Pitrou date: Mon Apr 04 20:55:12 2011 +0200 summary: Ignore AMD64 build files under Windows files: .hgignore | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/.hgignore b/.hgignore --- a/.hgignore +++ b/.hgignore @@ -33,6 +33,7 @@ Modules/ld_so_aix$ Parser/pgen$ Parser/pgen.stamp$ +PCbuild/amd64/ ^core ^python-gdb.py ^python.exe-gdb.py -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 20:56:14 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 04 Apr 2011 20:56:14 +0200 Subject: [Python-checkins] cpython: Ignore other MSVC by-products Message-ID: http://hg.python.org/cpython/rev/100561a0f093 changeset: 69138:100561a0f093 user: Antoine Pitrou date: Mon Apr 04 20:55:48 2011 +0200 summary: Ignore other MSVC by-products files: .hgignore | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/.hgignore b/.hgignore --- a/.hgignore +++ b/.hgignore @@ -64,6 +64,8 @@ PCbuild/*.o PCbuild/*.ncb PCbuild/*.bsc +PCbuild/*.user +PCbuild/*.suo PCbuild/Win32-temp-* PCbuild/x64-temp-* __pycache__ -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 21:01:57 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 04 Apr 2011 21:01:57 +0200 Subject: [Python-checkins] cpython: Issue #10791: Implement missing method GzipFile.read1(), allowing GzipFile Message-ID: http://hg.python.org/cpython/rev/9775d67c9af9 changeset: 69139:9775d67c9af9 user: Antoine Pitrou date: Mon Apr 04 21:00:37 2011 +0200 summary: Issue #10791: Implement missing method GzipFile.read1(), allowing GzipFile to be wrapped in a TextIOWrapper. Patch by Nadeem Vawda. files: Lib/gzip.py | 22 ++++++++++++++++++++++ Lib/test/test_gzip.py | 23 +++++++++++++++++++++++ Misc/NEWS | 3 +++ 3 files changed, 48 insertions(+), 0 deletions(-) diff --git a/Lib/gzip.py b/Lib/gzip.py --- a/Lib/gzip.py +++ b/Lib/gzip.py @@ -348,6 +348,28 @@ self.offset += size return chunk + def read1(self, size=-1): + self._check_closed() + if self.mode != READ: + import errno + raise IOError(errno.EBADF, "read1() on write-only GzipFile object") + + if self.extrasize <= 0 and self.fileobj is None: + return b'' + + try: + self._read() + except EOFError: + pass + if size < 0 or size > self.extrasize: + size = self.extrasize + + offset = self.offset - self.extrastart + chunk = self.extrabuf[offset: offset + size] + self.extrasize -= size + self.offset += size + return chunk + def peek(self, n): if self.mode != READ: import errno diff --git a/Lib/test/test_gzip.py b/Lib/test/test_gzip.py --- a/Lib/test/test_gzip.py +++ b/Lib/test/test_gzip.py @@ -64,6 +64,21 @@ d = f.read() self.assertEqual(d, data1*50) + def test_read1(self): + self.test_write() + blocks = [] + nread = 0 + with gzip.GzipFile(self.filename, 'r') as f: + while True: + d = f.read1() + if not d: + break + blocks.append(d) + nread += len(d) + # Check that position was updated correctly (see issue10791). + self.assertEqual(f.tell(), nread) + self.assertEqual(b''.join(blocks), data1 * 50) + def test_io_on_closed_object(self): # Test that I/O operations on closed GzipFile objects raise a # ValueError, just like the corresponding functions on file objects. @@ -323,6 +338,14 @@ self.assertEqual(f.read(100), b'') self.assertEqual(nread, len(uncompressed)) + def test_textio_readlines(self): + # Issue #10791: TextIOWrapper.readlines() fails when wrapping GzipFile. + lines = (data1 * 50).decode("ascii").splitlines(True) + self.test_write() + with gzip.GzipFile(self.filename, 'r') as f: + with io.TextIOWrapper(f, encoding="ascii") as t: + self.assertEqual(t.readlines(), lines) + # Testing compress/decompress shortcut functions def test_compress(self): diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -87,6 +87,9 @@ Library ------- +- Issue #10791: Implement missing method GzipFile.read1(), allowing GzipFile + to be wrapped in a TextIOWrapper. Patch by Nadeem Vawda. + - Issue #11688: Add sqlite3.Connection.set_trace_callback(). Patch by Torsten Landschoff. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 21:09:09 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 04 Apr 2011 21:09:09 +0200 Subject: [Python-checkins] cpython (3.2): Clarify that GzipFile.read1() isn't implemented. Message-ID: http://hg.python.org/cpython/rev/8a2639fdf433 changeset: 69140:8a2639fdf433 branch: 3.2 parent: 69133:5daf9a8dc4e8 user: Antoine Pitrou date: Mon Apr 04 21:06:20 2011 +0200 summary: Clarify that GzipFile.read1() isn't implemented. files: Doc/library/gzip.rst | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Doc/library/gzip.rst b/Doc/library/gzip.rst --- a/Doc/library/gzip.rst +++ b/Doc/library/gzip.rst @@ -72,7 +72,7 @@ :class:`GzipFile` supports the :class:`io.BufferedIOBase` interface, including iteration and the :keyword:`with` statement. Only the - :meth:`truncate` method isn't implemented. + :meth:`read1` and :meth:`truncate` methods aren't implemented. :class:`GzipFile` also provides the following method: -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 21:09:10 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 04 Apr 2011 21:09:10 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Clarify that GzipFile.read1() is now implemented Message-ID: http://hg.python.org/cpython/rev/4fa9bfa21a7e changeset: 69141:4fa9bfa21a7e parent: 69139:9775d67c9af9 parent: 69140:8a2639fdf433 user: Antoine Pitrou date: Mon Apr 04 21:09:05 2011 +0200 summary: Clarify that GzipFile.read1() is now implemented files: Doc/library/gzip.rst | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/Doc/library/gzip.rst b/Doc/library/gzip.rst --- a/Doc/library/gzip.rst +++ b/Doc/library/gzip.rst @@ -94,6 +94,9 @@ .. versionchanged:: 3.2 Support for unseekable files was added. + .. versionchanged:: 3.3 + The :meth:`io.BufferedIOBase.read1` method is now implemented. + .. function:: open(filename, mode='rb', compresslevel=9) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 22:00:51 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 04 Apr 2011 22:00:51 +0200 Subject: [Python-checkins] cpython (3.1): Try to fix sporadic failure in test_thread/test_threading Message-ID: http://hg.python.org/cpython/rev/04b5cd2f8c87 changeset: 69142:04b5cd2f8c87 branch: 3.1 parent: 69132:36d92e923a1a user: Antoine Pitrou date: Mon Apr 04 21:59:09 2011 +0200 summary: Try to fix sporadic failure in test_thread/test_threading files: Lib/test/lock_tests.py | 8 +++++++- 1 files changed, 7 insertions(+), 1 deletions(-) diff --git a/Lib/test/lock_tests.py b/Lib/test/lock_tests.py --- a/Lib/test/lock_tests.py +++ b/Lib/test/lock_tests.py @@ -141,7 +141,13 @@ # We run many threads in the hope that existing threads ids won't # be recycled. Bunch(f, 15).wait_for_finished() - self.assertEqual(n, len(threading.enumerate())) + if len(threading.enumerate()) != n: + # There is a small window during which a Thread instance's + # target function has finished running, but the Thread is still + # alive and registered. Avoid spurious failures by waiting a + # bit more (seen on a buildbot). + time.sleep(0.4) + self.assertEqual(n, len(threading.enumerate())) class LockTests(BaseLockTests): -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 22:00:52 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 04 Apr 2011 22:00:52 +0200 Subject: [Python-checkins] cpython (merge 3.1 -> 3.2): Try to fix sporadic failure in test_thread/test_threading Message-ID: http://hg.python.org/cpython/rev/8d5ea25d79d0 changeset: 69143:8d5ea25d79d0 branch: 3.2 parent: 69140:8a2639fdf433 parent: 69142:04b5cd2f8c87 user: Antoine Pitrou date: Mon Apr 04 22:00:10 2011 +0200 summary: Try to fix sporadic failure in test_thread/test_threading files: Lib/test/lock_tests.py | 8 +++++++- 1 files changed, 7 insertions(+), 1 deletions(-) diff --git a/Lib/test/lock_tests.py b/Lib/test/lock_tests.py --- a/Lib/test/lock_tests.py +++ b/Lib/test/lock_tests.py @@ -149,7 +149,13 @@ # We run many threads in the hope that existing threads ids won't # be recycled. Bunch(f, 15).wait_for_finished() - self.assertEqual(n, len(threading.enumerate())) + if len(threading.enumerate()) != n: + # There is a small window during which a Thread instance's + # target function has finished running, but the Thread is still + # alive and registered. Avoid spurious failures by waiting a + # bit more (seen on a buildbot). + time.sleep(0.4) + self.assertEqual(n, len(threading.enumerate())) def test_timeout(self): lock = self.locktype() -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 22:00:58 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 04 Apr 2011 22:00:58 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Try to fix sporadic failure in test_thread/test_threading Message-ID: http://hg.python.org/cpython/rev/877e152c2eee changeset: 69144:877e152c2eee parent: 69141:4fa9bfa21a7e parent: 69143:8d5ea25d79d0 user: Antoine Pitrou date: Mon Apr 04 22:00:45 2011 +0200 summary: Try to fix sporadic failure in test_thread/test_threading files: Lib/test/lock_tests.py | 8 +++++++- 1 files changed, 7 insertions(+), 1 deletions(-) diff --git a/Lib/test/lock_tests.py b/Lib/test/lock_tests.py --- a/Lib/test/lock_tests.py +++ b/Lib/test/lock_tests.py @@ -149,7 +149,13 @@ # We run many threads in the hope that existing threads ids won't # be recycled. Bunch(f, 15).wait_for_finished() - self.assertEqual(n, len(threading.enumerate())) + if len(threading.enumerate()) != n: + # There is a small window during which a Thread instance's + # target function has finished running, but the Thread is still + # alive and registered. Avoid spurious failures by waiting a + # bit more (seen on a buildbot). + time.sleep(0.4) + self.assertEqual(n, len(threading.enumerate())) def test_timeout(self): lock = self.locktype() -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 23:13:51 2011 From: python-checkins at python.org (victor.stinner) Date: Mon, 04 Apr 2011 23:13:51 +0200 Subject: [Python-checkins] cpython: Issue #11619: _PyImport_LoadDynamicModule() doesn't encode the path to bytes Message-ID: http://hg.python.org/cpython/rev/1b7f484bab6e changeset: 69145:1b7f484bab6e user: Victor Stinner date: Mon Apr 04 23:05:53 2011 +0200 summary: Issue #11619: _PyImport_LoadDynamicModule() doesn't encode the path to bytes on Windows. files: Misc/NEWS | 3 ++ Python/dynload_win.c | 33 +++++++++++++++---------------- Python/importdl.c | 11 ++++++++++ 3 files changed, 30 insertions(+), 17 deletions(-) diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -10,6 +10,9 @@ Core and Builtins ----------------- +- Issue #11619: _PyImport_LoadDynamicModule() doesn't encode the path to bytes + on Windows. + - Issue #10998: Remove mentions of -Q, sys.flags.division_warning and Py_DivisionWarningFlag left over from Python 2. diff --git a/Python/dynload_win.c b/Python/dynload_win.c --- a/Python/dynload_win.c +++ b/Python/dynload_win.c @@ -171,8 +171,8 @@ return NULL; } -dl_funcptr _PyImport_GetDynLoadFunc(const char *shortname, - const char *pathname, FILE *fp) +dl_funcptr _PyImport_GetDynLoadWindows(const char *shortname, + PyObject *pathname, FILE *fp) { dl_funcptr p; char funcname[258], *import_python; @@ -185,8 +185,7 @@ { HINSTANCE hDLL = NULL; - char pathbuf[260]; - LPTSTR dummy; + wchar_t pathbuf[260]; unsigned int old_mode; ULONG_PTR cookie = 0; /* We use LoadLibraryEx so Windows looks for dependent DLLs @@ -198,14 +197,14 @@ /* Don't display a message box when Python can't load a DLL */ old_mode = SetErrorMode(SEM_FAILCRITICALERRORS); - if (GetFullPathName(pathname, - sizeof(pathbuf), - pathbuf, - &dummy)) { + if (GetFullPathNameW(PyUnicode_AS_UNICODE(pathname), + sizeof(pathbuf) / sizeof(pathbuf[0]), + pathbuf, + NULL)) { ULONG_PTR cookie = _Py_ActivateActCtx(); /* XXX This call doesn't exist in Windows CE */ - hDLL = LoadLibraryEx(pathname, NULL, - LOAD_WITH_ALTERED_SEARCH_PATH); + hDLL = LoadLibraryExW(PyUnicode_AS_UNICODE(pathname), NULL, + LOAD_WITH_ALTERED_SEARCH_PATH); _Py_DeactivateActCtx(cookie); } @@ -264,21 +263,21 @@ } else { char buffer[256]; + PyOS_snprintf(buffer, sizeof(buffer), #ifdef _DEBUG - PyOS_snprintf(buffer, sizeof(buffer), "python%d%d_d.dll", + "python%d%d_d.dll", #else - PyOS_snprintf(buffer, sizeof(buffer), "python%d%d.dll", + "python%d%d.dll", #endif PY_MAJOR_VERSION,PY_MINOR_VERSION); import_python = GetPythonImport(hDLL); if (import_python && strcasecmp(buffer,import_python)) { - PyOS_snprintf(buffer, sizeof(buffer), - "Module use of %.150s conflicts " - "with this version of Python.", - import_python); - PyErr_SetString(PyExc_ImportError,buffer); + PyErr_Format(PyExc_ImportError, + "Module use of %.150s conflicts " + "with this version of Python.", + import_python); FreeLibrary(hDLL); return NULL; } diff --git a/Python/importdl.c b/Python/importdl.c --- a/Python/importdl.c +++ b/Python/importdl.c @@ -12,8 +12,13 @@ #include "importdl.h" +#ifdef MS_WINDOWS +extern dl_funcptr _PyImport_GetDynLoadWindows(const char *shortname, + PyObject *pathname, FILE *fp); +#else extern dl_funcptr _PyImport_GetDynLoadFunc(const char *shortname, const char *pathname, FILE *fp); +#endif /* name should be ASCII only because the C language doesn't accept non-ASCII identifiers, and dynamic modules are written in C. */ @@ -22,7 +27,9 @@ _PyImport_LoadDynamicModule(PyObject *name, PyObject *path, FILE *fp) { PyObject *m; +#ifndef MS_WINDOWS PyObject *pathbytes; +#endif char *namestr, *lastdot, *shortname, *packagecontext, *oldcontext; dl_funcptr p0; PyObject* (*p)(void); @@ -48,12 +55,16 @@ shortname = lastdot+1; } +#ifdef MS_WINDOWS + p0 = _PyImport_GetDynLoadWindows(shortname, path, fp); +#else pathbytes = PyUnicode_EncodeFSDefault(path); if (pathbytes == NULL) return NULL; p0 = _PyImport_GetDynLoadFunc(shortname, PyBytes_AS_STRING(pathbytes), fp); Py_DECREF(pathbytes); +#endif p = (PyObject*(*)(void))p0; if (PyErr_Occurred()) return NULL; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 4 23:42:34 2011 From: python-checkins at python.org (victor.stinner) Date: Mon, 04 Apr 2011 23:42:34 +0200 Subject: [Python-checkins] cpython: Issue #11765: don't test time.sleep() in test_faulthandler Message-ID: http://hg.python.org/cpython/rev/8da8cd1ba9d9 changeset: 69146:8da8cd1ba9d9 user: Victor Stinner date: Mon Apr 04 23:42:30 2011 +0200 summary: Issue #11765: don't test time.sleep() in test_faulthandler time.time() and/or time.sleep() are not accurate on Windows, don't test them in test_faulthandler. Anyway, the check was written for an old implementation of dump_tracebacks_later(), it is not more needed. files: Lib/test/test_faulthandler.py | 12 ++---------- 1 files changed, 2 insertions(+), 10 deletions(-) diff --git a/Lib/test/test_faulthandler.py b/Lib/test/test_faulthandler.py --- a/Lib/test/test_faulthandler.py +++ b/Lib/test/test_faulthandler.py @@ -360,16 +360,8 @@ def func(repeat, cancel, timeout): if cancel: faulthandler.cancel_dump_tracebacks_later() - - pause = timeout * 2.5 - # on Windows XP, b-a gives 1.249931 after sleep(1.25) - min_pause = pause * 0.9 - a = time.time() - time.sleep(pause) - b = time.time() + time.sleep(timeout * 2.5) faulthandler.cancel_dump_tracebacks_later() - # Check that sleep() was not interrupted - assert (b - a) >= min_pause, "{{}} < {{}}".format(b - a, min_pause) timeout = {timeout} repeat = {repeat} @@ -400,7 +392,7 @@ else: count = 1 header = 'Thread 0x[0-9a-f]+:\n' - regex = expected_traceback(12, 27, header, count=count) + regex = expected_traceback(7, 19, header, count=count) self.assertRegex(trace, regex) else: self.assertEqual(trace, '') -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 5 01:37:19 2011 From: python-checkins at python.org (brett.cannon) Date: Tue, 05 Apr 2011 01:37:19 +0200 Subject: [Python-checkins] peps: Draft of PEP 399: Pure Python/C Accelerator Module Compatibiilty Requirements Message-ID: http://hg.python.org/peps/rev/7b9a5b01b479 changeset: 3855:7b9a5b01b479 user: Brett Cannon date: Mon Apr 04 16:37:07 2011 -0700 summary: Draft of PEP 399: Pure Python/C Accelerator Module Compatibiilty Requirements files: pep-0399.txt | 205 +++++++++++++++++++++++++++++++++++++++ 1 files changed, 205 insertions(+), 0 deletions(-) diff --git a/pep-0399.txt b/pep-0399.txt new file mode 100644 --- /dev/null +++ b/pep-0399.txt @@ -0,0 +1,205 @@ +PEP: 399 +Title: Pure Python/C Accelerator Module Compatibiilty Requirements +Version: $Revision: 88219 $ +Last-Modified: $Date: 2011-01-27 13:47:00 -0800 (Thu, 27 Jan 2011) $ +Author: Brett Cannon +Status: Draft +Type: Informational +Content-Type: text/x-rst +Created: 04-Apr-2011 +Python-Version: 3.3 +Post-History: + +Abstract +======== + +The Python standard library under CPython contains various instances +of modules implemented in both pure Python and C. This PEP requires +that in these instances that both the Python and C code *must* be +semantically identical (except in cases where implementation details +of a VM prevents it entirely). It is also required that new C-based +modules lacking a pure Python equivalent implementation get special +permissions to be added to the standard library. + + +Rationale +========= + +Python has grown beyond the CPython virtual machine (VM). IronPython_, +Jython_, and PyPy_ all currently being viable alternatives to the +CPython VM. This VM ecosystem that has sprung up around the Python +programming language has led to Python being used in many different +areas where CPython cannot be used, e.g., Jython allowing Python to be +used in Java applications. + +A problem all of the VMs other than CPython face is handling modules +from the standard library that are implemented in C. Since they do not +typically support the entire `C API of Python`_ they are unable to use +the code used to create the module. Often times this leads these other +VMs to either re-implement the modules in pure Python or in the +programming language used to implement the VM (e.g., in C# for +IronPython). This duplication of effort between CPython, PyPy, Jython, +and IronPython is extremely unfortunate as implementing a module *at +least* in pure Python would help mitigate this duplicate effort. + +The purpose of this PEP is to minimize this duplicate effort by +mandating that all new modules added to Python's standard library +*must* have a pure Python implementation _unless_ special dispensation +is given. This makes sure that a module in the stdlib is available to +all VMs and not just to CPython. + +Re-implementing parts (or all) of a module in C (in the case +of CPython) is still allowed for performance reasons, but any such +accelerated code must semantically match the pure Python equivalent to +prevent divergence. To accomplish this, the pure Python and C code must +be thoroughly tested with the *same* test suite to verify compliance. +This is to prevent users from accidentally relying +on semantics that are specific to the C code and are not reflected in +the pure Python implementation that other VMs rely upon, e.g., in +CPython 3.2.0, ``heapq.heappop()`` raises different exceptions +depending on whether the accelerated C code is used or not:: + + from test.support import import_fresh_module + + c_heapq = import_fresh_module('heapq', fresh=['_heapq']) + py_heapq = import_fresh_module('heapq', blocked=['_heapq']) + + + class Spam: + """Tester class which defines no other magic methods but + __len__().""" + def __len__(self): + return 0 + + + try: + c_heapq.heappop(Spam()) + except TypeError: + # "heap argument must be a list" + pass + + try: + py_heapq.heappop(Spam()) + except AttributeError: + # "'Foo' object has no attribute 'pop'" + pass + +This kind of divergence is a problem for users as they unwittingly +write code that is CPython-specific. This is also an issue for other +VM teams as they have to deal with bug reports from users thinking +that they incorrectly implemented the module when in fact it was +caused by an untested case. + + +Details +======= + +Starting in Python 3.3, any modules added to the standard library must +have a pure Python implementation. This rule can only be ignored if +the Python development team grants a special exemption for the module. +Typically the exemption would be granted only when a module wraps a +specific C-based library (e.g., sqlite3_). In granting an exemption it +will be recognized that the module will most likely be considered +exclusive to CPython and not part of Python's standard library that +other VMs are expected to support. Usage of ``ctypes`` to provide an +API for a C library will continue to be frowned upon as ``ctypes`` +lacks compiler guarantees that C code typically relies upon to prevent +certain errors from occurring (e.g., API changes). + +Even though a pure Python implementation is mandated by this PEP, it +does not preclude the use of a companion acceleration module. If an +acceleration module is provided it is to be named the same as the +module it is accelerating with an underscore attached as a prefix, +e.g., ``_warnings`` for ``warnings``. The common pattern to access +the accelerated code from the pure Python implementation is to import +it with an ``import *``, e.g., ``from _warnings import *``. This is +typically done at the end of the module to allow it to overwrite +specific Python objects with their accelerated equivalents. This kind +of import can also be done before the end of the module when needed, +e.g., an accelerated base class is provided but is then subclassed by +Python code. This PEP does not mandate that pre-existing modules in +the stdlib that lack a pure Python equivalent gain such a module. But +if people do volunteer to provide and maintain a pure Python +equivalent (e.g., the PyPy team volunteering their pure Python +implementation of the ``csv`` module and maintaining it) then such +code will be accepted. + +Any accelerated code must be semantically identical to the pure Python +implementation. The only time any semantics are allowed to be +different are when technical details of the VM providing the +accelerated code prevent matching semantics from being possible, e.g., +a class being a ``type`` when implemented in C. The semantics +equivalence requirement also dictates that no public API be provided +in accelerated code that does not exist in the pure Python code. +Without this requirement people could accidentally come to rely on a +detail in the acclerated code which is not made available to other VMs +that use the pure Python implementation. To help verify that the +contract of semantic equivalence is being met, a module must be tested +both with and without its accelerated code as thoroughly as possible. + +As an example, to write tests which exercise both the pure Python and +C acclerated versions of a module, a basic idiom can be followed:: + + import collections.abc + from test.support import import_fresh_module, run_unittest + import unittest + + c_heapq = import_fresh_module('heapq', fresh=['_heapq']) + py_heapq = import_fresh_module('heapq', blocked=['_heapq']) + + + class ExampleTest(unittest.TestCase): + + def test_heappop_exc_for_non_MutableSequence(self): + # Raise TypeError when heap is not a + # collections.abc.MutableSequence. + class Spam: + """Test class lacking many ABC-required methods + (e.g., pop()).""" + def __len__(self): + return 0 + + heap = Spam() + self.assertFalse(isinstance(heap, + collections.abc.MutableSequence)) + with self.assertRaises(TypeError): + self.heapq.heappop(heap) + + + class AcceleratedExampleTest(ExampleTest): + + """Test using the acclerated code.""" + + heapq = c_heapq + + + class PyExampleTest(ExampleTest): + + """Test with just the pure Python code.""" + + heapq = py_heapq + + + def test_main(): + run_unittest(AcceleratedExampleTest, PyExampleTest) + + + if __name__ == '__main__': + test_main() + +Thoroughness of the test can be verified using coverage measurements +with branching coverage on the pure Python code to verify that all +possible scenarios are tested using (or not using) accelerator code. + + +Copyright +========= + +This document has been placed in the public domain. + + +.. _IronPython: http://ironpython.net/ +.. _Jython: http://www.jython.org/ +.. _PyPy: http://pypy.org/ +.. _C API of Python: http://docs.python.org/py3k/c-api/index.html +.. _sqlite3: http://docs.python.org/py3k/library/sqlite3.html -- Repository URL: http://hg.python.org/peps From python-checkins at python.org Tue Apr 5 01:47:20 2011 From: python-checkins at python.org (brett.cannon) Date: Tue, 05 Apr 2011 01:47:20 +0200 Subject: [Python-checkins] peps: Fix a spelling mistake in the title of PEP 399 Message-ID: http://hg.python.org/peps/rev/359ccf54bc52 changeset: 3856:359ccf54bc52 user: Brett Cannon date: Mon Apr 04 16:47:09 2011 -0700 summary: Fix a spelling mistake in the title of PEP 399 files: pep-0399.txt | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/pep-0399.txt b/pep-0399.txt --- a/pep-0399.txt +++ b/pep-0399.txt @@ -1,5 +1,5 @@ PEP: 399 -Title: Pure Python/C Accelerator Module Compatibiilty Requirements +Title: Pure Python/C Accelerator Module Compatibility Requirements Version: $Revision: 88219 $ Last-Modified: $Date: 2011-01-27 13:47:00 -0800 (Thu, 27 Jan 2011) $ Author: Brett Cannon -- Repository URL: http://hg.python.org/peps From python-checkins at python.org Tue Apr 5 01:48:20 2011 From: python-checkins at python.org (victor.stinner) Date: Tue, 05 Apr 2011 01:48:20 +0200 Subject: [Python-checkins] cpython: Issue #10785: Store the filename as Unicode in the Python parser. Message-ID: http://hg.python.org/cpython/rev/6e9dc970ac0e changeset: 69147:6e9dc970ac0e user: Victor Stinner date: Tue Apr 05 00:39:01 2011 +0200 summary: Issue #10785: Store the filename as Unicode in the Python parser. files: Include/parsetok.h | 9 +++++- Makefile.pre.in | 7 +++-- Misc/NEWS | 2 + Modules/parsermodule.c | 1 + Parser/parsetok.c | 32 ++++++++++++++++++----- Parser/parsetok_pgen.c | 2 + Parser/tokenizer.c | 35 ++++++++++++++++--------- Parser/tokenizer.h | 8 +++++- Python/pythonrun.c | 40 ++++++++++++++++++------------ 9 files changed, 94 insertions(+), 42 deletions(-) diff --git a/Include/parsetok.h b/Include/parsetok.h --- a/Include/parsetok.h +++ b/Include/parsetok.h @@ -9,7 +9,10 @@ typedef struct { int error; - const char *filename; /* decoded from the filesystem encoding */ +#ifndef PGEN + /* The filename is useless for pgen, see comment in tok_state structure */ + PyObject *filename; +#endif int lineno; int offset; char *text; /* UTF-8-encoded string */ @@ -66,8 +69,10 @@ perrdetail *err_ret, int *flags); -/* Note that he following function is defined in pythonrun.c not parsetok.c. */ +/* Note that the following functions are defined in pythonrun.c, + not in parsetok.c */ PyAPI_FUNC(void) PyParser_SetError(perrdetail *); +PyAPI_FUNC(void) PyParser_ClearError(perrdetail *); #ifdef __cplusplus } diff --git a/Makefile.pre.in b/Makefile.pre.in --- a/Makefile.pre.in +++ b/Makefile.pre.in @@ -238,14 +238,13 @@ Parser/listnode.o \ Parser/node.o \ Parser/parser.o \ - Parser/parsetok.o \ Parser/bitset.o \ Parser/metagrammar.o \ Parser/firstsets.o \ Parser/grammar.o \ Parser/pgen.o -PARSER_OBJS= $(POBJS) Parser/myreadline.o Parser/tokenizer.o +PARSER_OBJS= $(POBJS) Parser/myreadline.o Parser/parsetok.o Parser/tokenizer.o PGOBJS= \ Objects/obmalloc.o \ @@ -254,10 +253,12 @@ Python/pyctype.o \ Parser/tokenizer_pgen.o \ Parser/printgrammar.o \ + Parser/parsetok_pgen.o \ Parser/pgenmain.o PARSER_HEADERS= \ Parser/parser.h \ + Include/parsetok.h \ Parser/tokenizer.h PGENOBJS= $(PGENMAIN) $(POBJS) $(PGOBJS) @@ -593,6 +594,7 @@ Parser/metagrammar.o: $(srcdir)/Parser/metagrammar.c Parser/tokenizer_pgen.o: $(srcdir)/Parser/tokenizer.c +Parser/parsetok_pgen.o: $(srcdir)/Parser/parsetok.c Parser/pgenmain.o: $(srcdir)/Include/parsetok.h @@ -700,7 +702,6 @@ Include/objimpl.h \ Include/opcode.h \ Include/osdefs.h \ - Include/parsetok.h \ Include/patchlevel.h \ Include/pgen.h \ Include/pgenheaders.h \ diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -10,6 +10,8 @@ Core and Builtins ----------------- +- Issue #10785: Store the filename as Unicode in the Python parser. + - Issue #11619: _PyImport_LoadDynamicModule() doesn't encode the path to bytes on Windows. diff --git a/Modules/parsermodule.c b/Modules/parsermodule.c --- a/Modules/parsermodule.c +++ b/Modules/parsermodule.c @@ -584,6 +584,7 @@ else PyParser_SetError(&err); } + PyParser_ClearError(&err); return (res); } diff --git a/Parser/parsetok.c b/Parser/parsetok.c --- a/Parser/parsetok.c +++ b/Parser/parsetok.c @@ -13,7 +13,7 @@ /* Forward */ static node *parsetok(struct tok_state *, grammar *, int, perrdetail *, int *); -static void initerr(perrdetail *err_ret, const char* filename); +static int initerr(perrdetail *err_ret, const char* filename); /* Parse input coming from a string. Return error code, print some errors. */ node * @@ -48,7 +48,8 @@ struct tok_state *tok; int exec_input = start == file_input; - initerr(err_ret, filename); + if (initerr(err_ret, filename) < 0) + return NULL; if (*flags & PyPARSE_IGNORE_COOKIE) tok = PyTokenizer_FromUTF8(s, exec_input); @@ -59,7 +60,10 @@ return NULL; } - tok->filename = filename ? filename : ""; +#ifndef PGEN + Py_INCREF(err_ret->filename); + tok->filename = err_ret->filename; +#endif return parsetok(tok, g, start, err_ret, flags); } @@ -90,13 +94,17 @@ { struct tok_state *tok; - initerr(err_ret, filename); + if (initerr(err_ret, filename) < 0) + return NULL; if ((tok = PyTokenizer_FromFile(fp, (char *)enc, ps1, ps2)) == NULL) { err_ret->error = E_NOMEM; return NULL; } - tok->filename = filename; +#ifndef PGEN + Py_INCREF(err_ret->filename); + tok->filename = err_ret->filename; +#endif return parsetok(tok, g, start, err_ret, flags); } @@ -267,14 +275,24 @@ return n; } -static void +static int initerr(perrdetail *err_ret, const char *filename) { err_ret->error = E_OK; - err_ret->filename = filename; err_ret->lineno = 0; err_ret->offset = 0; err_ret->text = NULL; err_ret->token = -1; err_ret->expected = -1; +#ifndef PGEN + if (filename) + err_ret->filename = PyUnicode_DecodeFSDefault(filename); + else + err_ret->filename = PyUnicode_FromString(""); + if (err_ret->filename == NULL) { + err_ret->error = E_ERROR; + return -1; + } +#endif + return 0; } diff --git a/Parser/parsetok_pgen.c b/Parser/parsetok_pgen.c new file mode 100644 --- /dev/null +++ b/Parser/parsetok_pgen.c @@ -0,0 +1,2 @@ +#define PGEN +#include "parsetok.c" diff --git a/Parser/tokenizer.c b/Parser/tokenizer.c --- a/Parser/tokenizer.c +++ b/Parser/tokenizer.c @@ -128,7 +128,6 @@ tok->prompt = tok->nextprompt = NULL; tok->lineno = 0; tok->level = 0; - tok->filename = NULL; tok->altwarning = 1; tok->alterror = 1; tok->alttabsize = 1; @@ -140,6 +139,7 @@ tok->encoding = NULL; tok->cont_line = 0; #ifndef PGEN + tok->filename = NULL; tok->decoding_readline = NULL; tok->decoding_buffer = NULL; #endif @@ -545,7 +545,6 @@ { char *line = NULL; int badchar = 0; - PyObject *filename; for (;;) { if (tok->decoding_state == STATE_NORMAL) { /* We already have a codec associated with @@ -586,16 +585,12 @@ if (badchar) { /* Need to add 1 to the line number, since this line has not been counted, yet. */ - filename = PyUnicode_DecodeFSDefault(tok->filename); - if (filename != NULL) { - PyErr_Format(PyExc_SyntaxError, - "Non-UTF-8 code starting with '\\x%.2x' " - "in file %U on line %i, " - "but no encoding declared; " - "see http://python.org/dev/peps/pep-0263/ for details", - badchar, filename, tok->lineno + 1); - Py_DECREF(filename); - } + PyErr_Format(PyExc_SyntaxError, + "Non-UTF-8 code starting with '\\x%.2x' " + "in file %U on line %i, " + "but no encoding declared; " + "see http://python.org/dev/peps/pep-0263/ for details", + badchar, tok->filename, tok->lineno + 1); return error_ret(tok); } #endif @@ -853,6 +848,7 @@ #ifndef PGEN Py_XDECREF(tok->decoding_readline); Py_XDECREF(tok->decoding_buffer); + Py_XDECREF(tok->filename); #endif if (tok->fp != NULL && tok->buf != NULL) PyMem_FREE(tok->buf); @@ -1247,8 +1243,13 @@ return 1; } if (tok->altwarning) { - PySys_WriteStderr("%s: inconsistent use of tabs and spaces " +#ifdef PGEN + PySys_WriteStderr("inconsistent use of tabs and spaces " + "in indentation\n"); +#else + PySys_FormatStderr("%U: inconsistent use of tabs and spaces " "in indentation\n", tok->filename); +#endif tok->altwarning = 0; } return 0; @@ -1718,6 +1719,11 @@ fclose(fp); return NULL; } +#ifndef PGEN + tok->filename = PyUnicode_FromString(""); + if (tok->filename == NULL) + goto error; +#endif while (tok->lineno < 2 && tok->done == E_OK) { PyTokenizer_Get(tok, &p_start, &p_end); } @@ -1727,6 +1733,9 @@ if (encoding) strcpy(encoding, tok->encoding); } +#ifndef PGEN +error: +#endif PyTokenizer_Free(tok); return encoding; } diff --git a/Parser/tokenizer.h b/Parser/tokenizer.h --- a/Parser/tokenizer.h +++ b/Parser/tokenizer.h @@ -40,7 +40,13 @@ int level; /* () [] {} Parentheses nesting level */ /* Used to allow free continuations inside them */ /* Stuff for checking on different tab sizes */ - const char *filename; /* encoded to the filesystem encoding */ +#ifndef PGEN + /* pgen doesn't have access to Python codecs, it cannot decode the input + filename. The bytes filename might be kept, but it is only used by + indenterror() and it is not really needed: pgen only compiles one file + (Grammar/Grammar). */ + PyObject *filename; +#endif int altwarning; /* Issue warning if alternate tabs don't match */ int alterror; /* Issue error if alternate tabs don't match */ int alttabsize; /* Alternate tab spacing */ diff --git a/Python/pythonrun.c b/Python/pythonrun.c --- a/Python/pythonrun.c +++ b/Python/pythonrun.c @@ -62,6 +62,7 @@ static PyObject *run_pyc_file(FILE *, const char *, PyObject *, PyObject *, PyCompilerFlags *); static void err_input(perrdetail *); +static void err_free(perrdetail *); static void initsigs(void); static void call_py_exitfuncs(void); static void wait_for_thread_shutdown(void); @@ -1887,12 +1888,13 @@ flags->cf_flags |= iflags & PyCF_MASK; mod = PyAST_FromNode(n, flags, filename, arena); PyNode_Free(n); - return mod; } else { err_input(&err); - return NULL; + mod = NULL; } + err_free(&err); + return mod; } mod_ty @@ -1917,14 +1919,15 @@ flags->cf_flags |= iflags & PyCF_MASK; mod = PyAST_FromNode(n, flags, filename, arena); PyNode_Free(n); - return mod; } else { err_input(&err); if (errcode) *errcode = err.error; - return NULL; + mod = NULL; } + err_free(&err); + return mod; } /* Simplified interface to parsefile -- return node or set exception */ @@ -1938,6 +1941,7 @@ start, NULL, NULL, &err, flags); if (n == NULL) err_input(&err); + err_free(&err); return n; } @@ -1952,6 +1956,7 @@ start, &err, flags); if (n == NULL) err_input(&err); + err_free(&err); return n; } @@ -1964,6 +1969,7 @@ &_PyParser_Grammar, start, &err, flags); if (n == NULL) err_input(&err); + err_free(&err); return n; } @@ -1977,11 +1983,23 @@ even parser modules. */ void +PyParser_ClearError(perrdetail *err) +{ + err_free(err); +} + +void PyParser_SetError(perrdetail *err) { err_input(err); } +static void +err_free(perrdetail *err) +{ + Py_CLEAR(err->filename); +} + /* Set the error appropriate to the given input error code (see errcode.h) */ static void @@ -1989,7 +2007,6 @@ { PyObject *v, *w, *errtype, *errtext; PyObject *msg_obj = NULL; - PyObject *filename; char *msg = NULL; errtype = PyExc_SyntaxError; @@ -2075,17 +2092,8 @@ errtext = PyUnicode_DecodeUTF8(err->text, strlen(err->text), "replace"); } - if (err->filename != NULL) - filename = PyUnicode_DecodeFSDefault(err->filename); - else { - Py_INCREF(Py_None); - filename = Py_None; - } - if (filename != NULL) - v = Py_BuildValue("(NiiN)", filename, - err->lineno, err->offset, errtext); - else - v = NULL; + v = Py_BuildValue("(OiiN)", err->filename, + err->lineno, err->offset, errtext); if (v != NULL) { if (msg_obj) w = Py_BuildValue("(OO)", msg_obj, v); -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 5 01:48:35 2011 From: python-checkins at python.org (victor.stinner) Date: Tue, 05 Apr 2011 01:48:35 +0200 Subject: [Python-checkins] cpython: Issue #9319: Include the filename in "Non-UTF8 code ..." syntax error. Message-ID: http://hg.python.org/cpython/rev/7b8d625eb6e4 changeset: 69148:7b8d625eb6e4 user: Victor Stinner date: Tue Apr 05 01:48:03 2011 +0200 summary: Issue #9319: Include the filename in "Non-UTF8 code ..." syntax error. files: Lib/test/test_imp.py | 6 ++++ Misc/NEWS | 2 + Parser/tokenizer.c | 41 +++++++++++++++++++++---------- Parser/tokenizer.h | 1 - Python/import.c | 10 +++--- Python/traceback.c | 6 ++-- 6 files changed, 43 insertions(+), 23 deletions(-) diff --git a/Lib/test/test_imp.py b/Lib/test/test_imp.py --- a/Lib/test/test_imp.py +++ b/Lib/test/test_imp.py @@ -58,6 +58,12 @@ with imp.find_module('module_' + mod, self.test_path)[0] as fd: self.assertEqual(fd.encoding, encoding) + path = [os.path.dirname(__file__)] + self.assertRaisesRegex(SyntaxError, + r"Non-UTF-8 code starting with '\\xf6'" + r" in file .*badsyntax_pep3120.py", + imp.find_module, 'badsyntax_pep3120', path) + def test_issue1267(self): for mod, encoding, _ in self.test_strings: fp, filename, info = imp.find_module('module_' + mod, diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -10,6 +10,8 @@ Core and Builtins ----------------- +- Issue #9319: Include the filename in "Non-UTF8 code ..." syntax error. + - Issue #10785: Store the filename as Unicode in the Python parser. - Issue #11619: _PyImport_LoadDynamicModule() doesn't encode the path to bytes diff --git a/Parser/tokenizer.c b/Parser/tokenizer.c --- a/Parser/tokenizer.c +++ b/Parser/tokenizer.c @@ -1690,17 +1690,18 @@ return result; } -/* Get -*- encoding -*- from a Python file. +/* Get the encoding of a Python file. Check for the coding cookie and check if + the file starts with a BOM. - PyTokenizer_FindEncoding returns NULL when it can't find the encoding in - the first or second line of the file (in which case the encoding - should be assumed to be PyUnicode_GetDefaultEncoding()). + PyTokenizer_FindEncodingFilename() returns NULL when it can't find the + encoding in the first or second line of the file (in which case the encoding + should be assumed to be UTF-8). - The char * returned is malloc'ed via PyMem_MALLOC() and thus must be freed - by the caller. -*/ + The char* returned is malloc'ed via PyMem_MALLOC() and thus must be freed + by the caller. */ + char * -PyTokenizer_FindEncoding(int fd) +PyTokenizer_FindEncodingFilename(int fd, PyObject *filename) { struct tok_state *tok; FILE *fp; @@ -1720,9 +1721,18 @@ return NULL; } #ifndef PGEN - tok->filename = PyUnicode_FromString(""); - if (tok->filename == NULL) - goto error; + if (filename != NULL) { + Py_INCREF(filename); + tok->filename = filename; + } + else { + tok->filename = PyUnicode_FromString(""); + if (tok->filename == NULL) { + fclose(fp); + PyTokenizer_Free(tok); + return encoding; + } + } #endif while (tok->lineno < 2 && tok->done == E_OK) { PyTokenizer_Get(tok, &p_start, &p_end); @@ -1733,13 +1743,16 @@ if (encoding) strcpy(encoding, tok->encoding); } -#ifndef PGEN -error: -#endif PyTokenizer_Free(tok); return encoding; } +char * +PyTokenizer_FindEncoding(int fd) +{ + return PyTokenizer_FindEncodingFilename(fd, NULL); +} + #ifdef Py_DEBUG void diff --git a/Parser/tokenizer.h b/Parser/tokenizer.h --- a/Parser/tokenizer.h +++ b/Parser/tokenizer.h @@ -75,7 +75,6 @@ extern int PyTokenizer_Get(struct tok_state *, char **, char **); extern char * PyTokenizer_RestoreEncoding(struct tok_state* tok, int len, int *offset); -extern char * PyTokenizer_FindEncoding(int); #ifdef __cplusplus } diff --git a/Python/import.c b/Python/import.c --- a/Python/import.c +++ b/Python/import.c @@ -124,12 +124,12 @@ /* See _PyImport_FixupExtensionObject() below */ static PyObject *extensions = NULL; +/* Function from Parser/tokenizer.c */ +extern char * PyTokenizer_FindEncodingFilename(int, PyObject *); + /* This table is defined in config.c: */ extern struct _inittab _PyImport_Inittab[]; -/* Method from Parser/tokenizer.c */ -extern char * PyTokenizer_FindEncoding(int); - struct _inittab *PyImport_Inittab = _PyImport_Inittab; /* these tables define the module suffixes that Python recognizes */ @@ -3540,9 +3540,9 @@ } if (fd != -1) { if (strchr(fdp->mode, 'b') == NULL) { - /* PyTokenizer_FindEncoding() returns PyMem_MALLOC'ed + /* PyTokenizer_FindEncodingFilename() returns PyMem_MALLOC'ed memory. */ - found_encoding = PyTokenizer_FindEncoding(fd); + found_encoding = PyTokenizer_FindEncodingFilename(fd, pathobj); lseek(fd, 0, 0); /* Reset position */ if (found_encoding == NULL && PyErr_Occurred()) { Py_XDECREF(pathobj); diff --git a/Python/traceback.c b/Python/traceback.c --- a/Python/traceback.c +++ b/Python/traceback.c @@ -18,8 +18,8 @@ #define MAX_FRAME_DEPTH 100 #define MAX_NTHREADS 100 -/* Method from Parser/tokenizer.c */ -extern char * PyTokenizer_FindEncoding(int); +/* Function from Parser/tokenizer.c */ +extern char * PyTokenizer_FindEncodingFilename(int, PyObject *); static PyObject * tb_dir(PyTracebackObject *self) @@ -251,7 +251,7 @@ /* use the right encoding to decode the file as unicode */ fd = PyObject_AsFileDescriptor(binary); - found_encoding = PyTokenizer_FindEncoding(fd); + found_encoding = PyTokenizer_FindEncodingFilename(fd, filename); encoding = (found_encoding != NULL) ? found_encoding : "utf-8"; lseek(fd, 0, 0); /* Reset position */ fob = PyObject_CallMethod(io, "TextIOWrapper", "Os", binary, encoding); -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 5 02:30:07 2011 From: python-checkins at python.org (victor.stinner) Date: Tue, 05 Apr 2011 02:30:07 +0200 Subject: [Python-checkins] cpython: Issue #11768: add debug messages in test_threadsignals.test_signals Message-ID: http://hg.python.org/cpython/rev/d14eac872a46 changeset: 69149:d14eac872a46 user: Victor Stinner date: Tue Apr 05 02:29:30 2011 +0200 summary: Issue #11768: add debug messages in test_threadsignals.test_signals files: Lib/test/test_threadsignals.py | 8 ++++++++ 1 files changed, 8 insertions(+), 0 deletions(-) diff --git a/Lib/test/test_threadsignals.py b/Lib/test/test_threadsignals.py --- a/Lib/test/test_threadsignals.py +++ b/Lib/test/test_threadsignals.py @@ -30,9 +30,14 @@ # a function that will be spawned as a separate thread. def send_signals(): + print("send_signals: enter (thread %s)" % thread.get_ident(), file=sys.stderr) + print("send_signals: raise SIGUSR1", file=sys.stderr) os.kill(process_pid, signal.SIGUSR1) + print("send_signals: raise SIGUSR2", file=sys.stderr) os.kill(process_pid, signal.SIGUSR2) + print("send_signals: release signalled_all", file=sys.stderr) signalled_all.release() + print("send_signals: exit (thread %s)" % thread.get_ident(), file=sys.stderr) class ThreadSignals(unittest.TestCase): @@ -41,9 +46,12 @@ # We spawn a thread, have the thread send two signals, and # wait for it to finish. Check that we got both signals # and that they were run by the main thread. + print("test_signals: acquire lock (thread %s)" % thread.get_ident(), file=sys.stderr) signalled_all.acquire() self.spawnSignallingThread() + print("test_signals: wait lock (thread %s)" % thread.get_ident(), file=sys.stderr) signalled_all.acquire() + print("test_signals: lock acquired", file=sys.stderr) # the signals that we asked the kernel to send # will come back, but we don't know when. # (it might even be after the thread exits -- Repository URL: http://hg.python.org/cpython From solipsis at pitrou.net Tue Apr 5 04:55:50 2011 From: solipsis at pitrou.net (solipsis at pitrou.net) Date: Tue, 05 Apr 2011 04:55:50 +0200 Subject: [Python-checkins] Daily reference leaks (d14eac872a46): sum=0 Message-ID: results for d14eac872a46 on branch "default" -------------------------------------------- Command line was: ['./python', '-m', 'test.regrtest', '-uall', '-R', '3:3:/home/antoine/cpython/refleaks/reflogrsldRs', '-x'] From python-checkins at python.org Tue Apr 5 11:34:06 2011 From: python-checkins at python.org (raymond.hettinger) Date: Tue, 05 Apr 2011 11:34:06 +0200 Subject: [Python-checkins] cpython: Issue #11707: Fast C version of functools.cmp_to_key() Message-ID: http://hg.python.org/cpython/rev/a03fb2fc3ed8 changeset: 69150:a03fb2fc3ed8 user: Raymond Hettinger date: Tue Apr 05 02:33:54 2011 -0700 summary: Issue #11707: Fast C version of functools.cmp_to_key() files: Lib/functools.py | 7 +- Lib/test/test_functools.py | 66 ++++++++++- Misc/NEWS | 3 + Modules/_functoolsmodule.c | 161 +++++++++++++++++++++++++ 4 files changed, 235 insertions(+), 2 deletions(-) diff --git a/Lib/functools.py b/Lib/functools.py --- a/Lib/functools.py +++ b/Lib/functools.py @@ -97,7 +97,7 @@ """Convert a cmp= function into a key= function""" class K(object): __slots__ = ['obj'] - def __init__(self, obj, *args): + def __init__(self, obj): self.obj = obj def __lt__(self, other): return mycmp(self.obj, other.obj) < 0 @@ -115,6 +115,11 @@ raise TypeError('hash not implemented') return K +try: + from _functools import cmp_to_key +except ImportError: + pass + _CacheInfo = namedtuple("CacheInfo", "hits misses maxsize currsize") def lru_cache(maxsize=100): diff --git a/Lib/test/test_functools.py b/Lib/test/test_functools.py --- a/Lib/test/test_functools.py +++ b/Lib/test/test_functools.py @@ -435,18 +435,81 @@ self.assertEqual(self.func(add, d), "".join(d.keys())) class TestCmpToKey(unittest.TestCase): + def test_cmp_to_key(self): + def cmp1(x, y): + return (x > y) - (x < y) + key = functools.cmp_to_key(cmp1) + self.assertEqual(key(3), key(3)) + self.assertGreater(key(3), key(1)) + def cmp2(x, y): + return int(x) - int(y) + key = functools.cmp_to_key(cmp2) + self.assertEqual(key(4.0), key('4')) + self.assertLess(key(2), key('35')) + + def test_cmp_to_key_arguments(self): + def cmp1(x, y): + return (x > y) - (x < y) + key = functools.cmp_to_key(mycmp=cmp1) + self.assertEqual(key(obj=3), key(obj=3)) + self.assertGreater(key(obj=3), key(obj=1)) + with self.assertRaises((TypeError, AttributeError)): + key(3) > 1 # rhs is not a K object + with self.assertRaises((TypeError, AttributeError)): + 1 < key(3) # lhs is not a K object + with self.assertRaises(TypeError): + key = functools.cmp_to_key() # too few args + with self.assertRaises(TypeError): + key = functools.cmp_to_key(cmp1, None) # too many args + key = functools.cmp_to_key(cmp1) + with self.assertRaises(TypeError): + key() # too few args + with self.assertRaises(TypeError): + key(None, None) # too many args + + def test_bad_cmp(self): + def cmp1(x, y): + raise ZeroDivisionError + key = functools.cmp_to_key(cmp1) + with self.assertRaises(ZeroDivisionError): + key(3) > key(1) + + class BadCmp: + def __lt__(self, other): + raise ZeroDivisionError + def cmp1(x, y): + return BadCmp() + with self.assertRaises(ZeroDivisionError): + key(3) > key(1) + + def test_obj_field(self): + def cmp1(x, y): + return (x > y) - (x < y) + key = functools.cmp_to_key(mycmp=cmp1) + self.assertEqual(key(50).obj, 50) + + def test_sort_int(self): def mycmp(x, y): return y - x self.assertEqual(sorted(range(5), key=functools.cmp_to_key(mycmp)), [4, 3, 2, 1, 0]) + def test_sort_int_str(self): + def mycmp(x, y): + x, y = int(x), int(y) + return (x > y) - (x < y) + values = [5, '3', 7, 2, '0', '1', 4, '10', 1] + values = sorted(values, key=functools.cmp_to_key(mycmp)) + self.assertEqual([int(value) for value in values], + [0, 1, 1, 2, 3, 4, 5, 7, 10]) + def test_hash(self): def mycmp(x, y): return y - x key = functools.cmp_to_key(mycmp) k = key(10) - self.assertRaises(TypeError, hash(k)) + self.assertRaises(TypeError, hash, k) class TestTotalOrdering(unittest.TestCase): @@ -655,6 +718,7 @@ def test_main(verbose=None): test_classes = ( + TestCmpToKey, TestPartial, TestPartialSubclass, TestPythonPartial, diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -97,6 +97,9 @@ - Issue #10791: Implement missing method GzipFile.read1(), allowing GzipFile to be wrapped in a TextIOWrapper. Patch by Nadeem Vawda. +- Issue #11707: Added a fast C version of functools.cmp_to_key(). + Patch by Filip Gruszczy?ski. + - Issue #11688: Add sqlite3.Connection.set_trace_callback(). Patch by Torsten Landschoff. diff --git a/Modules/_functoolsmodule.c b/Modules/_functoolsmodule.c --- a/Modules/_functoolsmodule.c +++ b/Modules/_functoolsmodule.c @@ -330,6 +330,165 @@ }; +/* cmp_to_key ***************************************************************/ + +typedef struct { + PyObject_HEAD; + PyObject *cmp; + PyObject *object; +} keyobject; + +static void +keyobject_dealloc(keyobject *ko) +{ + Py_DECREF(ko->cmp); + Py_XDECREF(ko->object); + PyObject_FREE(ko); +} + +static int +keyobject_traverse(keyobject *ko, visitproc visit, void *arg) +{ + Py_VISIT(ko->cmp); + if (ko->object) + Py_VISIT(ko->object); + return 0; +} + +static PyMemberDef keyobject_members[] = { + {"obj", T_OBJECT, + offsetof(keyobject, object), 0, + PyDoc_STR("Value wrapped by a key function.")}, + {NULL} +}; + +static PyObject * +keyobject_call(keyobject *ko, PyObject *args, PyObject *kw); + +static PyObject * +keyobject_richcompare(PyObject *ko, PyObject *other, int op); + +static PyTypeObject keyobject_type = { + PyVarObject_HEAD_INIT(&PyType_Type, 0) + "functools.KeyWrapper", /* tp_name */ + sizeof(keyobject), /* tp_basicsize */ + 0, /* tp_itemsize */ + /* methods */ + (destructor)keyobject_dealloc, /* tp_dealloc */ + 0, /* tp_print */ + 0, /* tp_getattr */ + 0, /* tp_setattr */ + 0, /* tp_reserved */ + 0, /* tp_repr */ + 0, /* tp_as_number */ + 0, /* tp_as_sequence */ + 0, /* tp_as_mapping */ + 0, /* tp_hash */ + (ternaryfunc)keyobject_call, /* tp_call */ + 0, /* tp_str */ + PyObject_GenericGetAttr, /* tp_getattro */ + 0, /* tp_setattro */ + 0, /* tp_as_buffer */ + Py_TPFLAGS_DEFAULT, /* tp_flags */ + 0, /* tp_doc */ + (traverseproc)keyobject_traverse, /* tp_traverse */ + 0, /* tp_clear */ + keyobject_richcompare, /* tp_richcompare */ + 0, /* tp_weaklistoffset */ + 0, /* tp_iter */ + 0, /* tp_iternext */ + 0, /* tp_methods */ + keyobject_members, /* tp_members */ + 0, /* tp_getset */ +}; + +static PyObject * +keyobject_call(keyobject *ko, PyObject *args, PyObject *kwds) +{ + PyObject *object; + keyobject *result; + static char *kwargs[] = {"obj", NULL}; + + if (!PyArg_ParseTupleAndKeywords(args, kwds, "O:K", kwargs, &object)) + return NULL; + result = PyObject_New(keyobject, &keyobject_type); + if (!result) + return NULL; + Py_INCREF(ko->cmp); + result->cmp = ko->cmp; + Py_INCREF(object); + result->object = object; + return (PyObject *)result; +} + +static PyObject * +keyobject_richcompare(PyObject *ko, PyObject *other, int op) +{ + PyObject *res; + PyObject *args; + PyObject *x; + PyObject *y; + PyObject *compare; + PyObject *answer; + static PyObject *zero; + + if (zero == NULL) { + zero = PyLong_FromLong(0); + if (!zero) + return NULL; + } + + if (Py_TYPE(other) != &keyobject_type){ + PyErr_Format(PyExc_TypeError, "other argument must be K instance"); + return NULL; + } + compare = ((keyobject *) ko)->cmp; + assert(compare != NULL); + x = ((keyobject *) ko)->object; + y = ((keyobject *) other)->object; + if (!x || !y){ + PyErr_Format(PyExc_AttributeError, "object"); + return NULL; + } + + /* Call the user's comparison function and translate the 3-way + * result into true or false (or error). + */ + args = PyTuple_New(2); + if (args == NULL) + return NULL; + Py_INCREF(x); + Py_INCREF(y); + PyTuple_SET_ITEM(args, 0, x); + PyTuple_SET_ITEM(args, 1, y); + res = PyObject_Call(compare, args, NULL); + Py_DECREF(args); + if (res == NULL) + return NULL; + answer = PyObject_RichCompare(res, zero, op); + Py_DECREF(res); + return answer; +} + +static PyObject * +functools_cmp_to_key(PyObject *self, PyObject *args, PyObject *kwds){ + PyObject *cmp; + static char *kwargs[] = {"mycmp", NULL}; + + if (!PyArg_ParseTupleAndKeywords(args, kwds, "O:cmp_to_key", kwargs, &cmp)) + return NULL; + keyobject *object = PyObject_New(keyobject, &keyobject_type); + if (!object) + return NULL; + Py_INCREF(cmp); + object->cmp = cmp; + object->object = NULL; + return (PyObject *)object; +} + +PyDoc_STRVAR(functools_cmp_to_key_doc, +"Convert a cmp= function into a key= function."); + /* reduce (used to be a builtin) ********************************************/ static PyObject * @@ -413,6 +572,8 @@ static PyMethodDef module_methods[] = { {"reduce", functools_reduce, METH_VARARGS, functools_reduce_doc}, + {"cmp_to_key", functools_cmp_to_key, METH_VARARGS | METH_KEYWORDS, + functools_cmp_to_key_doc}, {NULL, NULL} /* sentinel */ }; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 5 12:21:51 2011 From: python-checkins at python.org (victor.stinner) Date: Tue, 05 Apr 2011 12:21:51 +0200 Subject: [Python-checkins] cpython: Issue #11707: Fix compilation errors with Visual Studio Message-ID: http://hg.python.org/cpython/rev/76ed6a061ebe changeset: 69151:76ed6a061ebe user: Victor Stinner date: Tue Apr 05 12:21:35 2011 +0200 summary: Issue #11707: Fix compilation errors with Visual Studio Fix also a compiler (gcc) warning. files: Modules/_functoolsmodule.c | 14 ++++++++------ 1 files changed, 8 insertions(+), 6 deletions(-) diff --git a/Modules/_functoolsmodule.c b/Modules/_functoolsmodule.c --- a/Modules/_functoolsmodule.c +++ b/Modules/_functoolsmodule.c @@ -333,7 +333,7 @@ /* cmp_to_key ***************************************************************/ typedef struct { - PyObject_HEAD; + PyObject_HEAD PyObject *cmp; PyObject *object; } keyobject; @@ -471,13 +471,15 @@ } static PyObject * -functools_cmp_to_key(PyObject *self, PyObject *args, PyObject *kwds){ - PyObject *cmp; +functools_cmp_to_key(PyObject *self, PyObject *args, PyObject *kwds) +{ + PyObject *cmp; static char *kwargs[] = {"mycmp", NULL}; + keyobject *object; if (!PyArg_ParseTupleAndKeywords(args, kwds, "O:cmp_to_key", kwargs, &cmp)) return NULL; - keyobject *object = PyObject_New(keyobject, &keyobject_type); + object = PyObject_New(keyobject, &keyobject_type); if (!object) return NULL; Py_INCREF(cmp); @@ -572,8 +574,8 @@ static PyMethodDef module_methods[] = { {"reduce", functools_reduce, METH_VARARGS, functools_reduce_doc}, - {"cmp_to_key", functools_cmp_to_key, METH_VARARGS | METH_KEYWORDS, - functools_cmp_to_key_doc}, + {"cmp_to_key", (PyCFunction)functools_cmp_to_key, + METH_VARARGS | METH_KEYWORDS, functools_cmp_to_key_doc}, {NULL, NULL} /* sentinel */ }; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 5 13:16:12 2011 From: python-checkins at python.org (victor.stinner) Date: Tue, 05 Apr 2011 13:16:12 +0200 Subject: [Python-checkins] cpython: Issue #11757: subprocess ensures that select() and poll() timeout >= 0 Message-ID: http://hg.python.org/cpython/rev/3664fc29e867 changeset: 69152:3664fc29e867 user: Victor Stinner date: Tue Apr 05 13:13:08 2011 +0200 summary: Issue #11757: subprocess ensures that select() and poll() timeout >= 0 files: Lib/subprocess.py | 33 +++++++++++++++++++-------------- 1 files changed, 19 insertions(+), 14 deletions(-) diff --git a/Lib/subprocess.py b/Lib/subprocess.py --- a/Lib/subprocess.py +++ b/Lib/subprocess.py @@ -817,15 +817,10 @@ if self._communication_started and input: raise ValueError("Cannot send input after starting communication") - if timeout is not None: - endtime = time.time() + timeout - else: - endtime = None - # Optimization: If we are not worried about timeouts, we haven't # started communicating, and we have one or zero pipes, using select() # or threads is unnecessary. - if (endtime is None and not self._communication_started and + if (timeout is None and not self._communication_started and [self.stdin, self.stdout, self.stderr].count(None) >= 2): stdout = None stderr = None @@ -840,14 +835,18 @@ stderr = self.stderr.read() self.stderr.close() self.wait() - return (stdout, stderr) + else: + if timeout is not None: + endtime = time.time() + timeout + else: + endtime = None - try: - stdout, stderr = self._communicate(input, endtime, timeout) - finally: - self._communication_started = True + try: + stdout, stderr = self._communicate(input, endtime, timeout) + finally: + self._communication_started = True - sts = self.wait(timeout=self._remaining_time(endtime)) + sts = self.wait(timeout=self._remaining_time(endtime)) return (stdout, stderr) @@ -1604,8 +1603,11 @@ self._input = self._input.encode(self.stdin.encoding) while self._fd2file: + timeout = self._remaining_time(endtime) + if timeout is not None and timeout < 0: + raise TimeoutExpired(self.args, orig_timeout) try: - ready = poller.poll(self._remaining_time(endtime)) + ready = poller.poll(timeout) except select.error as e: if e.args[0] == errno.EINTR: continue @@ -1664,10 +1666,13 @@ stderr = self._stderr_buff while self._read_set or self._write_set: + timeout = self._remaining_time(endtime) + if timeout is not None and timeout < 0: + raise TimeoutExpired(self.args, orig_timeout) try: (rlist, wlist, xlist) = \ select.select(self._read_set, self._write_set, [], - self._remaining_time(endtime)) + timeout) except select.error as e: if e.args[0] == errno.EINTR: continue -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 5 16:10:14 2011 From: python-checkins at python.org (ross.lagerwall) Date: Tue, 05 Apr 2011 16:10:14 +0200 Subject: [Python-checkins] cpython (2.7): Issue #10963: Ensure that subprocess.communicate() never raises EPIPE. Message-ID: http://hg.python.org/cpython/rev/c10d55c51d81 changeset: 69153:c10d55c51d81 branch: 2.7 parent: 69125:f961e9179998 user: Ross Lagerwall date: Tue Apr 05 15:24:34 2011 +0200 summary: Issue #10963: Ensure that subprocess.communicate() never raises EPIPE. files: Lib/subprocess.py | 45 ++++++++++++++++++------ Lib/test/test_subprocess.py | 18 ++++++++++ Misc/NEWS | 2 + 3 files changed, 54 insertions(+), 11 deletions(-) diff --git a/Lib/subprocess.py b/Lib/subprocess.py --- a/Lib/subprocess.py +++ b/Lib/subprocess.py @@ -396,6 +396,7 @@ import traceback import gc import signal +import errno # Exception classes used by this module. class CalledProcessError(Exception): @@ -427,7 +428,6 @@ else: import select _has_poll = hasattr(select, 'poll') - import errno import fcntl import pickle @@ -726,7 +726,11 @@ stderr = None if self.stdin: if input: - self.stdin.write(input) + try: + self.stdin.write(input) + except IOError as e: + if e.errno != errno.EPIPE and e.errno != errno.EINVAL: + raise self.stdin.close() elif self.stdout: stdout = self.stdout.read() @@ -956,7 +960,11 @@ if self.stdin: if input is not None: - self.stdin.write(input) + try: + self.stdin.write(input) + except IOError as e: + if e.errno != errno.EPIPE: + raise self.stdin.close() if self.stdout: @@ -1336,9 +1344,16 @@ for fd, mode in ready: if mode & select.POLLOUT: chunk = input[input_offset : input_offset + _PIPE_BUF] - input_offset += os.write(fd, chunk) - if input_offset >= len(input): - close_unregister_and_remove(fd) + try: + input_offset += os.write(fd, chunk) + except OSError as e: + if e.errno == errno.EPIPE: + close_unregister_and_remove(fd) + else: + raise + else: + if input_offset >= len(input): + close_unregister_and_remove(fd) elif mode & select_POLLIN_POLLPRI: data = os.read(fd, 4096) if not data: @@ -1377,11 +1392,19 @@ if self.stdin in wlist: chunk = input[input_offset : input_offset + _PIPE_BUF] - bytes_written = os.write(self.stdin.fileno(), chunk) - input_offset += bytes_written - if input_offset >= len(input): - self.stdin.close() - write_set.remove(self.stdin) + try: + bytes_written = os.write(self.stdin.fileno(), chunk) + except OSError as e: + if e.errno == errno.EPIPE: + self.stdin.close() + write_set.remove(self.stdin) + else: + raise + else: + input_offset += bytes_written + if input_offset >= len(input): + self.stdin.close() + write_set.remove(self.stdin) if self.stdout in rlist: data = os.read(self.stdout.fileno(), 1024) diff --git a/Lib/test/test_subprocess.py b/Lib/test/test_subprocess.py --- a/Lib/test/test_subprocess.py +++ b/Lib/test/test_subprocess.py @@ -597,6 +597,24 @@ self.assertFalse(os.path.exists(ofname)) self.assertFalse(os.path.exists(efname)) + def test_communicate_epipe(self): + # Issue 10963: communicate() should hide EPIPE + p = subprocess.Popen([sys.executable, "-c", 'pass'], + stdin=subprocess.PIPE, + stdout=subprocess.PIPE, + stderr=subprocess.PIPE) + self.addCleanup(p.stdout.close) + self.addCleanup(p.stderr.close) + self.addCleanup(p.stdin.close) + p.communicate("x" * 2**20) + + def test_communicate_epipe_only_stdin(self): + # Issue 10963: communicate() should hide EPIPE + p = subprocess.Popen([sys.executable, "-c", 'pass'], + stdin=subprocess.PIPE) + self.addCleanup(p.stdin.close) + time.sleep(2) + p.communicate("x" * 2**20) # context manager class _SuppressCoreFiles(object): diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -47,6 +47,8 @@ Library ------- +- Issue #10963: Ensure that subprocess.communicate() never raises EPIPE. + - Issue #11662: Make urllib and urllib2 ignore redirections if the scheme is not HTTP, HTTPS or FTP (CVE-2011-1521). -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 5 16:10:21 2011 From: python-checkins at python.org (ross.lagerwall) Date: Tue, 05 Apr 2011 16:10:21 +0200 Subject: [Python-checkins] cpython (3.1): Issue #10963: Ensure that subprocess.communicate() never raises EPIPE. Message-ID: http://hg.python.org/cpython/rev/158495d49f58 changeset: 69154:158495d49f58 branch: 3.1 parent: 69142:04b5cd2f8c87 user: Ross Lagerwall date: Tue Apr 05 15:34:00 2011 +0200 summary: Issue #10963: Ensure that subprocess.communicate() never raises EPIPE. files: Lib/subprocess.py | 45 ++++++++++++++++++------ Lib/test/test_subprocess.py | 19 ++++++++++ Misc/NEWS | 2 + 3 files changed, 55 insertions(+), 11 deletions(-) diff --git a/Lib/subprocess.py b/Lib/subprocess.py --- a/Lib/subprocess.py +++ b/Lib/subprocess.py @@ -326,6 +326,7 @@ import traceback import gc import signal +import errno # Exception classes used by this module. class CalledProcessError(Exception): @@ -358,7 +359,6 @@ else: import select _has_poll = hasattr(select, 'poll') - import errno import fcntl import pickle @@ -699,7 +699,11 @@ stderr = None if self.stdin: if input: - self.stdin.write(input) + try: + self.stdin.write(input) + except IOError as e: + if e.errno != errno.EPIPE and e.errno != errno.EINVAL: + raise self.stdin.close() elif self.stdout: stdout = self.stdout.read() @@ -929,7 +933,11 @@ if self.stdin: if input is not None: - self.stdin.write(input) + try: + self.stdin.write(input) + except IOError as e: + if e.errno != errno.EPIPE: + raise self.stdin.close() if self.stdout: @@ -1290,9 +1298,16 @@ for fd, mode in ready: if mode & select.POLLOUT: chunk = input[input_offset : input_offset + _PIPE_BUF] - input_offset += os.write(fd, chunk) - if input_offset >= len(input): - close_unregister_and_remove(fd) + try: + input_offset += os.write(fd, chunk) + except OSError as e: + if e.errno == errno.EPIPE: + close_unregister_and_remove(fd) + else: + raise + else: + if input_offset >= len(input): + close_unregister_and_remove(fd) elif mode & select_POLLIN_POLLPRI: data = os.read(fd, 4096) if not data: @@ -1334,11 +1349,19 @@ if self.stdin in wlist: chunk = input[input_offset : input_offset + _PIPE_BUF] - bytes_written = os.write(self.stdin.fileno(), chunk) - input_offset += bytes_written - if input_offset >= len(input): - self.stdin.close() - write_set.remove(self.stdin) + try: + bytes_written = os.write(self.stdin.fileno(), chunk) + except OSError as e: + if e.errno == errno.EPIPE: + self.stdin.close() + write_set.remove(self.stdin) + else: + raise + else: + input_offset += bytes_written + if input_offset >= len(input): + self.stdin.close() + write_set.remove(self.stdin) if self.stdout in rlist: data = os.read(self.stdout.fileno(), 1024) diff --git a/Lib/test/test_subprocess.py b/Lib/test/test_subprocess.py --- a/Lib/test/test_subprocess.py +++ b/Lib/test/test_subprocess.py @@ -592,6 +592,25 @@ self.assertFalse(os.path.exists(ofname)) self.assertFalse(os.path.exists(efname)) + def test_communicate_epipe(self): + # Issue 10963: communicate() should hide EPIPE + p = subprocess.Popen([sys.executable, "-c", 'pass'], + stdin=subprocess.PIPE, + stdout=subprocess.PIPE, + stderr=subprocess.PIPE) + self.addCleanup(p.stdout.close) + self.addCleanup(p.stderr.close) + self.addCleanup(p.stdin.close) + p.communicate(b"x" * 2**20) + + def test_communicate_epipe_only_stdin(self): + # Issue 10963: communicate() should hide EPIPE + p = subprocess.Popen([sys.executable, "-c", 'pass'], + stdin=subprocess.PIPE) + self.addCleanup(p.stdin.close) + time.sleep(2) + p.communicate(b"x" * 2**20) + # # POSIX tests # diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -44,6 +44,8 @@ Library ------- +- Issue #10963: Ensure that subprocess.communicate() never raises EPIPE. + - Issue #11696: Fix ID generation in msilib. - Issue #9696: Fix exception incorrectly raised by xdrlib.Packer.pack_int when -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 5 16:10:23 2011 From: python-checkins at python.org (ross.lagerwall) Date: Tue, 05 Apr 2011 16:10:23 +0200 Subject: [Python-checkins] cpython (merge 3.1 -> 3.2): Merge with 3.1 Message-ID: http://hg.python.org/cpython/rev/a7363288c8d4 changeset: 69155:a7363288c8d4 branch: 3.2 parent: 69143:8d5ea25d79d0 parent: 69154:158495d49f58 user: Ross Lagerwall date: Tue Apr 05 15:48:47 2011 +0200 summary: Merge with 3.1 files: Lib/subprocess.py | 45 ++++++++++++++++++------ Lib/test/test_subprocess.py | 19 ++++++++++ Misc/NEWS | 2 + Objects/typeslots.inc | 2 +- 4 files changed, 56 insertions(+), 12 deletions(-) diff --git a/Lib/subprocess.py b/Lib/subprocess.py --- a/Lib/subprocess.py +++ b/Lib/subprocess.py @@ -345,6 +345,7 @@ import signal import builtins import warnings +import errno # Exception classes used by this module. class CalledProcessError(Exception): @@ -376,7 +377,6 @@ else: import select _has_poll = hasattr(select, 'poll') - import errno import fcntl import pickle @@ -785,7 +785,11 @@ stderr = None if self.stdin: if input: - self.stdin.write(input) + try: + self.stdin.write(input) + except IOError as e: + if e.errno != errno.EPIPE and e.errno != errno.EINVAL: + raise self.stdin.close() elif self.stdout: stdout = self.stdout.read() @@ -1019,7 +1023,11 @@ if self.stdin: if input is not None: - self.stdin.write(input) + try: + self.stdin.write(input) + except IOError as e: + if e.errno != errno.EPIPE: + raise self.stdin.close() if self.stdout: @@ -1455,9 +1463,16 @@ for fd, mode in ready: if mode & select.POLLOUT: chunk = input[input_offset : input_offset + _PIPE_BUF] - input_offset += os.write(fd, chunk) - if input_offset >= len(input): - close_unregister_and_remove(fd) + try: + input_offset += os.write(fd, chunk) + except OSError as e: + if e.errno == errno.EPIPE: + close_unregister_and_remove(fd) + else: + raise + else: + if input_offset >= len(input): + close_unregister_and_remove(fd) elif mode & select_POLLIN_POLLPRI: data = os.read(fd, 4096) if not data: @@ -1499,11 +1514,19 @@ if self.stdin in wlist: chunk = input[input_offset : input_offset + _PIPE_BUF] - bytes_written = os.write(self.stdin.fileno(), chunk) - input_offset += bytes_written - if input_offset >= len(input): - self.stdin.close() - write_set.remove(self.stdin) + try: + bytes_written = os.write(self.stdin.fileno(), chunk) + except OSError as e: + if e.errno == errno.EPIPE: + self.stdin.close() + write_set.remove(self.stdin) + else: + raise + else: + input_offset += bytes_written + if input_offset >= len(input): + self.stdin.close() + write_set.remove(self.stdin) if self.stdout in rlist: data = os.read(self.stdout.fileno(), 1024) diff --git a/Lib/test/test_subprocess.py b/Lib/test/test_subprocess.py --- a/Lib/test/test_subprocess.py +++ b/Lib/test/test_subprocess.py @@ -626,6 +626,25 @@ self.assertFalse(os.path.exists(ofname)) self.assertFalse(os.path.exists(efname)) + def test_communicate_epipe(self): + # Issue 10963: communicate() should hide EPIPE + p = subprocess.Popen([sys.executable, "-c", 'pass'], + stdin=subprocess.PIPE, + stdout=subprocess.PIPE, + stderr=subprocess.PIPE) + self.addCleanup(p.stdout.close) + self.addCleanup(p.stderr.close) + self.addCleanup(p.stdin.close) + p.communicate(b"x" * 2**20) + + def test_communicate_epipe_only_stdin(self): + # Issue 10963: communicate() should hide EPIPE + p = subprocess.Popen([sys.executable, "-c", 'pass'], + stdin=subprocess.PIPE) + self.addCleanup(p.stdin.close) + time.sleep(2) + p.communicate(b"x" * 2**20) + # context manager class _SuppressCoreFiles(object): diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -49,6 +49,8 @@ Library ------- +- Issue #10963: Ensure that subprocess.communicate() never raises EPIPE. + - Issue #11746: Fix SSLContext.load_cert_chain() to accept elliptic curve private keys. diff --git a/Objects/typeslots.inc b/Objects/typeslots.inc --- a/Objects/typeslots.inc +++ b/Objects/typeslots.inc @@ -1,4 +1,4 @@ -/* Generated by typeslots.py $Revision: 87806 $ */ +/* Generated by typeslots.py $Revision$ */ 0, 0, offsetof(PyHeapTypeObject, as_mapping.mp_ass_subscript), -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 5 16:10:25 2011 From: python-checkins at python.org (ross.lagerwall) Date: Tue, 05 Apr 2011 16:10:25 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Merge with 3.2 Message-ID: http://hg.python.org/cpython/rev/c81ad4361c49 changeset: 69156:c81ad4361c49 parent: 69152:3664fc29e867 parent: 69155:a7363288c8d4 user: Ross Lagerwall date: Tue Apr 05 16:07:49 2011 +0200 summary: Merge with 3.2 files: Lib/subprocess.py | 45 ++++++++++++++++++------ Lib/test/test_subprocess.py | 19 ++++++++++ Misc/NEWS | 2 + 3 files changed, 55 insertions(+), 11 deletions(-) diff --git a/Lib/subprocess.py b/Lib/subprocess.py --- a/Lib/subprocess.py +++ b/Lib/subprocess.py @@ -348,6 +348,7 @@ import signal import builtins import warnings +import errno # Exception classes used by this module. class SubprocessError(Exception): pass @@ -396,7 +397,6 @@ else: import select _has_poll = hasattr(select, 'poll') - import errno import fcntl import pickle @@ -826,7 +826,11 @@ stderr = None if self.stdin: if input: - self.stdin.write(input) + try: + self.stdin.write(input) + except IOError as e: + if e.errno != errno.EPIPE and e.errno != errno.EINVAL: + raise self.stdin.close() elif self.stdout: stdout = self.stdout.read() @@ -1104,7 +1108,11 @@ if self.stdin: if input is not None: - self.stdin.write(input) + try: + self.stdin.write(input) + except IOError as e: + if e.errno != errno.EPIPE: + raise self.stdin.close() # Wait for the reader threads, or time out. If we time out, the @@ -1621,9 +1629,16 @@ if mode & select.POLLOUT: chunk = self._input[self._input_offset : self._input_offset + _PIPE_BUF] - self._input_offset += os.write(fd, chunk) - if self._input_offset >= len(self._input): - close_unregister_and_remove(fd) + try: + self._input_offset += os.write(fd, chunk) + except OSError as e: + if e.errno == errno.EPIPE: + close_unregister_and_remove(fd) + else: + raise + else: + if self._input_offset >= len(self._input): + close_unregister_and_remove(fd) elif mode & select_POLLIN_POLLPRI: data = os.read(fd, 4096) if not data: @@ -1691,11 +1706,19 @@ if self.stdin in wlist: chunk = self._input[self._input_offset : self._input_offset + _PIPE_BUF] - bytes_written = os.write(self.stdin.fileno(), chunk) - self._input_offset += bytes_written - if self._input_offset >= len(self._input): - self.stdin.close() - self._write_set.remove(self.stdin) + try: + bytes_written = os.write(self.stdin.fileno(), chunk) + except OSError as e: + if e.errno == errno.EPIPE: + self.stdin.close() + self._write_set.remove(self.stdin) + else: + raise + else: + self._input_offset += bytes_written + if self._input_offset >= len(self._input): + self.stdin.close() + self._write_set.remove(self.stdin) if self.stdout in rlist: data = os.read(self.stdout.fileno(), 1024) diff --git a/Lib/test/test_subprocess.py b/Lib/test/test_subprocess.py --- a/Lib/test/test_subprocess.py +++ b/Lib/test/test_subprocess.py @@ -720,6 +720,25 @@ self.assertFalse(os.path.exists(ofname)) self.assertFalse(os.path.exists(efname)) + def test_communicate_epipe(self): + # Issue 10963: communicate() should hide EPIPE + p = subprocess.Popen([sys.executable, "-c", 'pass'], + stdin=subprocess.PIPE, + stdout=subprocess.PIPE, + stderr=subprocess.PIPE) + self.addCleanup(p.stdout.close) + self.addCleanup(p.stderr.close) + self.addCleanup(p.stdin.close) + p.communicate(b"x" * 2**20) + + def test_communicate_epipe_only_stdin(self): + # Issue 10963: communicate() should hide EPIPE + p = subprocess.Popen([sys.executable, "-c", 'pass'], + stdin=subprocess.PIPE) + self.addCleanup(p.stdin.close) + time.sleep(2) + p.communicate(b"x" * 2**20) + # context manager class _SuppressCoreFiles(object): diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -94,6 +94,8 @@ Library ------- +- Issue #10963: Ensure that subprocess.communicate() never raises EPIPE. + - Issue #10791: Implement missing method GzipFile.read1(), allowing GzipFile to be wrapped in a TextIOWrapper. Patch by Nadeem Vawda. -- Repository URL: http://hg.python.org/cpython From ezio.melotti at gmail.com Tue Apr 5 16:53:04 2011 From: ezio.melotti at gmail.com (Ezio Melotti) Date: Tue, 05 Apr 2011 17:53:04 +0300 Subject: [Python-checkins] peps: Draft of PEP 399: Pure Python/C Accelerator Module Compatibiilty Requirements In-Reply-To: References: Message-ID: <4D9B2CD0.7060002@gmail.com> Hi, On 05/04/2011 2.37, brett.cannon wrote: > http://hg.python.org/peps/rev/7b9a5b01b479 > changeset: 3855:7b9a5b01b479 > user: Brett Cannon > date: Mon Apr 04 16:37:07 2011 -0700 > summary: > Draft of PEP 399: Pure Python/C Accelerator Module Compatibiilty Requirements > > files: > pep-0399.txt | 205 +++++++++++++++++++++++++++++++++++++++ > 1 files changed, 205 insertions(+), 0 deletions(-) > > > diff --git a/pep-0399.txt b/pep-0399.txt > new file mode 100644 > --- /dev/null > +++ b/pep-0399.txt > @@ -0,0 +1,205 @@ > +PEP: 399 > +Title: Pure Python/C Accelerator Module Compatibiilty Requirements > +Version: $Revision: 88219 $ > +Last-Modified: $Date: 2011-01-27 13:47:00 -0800 (Thu, 27 Jan 2011) $ > +Author: Brett Cannon > +Status: Draft > +Type: Informational > +Content-Type: text/x-rst > +Created: 04-Apr-2011 > +Python-Version: 3.3 > +Post-History: > + > [...] > + > +Any accelerated code must be semantically identical to the pure Python > +implementation. The only time any semantics are allowed to be > +different are when technical details of the VM providing the > +accelerated code prevent matching semantics from being possible, e.g., > +a class being a ``type`` when implemented in C. The semantics > +equivalence requirement also dictates that no public API be provided > +in accelerated code that does not exist in the pure Python code. > +Without this requirement people could accidentally come to rely on a > +detail in the acclerated code which is not made available to other VMs s/acclerated/accelerated/ > +that use the pure Python implementation. To help verify that the > +contract of semantic equivalence is being met, a module must be tested > +both with and without its accelerated code as thoroughly as possible. > + > +As an example, to write tests which exercise both the pure Python and > +C acclerated versions of a module, a basic idiom can be followed:: ditto > + > + import collections.abc > + from test.support import import_fresh_module, run_unittest > + import unittest > + > + c_heapq = import_fresh_module('heapq', fresh=['_heapq']) > + py_heapq = import_fresh_module('heapq', blocked=['_heapq']) > + > + > + class ExampleTest(unittest.TestCase): > [...] > +Copyright > +========= > + > +This document has been placed in the public domain. > + > + > +.. _IronPython: http://ironpython.net/ > +.. _Jython: http://www.jython.org/ > +.. _PyPy: http://pypy.org/ > +.. _C API of Python: http://docs.python.org/py3k/c-api/index.html > +.. _sqlite3: http://docs.python.org/py3k/library/sqlite3.html > Best Regards, Ezio Melotti From python-checkins at python.org Tue Apr 5 18:13:19 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 05 Apr 2011 18:13:19 +0200 Subject: [Python-checkins] cpython (3.1): Try to fix sporadic test_multiprocessing failure Message-ID: http://hg.python.org/cpython/rev/7a1ef59d765b changeset: 69157:7a1ef59d765b branch: 3.1 parent: 69154:158495d49f58 user: Antoine Pitrou date: Tue Apr 05 18:11:33 2011 +0200 summary: Try to fix sporadic test_multiprocessing failure files: Lib/test/test_multiprocessing.py | 12 +++++++++++- 1 files changed, 11 insertions(+), 1 deletions(-) diff --git a/Lib/test/test_multiprocessing.py b/Lib/test/test_multiprocessing.py --- a/Lib/test/test_multiprocessing.py +++ b/Lib/test/test_multiprocessing.py @@ -12,6 +12,7 @@ import sys import os import gc +import errno import signal import array import socket @@ -1277,7 +1278,16 @@ manager.shutdown() manager = QueueManager( address=addr, authkey=authkey, serializer=SERIALIZER) - manager.start() + try: + manager.start() + except IOError as e: + if e.errno != errno.EADDRINUSE: + raise + # Retry after some time, in case the old socket was lingering + # (sporadic failure on buildbots) + time.sleep(1.0) + manager = QueueManager( + address=addr, authkey=authkey, serializer=SERIALIZER) manager.shutdown() # -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 5 18:13:20 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 05 Apr 2011 18:13:20 +0200 Subject: [Python-checkins] cpython (merge 3.1 -> 3.2): Try to fix sporadic test_multiprocessing failure Message-ID: http://hg.python.org/cpython/rev/8a65e6aff672 changeset: 69158:8a65e6aff672 branch: 3.2 parent: 69155:a7363288c8d4 parent: 69157:7a1ef59d765b user: Antoine Pitrou date: Tue Apr 05 18:12:15 2011 +0200 summary: Try to fix sporadic test_multiprocessing failure files: Lib/test/test_multiprocessing.py | 12 +++++++++++- 1 files changed, 11 insertions(+), 1 deletions(-) diff --git a/Lib/test/test_multiprocessing.py b/Lib/test/test_multiprocessing.py --- a/Lib/test/test_multiprocessing.py +++ b/Lib/test/test_multiprocessing.py @@ -11,6 +11,7 @@ import sys import os import gc +import errno import signal import array import socket @@ -1359,7 +1360,16 @@ manager.shutdown() manager = QueueManager( address=addr, authkey=authkey, serializer=SERIALIZER) - manager.start() + try: + manager.start() + except IOError as e: + if e.errno != errno.EADDRINUSE: + raise + # Retry after some time, in case the old socket was lingering + # (sporadic failure on buildbots) + time.sleep(1.0) + manager = QueueManager( + address=addr, authkey=authkey, serializer=SERIALIZER) manager.shutdown() # -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 5 18:13:24 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 05 Apr 2011 18:13:24 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Try to fix sporadic test_multiprocessing failure Message-ID: http://hg.python.org/cpython/rev/a9371cf1cc61 changeset: 69159:a9371cf1cc61 parent: 69156:c81ad4361c49 parent: 69158:8a65e6aff672 user: Antoine Pitrou date: Tue Apr 05 18:13:06 2011 +0200 summary: Try to fix sporadic test_multiprocessing failure files: Lib/test/test_multiprocessing.py | 12 +++++++++++- 1 files changed, 11 insertions(+), 1 deletions(-) diff --git a/Lib/test/test_multiprocessing.py b/Lib/test/test_multiprocessing.py --- a/Lib/test/test_multiprocessing.py +++ b/Lib/test/test_multiprocessing.py @@ -11,6 +11,7 @@ import sys import os import gc +import errno import signal import array import socket @@ -1371,7 +1372,16 @@ manager.shutdown() manager = QueueManager( address=addr, authkey=authkey, serializer=SERIALIZER) - manager.start() + try: + manager.start() + except IOError as e: + if e.errno != errno.EADDRINUSE: + raise + # Retry after some time, in case the old socket was lingering + # (sporadic failure on buildbots) + time.sleep(1.0) + manager = QueueManager( + address=addr, authkey=authkey, serializer=SERIALIZER) manager.shutdown() # -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 5 19:40:55 2011 From: python-checkins at python.org (ezio.melotti) Date: Tue, 05 Apr 2011 19:40:55 +0200 Subject: [Python-checkins] cpython (2.7): #7311: fix HTMLParser to accept non-ASCII attribute values. Message-ID: http://hg.python.org/cpython/rev/7d4dea76c476 changeset: 69160:7d4dea76c476 branch: 2.7 parent: 69153:c10d55c51d81 user: Ezio Melotti date: Tue Apr 05 20:40:52 2011 +0300 summary: #7311: fix HTMLParser to accept non-ASCII attribute values. files: Lib/HTMLParser.py | 2 +- Lib/test/test_htmlparser.py | 17 +++++++++++++++++ Misc/NEWS | 2 ++ 3 files changed, 20 insertions(+), 1 deletions(-) diff --git a/Lib/HTMLParser.py b/Lib/HTMLParser.py --- a/Lib/HTMLParser.py +++ b/Lib/HTMLParser.py @@ -26,7 +26,7 @@ tagfind = re.compile('[a-zA-Z][-.a-zA-Z0-9:_]*') attrfind = re.compile( r'\s*([a-zA-Z_][-.:a-zA-Z_0-9]*)(\s*=\s*' - r'(\'[^\']*\'|"[^"]*"|[-a-zA-Z0-9./,:;+*%?!&$\(\)_#=~@]*))?') + r'(\'[^\']*\'|"[^"]*"|[^\s"\'=<>`]*))?') locatestarttagend = re.compile(r""" <[a-zA-Z][-.a-zA-Z0-9:_]* # tag name diff --git a/Lib/test/test_htmlparser.py b/Lib/test/test_htmlparser.py --- a/Lib/test/test_htmlparser.py +++ b/Lib/test/test_htmlparser.py @@ -208,6 +208,23 @@ ("starttag", "a", [("href", "mailto:xyz at example.com")]), ]) + def test_attr_nonascii(self): + # see issue 7311 + self._run_check(u"\u4e2d\u6587", [ + ("starttag", "img", [("src", "/foo/bar.png"), + ("alt", u"\u4e2d\u6587")]), + ]) + self._run_check(u"", [ + ("starttag", "a", [("title", u"\u30c6\u30b9\u30c8"), + ("href", u"\u30c6\u30b9\u30c8.html")]), + ]) + self._run_check(u'', [ + ("starttag", "a", [("title", u"\u30c6\u30b9\u30c8"), + ("href", u"\u30c6\u30b9\u30c8.html")]), + ]) + def test_attr_entity_replacement(self): self._run_check("""""", [ ("starttag", "a", [("b", "&><\"'")]), diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -47,6 +47,8 @@ Library ------- +- Issue #7311: fix HTMLParser to accept non-ASCII attribute values. + - Issue #10963: Ensure that subprocess.communicate() never raises EPIPE. - Issue #11662: Make urllib and urllib2 ignore redirections if the -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 5 20:47:51 2011 From: python-checkins at python.org (barry.warsaw) Date: Tue, 05 Apr 2011 20:47:51 +0200 Subject: [Python-checkins] peps: PEP 396. Message-ID: http://hg.python.org/peps/rev/1857fe1e65ab changeset: 3857:1857fe1e65ab parent: 3854:f65beac56930 user: Barry Warsaw date: Tue Apr 05 14:47:18 2011 -0400 summary: PEP 396. files: pep-0396.txt | 311 +++++++++++++++++++++++++++++++++++++++ 1 files changed, 311 insertions(+), 0 deletions(-) diff --git a/pep-0396.txt b/pep-0396.txt new file mode 100644 --- /dev/null +++ b/pep-0396.txt @@ -0,0 +1,311 @@ +PEP: 396 +Title: Module Version Numbers +Version: $Revision: 65628 $ +Last-Modified: $Date: 2008-08-10 09:59:20 -0400 (Sun, 10 Aug 2008) $ +Author: Barry Warsaw +Status: Draft +Type: Informational +Content-Type: text/x-rst +Created: 2011-03-16 +Post-History: + + +Abstract +======== + +Given that it is useful and common to specify version numbers for +Python modules, and given that different ways of doing this have grown +organically within the Python community, it is useful to establish +standard conventions for module authors to adhere to and reference. +This informational PEP describes best practices for Python module +authors who want to define the version number of their Python module. + +Conformance with this PEP is optional, however other Python tools +(such as ``distutils2`` [1]_) may be adapted to use the conventions +defined here. + + +User Stories +============ + +Alice is writing a new module, called ``alice``, which she wants to +share with other Python developers. ``alice`` is a simple module and +lives in one file, ``alice.py``. Alice wants to specify a version +number so that her users can tell which version they are using. +Because her module lives entirely in one file, she wants to add the +version number to that file. + +Bob has written a module called ``bob`` which he has shared with many +users. ``bob.py`` contains a version number for the convenience of +his users. Bob learns about the Cheeseshop [2]_, and adds some simple +packaging using classic distutils so that he can upload *The Bob +Bundle* to the Cheeseshop. Because ``bob.py`` already specifies a +version number which his users can access programmatically, he wants +the same API to continue to work even though his users now get it from +the Cheeseshop. + +Carole maintains several namespace packages, each of which are +independently developed and distributed. In order for her users to +properly specify dependencies on the right versions of her packages, +she specifies the version numbers in the namespace package's +``setup.py`` file. Because Carol wants to have to update one version +number per package, she specifies the version number in her module and +has the ``setup.py`` extract the module version number when she builds +the *sdist* archive. + +David maintains a package in the standard library, and also produces +standalone versions for other versions of Python. The standard +library copy defines the version number in the module, and this same +version number is used for the standalone distributions as well. + + +Rationale +========= + +Python modules, both in the standard library and available from third +parties, have long included version numbers. There are established +de-facto standards for describing version numbers, and many ad-hoc +ways have grown organically over the years. Often, version numbers +can be retrieved from a module programmatically, by importing the +module and inspecting an attribute. Classic Python distutils +``setup()`` functions [3]_ describe a ``version`` argument where the +release's version number can be specified. PEP 8 [4]_ describes the +use of a module attribute called ``__version__`` for recording +"Subversion, CVS, or RCS" version strings using keyword expansion. In +the PEP author's own email archives, the earliest example of the use +of an ``__version__`` module attribute by independent module +developers dates back to 1995. + +Another example of version information is the sqlite3 [5]_ library +with its ``sqlite_version_info``, ``version``, and ``version_info`` +attributes. It may not be immediately obvious which attribute +contains a version number for the module, and which contains a version +number for the underlying SQLite3 library. + +This informational PEP codifies established practice, and recommends +standard ways of describing module version numbers, along with some +use cases for when -- and when *not* -- to include them. Its adoption +by module authors is purely voluntary; packaging tools in the standard +library will provide optional support for the standards defined +herein, and other tools in the Python universe may comply as well. + + +Specification +============= + +#. In general, modules in the standard library SHOULD NOT have version + numbers. They implicitly carry the version number of the Python + release they are included in. + +#. On a case-by-case basis, standard library modules which are also + released in standalone form for other Python versions MAY include a + module version number when included in the standard library, and + SHOULD include a version number when packaged separately. + +#. When a module includes a version number, it SHOULD be available in + the ``__version__`` attribute on that module. + +#. For modules which are also packages, the module namespace SHOULD + include the ``__version__`` attribute. + +#. For modules which live inside a namespace package, the sub-package + name SHOULD include the ``__version__`` attribute. The namespace + module itself SHOULD NOT include its own ``__version__`` attribute. + +#. The ``__version__`` attribute's value SHOULD be a string. + +#. Module version numbers SHOULD conform to the normalized version + format specified in PEP 386 [6]_. + +#. Module version numbers SHOULD NOT contain version control system + supplied revision numbers, or any other semantically different + version numbers (e.g. underlying library version number). + +#. Wherever a ``__version__`` attribute exists, a module MAY also + include a ``__version_info__`` attribute, containing a tuple + representation of the module version number, for easy comparisons. + +#. ``__version_info__`` SHOULD be of the format returned by PEP 386's + ``parse_version()`` function. + +#. The ``version`` attribute in a classic distutils ``setup.py`` + file, or the PEP 345 [7]_ ``Version`` metadata field SHOULD be + derived from the ``__version__`` field, or vice versa. + + +Examples +======== + +Retrieving the version number from a third party package:: + + >>> import bzrlib + >>> bzrlib.__version__ + '2.3.0' + +Retrieving the version number from a standard library package that is +also distributed as a standalone module:: + + >>> import email + >>> email.__version__ + '5.1.0' + +Version numbers for namespace packages:: + + >>> import flufl.i18n + >>> import flufl.enum + >>> import flufl.lock + + >>> print flufl.i18n.__version__ + 1.0.4 + >>> print flufl.enum.__version__ + 3.1 + >>> print flufl.lock.__version__ + 2.1 + + >>> import flufl + >>> flufl.__version__ + Traceback (most recent call last): + File "", line 1, in + AttributeError: 'module' object has no attribute '__version__' + >>> + + +Deriving +======== + +Module version numbers can appear in at least two places, and +sometimes more. For example, in accordance with this PEP, they are +available programmatically on the module's ``__version__`` attribute. +In a classic distutils ``setup.py`` file, the ``setup()`` function +takes a ``version`` argument, while the distutils2 ``setup.cfg`` file +has a ``version`` key. The version number must also get into the PEP +345 metadata, preferably when the *sdist* archive is built. It's +desirable for module authors to only have to specify the version +number once, and have all the other uses derive from this single +definition. + +While there are any number of ways this could be done, this section +describes one possible approach, for each scenario. + +Let's say Elle adds this attribute to her module file ``elle.py``:: + + __version__ = '3.1.1' + + +Classic distutils +----------------- + +In classic distutils, the simplest way to add the version string to +the ``setup()`` function in ``setup.py`` is to do something like +this:: + + from elle import __version__ + setup(name='elle', version=__version__) + +In the PEP author's experience however, this can fail in some cases, +such as when the module uses automatic Python 3 conversion via the +``2to3`` program (because ``setup.py`` is executed by Python 3 before +the ``elle`` module has been converted). + +In that case, it's not much more difficult to write a little code to +parse the ``__version__`` from the file rather than importing it:: + + import re + DEFAULT_VERSION_RE = re.compile(r'(?P\d+\.\d(?:\.\d+)?)') + + def get_version(filename, pattern=None): + if pattern is None: + cre = DEFAULT_VERSION_RE + else: + cre = re.compile(pattern) + with open(filename) as fp: + for line in fp: + if line.startswith('__version__'): + mo = cre.search(line) + assert mo, 'No valid __version__ string found' + return mo.group('version') + raise AssertionError('No __version__ assignment found') + + setup(name='elle', version=get_version('elle.py')) + + +Distutils2 +---------- + +Because the distutils2 style ``setup.cfg`` is declarative, we can't +run any code to extract the ``__version__`` attribute, either via +import or via parsing. This PEP suggests a special key be added to +the ``[metadata]`` section of the ``setup.cfg`` file to indicate "get +the version from this file". Something like this might work:: + + [metadata] + version-from-file: elle.py + +where ``parse`` means to use a parsing method similar to the above, on +the file named after the colon. The exact recipe for doing this will +be discussed in the appropriate distutils2 development forum. + +An alternative is to only define the version number in ``setup.cfg`` +and use the ``pkgutil`` module [8]_ to make it available +programmatically. E.g. in ``elle.py``:: + + from distutils2._backport import pkgutil + __version__ = pkgutil.get_distribution('elle').metadata['version'] + + +PEP 376 metadata +================ + +PEP 376 [9]_ defines a standard for static metadata, but doesn't +describe the process by which this metadata gets created. It is +highly desirable for the derived version information to be placed into +the PEP 376 ``.dist-info`` metadata at build-time rather than +install-time. This way, the metadata will be available for +introspection even when the code is not installed. + + +References +========== + +.. [1] Distutils2 documentation + (http://distutils2.notmyidea.org/) + +.. [2] The Cheeseshop (Python Package Index) + (http://pypi.python.org) + +.. [3] http://docs.python.org/distutils/setupscript.html + +.. [4] PEP 8, Style Guide for Python Code + (http://www.python.org/dev/peps/pep-0008) + +.. [5] sqlite3 module documentation + (http://docs.python.org/library/sqlite3.html) + +.. [6] PEP 386, Changing the version comparison module in Distutils + (http://www.python.org/dev/peps/pep-0386/) + +.. [7] PEP 345, Metadata for Python Software Packages 1.2 + (http://www.python.org/dev/peps/pep-0345/#version) + +.. [8] pkgutil - Package utilities + (http://distutils2.notmyidea.org/library/pkgutil.html) + +.. [9] PEP 376, Database of Installed Python Distributions + (http://www.python.org/dev/peps/pep-0376/) + + +Copyright +========= + +This document has been placed in the public domain. + + + +.. + Local Variables: + mode: indented-text + indent-tabs-mode: nil + sentence-end-double-space: t + fill-column: 70 + coding: utf-8 + End: -- Repository URL: http://hg.python.org/peps From python-checkins at python.org Tue Apr 5 20:47:52 2011 From: python-checkins at python.org (barry.warsaw) Date: Tue, 05 Apr 2011 20:47:52 +0200 Subject: [Python-checkins] peps (merge default -> default): merge Message-ID: http://hg.python.org/peps/rev/fc65dddc2af3 changeset: 3858:fc65dddc2af3 parent: 3857:1857fe1e65ab parent: 3856:359ccf54bc52 user: Barry Warsaw date: Tue Apr 05 14:47:46 2011 -0400 summary: merge files: pep-0399.txt | 205 +++++++++++++++++++++++++++++++++++++++ 1 files changed, 205 insertions(+), 0 deletions(-) diff --git a/pep-0399.txt b/pep-0399.txt new file mode 100644 --- /dev/null +++ b/pep-0399.txt @@ -0,0 +1,205 @@ +PEP: 399 +Title: Pure Python/C Accelerator Module Compatibility Requirements +Version: $Revision: 88219 $ +Last-Modified: $Date: 2011-01-27 13:47:00 -0800 (Thu, 27 Jan 2011) $ +Author: Brett Cannon +Status: Draft +Type: Informational +Content-Type: text/x-rst +Created: 04-Apr-2011 +Python-Version: 3.3 +Post-History: + +Abstract +======== + +The Python standard library under CPython contains various instances +of modules implemented in both pure Python and C. This PEP requires +that in these instances that both the Python and C code *must* be +semantically identical (except in cases where implementation details +of a VM prevents it entirely). It is also required that new C-based +modules lacking a pure Python equivalent implementation get special +permissions to be added to the standard library. + + +Rationale +========= + +Python has grown beyond the CPython virtual machine (VM). IronPython_, +Jython_, and PyPy_ all currently being viable alternatives to the +CPython VM. This VM ecosystem that has sprung up around the Python +programming language has led to Python being used in many different +areas where CPython cannot be used, e.g., Jython allowing Python to be +used in Java applications. + +A problem all of the VMs other than CPython face is handling modules +from the standard library that are implemented in C. Since they do not +typically support the entire `C API of Python`_ they are unable to use +the code used to create the module. Often times this leads these other +VMs to either re-implement the modules in pure Python or in the +programming language used to implement the VM (e.g., in C# for +IronPython). This duplication of effort between CPython, PyPy, Jython, +and IronPython is extremely unfortunate as implementing a module *at +least* in pure Python would help mitigate this duplicate effort. + +The purpose of this PEP is to minimize this duplicate effort by +mandating that all new modules added to Python's standard library +*must* have a pure Python implementation _unless_ special dispensation +is given. This makes sure that a module in the stdlib is available to +all VMs and not just to CPython. + +Re-implementing parts (or all) of a module in C (in the case +of CPython) is still allowed for performance reasons, but any such +accelerated code must semantically match the pure Python equivalent to +prevent divergence. To accomplish this, the pure Python and C code must +be thoroughly tested with the *same* test suite to verify compliance. +This is to prevent users from accidentally relying +on semantics that are specific to the C code and are not reflected in +the pure Python implementation that other VMs rely upon, e.g., in +CPython 3.2.0, ``heapq.heappop()`` raises different exceptions +depending on whether the accelerated C code is used or not:: + + from test.support import import_fresh_module + + c_heapq = import_fresh_module('heapq', fresh=['_heapq']) + py_heapq = import_fresh_module('heapq', blocked=['_heapq']) + + + class Spam: + """Tester class which defines no other magic methods but + __len__().""" + def __len__(self): + return 0 + + + try: + c_heapq.heappop(Spam()) + except TypeError: + # "heap argument must be a list" + pass + + try: + py_heapq.heappop(Spam()) + except AttributeError: + # "'Foo' object has no attribute 'pop'" + pass + +This kind of divergence is a problem for users as they unwittingly +write code that is CPython-specific. This is also an issue for other +VM teams as they have to deal with bug reports from users thinking +that they incorrectly implemented the module when in fact it was +caused by an untested case. + + +Details +======= + +Starting in Python 3.3, any modules added to the standard library must +have a pure Python implementation. This rule can only be ignored if +the Python development team grants a special exemption for the module. +Typically the exemption would be granted only when a module wraps a +specific C-based library (e.g., sqlite3_). In granting an exemption it +will be recognized that the module will most likely be considered +exclusive to CPython and not part of Python's standard library that +other VMs are expected to support. Usage of ``ctypes`` to provide an +API for a C library will continue to be frowned upon as ``ctypes`` +lacks compiler guarantees that C code typically relies upon to prevent +certain errors from occurring (e.g., API changes). + +Even though a pure Python implementation is mandated by this PEP, it +does not preclude the use of a companion acceleration module. If an +acceleration module is provided it is to be named the same as the +module it is accelerating with an underscore attached as a prefix, +e.g., ``_warnings`` for ``warnings``. The common pattern to access +the accelerated code from the pure Python implementation is to import +it with an ``import *``, e.g., ``from _warnings import *``. This is +typically done at the end of the module to allow it to overwrite +specific Python objects with their accelerated equivalents. This kind +of import can also be done before the end of the module when needed, +e.g., an accelerated base class is provided but is then subclassed by +Python code. This PEP does not mandate that pre-existing modules in +the stdlib that lack a pure Python equivalent gain such a module. But +if people do volunteer to provide and maintain a pure Python +equivalent (e.g., the PyPy team volunteering their pure Python +implementation of the ``csv`` module and maintaining it) then such +code will be accepted. + +Any accelerated code must be semantically identical to the pure Python +implementation. The only time any semantics are allowed to be +different are when technical details of the VM providing the +accelerated code prevent matching semantics from being possible, e.g., +a class being a ``type`` when implemented in C. The semantics +equivalence requirement also dictates that no public API be provided +in accelerated code that does not exist in the pure Python code. +Without this requirement people could accidentally come to rely on a +detail in the acclerated code which is not made available to other VMs +that use the pure Python implementation. To help verify that the +contract of semantic equivalence is being met, a module must be tested +both with and without its accelerated code as thoroughly as possible. + +As an example, to write tests which exercise both the pure Python and +C acclerated versions of a module, a basic idiom can be followed:: + + import collections.abc + from test.support import import_fresh_module, run_unittest + import unittest + + c_heapq = import_fresh_module('heapq', fresh=['_heapq']) + py_heapq = import_fresh_module('heapq', blocked=['_heapq']) + + + class ExampleTest(unittest.TestCase): + + def test_heappop_exc_for_non_MutableSequence(self): + # Raise TypeError when heap is not a + # collections.abc.MutableSequence. + class Spam: + """Test class lacking many ABC-required methods + (e.g., pop()).""" + def __len__(self): + return 0 + + heap = Spam() + self.assertFalse(isinstance(heap, + collections.abc.MutableSequence)) + with self.assertRaises(TypeError): + self.heapq.heappop(heap) + + + class AcceleratedExampleTest(ExampleTest): + + """Test using the acclerated code.""" + + heapq = c_heapq + + + class PyExampleTest(ExampleTest): + + """Test with just the pure Python code.""" + + heapq = py_heapq + + + def test_main(): + run_unittest(AcceleratedExampleTest, PyExampleTest) + + + if __name__ == '__main__': + test_main() + +Thoroughness of the test can be verified using coverage measurements +with branching coverage on the pure Python code to verify that all +possible scenarios are tested using (or not using) accelerator code. + + +Copyright +========= + +This document has been placed in the public domain. + + +.. _IronPython: http://ironpython.net/ +.. _Jython: http://www.jython.org/ +.. _PyPy: http://pypy.org/ +.. _C API of Python: http://docs.python.org/py3k/c-api/index.html +.. _sqlite3: http://docs.python.org/py3k/library/sqlite3.html -- Repository URL: http://hg.python.org/peps From python-checkins at python.org Tue Apr 5 20:48:35 2011 From: python-checkins at python.org (barry.warsaw) Date: Tue, 05 Apr 2011 20:48:35 +0200 Subject: [Python-checkins] peps: Added Post-History. Message-ID: http://hg.python.org/peps/rev/6d0808c23ad8 changeset: 3859:6d0808c23ad8 user: Barry Warsaw date: Tue Apr 05 14:48:30 2011 -0400 summary: Added Post-History. files: pep-0396.txt | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/pep-0396.txt b/pep-0396.txt --- a/pep-0396.txt +++ b/pep-0396.txt @@ -7,7 +7,7 @@ Type: Informational Content-Type: text/x-rst Created: 2011-03-16 -Post-History: +Post-History: 2011-04-05 Abstract -- Repository URL: http://hg.python.org/peps From python-checkins at python.org Wed Apr 6 00:23:43 2011 From: python-checkins at python.org (benjamin.peterson) Date: Wed, 06 Apr 2011 00:23:43 +0200 Subject: [Python-checkins] cpython: implement tp_clear Message-ID: http://hg.python.org/cpython/rev/7b5d09343929 changeset: 69161:7b5d09343929 parent: 69159:a9371cf1cc61 user: Benjamin Peterson date: Tue Apr 05 17:25:14 2011 -0500 summary: implement tp_clear files: Modules/_functoolsmodule.c | 11 ++++++++++- 1 files changed, 10 insertions(+), 1 deletions(-) diff --git a/Modules/_functoolsmodule.c b/Modules/_functoolsmodule.c --- a/Modules/_functoolsmodule.c +++ b/Modules/_functoolsmodule.c @@ -355,6 +355,15 @@ return 0; } +static int +keyobject_clear(keyobject *ko) +{ + Py_CLEAR(ko->cmp); + if (ko->object) + Py_CLEAR(ko->object); + return 0; +} + static PyMemberDef keyobject_members[] = { {"obj", T_OBJECT, offsetof(keyobject, object), 0, @@ -392,7 +401,7 @@ Py_TPFLAGS_DEFAULT, /* tp_flags */ 0, /* tp_doc */ (traverseproc)keyobject_traverse, /* tp_traverse */ - 0, /* tp_clear */ + (inquiry)keyobject_clear, /* tp_clear */ keyobject_richcompare, /* tp_richcompare */ 0, /* tp_weaklistoffset */ 0, /* tp_iter */ -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Apr 6 02:20:05 2011 From: python-checkins at python.org (ned.deily) Date: Wed, 06 Apr 2011 02:20:05 +0200 Subject: [Python-checkins] cpython (2.7): Issue #7108: Fix test_commands to not fail when special attributes ('@' Message-ID: http://hg.python.org/cpython/rev/5616cbce0bee changeset: 69162:5616cbce0bee branch: 2.7 parent: 69160:7d4dea76c476 user: Ned Deily date: Tue Apr 05 17:16:09 2011 -0700 summary: Issue #7108: Fix test_commands to not fail when special attributes ('@' or '.') appear in 'ls -l' output. files: Lib/test/test_commands.py | 6 +++++- Misc/NEWS | 3 +++ 2 files changed, 8 insertions(+), 1 deletions(-) diff --git a/Lib/test/test_commands.py b/Lib/test/test_commands.py --- a/Lib/test/test_commands.py +++ b/Lib/test/test_commands.py @@ -49,8 +49,12 @@ # drwxr-xr-x 15 Joe User My Group 4096 Aug 12 12:50 / # Note that the first case above has a space in the group name # while the second one has a space in both names. + # Special attributes supported: + # + = has ACLs + # @ = has Mac OS X extended attributes + # . = has a SELinux security context pat = r'''d......... # It is a directory. - \+? # It may have ACLs. + [.+@]? # It may have special attributes. \s+\d+ # It has some number of links. [^/]* # Skip user, group, size, and date. /\. # and end with the name of the file. diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -315,6 +315,9 @@ Tests ----- +- Issue #7108: Fix test_commands to not fail when special attributes ('@' + or '.') appear in 'ls -l' output. + - Issue #11490: test_subprocess:test_leaking_fds_on_error no longer gives a false positive if the last directory in the path is inaccessible. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Apr 6 02:44:00 2011 From: python-checkins at python.org (alexander.belopolsky) Date: Wed, 06 Apr 2011 02:44:00 +0200 Subject: [Python-checkins] cpython (3.2): Issue #11576: Fixed timedelta subtraction glitch on big timedelta values Message-ID: http://hg.python.org/cpython/rev/76180cc853b6 changeset: 69163:76180cc853b6 branch: 3.2 parent: 69158:8a65e6aff672 user: Alexander Belopolsky date: Tue Apr 05 20:07:38 2011 -0400 summary: Issue #11576: Fixed timedelta subtraction glitch on big timedelta values files: Lib/datetime.py | 6 +++++- Lib/test/datetimetester.py | 6 ++++++ Modules/_datetimemodule.c | 15 ++++++++------- 3 files changed, 19 insertions(+), 8 deletions(-) diff --git a/Lib/datetime.py b/Lib/datetime.py --- a/Lib/datetime.py +++ b/Lib/datetime.py @@ -485,7 +485,11 @@ def __sub__(self, other): if isinstance(other, timedelta): - return self + -other + # for CPython compatibility, we cannot use + # our __class__ here, but need a real timedelta + return timedelta(self._days - other._days, + self._seconds - other._seconds, + self._microseconds - other._microseconds) return NotImplemented def __rsub__(self, other): diff --git a/Lib/test/datetimetester.py b/Lib/test/datetimetester.py --- a/Lib/test/datetimetester.py +++ b/Lib/test/datetimetester.py @@ -383,6 +383,12 @@ for i in range(-10, 10): eq((i*us/-3)//us, round(i/-3)) + # Issue #11576 + eq(td(999999999, 86399, 999999) - td(999999999, 86399, 999998), + td(0, 0, 1)) + eq(td(999999999, 1, 1) - td(999999999, 1, 0), + td(0, 0, 1)) + def test_disallowed_computations(self): a = timedelta(42) diff --git a/Modules/_datetimemodule.c b/Modules/_datetimemodule.c --- a/Modules/_datetimemodule.c +++ b/Modules/_datetimemodule.c @@ -1801,13 +1801,14 @@ if (PyDelta_Check(left) && PyDelta_Check(right)) { /* delta - delta */ - PyObject *minus_right = PyNumber_Negative(right); - if (minus_right) { - result = delta_add(left, minus_right); - Py_DECREF(minus_right); - } - else - result = NULL; + /* The C-level additions can't overflow because of the + * invariant bounds. + */ + int days = GET_TD_DAYS(left) - GET_TD_DAYS(right); + int seconds = GET_TD_SECONDS(left) - GET_TD_SECONDS(right); + int microseconds = GET_TD_MICROSECONDS(left) - + GET_TD_MICROSECONDS(right); + result = new_delta(days, seconds, microseconds, 1); } if (result == Py_NotImplemented) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Apr 6 02:44:01 2011 From: python-checkins at python.org (alexander.belopolsky) Date: Wed, 06 Apr 2011 02:44:01 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Issue #11576: Fixed timedelta subtraction glitch on big timedelta values Message-ID: http://hg.python.org/cpython/rev/d492915cf76d changeset: 69164:d492915cf76d parent: 69161:7b5d09343929 parent: 69163:76180cc853b6 user: Alexander Belopolsky date: Tue Apr 05 20:43:15 2011 -0400 summary: Issue #11576: Fixed timedelta subtraction glitch on big timedelta values files: Lib/datetime.py | 6 +++++- Lib/test/datetimetester.py | 6 ++++++ Modules/_datetimemodule.c | 15 ++++++++------- 3 files changed, 19 insertions(+), 8 deletions(-) diff --git a/Lib/datetime.py b/Lib/datetime.py --- a/Lib/datetime.py +++ b/Lib/datetime.py @@ -485,7 +485,11 @@ def __sub__(self, other): if isinstance(other, timedelta): - return self + -other + # for CPython compatibility, we cannot use + # our __class__ here, but need a real timedelta + return timedelta(self._days - other._days, + self._seconds - other._seconds, + self._microseconds - other._microseconds) return NotImplemented def __rsub__(self, other): diff --git a/Lib/test/datetimetester.py b/Lib/test/datetimetester.py --- a/Lib/test/datetimetester.py +++ b/Lib/test/datetimetester.py @@ -383,6 +383,12 @@ for i in range(-10, 10): eq((i*us/-3)//us, round(i/-3)) + # Issue #11576 + eq(td(999999999, 86399, 999999) - td(999999999, 86399, 999998), + td(0, 0, 1)) + eq(td(999999999, 1, 1) - td(999999999, 1, 0), + td(0, 0, 1)) + def test_disallowed_computations(self): a = timedelta(42) diff --git a/Modules/_datetimemodule.c b/Modules/_datetimemodule.c --- a/Modules/_datetimemodule.c +++ b/Modules/_datetimemodule.c @@ -1801,13 +1801,14 @@ if (PyDelta_Check(left) && PyDelta_Check(right)) { /* delta - delta */ - PyObject *minus_right = PyNumber_Negative(right); - if (minus_right) { - result = delta_add(left, minus_right); - Py_DECREF(minus_right); - } - else - result = NULL; + /* The C-level additions can't overflow because of the + * invariant bounds. + */ + int days = GET_TD_DAYS(left) - GET_TD_DAYS(right); + int seconds = GET_TD_SECONDS(left) - GET_TD_SECONDS(right); + int microseconds = GET_TD_MICROSECONDS(left) - + GET_TD_MICROSECONDS(right); + result = new_delta(days, seconds, microseconds, 1); } if (result == Py_NotImplemented) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Apr 6 04:14:27 2011 From: python-checkins at python.org (alexander.belopolsky) Date: Wed, 06 Apr 2011 04:14:27 +0200 Subject: [Python-checkins] cpython (2.7): Issue #11576: Fixed timedelta subtraction glitch on big timedelta values Message-ID: http://hg.python.org/cpython/rev/202a9feb1fd6 changeset: 69165:202a9feb1fd6 branch: 2.7 parent: 69162:5616cbce0bee user: Alexander Belopolsky date: Tue Apr 05 22:12:22 2011 -0400 summary: Issue #11576: Fixed timedelta subtraction glitch on big timedelta values files: Lib/test/test_datetime.py | 7 +++++++ Modules/datetimemodule.c | 15 ++++++++------- 2 files changed, 15 insertions(+), 7 deletions(-) diff --git a/Lib/test/test_datetime.py b/Lib/test/test_datetime.py --- a/Lib/test/test_datetime.py +++ b/Lib/test/test_datetime.py @@ -231,6 +231,13 @@ eq(a//10, td(0, 7*24*360)) eq(a//3600000, td(0, 0, 7*24*1000)) + # Issue #11576 + eq(td(999999999, 86399, 999999) - td(999999999, 86399, 999998), + td(0, 0, 1)) + eq(td(999999999, 1, 1) - td(999999999, 1, 0), + td(0, 0, 1)) + + def test_disallowed_computations(self): a = timedelta(42) diff --git a/Modules/datetimemodule.c b/Modules/datetimemodule.c --- a/Modules/datetimemodule.c +++ b/Modules/datetimemodule.c @@ -1737,13 +1737,14 @@ if (PyDelta_Check(left) && PyDelta_Check(right)) { /* delta - delta */ - PyObject *minus_right = PyNumber_Negative(right); - if (minus_right) { - result = delta_add(left, minus_right); - Py_DECREF(minus_right); - } - else - result = NULL; + /* The C-level additions can't overflow because of the + * invariant bounds. + */ + int days = GET_TD_DAYS(left) - GET_TD_DAYS(right); + int seconds = GET_TD_SECONDS(left) - GET_TD_SECONDS(right); + int microseconds = GET_TD_MICROSECONDS(left) - + GET_TD_MICROSECONDS(right); + result = new_delta(days, seconds, microseconds, 1); } if (result == Py_NotImplemented) -- Repository URL: http://hg.python.org/cpython From solipsis at pitrou.net Wed Apr 6 04:56:58 2011 From: solipsis at pitrou.net (solipsis at pitrou.net) Date: Wed, 06 Apr 2011 04:56:58 +0200 Subject: [Python-checkins] Daily reference leaks (d492915cf76d): sum=0 Message-ID: results for d492915cf76d on branch "default" -------------------------------------------- Command line was: ['./python', '-m', 'test.regrtest', '-uall', '-R', '3:3:/home/antoine/cpython/refleaks/refloglZqgHa', '-x'] From python-checkins at python.org Wed Apr 6 08:16:34 2011 From: python-checkins at python.org (senthil.kumaran) Date: Wed, 06 Apr 2011 08:16:34 +0200 Subject: [Python-checkins] cpython (3.1): Issue #10762: Guard against invalid/non-supported format string '%f' on Message-ID: http://hg.python.org/cpython/rev/2ca1bc677a60 changeset: 69166:2ca1bc677a60 branch: 3.1 parent: 69157:7a1ef59d765b user: Senthil Kumaran date: Wed Apr 06 12:54:06 2011 +0800 summary: Issue #10762: Guard against invalid/non-supported format string '%f' on Windows. Patch Santoso Wijaya. files: Lib/test/test_time.py | 8 ++++++++ Modules/timemodule.c | 2 +- 2 files changed, 9 insertions(+), 1 deletions(-) diff --git a/Lib/test/test_time.py b/Lib/test/test_time.py --- a/Lib/test/test_time.py +++ b/Lib/test/test_time.py @@ -2,6 +2,7 @@ import time import unittest import locale +import sys class TimeTestCase(unittest.TestCase): @@ -37,6 +38,13 @@ except ValueError: self.fail('conversion specifier: %r failed.' % format) + # Issue #10762: Guard against invalid/non-supported format string + # so that Python don't crash (Windows crashes when the format string + # input to [w]strftime is not kosher. + if sys.platform.startswith('win'): + with self.assertRaises(ValueError): + time.strftime('%f') + def test_strftime_bounds_checking(self): # Make sure that strftime() checks the bounds of the various parts #of the time tuple (0 is valid for *all* values). diff --git a/Modules/timemodule.c b/Modules/timemodule.c --- a/Modules/timemodule.c +++ b/Modules/timemodule.c @@ -549,7 +549,7 @@ if (outbuf[1]=='#') ++outbuf; /* not documented by python, */ if (outbuf[1]=='\0' || - !wcschr(L"aAbBcdfHIjmMpSUwWxXyYzZ%", outbuf[1])) + !wcschr(L"aAbBcdHIjmMpSUwWxXyYzZ%", outbuf[1])) { PyErr_SetString(PyExc_ValueError, "Invalid format string"); return 0; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Apr 6 08:16:36 2011 From: python-checkins at python.org (senthil.kumaran) Date: Wed, 06 Apr 2011 08:16:36 +0200 Subject: [Python-checkins] cpython (merge 3.1 -> 3.2): Merge from 3.1 Message-ID: http://hg.python.org/cpython/rev/1accc17055c9 changeset: 69167:1accc17055c9 branch: 3.2 parent: 69163:76180cc853b6 parent: 69166:2ca1bc677a60 user: Senthil Kumaran date: Wed Apr 06 14:11:09 2011 +0800 summary: Merge from 3.1 files: Lib/test/test_time.py | 8 ++++++++ Modules/timemodule.c | 2 +- 2 files changed, 9 insertions(+), 1 deletions(-) diff --git a/Lib/test/test_time.py b/Lib/test/test_time.py --- a/Lib/test/test_time.py +++ b/Lib/test/test_time.py @@ -3,6 +3,7 @@ import unittest import locale import sysconfig +import sys import warnings class TimeTestCase(unittest.TestCase): @@ -39,6 +40,13 @@ except ValueError: self.fail('conversion specifier: %r failed.' % format) + # Issue #10762: Guard against invalid/non-supported format string + # so that Python don't crash (Windows crashes when the format string + # input to [w]strftime is not kosher. + if sys.platform.startswith('win'): + with self.assertRaises(ValueError): + time.strftime('%f') + def _bounds_checking(self, func=time.strftime): # Make sure that strftime() checks the bounds of the various parts #of the time tuple (0 is valid for *all* values). diff --git a/Modules/timemodule.c b/Modules/timemodule.c --- a/Modules/timemodule.c +++ b/Modules/timemodule.c @@ -512,7 +512,7 @@ if (outbuf[1]=='#') ++outbuf; /* not documented by python, */ if (outbuf[1]=='\0' || - !wcschr(L"aAbBcdfHIjmMpSUwWxXyYzZ%", outbuf[1])) + !wcschr(L"aAbBcdHIjmMpSUwWxXyYzZ%", outbuf[1])) { PyErr_SetString(PyExc_ValueError, "Invalid format string"); return 0; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Apr 6 08:16:37 2011 From: python-checkins at python.org (senthil.kumaran) Date: Wed, 06 Apr 2011 08:16:37 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): merge from 3.2 Message-ID: http://hg.python.org/cpython/rev/dc728ac66c3c changeset: 69168:dc728ac66c3c parent: 69164:d492915cf76d parent: 69167:1accc17055c9 user: Senthil Kumaran date: Wed Apr 06 14:16:08 2011 +0800 summary: merge from 3.2 files: Lib/test/test_time.py | 8 ++++++++ Modules/timemodule.c | 2 +- 2 files changed, 9 insertions(+), 1 deletions(-) diff --git a/Lib/test/test_time.py b/Lib/test/test_time.py --- a/Lib/test/test_time.py +++ b/Lib/test/test_time.py @@ -3,6 +3,7 @@ import unittest import locale import sysconfig +import sys import warnings class TimeTestCase(unittest.TestCase): @@ -39,6 +40,13 @@ except ValueError: self.fail('conversion specifier: %r failed.' % format) + # Issue #10762: Guard against invalid/non-supported format string + # so that Python don't crash (Windows crashes when the format string + # input to [w]strftime is not kosher. + if sys.platform.startswith('win'): + with self.assertRaises(ValueError): + time.strftime('%f') + def _bounds_checking(self, func=time.strftime): # Make sure that strftime() checks the bounds of the various parts #of the time tuple (0 is valid for *all* values). diff --git a/Modules/timemodule.c b/Modules/timemodule.c --- a/Modules/timemodule.c +++ b/Modules/timemodule.c @@ -512,7 +512,7 @@ if (outbuf[1]=='#') ++outbuf; /* not documented by python, */ if (outbuf[1]=='\0' || - !wcschr(L"aAbBcdfHIjmMpSUwWxXyYzZ%", outbuf[1])) + !wcschr(L"aAbBcdHIjmMpSUwWxXyYzZ%", outbuf[1])) { PyErr_SetString(PyExc_ValueError, "Invalid format string"); return 0; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Apr 6 08:45:33 2011 From: python-checkins at python.org (senthil.kumaran) Date: Wed, 06 Apr 2011 08:45:33 +0200 Subject: [Python-checkins] cpython (2.7): Issue #10762: Guard against invalid/non-supported format string '%f' on Message-ID: http://hg.python.org/cpython/rev/1320f29bcf98 changeset: 69169:1320f29bcf98 branch: 2.7 parent: 69162:5616cbce0bee user: Senthil Kumaran date: Wed Apr 06 14:27:47 2011 +0800 summary: Issue #10762: Guard against invalid/non-supported format string '%f' on Windows. Patch Santoso Wijaya. files: Lib/test/test_time.py | 8 ++++++++ Modules/timemodule.c | 2 +- 2 files changed, 9 insertions(+), 1 deletions(-) diff --git a/Lib/test/test_time.py b/Lib/test/test_time.py --- a/Lib/test/test_time.py +++ b/Lib/test/test_time.py @@ -1,6 +1,7 @@ from test import test_support import time import unittest +import sys class TimeTestCase(unittest.TestCase): @@ -37,6 +38,13 @@ except ValueError: self.fail('conversion specifier: %r failed.' % format) + # Issue #10762: Guard against invalid/non-supported format string + # so that Python don't crash (Windows crashes when the format string + # input to [w]strftime is not kosher. + if sys.platform.startswith('win'): + with self.assertRaises(ValueError): + time.strftime('%f') + def test_strftime_bounds_checking(self): # Make sure that strftime() checks the bounds of the various parts #of the time tuple (0 is valid for *all* values). diff --git a/Modules/timemodule.c b/Modules/timemodule.c --- a/Modules/timemodule.c +++ b/Modules/timemodule.c @@ -487,7 +487,7 @@ if (outbuf[1]=='#') ++outbuf; /* not documented by python, */ if (outbuf[1]=='\0' || - !strchr("aAbBcdfHIjmMpSUwWxXyYzZ%", outbuf[1])) + !strchr("aAbBcdHIjmMpSUwWxXyYzZ%", outbuf[1])) { PyErr_SetString(PyExc_ValueError, "Invalid format string"); return 0; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Apr 6 08:45:36 2011 From: python-checkins at python.org (senthil.kumaran) Date: Wed, 06 Apr 2011 08:45:36 +0200 Subject: [Python-checkins] cpython (merge 2.7 -> 2.7): hg pull/merge - Changes to accomodate. Message-ID: http://hg.python.org/cpython/rev/da212fa62fea changeset: 69170:da212fa62fea branch: 2.7 parent: 69169:1320f29bcf98 parent: 69165:202a9feb1fd6 user: Senthil Kumaran date: Wed Apr 06 14:41:42 2011 +0800 summary: hg pull/merge - Changes to accomodate. files: Lib/test/test_datetime.py | 7 +++++++ Modules/datetimemodule.c | 15 ++++++++------- 2 files changed, 15 insertions(+), 7 deletions(-) diff --git a/Lib/test/test_datetime.py b/Lib/test/test_datetime.py --- a/Lib/test/test_datetime.py +++ b/Lib/test/test_datetime.py @@ -231,6 +231,13 @@ eq(a//10, td(0, 7*24*360)) eq(a//3600000, td(0, 0, 7*24*1000)) + # Issue #11576 + eq(td(999999999, 86399, 999999) - td(999999999, 86399, 999998), + td(0, 0, 1)) + eq(td(999999999, 1, 1) - td(999999999, 1, 0), + td(0, 0, 1)) + + def test_disallowed_computations(self): a = timedelta(42) diff --git a/Modules/datetimemodule.c b/Modules/datetimemodule.c --- a/Modules/datetimemodule.c +++ b/Modules/datetimemodule.c @@ -1737,13 +1737,14 @@ if (PyDelta_Check(left) && PyDelta_Check(right)) { /* delta - delta */ - PyObject *minus_right = PyNumber_Negative(right); - if (minus_right) { - result = delta_add(left, minus_right); - Py_DECREF(minus_right); - } - else - result = NULL; + /* The C-level additions can't overflow because of the + * invariant bounds. + */ + int days = GET_TD_DAYS(left) - GET_TD_DAYS(right); + int seconds = GET_TD_SECONDS(left) - GET_TD_SECONDS(right); + int microseconds = GET_TD_MICROSECONDS(left) - + GET_TD_MICROSECONDS(right); + result = new_delta(days, seconds, microseconds, 1); } if (result == Py_NotImplemented) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Apr 6 14:16:41 2011 From: python-checkins at python.org (r.david.murray) Date: Wed, 06 Apr 2011 14:16:41 +0200 Subject: [Python-checkins] cpython (3.2): #11605: don't use set/get_payload in feedparser; they do conversions. Message-ID: http://hg.python.org/cpython/rev/b807cf929e26 changeset: 69171:b807cf929e26 branch: 3.2 parent: 69167:1accc17055c9 user: R David Murray date: Wed Apr 06 08:13:02 2011 -0400 summary: #11605: don't use set/get_payload in feedparser; they do conversions. Really the whole API needs to be gone over to restore the separation of concerns; but that's what email6 is about. files: Lib/email/feedparser.py | 4 +- Lib/email/test/test_email.py | 47 ++++++++++++++++++++++++ Misc/NEWS | 3 + 3 files changed, 52 insertions(+), 2 deletions(-) diff --git a/Lib/email/feedparser.py b/Lib/email/feedparser.py --- a/Lib/email/feedparser.py +++ b/Lib/email/feedparser.py @@ -368,12 +368,12 @@ end = len(mo.group(0)) self._last.epilogue = epilogue[:-end] else: - payload = self._last.get_payload() + payload = self._last._payload if isinstance(payload, str): mo = NLCRE_eol.search(payload) if mo: payload = payload[:-len(mo.group(0))] - self._last.set_payload(payload) + self._last._payload = payload self._input.pop_eof_matcher() self._pop_message() # Set the multipart up for newline cleansing, which will diff --git a/Lib/email/test/test_email.py b/Lib/email/test/test_email.py --- a/Lib/email/test/test_email.py +++ b/Lib/email/test/test_email.py @@ -3168,6 +3168,53 @@ g = email.generator.BytesGenerator(s) g.flatten(msg, linesep='\r\n') self.assertEqual(s.getvalue(), text) + + def test_8bit_multipart(self): + # Issue 11605 + source = textwrap.dedent("""\ + Date: Fri, 18 Mar 2011 17:15:43 +0100 + To: foo at example.com + From: foodwatch-Newsletter + Subject: Aktuelles zu Japan, Klonfleisch und Smiley-System + Message-ID: <76a486bee62b0d200f33dc2ca08220ad at localhost.localdomain> + MIME-Version: 1.0 + Content-Type: multipart/alternative; + boundary="b1_76a486bee62b0d200f33dc2ca08220ad" + + --b1_76a486bee62b0d200f33dc2ca08220ad + Content-Type: text/plain; charset="utf-8" + Content-Transfer-Encoding: 8bit + + Guten Tag, , + + mit gro?er Betroffenheit verfolgen auch wir im foodwatch-Team die + Nachrichten aus Japan. + + + --b1_76a486bee62b0d200f33dc2ca08220ad + Content-Type: text/html; charset="utf-8" + Content-Transfer-Encoding: 8bit + + + + + foodwatch - Newsletter + + +

mit großer Betroffenheit verfolgen auch wir im foodwatch-Team + die Nachrichten aus Japan.

+ + + --b1_76a486bee62b0d200f33dc2ca08220ad-- + + """).encode('utf-8') + msg = email.message_from_bytes(source) + s = BytesIO() + g = email.generator.BytesGenerator(s) + g.flatten(msg) + self.assertEqual(s.getvalue(), source) + maxDiff = None diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -49,6 +49,9 @@ Library ------- +- Issue #11605: email.parser.BytesFeedParser was incorrectly converting multipart + subpararts with an 8bit CTE into unicode instead of preserving the bytes. + - Issue #10963: Ensure that subprocess.communicate() never raises EPIPE. - Issue #11746: Fix SSLContext.load_cert_chain() to accept elliptic curve -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Apr 6 14:16:43 2011 From: python-checkins at python.org (r.david.murray) Date: Wed, 06 Apr 2011 14:16:43 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Merge #11605: don't use set/get_payload in feedparser; they do conversions. Message-ID: http://hg.python.org/cpython/rev/642c0d6799c5 changeset: 69172:642c0d6799c5 parent: 69168:dc728ac66c3c parent: 69171:b807cf929e26 user: R David Murray date: Wed Apr 06 08:16:13 2011 -0400 summary: Merge #11605: don't use set/get_payload in feedparser; they do conversions. files: Lib/email/feedparser.py | 4 +- Lib/test/test_email/test_email.py | 47 +++++++++++++++++++ Misc/NEWS | 3 + 3 files changed, 52 insertions(+), 2 deletions(-) diff --git a/Lib/email/feedparser.py b/Lib/email/feedparser.py --- a/Lib/email/feedparser.py +++ b/Lib/email/feedparser.py @@ -368,12 +368,12 @@ end = len(mo.group(0)) self._last.epilogue = epilogue[:-end] else: - payload = self._last.get_payload() + payload = self._last._payload if isinstance(payload, str): mo = NLCRE_eol.search(payload) if mo: payload = payload[:-len(mo.group(0))] - self._last.set_payload(payload) + self._last._payload = payload self._input.pop_eof_matcher() self._pop_message() # Set the multipart up for newline cleansing, which will diff --git a/Lib/test/test_email/test_email.py b/Lib/test/test_email/test_email.py --- a/Lib/test/test_email/test_email.py +++ b/Lib/test/test_email/test_email.py @@ -3143,6 +3143,53 @@ g = email.generator.BytesGenerator(s) g.flatten(msg, linesep='\r\n') self.assertEqual(s.getvalue(), text) + + def test_8bit_multipart(self): + # Issue 11605 + source = textwrap.dedent("""\ + Date: Fri, 18 Mar 2011 17:15:43 +0100 + To: foo at example.com + From: foodwatch-Newsletter + Subject: Aktuelles zu Japan, Klonfleisch und Smiley-System + Message-ID: <76a486bee62b0d200f33dc2ca08220ad at localhost.localdomain> + MIME-Version: 1.0 + Content-Type: multipart/alternative; + boundary="b1_76a486bee62b0d200f33dc2ca08220ad" + + --b1_76a486bee62b0d200f33dc2ca08220ad + Content-Type: text/plain; charset="utf-8" + Content-Transfer-Encoding: 8bit + + Guten Tag, , + + mit gro?er Betroffenheit verfolgen auch wir im foodwatch-Team die + Nachrichten aus Japan. + + + --b1_76a486bee62b0d200f33dc2ca08220ad + Content-Type: text/html; charset="utf-8" + Content-Transfer-Encoding: 8bit + + + + + foodwatch - Newsletter + + +

mit großer Betroffenheit verfolgen auch wir im foodwatch-Team + die Nachrichten aus Japan.

+ + + --b1_76a486bee62b0d200f33dc2ca08220ad-- + + """).encode('utf-8') + msg = email.message_from_bytes(source) + s = BytesIO() + g = email.generator.BytesGenerator(s) + g.flatten(msg) + self.assertEqual(s.getvalue(), source) + maxDiff = None diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -94,6 +94,9 @@ Library ------- +- Issue #11605: email.parser.BytesFeedParser was incorrectly converting multipart + subpararts with an 8bit CTE into unicode instead of preserving the bytes. + - Issue #10963: Ensure that subprocess.communicate() never raises EPIPE. - Issue #10791: Implement missing method GzipFile.read1(), allowing GzipFile -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Apr 6 15:52:23 2011 From: python-checkins at python.org (r.david.murray) Date: Wed, 06 Apr 2011 15:52:23 +0200 Subject: [Python-checkins] cpython: #1690608: make formataddr RFC2047 aware. Message-ID: http://hg.python.org/cpython/rev/184ddd9acd5a changeset: 69173:184ddd9acd5a user: R David Murray date: Wed Apr 06 09:35:57 2011 -0400 summary: #1690608: make formataddr RFC2047 aware. Patch by Torsten Becker. files: Doc/library/email.util.rst | 9 +++- Lib/email/utils.py | 28 ++++++++++-- Lib/test/test_email/test_email.py | 40 +++++++++++++++++++ Misc/ACKS | 1 + Misc/NEWS | 4 + 5 files changed, 75 insertions(+), 7 deletions(-) diff --git a/Doc/library/email.util.rst b/Doc/library/email.util.rst --- a/Doc/library/email.util.rst +++ b/Doc/library/email.util.rst @@ -29,13 +29,20 @@ fails, in which case a 2-tuple of ``('', '')`` is returned. -.. function:: formataddr(pair) +.. function:: formataddr(pair, charset='utf-8') The inverse of :meth:`parseaddr`, this takes a 2-tuple of the form ``(realname, email_address)`` and returns the string value suitable for a :mailheader:`To` or :mailheader:`Cc` header. If the first element of *pair* is false, then the second element is returned unmodified. + Optional *charset* is the character set that will be used in the :rfc:`2047` + encoding of the ``realname`` if the ``realname`` contains non-ASCII + characters. Can be an instance of :class:`str` or a + :class:`~email.charset.Charset`. Defaults to ``utf-8``. + + .. versionchanged: 3.3 added the *charset* option + .. function:: getaddresses(fieldvalues) diff --git a/Lib/email/utils.py b/Lib/email/utils.py --- a/Lib/email/utils.py +++ b/Lib/email/utils.py @@ -42,6 +42,7 @@ # Intrapackage imports from email.encoders import _bencode, _qencode +from email.charset import Charset COMMASPACE = ', ' EMPTYSTRING = '' @@ -56,21 +57,36 @@ # Helpers -def formataddr(pair): +def formataddr(pair, charset='utf-8'): """The inverse of parseaddr(), this takes a 2-tuple of the form (realname, email_address) and returns the string value suitable for an RFC 2822 From, To or Cc header. If the first element of pair is false, then the second element is returned unmodified. + + Optional charset if given is the character set that is used to encode + realname in case realname is not ASCII safe. Can be an instance of str or + a Charset-like object which has a header_encode method. Default is + 'utf-8'. """ name, address = pair + # The address MUST (per RFC) be ascii, so throw a UnicodeError if it isn't. + address.encode('ascii') if name: - quotes = '' - if specialsre.search(name): - quotes = '"' - name = escapesre.sub(r'\\\g<0>', name) - return '%s%s%s <%s>' % (quotes, name, quotes, address) + try: + name.encode('ascii') + except UnicodeEncodeError: + if isinstance(charset, str): + charset = Charset(charset) + encoded_name = charset.header_encode(name) + return "%s <%s>" % (encoded_name, address) + else: + quotes = '' + if specialsre.search(name): + quotes = '"' + name = escapesre.sub(r'\\\g<0>', name) + return '%s%s%s <%s>' % (quotes, name, quotes, address) return address diff --git a/Lib/test/test_email/test_email.py b/Lib/test/test_email/test_email.py --- a/Lib/test/test_email/test_email.py +++ b/Lib/test/test_email/test_email.py @@ -2376,6 +2376,46 @@ b = 'person at dom.ain' self.assertEqual(utils.parseaddr(utils.formataddr((a, b))), (a, b)) + def test_quotes_unicode_names(self): + # issue 1690608. email.utils.formataddr() should be rfc2047 aware. + name = "H\u00e4ns W\u00fcrst" + addr = 'person at dom.ain' + utf8_base64 = "=?utf-8?b?SMOkbnMgV8O8cnN0?= " + latin1_quopri = "=?iso-8859-1?q?H=E4ns_W=FCrst?= " + self.assertEqual(utils.formataddr((name, addr)), utf8_base64) + self.assertEqual(utils.formataddr((name, addr), 'iso-8859-1'), + latin1_quopri) + + def test_accepts_any_charset_like_object(self): + # issue 1690608. email.utils.formataddr() should be rfc2047 aware. + name = "H\u00e4ns W\u00fcrst" + addr = 'person at dom.ain' + utf8_base64 = "=?utf-8?b?SMOkbnMgV8O8cnN0?= " + foobar = "FOOBAR" + class CharsetMock: + def header_encode(self, string): + return foobar + mock = CharsetMock() + mock_expected = "%s <%s>" % (foobar, addr) + self.assertEqual(utils.formataddr((name, addr), mock), mock_expected) + self.assertEqual(utils.formataddr((name, addr), Charset('utf-8')), + utf8_base64) + + def test_invalid_charset_like_object_raises_error(self): + # issue 1690608. email.utils.formataddr() should be rfc2047 aware. + name = "H\u00e4ns W\u00fcrst" + addr = 'person at dom.ain' + # A object without a header_encode method: + bad_charset = object() + self.assertRaises(AttributeError, utils.formataddr, (name, addr), + bad_charset) + + def test_unicode_address_raises_error(self): + # issue 1690608. email.utils.formataddr() should be rfc2047 aware. + addr = 'pers\u00f6n at dom.in' + self.assertRaises(UnicodeError, utils.formataddr, (None, addr)) + self.assertRaises(UnicodeError, utils.formataddr, ("Name", addr)) + def test_name_with_dot(self): x = 'John X. Doe ' y = '"John X. Doe" ' diff --git a/Misc/ACKS b/Misc/ACKS --- a/Misc/ACKS +++ b/Misc/ACKS @@ -979,3 +979,4 @@ Kai Zhu Tarek Ziad? Peter ?strand +Torsten Becker diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -97,6 +97,10 @@ - Issue #11605: email.parser.BytesFeedParser was incorrectly converting multipart subpararts with an 8bit CTE into unicode instead of preserving the bytes. +- Issue #1690608: email.util.formataddr is now RFC2047 aware: it now has a + charset parameter that defaults utf-8 which is used as the charset for RFC + 2047 encoding when the realname contains non-ASCII characters. + - Issue #10963: Ensure that subprocess.communicate() never raises EPIPE. - Issue #10791: Implement missing method GzipFile.read1(), allowing GzipFile -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Apr 6 18:38:26 2011 From: python-checkins at python.org (brett.cannon) Date: Wed, 06 Apr 2011 18:38:26 +0200 Subject: [Python-checkins] peps: Fix some spelling mistakes found by Ezio. Message-ID: http://hg.python.org/peps/rev/69662427c7c5 changeset: 3860:69662427c7c5 user: Brett Cannon date: Wed Apr 06 09:38:21 2011 -0700 summary: Fix some spelling mistakes found by Ezio. files: pep-0399.txt | 6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/pep-0399.txt b/pep-0399.txt --- a/pep-0399.txt +++ b/pep-0399.txt @@ -132,13 +132,13 @@ equivalence requirement also dictates that no public API be provided in accelerated code that does not exist in the pure Python code. Without this requirement people could accidentally come to rely on a -detail in the acclerated code which is not made available to other VMs +detail in the accelerated code which is not made available to other VMs that use the pure Python implementation. To help verify that the contract of semantic equivalence is being met, a module must be tested both with and without its accelerated code as thoroughly as possible. As an example, to write tests which exercise both the pure Python and -C acclerated versions of a module, a basic idiom can be followed:: +C accelerated versions of a module, a basic idiom can be followed:: import collections.abc from test.support import import_fresh_module, run_unittest @@ -168,7 +168,7 @@ class AcceleratedExampleTest(ExampleTest): - """Test using the acclerated code.""" + """Test using the accelerated code.""" heapq = c_heapq -- Repository URL: http://hg.python.org/peps From python-checkins at python.org Wed Apr 6 19:05:59 2011 From: python-checkins at python.org (brett.cannon) Date: Wed, 06 Apr 2011 19:05:59 +0200 Subject: [Python-checkins] peps: Explicitly mention accelerator modules can be a subset of functionality. Message-ID: http://hg.python.org/peps/rev/28f8ebde4dc8 changeset: 3861:28f8ebde4dc8 user: Brett Cannon date: Wed Apr 06 10:05:50 2011 -0700 summary: Explicitly mention accelerator modules can be a subset of functionality. files: pep-0399.txt | 30 ++++++++++++++++-------------- 1 files changed, 16 insertions(+), 14 deletions(-) diff --git a/pep-0399.txt b/pep-0399.txt --- a/pep-0399.txt +++ b/pep-0399.txt @@ -14,12 +14,13 @@ ======== The Python standard library under CPython contains various instances -of modules implemented in both pure Python and C. This PEP requires -that in these instances that both the Python and C code *must* be -semantically identical (except in cases where implementation details -of a VM prevents it entirely). It is also required that new C-based -modules lacking a pure Python equivalent implementation get special -permissions to be added to the standard library. +of modules implemented in both pure Python and C (either entirely or +partially). This PEP requires that in these instances that both the +Python and C code *must* be semantically identical (except in cases +where implementation details of a VM prevents it entirely). It is also +required that new C-based modules lacking a pure Python equivalent +implementation get special permissions to be added to the standard +library. Rationale @@ -33,14 +34,15 @@ used in Java applications. A problem all of the VMs other than CPython face is handling modules -from the standard library that are implemented in C. Since they do not -typically support the entire `C API of Python`_ they are unable to use -the code used to create the module. Often times this leads these other -VMs to either re-implement the modules in pure Python or in the -programming language used to implement the VM (e.g., in C# for -IronPython). This duplication of effort between CPython, PyPy, Jython, -and IronPython is extremely unfortunate as implementing a module *at -least* in pure Python would help mitigate this duplicate effort. +from the standard library that are implemented (to some extent) in C. +Since they do not typically support the entire `C API of Python`_ they +are unable to use the code used to create the module. Often times this +leads these other VMs to either re-implement the modules in pure +Python or in the programming language used to implement the VM +(e.g., in C# for IronPython). This duplication of effort between +CPython, PyPy, Jython, and IronPython is extremely unfortunate as +implementing a module *at least* in pure Python would help mitigate +this duplicate effort. The purpose of this PEP is to minimize this duplicate effort by mandating that all new modules added to Python's standard library -- Repository URL: http://hg.python.org/peps From python-checkins at python.org Wed Apr 6 22:35:36 2011 From: python-checkins at python.org (barry.warsaw) Date: Wed, 06 Apr 2011 22:35:36 +0200 Subject: [Python-checkins] cpython (3.1): Issue 11715: Build extension modules on multiarch Debian and Ubuntu by Message-ID: http://hg.python.org/cpython/rev/7582a78f573b changeset: 69174:7582a78f573b branch: 3.1 parent: 69166:2ca1bc677a60 user: Barry Warsaw date: Wed Apr 06 15:18:12 2011 -0400 summary: Issue 11715: Build extension modules on multiarch Debian and Ubuntu by extending search paths to include multiarch directories. files: setup.py | 21 +++++++++++++++++++++ 1 files changed, 21 insertions(+), 0 deletions(-) diff --git a/setup.py b/setup.py --- a/setup.py +++ b/setup.py @@ -339,10 +339,31 @@ return platform return sys.platform + def add_multiarch_paths(self): + # Debian/Ubuntu multiarch support. + # https://wiki.ubuntu.com/MultiarchSpec + tmpfile = os.path.join(self.build_temp, 'multiarch') + if not os.path.exists(self.build_temp): + os.makedirs(self.build_temp) + ret = os.system( + 'dpkg-architecture -qDEB_HOST_MULTIARCH > %s 2> /dev/null' % + tmpfile) + try: + if ret >> 8 == 0: + with open(tmpfile) as fp: + multiarch_path_component = fp.readline().strip() + add_dir_to_list(self.compiler.library_dirs, + '/usr/lib/' + multiarch_path_component) + add_dir_to_list(self.compiler.include_dirs, + '/usr/include/' + multiarch_path_component) + finally: + os.unlink(tmpfile) + def detect_modules(self): # Ensure that /usr/local is always used add_dir_to_list(self.compiler.library_dirs, '/usr/local/lib') add_dir_to_list(self.compiler.include_dirs, '/usr/local/include') + self.add_multiarch_paths() # Add paths specified in the environment variables LDFLAGS and # CPPFLAGS for header and library files. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Apr 6 22:35:37 2011 From: python-checkins at python.org (barry.warsaw) Date: Wed, 06 Apr 2011 22:35:37 +0200 Subject: [Python-checkins] cpython (merge 3.1 -> 3.2): Issue 11715: Merge multiarch fix from 3.1 branch. Message-ID: http://hg.python.org/cpython/rev/867937dd2279 changeset: 69175:867937dd2279 branch: 3.2 parent: 69171:b807cf929e26 parent: 69174:7582a78f573b user: Barry Warsaw date: Wed Apr 06 15:19:05 2011 -0400 summary: Issue 11715: Merge multiarch fix from 3.1 branch. files: setup.py | 21 +++++++++++++++++++++ 1 files changed, 21 insertions(+), 0 deletions(-) diff --git a/setup.py b/setup.py --- a/setup.py +++ b/setup.py @@ -370,12 +370,33 @@ return platform return sys.platform + def add_multiarch_paths(self): + # Debian/Ubuntu multiarch support. + # https://wiki.ubuntu.com/MultiarchSpec + tmpfile = os.path.join(self.build_temp, 'multiarch') + if not os.path.exists(self.build_temp): + os.makedirs(self.build_temp) + ret = os.system( + 'dpkg-architecture -qDEB_HOST_MULTIARCH > %s 2> /dev/null' % + tmpfile) + try: + if ret >> 8 == 0: + with open(tmpfile) as fp: + multiarch_path_component = fp.readline().strip() + add_dir_to_list(self.compiler.library_dirs, + '/usr/lib/' + multiarch_path_component) + add_dir_to_list(self.compiler.include_dirs, + '/usr/include/' + multiarch_path_component) + finally: + os.unlink(tmpfile) + def detect_modules(self): # Ensure that /usr/local is always used, but the local build # directories (i.e. '.' and 'Include') must be first. See issue # 10520. add_dir_to_list(self.compiler.library_dirs, '/usr/local/lib') add_dir_to_list(self.compiler.include_dirs, '/usr/local/include') + self.add_multiarch_paths() # Add paths specified in the environment variables LDFLAGS and # CPPFLAGS for header and library files. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Apr 6 22:35:39 2011 From: python-checkins at python.org (barry.warsaw) Date: Wed, 06 Apr 2011 22:35:39 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Issue 11715: Merge multiarch fix from 3.1 branch. Message-ID: http://hg.python.org/cpython/rev/3f00611c3daf changeset: 69176:3f00611c3daf parent: 69173:184ddd9acd5a parent: 69175:867937dd2279 user: Barry Warsaw date: Wed Apr 06 15:19:25 2011 -0400 summary: Issue 11715: Merge multiarch fix from 3.1 branch. files: setup.py | 21 +++++++++++++++++++++ 1 files changed, 21 insertions(+), 0 deletions(-) diff --git a/setup.py b/setup.py --- a/setup.py +++ b/setup.py @@ -370,12 +370,33 @@ return platform return sys.platform + def add_multiarch_paths(self): + # Debian/Ubuntu multiarch support. + # https://wiki.ubuntu.com/MultiarchSpec + tmpfile = os.path.join(self.build_temp, 'multiarch') + if not os.path.exists(self.build_temp): + os.makedirs(self.build_temp) + ret = os.system( + 'dpkg-architecture -qDEB_HOST_MULTIARCH > %s 2> /dev/null' % + tmpfile) + try: + if ret >> 8 == 0: + with open(tmpfile) as fp: + multiarch_path_component = fp.readline().strip() + add_dir_to_list(self.compiler.library_dirs, + '/usr/lib/' + multiarch_path_component) + add_dir_to_list(self.compiler.include_dirs, + '/usr/include/' + multiarch_path_component) + finally: + os.unlink(tmpfile) + def detect_modules(self): # Ensure that /usr/local is always used, but the local build # directories (i.e. '.' and 'Include') must be first. See issue # 10520. add_dir_to_list(self.compiler.library_dirs, '/usr/local/lib') add_dir_to_list(self.compiler.include_dirs, '/usr/local/include') + self.add_multiarch_paths() # Add paths specified in the environment variables LDFLAGS and # CPPFLAGS for header and library files. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Apr 6 22:54:54 2011 From: python-checkins at python.org (antoine.pitrou) Date: Wed, 06 Apr 2011 22:54:54 +0200 Subject: [Python-checkins] cpython (3.2): Issue #11766: increase countdown waiting for a pool of processes to start Message-ID: http://hg.python.org/cpython/rev/c4a514199dba changeset: 69177:c4a514199dba branch: 3.2 parent: 69175:867937dd2279 user: Antoine Pitrou date: Wed Apr 06 22:51:17 2011 +0200 summary: Issue #11766: increase countdown waiting for a pool of processes to start up. Hopefully fixes transient buildbot failures. files: Lib/test/test_multiprocessing.py | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/Lib/test/test_multiprocessing.py b/Lib/test/test_multiprocessing.py --- a/Lib/test/test_multiprocessing.py +++ b/Lib/test/test_multiprocessing.py @@ -1170,7 +1170,8 @@ # Refill the pool p._repopulate_pool() # Wait until all workers are alive - countdown = 5 + # (countdown * DELTA = 5 seconds max startup process time) + countdown = 50 while countdown and not all(w.is_alive() for w in p._pool): countdown -= 1 time.sleep(DELTA) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Apr 6 22:54:57 2011 From: python-checkins at python.org (antoine.pitrou) Date: Wed, 06 Apr 2011 22:54:57 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Issue #11766: increase countdown waiting for a pool of processes to start Message-ID: http://hg.python.org/cpython/rev/3eac8302a448 changeset: 69178:3eac8302a448 parent: 69176:3f00611c3daf parent: 69177:c4a514199dba user: Antoine Pitrou date: Wed Apr 06 22:54:14 2011 +0200 summary: Issue #11766: increase countdown waiting for a pool of processes to start up. Hopefully fixes transient buildbot failures. files: Lib/test/test_multiprocessing.py | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/Lib/test/test_multiprocessing.py b/Lib/test/test_multiprocessing.py --- a/Lib/test/test_multiprocessing.py +++ b/Lib/test/test_multiprocessing.py @@ -1182,7 +1182,8 @@ # Refill the pool p._repopulate_pool() # Wait until all workers are alive - countdown = 5 + # (countdown * DELTA = 5 seconds max startup process time) + countdown = 50 while countdown and not all(w.is_alive() for w in p._pool): countdown -= 1 time.sleep(DELTA) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Apr 6 22:55:42 2011 From: python-checkins at python.org (raymond.hettinger) Date: Wed, 06 Apr 2011 22:55:42 +0200 Subject: [Python-checkins] peps: Add PEP 207 guidance on rich comparisons to PEP 8. Message-ID: http://hg.python.org/peps/rev/64bda015861d changeset: 3862:64bda015861d user: Raymond Hettinger date: Wed Apr 06 13:53:31 2011 -0700 summary: Add PEP 207 guidance on rich comparisons to PEP 8. files: pep-0008.txt | 15 +++++++++++++++ 1 files changed, 15 insertions(+), 0 deletions(-) diff --git a/pep-0008.txt b/pep-0008.txt --- a/pep-0008.txt +++ b/pep-0008.txt @@ -667,6 +667,21 @@ None was set to some other value. The other value might have a type (such as a container) that could be false in a boolean context! + - When implementing ordering operations with rich comparisons, it is best to + implement all six operations (__eq__, __ne__, __lt__, __le__, __gt__, + __ge__) rather than relying on other code to only exercise a particular + comparison. + + To minimize the effort involved, the functools.total_ordering() decorator + provides a tool to generate missing comparison methods. + + PEP 207 indicates that reflexivity rules *are* assumed by Python. Thus, + the interpreter may swap y>x with x=x with x<=y, and may swap the + arguments of x==y and x!=y. The sort() and min() operations are + guaranteed to use the < operator and the max() function uses the > + operator. However, it is best to implement all six operations so that + confusion doesn't arise in other contexts. + - Use class-based exceptions. String exceptions in new code are forbidden, because this language -- Repository URL: http://hg.python.org/peps From python-checkins at python.org Wed Apr 6 23:01:40 2011 From: python-checkins at python.org (antoine.pitrou) Date: Wed, 06 Apr 2011 23:01:40 +0200 Subject: [Python-checkins] cpython (2.7): Issue #11766: increase countdown waiting for a pool of processes to start Message-ID: http://hg.python.org/cpython/rev/2e4cdaffe493 changeset: 69179:2e4cdaffe493 branch: 2.7 parent: 69170:da212fa62fea user: Antoine Pitrou date: Wed Apr 06 22:54:14 2011 +0200 summary: Issue #11766: increase countdown waiting for a pool of processes to start up. Hopefully fixes transient buildbot failures. files: Lib/test/test_multiprocessing.py | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/Lib/test/test_multiprocessing.py b/Lib/test/test_multiprocessing.py --- a/Lib/test/test_multiprocessing.py +++ b/Lib/test/test_multiprocessing.py @@ -1126,7 +1126,8 @@ # Refill the pool p._repopulate_pool() # Wait until all workers are alive - countdown = 5 + # (countdown * DELTA = 5 seconds max startup process time) + countdown = 50 while countdown and not all(w.is_alive() for w in p._pool): countdown -= 1 time.sleep(DELTA) -- Repository URL: http://hg.python.org/cpython From solipsis at pitrou.net Thu Apr 7 04:57:00 2011 From: solipsis at pitrou.net (solipsis at pitrou.net) Date: Thu, 07 Apr 2011 04:57:00 +0200 Subject: [Python-checkins] Daily reference leaks (3eac8302a448): sum=0 Message-ID: results for 3eac8302a448 on branch "default" -------------------------------------------- Command line was: ['./python', '-m', 'test.regrtest', '-uall', '-R', '3:3:/home/antoine/cpython/refleaks/refloguDZg8a', '-x'] From python-checkins at python.org Thu Apr 7 06:16:45 2011 From: python-checkins at python.org (alexander.belopolsky) Date: Thu, 07 Apr 2011 06:16:45 +0200 Subject: [Python-checkins] cpython (3.2): Removed 'or long integer' from bin, oct, and hex docstrings. Message-ID: http://hg.python.org/cpython/rev/d29277949ad6 changeset: 69180:d29277949ad6 branch: 3.2 parent: 69177:c4a514199dba user: Alexander Belopolsky date: Thu Apr 07 00:15:33 2011 -0400 summary: Removed 'or long integer' from bin, oct, and hex docstrings. files: Python/bltinmodule.c | 6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/Python/bltinmodule.c b/Python/bltinmodule.c --- a/Python/bltinmodule.c +++ b/Python/bltinmodule.c @@ -307,7 +307,7 @@ PyDoc_STRVAR(bin_doc, "bin(number) -> string\n\ \n\ -Return the binary representation of an integer or long integer."); +Return the binary representation of an integer."); static PyObject * @@ -1192,7 +1192,7 @@ PyDoc_STRVAR(hex_doc, "hex(number) -> string\n\ \n\ -Return the hexadecimal representation of an integer or long integer."); +Return the hexadecimal representation of an integer."); static PyObject * @@ -1380,7 +1380,7 @@ PyDoc_STRVAR(oct_doc, "oct(number) -> string\n\ \n\ -Return the octal representation of an integer or long integer."); +Return the octal representation of an integer."); static PyObject * -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Apr 7 06:16:47 2011 From: python-checkins at python.org (alexander.belopolsky) Date: Thu, 07 Apr 2011 06:16:47 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Removed 'or long integer' from bin, oct, and hex docstrings. Message-ID: http://hg.python.org/cpython/rev/11052e067192 changeset: 69181:11052e067192 parent: 69178:3eac8302a448 parent: 69180:d29277949ad6 user: Alexander Belopolsky date: Thu Apr 07 00:16:22 2011 -0400 summary: Removed 'or long integer' from bin, oct, and hex docstrings. files: Python/bltinmodule.c | 6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/Python/bltinmodule.c b/Python/bltinmodule.c --- a/Python/bltinmodule.c +++ b/Python/bltinmodule.c @@ -303,7 +303,7 @@ PyDoc_STRVAR(bin_doc, "bin(number) -> string\n\ \n\ -Return the binary representation of an integer or long integer."); +Return the binary representation of an integer."); static PyObject * @@ -1186,7 +1186,7 @@ PyDoc_STRVAR(hex_doc, "hex(number) -> string\n\ \n\ -Return the hexadecimal representation of an integer or long integer."); +Return the hexadecimal representation of an integer."); static PyObject * @@ -1374,7 +1374,7 @@ PyDoc_STRVAR(oct_doc, "oct(number) -> string\n\ \n\ -Return the octal representation of an integer or long integer."); +Return the octal representation of an integer."); static PyObject * -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Apr 7 11:51:30 2011 From: python-checkins at python.org (victor.stinner) Date: Thu, 07 Apr 2011 11:51:30 +0200 Subject: [Python-checkins] cpython: faulthandler: check PyThreadState_Get() result in dump_tracebacks_later() Message-ID: http://hg.python.org/cpython/rev/7a77a0d9c5b7 changeset: 69182:7a77a0d9c5b7 user: Victor Stinner date: Thu Apr 07 11:37:19 2011 +0200 summary: faulthandler: check PyThreadState_Get() result in dump_tracebacks_later() Cleanup also the code files: Modules/faulthandler.c | 33 +++++++++++++++++++---------- 1 files changed, 21 insertions(+), 12 deletions(-) diff --git a/Modules/faulthandler.c b/Modules/faulthandler.c --- a/Modules/faulthandler.c +++ b/Modules/faulthandler.c @@ -5,6 +5,9 @@ #include #include +/* Allocate at maximum 100 MB of the stack to raise the stack overflow */ +#define STACK_OVERFLOW_MAX_SIZE (100*1024*1024) + #ifdef WITH_THREAD # define FAULTHANDLER_LATER #endif @@ -16,9 +19,6 @@ # define FAULTHANDLER_USER #endif -/* Allocate at maximum 100 MB of the stack to raise the stack overflow */ -#define STACK_OVERFLOW_MAX_SIZE (100*1024*1024) - #define PUTS(fd, str) write(fd, str, strlen(str)) #ifdef HAVE_SIGACTION @@ -451,8 +451,8 @@ } static PyObject* -faulthandler_dump_traceback_later(PyObject *self, - PyObject *args, PyObject *kwargs) +faulthandler_dump_tracebacks_later(PyObject *self, + PyObject *args, PyObject *kwargs) { static char *kwlist[] = {"timeout", "repeat", "file", "exit", NULL}; double timeout; @@ -461,6 +461,7 @@ PyObject *file = NULL; int fd; int exit = 0; + PyThreadState *tstate; if (!PyArg_ParseTupleAndKeywords(args, kwargs, "d|iOi:dump_tracebacks_later", kwlist, @@ -477,6 +478,13 @@ return NULL; } + tstate = PyThreadState_Get(); + if (tstate == NULL) { + PyErr_SetString(PyExc_RuntimeError, + "unable to get the current thread state"); + return NULL; + } + file = faulthandler_get_fileno(file, &fd); if (file == NULL) return NULL; @@ -490,7 +498,7 @@ thread.fd = fd; thread.timeout_ms = timeout_ms; thread.repeat = repeat; - thread.interp = PyThreadState_Get()->interp; + thread.interp = tstate->interp; thread.exit = exit; /* Arm these locks to serve as events when released */ @@ -826,7 +834,7 @@ faulthandler_traverse(PyObject *module, visitproc visit, void *arg) { #ifdef FAULTHANDLER_USER - unsigned int index; + unsigned int signum; #endif #ifdef FAULTHANDLER_LATER @@ -834,8 +842,8 @@ #endif #ifdef FAULTHANDLER_USER if (user_signals != NULL) { - for (index=0; index < NSIG; index++) - Py_VISIT(user_signals[index].file); + for (signum=0; signum < NSIG; signum++) + Py_VISIT(user_signals[signum].file); } #endif Py_VISIT(fatal_error.file); @@ -861,10 +869,11 @@ "if all_threads is True, into file")}, #ifdef FAULTHANDLER_LATER {"dump_tracebacks_later", - (PyCFunction)faulthandler_dump_traceback_later, METH_VARARGS|METH_KEYWORDS, - PyDoc_STR("dump_tracebacks_later(timeout, repeat=False, file=sys.stderr):\n" + (PyCFunction)faulthandler_dump_tracebacks_later, METH_VARARGS|METH_KEYWORDS, + PyDoc_STR("dump_tracebacks_later(timeout, repeat=False, file=sys.stderrn, exit=False):\n" "dump the traceback of all threads in timeout seconds,\n" - "or each timeout seconds if repeat is True.")}, + "or each timeout seconds if repeat is True. If exit is True, " + "call _exit(1) which is not safe.")}, {"cancel_dump_tracebacks_later", (PyCFunction)faulthandler_cancel_dump_tracebacks_later_py, METH_NOARGS, PyDoc_STR("cancel_dump_tracebacks_later():\ncancel the previous call " -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Apr 7 11:51:31 2011 From: python-checkins at python.org (victor.stinner) Date: Thu, 07 Apr 2011 11:51:31 +0200 Subject: [Python-checkins] cpython: faulthandler: we don't use (or need) SA_SIGINFO flag of sigaction() Message-ID: http://hg.python.org/cpython/rev/eef9ab5e50db changeset: 69183:eef9ab5e50db user: Victor Stinner date: Thu Apr 07 11:39:03 2011 +0200 summary: faulthandler: we don't use (or need) SA_SIGINFO flag of sigaction() files: Modules/faulthandler.c | 9 ++------- 1 files changed, 2 insertions(+), 7 deletions(-) diff --git a/Modules/faulthandler.c b/Modules/faulthandler.c --- a/Modules/faulthandler.c +++ b/Modules/faulthandler.c @@ -218,12 +218,7 @@ This function is signal safe and should only call signal safe functions. */ static void -faulthandler_fatal_error( - int signum -#ifdef HAVE_SIGACTION - , siginfo_t *siginfo, void *ucontext -#endif -) +faulthandler_fatal_error(int signum) { const int fd = fatal_error.fd; unsigned int i; @@ -320,7 +315,7 @@ for (i=0; i < faulthandler_nsignals; i++) { handler = &faulthandler_handlers[i]; #ifdef HAVE_SIGACTION - action.sa_sigaction = faulthandler_fatal_error; + action.sa_handler = faulthandler_fatal_error; sigemptyset(&action.sa_mask); /* Do not prevent the signal from being received from within its own signal handler */ -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Apr 7 11:51:32 2011 From: python-checkins at python.org (victor.stinner) Date: Thu, 07 Apr 2011 11:51:32 +0200 Subject: [Python-checkins] cpython: faulthandler: fix compilating without threads Message-ID: http://hg.python.org/cpython/rev/6adbf5f3dafb changeset: 69184:6adbf5f3dafb user: Victor Stinner date: Thu Apr 07 11:50:25 2011 +0200 summary: faulthandler: fix compilating without threads files: Lib/test/test_faulthandler.py | 7 +++++++ Modules/faulthandler.c | 8 ++++++++ 2 files changed, 15 insertions(+), 0 deletions(-) diff --git a/Lib/test/test_faulthandler.py b/Lib/test/test_faulthandler.py --- a/Lib/test/test_faulthandler.py +++ b/Lib/test/test_faulthandler.py @@ -8,6 +8,12 @@ import tempfile import unittest +try: + import threading + HAVE_THREADS = True +except ImportError: + HAVE_THREADS = False + TIMEOUT = 0.5 try: @@ -279,6 +285,7 @@ with temporary_filename() as filename: self.check_dump_traceback(filename) + @unittest.skipIf(not HAVE_THREADS, 'need threads') def check_dump_traceback_threads(self, filename): """ Call explicitly dump_traceback(all_threads=True) and check the output. diff --git a/Modules/faulthandler.c b/Modules/faulthandler.c --- a/Modules/faulthandler.c +++ b/Modules/faulthandler.c @@ -250,6 +250,7 @@ PUTS(fd, handler->name); PUTS(fd, "\n\n"); +#ifdef WITH_THREAD /* SIGSEGV, SIGFPE, SIGABRT, SIGBUS and SIGILL are synchronous signals and so are delivered to the thread that caused the fault. Get the Python thread state of the current thread. @@ -259,6 +260,9 @@ used. Read the thread local storage (TLS) instead: call PyGILState_GetThisThreadState(). */ tstate = PyGILState_GetThisThreadState(); +#else + tstate = PyThreadState_Get(); +#endif if (tstate == NULL) return; @@ -540,10 +544,14 @@ if (!user->enabled) return; +#ifdef WITH_THREAD /* PyThreadState_Get() doesn't give the state of the current thread if the thread doesn't hold the GIL. Read the thread local storage (TLS) instead: call PyGILState_GetThisThreadState(). */ tstate = PyGILState_GetThisThreadState(); +#else + tstate = PyThreadState_Get(); +#endif if (user->all_threads) _Py_DumpTracebackThreads(user->fd, user->interp, tstate); -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Apr 7 16:48:41 2011 From: python-checkins at python.org (barry.warsaw) Date: Thu, 07 Apr 2011 16:48:41 +0200 Subject: [Python-checkins] cpython (3.1): Refinement by Stefan Krah (see issue 11715, msg133194) to exit early if the Message-ID: http://hg.python.org/cpython/rev/c8738114b962 changeset: 69185:c8738114b962 branch: 3.1 parent: 69174:7582a78f573b user: Barry Warsaw date: Thu Apr 07 10:40:36 2011 -0400 summary: Refinement by Stefan Krah (see issue 11715, msg133194) to exit early if the dpkg-architecture command is not found on $PATH. This should fix the failures on FreeBSD and Solaris, which do not create the target file via I/O redirection if the command isn't found (unlike Linux and OS X which do). files: setup.py | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/setup.py b/setup.py --- a/setup.py +++ b/setup.py @@ -342,6 +342,8 @@ def add_multiarch_paths(self): # Debian/Ubuntu multiarch support. # https://wiki.ubuntu.com/MultiarchSpec + if not find_executable('dpkg-architecture'): + return tmpfile = os.path.join(self.build_temp, 'multiarch') if not os.path.exists(self.build_temp): os.makedirs(self.build_temp) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Apr 7 16:48:41 2011 From: python-checkins at python.org (barry.warsaw) Date: Thu, 07 Apr 2011 16:48:41 +0200 Subject: [Python-checkins] cpython (merge 3.1 -> 3.2): Refinement by Stefan Krah (see issue 11715, msg133194) to exit early if the Message-ID: http://hg.python.org/cpython/rev/3d7c9b38fbfd changeset: 69186:3d7c9b38fbfd branch: 3.2 parent: 69180:d29277949ad6 parent: 69185:c8738114b962 user: Barry Warsaw date: Thu Apr 07 10:45:07 2011 -0400 summary: Refinement by Stefan Krah (see issue 11715, msg133194) to exit early if the dpkg-architecture command is not found on $PATH. This should fix the failures on FreeBSD and Solaris, which do not create the target file via I/O redirection if the command isn't found (unlike Linux and OS X which do). files: setup.py | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/setup.py b/setup.py --- a/setup.py +++ b/setup.py @@ -373,6 +373,8 @@ def add_multiarch_paths(self): # Debian/Ubuntu multiarch support. # https://wiki.ubuntu.com/MultiarchSpec + if not find_executable('dpkg-architecture'): + return tmpfile = os.path.join(self.build_temp, 'multiarch') if not os.path.exists(self.build_temp): os.makedirs(self.build_temp) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Apr 7 16:48:43 2011 From: python-checkins at python.org (barry.warsaw) Date: Thu, 07 Apr 2011 16:48:43 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Refinement by Stefan Krah (see issue 11715, msg133194) to exit early if the Message-ID: http://hg.python.org/cpython/rev/bbfc65d05588 changeset: 69187:bbfc65d05588 parent: 69184:6adbf5f3dafb parent: 69186:3d7c9b38fbfd user: Barry Warsaw date: Thu Apr 07 10:48:29 2011 -0400 summary: Refinement by Stefan Krah (see issue 11715, msg133194) to exit early if the dpkg-architecture command is not found on $PATH. This should fix the failures on FreeBSD and Solaris, which do not create the target file via I/O redirection if the command isn't found (unlike Linux and OS X which do). files: setup.py | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/setup.py b/setup.py --- a/setup.py +++ b/setup.py @@ -373,6 +373,8 @@ def add_multiarch_paths(self): # Debian/Ubuntu multiarch support. # https://wiki.ubuntu.com/MultiarchSpec + if not find_executable('dpkg-architecture'): + return tmpfile = os.path.join(self.build_temp, 'multiarch') if not os.path.exists(self.build_temp): os.makedirs(self.build_temp) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Apr 7 17:28:50 2011 From: python-checkins at python.org (barry.warsaw) Date: Thu, 07 Apr 2011 17:28:50 +0200 Subject: [Python-checkins] cpython (2.7): Backport for Python 2.7 of issue 11715 support for building Python on Message-ID: http://hg.python.org/cpython/rev/bd0f73a9538e changeset: 69188:bd0f73a9538e branch: 2.7 parent: 69179:2e4cdaffe493 user: Barry Warsaw date: Thu Apr 07 11:28:11 2011 -0400 summary: Backport for Python 2.7 of issue 11715 support for building Python on multiarch Debian/Ubuntu. files: setup.py | 23 +++++++++++++++++++++++ 1 files changed, 23 insertions(+), 0 deletions(-) diff --git a/setup.py b/setup.py --- a/setup.py +++ b/setup.py @@ -345,10 +345,33 @@ return platform return sys.platform + def add_multiarch_paths(self): + # Debian/Ubuntu multiarch support. + # https://wiki.ubuntu.com/MultiarchSpec + if not find_executable('dpkg-architecture'): + return + tmpfile = os.path.join(self.build_temp, 'multiarch') + if not os.path.exists(self.build_temp): + os.makedirs(self.build_temp) + ret = os.system( + 'dpkg-architecture -qDEB_HOST_MULTIARCH > %s 2> /dev/null' % + tmpfile) + try: + if ret >> 8 == 0: + with open(tmpfile) as fp: + multiarch_path_component = fp.readline().strip() + add_dir_to_list(self.compiler.library_dirs, + '/usr/lib/' + multiarch_path_component) + add_dir_to_list(self.compiler.include_dirs, + '/usr/include/' + multiarch_path_component) + finally: + os.unlink(tmpfile) + def detect_modules(self): # Ensure that /usr/local is always used add_dir_to_list(self.compiler.library_dirs, '/usr/local/lib') add_dir_to_list(self.compiler.include_dirs, '/usr/local/include') + self.add_multiarch_paths() # Add paths specified in the environment variables LDFLAGS and # CPPFLAGS for header and library files. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Apr 7 21:27:23 2011 From: python-checkins at python.org (ezio.melotti) Date: Thu, 07 Apr 2011 21:27:23 +0200 Subject: [Python-checkins] cpython (3.2): #7311: fix html.parser to accept non-ASCII attribute values. Message-ID: http://hg.python.org/cpython/rev/225400cb6e84 changeset: 69189:225400cb6e84 branch: 3.2 parent: 69186:3d7c9b38fbfd user: Ezio Melotti date: Thu Apr 07 22:03:31 2011 +0300 summary: #7311: fix html.parser to accept non-ASCII attribute values. files: Lib/html/parser.py | 2 +- Lib/test/test_htmlparser.py | 17 +++++++++++++++++ Misc/NEWS | 2 ++ 3 files changed, 20 insertions(+), 1 deletions(-) diff --git a/Lib/html/parser.py b/Lib/html/parser.py --- a/Lib/html/parser.py +++ b/Lib/html/parser.py @@ -28,7 +28,7 @@ # make it correctly strict without breaking backward compatibility. attrfind = re.compile( r'\s*([a-zA-Z_][-.:a-zA-Z_0-9]*)(\s*=\s*' - r'(\'[^\']*\'|"[^"]*"|[-a-zA-Z0-9./,:;+*%?!&$\(\)_#=~@]*))?') + r'(\'[^\']*\'|"[^"]*"|[^\s"\'=<>`]*))?') attrfind_tolerant = re.compile( r'\s*([a-zA-Z_][-.:a-zA-Z_0-9]*)(\s*=\s*' r'(\'[^\']*\'|"[^"]*"|[^>\s]*))?') diff --git a/Lib/test/test_htmlparser.py b/Lib/test/test_htmlparser.py --- a/Lib/test/test_htmlparser.py +++ b/Lib/test/test_htmlparser.py @@ -217,6 +217,23 @@ ("starttag", "a", [("href", "mailto:xyz at example.com")]), ]) + def test_attr_nonascii(self): + # see issue 7311 + self._run_check("\u4e2d\u6587", [ + ("starttag", "img", [("src", "/foo/bar.png"), + ("alt", "\u4e2d\u6587")]), + ]) + self._run_check("
", [ + ("starttag", "a", [("title", "\u30c6\u30b9\u30c8"), + ("href", "\u30c6\u30b9\u30c8.html")]), + ]) + self._run_check('', [ + ("starttag", "a", [("title", "\u30c6\u30b9\u30c8"), + ("href", "\u30c6\u30b9\u30c8.html")]), + ]) + def test_attr_entity_replacement(self): self._run_check("""""", [ ("starttag", "a", [("b", "&><\"'")]), diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -49,6 +49,8 @@ Library ------- +- Issue #7311: fix html.parser to accept non-ASCII attribute values. + - Issue #11605: email.parser.BytesFeedParser was incorrectly converting multipart subpararts with an 8bit CTE into unicode instead of preserving the bytes. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Apr 7 21:27:24 2011 From: python-checkins at python.org (ezio.melotti) Date: Thu, 07 Apr 2011 21:27:24 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): #7311: merge with 3.2. Message-ID: http://hg.python.org/cpython/rev/a1dea7cde58f changeset: 69190:a1dea7cde58f parent: 69187:bbfc65d05588 parent: 69189:225400cb6e84 user: Ezio Melotti date: Thu Apr 07 22:27:44 2011 +0300 summary: #7311: merge with 3.2. files: Lib/html/parser.py | 2 +- Lib/test/test_htmlparser.py | 17 +++++++++++++++++ Misc/NEWS | 2 ++ 3 files changed, 20 insertions(+), 1 deletions(-) diff --git a/Lib/html/parser.py b/Lib/html/parser.py --- a/Lib/html/parser.py +++ b/Lib/html/parser.py @@ -28,7 +28,7 @@ # make it correctly strict without breaking backward compatibility. attrfind = re.compile( r'\s*([a-zA-Z_][-.:a-zA-Z_0-9]*)(\s*=\s*' - r'(\'[^\']*\'|"[^"]*"|[-a-zA-Z0-9./,:;+*%?!&$\(\)_#=~@]*))?') + r'(\'[^\']*\'|"[^"]*"|[^\s"\'=<>`]*))?') attrfind_tolerant = re.compile( r'\s*([a-zA-Z_][-.:a-zA-Z_0-9]*)(\s*=\s*' r'(\'[^\']*\'|"[^"]*"|[^>\s]*))?') diff --git a/Lib/test/test_htmlparser.py b/Lib/test/test_htmlparser.py --- a/Lib/test/test_htmlparser.py +++ b/Lib/test/test_htmlparser.py @@ -217,6 +217,23 @@ ("starttag", "a", [("href", "mailto:xyz at example.com")]), ]) + def test_attr_nonascii(self): + # see issue 7311 + self._run_check("\u4e2d\u6587", [ + ("starttag", "img", [("src", "/foo/bar.png"), + ("alt", "\u4e2d\u6587")]), + ]) + self._run_check("", [ + ("starttag", "a", [("title", "\u30c6\u30b9\u30c8"), + ("href", "\u30c6\u30b9\u30c8.html")]), + ]) + self._run_check('', [ + ("starttag", "a", [("title", "\u30c6\u30b9\u30c8"), + ("href", "\u30c6\u30b9\u30c8.html")]), + ]) + def test_attr_entity_replacement(self): self._run_check("""""", [ ("starttag", "a", [("b", "&><\"'")]), diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -94,6 +94,8 @@ Library ------- +- Issue #7311: fix html.parser to accept non-ASCII attribute values. + - Issue #11605: email.parser.BytesFeedParser was incorrectly converting multipart subpararts with an 8bit CTE into unicode instead of preserving the bytes. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Apr 7 23:22:32 2011 From: python-checkins at python.org (antoine.pitrou) Date: Thu, 07 Apr 2011 23:22:32 +0200 Subject: [Python-checkins] cpython: Fix faulthandler timeout to avoid breaking buildbots Message-ID: http://hg.python.org/cpython/rev/567cbddf8678 changeset: 69191:567cbddf8678 user: Antoine Pitrou date: Thu Apr 07 23:22:28 2011 +0200 summary: Fix faulthandler timeout to avoid breaking buildbots files: Lib/test/regrtest.py | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Lib/test/regrtest.py b/Lib/test/regrtest.py --- a/Lib/test/regrtest.py +++ b/Lib/test/regrtest.py @@ -240,7 +240,7 @@ findleaks=False, use_resources=None, trace=False, coverdir='coverage', runleaks=False, huntrleaks=False, verbose2=False, print_slow=False, random_seed=None, use_mp=None, verbose3=False, forever=False, - header=False, timeout=30*60): + header=False, timeout=60*60): """Execute a test suite. This also parses command-line options and modifies its behavior -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Fri Apr 8 00:30:58 2011 From: python-checkins at python.org (brian.quinlan) Date: Fri, 08 Apr 2011 00:30:58 +0200 Subject: [Python-checkins] cpython: Issue #11777: Executor.map does not submit futures until iter.next() is called Message-ID: http://hg.python.org/cpython/rev/126353bc7e94 changeset: 69192:126353bc7e94 parent: 69181:11052e067192 user: Brian Quinlan date: Fri Apr 08 08:19:33 2011 +1000 summary: Issue #11777: Executor.map does not submit futures until iter.next() is called files: Lib/concurrent/futures/_base.py | 22 ++++++++++------ Lib/test/test_concurrent_futures.py | 10 ++++++- 2 files changed, 22 insertions(+), 10 deletions(-) diff --git a/Lib/concurrent/futures/_base.py b/Lib/concurrent/futures/_base.py --- a/Lib/concurrent/futures/_base.py +++ b/Lib/concurrent/futures/_base.py @@ -536,15 +536,19 @@ fs = [self.submit(fn, *args) for args in zip(*iterables)] - try: - for future in fs: - if timeout is None: - yield future.result() - else: - yield future.result(end_time - time.time()) - finally: - for future in fs: - future.cancel() + # Yield must be hidden in closure so that the futures are submitted + # before the first iterator value is required. + def result_iterator(): + try: + for future in fs: + if timeout is None: + yield future.result() + else: + yield future.result(end_time - time.time()) + finally: + for future in fs: + future.cancel() + return result_iterator() def shutdown(self, wait=True): """Clean-up the resources associated with the Executor. diff --git a/Lib/test/test_concurrent_futures.py b/Lib/test/test_concurrent_futures.py --- a/Lib/test/test_concurrent_futures.py +++ b/Lib/test/test_concurrent_futures.py @@ -369,7 +369,15 @@ class ThreadPoolExecutorTest(ThreadPoolMixin, ExecutorTest): - pass + def test_map_submits_without_iteration(self): + """Tests verifying issue 11777.""" + finished = [] + def record_finished(n): + finished.append(n) + + self.executor.map(record_finished, range(10)) + self.executor.shutdown(wait=True) + self.assertCountEqual(finished, range(10)) class ProcessPoolExecutorTest(ProcessPoolMixin, ExecutorTest): -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Fri Apr 8 00:30:59 2011 From: python-checkins at python.org (brian.quinlan) Date: Fri, 08 Apr 2011 00:30:59 +0200 Subject: [Python-checkins] cpython (merge default -> default): Merge to tip. Message-ID: http://hg.python.org/cpython/rev/9ddba521c3aa changeset: 69193:9ddba521c3aa parent: 69192:126353bc7e94 parent: 69191:567cbddf8678 user: Brian Quinlan date: Fri Apr 08 08:30:41 2011 +1000 summary: Merge to tip. files: Lib/html/parser.py | 2 +- Lib/test/regrtest.py | 2 +- Lib/test/test_faulthandler.py | 7 +++ Lib/test/test_htmlparser.py | 17 +++++++ Misc/NEWS | 2 + Modules/faulthandler.c | 50 ++++++++++++++-------- setup.py | 2 + 7 files changed, 61 insertions(+), 21 deletions(-) diff --git a/Lib/html/parser.py b/Lib/html/parser.py --- a/Lib/html/parser.py +++ b/Lib/html/parser.py @@ -28,7 +28,7 @@ # make it correctly strict without breaking backward compatibility. attrfind = re.compile( r'\s*([a-zA-Z_][-.:a-zA-Z_0-9]*)(\s*=\s*' - r'(\'[^\']*\'|"[^"]*"|[-a-zA-Z0-9./,:;+*%?!&$\(\)_#=~@]*))?') + r'(\'[^\']*\'|"[^"]*"|[^\s"\'=<>`]*))?') attrfind_tolerant = re.compile( r'\s*([a-zA-Z_][-.:a-zA-Z_0-9]*)(\s*=\s*' r'(\'[^\']*\'|"[^"]*"|[^>\s]*))?') diff --git a/Lib/test/regrtest.py b/Lib/test/regrtest.py --- a/Lib/test/regrtest.py +++ b/Lib/test/regrtest.py @@ -240,7 +240,7 @@ findleaks=False, use_resources=None, trace=False, coverdir='coverage', runleaks=False, huntrleaks=False, verbose2=False, print_slow=False, random_seed=None, use_mp=None, verbose3=False, forever=False, - header=False, timeout=30*60): + header=False, timeout=60*60): """Execute a test suite. This also parses command-line options and modifies its behavior diff --git a/Lib/test/test_faulthandler.py b/Lib/test/test_faulthandler.py --- a/Lib/test/test_faulthandler.py +++ b/Lib/test/test_faulthandler.py @@ -8,6 +8,12 @@ import tempfile import unittest +try: + import threading + HAVE_THREADS = True +except ImportError: + HAVE_THREADS = False + TIMEOUT = 0.5 try: @@ -279,6 +285,7 @@ with temporary_filename() as filename: self.check_dump_traceback(filename) + @unittest.skipIf(not HAVE_THREADS, 'need threads') def check_dump_traceback_threads(self, filename): """ Call explicitly dump_traceback(all_threads=True) and check the output. diff --git a/Lib/test/test_htmlparser.py b/Lib/test/test_htmlparser.py --- a/Lib/test/test_htmlparser.py +++ b/Lib/test/test_htmlparser.py @@ -217,6 +217,23 @@ ("starttag", "a", [("href", "mailto:xyz at example.com")]), ]) + def test_attr_nonascii(self): + # see issue 7311 + self._run_check("\u4e2d\u6587", [ + ("starttag", "img", [("src", "/foo/bar.png"), + ("alt", "\u4e2d\u6587")]), + ]) + self._run_check("", [ + ("starttag", "a", [("title", "\u30c6\u30b9\u30c8"), + ("href", "\u30c6\u30b9\u30c8.html")]), + ]) + self._run_check('', [ + ("starttag", "a", [("title", "\u30c6\u30b9\u30c8"), + ("href", "\u30c6\u30b9\u30c8.html")]), + ]) + def test_attr_entity_replacement(self): self._run_check("""""", [ ("starttag", "a", [("b", "&><\"'")]), diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -94,6 +94,8 @@ Library ------- +- Issue #7311: fix html.parser to accept non-ASCII attribute values. + - Issue #11605: email.parser.BytesFeedParser was incorrectly converting multipart subpararts with an 8bit CTE into unicode instead of preserving the bytes. diff --git a/Modules/faulthandler.c b/Modules/faulthandler.c --- a/Modules/faulthandler.c +++ b/Modules/faulthandler.c @@ -5,6 +5,9 @@ #include #include +/* Allocate at maximum 100 MB of the stack to raise the stack overflow */ +#define STACK_OVERFLOW_MAX_SIZE (100*1024*1024) + #ifdef WITH_THREAD # define FAULTHANDLER_LATER #endif @@ -16,9 +19,6 @@ # define FAULTHANDLER_USER #endif -/* Allocate at maximum 100 MB of the stack to raise the stack overflow */ -#define STACK_OVERFLOW_MAX_SIZE (100*1024*1024) - #define PUTS(fd, str) write(fd, str, strlen(str)) #ifdef HAVE_SIGACTION @@ -218,12 +218,7 @@ This function is signal safe and should only call signal safe functions. */ static void -faulthandler_fatal_error( - int signum -#ifdef HAVE_SIGACTION - , siginfo_t *siginfo, void *ucontext -#endif -) +faulthandler_fatal_error(int signum) { const int fd = fatal_error.fd; unsigned int i; @@ -255,6 +250,7 @@ PUTS(fd, handler->name); PUTS(fd, "\n\n"); +#ifdef WITH_THREAD /* SIGSEGV, SIGFPE, SIGABRT, SIGBUS and SIGILL are synchronous signals and so are delivered to the thread that caused the fault. Get the Python thread state of the current thread. @@ -264,6 +260,9 @@ used. Read the thread local storage (TLS) instead: call PyGILState_GetThisThreadState(). */ tstate = PyGILState_GetThisThreadState(); +#else + tstate = PyThreadState_Get(); +#endif if (tstate == NULL) return; @@ -320,7 +319,7 @@ for (i=0; i < faulthandler_nsignals; i++) { handler = &faulthandler_handlers[i]; #ifdef HAVE_SIGACTION - action.sa_sigaction = faulthandler_fatal_error; + action.sa_handler = faulthandler_fatal_error; sigemptyset(&action.sa_mask); /* Do not prevent the signal from being received from within its own signal handler */ @@ -451,8 +450,8 @@ } static PyObject* -faulthandler_dump_traceback_later(PyObject *self, - PyObject *args, PyObject *kwargs) +faulthandler_dump_tracebacks_later(PyObject *self, + PyObject *args, PyObject *kwargs) { static char *kwlist[] = {"timeout", "repeat", "file", "exit", NULL}; double timeout; @@ -461,6 +460,7 @@ PyObject *file = NULL; int fd; int exit = 0; + PyThreadState *tstate; if (!PyArg_ParseTupleAndKeywords(args, kwargs, "d|iOi:dump_tracebacks_later", kwlist, @@ -477,6 +477,13 @@ return NULL; } + tstate = PyThreadState_Get(); + if (tstate == NULL) { + PyErr_SetString(PyExc_RuntimeError, + "unable to get the current thread state"); + return NULL; + } + file = faulthandler_get_fileno(file, &fd); if (file == NULL) return NULL; @@ -490,7 +497,7 @@ thread.fd = fd; thread.timeout_ms = timeout_ms; thread.repeat = repeat; - thread.interp = PyThreadState_Get()->interp; + thread.interp = tstate->interp; thread.exit = exit; /* Arm these locks to serve as events when released */ @@ -537,10 +544,14 @@ if (!user->enabled) return; +#ifdef WITH_THREAD /* PyThreadState_Get() doesn't give the state of the current thread if the thread doesn't hold the GIL. Read the thread local storage (TLS) instead: call PyGILState_GetThisThreadState(). */ tstate = PyGILState_GetThisThreadState(); +#else + tstate = PyThreadState_Get(); +#endif if (user->all_threads) _Py_DumpTracebackThreads(user->fd, user->interp, tstate); @@ -826,7 +837,7 @@ faulthandler_traverse(PyObject *module, visitproc visit, void *arg) { #ifdef FAULTHANDLER_USER - unsigned int index; + unsigned int signum; #endif #ifdef FAULTHANDLER_LATER @@ -834,8 +845,8 @@ #endif #ifdef FAULTHANDLER_USER if (user_signals != NULL) { - for (index=0; index < NSIG; index++) - Py_VISIT(user_signals[index].file); + for (signum=0; signum < NSIG; signum++) + Py_VISIT(user_signals[signum].file); } #endif Py_VISIT(fatal_error.file); @@ -861,10 +872,11 @@ "if all_threads is True, into file")}, #ifdef FAULTHANDLER_LATER {"dump_tracebacks_later", - (PyCFunction)faulthandler_dump_traceback_later, METH_VARARGS|METH_KEYWORDS, - PyDoc_STR("dump_tracebacks_later(timeout, repeat=False, file=sys.stderr):\n" + (PyCFunction)faulthandler_dump_tracebacks_later, METH_VARARGS|METH_KEYWORDS, + PyDoc_STR("dump_tracebacks_later(timeout, repeat=False, file=sys.stderrn, exit=False):\n" "dump the traceback of all threads in timeout seconds,\n" - "or each timeout seconds if repeat is True.")}, + "or each timeout seconds if repeat is True. If exit is True, " + "call _exit(1) which is not safe.")}, {"cancel_dump_tracebacks_later", (PyCFunction)faulthandler_cancel_dump_tracebacks_later_py, METH_NOARGS, PyDoc_STR("cancel_dump_tracebacks_later():\ncancel the previous call " diff --git a/setup.py b/setup.py --- a/setup.py +++ b/setup.py @@ -373,6 +373,8 @@ def add_multiarch_paths(self): # Debian/Ubuntu multiarch support. # https://wiki.ubuntu.com/MultiarchSpec + if not find_executable('dpkg-architecture'): + return tmpfile = os.path.join(self.build_temp, 'multiarch') if not os.path.exists(self.build_temp): os.makedirs(self.build_temp) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Fri Apr 8 02:35:12 2011 From: python-checkins at python.org (vinay.sajip) Date: Fri, 08 Apr 2011 02:35:12 +0200 Subject: [Python-checkins] cpython (3.2): Updated Formatter documentation. Message-ID: http://hg.python.org/cpython/rev/c760390165dc changeset: 69194:c760390165dc branch: 3.2 parent: 69189:225400cb6e84 user: Vinay Sajip date: Fri Apr 08 01:30:51 2011 +0100 summary: Updated Formatter documentation. files: Doc/library/logging.rst | 11 +++++++++-- 1 files changed, 9 insertions(+), 2 deletions(-) diff --git a/Doc/library/logging.rst b/Doc/library/logging.rst --- a/Doc/library/logging.rst +++ b/Doc/library/logging.rst @@ -405,7 +405,7 @@ :ref:`logrecord-attributes`. -.. class:: Formatter(fmt=None, datefmt=None) +.. class:: Formatter(fmt=None, datefmt=None, style='%') Returns a new instance of the :class:`Formatter` class. The instance is initialized with a format string for the message as a whole, as well as a @@ -413,6 +413,14 @@ specified, ``'%(message)s'`` is used. If no *datefmt* is specified, the ISO8601 date format is used. + The *style* parameter can be one of '%', '{' or '$' and determines how + the format string will be merged with its data: using one of %-formatting, + :meth:`str.format` or :class:`string.Template`. + + .. versionchanged:: 3.2 + The *style* parameter was added. + + .. method:: format(record) The record's attribute dictionary is used as the operand to a string @@ -691,7 +699,6 @@ information into logging calls. For a usage example , see the section on :ref:`adding contextual information to your logging output `. - .. class:: LoggerAdapter(logger, extra) Returns an instance of :class:`LoggerAdapter` initialized with an -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Fri Apr 8 02:35:16 2011 From: python-checkins at python.org (vinay.sajip) Date: Fri, 08 Apr 2011 02:35:16 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Merged doc fix in 3.2. Message-ID: http://hg.python.org/cpython/rev/93f6ffe53b99 changeset: 69195:93f6ffe53b99 parent: 69193:9ddba521c3aa parent: 69194:c760390165dc user: Vinay Sajip date: Fri Apr 08 01:32:27 2011 +0100 summary: Merged doc fix in 3.2. files: Doc/library/logging.rst | 11 +++++++++-- 1 files changed, 9 insertions(+), 2 deletions(-) diff --git a/Doc/library/logging.rst b/Doc/library/logging.rst --- a/Doc/library/logging.rst +++ b/Doc/library/logging.rst @@ -405,7 +405,7 @@ :ref:`logrecord-attributes`. -.. class:: Formatter(fmt=None, datefmt=None) +.. class:: Formatter(fmt=None, datefmt=None, style='%') Returns a new instance of the :class:`Formatter` class. The instance is initialized with a format string for the message as a whole, as well as a @@ -413,6 +413,14 @@ specified, ``'%(message)s'`` is used. If no *datefmt* is specified, the ISO8601 date format is used. + The *style* parameter can be one of '%', '{' or '$' and determines how + the format string will be merged with its data: using one of %-formatting, + :meth:`str.format` or :class:`string.Template`. + + .. versionchanged:: 3.2 + The *style* parameter was added. + + .. method:: format(record) The record's attribute dictionary is used as the operand to a string @@ -691,7 +699,6 @@ information into logging calls. For a usage example , see the section on :ref:`adding contextual information to your logging output `. - .. class:: LoggerAdapter(logger, extra) Returns an instance of :class:`LoggerAdapter` initialized with an -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Fri Apr 8 02:35:17 2011 From: python-checkins at python.org (vinay.sajip) Date: Fri, 08 Apr 2011 02:35:17 +0200 Subject: [Python-checkins] cpython (3.2): Normalised whitespace. Message-ID: http://hg.python.org/cpython/rev/4995dfe308e7 changeset: 69196:4995dfe308e7 branch: 3.2 parent: 69194:c760390165dc user: Vinay Sajip date: Fri Apr 08 01:34:20 2011 +0100 summary: Normalised whitespace. files: Doc/library/logging.rst | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Doc/library/logging.rst b/Doc/library/logging.rst --- a/Doc/library/logging.rst +++ b/Doc/library/logging.rst @@ -415,7 +415,7 @@ The *style* parameter can be one of '%', '{' or '$' and determines how the format string will be merged with its data: using one of %-formatting, - :meth:`str.format` or :class:`string.Template`. + :meth:`str.format` or :class:`string.Template`. .. versionchanged:: 3.2 The *style* parameter was added. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Fri Apr 8 02:35:19 2011 From: python-checkins at python.org (vinay.sajip) Date: Fri, 08 Apr 2011 02:35:19 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Merged whitespace fix. Message-ID: http://hg.python.org/cpython/rev/664e065ed3cd changeset: 69197:664e065ed3cd parent: 69195:93f6ffe53b99 parent: 69196:4995dfe308e7 user: Vinay Sajip date: Fri Apr 08 01:35:04 2011 +0100 summary: Merged whitespace fix. files: Doc/library/logging.rst | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Doc/library/logging.rst b/Doc/library/logging.rst --- a/Doc/library/logging.rst +++ b/Doc/library/logging.rst @@ -415,7 +415,7 @@ The *style* parameter can be one of '%', '{' or '$' and determines how the format string will be merged with its data: using one of %-formatting, - :meth:`str.format` or :class:`string.Template`. + :meth:`str.format` or :class:`string.Template`. .. versionchanged:: 3.2 The *style* parameter was added. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Fri Apr 8 02:42:42 2011 From: python-checkins at python.org (r.david.murray) Date: Fri, 08 Apr 2011 02:42:42 +0200 Subject: [Python-checkins] cpython (3.1): Improve test coverage of _split_ascii method. Message-ID: http://hg.python.org/cpython/rev/0e76c8ddd989 changeset: 69198:0e76c8ddd989 branch: 3.1 parent: 69185:c8738114b962 user: R David Murray date: Thu Apr 07 20:37:17 2011 -0400 summary: Improve test coverage of _split_ascii method. files: Lib/email/test/test_email.py | 43 ++++++++++++++++++++++++ 1 files changed, 43 insertions(+), 0 deletions(-) diff --git a/Lib/email/test/test_email.py b/Lib/email/test/test_email.py --- a/Lib/email/test/test_email.py +++ b/Lib/email/test/test_email.py @@ -750,6 +750,49 @@ Test""") + def test_last_split_chunk_does_not_fit(self): + eq = self.ndiffAssertEqual + h = Header('Subject: the first part of this is short, but_the_second' + '_part_does_not_fit_within_maxlinelen_and_thus_should_be_on_a_line' + '_all_by_itself') + eq(h.encode(), """\ +Subject: the first part of this is short, + but_the_second_part_does_not_fit_within_maxlinelen_and_thus_should_be_on_a_line_all_by_itself""") + + def test_splittable_leading_char_followed_by_overlong_unsplitable(self): + eq = self.ndiffAssertEqual + h = Header(', but_the_second' + '_part_does_not_fit_within_maxlinelen_and_thus_should_be_on_a_line' + '_all_by_itself') + eq(h.encode(), """\ +, + but_the_second_part_does_not_fit_within_maxlinelen_and_thus_should_be_on_a_line_all_by_itself""") + + def test_multiple_splittable_leading_char_followed_by_overlong_unsplitable(self): + eq = self.ndiffAssertEqual + h = Header(', , but_the_second' + '_part_does_not_fit_within_maxlinelen_and_thus_should_be_on_a_line' + '_all_by_itself') + eq(h.encode(), """\ +, , + but_the_second_part_does_not_fit_within_maxlinelen_and_thus_should_be_on_a_line_all_by_itself""") + + def test_trailing_splitable_on_overlong_unsplitable(self): + eq = self.ndiffAssertEqual + h = Header('this_part_does_not_fit_within_maxlinelen_and_thus_should_' + 'be_on_a_line_all_by_itself;') + eq(h.encode(), "this_part_does_not_fit_within_maxlinelen_and_thus_should_" + "be_on_a_line_all_by_itself;") + + def test_trailing_splitable_on_overlong_unsplitable_with_leading_splitable(self): + eq = self.ndiffAssertEqual + h = Header('; ' + 'this_part_does_not_fit_within_maxlinelen_and_thus_should_' + 'be_on_a_line_all_by_itself;') + eq(h.encode(), """\ +; + this_part_does_not_fit_within_maxlinelen_and_thus_should_be_on_a_line_all_by_itself;""") + def test_no_split_long_header(self): eq = self.ndiffAssertEqual hstr = 'References: ' + 'x' * 80 -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Fri Apr 8 02:42:43 2011 From: python-checkins at python.org (r.david.murray) Date: Fri, 08 Apr 2011 02:42:43 +0200 Subject: [Python-checkins] cpython (merge 3.1 -> 3.2): Merge: Improve test coverage of _split_ascii method. Message-ID: http://hg.python.org/cpython/rev/bc1117ace406 changeset: 69199:bc1117ace406 branch: 3.2 parent: 69196:4995dfe308e7 parent: 69198:0e76c8ddd989 user: R David Murray date: Thu Apr 07 20:40:01 2011 -0400 summary: Merge: Improve test coverage of _split_ascii method. files: Lib/email/test/test_email.py | 43 ++++++++++++++++++++++++ 1 files changed, 43 insertions(+), 0 deletions(-) diff --git a/Lib/email/test/test_email.py b/Lib/email/test/test_email.py --- a/Lib/email/test/test_email.py +++ b/Lib/email/test/test_email.py @@ -784,6 +784,49 @@ Test""") + def test_last_split_chunk_does_not_fit(self): + eq = self.ndiffAssertEqual + h = Header('Subject: the first part of this is short, but_the_second' + '_part_does_not_fit_within_maxlinelen_and_thus_should_be_on_a_line' + '_all_by_itself') + eq(h.encode(), """\ +Subject: the first part of this is short, + but_the_second_part_does_not_fit_within_maxlinelen_and_thus_should_be_on_a_line_all_by_itself""") + + def test_splittable_leading_char_followed_by_overlong_unsplitable(self): + eq = self.ndiffAssertEqual + h = Header(', but_the_second' + '_part_does_not_fit_within_maxlinelen_and_thus_should_be_on_a_line' + '_all_by_itself') + eq(h.encode(), """\ +, + but_the_second_part_does_not_fit_within_maxlinelen_and_thus_should_be_on_a_line_all_by_itself""") + + def test_multiple_splittable_leading_char_followed_by_overlong_unsplitable(self): + eq = self.ndiffAssertEqual + h = Header(', , but_the_second' + '_part_does_not_fit_within_maxlinelen_and_thus_should_be_on_a_line' + '_all_by_itself') + eq(h.encode(), """\ +, , + but_the_second_part_does_not_fit_within_maxlinelen_and_thus_should_be_on_a_line_all_by_itself""") + + def test_trailing_splitable_on_overlong_unsplitable(self): + eq = self.ndiffAssertEqual + h = Header('this_part_does_not_fit_within_maxlinelen_and_thus_should_' + 'be_on_a_line_all_by_itself;') + eq(h.encode(), "this_part_does_not_fit_within_maxlinelen_and_thus_should_" + "be_on_a_line_all_by_itself;") + + def test_trailing_splitable_on_overlong_unsplitable_with_leading_splitable(self): + eq = self.ndiffAssertEqual + h = Header('; ' + 'this_part_does_not_fit_within_maxlinelen_and_thus_should_' + 'be_on_a_line_all_by_itself;') + eq(h.encode(), """\ +; + this_part_does_not_fit_within_maxlinelen_and_thus_should_be_on_a_line_all_by_itself;""") + def test_no_split_long_header(self): eq = self.ndiffAssertEqual hstr = 'References: ' + 'x' * 80 -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Fri Apr 8 02:42:45 2011 From: python-checkins at python.org (r.david.murray) Date: Fri, 08 Apr 2011 02:42:45 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Merge: Improve test coverage of _split_ascii method. Message-ID: http://hg.python.org/cpython/rev/d48b886dd750 changeset: 69200:d48b886dd750 parent: 69197:664e065ed3cd parent: 69199:bc1117ace406 user: R David Murray date: Thu Apr 07 20:42:28 2011 -0400 summary: Merge: Improve test coverage of _split_ascii method. files: Lib/test/test_email/test_email.py | 43 +++++++++++++++++++ 1 files changed, 43 insertions(+), 0 deletions(-) diff --git a/Lib/test/test_email/test_email.py b/Lib/test/test_email/test_email.py --- a/Lib/test/test_email/test_email.py +++ b/Lib/test/test_email/test_email.py @@ -758,6 +758,49 @@ Test""") + def test_last_split_chunk_does_not_fit(self): + eq = self.ndiffAssertEqual + h = Header('Subject: the first part of this is short, but_the_second' + '_part_does_not_fit_within_maxlinelen_and_thus_should_be_on_a_line' + '_all_by_itself') + eq(h.encode(), """\ +Subject: the first part of this is short, + but_the_second_part_does_not_fit_within_maxlinelen_and_thus_should_be_on_a_line_all_by_itself""") + + def test_splittable_leading_char_followed_by_overlong_unsplitable(self): + eq = self.ndiffAssertEqual + h = Header(', but_the_second' + '_part_does_not_fit_within_maxlinelen_and_thus_should_be_on_a_line' + '_all_by_itself') + eq(h.encode(), """\ +, + but_the_second_part_does_not_fit_within_maxlinelen_and_thus_should_be_on_a_line_all_by_itself""") + + def test_multiple_splittable_leading_char_followed_by_overlong_unsplitable(self): + eq = self.ndiffAssertEqual + h = Header(', , but_the_second' + '_part_does_not_fit_within_maxlinelen_and_thus_should_be_on_a_line' + '_all_by_itself') + eq(h.encode(), """\ +, , + but_the_second_part_does_not_fit_within_maxlinelen_and_thus_should_be_on_a_line_all_by_itself""") + + def test_trailing_splitable_on_overlong_unsplitable(self): + eq = self.ndiffAssertEqual + h = Header('this_part_does_not_fit_within_maxlinelen_and_thus_should_' + 'be_on_a_line_all_by_itself;') + eq(h.encode(), "this_part_does_not_fit_within_maxlinelen_and_thus_should_" + "be_on_a_line_all_by_itself;") + + def test_trailing_splitable_on_overlong_unsplitable_with_leading_splitable(self): + eq = self.ndiffAssertEqual + h = Header('; ' + 'this_part_does_not_fit_within_maxlinelen_and_thus_should_' + 'be_on_a_line_all_by_itself;') + eq(h.encode(), """\ +; + this_part_does_not_fit_within_maxlinelen_and_thus_should_be_on_a_line_all_by_itself;""") + def test_no_split_long_header(self): eq = self.ndiffAssertEqual hstr = 'References: ' + 'x' * 80 -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Fri Apr 8 03:01:27 2011 From: python-checkins at python.org (r.david.murray) Date: Fri, 08 Apr 2011 03:01:27 +0200 Subject: [Python-checkins] cpython (3.1): #11492: fix header truncation on folding when there are runs of split chars. Message-ID: http://hg.python.org/cpython/rev/10725fc76e11 changeset: 69201:10725fc76e11 branch: 3.1 parent: 69198:0e76c8ddd989 user: R David Murray date: Thu Apr 07 20:54:03 2011 -0400 summary: #11492: fix header truncation on folding when there are runs of split chars. Not a complete fix for this issue. files: Lib/email/header.py | 7 ++++--- Lib/email/test/test_email.py | 10 ++++++++++ 2 files changed, 14 insertions(+), 3 deletions(-) diff --git a/Lib/email/header.py b/Lib/email/header.py --- a/Lib/email/header.py +++ b/Lib/email/header.py @@ -464,12 +464,13 @@ self._current_line.reset(str(holding)) return elif not nextpart: - # There must be some trailing split characters because we + # There must be some trailing or duplicated split characters + # because we # found a split character but no next part. In this case we # must treat the thing to fit as the part + splitpart because # if splitpart is whitespace it's not allowed to be the only # thing on the line, and if it's not whitespace we must split - # after the syntactic break. In either case, we're done. + # after the syntactic break. holding_prelen = len(holding) holding.push(part + splitpart) if len(holding) + len(self._current_line) <= self._maxlen: @@ -484,7 +485,7 @@ self._lines.append(str(self._current_line)) holding.reset(save_part) self._current_line.reset(str(holding)) - return + holding.reset() elif not part: # We're leading with a split character. See if the splitpart # and nextpart fits on the current line. diff --git a/Lib/email/test/test_email.py b/Lib/email/test/test_email.py --- a/Lib/email/test/test_email.py +++ b/Lib/email/test/test_email.py @@ -793,6 +793,16 @@ ; this_part_does_not_fit_within_maxlinelen_and_thus_should_be_on_a_line_all_by_itself;""") + def test_long_header_with_multiple_sequential_split_chars(self): + # Issue 11492 + + eq = self.ndiffAssertEqual + h = Header('This is a long line that has two whitespaces in a row. ' + 'This used to cause truncation of the header when folded') + eq(h.encode(), """\ +This is a long line that has two whitespaces in a row. This used to cause + truncation of the header when folded""") + def test_no_split_long_header(self): eq = self.ndiffAssertEqual hstr = 'References: ' + 'x' * 80 -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Fri Apr 8 03:01:28 2011 From: python-checkins at python.org (r.david.murray) Date: Fri, 08 Apr 2011 03:01:28 +0200 Subject: [Python-checkins] cpython (merge 3.1 -> 3.2): Merge #11492: fix header truncation on folding when there are runs of split Message-ID: http://hg.python.org/cpython/rev/74ec64dc3538 changeset: 69202:74ec64dc3538 branch: 3.2 parent: 69199:bc1117ace406 parent: 69201:10725fc76e11 user: R David Murray date: Thu Apr 07 20:56:31 2011 -0400 summary: Merge #11492: fix header truncation on folding when there are runs of split chars. Not a complete fix for this issue. files: Lib/email/header.py | 7 ++++--- Lib/email/test/test_email.py | 10 ++++++++++ 2 files changed, 14 insertions(+), 3 deletions(-) diff --git a/Lib/email/header.py b/Lib/email/header.py --- a/Lib/email/header.py +++ b/Lib/email/header.py @@ -483,12 +483,13 @@ self._current_line.reset(str(holding)) return elif not nextpart: - # There must be some trailing split characters because we + # There must be some trailing or duplicated split characters + # because we # found a split character but no next part. In this case we # must treat the thing to fit as the part + splitpart because # if splitpart is whitespace it's not allowed to be the only # thing on the line, and if it's not whitespace we must split - # after the syntactic break. In either case, we're done. + # after the syntactic break. holding_prelen = len(holding) holding.push(part + splitpart) if len(holding) + len(self._current_line) <= self._maxlen: @@ -503,7 +504,7 @@ self._lines.append(str(self._current_line)) holding.reset(save_part) self._current_line.reset(str(holding)) - return + holding.reset() elif not part: # We're leading with a split character. See if the splitpart # and nextpart fits on the current line. diff --git a/Lib/email/test/test_email.py b/Lib/email/test/test_email.py --- a/Lib/email/test/test_email.py +++ b/Lib/email/test/test_email.py @@ -827,6 +827,16 @@ ; this_part_does_not_fit_within_maxlinelen_and_thus_should_be_on_a_line_all_by_itself;""") + def test_long_header_with_multiple_sequential_split_chars(self): + # Issue 11492 + + eq = self.ndiffAssertEqual + h = Header('This is a long line that has two whitespaces in a row. ' + 'This used to cause truncation of the header when folded') + eq(h.encode(), """\ +This is a long line that has two whitespaces in a row. This used to cause + truncation of the header when folded""") + def test_no_split_long_header(self): eq = self.ndiffAssertEqual hstr = 'References: ' + 'x' * 80 -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Fri Apr 8 03:01:31 2011 From: python-checkins at python.org (r.david.murray) Date: Fri, 08 Apr 2011 03:01:31 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Merge #11492: fix header truncation on folding when there are runs of split Message-ID: http://hg.python.org/cpython/rev/5ec2695c9c15 changeset: 69203:5ec2695c9c15 parent: 69200:d48b886dd750 parent: 69202:74ec64dc3538 user: R David Murray date: Thu Apr 07 21:00:33 2011 -0400 summary: Merge #11492: fix header truncation on folding when there are runs of split chars. Not a complete fix for this issue. files: Lib/email/header.py | 7 ++++--- Lib/test/test_email/test_email.py | 10 ++++++++++ 2 files changed, 14 insertions(+), 3 deletions(-) diff --git a/Lib/email/header.py b/Lib/email/header.py --- a/Lib/email/header.py +++ b/Lib/email/header.py @@ -483,12 +483,13 @@ self._current_line.reset(str(holding)) return elif not nextpart: - # There must be some trailing split characters because we + # There must be some trailing or duplicated split characters + # because we # found a split character but no next part. In this case we # must treat the thing to fit as the part + splitpart because # if splitpart is whitespace it's not allowed to be the only # thing on the line, and if it's not whitespace we must split - # after the syntactic break. In either case, we're done. + # after the syntactic break. holding_prelen = len(holding) holding.push(part + splitpart) if len(holding) + len(self._current_line) <= self._maxlen: @@ -503,7 +504,7 @@ self._lines.append(str(self._current_line)) holding.reset(save_part) self._current_line.reset(str(holding)) - return + holding.reset() elif not part: # We're leading with a split character. See if the splitpart # and nextpart fits on the current line. diff --git a/Lib/test/test_email/test_email.py b/Lib/test/test_email/test_email.py --- a/Lib/test/test_email/test_email.py +++ b/Lib/test/test_email/test_email.py @@ -801,6 +801,16 @@ ; this_part_does_not_fit_within_maxlinelen_and_thus_should_be_on_a_line_all_by_itself;""") + def test_long_header_with_multiple_sequential_split_chars(self): + # Issue 11492 + + eq = self.ndiffAssertEqual + h = Header('This is a long line that has two whitespaces in a row. ' + 'This used to cause truncation of the header when folded') + eq(h.encode(), """\ +This is a long line that has two whitespaces in a row. This used to cause + truncation of the header when folded""") + def test_no_split_long_header(self): eq = self.ndiffAssertEqual hstr = 'References: ' + 'x' * 80 -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Fri Apr 8 03:42:09 2011 From: python-checkins at python.org (ezio.melotti) Date: Fri, 8 Apr 2011 03:42:09 +0200 (CEST) Subject: [Python-checkins] r88811 - tracker/instances/python-dev/html/issue.item.js Message-ID: <3Q5NnY35NLzQ8f@mail.python.org> Author: ezio.melotti Date: Fri Apr 8 03:42:09 2011 New Revision: 88811 Log: #390: Fix regex to allow names with spaces. Modified: tracker/instances/python-dev/html/issue.item.js Modified: tracker/instances/python-dev/html/issue.item.js ============================================================================== --- tracker/instances/python-dev/html/issue.item.js (original) +++ tracker/instances/python-dev/html/issue.item.js Fri Apr 8 03:42:09 2011 @@ -23,7 +23,7 @@ function add_to_nosy(user) { var add_me_button = document.getElementById('add_me_to_nosy'); var nosy = document.getElementsByName('nosy')[0]; - var nosy_text = nosy.value.replace(/\s+/g, ''); + var nosy_text = nosy.value.replace(/,\s+/g, ','); if (nosy_text == "") { // nosy_list is empty, add the user nosy.value = user; From python-checkins at python.org Fri Apr 8 04:41:44 2011 From: python-checkins at python.org (ezio.melotti) Date: Fri, 8 Apr 2011 04:41:44 +0200 (CEST) Subject: [Python-checkins] r88812 - tracker/instances/python-dev/html/style.css Message-ID: <3Q5Q6J62zJzRVW@mail.python.org> Author: ezio.melotti Date: Fri Apr 8 04:41:44 2011 New Revision: 88812 Log: #385: remove unnecessary whitespace. Modified: tracker/instances/python-dev/html/style.css Modified: tracker/instances/python-dev/html/style.css ============================================================================== --- tracker/instances/python-dev/html/style.css (original) +++ tracker/instances/python-dev/html/style.css Fri Apr 8 04:41:44 2011 @@ -21,7 +21,7 @@ margin: 0px; } - at media print + at media print { .index-controls { display: none;} #searchbox { display: none;} @@ -34,17 +34,29 @@ } -div#searchbox +div#searchbox { float: right; padding-top: 1em; } -div#searchbox input#search-text +div#searchbox input#search-text { width: 10em; } +#body-main { + margin-left: 14em; +} + +#menu +{ + width: 13em; +} +#menu ul.level-one li a +{ + margin-left: 0; +} #menu ul.level-two li { background-image: none; @@ -53,11 +65,11 @@ border: 0; border-top: 1px solid #DDD; padding: 0.1em; - margin: 0 3em 0px 1.5em; + margin: 0; color: #3C4B7B; background: none; width: 11em !important; - width /**/: 3.2em; + width /**/: 3.2em; font-family: Arial, Verdana, Geneva, "Bitstream Vera Sans", Helvetica, sans-serif; text-transform: none; } @@ -72,7 +84,7 @@ color: #5E72A5; background-image: none; width: 10em !important; - width /**/: 11.4em; + width /**/: 11.4em; font-family: Arial, Verdana, Geneva, "Bitstream Vera Sans", Helvetica, sans-serif; font-size: 95%; } @@ -100,25 +112,25 @@ } -td.date, th.date { +td.date, th.date { white-space: nowrap; } -p.ok-message +p.ok-message { background-color: #22bb22; padding: 5px; color: white; font-weight: bold; } -p.error-message +p.error-message { background-color: #bb2222; padding: 5px; color: white; font-weight: bold; } -p.error-message a[href] +p.error-message a[href] { color: white; text-decoration: underline; @@ -143,7 +155,7 @@ padding: 2px; border-spacing: 5px; border-collapse: collapse; -/* background-color: #e0e0e0; */ +/* background-color: #e0e0e0; */ margin: 5px; } @@ -203,7 +215,7 @@ table.list th a[href]:hover { color: #404070 } table.list th a[href]:link { color: #404070 } table.list th a[href] { color: #404070 } -table.list th.group +table.list th.group { background-color: #e0e0e0; text-align: center; @@ -364,7 +376,7 @@ /* style for class help display */ -table.classhelp { /* the table-layout: fixed; */ +table.classhelp { /* the table-layout: fixed; */ table-layout: fixed; /* compromises quality for speed */ overflow: hidden; font-size: .9em; From python-checkins at python.org Fri Apr 8 04:49:10 2011 From: python-checkins at python.org (ezio.melotti) Date: Fri, 8 Apr 2011 04:49:10 +0200 (CEST) Subject: [Python-checkins] r88813 - tracker/instances/python-dev/html/issue.item.html Message-ID: <3Q5QGt6D2Qz7Lpk@mail.python.org> Author: ezio.melotti Date: Fri Apr 8 04:49:10 2011 New Revision: 88813 Log: #390: Fix the check in the TAL too. Modified: tracker/instances/python-dev/html/issue.item.html Modified: tracker/instances/python-dev/html/issue.item.html ============================================================================== --- tracker/instances/python-dev/html/issue.item.html (original) +++ tracker/instances/python-dev/html/issue.item.html Fri Apr 8 04:49:10 2011 @@ -156,7 +156,7 @@ @@ -273,8 +273,8 @@ link#link# From solipsis at pitrou.net Fri Apr 8 04:56:38 2011 From: solipsis at pitrou.net (solipsis at pitrou.net) Date: Fri, 08 Apr 2011 04:56:38 +0200 Subject: [Python-checkins] Daily reference leaks (5ec2695c9c15): sum=0 Message-ID: results for 5ec2695c9c15 on branch "default" -------------------------------------------- Command line was: ['./python', '-m', 'test.regrtest', '-uall', '-R', '3:3:/home/antoine/cpython/refleaks/reflogq6nYTr', '-x'] From python-checkins at python.org Fri Apr 8 11:49:10 2011 From: python-checkins at python.org (antoine.pitrou) Date: Fri, 08 Apr 2011 11:49:10 +0200 Subject: [Python-checkins] devguide: pydotorg -> python-committers Message-ID: http://hg.python.org/devguide/rev/91956ceea765 changeset: 410:91956ceea765 user: Antoine Pitrou date: Fri Apr 08 11:49:07 2011 +0200 summary: pydotorg -> python-committers files: coredev.rst | 6 +++--- faq.rst | 4 ++-- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/coredev.rst b/coredev.rst --- a/coredev.rst +++ b/coredev.rst @@ -60,9 +60,9 @@ You need to generate an SSH 2 RSA key to be able to commit code. You may have multiple keys if you wish (e.g., for work and home). Send your key as an -attachment in an email to python-committers (do not paste it in the email as -SSH keys have specific formatting requirements). Help in generating an SSH key -can be found in the :ref:`faq`. +attachment in an email to python-committers at python.org (do not paste it in +the email as SSH keys have specific formatting requirements). Help in +generating an SSH key can be found in the :ref:`faq`. Your SSH key will be set to a username in the form of "first_name.last_name". This should match your username on the issue tracker. diff --git a/faq.rst b/faq.rst --- a/faq.rst +++ b/faq.rst @@ -676,8 +676,8 @@ How do I generate an SSH 2 public key? ------------------------------------------------------------------------------- -All generated SSH keys should be sent to pydotorg for adding to the list of -keys. +All generated SSH keys should be sent to python-committers at python.org for +adding to the list of keys. UNIX ''''''''''''''''''' -- Repository URL: http://hg.python.org/devguide From python-checkins at python.org Fri Apr 8 12:43:03 2011 From: python-checkins at python.org (vinay.sajip) Date: Fri, 08 Apr 2011 12:43:03 +0200 Subject: [Python-checkins] cpython (2.7): Issue #11794: Reorganised logging documentation. Message-ID: http://hg.python.org/cpython/rev/6fb033af9310 changeset: 69204:6fb033af9310 branch: 2.7 parent: 69188:bd0f73a9538e user: Vinay Sajip date: Fri Apr 08 11:40:38 2011 +0100 summary: Issue #11794: Reorganised logging documentation. files: Doc/howto/index.rst | 2 + Doc/howto/logging-cookbook.rst | 684 +++ Doc/howto/logging.rst | 998 ++++ Doc/library/allos.rst | 2 + Doc/library/logging.config.rst | 673 +++ Doc/library/logging.handlers.rst | 740 +++ Doc/library/logging.rst | 3875 ++--------------- 7 files changed, 3656 insertions(+), 3318 deletions(-) diff --git a/Doc/howto/index.rst b/Doc/howto/index.rst --- a/Doc/howto/index.rst +++ b/Doc/howto/index.rst @@ -19,6 +19,8 @@ descriptor.rst doanddont.rst functional.rst + logging.rst + logging-cookbook.rst regex.rst sockets.rst sorting.rst diff --git a/Doc/howto/logging-cookbook.rst b/Doc/howto/logging-cookbook.rst new file mode 100644 --- /dev/null +++ b/Doc/howto/logging-cookbook.rst @@ -0,0 +1,684 @@ +.. _logging-cookbook: + +================ +Logging Cookbook +================ + +:Author: Vinay Sajip + +This page contains a number of recipes related to logging, which have been found +useful in the past. + +.. currentmodule:: logging + +Using logging in multiple modules +--------------------------------- + +Multiple calls to ``logging.getLogger('someLogger')`` return a reference to the +same logger object. This is true not only within the same module, but also +across modules as long as it is in the same Python interpreter process. It is +true for references to the same object; additionally, application code can +define and configure a parent logger in one module and create (but not +configure) a child logger in a separate module, and all logger calls to the +child will pass up to the parent. Here is a main module:: + + import logging + import auxiliary_module + + # create logger with 'spam_application' + logger = logging.getLogger('spam_application') + logger.setLevel(logging.DEBUG) + # create file handler which logs even debug messages + fh = logging.FileHandler('spam.log') + fh.setLevel(logging.DEBUG) + # create console handler with a higher log level + ch = logging.StreamHandler() + ch.setLevel(logging.ERROR) + # create formatter and add it to the handlers + formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s') + fh.setFormatter(formatter) + ch.setFormatter(formatter) + # add the handlers to the logger + logger.addHandler(fh) + logger.addHandler(ch) + + logger.info('creating an instance of auxiliary_module.Auxiliary') + a = auxiliary_module.Auxiliary() + logger.info('created an instance of auxiliary_module.Auxiliary') + logger.info('calling auxiliary_module.Auxiliary.do_something') + a.do_something() + logger.info('finished auxiliary_module.Auxiliary.do_something') + logger.info('calling auxiliary_module.some_function()') + auxiliary_module.some_function() + logger.info('done with auxiliary_module.some_function()') + +Here is the auxiliary module:: + + import logging + + # create logger + module_logger = logging.getLogger('spam_application.auxiliary') + + class Auxiliary: + def __init__(self): + self.logger = logging.getLogger('spam_application.auxiliary.Auxiliary') + self.logger.info('creating an instance of Auxiliary') + def do_something(self): + self.logger.info('doing something') + a = 1 + 1 + self.logger.info('done doing something') + + def some_function(): + module_logger.info('received a call to "some_function"') + +The output looks like this:: + + 2005-03-23 23:47:11,663 - spam_application - INFO - + creating an instance of auxiliary_module.Auxiliary + 2005-03-23 23:47:11,665 - spam_application.auxiliary.Auxiliary - INFO - + creating an instance of Auxiliary + 2005-03-23 23:47:11,665 - spam_application - INFO - + created an instance of auxiliary_module.Auxiliary + 2005-03-23 23:47:11,668 - spam_application - INFO - + calling auxiliary_module.Auxiliary.do_something + 2005-03-23 23:47:11,668 - spam_application.auxiliary.Auxiliary - INFO - + doing something + 2005-03-23 23:47:11,669 - spam_application.auxiliary.Auxiliary - INFO - + done doing something + 2005-03-23 23:47:11,670 - spam_application - INFO - + finished auxiliary_module.Auxiliary.do_something + 2005-03-23 23:47:11,671 - spam_application - INFO - + calling auxiliary_module.some_function() + 2005-03-23 23:47:11,672 - spam_application.auxiliary - INFO - + received a call to 'some_function' + 2005-03-23 23:47:11,673 - spam_application - INFO - + done with auxiliary_module.some_function() + +Multiple handlers and formatters +-------------------------------- + +Loggers are plain Python objects. The :func:`addHandler` method has no minimum +or maximum quota for the number of handlers you may add. Sometimes it will be +beneficial for an application to log all messages of all severities to a text +file while simultaneously logging errors or above to the console. To set this +up, simply configure the appropriate handlers. The logging calls in the +application code will remain unchanged. Here is a slight modification to the +previous simple module-based configuration example:: + + import logging + + logger = logging.getLogger('simple_example') + logger.setLevel(logging.DEBUG) + # create file handler which logs even debug messages + fh = logging.FileHandler('spam.log') + fh.setLevel(logging.DEBUG) + # create console handler with a higher log level + ch = logging.StreamHandler() + ch.setLevel(logging.ERROR) + # create formatter and add it to the handlers + formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s') + ch.setFormatter(formatter) + fh.setFormatter(formatter) + # add the handlers to logger + logger.addHandler(ch) + logger.addHandler(fh) + + # 'application' code + logger.debug('debug message') + logger.info('info message') + logger.warn('warn message') + logger.error('error message') + logger.critical('critical message') + +Notice that the 'application' code does not care about multiple handlers. All +that changed was the addition and configuration of a new handler named *fh*. + +The ability to create new handlers with higher- or lower-severity filters can be +very helpful when writing and testing an application. Instead of using many +``print`` statements for debugging, use ``logger.debug``: Unlike the print +statements, which you will have to delete or comment out later, the logger.debug +statements can remain intact in the source code and remain dormant until you +need them again. At that time, the only change that needs to happen is to +modify the severity level of the logger and/or handler to debug. + +.. _multiple-destinations: + +Logging to multiple destinations +-------------------------------- + +Let's say you want to log to console and file with different message formats and +in differing circumstances. Say you want to log messages with levels of DEBUG +and higher to file, and those messages at level INFO and higher to the console. +Let's also assume that the file should contain timestamps, but the console +messages should not. Here's how you can achieve this:: + + import logging + + # set up logging to file - see previous section for more details + logging.basicConfig(level=logging.DEBUG, + format='%(asctime)s %(name)-12s %(levelname)-8s %(message)s', + datefmt='%m-%d %H:%M', + filename='/temp/myapp.log', + filemode='w') + # define a Handler which writes INFO messages or higher to the sys.stderr + console = logging.StreamHandler() + console.setLevel(logging.INFO) + # set a format which is simpler for console use + formatter = logging.Formatter('%(name)-12s: %(levelname)-8s %(message)s') + # tell the handler to use this format + console.setFormatter(formatter) + # add the handler to the root logger + logging.getLogger('').addHandler(console) + + # Now, we can log to the root logger, or any other logger. First the root... + logging.info('Jackdaws love my big sphinx of quartz.') + + # Now, define a couple of other loggers which might represent areas in your + # application: + + logger1 = logging.getLogger('myapp.area1') + logger2 = logging.getLogger('myapp.area2') + + logger1.debug('Quick zephyrs blow, vexing daft Jim.') + logger1.info('How quickly daft jumping zebras vex.') + logger2.warning('Jail zesty vixen who grabbed pay from quack.') + logger2.error('The five boxing wizards jump quickly.') + +When you run this, on the console you will see :: + + root : INFO Jackdaws love my big sphinx of quartz. + myapp.area1 : INFO How quickly daft jumping zebras vex. + myapp.area2 : WARNING Jail zesty vixen who grabbed pay from quack. + myapp.area2 : ERROR The five boxing wizards jump quickly. + +and in the file you will see something like :: + + 10-22 22:19 root INFO Jackdaws love my big sphinx of quartz. + 10-22 22:19 myapp.area1 DEBUG Quick zephyrs blow, vexing daft Jim. + 10-22 22:19 myapp.area1 INFO How quickly daft jumping zebras vex. + 10-22 22:19 myapp.area2 WARNING Jail zesty vixen who grabbed pay from quack. + 10-22 22:19 myapp.area2 ERROR The five boxing wizards jump quickly. + +As you can see, the DEBUG message only shows up in the file. The other messages +are sent to both destinations. + +This example uses console and file handlers, but you can use any number and +combination of handlers you choose. + + +Configuration server example +---------------------------- + +Here is an example of a module using the logging configuration server:: + + import logging + import logging.config + import time + import os + + # read initial config file + logging.config.fileConfig('logging.conf') + + # create and start listener on port 9999 + t = logging.config.listen(9999) + t.start() + + logger = logging.getLogger('simpleExample') + + try: + # loop through logging calls to see the difference + # new configurations make, until Ctrl+C is pressed + while True: + logger.debug('debug message') + logger.info('info message') + logger.warn('warn message') + logger.error('error message') + logger.critical('critical message') + time.sleep(5) + except KeyboardInterrupt: + # cleanup + logging.config.stopListening() + t.join() + +And here is a script that takes a filename and sends that file to the server, +properly preceded with the binary-encoded length, as the new logging +configuration:: + + #!/usr/bin/env python + import socket, sys, struct + + with open(sys.argv[1], 'rb') as f: + data_to_send = f.read() + + HOST = 'localhost' + PORT = 9999 + s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) + print('connecting...') + s.connect((HOST, PORT)) + print('sending config...') + s.send(struct.pack('>L', len(data_to_send))) + s.send(data_to_send) + s.close() + print('complete') + + +.. _network-logging: + +Sending and receiving logging events across a network +----------------------------------------------------- + +Let's say you want to send logging events across a network, and handle them at +the receiving end. A simple way of doing this is attaching a +:class:`SocketHandler` instance to the root logger at the sending end:: + + import logging, logging.handlers + + rootLogger = logging.getLogger('') + rootLogger.setLevel(logging.DEBUG) + socketHandler = logging.handlers.SocketHandler('localhost', + logging.handlers.DEFAULT_TCP_LOGGING_PORT) + # don't bother with a formatter, since a socket handler sends the event as + # an unformatted pickle + rootLogger.addHandler(socketHandler) + + # Now, we can log to the root logger, or any other logger. First the root... + logging.info('Jackdaws love my big sphinx of quartz.') + + # Now, define a couple of other loggers which might represent areas in your + # application: + + logger1 = logging.getLogger('myapp.area1') + logger2 = logging.getLogger('myapp.area2') + + logger1.debug('Quick zephyrs blow, vexing daft Jim.') + logger1.info('How quickly daft jumping zebras vex.') + logger2.warning('Jail zesty vixen who grabbed pay from quack.') + logger2.error('The five boxing wizards jump quickly.') + +At the receiving end, you can set up a receiver using the :mod:`socketserver` +module. Here is a basic working example:: + + import pickle + import logging + import logging.handlers + import socketserver + import struct + + + class LogRecordStreamHandler(socketserver.StreamRequestHandler): + """Handler for a streaming logging request. + + This basically logs the record using whatever logging policy is + configured locally. + """ + + def handle(self): + """ + Handle multiple requests - each expected to be a 4-byte length, + followed by the LogRecord in pickle format. Logs the record + according to whatever policy is configured locally. + """ + while True: + chunk = self.connection.recv(4) + if len(chunk) < 4: + break + slen = struct.unpack('>L', chunk)[0] + chunk = self.connection.recv(slen) + while len(chunk) < slen: + chunk = chunk + self.connection.recv(slen - len(chunk)) + obj = self.unPickle(chunk) + record = logging.makeLogRecord(obj) + self.handleLogRecord(record) + + def unPickle(self, data): + return pickle.loads(data) + + def handleLogRecord(self, record): + # if a name is specified, we use the named logger rather than the one + # implied by the record. + if self.server.logname is not None: + name = self.server.logname + else: + name = record.name + logger = logging.getLogger(name) + # N.B. EVERY record gets logged. This is because Logger.handle + # is normally called AFTER logger-level filtering. If you want + # to do filtering, do it at the client end to save wasting + # cycles and network bandwidth! + logger.handle(record) + + class LogRecordSocketReceiver(socketserver.ThreadingTCPServer): + """ + Simple TCP socket-based logging receiver suitable for testing. + """ + + allow_reuse_address = 1 + + def __init__(self, host='localhost', + port=logging.handlers.DEFAULT_TCP_LOGGING_PORT, + handler=LogRecordStreamHandler): + socketserver.ThreadingTCPServer.__init__(self, (host, port), handler) + self.abort = 0 + self.timeout = 1 + self.logname = None + + def serve_until_stopped(self): + import select + abort = 0 + while not abort: + rd, wr, ex = select.select([self.socket.fileno()], + [], [], + self.timeout) + if rd: + self.handle_request() + abort = self.abort + + def main(): + logging.basicConfig( + format='%(relativeCreated)5d %(name)-15s %(levelname)-8s %(message)s') + tcpserver = LogRecordSocketReceiver() + print('About to start TCP server...') + tcpserver.serve_until_stopped() + + if __name__ == '__main__': + main() + +First run the server, and then the client. On the client side, nothing is +printed on the console; on the server side, you should see something like:: + + About to start TCP server... + 59 root INFO Jackdaws love my big sphinx of quartz. + 59 myapp.area1 DEBUG Quick zephyrs blow, vexing daft Jim. + 69 myapp.area1 INFO How quickly daft jumping zebras vex. + 69 myapp.area2 WARNING Jail zesty vixen who grabbed pay from quack. + 69 myapp.area2 ERROR The five boxing wizards jump quickly. + +Note that there are some security issues with pickle in some scenarios. If +these affect you, you can use an alternative serialization scheme by overriding +the :meth:`makePickle` method and implementing your alternative there, as +well as adapting the above script to use your alternative serialization. + + +.. _context-info: + +Adding contextual information to your logging output +---------------------------------------------------- + +Sometimes you want logging output to contain contextual information in +addition to the parameters passed to the logging call. For example, in a +networked application, it may be desirable to log client-specific information +in the log (e.g. remote client's username, or IP address). Although you could +use the *extra* parameter to achieve this, it's not always convenient to pass +the information in this way. While it might be tempting to create +:class:`Logger` instances on a per-connection basis, this is not a good idea +because these instances are not garbage collected. While this is not a problem +in practice, when the number of :class:`Logger` instances is dependent on the +level of granularity you want to use in logging an application, it could +be hard to manage if the number of :class:`Logger` instances becomes +effectively unbounded. + + +Using LoggerAdapters to impart contextual information +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +An easy way in which you can pass contextual information to be output along +with logging event information is to use the :class:`LoggerAdapter` class. +This class is designed to look like a :class:`Logger`, so that you can call +:meth:`debug`, :meth:`info`, :meth:`warning`, :meth:`error`, +:meth:`exception`, :meth:`critical` and :meth:`log`. These methods have the +same signatures as their counterparts in :class:`Logger`, so you can use the +two types of instances interchangeably. + +When you create an instance of :class:`LoggerAdapter`, you pass it a +:class:`Logger` instance and a dict-like object which contains your contextual +information. When you call one of the logging methods on an instance of +:class:`LoggerAdapter`, it delegates the call to the underlying instance of +:class:`Logger` passed to its constructor, and arranges to pass the contextual +information in the delegated call. Here's a snippet from the code of +:class:`LoggerAdapter`:: + + def debug(self, msg, *args, **kwargs): + """ + Delegate a debug call to the underlying logger, after adding + contextual information from this adapter instance. + """ + msg, kwargs = self.process(msg, kwargs) + self.logger.debug(msg, *args, **kwargs) + +The :meth:`process` method of :class:`LoggerAdapter` is where the contextual +information is added to the logging output. It's passed the message and +keyword arguments of the logging call, and it passes back (potentially) +modified versions of these to use in the call to the underlying logger. The +default implementation of this method leaves the message alone, but inserts +an 'extra' key in the keyword argument whose value is the dict-like object +passed to the constructor. Of course, if you had passed an 'extra' keyword +argument in the call to the adapter, it will be silently overwritten. + +The advantage of using 'extra' is that the values in the dict-like object are +merged into the :class:`LogRecord` instance's __dict__, allowing you to use +customized strings with your :class:`Formatter` instances which know about +the keys of the dict-like object. If you need a different method, e.g. if you +want to prepend or append the contextual information to the message string, +you just need to subclass :class:`LoggerAdapter` and override :meth:`process` +to do what you need. Here's an example script which uses this class, which +also illustrates what dict-like behaviour is needed from an arbitrary +'dict-like' object for use in the constructor:: + + import logging + + class ConnInfo: + """ + An example class which shows how an arbitrary class can be used as + the 'extra' context information repository passed to a LoggerAdapter. + """ + + def __getitem__(self, name): + """ + To allow this instance to look like a dict. + """ + from random import choice + if name == 'ip': + result = choice(['127.0.0.1', '192.168.0.1']) + elif name == 'user': + result = choice(['jim', 'fred', 'sheila']) + else: + result = self.__dict__.get(name, '?') + return result + + def __iter__(self): + """ + To allow iteration over keys, which will be merged into + the LogRecord dict before formatting and output. + """ + keys = ['ip', 'user'] + keys.extend(self.__dict__.keys()) + return keys.__iter__() + + if __name__ == '__main__': + from random import choice + levels = (logging.DEBUG, logging.INFO, logging.WARNING, logging.ERROR, logging.CRITICAL) + a1 = logging.LoggerAdapter(logging.getLogger('a.b.c'), + { 'ip' : '123.231.231.123', 'user' : 'sheila' }) + logging.basicConfig(level=logging.DEBUG, + format='%(asctime)-15s %(name)-5s %(levelname)-8s IP: %(ip)-15s User: %(user)-8s %(message)s') + a1.debug('A debug message') + a1.info('An info message with %s', 'some parameters') + a2 = logging.LoggerAdapter(logging.getLogger('d.e.f'), ConnInfo()) + for x in range(10): + lvl = choice(levels) + lvlname = logging.getLevelName(lvl) + a2.log(lvl, 'A message at %s level with %d %s', lvlname, 2, 'parameters') + +When this script is run, the output should look something like this:: + + 2008-01-18 14:49:54,023 a.b.c DEBUG IP: 123.231.231.123 User: sheila A debug message + 2008-01-18 14:49:54,023 a.b.c INFO IP: 123.231.231.123 User: sheila An info message with some parameters + 2008-01-18 14:49:54,023 d.e.f CRITICAL IP: 192.168.0.1 User: jim A message at CRITICAL level with 2 parameters + 2008-01-18 14:49:54,033 d.e.f INFO IP: 192.168.0.1 User: jim A message at INFO level with 2 parameters + 2008-01-18 14:49:54,033 d.e.f WARNING IP: 192.168.0.1 User: sheila A message at WARNING level with 2 parameters + 2008-01-18 14:49:54,033 d.e.f ERROR IP: 127.0.0.1 User: fred A message at ERROR level with 2 parameters + 2008-01-18 14:49:54,033 d.e.f ERROR IP: 127.0.0.1 User: sheila A message at ERROR level with 2 parameters + 2008-01-18 14:49:54,033 d.e.f WARNING IP: 192.168.0.1 User: sheila A message at WARNING level with 2 parameters + 2008-01-18 14:49:54,033 d.e.f WARNING IP: 192.168.0.1 User: jim A message at WARNING level with 2 parameters + 2008-01-18 14:49:54,033 d.e.f INFO IP: 192.168.0.1 User: fred A message at INFO level with 2 parameters + 2008-01-18 14:49:54,033 d.e.f WARNING IP: 192.168.0.1 User: sheila A message at WARNING level with 2 parameters + 2008-01-18 14:49:54,033 d.e.f WARNING IP: 127.0.0.1 User: jim A message at WARNING level with 2 parameters + + +.. _filters-contextual: + +Using Filters to impart contextual information +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +You can also add contextual information to log output using a user-defined +:class:`Filter`. ``Filter`` instances are allowed to modify the ``LogRecords`` +passed to them, including adding additional attributes which can then be output +using a suitable format string, or if needed a custom :class:`Formatter`. + +For example in a web application, the request being processed (or at least, +the interesting parts of it) can be stored in a threadlocal +(:class:`threading.local`) variable, and then accessed from a ``Filter`` to +add, say, information from the request - say, the remote IP address and remote +user's username - to the ``LogRecord``, using the attribute names 'ip' and +'user' as in the ``LoggerAdapter`` example above. In that case, the same format +string can be used to get similar output to that shown above. Here's an example +script:: + + import logging + from random import choice + + class ContextFilter(logging.Filter): + """ + This is a filter which injects contextual information into the log. + + Rather than use actual contextual information, we just use random + data in this demo. + """ + + USERS = ['jim', 'fred', 'sheila'] + IPS = ['123.231.231.123', '127.0.0.1', '192.168.0.1'] + + def filter(self, record): + + record.ip = choice(ContextFilter.IPS) + record.user = choice(ContextFilter.USERS) + return True + + if __name__ == '__main__': + levels = (logging.DEBUG, logging.INFO, logging.WARNING, logging.ERROR, logging.CRITICAL) + logging.basicConfig(level=logging.DEBUG, + format='%(asctime)-15s %(name)-5s %(levelname)-8s IP: %(ip)-15s User: %(user)-8s %(message)s') + a1 = logging.getLogger('a.b.c') + a2 = logging.getLogger('d.e.f') + + f = ContextFilter() + a1.addFilter(f) + a2.addFilter(f) + a1.debug('A debug message') + a1.info('An info message with %s', 'some parameters') + for x in range(10): + lvl = choice(levels) + lvlname = logging.getLevelName(lvl) + a2.log(lvl, 'A message at %s level with %d %s', lvlname, 2, 'parameters') + +which, when run, produces something like:: + + 2010-09-06 22:38:15,292 a.b.c DEBUG IP: 123.231.231.123 User: fred A debug message + 2010-09-06 22:38:15,300 a.b.c INFO IP: 192.168.0.1 User: sheila An info message with some parameters + 2010-09-06 22:38:15,300 d.e.f CRITICAL IP: 127.0.0.1 User: sheila A message at CRITICAL level with 2 parameters + 2010-09-06 22:38:15,300 d.e.f ERROR IP: 127.0.0.1 User: jim A message at ERROR level with 2 parameters + 2010-09-06 22:38:15,300 d.e.f DEBUG IP: 127.0.0.1 User: sheila A message at DEBUG level with 2 parameters + 2010-09-06 22:38:15,300 d.e.f ERROR IP: 123.231.231.123 User: fred A message at ERROR level with 2 parameters + 2010-09-06 22:38:15,300 d.e.f CRITICAL IP: 192.168.0.1 User: jim A message at CRITICAL level with 2 parameters + 2010-09-06 22:38:15,300 d.e.f CRITICAL IP: 127.0.0.1 User: sheila A message at CRITICAL level with 2 parameters + 2010-09-06 22:38:15,300 d.e.f DEBUG IP: 192.168.0.1 User: jim A message at DEBUG level with 2 parameters + 2010-09-06 22:38:15,301 d.e.f ERROR IP: 127.0.0.1 User: sheila A message at ERROR level with 2 parameters + 2010-09-06 22:38:15,301 d.e.f DEBUG IP: 123.231.231.123 User: fred A message at DEBUG level with 2 parameters + 2010-09-06 22:38:15,301 d.e.f INFO IP: 123.231.231.123 User: fred A message at INFO level with 2 parameters + + +.. _multiple-processes: + +Logging to a single file from multiple processes +------------------------------------------------ + +Although logging is thread-safe, and logging to a single file from multiple +threads in a single process *is* supported, logging to a single file from +*multiple processes* is *not* supported, because there is no standard way to +serialize access to a single file across multiple processes in Python. If you +need to log to a single file from multiple processes, one way of doing this is +to have all the processes log to a :class:`SocketHandler`, and have a separate +process which implements a socket server which reads from the socket and logs +to file. (If you prefer, you can dedicate one thread in one of the existing +processes to perform this function.) The following section documents this +approach in more detail and includes a working socket receiver which can be +used as a starting point for you to adapt in your own applications. + +If you are using a recent version of Python which includes the +:mod:`multiprocessing` module, you could write your own handler which uses the +:class:`Lock` class from this module to serialize access to the file from +your processes. The existing :class:`FileHandler` and subclasses do not make +use of :mod:`multiprocessing` at present, though they may do so in the future. +Note that at present, the :mod:`multiprocessing` module does not provide +working lock functionality on all platforms (see +http://bugs.python.org/issue3770). + +.. currentmodule:: logging.handlers + + +Using file rotation +------------------- + +.. sectionauthor:: Doug Hellmann, Vinay Sajip (changes) +.. (see ) + +Sometimes you want to let a log file grow to a certain size, then open a new +file and log to that. You may want to keep a certain number of these files, and +when that many files have been created, rotate the files so that the number of +files and the size of the files both remain bounded. For this usage pattern, the +logging package provides a :class:`RotatingFileHandler`:: + + import glob + import logging + import logging.handlers + + LOG_FILENAME = 'logging_rotatingfile_example.out' + + # Set up a specific logger with our desired output level + my_logger = logging.getLogger('MyLogger') + my_logger.setLevel(logging.DEBUG) + + # Add the log message handler to the logger + handler = logging.handlers.RotatingFileHandler( + LOG_FILENAME, maxBytes=20, backupCount=5) + + my_logger.addHandler(handler) + + # Log some messages + for i in range(20): + my_logger.debug('i = %d' % i) + + # See what files are created + logfiles = glob.glob('%s*' % LOG_FILENAME) + + for filename in logfiles: + print(filename) + +The result should be 6 separate files, each with part of the log history for the +application:: + + logging_rotatingfile_example.out + logging_rotatingfile_example.out.1 + logging_rotatingfile_example.out.2 + logging_rotatingfile_example.out.3 + logging_rotatingfile_example.out.4 + logging_rotatingfile_example.out.5 + +The most current file is always :file:`logging_rotatingfile_example.out`, +and each time it reaches the size limit it is renamed with the suffix +``.1``. Each of the existing backup files is renamed to increment the suffix +(``.1`` becomes ``.2``, etc.) and the ``.6`` file is erased. + +Obviously this example sets the log length much much too small as an extreme +example. You would want to set *maxBytes* to an appropriate value. + diff --git a/Doc/howto/logging.rst b/Doc/howto/logging.rst new file mode 100644 --- /dev/null +++ b/Doc/howto/logging.rst @@ -0,0 +1,998 @@ +============= +Logging HOWTO +============= + +:Author: Vinay Sajip + +.. _logging-basic-tutorial: + +.. currentmodule:: logging + +Basic Logging Tutorial +---------------------- + +Logging is a means of tracking events that happen when some software runs. The +software's developer adds logging calls to their code to indicate that certain +events have occurred. An event is described by a descriptive message which can +optionally contain variable data (i.e. data that is potentially different for +each occurrence of the event). Events also have an importance which the +developer ascribes to the event; the importance can also be called the *level* +or *severity*. + +When to use logging +^^^^^^^^^^^^^^^^^^^ + +Logging provides a set of convenience functions for simple logging usage. These +are :func:`debug`, :func:`info`, :func:`warning`, :func:`error` and +:func:`critical`. To determine when to use logging, see the table below, which +states, for each of a set of common tasks, the best tool to use for it. + ++-------------------------------------+--------------------------------------+ +| Task you want to perform | The best tool for the task | ++=====================================+======================================+ +| Display console output for ordinary | :func:`print` | +| usage of a command line script or | | +| program | | ++-------------------------------------+--------------------------------------+ +| Report events that occur during | :func:`logging.info` (or | +| normal operation of a program (e.g. | :func:`logging.debug` for very | +| for status monitoring or fault | detailed output for diagnostic | +| investigation) | purposes) | ++-------------------------------------+--------------------------------------+ +| Issue a warning regarding a | :func:`warnings.warn` in library | +| particular runtime event | code if the issue is avoidable and | +| | the client application should be | +| | modified to eliminate the warning | +| | | +| | :func:`logging.warning` if there is | +| | nothing the client application can do| +| | about the situation, but the event | +| | should still be noted | ++-------------------------------------+--------------------------------------+ +| Report an error regarding a | Raise an exception | +| particular runtime event | | ++-------------------------------------+--------------------------------------+ +| Report suppression of an error | :func:`logging.error`, | +| without raising an exception (e.g. | :func:`logging.exception` or | +| error handler in a long-running | :func:`logging.critical` as | +| server process) | appropriate for the specific error | +| | and application domain | ++-------------------------------------+--------------------------------------+ + +The logging functions are named after the level or severity of the events +they are used to track. The standard levels and their applicability are +described below (in increasing order of severity): + ++--------------+---------------------------------------------+ +| Level | When it's used | ++==============+=============================================+ +| ``DEBUG`` | Detailed information, typically of interest | +| | only when diagnosing problems. | ++--------------+---------------------------------------------+ +| ``INFO`` | Confirmation that things are working as | +| | expected. | ++--------------+---------------------------------------------+ +| ``WARNING`` | An indication that something unexpected | +| | happened, or indicative of some problem in | +| | the near future (e.g. 'disk space low'). | +| | The software is still working as expected. | ++--------------+---------------------------------------------+ +| ``ERROR`` | Due to a more serious problem, the software | +| | has not been able to perform some function. | ++--------------+---------------------------------------------+ +| ``CRITICAL`` | A serious error, indicating that the program| +| | itself may be unable to continue running. | ++--------------+---------------------------------------------+ + +The default level is ``WARNING``, which means that only events of this level +and above will be tracked, unless the logging package is configured to do +otherwise. + +Events that are tracked can be handled in different ways. The simplest way of +handling tracked events is to print them to the console. Another common way +is to write them to a disk file. + + +.. _howto-minimal-example: + +A simple example +^^^^^^^^^^^^^^^^ + +A very simple example is:: + + import logging + logging.warning('Watch out!') # will print a message to the console + logging.info('I told you so') # will not print anything + +If you type these lines into a script and run it, you'll see:: + + WARNING:root:Watch out! + +printed out on the console. The ``INFO`` message doesn't appear because the +default level is ``WARNING``. The printed message includes the indication of +the level and the description of the event provided in the logging call, i.e. +'Watch out!'. Don't worry about the 'root' part for now: it will be explained +later. The actual output can be formatted quite flexibly if you need that; +formatting options will also be explained later. + + +Logging to a file +^^^^^^^^^^^^^^^^^ + +A very common situation is that of recording logging events in a file, so let's +look at that next:: + + import logging + logging.basicConfig(filename='example.log',level=logging.DEBUG) + logging.debug('This message should go to the log file') + logging.info('So should this') + logging.warning('And this, too') + +And now if we open the file and look at what we have, we should find the log +messages:: + + DEBUG:root:This message should go to the log file + INFO:root:So should this + WARNING:root:And this, too + +This example also shows how you can set the logging level which acts as the +threshold for tracking. In this case, because we set the threshold to +``DEBUG``, all of the messages were printed. + +If you want to set the logging level from a command-line option such as:: + + --log=INFO + +and you have the value of the parameter passed for ``--log`` in some variable +*loglevel*, you can use:: + + getattr(logging, loglevel.upper()) + +to get the value which you'll pass to :func:`basicConfig` via the *level* +argument. You may want to error check any user input value, perhaps as in the +following example:: + + # assuming loglevel is bound to the string value obtained from the + # command line argument. Convert to upper case to allow the user to + # specify --log=DEBUG or --log=debug + numeric_level = getattr(logging, loglevel.upper(), None) + if not isinstance(numeric_level, int): + raise ValueError('Invalid log level: %s' % loglevel) + logging.basicConfig(level=numeric_level, ...) + +The call to :func:`basicConfig` should come *before* any calls to :func:`debug`, +:func:`info` etc. As it's intended as a one-off simple configuration facility, +only the first call will actually do anything: subsequent calls are effectively +no-ops. + +If you run the above script several times, the messages from successive runs +are appended to the file *example.log*. If you want each run to start afresh, +not remembering the messages from earlier runs, you can specify the *filemode* +argument, by changing the call in the above example to:: + + logging.basicConfig(filename='example.log', filemode='w', level=logging.DEBUG) + +The output will be the same as before, but the log file is no longer appended +to, so the messages from earlier runs are lost. + + +Logging from multiple modules +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +If your program consists of multiple modules, here's an example of how you +could organize logging in it:: + + # myapp.py + import logging + import mylib + + def main(): + logging.basicConfig(filename='myapp.log', level=logging.INFO) + logging.info('Started') + mylib.do_something() + logging.info('Finished') + + if __name__ == '__main__': + main() + +:: + + # mylib.py + import logging + + def do_something(): + logging.info('Doing something') + +If you run *myapp.py*, you should see this in *myapp.log*:: + + INFO:root:Started + INFO:root:Doing something + INFO:root:Finished + +which is hopefully what you were expecting to see. You can generalize this to +multiple modules, using the pattern in *mylib.py*. Note that for this simple +usage pattern, you won't know, by looking in the log file, *where* in your +application your messages came from, apart from looking at the event +description. If you want to track the location of your messages, you'll need +to refer to the documentation beyond the tutorial level -- see +:ref:`logging-advanced-tutorial`. + + +Logging variable data +^^^^^^^^^^^^^^^^^^^^^ + +To log variable data, use a format string for the event description message and +append the variable data as arguments. For example:: + + import logging + logging.warning('%s before you %s', 'Look', 'leap!') + +will display:: + + WARNING:root:Look before you leap! + +As you can see, merging of variable data into the event description message +uses the old, %-style of string formatting. This is for backwards +compatibility: the logging package pre-dates newer formatting options such as +:meth:`str.format` and :class:`string.Template`. These newer formatting +options *are* supported, but exploring them is outside the scope of this +tutorial. + + +Changing the format of displayed messages +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +To change the format which is used to display messages, you need to +specify the format you want to use:: + + import logging + logging.basicConfig(format='%(levelname)s:%(message)s', level=logging.DEBUG) + logging.debug('This message should appear on the console') + logging.info('So should this') + logging.warning('And this, too') + +which would print:: + + DEBUG:This message should appear on the console + INFO:So should this + WARNING:And this, too + +Notice that the 'root' which appeared in earlier examples has disappeared. For +a full set of things that can appear in format strings, you can refer to the +documentation for :ref:`logrecord-attributes`, but for simple usage, you just +need the *levelname* (severity), *message* (event description, including +variable data) and perhaps to display when the event occurred. This is +described in the next section. + + +Displaying the date/time in messages +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +To display the date and time of an event, you would place '%(asctime)s' in +your format string:: + + import logging + logging.basicConfig(format='%(asctime)s %(message)s') + logging.warning('is when this event was logged.') + +which should print something like this:: + + 2010-12-12 11:41:42,612 is when this event was logged. + +The default format for date/time display (shown above) is ISO8601. If you need +more control over the formatting of the date/time, provide a *datefmt* +argument to ``basicConfig``, as in this example:: + + import logging + logging.basicConfig(format='%(asctime)s %(message)s', datefmt='%m/%d/%Y %I:%M:%S %p') + logging.warning('is when this event was logged.') + +which would display something like this:: + + 12/12/2010 11:46:36 AM is when this event was logged. + +The format of the *datefmt* argument is the same as supported by +:func:`time.strftime`. + + +Next Steps +^^^^^^^^^^ + +That concludes the basic tutorial. It should be enough to get you up and +running with logging. There's a lot more that the logging package offers, but +to get the best out of it, you'll need to invest a little more of your time in +reading the following sections. If you're ready for that, grab some of your +favourite beverage and carry on. + +If your logging needs are simple, then use the above examples to incorporate +logging into your own scripts, and if you run into problems or don't +understand something, please post a question on the comp.lang.python Usenet +group (available at http://groups.google.com/group/comp.lang.python) and you +should receive help before too long. + +Still here? You can carry on reading the next few sections, which provide a +slightly more advanced/in-depth tutorial than the basic one above. After that, +you can take a look at the :ref:`logging-cookbook`. + +.. _logging-advanced-tutorial: + + +Advanced Logging Tutorial +------------------------- + +The logging library takes a modular approach and offers several categories +of components: loggers, handlers, filters, and formatters. + +* Loggers expose the interface that application code directly uses. +* Handlers send the log records (created by loggers) to the appropriate + destination. +* Filters provide a finer grained facility for determining which log records + to output. +* Formatters specify the layout of log records in the final output. + +Logging is performed by calling methods on instances of the :class:`Logger` +class (hereafter called :dfn:`loggers`). Each instance has a name, and they are +conceptually arranged in a namespace hierarchy using dots (periods) as +separators. For example, a logger named 'scan' is the parent of loggers +'scan.text', 'scan.html' and 'scan.pdf'. Logger names can be anything you want, +and indicate the area of an application in which a logged message originates. + +A good convention to use when naming loggers is to use a module-level logger, +in each module which uses logging, named as follows:: + + logger = logging.getLogger(__name__) + +This means that logger names track the package/module hierarchy, and it's +intuitively obvious where events are logged just from the logger name. + +The root of the hierarchy of loggers is called the root logger. That's the +logger used by the functions :func:`debug`, :func:`info`, :func:`warning`, +:func:`error` and :func:`critical`, which just call the same-named method of +the root logger. The functions and the methods have the same signatures. The +root logger's name is printed as 'root' in the logged output. + +It is, of course, possible to log messages to different destinations. Support +is included in the package for writing log messages to files, HTTP GET/POST +locations, email via SMTP, generic sockets, or OS-specific logging mechanisms +such as syslog or the Windows NT event log. Destinations are served by +:dfn:`handler` classes. You can create your own log destination class if you +have special requirements not met by any of the built-in handler classes. + +By default, no destination is set for any logging messages. You can specify +a destination (such as console or file) by using :func:`basicConfig` as in the +tutorial examples. If you call the functions :func:`debug`, :func:`info`, +:func:`warning`, :func:`error` and :func:`critical`, they will check to see +if no destination is set; and if one is not set, they will set a destination +of the console (``sys.stderr``) and a default format for the displayed +message before delegating to the root logger to do the actual message output. + +The default format set by :func:`basicConfig` for messages is:: + + severity:logger name:message + +You can change this by passing a format string to :func:`basicConfig` with the +*format* keyword argument. For all options regarding how a format string is +constructed, see :ref:`formatter-objects`. + + +Loggers +^^^^^^^ + +:class:`Logger` objects have a threefold job. First, they expose several +methods to application code so that applications can log messages at runtime. +Second, logger objects determine which log messages to act upon based upon +severity (the default filtering facility) or filter objects. Third, logger +objects pass along relevant log messages to all interested log handlers. + +The most widely used methods on logger objects fall into two categories: +configuration and message sending. + +These are the most common configuration methods: + +* :meth:`Logger.setLevel` specifies the lowest-severity log message a logger + will handle, where debug is the lowest built-in severity level and critical + is the highest built-in severity. For example, if the severity level is + INFO, the logger will handle only INFO, WARNING, ERROR, and CRITICAL messages + and will ignore DEBUG messages. + +* :meth:`Logger.addHandler` and :meth:`Logger.removeHandler` add and remove + handler objects from the logger object. Handlers are covered in more detail + in :ref:`handler-basic`. + +* :meth:`Logger.addFilter` and :meth:`Logger.removeFilter` add and remove filter + objects from the logger object. Filters are covered in more detail in + :ref:`filter`. + +You don't need to always call these methods on every logger you create. See the +last two paragraphs in this section. + +With the logger object configured, the following methods create log messages: + +* :meth:`Logger.debug`, :meth:`Logger.info`, :meth:`Logger.warning`, + :meth:`Logger.error`, and :meth:`Logger.critical` all create log records with + a message and a level that corresponds to their respective method names. The + message is actually a format string, which may contain the standard string + substitution syntax of :const:`%s`, :const:`%d`, :const:`%f`, and so on. The + rest of their arguments is a list of objects that correspond with the + substitution fields in the message. With regard to :const:`**kwargs`, the + logging methods care only about a keyword of :const:`exc_info` and use it to + determine whether to log exception information. + +* :meth:`Logger.exception` creates a log message similar to + :meth:`Logger.error`. The difference is that :meth:`Logger.exception` dumps a + stack trace along with it. Call this method only from an exception handler. + +* :meth:`Logger.log` takes a log level as an explicit argument. This is a + little more verbose for logging messages than using the log level convenience + methods listed above, but this is how to log at custom log levels. + +:func:`getLogger` returns a reference to a logger instance with the specified +name if it is provided, or ``root`` if not. The names are period-separated +hierarchical structures. Multiple calls to :func:`getLogger` with the same name +will return a reference to the same logger object. Loggers that are further +down in the hierarchical list are children of loggers higher up in the list. +For example, given a logger with a name of ``foo``, loggers with names of +``foo.bar``, ``foo.bar.baz``, and ``foo.bam`` are all descendants of ``foo``. + +Loggers have a concept of *effective level*. If a level is not explicitly set +on a logger, the level of its parent is used instead as its effective level. +If the parent has no explicit level set, *its* parent is examined, and so on - +all ancestors are searched until an explicitly set level is found. The root +logger always has an explicit level set (``WARNING`` by default). When deciding +whether to process an event, the effective level of the logger is used to +determine whether the event is passed to the logger's handlers. + +Child loggers propagate messages up to the handlers associated with their +ancestor loggers. Because of this, it is unnecessary to define and configure +handlers for all the loggers an application uses. It is sufficient to +configure handlers for a top-level logger and create child loggers as needed. +(You can, however, turn off propagation by setting the *propagate* +attribute of a logger to *False*.) + + +.. _handler-basic: + +Handlers +^^^^^^^^ + +:class:`~logging.Handler` objects are responsible for dispatching the +appropriate log messages (based on the log messages' severity) to the handler's +specified destination. Logger objects can add zero or more handler objects to +themselves with an :func:`addHandler` method. As an example scenario, an +application may want to send all log messages to a log file, all log messages +of error or higher to stdout, and all messages of critical to an email address. +This scenario requires three individual handlers where each handler is +responsible for sending messages of a specific severity to a specific location. + +The standard library includes quite a few handler types (see +:ref:`useful-handlers`); the tutorials use mainly :class:`StreamHandler` and +:class:`FileHandler` in its examples. + +There are very few methods in a handler for application developers to concern +themselves with. The only handler methods that seem relevant for application +developers who are using the built-in handler objects (that is, not creating +custom handlers) are the following configuration methods: + +* The :meth:`Handler.setLevel` method, just as in logger objects, specifies the + lowest severity that will be dispatched to the appropriate destination. Why + are there two :func:`setLevel` methods? The level set in the logger + determines which severity of messages it will pass to its handlers. The level + set in each handler determines which messages that handler will send on. + +* :func:`setFormatter` selects a Formatter object for this handler to use. + +* :func:`addFilter` and :func:`removeFilter` respectively configure and + deconfigure filter objects on handlers. + +Application code should not directly instantiate and use instances of +:class:`Handler`. Instead, the :class:`Handler` class is a base class that +defines the interface that all handlers should have and establishes some +default behavior that child classes can use (or override). + + +Formatters +^^^^^^^^^^ + +Formatter objects configure the final order, structure, and contents of the log +message. Unlike the base :class:`logging.Handler` class, application code may +instantiate formatter classes, although you could likely subclass the formatter +if your application needs special behavior. The constructor takes two +optional arguments -- a message format string and a date format string. + +.. method:: logging.Formatter.__init__(fmt=None, datefmt=None) + +If there is no message format string, the default is to use the +raw message. If there is no date format string, the default date format is:: + + %Y-%m-%d %H:%M:%S + +with the milliseconds tacked on at the end. + +The message format string uses ``%()s`` styled string +substitution; the possible keys are documented in :ref:`logrecord-attributes`. + +The following message format string will log the time in a human-readable +format, the severity of the message, and the contents of the message, in that +order:: + + '%(asctime)s - %(levelname)s - %(message)s' + +Formatters use a user-configurable function to convert the creation time of a +record to a tuple. By default, :func:`time.localtime` is used; to change this +for a particular formatter instance, set the ``converter`` attribute of the +instance to a function with the same signature as :func:`time.localtime` or +:func:`time.gmtime`. To change it for all formatters, for example if you want +all logging times to be shown in GMT, set the ``converter`` attribute in the +Formatter class (to ``time.gmtime`` for GMT display). + + +Configuring Logging +^^^^^^^^^^^^^^^^^^^ + +.. currentmodule:: logging.config + +Programmers can configure logging in three ways: + +1. Creating loggers, handlers, and formatters explicitly using Python + code that calls the configuration methods listed above. +2. Creating a logging config file and reading it using the :func:`fileConfig` + function. +3. Creating a dictionary of configuration information and passing it + to the :func:`dictConfig` function. + +For the reference documentation on the last two options, see +:ref:`logging-config-api`. The following example configures a very simple +logger, a console handler, and a simple formatter using Python code:: + + import logging + + # create logger + logger = logging.getLogger('simple_example') + logger.setLevel(logging.DEBUG) + + # create console handler and set level to debug + ch = logging.StreamHandler() + ch.setLevel(logging.DEBUG) + + # create formatter + formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s') + + # add formatter to ch + ch.setFormatter(formatter) + + # add ch to logger + logger.addHandler(ch) + + # 'application' code + logger.debug('debug message') + logger.info('info message') + logger.warn('warn message') + logger.error('error message') + logger.critical('critical message') + +Running this module from the command line produces the following output:: + + $ python simple_logging_module.py + 2005-03-19 15:10:26,618 - simple_example - DEBUG - debug message + 2005-03-19 15:10:26,620 - simple_example - INFO - info message + 2005-03-19 15:10:26,695 - simple_example - WARNING - warn message + 2005-03-19 15:10:26,697 - simple_example - ERROR - error message + 2005-03-19 15:10:26,773 - simple_example - CRITICAL - critical message + +The following Python module creates a logger, handler, and formatter nearly +identical to those in the example listed above, with the only difference being +the names of the objects:: + + import logging + import logging.config + + logging.config.fileConfig('logging.conf') + + # create logger + logger = logging.getLogger('simpleExample') + + # 'application' code + logger.debug('debug message') + logger.info('info message') + logger.warn('warn message') + logger.error('error message') + logger.critical('critical message') + +Here is the logging.conf file:: + + [loggers] + keys=root,simpleExample + + [handlers] + keys=consoleHandler + + [formatters] + keys=simpleFormatter + + [logger_root] + level=DEBUG + handlers=consoleHandler + + [logger_simpleExample] + level=DEBUG + handlers=consoleHandler + qualname=simpleExample + propagate=0 + + [handler_consoleHandler] + class=StreamHandler + level=DEBUG + formatter=simpleFormatter + args=(sys.stdout,) + + [formatter_simpleFormatter] + format=%(asctime)s - %(name)s - %(levelname)s - %(message)s + datefmt= + +The output is nearly identical to that of the non-config-file-based example:: + + $ python simple_logging_config.py + 2005-03-19 15:38:55,977 - simpleExample - DEBUG - debug message + 2005-03-19 15:38:55,979 - simpleExample - INFO - info message + 2005-03-19 15:38:56,054 - simpleExample - WARNING - warn message + 2005-03-19 15:38:56,055 - simpleExample - ERROR - error message + 2005-03-19 15:38:56,130 - simpleExample - CRITICAL - critical message + +You can see that the config file approach has a few advantages over the Python +code approach, mainly separation of configuration and code and the ability of +noncoders to easily modify the logging properties. + +.. currentmodule:: logging + +Note that the class names referenced in config files need to be either relative +to the logging module, or absolute values which can be resolved using normal +import mechanisms. Thus, you could use either +:class:`~logging.handlers.WatchedFileHandler` (relative to the logging module) or +``mypackage.mymodule.MyHandler`` (for a class defined in package ``mypackage`` +and module ``mymodule``, where ``mypackage`` is available on the Python import +path). + +In Python 2.7, a new means of configuring logging has been introduced, using +dictionaries to hold configuration information. This provides a superset of the +functionality of the config-file-based approach outlined above, and is the +recommended configuration method for new applications and deployments. Because +a Python dictionary is used to hold configuration information, and since you +can populate that dictionary using different means, you have more options for +configuration. For example, you can use a configuration file in JSON format, +or, if you have access to YAML processing functionality, a file in YAML +format, to populate the configuration dictionary. Or, of course, you can +construct the dictionary in Python code, receive it in pickled form over a +socket, or use whatever approach makes sense for your application. + +Here's an example of the same configuration as above, in YAML format for +the new dictionary-based approach:: + + version: 1 + formatters: + simple: + format: format=%(asctime)s - %(name)s - %(levelname)s - %(message)s + handlers: + console: + class: logging.StreamHandler + level: DEBUG + formatter: simple + stream: ext://sys.stdout + loggers: + simpleExample: + level: DEBUG + handlers: [console] + propagate: no + root: + level: DEBUG + handlers: [console] + +For more information about logging using a dictionary, see +:ref:`logging-config-api`. + +What happens if no configuration is provided +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +If no logging configuration is provided, it is possible to have a situation +where a logging event needs to be output, but no handlers can be found to +output the event. The behaviour of the logging package in these +circumstances is dependent on the Python version. + +For Python 2.x, the behaviour is as follows: + +* If *logging.raiseExceptions* is *False* (production mode), the event is + silently dropped. + +* If *logging.raiseExceptions* is *True* (development mode), a message + 'No handlers could be found for logger X.Y.Z' is printed once. + +.. _library-config: + +Configuring Logging for a Library +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +When developing a library which uses logging, you should take care to +document how the library uses logging - for example, the names of loggers +used. Some consideration also needs to be given to its logging configuration. +If the using application does not use logging, and library code makes logging +calls, then (as described in the previous section) events of severity +``WARNING`` and greater will be printed to ``sys.stderr``. This is regarded as +the best default behaviour. + +If for some reason you *don't* want these messages printed in the absence of +any logging configuration, you can attach a do-nothing handler to the top-level +logger for your library. This avoids the message being printed, since a handler +will be always be found for the library's events: it just doesn't produce any +output. If the library user configures logging for application use, presumably +that configuration will add some handlers, and if levels are suitably +configured then logging calls made in library code will send output to those +handlers, as normal. + +A do-nothing handler is included in the logging package: +:class:`~logging.NullHandler` (since Python 2.7). An instance of this handler +could be added to the top-level logger of the logging namespace used by the +library (*if* you want to prevent your library's logged events being output to +``sys.stderr`` in the absence of logging configuration). If all logging by a +library *foo* is done using loggers with names matching 'foo.x', 'foo.x.y', +etc. then the code:: + + import logging + logging.getLogger('foo').addHandler(logging.NullHandler()) + +should have the desired effect. If an organisation produces a number of +libraries, then the logger name specified can be 'orgname.foo' rather than +just 'foo'. + +**PLEASE NOTE:** It is strongly advised that you *do not add any handlers other +than* :class:`~logging.NullHandler` *to your library's loggers*. This is +because the configuration of handlers is the prerogative of the application +developer who uses your library. The application developer knows their target +audience and what handlers are most appropriate for their application: if you +add handlers 'under the hood', you might well interfere with their ability to +carry out unit tests and deliver logs which suit their requirements. + + +Logging Levels +-------------- + +The numeric values of logging levels are given in the following table. These are +primarily of interest if you want to define your own levels, and need them to +have specific values relative to the predefined levels. If you define a level +with the same numeric value, it overwrites the predefined value; the predefined +name is lost. + ++--------------+---------------+ +| Level | Numeric value | ++==============+===============+ +| ``CRITICAL`` | 50 | ++--------------+---------------+ +| ``ERROR`` | 40 | ++--------------+---------------+ +| ``WARNING`` | 30 | ++--------------+---------------+ +| ``INFO`` | 20 | ++--------------+---------------+ +| ``DEBUG`` | 10 | ++--------------+---------------+ +| ``NOTSET`` | 0 | ++--------------+---------------+ + +Levels can also be associated with loggers, being set either by the developer or +through loading a saved logging configuration. When a logging method is called +on a logger, the logger compares its own level with the level associated with +the method call. If the logger's level is higher than the method call's, no +logging message is actually generated. This is the basic mechanism controlling +the verbosity of logging output. + +Logging messages are encoded as instances of the :class:`~logging.LogRecord` +class. When a logger decides to actually log an event, a +:class:`~logging.LogRecord` instance is created from the logging message. + +Logging messages are subjected to a dispatch mechanism through the use of +:dfn:`handlers`, which are instances of subclasses of the :class:`Handler` +class. Handlers are responsible for ensuring that a logged message (in the form +of a :class:`LogRecord`) ends up in a particular location (or set of locations) +which is useful for the target audience for that message (such as end users, +support desk staff, system administrators, developers). Handlers are passed +:class:`LogRecord` instances intended for particular destinations. Each logger +can have zero, one or more handlers associated with it (via the +:meth:`~Logger.addHandler` method of :class:`Logger`). In addition to any +handlers directly associated with a logger, *all handlers associated with all +ancestors of the logger* are called to dispatch the message (unless the +*propagate* flag for a logger is set to a false value, at which point the +passing to ancestor handlers stops). + +Just as for loggers, handlers can have levels associated with them. A handler's +level acts as a filter in the same way as a logger's level does. If a handler +decides to actually dispatch an event, the :meth:`~Handler.emit` method is used +to send the message to its destination. Most user-defined subclasses of +:class:`Handler` will need to override this :meth:`~Handler.emit`. + +.. _custom-levels: + +Custom Levels +^^^^^^^^^^^^^ + +Defining your own levels is possible, but should not be necessary, as the +existing levels have been chosen on the basis of practical experience. +However, if you are convinced that you need custom levels, great care should +be exercised when doing this, and it is possibly *a very bad idea to define +custom levels if you are developing a library*. That's because if multiple +library authors all define their own custom levels, there is a chance that +the logging output from such multiple libraries used together will be +difficult for the using developer to control and/or interpret, because a +given numeric value might mean different things for different libraries. + +.. _useful-handlers: + +Useful Handlers +--------------- + +In addition to the base :class:`Handler` class, many useful subclasses are +provided: + +#. :class:`StreamHandler` instances send messages to streams (file-like + objects). + +#. :class:`FileHandler` instances send messages to disk files. + +#. :class:`~handlers.BaseRotatingHandler` is the base class for handlers that + rotate log files at a certain point. It is not meant to be instantiated + directly. Instead, use :class:`~handlers.RotatingFileHandler` or + :class:`~handlers.TimedRotatingFileHandler`. + +#. :class:`~handlers.RotatingFileHandler` instances send messages to disk + files, with support for maximum log file sizes and log file rotation. + +#. :class:`~handlers.TimedRotatingFileHandler` instances send messages to + disk files, rotating the log file at certain timed intervals. + +#. :class:`~handlers.SocketHandler` instances send messages to TCP/IP + sockets. + +#. :class:`~handlers.DatagramHandler` instances send messages to UDP + sockets. + +#. :class:`~handlers.SMTPHandler` instances send messages to a designated + email address. + +#. :class:`~handlers.SysLogHandler` instances send messages to a Unix + syslog daemon, possibly on a remote machine. + +#. :class:`~handlers.NTEventLogHandler` instances send messages to a + Windows NT/2000/XP event log. + +#. :class:`~handlers.MemoryHandler` instances send messages to a buffer + in memory, which is flushed whenever specific criteria are met. + +#. :class:`~handlers.HTTPHandler` instances send messages to an HTTP + server using either ``GET`` or ``POST`` semantics. + +#. :class:`~handlers.WatchedFileHandler` instances watch the file they are + logging to. If the file changes, it is closed and reopened using the file + name. This handler is only useful on Unix-like systems; Windows does not + support the underlying mechanism used. + +#. :class:`NullHandler` instances do nothing with error messages. They are used + by library developers who want to use logging, but want to avoid the 'No + handlers could be found for logger XXX' message which can be displayed if + the library user has not configured logging. See :ref:`library-config` for + more information. + +.. versionadded:: 2.7 + The :class:`NullHandler` class. + +The :class:`NullHandler`, :class:`StreamHandler` and :class:`FileHandler` +classes are defined in the core logging package. The other handlers are +defined in a sub- module, :mod:`logging.handlers`. (There is also another +sub-module, :mod:`logging.config`, for configuration functionality.) + +Logged messages are formatted for presentation through instances of the +:class:`Formatter` class. They are initialized with a format string suitable for +use with the % operator and a dictionary. + +For formatting multiple messages in a batch, instances of +:class:`BufferingFormatter` can be used. In addition to the format string (which +is applied to each message in the batch), there is provision for header and +trailer format strings. + +When filtering based on logger level and/or handler level is not enough, +instances of :class:`Filter` can be added to both :class:`Logger` and +:class:`Handler` instances (through their :meth:`addFilter` method). Before +deciding to process a message further, both loggers and handlers consult all +their filters for permission. If any filter returns a false value, the message +is not processed further. + +The basic :class:`Filter` functionality allows filtering by specific logger +name. If this feature is used, messages sent to the named logger and its +children are allowed through the filter, and all others dropped. + + +.. _logging-exceptions: + +Exceptions raised during logging +-------------------------------- + +The logging package is designed to swallow exceptions which occur while logging +in production. This is so that errors which occur while handling logging events +- such as logging misconfiguration, network or other similar errors - do not +cause the application using logging to terminate prematurely. + +:class:`SystemExit` and :class:`KeyboardInterrupt` exceptions are never +swallowed. Other exceptions which occur during the :meth:`emit` method of a +:class:`Handler` subclass are passed to its :meth:`handleError` method. + +The default implementation of :meth:`handleError` in :class:`Handler` checks +to see if a module-level variable, :data:`raiseExceptions`, is set. If set, a +traceback is printed to :data:`sys.stderr`. If not set, the exception is swallowed. + +**Note:** The default value of :data:`raiseExceptions` is ``True``. This is because +during development, you typically want to be notified of any exceptions that +occur. It's advised that you set :data:`raiseExceptions` to ``False`` for production +usage. + +.. currentmodule:: logging + +.. _arbitrary-object-messages: + +Using arbitrary objects as messages +----------------------------------- + +In the preceding sections and examples, it has been assumed that the message +passed when logging the event is a string. However, this is not the only +possibility. You can pass an arbitrary object as a message, and its +:meth:`__str__` method will be called when the logging system needs to convert +it to a string representation. In fact, if you want to, you can avoid +computing a string representation altogether - for example, the +:class:`SocketHandler` emits an event by pickling it and sending it over the +wire. + + +Optimization +------------ + +Formatting of message arguments is deferred until it cannot be avoided. +However, computing the arguments passed to the logging method can also be +expensive, and you may want to avoid doing it if the logger will just throw +away your event. To decide what to do, you can call the :meth:`isEnabledFor` +method which takes a level argument and returns true if the event would be +created by the Logger for that level of call. You can write code like this:: + + if logger.isEnabledFor(logging.DEBUG): + logger.debug('Message with %s, %s', expensive_func1(), + expensive_func2()) + +so that if the logger's threshold is set above ``DEBUG``, the calls to +:func:`expensive_func1` and :func:`expensive_func2` are never made. + +There are other optimizations which can be made for specific applications which +need more precise control over what logging information is collected. Here's a +list of things you can do to avoid processing during logging which you don't +need: + ++-----------------------------------------------+----------------------------------------+ +| What you don't want to collect | How to avoid collecting it | ++===============================================+========================================+ +| Information about where calls were made from. | Set ``logging._srcfile`` to ``None``. | ++-----------------------------------------------+----------------------------------------+ +| Threading information. | Set ``logging.logThreads`` to ``0``. | ++-----------------------------------------------+----------------------------------------+ +| Process information. | Set ``logging.logProcesses`` to ``0``. | ++-----------------------------------------------+----------------------------------------+ + +Also note that the core logging module only includes the basic handlers. If +you don't import :mod:`logging.handlers` and :mod:`logging.config`, they won't +take up any memory. + +.. seealso:: + + Module :mod:`logging` + API reference for the logging module. + + Module :mod:`logging.config` + Configuration API for the logging module. + + Module :mod:`logging.handlers` + Useful handlers included with the logging module. + + :ref:`A logging cookbook ` + diff --git a/Doc/library/allos.rst b/Doc/library/allos.rst --- a/Doc/library/allos.rst +++ b/Doc/library/allos.rst @@ -20,6 +20,8 @@ optparse.rst getopt.rst logging.rst + logging.config.rst + logging.handlers.rst getpass.rst curses.rst curses.ascii.rst diff --git a/Doc/library/logging.config.rst b/Doc/library/logging.config.rst new file mode 100644 --- /dev/null +++ b/Doc/library/logging.config.rst @@ -0,0 +1,673 @@ +:mod:`logging.config` --- Logging configuration +=============================================== + +.. module:: logging.config + :synopsis: Configuration of the logging module. + + +.. moduleauthor:: Vinay Sajip +.. sectionauthor:: Vinay Sajip + +.. sidebar:: Important + + This page contains only reference information. For tutorials, + please see + + * :ref:`Basic Tutorial ` + * :ref:`Advanced Tutorial ` + * :ref:`Logging Cookbook ` + +This section describes the API for configuring the logging module. + +.. _logging-config-api: + +Configuration functions +^^^^^^^^^^^^^^^^^^^^^^^ + +The following functions configure the logging module. They are located in the +:mod:`logging.config` module. Their use is optional --- you can configure the +logging module using these functions or by making calls to the main API (defined +in :mod:`logging` itself) and defining handlers which are declared either in +:mod:`logging` or :mod:`logging.handlers`. + +.. function:: dictConfig(config) + + Takes the logging configuration from a dictionary. The contents of + this dictionary are described in :ref:`logging-config-dictschema` + below. + + If an error is encountered during configuration, this function will + raise a :exc:`ValueError`, :exc:`TypeError`, :exc:`AttributeError` + or :exc:`ImportError` with a suitably descriptive message. The + following is a (possibly incomplete) list of conditions which will + raise an error: + + * A ``level`` which is not a string or which is a string not + corresponding to an actual logging level. + * A ``propagate`` value which is not a boolean. + * An id which does not have a corresponding destination. + * A non-existent handler id found during an incremental call. + * An invalid logger name. + * Inability to resolve to an internal or external object. + + Parsing is performed by the :class:`DictConfigurator` class, whose + constructor is passed the dictionary used for configuration, and + has a :meth:`configure` method. The :mod:`logging.config` module + has a callable attribute :attr:`dictConfigClass` + which is initially set to :class:`DictConfigurator`. + You can replace the value of :attr:`dictConfigClass` with a + suitable implementation of your own. + + :func:`dictConfig` calls :attr:`dictConfigClass` passing + the specified dictionary, and then calls the :meth:`configure` method on + the returned object to put the configuration into effect:: + + def dictConfig(config): + dictConfigClass(config).configure() + + For example, a subclass of :class:`DictConfigurator` could call + ``DictConfigurator.__init__()`` in its own :meth:`__init__()`, then + set up custom prefixes which would be usable in the subsequent + :meth:`configure` call. :attr:`dictConfigClass` would be bound to + this new subclass, and then :func:`dictConfig` could be called exactly as + in the default, uncustomized state. + + .. versionadded:: 2.7 + +.. function:: fileConfig(fname[, defaults]) + + Reads the logging configuration from a :mod:`configparser`\-format file named + *fname*. This function can be called several times from an application, + allowing an end user to select from various pre-canned + configurations (if the developer provides a mechanism to present the choices + and load the chosen configuration). Defaults to be passed to the ConfigParser + can be specified in the *defaults* argument. + + +.. function:: listen(port=DEFAULT_LOGGING_CONFIG_PORT) + + Starts up a socket server on the specified port, and listens for new + configurations. If no port is specified, the module's default + :const:`DEFAULT_LOGGING_CONFIG_PORT` is used. Logging configurations will be + sent as a file suitable for processing by :func:`fileConfig`. Returns a + :class:`Thread` instance on which you can call :meth:`start` to start the + server, and which you can :meth:`join` when appropriate. To stop the server, + call :func:`stopListening`. + + To send a configuration to the socket, read in the configuration file and + send it to the socket as a string of bytes preceded by a four-byte length + string packed in binary using ``struct.pack('>L', n)``. + + +.. function:: stopListening() + + Stops the listening server which was created with a call to :func:`listen`. + This is typically called before calling :meth:`join` on the return value from + :func:`listen`. + + +.. _logging-config-dictschema: + +Configuration dictionary schema +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Describing a logging configuration requires listing the various +objects to create and the connections between them; for example, you +may create a handler named 'console' and then say that the logger +named 'startup' will send its messages to the 'console' handler. +These objects aren't limited to those provided by the :mod:`logging` +module because you might write your own formatter or handler class. +The parameters to these classes may also need to include external +objects such as ``sys.stderr``. The syntax for describing these +objects and connections is defined in :ref:`logging-config-dict-connections` +below. + +Dictionary Schema Details +""""""""""""""""""""""""" + +The dictionary passed to :func:`dictConfig` must contain the following +keys: + +* *version* - to be set to an integer value representing the schema + version. The only valid value at present is 1, but having this key + allows the schema to evolve while still preserving backwards + compatibility. + +All other keys are optional, but if present they will be interpreted +as described below. In all cases below where a 'configuring dict' is +mentioned, it will be checked for the special ``'()'`` key to see if a +custom instantiation is required. If so, the mechanism described in +:ref:`logging-config-dict-userdef` below is used to create an instance; +otherwise, the context is used to determine what to instantiate. + +* *formatters* - the corresponding value will be a dict in which each + key is a formatter id and each value is a dict describing how to + configure the corresponding Formatter instance. + + The configuring dict is searched for keys ``format`` and ``datefmt`` + (with defaults of ``None``) and these are used to construct a + :class:`logging.Formatter` instance. + +* *filters* - the corresponding value will be a dict in which each key + is a filter id and each value is a dict describing how to configure + the corresponding Filter instance. + + The configuring dict is searched for the key ``name`` (defaulting to the + empty string) and this is used to construct a :class:`logging.Filter` + instance. + +* *handlers* - the corresponding value will be a dict in which each + key is a handler id and each value is a dict describing how to + configure the corresponding Handler instance. + + The configuring dict is searched for the following keys: + + * ``class`` (mandatory). This is the fully qualified name of the + handler class. + + * ``level`` (optional). The level of the handler. + + * ``formatter`` (optional). The id of the formatter for this + handler. + + * ``filters`` (optional). A list of ids of the filters for this + handler. + + All *other* keys are passed through as keyword arguments to the + handler's constructor. For example, given the snippet:: + + handlers: + console: + class : logging.StreamHandler + formatter: brief + level : INFO + filters: [allow_foo] + stream : ext://sys.stdout + file: + class : logging.handlers.RotatingFileHandler + formatter: precise + filename: logconfig.log + maxBytes: 1024 + backupCount: 3 + + the handler with id ``console`` is instantiated as a + :class:`logging.StreamHandler`, using ``sys.stdout`` as the underlying + stream. The handler with id ``file`` is instantiated as a + :class:`logging.handlers.RotatingFileHandler` with the keyword arguments + ``filename='logconfig.log', maxBytes=1024, backupCount=3``. + +* *loggers* - the corresponding value will be a dict in which each key + is a logger name and each value is a dict describing how to + configure the corresponding Logger instance. + + The configuring dict is searched for the following keys: + + * ``level`` (optional). The level of the logger. + + * ``propagate`` (optional). The propagation setting of the logger. + + * ``filters`` (optional). A list of ids of the filters for this + logger. + + * ``handlers`` (optional). A list of ids of the handlers for this + logger. + + The specified loggers will be configured according to the level, + propagation, filters and handlers specified. + +* *root* - this will be the configuration for the root logger. + Processing of the configuration will be as for any logger, except + that the ``propagate`` setting will not be applicable. + +* *incremental* - whether the configuration is to be interpreted as + incremental to the existing configuration. This value defaults to + ``False``, which means that the specified configuration replaces the + existing configuration with the same semantics as used by the + existing :func:`fileConfig` API. + + If the specified value is ``True``, the configuration is processed + as described in the section on :ref:`logging-config-dict-incremental`. + +* *disable_existing_loggers* - whether any existing loggers are to be + disabled. This setting mirrors the parameter of the same name in + :func:`fileConfig`. If absent, this parameter defaults to ``True``. + This value is ignored if *incremental* is ``True``. + +.. _logging-config-dict-incremental: + +Incremental Configuration +""""""""""""""""""""""""" + +It is difficult to provide complete flexibility for incremental +configuration. For example, because objects such as filters +and formatters are anonymous, once a configuration is set up, it is +not possible to refer to such anonymous objects when augmenting a +configuration. + +Furthermore, there is not a compelling case for arbitrarily altering +the object graph of loggers, handlers, filters, formatters at +run-time, once a configuration is set up; the verbosity of loggers and +handlers can be controlled just by setting levels (and, in the case of +loggers, propagation flags). Changing the object graph arbitrarily in +a safe way is problematic in a multi-threaded environment; while not +impossible, the benefits are not worth the complexity it adds to the +implementation. + +Thus, when the ``incremental`` key of a configuration dict is present +and is ``True``, the system will completely ignore any ``formatters`` and +``filters`` entries, and process only the ``level`` +settings in the ``handlers`` entries, and the ``level`` and +``propagate`` settings in the ``loggers`` and ``root`` entries. + +Using a value in the configuration dict lets configurations to be sent +over the wire as pickled dicts to a socket listener. Thus, the logging +verbosity of a long-running application can be altered over time with +no need to stop and restart the application. + +.. _logging-config-dict-connections: + +Object connections +"""""""""""""""""" + +The schema describes a set of logging objects - loggers, +handlers, formatters, filters - which are connected to each other in +an object graph. Thus, the schema needs to represent connections +between the objects. For example, say that, once configured, a +particular logger has attached to it a particular handler. For the +purposes of this discussion, we can say that the logger represents the +source, and the handler the destination, of a connection between the +two. Of course in the configured objects this is represented by the +logger holding a reference to the handler. In the configuration dict, +this is done by giving each destination object an id which identifies +it unambiguously, and then using the id in the source object's +configuration to indicate that a connection exists between the source +and the destination object with that id. + +So, for example, consider the following YAML snippet:: + + formatters: + brief: + # configuration for formatter with id 'brief' goes here + precise: + # configuration for formatter with id 'precise' goes here + handlers: + h1: #This is an id + # configuration of handler with id 'h1' goes here + formatter: brief + h2: #This is another id + # configuration of handler with id 'h2' goes here + formatter: precise + loggers: + foo.bar.baz: + # other configuration for logger 'foo.bar.baz' + handlers: [h1, h2] + +(Note: YAML used here because it's a little more readable than the +equivalent Python source form for the dictionary.) + +The ids for loggers are the logger names which would be used +programmatically to obtain a reference to those loggers, e.g. +``foo.bar.baz``. The ids for Formatters and Filters can be any string +value (such as ``brief``, ``precise`` above) and they are transient, +in that they are only meaningful for processing the configuration +dictionary and used to determine connections between objects, and are +not persisted anywhere when the configuration call is complete. + +The above snippet indicates that logger named ``foo.bar.baz`` should +have two handlers attached to it, which are described by the handler +ids ``h1`` and ``h2``. The formatter for ``h1`` is that described by id +``brief``, and the formatter for ``h2`` is that described by id +``precise``. + + +.. _logging-config-dict-userdef: + +User-defined objects +"""""""""""""""""""" + +The schema supports user-defined objects for handlers, filters and +formatters. (Loggers do not need to have different types for +different instances, so there is no support in this configuration +schema for user-defined logger classes.) + +Objects to be configured are described by dictionaries +which detail their configuration. In some places, the logging system +will be able to infer from the context how an object is to be +instantiated, but when a user-defined object is to be instantiated, +the system will not know how to do this. In order to provide complete +flexibility for user-defined object instantiation, the user needs +to provide a 'factory' - a callable which is called with a +configuration dictionary and which returns the instantiated object. +This is signalled by an absolute import path to the factory being +made available under the special key ``'()'``. Here's a concrete +example:: + + formatters: + brief: + format: '%(message)s' + default: + format: '%(asctime)s %(levelname)-8s %(name)-15s %(message)s' + datefmt: '%Y-%m-%d %H:%M:%S' + custom: + (): my.package.customFormatterFactory + bar: baz + spam: 99.9 + answer: 42 + +The above YAML snippet defines three formatters. The first, with id +``brief``, is a standard :class:`logging.Formatter` instance with the +specified format string. The second, with id ``default``, has a +longer format and also defines the time format explicitly, and will +result in a :class:`logging.Formatter` initialized with those two format +strings. Shown in Python source form, the ``brief`` and ``default`` +formatters have configuration sub-dictionaries:: + + { + 'format' : '%(message)s' + } + +and:: + + { + 'format' : '%(asctime)s %(levelname)-8s %(name)-15s %(message)s', + 'datefmt' : '%Y-%m-%d %H:%M:%S' + } + +respectively, and as these dictionaries do not contain the special key +``'()'``, the instantiation is inferred from the context: as a result, +standard :class:`logging.Formatter` instances are created. The +configuration sub-dictionary for the third formatter, with id +``custom``, is:: + + { + '()' : 'my.package.customFormatterFactory', + 'bar' : 'baz', + 'spam' : 99.9, + 'answer' : 42 + } + +and this contains the special key ``'()'``, which means that +user-defined instantiation is wanted. In this case, the specified +factory callable will be used. If it is an actual callable it will be +used directly - otherwise, if you specify a string (as in the example) +the actual callable will be located using normal import mechanisms. +The callable will be called with the **remaining** items in the +configuration sub-dictionary as keyword arguments. In the above +example, the formatter with id ``custom`` will be assumed to be +returned by the call:: + + my.package.customFormatterFactory(bar='baz', spam=99.9, answer=42) + +The key ``'()'`` has been used as the special key because it is not a +valid keyword parameter name, and so will not clash with the names of +the keyword arguments used in the call. The ``'()'`` also serves as a +mnemonic that the corresponding value is a callable. + + +.. _logging-config-dict-externalobj: + +Access to external objects +"""""""""""""""""""""""""" + +There are times where a configuration needs to refer to objects +external to the configuration, for example ``sys.stderr``. If the +configuration dict is constructed using Python code, this is +straightforward, but a problem arises when the configuration is +provided via a text file (e.g. JSON, YAML). In a text file, there is +no standard way to distinguish ``sys.stderr`` from the literal string +``'sys.stderr'``. To facilitate this distinction, the configuration +system looks for certain special prefixes in string values and +treat them specially. For example, if the literal string +``'ext://sys.stderr'`` is provided as a value in the configuration, +then the ``ext://`` will be stripped off and the remainder of the +value processed using normal import mechanisms. + +The handling of such prefixes is done in a way analogous to protocol +handling: there is a generic mechanism to look for prefixes which +match the regular expression ``^(?P[a-z]+)://(?P.*)$`` +whereby, if the ``prefix`` is recognised, the ``suffix`` is processed +in a prefix-dependent manner and the result of the processing replaces +the string value. If the prefix is not recognised, then the string +value will be left as-is. + + +.. _logging-config-dict-internalobj: + +Access to internal objects +"""""""""""""""""""""""""" + +As well as external objects, there is sometimes also a need to refer +to objects in the configuration. This will be done implicitly by the +configuration system for things that it knows about. For example, the +string value ``'DEBUG'`` for a ``level`` in a logger or handler will +automatically be converted to the value ``logging.DEBUG``, and the +``handlers``, ``filters`` and ``formatter`` entries will take an +object id and resolve to the appropriate destination object. + +However, a more generic mechanism is needed for user-defined +objects which are not known to the :mod:`logging` module. For +example, consider :class:`logging.handlers.MemoryHandler`, which takes +a ``target`` argument which is another handler to delegate to. Since +the system already knows about this class, then in the configuration, +the given ``target`` just needs to be the object id of the relevant +target handler, and the system will resolve to the handler from the +id. If, however, a user defines a ``my.package.MyHandler`` which has +an ``alternate`` handler, the configuration system would not know that +the ``alternate`` referred to a handler. To cater for this, a generic +resolution system allows the user to specify:: + + handlers: + file: + # configuration of file handler goes here + + custom: + (): my.package.MyHandler + alternate: cfg://handlers.file + +The literal string ``'cfg://handlers.file'`` will be resolved in an +analogous way to strings with the ``ext://`` prefix, but looking +in the configuration itself rather than the import namespace. The +mechanism allows access by dot or by index, in a similar way to +that provided by ``str.format``. Thus, given the following snippet:: + + handlers: + email: + class: logging.handlers.SMTPHandler + mailhost: localhost + fromaddr: my_app at domain.tld + toaddrs: + - support_team at domain.tld + - dev_team at domain.tld + subject: Houston, we have a problem. + +in the configuration, the string ``'cfg://handlers'`` would resolve to +the dict with key ``handlers``, the string ``'cfg://handlers.email`` +would resolve to the dict with key ``email`` in the ``handlers`` dict, +and so on. The string ``'cfg://handlers.email.toaddrs[1]`` would +resolve to ``'dev_team.domain.tld'`` and the string +``'cfg://handlers.email.toaddrs[0]'`` would resolve to the value +``'support_team at domain.tld'``. The ``subject`` value could be accessed +using either ``'cfg://handlers.email.subject'`` or, equivalently, +``'cfg://handlers.email[subject]'``. The latter form only needs to be +used if the key contains spaces or non-alphanumeric characters. If an +index value consists only of decimal digits, access will be attempted +using the corresponding integer value, falling back to the string +value if needed. + +Given a string ``cfg://handlers.myhandler.mykey.123``, this will +resolve to ``config_dict['handlers']['myhandler']['mykey']['123']``. +If the string is specified as ``cfg://handlers.myhandler.mykey[123]``, +the system will attempt to retrieve the value from +``config_dict['handlers']['myhandler']['mykey'][123]``, and fall back +to ``config_dict['handlers']['myhandler']['mykey']['123']`` if that +fails. + +.. _logging-config-fileformat: + +Configuration file format +^^^^^^^^^^^^^^^^^^^^^^^^^ + +The configuration file format understood by :func:`fileConfig` is based on +:mod:`configparser` functionality. The file must contain sections called +``[loggers]``, ``[handlers]`` and ``[formatters]`` which identify by name the +entities of each type which are defined in the file. For each such entity, there +is a separate section which identifies how that entity is configured. Thus, for +a logger named ``log01`` in the ``[loggers]`` section, the relevant +configuration details are held in a section ``[logger_log01]``. Similarly, a +handler called ``hand01`` in the ``[handlers]`` section will have its +configuration held in a section called ``[handler_hand01]``, while a formatter +called ``form01`` in the ``[formatters]`` section will have its configuration +specified in a section called ``[formatter_form01]``. The root logger +configuration must be specified in a section called ``[logger_root]``. + +Examples of these sections in the file are given below. :: + + [loggers] + keys=root,log02,log03,log04,log05,log06,log07 + + [handlers] + keys=hand01,hand02,hand03,hand04,hand05,hand06,hand07,hand08,hand09 + + [formatters] + keys=form01,form02,form03,form04,form05,form06,form07,form08,form09 + +The root logger must specify a level and a list of handlers. An example of a +root logger section is given below. :: + + [logger_root] + level=NOTSET + handlers=hand01 + +The ``level`` entry can be one of ``DEBUG, INFO, WARNING, ERROR, CRITICAL`` or +``NOTSET``. For the root logger only, ``NOTSET`` means that all messages will be +logged. Level values are :func:`eval`\ uated in the context of the ``logging`` +package's namespace. + +The ``handlers`` entry is a comma-separated list of handler names, which must +appear in the ``[handlers]`` section. These names must appear in the +``[handlers]`` section and have corresponding sections in the configuration +file. + +For loggers other than the root logger, some additional information is required. +This is illustrated by the following example. :: + + [logger_parser] + level=DEBUG + handlers=hand01 + propagate=1 + qualname=compiler.parser + +The ``level`` and ``handlers`` entries are interpreted as for the root logger, +except that if a non-root logger's level is specified as ``NOTSET``, the system +consults loggers higher up the hierarchy to determine the effective level of the +logger. The ``propagate`` entry is set to 1 to indicate that messages must +propagate to handlers higher up the logger hierarchy from this logger, or 0 to +indicate that messages are **not** propagated to handlers up the hierarchy. The +``qualname`` entry is the hierarchical channel name of the logger, that is to +say the name used by the application to get the logger. + +Sections which specify handler configuration are exemplified by the following. +:: + + [handler_hand01] + class=StreamHandler + level=NOTSET + formatter=form01 + args=(sys.stdout,) + +The ``class`` entry indicates the handler's class (as determined by :func:`eval` +in the ``logging`` package's namespace). The ``level`` is interpreted as for +loggers, and ``NOTSET`` is taken to mean 'log everything'. + +.. versionchanged:: 2.6 + Added support for resolving the handler?s class as a dotted module and + class name. + +The ``formatter`` entry indicates the key name of the formatter for this +handler. If blank, a default formatter (``logging._defaultFormatter``) is used. +If a name is specified, it must appear in the ``[formatters]`` section and have +a corresponding section in the configuration file. + +The ``args`` entry, when :func:`eval`\ uated in the context of the ``logging`` +package's namespace, is the list of arguments to the constructor for the handler +class. Refer to the constructors for the relevant handlers, or to the examples +below, to see how typical entries are constructed. :: + + [handler_hand02] + class=FileHandler + level=DEBUG + formatter=form02 + args=('python.log', 'w') + + [handler_hand03] + class=handlers.SocketHandler + level=INFO + formatter=form03 + args=('localhost', handlers.DEFAULT_TCP_LOGGING_PORT) + + [handler_hand04] + class=handlers.DatagramHandler + level=WARN + formatter=form04 + args=('localhost', handlers.DEFAULT_UDP_LOGGING_PORT) + + [handler_hand05] + class=handlers.SysLogHandler + level=ERROR + formatter=form05 + args=(('localhost', handlers.SYSLOG_UDP_PORT), handlers.SysLogHandler.LOG_USER) + + [handler_hand06] + class=handlers.NTEventLogHandler + level=CRITICAL + formatter=form06 + args=('Python Application', '', 'Application') + + [handler_hand07] + class=handlers.SMTPHandler + level=WARN + formatter=form07 + args=('localhost', 'from at abc', ['user1 at abc', 'user2 at xyz'], 'Logger Subject') + + [handler_hand08] + class=handlers.MemoryHandler + level=NOTSET + formatter=form08 + target= + args=(10, ERROR) + + [handler_hand09] + class=handlers.HTTPHandler + level=NOTSET + formatter=form09 + args=('localhost:9022', '/log', 'GET') + +Sections which specify formatter configuration are typified by the following. :: + + [formatter_form01] + format=F1 %(asctime)s %(levelname)s %(message)s + datefmt= + class=logging.Formatter + +The ``format`` entry is the overall format string, and the ``datefmt`` entry is +the :func:`strftime`\ -compatible date/time format string. If empty, the +package substitutes ISO8601 format date/times, which is almost equivalent to +specifying the date format string ``'%Y-%m-%d %H:%M:%S'``. The ISO8601 format +also specifies milliseconds, which are appended to the result of using the above +format string, with a comma separator. An example time in ISO8601 format is +``2003-01-23 00:29:50,411``. + +The ``class`` entry is optional. It indicates the name of the formatter's class +(as a dotted module and class name.) This option is useful for instantiating a +:class:`Formatter` subclass. Subclasses of :class:`Formatter` can present +exception tracebacks in an expanded or condensed format. + +.. seealso:: + + Module :mod:`logging` + API reference for the logging module. + + Module :mod:`logging.handlers` + Useful handlers included with the logging module. + + diff --git a/Doc/library/logging.handlers.rst b/Doc/library/logging.handlers.rst new file mode 100644 --- /dev/null +++ b/Doc/library/logging.handlers.rst @@ -0,0 +1,740 @@ +:mod:`logging.handlers` --- Logging handlers +============================================ + +.. module:: logging.handlers + :synopsis: Handlers for the logging module. + + +.. moduleauthor:: Vinay Sajip +.. sectionauthor:: Vinay Sajip + +.. sidebar:: Important + + This page contains only reference information. For tutorials, + please see + + * :ref:`Basic Tutorial ` + * :ref:`Advanced Tutorial ` + * :ref:`Logging Cookbook ` + +.. currentmodule:: logging + +The following useful handlers are provided in the package. Note that three of +the handlers (:class:`StreamHandler`, :class:`FileHandler` and +:class:`NullHandler`) are actually defined in the :mod:`logging` module itself, +but have been documented here along with the other handlers. + +.. _stream-handler: + +StreamHandler +^^^^^^^^^^^^^ + +The :class:`StreamHandler` class, located in the core :mod:`logging` package, +sends logging output to streams such as *sys.stdout*, *sys.stderr* or any +file-like object (or, more precisely, any object which supports :meth:`write` +and :meth:`flush` methods). + + +.. class:: StreamHandler(stream=None) + + Returns a new instance of the :class:`StreamHandler` class. If *stream* is + specified, the instance will use it for logging output; otherwise, *sys.stderr* + will be used. + + + .. method:: emit(record) + + If a formatter is specified, it is used to format the record. The record + is then written to the stream with a newline terminator. If exception + information is present, it is formatted using + :func:`traceback.print_exception` and appended to the stream. + + + .. method:: flush() + + Flushes the stream by calling its :meth:`flush` method. Note that the + :meth:`close` method is inherited from :class:`Handler` and so does + no output, so an explicit :meth:`flush` call may be needed at times. + +.. _file-handler: + +FileHandler +^^^^^^^^^^^ + +The :class:`FileHandler` class, located in the core :mod:`logging` package, +sends logging output to a disk file. It inherits the output functionality from +:class:`StreamHandler`. + + +.. class:: FileHandler(filename, mode='a', encoding=None, delay=False) + + Returns a new instance of the :class:`FileHandler` class. The specified file is + opened and used as the stream for logging. If *mode* is not specified, + :const:`'a'` is used. If *encoding* is not *None*, it is used to open the file + with that encoding. If *delay* is true, then file opening is deferred until the + first call to :meth:`emit`. By default, the file grows indefinitely. + + .. versionchanged:: 2.6 + *delay* was added. + + .. method:: close() + + Closes the file. + + + .. method:: emit(record) + + Outputs the record to the file. + + +.. _null-handler: + +NullHandler +^^^^^^^^^^^ + +.. versionadded:: 2.7 + +The :class:`NullHandler` class, located in the core :mod:`logging` package, +does not do any formatting or output. It is essentially a 'no-op' handler +for use by library developers. + +.. class:: NullHandler() + + Returns a new instance of the :class:`NullHandler` class. + + .. method:: emit(record) + + This method does nothing. + + .. method:: handle(record) + + This method does nothing. + + .. method:: createLock() + + This method returns ``None`` for the lock, since there is no + underlying I/O to which access needs to be serialized. + + +See :ref:`library-config` for more information on how to use +:class:`NullHandler`. + +.. _watched-file-handler: + +WatchedFileHandler +^^^^^^^^^^^^^^^^^^ + +.. currentmodule:: logging.handlers + +.. versionadded:: 2.6 + +The :class:`WatchedFileHandler` class, located in the :mod:`logging.handlers` +module, is a :class:`FileHandler` which watches the file it is logging to. If +the file changes, it is closed and reopened using the file name. + +A file change can happen because of usage of programs such as *newsyslog* and +*logrotate* which perform log file rotation. This handler, intended for use +under Unix/Linux, watches the file to see if it has changed since the last emit. +(A file is deemed to have changed if its device or inode have changed.) If the +file has changed, the old file stream is closed, and the file opened to get a +new stream. + +This handler is not appropriate for use under Windows, because under Windows +open log files cannot be moved or renamed - logging opens the files with +exclusive locks - and so there is no need for such a handler. Furthermore, +*ST_INO* is not supported under Windows; :func:`stat` always returns zero for +this value. + + +.. class:: WatchedFileHandler(filename[,mode[, encoding[, delay]]]) + + Returns a new instance of the :class:`WatchedFileHandler` class. The specified + file is opened and used as the stream for logging. If *mode* is not specified, + :const:`'a'` is used. If *encoding* is not *None*, it is used to open the file + with that encoding. If *delay* is true, then file opening is deferred until the + first call to :meth:`emit`. By default, the file grows indefinitely. + + + .. method:: emit(record) + + Outputs the record to the file, but first checks to see if the file has + changed. If it has, the existing stream is flushed and closed and the + file opened again, before outputting the record to the file. + +.. _rotating-file-handler: + +RotatingFileHandler +^^^^^^^^^^^^^^^^^^^ + +The :class:`RotatingFileHandler` class, located in the :mod:`logging.handlers` +module, supports rotation of disk log files. + + +.. class:: RotatingFileHandler(filename, mode='a', maxBytes=0, backupCount=0, encoding=None, delay=0) + + Returns a new instance of the :class:`RotatingFileHandler` class. The specified + file is opened and used as the stream for logging. If *mode* is not specified, + ``'a'`` is used. If *encoding* is not *None*, it is used to open the file + with that encoding. If *delay* is true, then file opening is deferred until the + first call to :meth:`emit`. By default, the file grows indefinitely. + + You can use the *maxBytes* and *backupCount* values to allow the file to + :dfn:`rollover` at a predetermined size. When the size is about to be exceeded, + the file is closed and a new file is silently opened for output. Rollover occurs + whenever the current log file is nearly *maxBytes* in length; if *maxBytes* is + zero, rollover never occurs. If *backupCount* is non-zero, the system will save + old log files by appending the extensions '.1', '.2' etc., to the filename. For + example, with a *backupCount* of 5 and a base file name of :file:`app.log`, you + would get :file:`app.log`, :file:`app.log.1`, :file:`app.log.2`, up to + :file:`app.log.5`. The file being written to is always :file:`app.log`. When + this file is filled, it is closed and renamed to :file:`app.log.1`, and if files + :file:`app.log.1`, :file:`app.log.2`, etc. exist, then they are renamed to + :file:`app.log.2`, :file:`app.log.3` etc. respectively. + + .. versionchanged:: 2.6 + *delay* was added. + + + .. method:: doRollover() + + Does a rollover, as described above. + + + .. method:: emit(record) + + Outputs the record to the file, catering for rollover as described + previously. + +.. _timed-rotating-file-handler: + +TimedRotatingFileHandler +^^^^^^^^^^^^^^^^^^^^^^^^ + +The :class:`TimedRotatingFileHandler` class, located in the +:mod:`logging.handlers` module, supports rotation of disk log files at certain +timed intervals. + + +.. class:: TimedRotatingFileHandler(filename, when='h', interval=1, backupCount=0, encoding=None, delay=False, utc=False) + + Returns a new instance of the :class:`TimedRotatingFileHandler` class. The + specified file is opened and used as the stream for logging. On rotating it also + sets the filename suffix. Rotating happens based on the product of *when* and + *interval*. + + You can use the *when* to specify the type of *interval*. The list of possible + values is below. Note that they are not case sensitive. + + +----------------+-----------------------+ + | Value | Type of interval | + +================+=======================+ + | ``'S'`` | Seconds | + +----------------+-----------------------+ + | ``'M'`` | Minutes | + +----------------+-----------------------+ + | ``'H'`` | Hours | + +----------------+-----------------------+ + | ``'D'`` | Days | + +----------------+-----------------------+ + | ``'W'`` | Week day (0=Monday) | + +----------------+-----------------------+ + | ``'midnight'`` | Roll over at midnight | + +----------------+-----------------------+ + + The system will save old log files by appending extensions to the filename. + The extensions are date-and-time based, using the strftime format + ``%Y-%m-%d_%H-%M-%S`` or a leading portion thereof, depending on the + rollover interval. + + When computing the next rollover time for the first time (when the handler + is created), the last modification time of an existing log file, or else + the current time, is used to compute when the next rotation will occur. + + If the *utc* argument is true, times in UTC will be used; otherwise + local time is used. + + If *backupCount* is nonzero, at most *backupCount* files + will be kept, and if more would be created when rollover occurs, the oldest + one is deleted. The deletion logic uses the interval to determine which + files to delete, so changing the interval may leave old files lying around. + + If *delay* is true, then file opening is deferred until the first call to + :meth:`emit`. + + .. versionchanged:: 2.6 + *delay* was added. + + .. versionchanged:: 2.7 + *utc* was added. + + + .. method:: doRollover() + + Does a rollover, as described above. + + + .. method:: emit(record) + + Outputs the record to the file, catering for rollover as described above. + + +.. _socket-handler: + +SocketHandler +^^^^^^^^^^^^^ + +The :class:`SocketHandler` class, located in the :mod:`logging.handlers` module, +sends logging output to a network socket. The base class uses a TCP socket. + + +.. class:: SocketHandler(host, port) + + Returns a new instance of the :class:`SocketHandler` class intended to + communicate with a remote machine whose address is given by *host* and *port*. + + + .. method:: close() + + Closes the socket. + + + .. method:: emit() + + Pickles the record's attribute dictionary and writes it to the socket in + binary format. If there is an error with the socket, silently drops the + packet. If the connection was previously lost, re-establishes the + connection. To unpickle the record at the receiving end into a + :class:`LogRecord`, use the :func:`makeLogRecord` function. + + + .. method:: handleError() + + Handles an error which has occurred during :meth:`emit`. The most likely + cause is a lost connection. Closes the socket so that we can retry on the + next event. + + + .. method:: makeSocket() + + This is a factory method which allows subclasses to define the precise + type of socket they want. The default implementation creates a TCP socket + (:const:`socket.SOCK_STREAM`). + + + .. method:: makePickle(record) + + Pickles the record's attribute dictionary in binary format with a length + prefix, and returns it ready for transmission across the socket. + + Note that pickles aren't completely secure. If you are concerned about + security, you may want to override this method to implement a more secure + mechanism. For example, you can sign pickles using HMAC and then verify + them on the receiving end, or alternatively you can disable unpickling of + global objects on the receiving end. + + + .. method:: send(packet) + + Send a pickled string *packet* to the socket. This function allows for + partial sends which can happen when the network is busy. + + + .. method:: createSocket() + + Tries to create a socket; on failure, uses an exponential back-off + algorithm. On intial failure, the handler will drop the message it was + trying to send. When subsequent messages are handled by the same + instance, it will not try connecting until some time has passed. The + default parameters are such that the initial delay is one second, and if + after that delay the connection still can't be made, the handler will + double the delay each time up to a maximum of 30 seconds. + + This behaviour is controlled by the following handler attributes: + + * ``retryStart`` (initial delay, defaulting to 1.0 seconds). + * ``retryFactor`` (multiplier, defaulting to 2.0). + * ``retryMax`` (maximum delay, defaulting to 30.0 seconds). + + This means that if the remote listener starts up *after* the handler has + been used, you could lose messages (since the handler won't even attempt + a connection until the delay has elapsed, but just silently drop messages + during the delay period). + + +.. _datagram-handler: + +DatagramHandler +^^^^^^^^^^^^^^^ + +The :class:`DatagramHandler` class, located in the :mod:`logging.handlers` +module, inherits from :class:`SocketHandler` to support sending logging messages +over UDP sockets. + + +.. class:: DatagramHandler(host, port) + + Returns a new instance of the :class:`DatagramHandler` class intended to + communicate with a remote machine whose address is given by *host* and *port*. + + + .. method:: emit() + + Pickles the record's attribute dictionary and writes it to the socket in + binary format. If there is an error with the socket, silently drops the + packet. To unpickle the record at the receiving end into a + :class:`LogRecord`, use the :func:`makeLogRecord` function. + + + .. method:: makeSocket() + + The factory method of :class:`SocketHandler` is here overridden to create + a UDP socket (:const:`socket.SOCK_DGRAM`). + + + .. method:: send(s) + + Send a pickled string to a socket. + + +.. _syslog-handler: + +SysLogHandler +^^^^^^^^^^^^^ + +The :class:`SysLogHandler` class, located in the :mod:`logging.handlers` module, +supports sending logging messages to a remote or local Unix syslog. + + +.. class:: SysLogHandler(address=('localhost', SYSLOG_UDP_PORT), facility=LOG_USER, socktype=socket.SOCK_DGRAM) + + Returns a new instance of the :class:`SysLogHandler` class intended to + communicate with a remote Unix machine whose address is given by *address* in + the form of a ``(host, port)`` tuple. If *address* is not specified, + ``('localhost', 514)`` is used. The address is used to open a socket. An + alternative to providing a ``(host, port)`` tuple is providing an address as a + string, for example '/dev/log'. In this case, a Unix domain socket is used to + send the message to the syslog. If *facility* is not specified, + :const:`LOG_USER` is used. The type of socket opened depends on the + *socktype* argument, which defaults to :const:`socket.SOCK_DGRAM` and thus + opens a UDP socket. To open a TCP socket (for use with the newer syslog + daemons such as rsyslog), specify a value of :const:`socket.SOCK_STREAM`. + + Note that if your server is not listening on UDP port 514, + :class:`SysLogHandler` may appear not to work. In that case, check what + address you should be using for a domain socket - it's system dependent. + For example, on Linux it's usually '/dev/log' but on OS/X it's + '/var/run/syslog'. You'll need to check your platform and use the + appropriate address (you may need to do this check at runtime if your + application needs to run on several platforms). On Windows, you pretty + much have to use the UDP option. + + .. versionchanged:: 2.7 + *socktype* was added. + + + .. method:: close() + + Closes the socket to the remote host. + + + .. method:: emit(record) + + The record is formatted, and then sent to the syslog server. If exception + information is present, it is *not* sent to the server. + + + .. method:: encodePriority(facility, priority) + + Encodes the facility and priority into an integer. You can pass in strings + or integers - if strings are passed, internal mapping dictionaries are + used to convert them to integers. + + The symbolic ``LOG_`` values are defined in :class:`SysLogHandler` and + mirror the values defined in the ``sys/syslog.h`` header file. + + **Priorities** + + +--------------------------+---------------+ + | Name (string) | Symbolic value| + +==========================+===============+ + | ``alert`` | LOG_ALERT | + +--------------------------+---------------+ + | ``crit`` or ``critical`` | LOG_CRIT | + +--------------------------+---------------+ + | ``debug`` | LOG_DEBUG | + +--------------------------+---------------+ + | ``emerg`` or ``panic`` | LOG_EMERG | + +--------------------------+---------------+ + | ``err`` or ``error`` | LOG_ERR | + +--------------------------+---------------+ + | ``info`` | LOG_INFO | + +--------------------------+---------------+ + | ``notice`` | LOG_NOTICE | + +--------------------------+---------------+ + | ``warn`` or ``warning`` | LOG_WARNING | + +--------------------------+---------------+ + + **Facilities** + + +---------------+---------------+ + | Name (string) | Symbolic value| + +===============+===============+ + | ``auth`` | LOG_AUTH | + +---------------+---------------+ + | ``authpriv`` | LOG_AUTHPRIV | + +---------------+---------------+ + | ``cron`` | LOG_CRON | + +---------------+---------------+ + | ``daemon`` | LOG_DAEMON | + +---------------+---------------+ + | ``ftp`` | LOG_FTP | + +---------------+---------------+ + | ``kern`` | LOG_KERN | + +---------------+---------------+ + | ``lpr`` | LOG_LPR | + +---------------+---------------+ + | ``mail`` | LOG_MAIL | + +---------------+---------------+ + | ``news`` | LOG_NEWS | + +---------------+---------------+ + | ``syslog`` | LOG_SYSLOG | + +---------------+---------------+ + | ``user`` | LOG_USER | + +---------------+---------------+ + | ``uucp`` | LOG_UUCP | + +---------------+---------------+ + | ``local0`` | LOG_LOCAL0 | + +---------------+---------------+ + | ``local1`` | LOG_LOCAL1 | + +---------------+---------------+ + | ``local2`` | LOG_LOCAL2 | + +---------------+---------------+ + | ``local3`` | LOG_LOCAL3 | + +---------------+---------------+ + | ``local4`` | LOG_LOCAL4 | + +---------------+---------------+ + | ``local5`` | LOG_LOCAL5 | + +---------------+---------------+ + | ``local6`` | LOG_LOCAL6 | + +---------------+---------------+ + | ``local7`` | LOG_LOCAL7 | + +---------------+---------------+ + + .. method:: mapPriority(levelname) + + Maps a logging level name to a syslog priority name. + You may need to override this if you are using custom levels, or + if the default algorithm is not suitable for your needs. The + default algorithm maps ``DEBUG``, ``INFO``, ``WARNING``, ``ERROR`` and + ``CRITICAL`` to the equivalent syslog names, and all other level + names to 'warning'. + +.. _nt-eventlog-handler: + +NTEventLogHandler +^^^^^^^^^^^^^^^^^ + +The :class:`NTEventLogHandler` class, located in the :mod:`logging.handlers` +module, supports sending logging messages to a local Windows NT, Windows 2000 or +Windows XP event log. Before you can use it, you need Mark Hammond's Win32 +extensions for Python installed. + + +.. class:: NTEventLogHandler(appname, dllname=None, logtype='Application') + + Returns a new instance of the :class:`NTEventLogHandler` class. The *appname* is + used to define the application name as it appears in the event log. An + appropriate registry entry is created using this name. The *dllname* should give + the fully qualified pathname of a .dll or .exe which contains message + definitions to hold in the log (if not specified, ``'win32service.pyd'`` is used + - this is installed with the Win32 extensions and contains some basic + placeholder message definitions. Note that use of these placeholders will make + your event logs big, as the entire message source is held in the log. If you + want slimmer logs, you have to pass in the name of your own .dll or .exe which + contains the message definitions you want to use in the event log). The + *logtype* is one of ``'Application'``, ``'System'`` or ``'Security'``, and + defaults to ``'Application'``. + + + .. method:: close() + + At this point, you can remove the application name from the registry as a + source of event log entries. However, if you do this, you will not be able + to see the events as you intended in the Event Log Viewer - it needs to be + able to access the registry to get the .dll name. The current version does + not do this. + + + .. method:: emit(record) + + Determines the message ID, event category and event type, and then logs + the message in the NT event log. + + + .. method:: getEventCategory(record) + + Returns the event category for the record. Override this if you want to + specify your own categories. This version returns 0. + + + .. method:: getEventType(record) + + Returns the event type for the record. Override this if you want to + specify your own types. This version does a mapping using the handler's + typemap attribute, which is set up in :meth:`__init__` to a dictionary + which contains mappings for :const:`DEBUG`, :const:`INFO`, + :const:`WARNING`, :const:`ERROR` and :const:`CRITICAL`. If you are using + your own levels, you will either need to override this method or place a + suitable dictionary in the handler's *typemap* attribute. + + + .. method:: getMessageID(record) + + Returns the message ID for the record. If you are using your own messages, + you could do this by having the *msg* passed to the logger being an ID + rather than a format string. Then, in here, you could use a dictionary + lookup to get the message ID. This version returns 1, which is the base + message ID in :file:`win32service.pyd`. + +.. _smtp-handler: + +SMTPHandler +^^^^^^^^^^^ + +The :class:`SMTPHandler` class, located in the :mod:`logging.handlers` module, +supports sending logging messages to an email address via SMTP. + + +.. class:: SMTPHandler(mailhost, fromaddr, toaddrs, subject, credentials=None, secure=None) + + Returns a new instance of the :class:`SMTPHandler` class. The instance is + initialized with the from and to addresses and subject line of the email. + The *toaddrs* should be a list of strings. To specify a non-standard SMTP + port, use the (host, port) tuple format for the *mailhost* argument. If you + use a string, the standard SMTP port is used. If your SMTP server requires + authentication, you can specify a (username, password) tuple for the + *credentials* argument. If *secure* is True, then the handler will attempt + to use TLS for the email transmission. + + .. versionchanged:: 2.6 + *credentials* was added. + + .. versionchanged:: 2.7 + *secure* was added. + + + .. method:: emit(record) + + Formats the record and sends it to the specified addressees. + + + .. method:: getSubject(record) + + If you want to specify a subject line which is record-dependent, override + this method. + +.. _memory-handler: + +MemoryHandler +^^^^^^^^^^^^^ + +The :class:`MemoryHandler` class, located in the :mod:`logging.handlers` module, +supports buffering of logging records in memory, periodically flushing them to a +:dfn:`target` handler. Flushing occurs whenever the buffer is full, or when an +event of a certain severity or greater is seen. + +:class:`MemoryHandler` is a subclass of the more general +:class:`BufferingHandler`, which is an abstract class. This buffers logging +records in memory. Whenever each record is added to the buffer, a check is made +by calling :meth:`shouldFlush` to see if the buffer should be flushed. If it +should, then :meth:`flush` is expected to do the needful. + + +.. class:: BufferingHandler(capacity) + + Initializes the handler with a buffer of the specified capacity. + + + .. method:: emit(record) + + Appends the record to the buffer. If :meth:`shouldFlush` returns true, + calls :meth:`flush` to process the buffer. + + + .. method:: flush() + + You can override this to implement custom flushing behavior. This version + just zaps the buffer to empty. + + + .. method:: shouldFlush(record) + + Returns true if the buffer is up to capacity. This method can be + overridden to implement custom flushing strategies. + + +.. class:: MemoryHandler(capacity, flushLevel=ERROR, target=None) + + Returns a new instance of the :class:`MemoryHandler` class. The instance is + initialized with a buffer size of *capacity*. If *flushLevel* is not specified, + :const:`ERROR` is used. If no *target* is specified, the target will need to be + set using :meth:`setTarget` before this handler does anything useful. + + + .. method:: close() + + Calls :meth:`flush`, sets the target to :const:`None` and clears the + buffer. + + + .. method:: flush() + + For a :class:`MemoryHandler`, flushing means just sending the buffered + records to the target, if there is one. The buffer is also cleared when + this happens. Override if you want different behavior. + + + .. method:: setTarget(target) + .. versionchanged:: 2.6 + *credentials* was added. + + + Sets the target handler for this handler. + + + .. method:: shouldFlush(record) + + Checks for buffer full or a record at the *flushLevel* or higher. + + +.. _http-handler: + +HTTPHandler +^^^^^^^^^^^ + +The :class:`HTTPHandler` class, located in the :mod:`logging.handlers` module, +supports sending logging messages to a Web server, using either ``GET`` or +``POST`` semantics. + + +.. class:: HTTPHandler(host, url, method='GET') + + Returns a new instance of the :class:`HTTPHandler` class. The *host* can be + of the form ``host:port``, should you need to use a specific port number. + If no *method* is specified, ``GET`` is used. + + + .. method:: emit(record) + + Sends the record to the Web server as a percent-encoded dictionary. + + +.. seealso:: + + Module :mod:`logging` + API reference for the logging module. + + Module :mod:`logging.config` + Configuration API for the logging module. + + diff --git a/Doc/library/logging.rst b/Doc/library/logging.rst --- a/Doc/library/logging.rst +++ b/Doc/library/logging.rst @@ -2,7 +2,7 @@ ============================================== .. module:: logging - :synopsis: Flexible error logging system for applications. + :synopsis: Flexible event logging system for applications. .. moduleauthor:: Vinay Sajip @@ -11,690 +11,645 @@ .. index:: pair: Errors; logging +.. sidebar:: Important + + This page contains the API reference information. For tutorial + information and discussion of more advanced topics, see + + * :ref:`Basic Tutorial ` + * :ref:`Advanced Tutorial ` + * :ref:`Logging Cookbook ` + + .. versionadded:: 2.3 -This module defines functions and classes which implement a flexible error -logging system for applications. - -Logging is performed by calling methods on instances of the :class:`Logger` -class (hereafter called :dfn:`loggers`). Each instance has a name, and they are -conceptually arranged in a namespace hierarchy using dots (periods) as -separators. For example, a logger named "scan" is the parent of loggers -"scan.text", "scan.html" and "scan.pdf". Logger names can be anything you want, -and indicate the area of an application in which a logged message originates. - -Logged messages also have levels of importance associated with them. The default -levels provided are :const:`DEBUG`, :const:`INFO`, :const:`WARNING`, -:const:`ERROR` and :const:`CRITICAL`. As a convenience, you indicate the -importance of a logged message by calling an appropriate method of -:class:`Logger`. The methods are :meth:`debug`, :meth:`info`, :meth:`warning`, -:meth:`error` and :meth:`critical`, which mirror the default levels. You are not -constrained to use these levels: you can specify your own and use a more general -:class:`Logger` method, :meth:`log`, which takes an explicit level argument. - - -Logging tutorial ----------------- +This module defines functions and classes which implement a flexible event +logging system for applications and libraries. The key benefit of having the logging API provided by a standard library module is that all Python modules can participate in logging, so your application log -can include messages from third-party modules. +can include your own messages integrated with messages from third-party +modules. -It is, of course, possible to log messages with different verbosity levels or to -different destinations. Support for writing log messages to files, HTTP -GET/POST locations, email via SMTP, generic sockets, or OS-specific logging -mechanisms are all supported by the standard module. You can also create your -own log destination class if you have special requirements not met by any of the -built-in classes. +The module provides a lot of functionality and flexibility. If you are +unfamiliar with logging, the best way to get to grips with it is to see the +tutorials (see the links on the right). -Simple examples -^^^^^^^^^^^^^^^ +The basic classes defined by the module, together with their functions, are +listed below. -.. sectionauthor:: Doug Hellmann -.. (see ) +* Loggers expose the interface that application code directly uses. +* Handlers send the log records (created by loggers) to the appropriate + destination. +* Filters provide a finer grained facility for determining which log records + to output. +* Formatters specify the layout of log records in the final output. -Most applications are probably going to want to log to a file, so let's start -with that case. Using the :func:`basicConfig` function, we can set up the -default handler so that debug messages are written to a file (in the example, -we assume that you have the appropriate permissions to create a file called -*example.log* in the current directory):: - import logging - LOG_FILENAME = 'example.log' - logging.basicConfig(filename=LOG_FILENAME,level=logging.DEBUG) +.. _logger: - logging.debug('This message should go to the log file') +Logger Objects +-------------- -And now if we open the file and look at what we have, we should find the log -message:: +Loggers have the following attributes and methods. Note that Loggers are never +instantiated directly, but always through the module-level function +``logging.getLogger(name)``. - DEBUG:root:This message should go to the log file +.. class:: Logger -If you run the script repeatedly, the additional log messages are appended to -the file. To create a new file each time, you can pass a *filemode* argument to -:func:`basicConfig` with a value of ``'w'``. Rather than managing the file size -yourself, though, it is simpler to use a :class:`RotatingFileHandler`:: +.. attribute:: Logger.propagate - import glob - import logging - import logging.handlers + If this evaluates to false, logging messages are not passed by this logger or by + its child loggers to the handlers of higher level (ancestor) loggers. The + constructor sets this attribute to 1. - LOG_FILENAME = 'logging_rotatingfile_example.out' - # Set up a specific logger with our desired output level - my_logger = logging.getLogger('MyLogger') - my_logger.setLevel(logging.DEBUG) +.. method:: Logger.setLevel(lvl) - # Add the log message handler to the logger - handler = logging.handlers.RotatingFileHandler( - LOG_FILENAME, maxBytes=20, backupCount=5) + Sets the threshold for this logger to *lvl*. Logging messages which are less + severe than *lvl* will be ignored. When a logger is created, the level is set to + :const:`NOTSET` (which causes all messages to be processed when the logger is + the root logger, or delegation to the parent when the logger is a non-root + logger). Note that the root logger is created with level :const:`WARNING`. - my_logger.addHandler(handler) + The term 'delegation to the parent' means that if a logger has a level of + NOTSET, its chain of ancestor loggers is traversed until either an ancestor with + a level other than NOTSET is found, or the root is reached. - # Log some messages - for i in range(20): - my_logger.debug('i = %d' % i) + If an ancestor is found with a level other than NOTSET, then that ancestor's + level is treated as the effective level of the logger where the ancestor search + began, and is used to determine how a logging event is handled. - # See what files are created - logfiles = glob.glob('%s*' % LOG_FILENAME) + If the root is reached, and it has a level of NOTSET, then all messages will be + processed. Otherwise, the root's level will be used as the effective level. - for filename in logfiles: - print filename -The result should be 6 separate files, each with part of the log history for the -application:: +.. method:: Logger.isEnabledFor(lvl) - logging_rotatingfile_example.out - logging_rotatingfile_example.out.1 - logging_rotatingfile_example.out.2 - logging_rotatingfile_example.out.3 - logging_rotatingfile_example.out.4 - logging_rotatingfile_example.out.5 + Indicates if a message of severity *lvl* would be processed by this logger. + This method checks first the module-level level set by + ``logging.disable(lvl)`` and then the logger's effective level as determined + by :meth:`getEffectiveLevel`. -The most current file is always :file:`logging_rotatingfile_example.out`, -and each time it reaches the size limit it is renamed with the suffix -``.1``. Each of the existing backup files is renamed to increment the suffix -(``.1`` becomes ``.2``, etc.) and the ``.6`` file is erased. -Obviously this example sets the log length much much too small as an extreme -example. You would want to set *maxBytes* to an appropriate value. +.. method:: Logger.getEffectiveLevel() -Another useful feature of the logging API is the ability to produce different -messages at different log levels. This allows you to instrument your code with -debug messages, for example, but turning the log level down so that those debug -messages are not written for your production system. The default levels are -``NOTSET``, ``DEBUG``, ``INFO``, ``WARNING``, ``ERROR`` and ``CRITICAL``. + Indicates the effective level for this logger. If a value other than + :const:`NOTSET` has been set using :meth:`setLevel`, it is returned. Otherwise, + the hierarchy is traversed towards the root until a value other than + :const:`NOTSET` is found, and that value is returned. -The logger, handler, and log message call each specify a level. The log message -is only emitted if the handler and logger are configured to emit messages of -that level or lower. For example, if a message is ``CRITICAL``, and the logger -is set to ``ERROR``, the message is emitted. If a message is a ``WARNING``, and -the logger is set to produce only ``ERROR``\s, the message is not emitted:: - import logging - import sys +.. method:: Logger.getChild(suffix) - LEVELS = {'debug': logging.DEBUG, - 'info': logging.INFO, - 'warning': logging.WARNING, - 'error': logging.ERROR, - 'critical': logging.CRITICAL} + Returns a logger which is a descendant to this logger, as determined by the suffix. + Thus, ``logging.getLogger('abc').getChild('def.ghi')`` would return the same + logger as would be returned by ``logging.getLogger('abc.def.ghi')``. This is a + convenience method, useful when the parent logger is named using e.g. ``__name__`` + rather than a literal string. - if len(sys.argv) > 1: - level_name = sys.argv[1] - level = LEVELS.get(level_name, logging.NOTSET) - logging.basicConfig(level=level) + .. versionadded:: 2.7 - logging.debug('This is a debug message') - logging.info('This is an info message') - logging.warning('This is a warning message') - logging.error('This is an error message') - logging.critical('This is a critical error message') -Run the script with an argument like 'debug' or 'warning' to see which messages -show up at different levels:: +.. method:: Logger.debug(msg, *args, **kwargs) - $ python logging_level_example.py debug - DEBUG:root:This is a debug message - INFO:root:This is an info message - WARNING:root:This is a warning message - ERROR:root:This is an error message - CRITICAL:root:This is a critical error message + Logs a message with level :const:`DEBUG` on this logger. The *msg* is the + message format string, and the *args* are the arguments which are merged into + *msg* using the string formatting operator. (Note that this means that you can + use keywords in the format string, together with a single dictionary argument.) - $ python logging_level_example.py info - INFO:root:This is an info message - WARNING:root:This is a warning message - ERROR:root:This is an error message - CRITICAL:root:This is a critical error message + There are two keyword arguments in *kwargs* which are inspected: *exc_info* + which, if it does not evaluate as false, causes exception information to be + added to the logging message. If an exception tuple (in the format returned by + :func:`sys.exc_info`) is provided, it is used; otherwise, :func:`sys.exc_info` + is called to get the exception information. -You will notice that these log messages all have ``root`` embedded in them. The -logging module supports a hierarchy of loggers with different names. An easy -way to tell where a specific log message comes from is to use a separate logger -object for each of your modules. Each new logger "inherits" the configuration -of its parent, and log messages sent to a logger include the name of that -logger. Optionally, each logger can be configured differently, so that messages -from different modules are handled in different ways. Let's look at a simple -example of how to log from different modules so it is easy to trace the source -of the message:: + The second keyword argument is *extra* which can be used to pass a + dictionary which is used to populate the __dict__ of the LogRecord created for + the logging event with user-defined attributes. These custom attributes can then + be used as you like. For example, they could be incorporated into logged + messages. For example:: - import logging + FORMAT = '%(asctime)-15s %(clientip)s %(user)-8s %(message)s' + logging.basicConfig(format=FORMAT) + d = { 'clientip' : '192.168.0.1', 'user' : 'fbloggs' } + logger = logging.getLogger('tcpserver') + logger.warning('Protocol problem: %s', 'connection reset', extra=d) - logging.basicConfig(level=logging.WARNING) + would print something like :: - logger1 = logging.getLogger('package1.module1') - logger2 = logging.getLogger('package2.module2') + 2006-02-08 22:20:02,165 192.168.0.1 fbloggs Protocol problem: connection reset - logger1.warning('This message comes from one module') - logger2.warning('And this message comes from another module') + The keys in the dictionary passed in *extra* should not clash with the keys used + by the logging system. (See the :class:`Formatter` documentation for more + information on which keys are used by the logging system.) -And the output:: + If you choose to use these attributes in logged messages, you need to exercise + some care. In the above example, for instance, the :class:`Formatter` has been + set up with a format string which expects 'clientip' and 'user' in the attribute + dictionary of the LogRecord. If these are missing, the message will not be + logged because a string formatting exception will occur. So in this case, you + always need to pass the *extra* dictionary with these keys. - $ python logging_modules_example.py - WARNING:package1.module1:This message comes from one module - WARNING:package2.module2:And this message comes from another module + While this might be annoying, this feature is intended for use in specialized + circumstances, such as multi-threaded servers where the same code executes in + many contexts, and interesting conditions which arise are dependent on this + context (such as remote client IP address and authenticated user name, in the + above example). In such circumstances, it is likely that specialized + :class:`Formatter`\ s would be used with particular :class:`Handler`\ s. -There are many more options for configuring logging, including different log -message formatting options, having messages delivered to multiple destinations, -and changing the configuration of a long-running application on the fly using a -socket interface. All of these options are covered in depth in the library -module documentation. -Loggers -^^^^^^^ +.. method:: Logger.info(msg, *args, **kwargs) -The logging library takes a modular approach and offers the several categories -of components: loggers, handlers, filters, and formatters. Loggers expose the -interface that application code directly uses. Handlers send the log records to -the appropriate destination. Filters provide a finer grained facility for -determining which log records to send on to a handler. Formatters specify the -layout of the resultant log record. + Logs a message with level :const:`INFO` on this logger. The arguments are + interpreted as for :meth:`debug`. -:class:`Logger` objects have a threefold job. First, they expose several -methods to application code so that applications can log messages at runtime. -Second, logger objects determine which log messages to act upon based upon -severity (the default filtering facility) or filter objects. Third, logger -objects pass along relevant log messages to all interested log handlers. -The most widely used methods on logger objects fall into two categories: -configuration and message sending. +.. method:: Logger.warning(msg, *args, **kwargs) -* :meth:`Logger.setLevel` specifies the lowest-severity log message a logger - will handle, where debug is the lowest built-in severity level and critical is - the highest built-in severity. For example, if the severity level is info, - the logger will handle only info, warning, error, and critical messages and - will ignore debug messages. + Logs a message with level :const:`WARNING` on this logger. The arguments are + interpreted as for :meth:`debug`. -* :meth:`Logger.addFilter` and :meth:`Logger.removeFilter` add and remove filter - objects from the logger object. This tutorial does not address filters. -With the logger object configured, the following methods create log messages: +.. method:: Logger.error(msg, *args, **kwargs) -* :meth:`Logger.debug`, :meth:`Logger.info`, :meth:`Logger.warning`, - :meth:`Logger.error`, and :meth:`Logger.critical` all create log records with - a message and a level that corresponds to their respective method names. The - message is actually a format string, which may contain the standard string - substitution syntax of :const:`%s`, :const:`%d`, :const:`%f`, and so on. The - rest of their arguments is a list of objects that correspond with the - substitution fields in the message. With regard to :const:`**kwargs`, the - logging methods care only about a keyword of :const:`exc_info` and use it to - determine whether to log exception information. + Logs a message with level :const:`ERROR` on this logger. The arguments are + interpreted as for :meth:`debug`. -* :meth:`Logger.exception` creates a log message similar to - :meth:`Logger.error`. The difference is that :meth:`Logger.exception` dumps a - stack trace along with it. Call this method only from an exception handler. -* :meth:`Logger.log` takes a log level as an explicit argument. This is a - little more verbose for logging messages than using the log level convenience - methods listed above, but this is how to log at custom log levels. +.. method:: Logger.critical(msg, *args, **kwargs) -:func:`getLogger` returns a reference to a logger instance with the specified -name if it is provided, or ``root`` if not. The names are period-separated -hierarchical structures. Multiple calls to :func:`getLogger` with the same name -will return a reference to the same logger object. Loggers that are further -down in the hierarchical list are children of loggers higher up in the list. -For example, given a logger with a name of ``foo``, loggers with names of -``foo.bar``, ``foo.bar.baz``, and ``foo.bam`` are all descendants of ``foo``. -Child loggers propagate messages up to the handlers associated with their -ancestor loggers. Because of this, it is unnecessary to define and configure -handlers for all the loggers an application uses. It is sufficient to -configure handlers for a top-level logger and create child loggers as needed. + Logs a message with level :const:`CRITICAL` on this logger. The arguments are + interpreted as for :meth:`debug`. -Handlers -^^^^^^^^ +.. method:: Logger.log(lvl, msg, *args, **kwargs) -:class:`Handler` objects are responsible for dispatching the appropriate log -messages (based on the log messages' severity) to the handler's specified -destination. Logger objects can add zero or more handler objects to themselves -with an :func:`addHandler` method. As an example scenario, an application may -want to send all log messages to a log file, all log messages of error or higher -to stdout, and all messages of critical to an email address. This scenario -requires three individual handlers where each handler is responsible for sending -messages of a specific severity to a specific location. + Logs a message with integer level *lvl* on this logger. The other arguments are + interpreted as for :meth:`debug`. -The standard library includes quite a few handler types; this tutorial uses only -:class:`StreamHandler` and :class:`FileHandler` in its examples. -There are very few methods in a handler for application developers to concern -themselves with. The only handler methods that seem relevant for application -developers who are using the built-in handler objects (that is, not creating -custom handlers) are the following configuration methods: +.. method:: Logger.exception(msg, *args) -* The :meth:`Handler.setLevel` method, just as in logger objects, specifies the - lowest severity that will be dispatched to the appropriate destination. Why - are there two :func:`setLevel` methods? The level set in the logger - determines which severity of messages it will pass to its handlers. The level - set in each handler determines which messages that handler will send on. + Logs a message with level :const:`ERROR` on this logger. The arguments are + interpreted as for :meth:`debug`. Exception info is added to the logging + message. This method should only be called from an exception handler. -* :func:`setFormatter` selects a Formatter object for this handler to use. -* :func:`addFilter` and :func:`removeFilter` respectively configure and - deconfigure filter objects on handlers. +.. method:: Logger.addFilter(filt) -Application code should not directly instantiate and use instances of -:class:`Handler`. Instead, the :class:`Handler` class is a base class that -defines the interface that all handlers should have and establishes some -default behavior that child classes can use (or override). + Adds the specified filter *filt* to this logger. -Formatters -^^^^^^^^^^ +.. method:: Logger.removeFilter(filt) -Formatter objects configure the final order, structure, and contents of the log -message. Unlike the base :class:`logging.Handler` class, application code may -instantiate formatter classes, although you could likely subclass the formatter -if your application needs special behavior. The constructor takes two optional -arguments: a message format string and a date format string. If there is no -message format string, the default is to use the raw message. If there is no -date format string, the default date format is:: + Removes the specified filter *filt* from this logger. - %Y-%m-%d %H:%M:%S -with the milliseconds tacked on at the end. +.. method:: Logger.filter(record) -The message format string uses ``%()s`` styled string -substitution; the possible keys are documented in :ref:`formatter`. + Applies this logger's filters to the record and returns a true value if the + record is to be processed. -The following message format string will log the time in a human-readable -format, the severity of the message, and the contents of the message, in that -order:: - "%(asctime)s - %(levelname)s - %(message)s" +.. method:: Logger.addHandler(hdlr) -Formatters use a user-configurable function to convert the creation time of a -record to a tuple. By default, :func:`time.localtime` is used; to change this -for a particular formatter instance, set the ``converter`` attribute of the -instance to a function with the same signature as :func:`time.localtime` or -:func:`time.gmtime`. To change it for all formatters, for example if you want -all logging times to be shown in GMT, set the ``converter`` attribute in the -Formatter class (to ``time.gmtime`` for GMT display). + Adds the specified handler *hdlr* to this logger. -Configuring Logging -^^^^^^^^^^^^^^^^^^^ +.. method:: Logger.removeHandler(hdlr) -Programmers can configure logging in three ways: + Removes the specified handler *hdlr* from this logger. -1. Creating loggers, handlers, and formatters explicitly using Python - code that calls the configuration methods listed above. -2. Creating a logging config file and reading it using the :func:`fileConfig` - function. -3. Creating a dictionary of configuration information and passing it - to the :func:`dictConfig` function. -The following example configures a very simple logger, a console -handler, and a simple formatter using Python code:: +.. method:: Logger.findCaller() - import logging + Finds the caller's source filename and line number. Returns the filename, line + number and function name as a 3-element tuple. - # create logger - logger = logging.getLogger("simple_example") - logger.setLevel(logging.DEBUG) + .. versionchanged:: 2.4 + The function name was added. In earlier versions, the filename and line + number were returned as a 2-element tuple. - # create console handler and set level to debug - ch = logging.StreamHandler() - ch.setLevel(logging.DEBUG) +.. method:: Logger.handle(record) - # create formatter - formatter = logging.Formatter("%(asctime)s - %(name)s - %(levelname)s - %(message)s") + Handles a record by passing it to all handlers associated with this logger and + its ancestors (until a false value of *propagate* is found). This method is used + for unpickled records received from a socket, as well as those created locally. + Logger-level filtering is applied using :meth:`~Logger.filter`. - # add formatter to ch - ch.setFormatter(formatter) - # add ch to logger - logger.addHandler(ch) +.. method:: Logger.makeRecord(name, lvl, fn, lno, msg, args, exc_info, func=None, extra=None) - # "application" code - logger.debug("debug message") - logger.info("info message") - logger.warn("warn message") - logger.error("error message") - logger.critical("critical message") + This is a factory method which can be overridden in subclasses to create + specialized :class:`LogRecord` instances. -Running this module from the command line produces the following output:: + .. versionchanged:: 2.5 + *func* and *extra* were added. - $ python simple_logging_module.py - 2005-03-19 15:10:26,618 - simple_example - DEBUG - debug message - 2005-03-19 15:10:26,620 - simple_example - INFO - info message - 2005-03-19 15:10:26,695 - simple_example - WARNING - warn message - 2005-03-19 15:10:26,697 - simple_example - ERROR - error message - 2005-03-19 15:10:26,773 - simple_example - CRITICAL - critical message +.. _handler: -The following Python module creates a logger, handler, and formatter nearly -identical to those in the example listed above, with the only difference being -the names of the objects:: +Handler Objects +--------------- - import logging - import logging.config +Handlers have the following attributes and methods. Note that :class:`Handler` +is never instantiated directly; this class acts as a base for more useful +subclasses. However, the :meth:`__init__` method in subclasses needs to call +:meth:`Handler.__init__`. - logging.config.fileConfig("logging.conf") - # create logger - logger = logging.getLogger("simpleExample") +.. method:: Handler.__init__(level=NOTSET) - # "application" code - logger.debug("debug message") - logger.info("info message") - logger.warn("warn message") - logger.error("error message") - logger.critical("critical message") + Initializes the :class:`Handler` instance by setting its level, setting the list + of filters to the empty list and creating a lock (using :meth:`createLock`) for + serializing access to an I/O mechanism. -Here is the logging.conf file:: - [loggers] - keys=root,simpleExample +.. method:: Handler.createLock() - [handlers] - keys=consoleHandler + Initializes a thread lock which can be used to serialize access to underlying + I/O functionality which may not be threadsafe. - [formatters] - keys=simpleFormatter - [logger_root] - level=DEBUG - handlers=consoleHandler +.. method:: Handler.acquire() - [logger_simpleExample] - level=DEBUG - handlers=consoleHandler - qualname=simpleExample - propagate=0 + Acquires the thread lock created with :meth:`createLock`. - [handler_consoleHandler] - class=StreamHandler - level=DEBUG - formatter=simpleFormatter - args=(sys.stdout,) - [formatter_simpleFormatter] - format=%(asctime)s - %(name)s - %(levelname)s - %(message)s - datefmt= +.. method:: Handler.release() -The output is nearly identical to that of the non-config-file-based example:: + Releases the thread lock acquired with :meth:`acquire`. - $ python simple_logging_config.py - 2005-03-19 15:38:55,977 - simpleExample - DEBUG - debug message - 2005-03-19 15:38:55,979 - simpleExample - INFO - info message - 2005-03-19 15:38:56,054 - simpleExample - WARNING - warn message - 2005-03-19 15:38:56,055 - simpleExample - ERROR - error message - 2005-03-19 15:38:56,130 - simpleExample - CRITICAL - critical message -You can see that the config file approach has a few advantages over the Python -code approach, mainly separation of configuration and code and the ability of -noncoders to easily modify the logging properties. +.. method:: Handler.setLevel(lvl) -Note that the class names referenced in config files need to be either relative -to the logging module, or absolute values which can be resolved using normal -import mechanisms. Thus, you could use either :class:`handlers.WatchedFileHandler` -(relative to the logging module) or :class:`mypackage.mymodule.MyHandler` (for a -class defined in package :mod:`mypackage` and module :mod:`mymodule`, where -:mod:`mypackage` is available on the Python import path). + Sets the threshold for this handler to *lvl*. Logging messages which are less + severe than *lvl* will be ignored. When a handler is created, the level is set + to :const:`NOTSET` (which causes all messages to be processed). + + +.. method:: Handler.setFormatter(form) + + Sets the :class:`Formatter` for this handler to *form*. + + +.. method:: Handler.addFilter(filt) + + Adds the specified filter *filt* to this handler. + + +.. method:: Handler.removeFilter(filt) + + Removes the specified filter *filt* from this handler. + + +.. method:: Handler.filter(record) + + Applies this handler's filters to the record and returns a true value if the + record is to be processed. + + +.. method:: Handler.flush() + + Ensure all logging output has been flushed. This version does nothing and is + intended to be implemented by subclasses. + + +.. method:: Handler.close() + + Tidy up any resources used by the handler. This version does no output but + removes the handler from an internal list of handlers which is closed when + :func:`shutdown` is called. Subclasses should ensure that this gets called + from overridden :meth:`close` methods. + + +.. method:: Handler.handle(record) + + Conditionally emits the specified logging record, depending on filters which may + have been added to the handler. Wraps the actual emission of the record with + acquisition/release of the I/O thread lock. + + +.. method:: Handler.handleError(record) + + This method should be called from handlers when an exception is encountered + during an :meth:`emit` call. By default it does nothing, which means that + exceptions get silently ignored. This is what is mostly wanted for a logging + system - most users will not care about errors in the logging system, they are + more interested in application errors. You could, however, replace this with a + custom handler if you wish. The specified record is the one which was being + processed when the exception occurred. + + +.. method:: Handler.format(record) + + Do formatting for a record - if a formatter is set, use it. Otherwise, use the + default formatter for the module. + + +.. method:: Handler.emit(record) + + Do whatever it takes to actually log the specified logging record. This version + is intended to be implemented by subclasses and so raises a + :exc:`NotImplementedError`. + +For a list of handlers included as standard, see :mod:`logging.handlers`. + +.. _formatter-objects: + +Formatter Objects +----------------- + +.. currentmodule:: logging + +:class:`Formatter` objects have the following attributes and methods. They are +responsible for converting a :class:`LogRecord` to (usually) a string which can +be interpreted by either a human or an external system. The base +:class:`Formatter` allows a formatting string to be specified. If none is +supplied, the default value of ``'%(message)s'`` is used. + +A Formatter can be initialized with a format string which makes use of knowledge +of the :class:`LogRecord` attributes - such as the default value mentioned above +making use of the fact that the user's message and arguments are pre-formatted +into a :class:`LogRecord`'s *message* attribute. This format string contains +standard Python %-style mapping keys. See section :ref:`string-formatting` +for more information on string formatting. + +The useful mapping keys in a :class:`LogRecord` are given in the section on +:ref:`logrecord-attributes`. + + +.. class:: Formatter(fmt=None, datefmt=None) + + Returns a new instance of the :class:`Formatter` class. The instance is + initialized with a format string for the message as a whole, as well as a + format string for the date/time portion of a message. If no *fmt* is + specified, ``'%(message)s'`` is used. If no *datefmt* is specified, the + ISO8601 date format is used. + + .. method:: format(record) + + The record's attribute dictionary is used as the operand to a string + formatting operation. Returns the resulting string. Before formatting the + dictionary, a couple of preparatory steps are carried out. The *message* + attribute of the record is computed using *msg* % *args*. If the + formatting string contains ``'(asctime)'``, :meth:`formatTime` is called + to format the event time. If there is exception information, it is + formatted using :meth:`formatException` and appended to the message. Note + that the formatted exception information is cached in attribute + *exc_text*. This is useful because the exception information can be + pickled and sent across the wire, but you should be careful if you have + more than one :class:`Formatter` subclass which customizes the formatting + of exception information. In this case, you will have to clear the cached + value after a formatter has done its formatting, so that the next + formatter to handle the event doesn't use the cached value but + recalculates it afresh. + + + .. method:: formatTime(record, datefmt=None) + + This method should be called from :meth:`format` by a formatter which + wants to make use of a formatted time. This method can be overridden in + formatters to provide for any specific requirement, but the basic behavior + is as follows: if *datefmt* (a string) is specified, it is used with + :func:`time.strftime` to format the creation time of the + record. Otherwise, the ISO8601 format is used. The resulting string is + returned. + + + .. method:: formatException(exc_info) + + Formats the specified exception information (a standard exception tuple as + returned by :func:`sys.exc_info`) as a string. This default implementation + just uses :func:`traceback.print_exception`. The resulting string is + returned. + +.. _filter: + +Filter Objects +-------------- + +``Filters`` can be used by ``Handlers`` and ``Loggers`` for more sophisticated +filtering than is provided by levels. The base filter class only allows events +which are below a certain point in the logger hierarchy. For example, a filter +initialized with 'A.B' will allow events logged by loggers 'A.B', 'A.B.C', +'A.B.C.D', 'A.B.D' etc. but not 'A.BB', 'B.A.B' etc. If initialized with the +empty string, all events are passed. + + +.. class:: Filter(name='') + + Returns an instance of the :class:`Filter` class. If *name* is specified, it + names a logger which, together with its children, will have its events allowed + through the filter. If *name* is the empty string, allows every event. + + + .. method:: filter(record) + + Is the specified record to be logged? Returns zero for no, nonzero for + yes. If deemed appropriate, the record may be modified in-place by this + method. + +Note that filters attached to handlers are consulted whenever an event is +emitted by the handler, whereas filters attached to loggers are consulted +whenever an event is logged to the handler (using :meth:`debug`, :meth:`info`, +etc.) This means that events which have been generated by descendant loggers +will not be filtered by a logger's filter setting, unless the filter has also +been applied to those descendant loggers. + +You don't actually need to subclass ``Filter``: you can pass any instance +which has a ``filter`` method with the same semantics. + +Although filters are used primarily to filter records based on more +sophisticated criteria than levels, they get to see every record which is +processed by the handler or logger they're attached to: this can be useful if +you want to do things like counting how many records were processed by a +particular logger or handler, or adding, changing or removing attributes in +the LogRecord being processed. Obviously changing the LogRecord needs to be +done with some care, but it does allow the injection of contextual information +into logs (see :ref:`filters-contextual`). + +.. _log-record: + +LogRecord Objects +----------------- + +:class:`LogRecord` instances are created automatically by the :class:`Logger` +every time something is logged, and can be created manually via +:func:`makeLogRecord` (for example, from a pickled event received over the +wire). + + +.. class:: LogRecord(name, level, pathname, lineno, msg, args, exc_info, func=None) + + Contains all the information pertinent to the event being logged. + + The primary information is passed in :attr:`msg` and :attr:`args`, which + are combined using ``msg % args`` to create the :attr:`message` field of the + record. + + :param name: The name of the logger used to log the event represented by + this LogRecord. + :param level: The numeric level of the logging event (one of DEBUG, INFO etc.) + :param pathname: The full pathname of the source file where the logging call + was made. + :param lineno: The line number in the source file where the logging call was + made. + :param msg: The event description message, possibly a format string with + placeholders for variable data. + :param args: Variable data to merge into the *msg* argument to obtain the + event description. + :param exc_info: An exception tuple with the current exception information, + or *None* if no exception information is available. + :param func: The name of the function or method from which the logging call + was invoked. + + .. versionchanged:: 2.5 + *func* was added. + + .. method:: getMessage() + + Returns the message for this :class:`LogRecord` instance after merging any + user-supplied arguments with the message. If the user-supplied message + argument to the logging call is not a string, :func:`str` is called on it to + convert it to a string. This allows use of user-defined classes as + messages, whose ``__str__`` method can return the actual format string to + be used. + + +.. _logrecord-attributes: + +LogRecord attributes +-------------------- + +The LogRecord has a number of attributes, most of which are derived from the +parameters to the constructor. (Note that the names do not always correspond +exactly between the LogRecord constructor parameters and the LogRecord +attributes.) These attributes can be used to merge data from the record into +the format string. The following table lists (in alphabetical order) the +attribute names, their meanings and the corresponding placeholder in a %-style +format string. + ++----------------+-------------------------+-----------------------------------------------+ +| Attribute name | Format | Description | ++================+=========================+===============================================+ +| args | You shouldn't need to | The tuple of arguments merged into ``msg`` to | +| | format this yourself. | produce ``message``. | ++----------------+-------------------------+-----------------------------------------------+ +| asctime | ``%(asctime)s`` | Human-readable time when the | +| | | :class:`LogRecord` was created. By default | +| | | this is of the form '2003-07-08 16:49:45,896' | +| | | (the numbers after the comma are millisecond | +| | | portion of the time). | ++----------------+-------------------------+-----------------------------------------------+ +| created | ``%(created)f`` | Time when the :class:`LogRecord` was created | +| | | (as returned by :func:`time.time`). | ++----------------+-------------------------+-----------------------------------------------+ +| exc_info | You shouldn't need to | Exception tuple (? la ``sys.exc_info``) or, | +| | format this yourself. | if no exception has occurred, *None*. | ++----------------+-------------------------+-----------------------------------------------+ +| filename | ``%(filename)s`` | Filename portion of ``pathname``. | ++----------------+-------------------------+-----------------------------------------------+ +| funcName | ``%(funcName)s`` | Name of function containing the logging call. | ++----------------+-------------------------+-----------------------------------------------+ +| levelname | ``%(levelname)s`` | Text logging level for the message | +| | | (``'DEBUG'``, ``'INFO'``, ``'WARNING'``, | +| | | ``'ERROR'``, ``'CRITICAL'``). | ++----------------+-------------------------+-----------------------------------------------+ +| levelno | ``%(levelno)s`` | Numeric logging level for the message | +| | | (:const:`DEBUG`, :const:`INFO`, | +| | | :const:`WARNING`, :const:`ERROR`, | +| | | :const:`CRITICAL`). | ++----------------+-------------------------+-----------------------------------------------+ +| lineno | ``%(lineno)d`` | Source line number where the logging call was | +| | | issued (if available). | ++----------------+-------------------------+-----------------------------------------------+ +| module | ``%(module)s`` | Module (name portion of ``filename``). | ++----------------+-------------------------+-----------------------------------------------+ +| msecs | ``%(msecs)d`` | Millisecond portion of the time when the | +| | | :class:`LogRecord` was created. | ++----------------+-------------------------+-----------------------------------------------+ +| message | ``%(message)s`` | The logged message, computed as ``msg % | +| | | args``. This is set when | +| | | :meth:`Formatter.format` is invoked. | ++----------------+-------------------------+-----------------------------------------------+ +| msg | You shouldn't need to | The format string passed in the original | +| | format this yourself. | logging call. Merged with ``args`` to | +| | | produce ``message``, or an arbitrary object | +| | | (see :ref:`arbitrary-object-messages`). | ++----------------+-------------------------+-----------------------------------------------+ +| name | ``%(name)s`` | Name of the logger used to log the call. | ++----------------+-------------------------+-----------------------------------------------+ +| pathname | ``%(pathname)s`` | Full pathname of the source file where the | +| | | logging call was issued (if available). | ++----------------+-------------------------+-----------------------------------------------+ +| process | ``%(process)d`` | Process ID (if available). | ++----------------+-------------------------+-----------------------------------------------+ +| processName | ``%(processName)s`` | Process name (if available). | ++----------------+-------------------------+-----------------------------------------------+ +| relativeCreated| ``%(relativeCreated)d`` | Time in milliseconds when the LogRecord was | +| | | created, relative to the time the logging | +| | | module was loaded. | ++----------------+-------------------------+-----------------------------------------------+ +| thread | ``%(thread)d`` | Thread ID (if available). | ++----------------+-------------------------+-----------------------------------------------+ +| threadName | ``%(threadName)s`` | Thread name (if available). | ++----------------+-------------------------+-----------------------------------------------+ + +.. versionchanged:: 2.5 + *funcName* was added. + +.. _logger-adapter: + +LoggerAdapter Objects +--------------------- + +:class:`LoggerAdapter` instances are used to conveniently pass contextual +information into logging calls. For a usage example , see the section on +:ref:`adding contextual information to your logging output `. + +.. versionadded:: 2.6 + + +.. class:: LoggerAdapter(logger, extra) + + Returns an instance of :class:`LoggerAdapter` initialized with an + underlying :class:`Logger` instance and a dict-like object. + + .. method:: process(msg, kwargs) + + Modifies the message and/or keyword arguments passed to a logging call in + order to insert contextual information. This implementation takes the object + passed as *extra* to the constructor and adds it to *kwargs* using key + 'extra'. The return value is a (*msg*, *kwargs*) tuple which has the + (possibly modified) versions of the arguments passed in. + +In addition to the above, :class:`LoggerAdapter` supports the following +methods of :class:`Logger`, i.e. :meth:`debug`, :meth:`info`, :meth:`warning`, +:meth:`error`, :meth:`exception`, :meth:`critical`, :meth:`log`, +:meth:`isEnabledFor`, :meth:`getEffectiveLevel`, :meth:`setLevel`, +:meth:`hasHandlers`. These methods have the same signatures as their +counterparts in :class:`Logger`, so you can use the two types of instances +interchangeably. .. versionchanged:: 2.7 + The :meth:`isEnabledFor` method was added to :class:`LoggerAdapter`. This + method delegates to the underlying logger. -In Python 2.7, a new means of configuring logging has been introduced, using -dictionaries to hold configuration information. This provides a superset of the -functionality of the config-file-based approach outlined above, and is the -recommended configuration method for new applications and deployments. Because -a Python dictionary is used to hold configuration information, and since you -can populate that dictionary using different means, you have more options for -configuration. For example, you can use a configuration file in JSON format, -or, if you have access to YAML processing functionality, a file in YAML -format, to populate the configuration dictionary. Or, of course, you can -construct the dictionary in Python code, receive it in pickled form over a -socket, or use whatever approach makes sense for your application. -Here's an example of the same configuration as above, in YAML format for -the new dictionary-based approach:: +Thread Safety +------------- - version: 1 - formatters: - simple: - format: format=%(asctime)s - %(name)s - %(levelname)s - %(message)s - handlers: - console: - class: logging.StreamHandler - level: DEBUG - formatter: simple - stream: ext://sys.stdout - loggers: - simpleExample: - level: DEBUG - handlers: [console] - propagate: no - root: - level: DEBUG - handlers: [console] +The logging module is intended to be thread-safe without any special work +needing to be done by its clients. It achieves this though using threading +locks; there is one lock to serialize access to the module's shared data, and +each handler also creates a lock to serialize access to its underlying I/O. -For more information about logging using a dictionary, see -:ref:`logging-config-api`. +If you are implementing asynchronous signal handlers using the :mod:`signal` +module, you may not be able to use logging from within such handlers. This is +because lock implementations in the :mod:`threading` module are not always +re-entrant, and so cannot be invoked from such signal handlers. -.. _library-config: - -Configuring Logging for a Library -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -When developing a library which uses logging, some consideration needs to be -given to its configuration. If the using application does not use logging, and -library code makes logging calls, then a one-off message "No handlers could be -found for logger X.Y.Z" is printed to the console. This message is intended -to catch mistakes in logging configuration, but will confuse an application -developer who is not aware of logging by the library. - -In addition to documenting how a library uses logging, a good way to configure -library logging so that it does not cause a spurious message is to add a -handler which does nothing. This avoids the message being printed, since a -handler will be found: it just doesn't produce any output. If the library user -configures logging for application use, presumably that configuration will add -some handlers, and if levels are suitably configured then logging calls made -in library code will send output to those handlers, as normal. - -A do-nothing handler can be simply defined as follows:: - - import logging - - class NullHandler(logging.Handler): - def emit(self, record): - pass - -An instance of this handler should be added to the top-level logger of the -logging namespace used by the library. If all logging by a library *foo* is -done using loggers with names matching "foo.x.y", then the code:: - - import logging - - h = NullHandler() - logging.getLogger("foo").addHandler(h) - -should have the desired effect. If an organisation produces a number of -libraries, then the logger name specified can be "orgname.foo" rather than -just "foo". - -**PLEASE NOTE:** It is strongly advised that you *do not add any handlers other -than* :class:`NullHandler` *to your library's loggers*. This is because the -configuration of handlers is the prerogative of the application developer who -uses your library. The application developer knows their target audience and -what handlers are most appropriate for their application: if you add handlers -"under the hood", you might well interfere with their ability to carry out -unit tests and deliver logs which suit their requirements. - -.. versionadded:: 2.7 - The :class:`NullHandler` class. - - -Logging Levels --------------- - -The numeric values of logging levels are given in the following table. These are -primarily of interest if you want to define your own levels, and need them to -have specific values relative to the predefined levels. If you define a level -with the same numeric value, it overwrites the predefined value; the predefined -name is lost. - -+--------------+---------------+ -| Level | Numeric value | -+==============+===============+ -| ``CRITICAL`` | 50 | -+--------------+---------------+ -| ``ERROR`` | 40 | -+--------------+---------------+ -| ``WARNING`` | 30 | -+--------------+---------------+ -| ``INFO`` | 20 | -+--------------+---------------+ -| ``DEBUG`` | 10 | -+--------------+---------------+ -| ``NOTSET`` | 0 | -+--------------+---------------+ - -Levels can also be associated with loggers, being set either by the developer or -through loading a saved logging configuration. When a logging method is called -on a logger, the logger compares its own level with the level associated with -the method call. If the logger's level is higher than the method call's, no -logging message is actually generated. This is the basic mechanism controlling -the verbosity of logging output. - -Logging messages are encoded as instances of the :class:`LogRecord` class. When -a logger decides to actually log an event, a :class:`LogRecord` instance is -created from the logging message. - -Logging messages are subjected to a dispatch mechanism through the use of -:dfn:`handlers`, which are instances of subclasses of the :class:`Handler` -class. Handlers are responsible for ensuring that a logged message (in the form -of a :class:`LogRecord`) ends up in a particular location (or set of locations) -which is useful for the target audience for that message (such as end users, -support desk staff, system administrators, developers). Handlers are passed -:class:`LogRecord` instances intended for particular destinations. Each logger -can have zero, one or more handlers associated with it (via the -:meth:`addHandler` method of :class:`Logger`). In addition to any handlers -directly associated with a logger, *all handlers associated with all ancestors -of the logger* are called to dispatch the message (unless the *propagate* flag -for a logger is set to a false value, at which point the passing to ancestor -handlers stops). - -Just as for loggers, handlers can have levels associated with them. A handler's -level acts as a filter in the same way as a logger's level does. If a handler -decides to actually dispatch an event, the :meth:`emit` method is used to send -the message to its destination. Most user-defined subclasses of :class:`Handler` -will need to override this :meth:`emit`. - -.. _custom-levels: - -Custom Levels -^^^^^^^^^^^^^ - -Defining your own levels is possible, but should not be necessary, as the -existing levels have been chosen on the basis of practical experience. -However, if you are convinced that you need custom levels, great care should -be exercised when doing this, and it is possibly *a very bad idea to define -custom levels if you are developing a library*. That's because if multiple -library authors all define their own custom levels, there is a chance that -the logging output from such multiple libraries used together will be -difficult for the using developer to control and/or interpret, because a -given numeric value might mean different things for different libraries. - - -Useful Handlers ---------------- - -In addition to the base :class:`Handler` class, many useful subclasses are -provided: - -#. :ref:`stream-handler` instances send error messages to streams (file-like - objects). - -#. :ref:`file-handler` instances send error messages to disk files. - -#. :class:`BaseRotatingHandler` is the base class for handlers that - rotate log files at a certain point. It is not meant to be instantiated - directly. Instead, use :ref:`rotating-file-handler` or - :ref:`timed-rotating-file-handler`. - -#. :ref:`rotating-file-handler` instances send error messages to disk - files, with support for maximum log file sizes and log file rotation. - -#. :ref:`timed-rotating-file-handler` instances send error messages to - disk files, rotating the log file at certain timed intervals. - -#. :ref:`socket-handler` instances send error messages to TCP/IP - sockets. - -#. :ref:`datagram-handler` instances send error messages to UDP - sockets. - -#. :ref:`smtp-handler` instances send error messages to a designated - email address. - -#. :ref:`syslog-handler` instances send error messages to a Unix - syslog daemon, possibly on a remote machine. - -#. :ref:`nt-eventlog-handler` instances send error messages to a - Windows NT/2000/XP event log. - -#. :ref:`memory-handler` instances send error messages to a buffer - in memory, which is flushed whenever specific criteria are met. - -#. :ref:`http-handler` instances send error messages to an HTTP - server using either ``GET`` or ``POST`` semantics. - -#. :ref:`watched-file-handler` instances watch the file they are - logging to. If the file changes, it is closed and reopened using the file - name. This handler is only useful on Unix-like systems; Windows does not - support the underlying mechanism used. - -#. :ref:`null-handler` instances do nothing with error messages. They are used - by library developers who want to use logging, but want to avoid the "No - handlers could be found for logger XXX" message which can be displayed if - the library user has not configured logging. See :ref:`library-config` for - more information. - -.. versionadded:: 2.7 - The :class:`NullHandler` class. - -The :class:`NullHandler`, :class:`StreamHandler` and :class:`FileHandler` -classes are defined in the core logging package. The other handlers are -defined in a sub- module, :mod:`logging.handlers`. (There is also another -sub-module, :mod:`logging.config`, for configuration functionality.) - -Logged messages are formatted for presentation through instances of the -:class:`Formatter` class. They are initialized with a format string suitable for -use with the % operator and a dictionary. - -For formatting multiple messages in a batch, instances of -:class:`BufferingFormatter` can be used. In addition to the format string (which -is applied to each message in the batch), there is provision for header and -trailer format strings. - -When filtering based on logger level and/or handler level is not enough, -instances of :class:`Filter` can be added to both :class:`Logger` and -:class:`Handler` instances (through their :meth:`addFilter` method). Before -deciding to process a message further, both loggers and handlers consult all -their filters for permission. If any filter returns a false value, the message -is not processed further. - -The basic :class:`Filter` functionality allows filtering by specific logger -name. If this feature is used, messages sent to the named logger and its -children are allowed through the filter, and all others dropped. Module-Level Functions ---------------------- @@ -928,9 +883,36 @@ which need to use custom logger behavior. +Integration with the warnings module +------------------------------------ + +The :func:`captureWarnings` function can be used to integrate :mod:`logging` +with the :mod:`warnings` module. + +.. function:: captureWarnings(capture) + + This function is used to turn the capture of warnings by logging on and + off. + + If *capture* is ``True``, warnings issued by the :mod:`warnings` module will + be redirected to the logging system. Specifically, a warning will be + formatted using :func:`warnings.formatwarning` and the resulting string + logged to a logger named 'py.warnings' with a severity of `WARNING`. + + If *capture* is ``False``, the redirection of warnings to the logging system + will stop, and warnings will be redirected to their original destinations + (i.e. those in effect before `captureWarnings(True)` was called). + + .. seealso:: + Module :mod:`logging.config` + Configuration API for the logging module. + + Module :mod:`logging.handlers` + Useful handlers included with the logging module. + :pep:`282` - A Logging System The proposal which described this feature for inclusion in the Python standard library. @@ -941,2767 +923,3 @@ and 2.2.x, which do not include the :mod:`logging` package in the standard library. -.. _logger: - -Logger Objects --------------- - -Loggers have the following attributes and methods. Note that Loggers are never -instantiated directly, but always through the module-level function -``logging.getLogger(name)``. - - -.. attribute:: Logger.propagate - - If this evaluates to false, logging messages are not passed by this logger or by - its child loggers to the handlers of higher level (ancestor) loggers. The - constructor sets this attribute to 1. - - -.. method:: Logger.setLevel(lvl) - - Sets the threshold for this logger to *lvl*. Logging messages which are less - severe than *lvl* will be ignored. When a logger is created, the level is set to - :const:`NOTSET` (which causes all messages to be processed when the logger is - the root logger, or delegation to the parent when the logger is a non-root - logger). Note that the root logger is created with level :const:`WARNING`. - - The term "delegation to the parent" means that if a logger has a level of - NOTSET, its chain of ancestor loggers is traversed until either an ancestor with - a level other than NOTSET is found, or the root is reached. - - If an ancestor is found with a level other than NOTSET, then that ancestor's - level is treated as the effective level of the logger where the ancestor search - began, and is used to determine how a logging event is handled. - - If the root is reached, and it has a level of NOTSET, then all messages will be - processed. Otherwise, the root's level will be used as the effective level. - - -.. method:: Logger.isEnabledFor(lvl) - - Indicates if a message of severity *lvl* would be processed by this logger. - This method checks first the module-level level set by - ``logging.disable(lvl)`` and then the logger's effective level as determined - by :meth:`getEffectiveLevel`. - - -.. method:: Logger.getEffectiveLevel() - - Indicates the effective level for this logger. If a value other than - :const:`NOTSET` has been set using :meth:`setLevel`, it is returned. Otherwise, - the hierarchy is traversed towards the root until a value other than - :const:`NOTSET` is found, and that value is returned. - - -.. method:: Logger.getChild(suffix) - - Returns a logger which is a descendant to this logger, as determined by the suffix. - Thus, ``logging.getLogger('abc').getChild('def.ghi')`` would return the same - logger as would be returned by ``logging.getLogger('abc.def.ghi')``. This is a - convenience method, useful when the parent logger is named using e.g. ``__name__`` - rather than a literal string. - - .. versionadded:: 2.7 - -.. method:: Logger.debug(msg[, *args[, **kwargs]]) - - Logs a message with level :const:`DEBUG` on this logger. The *msg* is the - message format string, and the *args* are the arguments which are merged into - *msg* using the string formatting operator. (Note that this means that you can - use keywords in the format string, together with a single dictionary argument.) - - There are two keyword arguments in *kwargs* which are inspected: *exc_info* - which, if it does not evaluate as false, causes exception information to be - added to the logging message. If an exception tuple (in the format returned by - :func:`sys.exc_info`) is provided, it is used; otherwise, :func:`sys.exc_info` - is called to get the exception information. - - The other optional keyword argument is *extra* which can be used to pass a - dictionary which is used to populate the __dict__ of the LogRecord created for - the logging event with user-defined attributes. These custom attributes can then - be used as you like. For example, they could be incorporated into logged - messages. For example:: - - FORMAT = "%(asctime)-15s %(clientip)s %(user)-8s %(message)s" - logging.basicConfig(format=FORMAT) - d = { 'clientip' : '192.168.0.1', 'user' : 'fbloggs' } - logger = logging.getLogger("tcpserver") - logger.warning("Protocol problem: %s", "connection reset", extra=d) - - would print something like :: - - 2006-02-08 22:20:02,165 192.168.0.1 fbloggs Protocol problem: connection reset - - The keys in the dictionary passed in *extra* should not clash with the keys used - by the logging system. (See the :class:`Formatter` documentation for more - information on which keys are used by the logging system.) - - If you choose to use these attributes in logged messages, you need to exercise - some care. In the above example, for instance, the :class:`Formatter` has been - set up with a format string which expects 'clientip' and 'user' in the attribute - dictionary of the LogRecord. If these are missing, the message will not be - logged because a string formatting exception will occur. So in this case, you - always need to pass the *extra* dictionary with these keys. - - While this might be annoying, this feature is intended for use in specialized - circumstances, such as multi-threaded servers where the same code executes in - many contexts, and interesting conditions which arise are dependent on this - context (such as remote client IP address and authenticated user name, in the - above example). In such circumstances, it is likely that specialized - :class:`Formatter`\ s would be used with particular :class:`Handler`\ s. - - .. versionchanged:: 2.5 - *extra* was added. - - -.. method:: Logger.info(msg[, *args[, **kwargs]]) - - Logs a message with level :const:`INFO` on this logger. The arguments are - interpreted as for :meth:`debug`. - - -.. method:: Logger.warning(msg[, *args[, **kwargs]]) - - Logs a message with level :const:`WARNING` on this logger. The arguments are - interpreted as for :meth:`debug`. - - -.. method:: Logger.error(msg[, *args[, **kwargs]]) - - Logs a message with level :const:`ERROR` on this logger. The arguments are - interpreted as for :meth:`debug`. - - -.. method:: Logger.critical(msg[, *args[, **kwargs]]) - - Logs a message with level :const:`CRITICAL` on this logger. The arguments are - interpreted as for :meth:`debug`. - - -.. method:: Logger.log(lvl, msg[, *args[, **kwargs]]) - - Logs a message with integer level *lvl* on this logger. The other arguments are - interpreted as for :meth:`debug`. - - -.. method:: Logger.exception(msg[, *args]) - - Logs a message with level :const:`ERROR` on this logger. The arguments are - interpreted as for :meth:`debug`. Exception info is added to the logging - message. This method should only be called from an exception handler. - - -.. method:: Logger.addFilter(filt) - - Adds the specified filter *filt* to this logger. - - -.. method:: Logger.removeFilter(filt) - - Removes the specified filter *filt* from this logger. - - -.. method:: Logger.filter(record) - - Applies this logger's filters to the record and returns a true value if the - record is to be processed. - - -.. method:: Logger.addHandler(hdlr) - - Adds the specified handler *hdlr* to this logger. - - -.. method:: Logger.removeHandler(hdlr) - - Removes the specified handler *hdlr* from this logger. - - -.. method:: Logger.findCaller() - - Finds the caller's source filename and line number. Returns the filename, line - number and function name as a 3-element tuple. - - .. versionchanged:: 2.4 - The function name was added. In earlier versions, the filename and line number - were returned as a 2-element tuple.. - - -.. method:: Logger.handle(record) - - Handles a record by passing it to all handlers associated with this logger and - its ancestors (until a false value of *propagate* is found). This method is used - for unpickled records received from a socket, as well as those created locally. - Logger-level filtering is applied using :meth:`~Logger.filter`. - - -.. method:: Logger.makeRecord(name, lvl, fn, lno, msg, args, exc_info [, func, extra]) - - This is a factory method which can be overridden in subclasses to create - specialized :class:`LogRecord` instances. - - .. versionchanged:: 2.5 - *func* and *extra* were added. - - -.. _minimal-example: - -Basic example -------------- - -.. versionchanged:: 2.4 - formerly :func:`basicConfig` did not take any keyword arguments. - -The :mod:`logging` package provides a lot of flexibility, and its configuration -can appear daunting. This section demonstrates that simple use of the logging -package is possible. - -The simplest example shows logging to the console:: - - import logging - - logging.debug('A debug message') - logging.info('Some information') - logging.warning('A shot across the bows') - -If you run the above script, you'll see this:: - - WARNING:root:A shot across the bows - -Because no particular logger was specified, the system used the root logger. The -debug and info messages didn't appear because by default, the root logger is -configured to only handle messages with a severity of WARNING or above. The -message format is also a configuration default, as is the output destination of -the messages - ``sys.stderr``. The severity level, the message format and -destination can be easily changed, as shown in the example below:: - - import logging - - logging.basicConfig(level=logging.DEBUG, - format='%(asctime)s %(levelname)s %(message)s', - filename='myapp.log', - filemode='w') - logging.debug('A debug message') - logging.info('Some information') - logging.warning('A shot across the bows') - -The :meth:`basicConfig` method is used to change the configuration defaults, -which results in output (written to ``myapp.log``) which should look -something like the following:: - - 2004-07-02 13:00:08,743 DEBUG A debug message - 2004-07-02 13:00:08,743 INFO Some information - 2004-07-02 13:00:08,743 WARNING A shot across the bows - -This time, all messages with a severity of DEBUG or above were handled, and the -format of the messages was also changed, and output went to the specified file -rather than the console. - -Formatting uses standard Python string formatting - see section -:ref:`string-formatting`. The format string takes the following common -specifiers. For a complete list of specifiers, consult the :class:`Formatter` -documentation. - -+-------------------+-----------------------------------------------+ -| Format | Description | -+===================+===============================================+ -| ``%(name)s`` | Name of the logger (logging channel). | -+-------------------+-----------------------------------------------+ -| ``%(levelname)s`` | Text logging level for the message | -| | (``'DEBUG'``, ``'INFO'``, ``'WARNING'``, | -| | ``'ERROR'``, ``'CRITICAL'``). | -+-------------------+-----------------------------------------------+ -| ``%(asctime)s`` | Human-readable time when the | -| | :class:`LogRecord` was created. By default | -| | this is of the form "2003-07-08 16:49:45,896" | -| | (the numbers after the comma are millisecond | -| | portion of the time). | -+-------------------+-----------------------------------------------+ -| ``%(message)s`` | The logged message. | -+-------------------+-----------------------------------------------+ - -To change the date/time format, you can pass an additional keyword parameter, -*datefmt*, as in the following:: - - import logging - - logging.basicConfig(level=logging.DEBUG, - format='%(asctime)s %(levelname)-8s %(message)s', - datefmt='%a, %d %b %Y %H:%M:%S', - filename='/temp/myapp.log', - filemode='w') - logging.debug('A debug message') - logging.info('Some information') - logging.warning('A shot across the bows') - -which would result in output like :: - - Fri, 02 Jul 2004 13:06:18 DEBUG A debug message - Fri, 02 Jul 2004 13:06:18 INFO Some information - Fri, 02 Jul 2004 13:06:18 WARNING A shot across the bows - -The date format string follows the requirements of :func:`strftime` - see the -documentation for the :mod:`time` module. - -If, instead of sending logging output to the console or a file, you'd rather use -a file-like object which you have created separately, you can pass it to -:func:`basicConfig` using the *stream* keyword argument. Note that if both -*stream* and *filename* keyword arguments are passed, the *stream* argument is -ignored. - -Of course, you can put variable information in your output. To do this, simply -have the message be a format string and pass in additional arguments containing -the variable information, as in the following example:: - - import logging - - logging.basicConfig(level=logging.DEBUG, - format='%(asctime)s %(levelname)-8s %(message)s', - datefmt='%a, %d %b %Y %H:%M:%S', - filename='/temp/myapp.log', - filemode='w') - logging.error('Pack my box with %d dozen %s', 5, 'liquor jugs') - -which would result in :: - - Wed, 21 Jul 2004 15:35:16 ERROR Pack my box with 5 dozen liquor jugs - - -.. _multiple-destinations: - -Logging to multiple destinations --------------------------------- - -Let's say you want to log to console and file with different message formats and -in differing circumstances. Say you want to log messages with levels of DEBUG -and higher to file, and those messages at level INFO and higher to the console. -Let's also assume that the file should contain timestamps, but the console -messages should not. Here's how you can achieve this:: - - import logging - - # set up logging to file - see previous section for more details - logging.basicConfig(level=logging.DEBUG, - format='%(asctime)s %(name)-12s %(levelname)-8s %(message)s', - datefmt='%m-%d %H:%M', - filename='/temp/myapp.log', - filemode='w') - # define a Handler which writes INFO messages or higher to the sys.stderr - console = logging.StreamHandler() - console.setLevel(logging.INFO) - # set a format which is simpler for console use - formatter = logging.Formatter('%(name)-12s: %(levelname)-8s %(message)s') - # tell the handler to use this format - console.setFormatter(formatter) - # add the handler to the root logger - logging.getLogger('').addHandler(console) - - # Now, we can log to the root logger, or any other logger. First the root... - logging.info('Jackdaws love my big sphinx of quartz.') - - # Now, define a couple of other loggers which might represent areas in your - # application: - - logger1 = logging.getLogger('myapp.area1') - logger2 = logging.getLogger('myapp.area2') - - logger1.debug('Quick zephyrs blow, vexing daft Jim.') - logger1.info('How quickly daft jumping zebras vex.') - logger2.warning('Jail zesty vixen who grabbed pay from quack.') - logger2.error('The five boxing wizards jump quickly.') - -When you run this, on the console you will see :: - - root : INFO Jackdaws love my big sphinx of quartz. - myapp.area1 : INFO How quickly daft jumping zebras vex. - myapp.area2 : WARNING Jail zesty vixen who grabbed pay from quack. - myapp.area2 : ERROR The five boxing wizards jump quickly. - -and in the file you will see something like :: - - 10-22 22:19 root INFO Jackdaws love my big sphinx of quartz. - 10-22 22:19 myapp.area1 DEBUG Quick zephyrs blow, vexing daft Jim. - 10-22 22:19 myapp.area1 INFO How quickly daft jumping zebras vex. - 10-22 22:19 myapp.area2 WARNING Jail zesty vixen who grabbed pay from quack. - 10-22 22:19 myapp.area2 ERROR The five boxing wizards jump quickly. - -As you can see, the DEBUG message only shows up in the file. The other messages -are sent to both destinations. - -This example uses console and file handlers, but you can use any number and -combination of handlers you choose. - -.. _logging-exceptions: - -Exceptions raised during logging --------------------------------- - -The logging package is designed to swallow exceptions which occur while logging -in production. This is so that errors which occur while handling logging events -- such as logging misconfiguration, network or other similar errors - do not -cause the application using logging to terminate prematurely. - -:class:`SystemExit` and :class:`KeyboardInterrupt` exceptions are never -swallowed. Other exceptions which occur during the :meth:`emit` method of a -:class:`Handler` subclass are passed to its :meth:`handleError` method. - -The default implementation of :meth:`handleError` in :class:`Handler` checks -to see if a module-level variable, :data:`raiseExceptions`, is set. If set, a -traceback is printed to :data:`sys.stderr`. If not set, the exception is swallowed. - -**Note:** The default value of :data:`raiseExceptions` is ``True``. This is because -during development, you typically want to be notified of any exceptions that -occur. It's advised that you set :data:`raiseExceptions` to ``False`` for production -usage. - -.. _context-info: - -Adding contextual information to your logging output ----------------------------------------------------- - -Sometimes you want logging output to contain contextual information in -addition to the parameters passed to the logging call. For example, in a -networked application, it may be desirable to log client-specific information -in the log (e.g. remote client's username, or IP address). Although you could -use the *extra* parameter to achieve this, it's not always convenient to pass -the information in this way. While it might be tempting to create -:class:`Logger` instances on a per-connection basis, this is not a good idea -because these instances are not garbage collected. While this is not a problem -in practice, when the number of :class:`Logger` instances is dependent on the -level of granularity you want to use in logging an application, it could -be hard to manage if the number of :class:`Logger` instances becomes -effectively unbounded. - - -Using LoggerAdapters to impart contextual information -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -An easy way in which you can pass contextual information to be output along -with logging event information is to use the :class:`LoggerAdapter` class. -This class is designed to look like a :class:`Logger`, so that you can call -:meth:`debug`, :meth:`info`, :meth:`warning`, :meth:`error`, -:meth:`exception`, :meth:`critical` and :meth:`log`. These methods have the -same signatures as their counterparts in :class:`Logger`, so you can use the -two types of instances interchangeably. - -When you create an instance of :class:`LoggerAdapter`, you pass it a -:class:`Logger` instance and a dict-like object which contains your contextual -information. When you call one of the logging methods on an instance of -:class:`LoggerAdapter`, it delegates the call to the underlying instance of -:class:`Logger` passed to its constructor, and arranges to pass the contextual -information in the delegated call. Here's a snippet from the code of -:class:`LoggerAdapter`:: - - def debug(self, msg, *args, **kwargs): - """ - Delegate a debug call to the underlying logger, after adding - contextual information from this adapter instance. - """ - msg, kwargs = self.process(msg, kwargs) - self.logger.debug(msg, *args, **kwargs) - -The :meth:`process` method of :class:`LoggerAdapter` is where the contextual -information is added to the logging output. It's passed the message and -keyword arguments of the logging call, and it passes back (potentially) -modified versions of these to use in the call to the underlying logger. The -default implementation of this method leaves the message alone, but inserts -an "extra" key in the keyword argument whose value is the dict-like object -passed to the constructor. Of course, if you had passed an "extra" keyword -argument in the call to the adapter, it will be silently overwritten. - -The advantage of using "extra" is that the values in the dict-like object are -merged into the :class:`LogRecord` instance's __dict__, allowing you to use -customized strings with your :class:`Formatter` instances which know about -the keys of the dict-like object. If you need a different method, e.g. if you -want to prepend or append the contextual information to the message string, -you just need to subclass :class:`LoggerAdapter` and override :meth:`process` -to do what you need. Here's an example script which uses this class, which -also illustrates what dict-like behaviour is needed from an arbitrary -"dict-like" object for use in the constructor:: - - import logging - - class ConnInfo: - """ - An example class which shows how an arbitrary class can be used as - the 'extra' context information repository passed to a LoggerAdapter. - """ - - def __getitem__(self, name): - """ - To allow this instance to look like a dict. - """ - from random import choice - if name == "ip": - result = choice(["127.0.0.1", "192.168.0.1"]) - elif name == "user": - result = choice(["jim", "fred", "sheila"]) - else: - result = self.__dict__.get(name, "?") - return result - - def __iter__(self): - """ - To allow iteration over keys, which will be merged into - the LogRecord dict before formatting and output. - """ - keys = ["ip", "user"] - keys.extend(self.__dict__.keys()) - return keys.__iter__() - - if __name__ == "__main__": - from random import choice - levels = (logging.DEBUG, logging.INFO, logging.WARNING, logging.ERROR, logging.CRITICAL) - a1 = logging.LoggerAdapter(logging.getLogger("a.b.c"), - { "ip" : "123.231.231.123", "user" : "sheila" }) - logging.basicConfig(level=logging.DEBUG, - format="%(asctime)-15s %(name)-5s %(levelname)-8s IP: %(ip)-15s User: %(user)-8s %(message)s") - a1.debug("A debug message") - a1.info("An info message with %s", "some parameters") - a2 = logging.LoggerAdapter(logging.getLogger("d.e.f"), ConnInfo()) - for x in range(10): - lvl = choice(levels) - lvlname = logging.getLevelName(lvl) - a2.log(lvl, "A message at %s level with %d %s", lvlname, 2, "parameters") - -When this script is run, the output should look something like this:: - - 2008-01-18 14:49:54,023 a.b.c DEBUG IP: 123.231.231.123 User: sheila A debug message - 2008-01-18 14:49:54,023 a.b.c INFO IP: 123.231.231.123 User: sheila An info message with some parameters - 2008-01-18 14:49:54,023 d.e.f CRITICAL IP: 192.168.0.1 User: jim A message at CRITICAL level with 2 parameters - 2008-01-18 14:49:54,033 d.e.f INFO IP: 192.168.0.1 User: jim A message at INFO level with 2 parameters - 2008-01-18 14:49:54,033 d.e.f WARNING IP: 192.168.0.1 User: sheila A message at WARNING level with 2 parameters - 2008-01-18 14:49:54,033 d.e.f ERROR IP: 127.0.0.1 User: fred A message at ERROR level with 2 parameters - 2008-01-18 14:49:54,033 d.e.f ERROR IP: 127.0.0.1 User: sheila A message at ERROR level with 2 parameters - 2008-01-18 14:49:54,033 d.e.f WARNING IP: 192.168.0.1 User: sheila A message at WARNING level with 2 parameters - 2008-01-18 14:49:54,033 d.e.f WARNING IP: 192.168.0.1 User: jim A message at WARNING level with 2 parameters - 2008-01-18 14:49:54,033 d.e.f INFO IP: 192.168.0.1 User: fred A message at INFO level with 2 parameters - 2008-01-18 14:49:54,033 d.e.f WARNING IP: 192.168.0.1 User: sheila A message at WARNING level with 2 parameters - 2008-01-18 14:49:54,033 d.e.f WARNING IP: 127.0.0.1 User: jim A message at WARNING level with 2 parameters - -.. versionadded:: 2.6 - -The :class:`LoggerAdapter` class was not present in previous versions. - -.. _filters-contextual: - -Using Filters to impart contextual information -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -You can also add contextual information to log output using a user-defined -:class:`Filter`. ``Filter`` instances are allowed to modify the ``LogRecords`` -passed to them, including adding additional attributes which can then be output -using a suitable format string, or if needed a custom :class:`Formatter`. - -For example in a web application, the request being processed (or at least, -the interesting parts of it) can be stored in a threadlocal -(:class:`threading.local`) variable, and then accessed from a ``Filter`` to -add, say, information from the request - say, the remote IP address and remote -user's username - to the ``LogRecord``, using the attribute names 'ip' and -'user' as in the ``LoggerAdapter`` example above. In that case, the same format -string can be used to get similar output to that shown above. Here's an example -script:: - - import logging - from random import choice - - class ContextFilter(logging.Filter): - """ - This is a filter which injects contextual information into the log. - - Rather than use actual contextual information, we just use random - data in this demo. - """ - - USERS = ['jim', 'fred', 'sheila'] - IPS = ['123.231.231.123', '127.0.0.1', '192.168.0.1'] - - def filter(self, record): - - record.ip = choice(ContextFilter.IPS) - record.user = choice(ContextFilter.USERS) - return True - - if __name__ == "__main__": - levels = (logging.DEBUG, logging.INFO, logging.WARNING, logging.ERROR, logging.CRITICAL) - a1 = logging.LoggerAdapter(logging.getLogger("a.b.c"), - { "ip" : "123.231.231.123", "user" : "sheila" }) - logging.basicConfig(level=logging.DEBUG, - format="%(asctime)-15s %(name)-5s %(levelname)-8s IP: %(ip)-15s User: %(user)-8s %(message)s") - a1 = logging.getLogger("a.b.c") - a2 = logging.getLogger("d.e.f") - - f = ContextFilter() - a1.addFilter(f) - a2.addFilter(f) - a1.debug("A debug message") - a1.info("An info message with %s", "some parameters") - for x in range(10): - lvl = choice(levels) - lvlname = logging.getLevelName(lvl) - a2.log(lvl, "A message at %s level with %d %s", lvlname, 2, "parameters") - -which, when run, produces something like:: - - 2010-09-06 22:38:15,292 a.b.c DEBUG IP: 123.231.231.123 User: fred A debug message - 2010-09-06 22:38:15,300 a.b.c INFO IP: 192.168.0.1 User: sheila An info message with some parameters - 2010-09-06 22:38:15,300 d.e.f CRITICAL IP: 127.0.0.1 User: sheila A message at CRITICAL level with 2 parameters - 2010-09-06 22:38:15,300 d.e.f ERROR IP: 127.0.0.1 User: jim A message at ERROR level with 2 parameters - 2010-09-06 22:38:15,300 d.e.f DEBUG IP: 127.0.0.1 User: sheila A message at DEBUG level with 2 parameters - 2010-09-06 22:38:15,300 d.e.f ERROR IP: 123.231.231.123 User: fred A message at ERROR level with 2 parameters - 2010-09-06 22:38:15,300 d.e.f CRITICAL IP: 192.168.0.1 User: jim A message at CRITICAL level with 2 parameters - 2010-09-06 22:38:15,300 d.e.f CRITICAL IP: 127.0.0.1 User: sheila A message at CRITICAL level with 2 parameters - 2010-09-06 22:38:15,300 d.e.f DEBUG IP: 192.168.0.1 User: jim A message at DEBUG level with 2 parameters - 2010-09-06 22:38:15,301 d.e.f ERROR IP: 127.0.0.1 User: sheila A message at ERROR level with 2 parameters - 2010-09-06 22:38:15,301 d.e.f DEBUG IP: 123.231.231.123 User: fred A message at DEBUG level with 2 parameters - 2010-09-06 22:38:15,301 d.e.f INFO IP: 123.231.231.123 User: fred A message at INFO level with 2 parameters - - -.. _multiple-processes: - -Logging to a single file from multiple processes ------------------------------------------------- - -Although logging is thread-safe, and logging to a single file from multiple -threads in a single process *is* supported, logging to a single file from -*multiple processes* is *not* supported, because there is no standard way to -serialize access to a single file across multiple processes in Python. If you -need to log to a single file from multiple processes, the best way of doing -this is to have all the processes log to a :class:`SocketHandler`, and have a -separate process which implements a socket server which reads from the socket -and logs to file. (If you prefer, you can dedicate one thread in one of the -existing processes to perform this function.) The following section documents -this approach in more detail and includes a working socket receiver which can -be used as a starting point for you to adapt in your own applications. - -If you are using a recent version of Python which includes the -:mod:`multiprocessing` module, you can write your own handler which uses the -:class:`Lock` class from this module to serialize access to the file from -your processes. The existing :class:`FileHandler` and subclasses do not make -use of :mod:`multiprocessing` at present, though they may do so in the future. -Note that at present, the :mod:`multiprocessing` module does not provide -working lock functionality on all platforms (see -http://bugs.python.org/issue3770). - -.. _network-logging: - -Sending and receiving logging events across a network ------------------------------------------------------ - -Let's say you want to send logging events across a network, and handle them at -the receiving end. A simple way of doing this is attaching a -:class:`SocketHandler` instance to the root logger at the sending end:: - - import logging, logging.handlers - - rootLogger = logging.getLogger('') - rootLogger.setLevel(logging.DEBUG) - socketHandler = logging.handlers.SocketHandler('localhost', - logging.handlers.DEFAULT_TCP_LOGGING_PORT) - # don't bother with a formatter, since a socket handler sends the event as - # an unformatted pickle - rootLogger.addHandler(socketHandler) - - # Now, we can log to the root logger, or any other logger. First the root... - logging.info('Jackdaws love my big sphinx of quartz.') - - # Now, define a couple of other loggers which might represent areas in your - # application: - - logger1 = logging.getLogger('myapp.area1') - logger2 = logging.getLogger('myapp.area2') - - logger1.debug('Quick zephyrs blow, vexing daft Jim.') - logger1.info('How quickly daft jumping zebras vex.') - logger2.warning('Jail zesty vixen who grabbed pay from quack.') - logger2.error('The five boxing wizards jump quickly.') - -At the receiving end, you can set up a receiver using the :mod:`SocketServer` -module. Here is a basic working example:: - - import cPickle - import logging - import logging.handlers - import SocketServer - import struct - - - class LogRecordStreamHandler(SocketServer.StreamRequestHandler): - """Handler for a streaming logging request. - - This basically logs the record using whatever logging policy is - configured locally. - """ - - def handle(self): - """ - Handle multiple requests - each expected to be a 4-byte length, - followed by the LogRecord in pickle format. Logs the record - according to whatever policy is configured locally. - """ - while 1: - chunk = self.connection.recv(4) - if len(chunk) < 4: - break - slen = struct.unpack(">L", chunk)[0] - chunk = self.connection.recv(slen) - while len(chunk) < slen: - chunk = chunk + self.connection.recv(slen - len(chunk)) - obj = self.unPickle(chunk) - record = logging.makeLogRecord(obj) - self.handleLogRecord(record) - - def unPickle(self, data): - return cPickle.loads(data) - - def handleLogRecord(self, record): - # if a name is specified, we use the named logger rather than the one - # implied by the record. - if self.server.logname is not None: - name = self.server.logname - else: - name = record.name - logger = logging.getLogger(name) - # N.B. EVERY record gets logged. This is because Logger.handle - # is normally called AFTER logger-level filtering. If you want - # to do filtering, do it at the client end to save wasting - # cycles and network bandwidth! - logger.handle(record) - - class LogRecordSocketReceiver(SocketServer.ThreadingTCPServer): - """simple TCP socket-based logging receiver suitable for testing. - """ - - allow_reuse_address = 1 - - def __init__(self, host='localhost', - port=logging.handlers.DEFAULT_TCP_LOGGING_PORT, - handler=LogRecordStreamHandler): - SocketServer.ThreadingTCPServer.__init__(self, (host, port), handler) - self.abort = 0 - self.timeout = 1 - self.logname = None - - def serve_until_stopped(self): - import select - abort = 0 - while not abort: - rd, wr, ex = select.select([self.socket.fileno()], - [], [], - self.timeout) - if rd: - self.handle_request() - abort = self.abort - - def main(): - logging.basicConfig( - format="%(relativeCreated)5d %(name)-15s %(levelname)-8s %(message)s") - tcpserver = LogRecordSocketReceiver() - print "About to start TCP server..." - tcpserver.serve_until_stopped() - - if __name__ == "__main__": - main() - -First run the server, and then the client. On the client side, nothing is -printed on the console; on the server side, you should see something like:: - - About to start TCP server... - 59 root INFO Jackdaws love my big sphinx of quartz. - 59 myapp.area1 DEBUG Quick zephyrs blow, vexing daft Jim. - 69 myapp.area1 INFO How quickly daft jumping zebras vex. - 69 myapp.area2 WARNING Jail zesty vixen who grabbed pay from quack. - 69 myapp.area2 ERROR The five boxing wizards jump quickly. - -Note that there are some security issues with pickle in some scenarios. If -these affect you, you can use an alternative serialization scheme by overriding -the :meth:`makePickle` method and implementing your alternative there, as -well as adapting the above script to use your alternative serialization. - -.. _arbitrary-object-messages: - -Using arbitrary objects as messages ------------------------------------ - -In the preceding sections and examples, it has been assumed that the message -passed when logging the event is a string. However, this is not the only -possibility. You can pass an arbitrary object as a message, and its -:meth:`__str__` method will be called when the logging system needs to convert -it to a string representation. In fact, if you want to, you can avoid -computing a string representation altogether - for example, the -:class:`SocketHandler` emits an event by pickling it and sending it over the -wire. - -Optimization ------------- - -Formatting of message arguments is deferred until it cannot be avoided. -However, computing the arguments passed to the logging method can also be -expensive, and you may want to avoid doing it if the logger will just throw -away your event. To decide what to do, you can call the :meth:`isEnabledFor` -method which takes a level argument and returns true if the event would be -created by the Logger for that level of call. You can write code like this:: - - if logger.isEnabledFor(logging.DEBUG): - logger.debug("Message with %s, %s", expensive_func1(), - expensive_func2()) - -so that if the logger's threshold is set above ``DEBUG``, the calls to -:func:`expensive_func1` and :func:`expensive_func2` are never made. - -There are other optimizations which can be made for specific applications which -need more precise control over what logging information is collected. Here's a -list of things you can do to avoid processing during logging which you don't -need: - -+-----------------------------------------------+----------------------------------------+ -| What you don't want to collect | How to avoid collecting it | -+===============================================+========================================+ -| Information about where calls were made from. | Set ``logging._srcfile`` to ``None``. | -+-----------------------------------------------+----------------------------------------+ -| Threading information. | Set ``logging.logThreads`` to ``0``. | -+-----------------------------------------------+----------------------------------------+ -| Process information. | Set ``logging.logProcesses`` to ``0``. | -+-----------------------------------------------+----------------------------------------+ - -Also note that the core logging module only includes the basic handlers. If -you don't import :mod:`logging.handlers` and :mod:`logging.config`, they won't -take up any memory. - -.. _handler: - -Handler Objects ---------------- - -Handlers have the following attributes and methods. Note that :class:`Handler` -is never instantiated directly; this class acts as a base for more useful -subclasses. However, the :meth:`__init__` method in subclasses needs to call -:meth:`Handler.__init__`. - - -.. method:: Handler.__init__(level=NOTSET) - - Initializes the :class:`Handler` instance by setting its level, setting the list - of filters to the empty list and creating a lock (using :meth:`createLock`) for - serializing access to an I/O mechanism. - - -.. method:: Handler.createLock() - - Initializes a thread lock which can be used to serialize access to underlying - I/O functionality which may not be threadsafe. - - -.. method:: Handler.acquire() - - Acquires the thread lock created with :meth:`createLock`. - - -.. method:: Handler.release() - - Releases the thread lock acquired with :meth:`acquire`. - - -.. method:: Handler.setLevel(lvl) - - Sets the threshold for this handler to *lvl*. Logging messages which are less - severe than *lvl* will be ignored. When a handler is created, the level is set - to :const:`NOTSET` (which causes all messages to be processed). - - -.. method:: Handler.setFormatter(form) - - Sets the :class:`Formatter` for this handler to *form*. - - -.. method:: Handler.addFilter(filt) - - Adds the specified filter *filt* to this handler. - - -.. method:: Handler.removeFilter(filt) - - Removes the specified filter *filt* from this handler. - - -.. method:: Handler.filter(record) - - Applies this handler's filters to the record and returns a true value if the - record is to be processed. - - -.. method:: Handler.flush() - - Ensure all logging output has been flushed. This version does nothing and is - intended to be implemented by subclasses. - - -.. method:: Handler.close() - - Tidy up any resources used by the handler. This version does no output but - removes the handler from an internal list of handlers which is closed when - :func:`shutdown` is called. Subclasses should ensure that this gets called - from overridden :meth:`close` methods. - - -.. method:: Handler.handle(record) - - Conditionally emits the specified logging record, depending on filters which may - have been added to the handler. Wraps the actual emission of the record with - acquisition/release of the I/O thread lock. - - -.. method:: Handler.handleError(record) - - This method should be called from handlers when an exception is encountered - during an :meth:`emit` call. By default it does nothing, which means that - exceptions get silently ignored. This is what is mostly wanted for a logging - system - most users will not care about errors in the logging system, they are - more interested in application errors. You could, however, replace this with a - custom handler if you wish. The specified record is the one which was being - processed when the exception occurred. - - -.. method:: Handler.format(record) - - Do formatting for a record - if a formatter is set, use it. Otherwise, use the - default formatter for the module. - - -.. method:: Handler.emit(record) - - Do whatever it takes to actually log the specified logging record. This version - is intended to be implemented by subclasses and so raises a - :exc:`NotImplementedError`. - - -.. _stream-handler: - -StreamHandler -^^^^^^^^^^^^^ - -The :class:`StreamHandler` class, located in the core :mod:`logging` package, -sends logging output to streams such as *sys.stdout*, *sys.stderr* or any -file-like object (or, more precisely, any object which supports :meth:`write` -and :meth:`flush` methods). - - -.. currentmodule:: logging - -.. class:: StreamHandler([stream]) - - Returns a new instance of the :class:`StreamHandler` class. If *stream* is - specified, the instance will use it for logging output; otherwise, *sys.stderr* - will be used. - - .. versionchanged:: 2.7 - The ``stream`` parameter was called ``strm`` in earlier versions. - - .. method:: emit(record) - - If a formatter is specified, it is used to format the record. The record - is then written to the stream with a trailing newline. If exception - information is present, it is formatted using - :func:`traceback.print_exception` and appended to the stream. - - - .. method:: flush() - - Flushes the stream by calling its :meth:`flush` method. Note that the - :meth:`close` method is inherited from :class:`Handler` and so does - no output, so an explicit :meth:`flush` call may be needed at times. - - -.. _file-handler: - -FileHandler -^^^^^^^^^^^ - -The :class:`FileHandler` class, located in the core :mod:`logging` package, -sends logging output to a disk file. It inherits the output functionality from -:class:`StreamHandler`. - - -.. class:: FileHandler(filename[, mode[, encoding[, delay]]]) - - Returns a new instance of the :class:`FileHandler` class. The specified file is - opened and used as the stream for logging. If *mode* is not specified, - :const:`'a'` is used. If *encoding* is not *None*, it is used to open the file - with that encoding. If *delay* is true, then file opening is deferred until the - first call to :meth:`emit`. By default, the file grows indefinitely. - - .. versionchanged:: 2.6 - *delay* was added. - - .. method:: close() - - Closes the file. - - - .. method:: emit(record) - - Outputs the record to the file. - - -.. _null-handler: - -NullHandler -^^^^^^^^^^^ - -.. versionadded:: 2.7 - -The :class:`NullHandler` class, located in the core :mod:`logging` package, -does not do any formatting or output. It is essentially a "no-op" handler -for use by library developers. - -.. class:: NullHandler() - - Returns a new instance of the :class:`NullHandler` class. - - .. method:: emit(record) - - This method does nothing. - - .. method:: handle(record) - - This method does nothing. - - .. method:: createLock() - - This method returns `None` for the lock, since there is no - underlying I/O to which access needs to be serialized. - - -See :ref:`library-config` for more information on how to use -:class:`NullHandler`. - -.. _watched-file-handler: - -WatchedFileHandler -^^^^^^^^^^^^^^^^^^ - -.. versionadded:: 2.6 - -.. currentmodule:: logging.handlers - -The :class:`WatchedFileHandler` class, located in the :mod:`logging.handlers` -module, is a :class:`FileHandler` which watches the file it is logging to. If -the file changes, it is closed and reopened using the file name. - -A file change can happen because of usage of programs such as *newsyslog* and -*logrotate* which perform log file rotation. This handler, intended for use -under Unix/Linux, watches the file to see if it has changed since the last emit. -(A file is deemed to have changed if its device or inode have changed.) If the -file has changed, the old file stream is closed, and the file opened to get a -new stream. - -This handler is not appropriate for use under Windows, because under Windows -open log files cannot be moved or renamed - logging opens the files with -exclusive locks - and so there is no need for such a handler. Furthermore, -*ST_INO* is not supported under Windows; :func:`stat` always returns zero for -this value. - - -.. class:: WatchedFileHandler(filename[,mode[, encoding[, delay]]]) - - Returns a new instance of the :class:`WatchedFileHandler` class. The specified - file is opened and used as the stream for logging. If *mode* is not specified, - :const:`'a'` is used. If *encoding* is not *None*, it is used to open the file - with that encoding. If *delay* is true, then file opening is deferred until the - first call to :meth:`emit`. By default, the file grows indefinitely. - - .. versionchanged:: 2.6 - *delay* was added. - - - .. method:: emit(record) - - Outputs the record to the file, but first checks to see if the file has - changed. If it has, the existing stream is flushed and closed and the - file opened again, before outputting the record to the file. - -.. _rotating-file-handler: - -RotatingFileHandler -^^^^^^^^^^^^^^^^^^^ - -The :class:`RotatingFileHandler` class, located in the :mod:`logging.handlers` -module, supports rotation of disk log files. - - -.. class:: RotatingFileHandler(filename[, mode[, maxBytes[, backupCount[, encoding[, delay]]]]]) - - Returns a new instance of the :class:`RotatingFileHandler` class. The specified - file is opened and used as the stream for logging. If *mode* is not specified, - ``'a'`` is used. If *encoding* is not *None*, it is used to open the file - with that encoding. If *delay* is true, then file opening is deferred until the - first call to :meth:`emit`. By default, the file grows indefinitely. - - You can use the *maxBytes* and *backupCount* values to allow the file to - :dfn:`rollover` at a predetermined size. When the size is about to be exceeded, - the file is closed and a new file is silently opened for output. Rollover occurs - whenever the current log file is nearly *maxBytes* in length; if *maxBytes* is - zero, rollover never occurs. If *backupCount* is non-zero, the system will save - old log files by appending the extensions ".1", ".2" etc., to the filename. For - example, with a *backupCount* of 5 and a base file name of :file:`app.log`, you - would get :file:`app.log`, :file:`app.log.1`, :file:`app.log.2`, up to - :file:`app.log.5`. The file being written to is always :file:`app.log`. When - this file is filled, it is closed and renamed to :file:`app.log.1`, and if files - :file:`app.log.1`, :file:`app.log.2`, etc. exist, then they are renamed to - :file:`app.log.2`, :file:`app.log.3` etc. respectively. - - .. versionchanged:: 2.6 - *delay* was added. - - .. method:: doRollover() - - Does a rollover, as described above. - - - .. method:: emit(record) - - Outputs the record to the file, catering for rollover as described - previously. - -.. _timed-rotating-file-handler: - -TimedRotatingFileHandler -^^^^^^^^^^^^^^^^^^^^^^^^ - -The :class:`TimedRotatingFileHandler` class, located in the -:mod:`logging.handlers` module, supports rotation of disk log files at certain -timed intervals. - - -.. class:: TimedRotatingFileHandler(filename [,when [,interval [,backupCount[, encoding[, delay[, utc]]]]]]) - - Returns a new instance of the :class:`TimedRotatingFileHandler` class. The - specified file is opened and used as the stream for logging. On rotating it also - sets the filename suffix. Rotating happens based on the product of *when* and - *interval*. - - You can use the *when* to specify the type of *interval*. The list of possible - values is below. Note that they are not case sensitive. - - +----------------+-----------------------+ - | Value | Type of interval | - +================+=======================+ - | ``'S'`` | Seconds | - +----------------+-----------------------+ - | ``'M'`` | Minutes | - +----------------+-----------------------+ - | ``'H'`` | Hours | - +----------------+-----------------------+ - | ``'D'`` | Days | - +----------------+-----------------------+ - | ``'W'`` | Week day (0=Monday) | - +----------------+-----------------------+ - | ``'midnight'`` | Roll over at midnight | - +----------------+-----------------------+ - - The system will save old log files by appending extensions to the filename. - The extensions are date-and-time based, using the strftime format - ``%Y-%m-%d_%H-%M-%S`` or a leading portion thereof, depending on the - rollover interval. - - When computing the next rollover time for the first time (when the handler - is created), the last modification time of an existing log file, or else - the current time, is used to compute when the next rotation will occur. - - If the *utc* argument is true, times in UTC will be used; otherwise - local time is used. - - If *backupCount* is nonzero, at most *backupCount* files - will be kept, and if more would be created when rollover occurs, the oldest - one is deleted. The deletion logic uses the interval to determine which - files to delete, so changing the interval may leave old files lying around. - - If *delay* is true, then file opening is deferred until the first call to - :meth:`emit`. - - .. versionchanged:: 2.6 - *delay* was added. - - .. method:: doRollover() - - Does a rollover, as described above. - - - .. method:: emit(record) - - Outputs the record to the file, catering for rollover as described above. - - -.. _socket-handler: - -SocketHandler -^^^^^^^^^^^^^ - -The :class:`SocketHandler` class, located in the :mod:`logging.handlers` module, -sends logging output to a network socket. The base class uses a TCP socket. - - -.. class:: SocketHandler(host, port) - - Returns a new instance of the :class:`SocketHandler` class intended to - communicate with a remote machine whose address is given by *host* and *port*. - - - .. method:: close() - - Closes the socket. - - - .. method:: emit() - - Pickles the record's attribute dictionary and writes it to the socket in - binary format. If there is an error with the socket, silently drops the - packet. If the connection was previously lost, re-establishes the - connection. To unpickle the record at the receiving end into a - :class:`LogRecord`, use the :func:`makeLogRecord` function. - - - .. method:: handleError() - - Handles an error which has occurred during :meth:`emit`. The most likely - cause is a lost connection. Closes the socket so that we can retry on the - next event. - - - .. method:: makeSocket() - - This is a factory method which allows subclasses to define the precise - type of socket they want. The default implementation creates a TCP socket - (:const:`socket.SOCK_STREAM`). - - - .. method:: makePickle(record) - - Pickles the record's attribute dictionary in binary format with a length - prefix, and returns it ready for transmission across the socket. - - Note that pickles aren't completely secure. If you are concerned about - security, you may want to override this method to implement a more secure - mechanism. For example, you can sign pickles using HMAC and then verify - them on the receiving end, or alternatively you can disable unpickling of - global objects on the receiving end. - - .. method:: send(packet) - - Send a pickled string *packet* to the socket. This function allows for - partial sends which can happen when the network is busy. - - -.. _datagram-handler: - -DatagramHandler -^^^^^^^^^^^^^^^ - -The :class:`DatagramHandler` class, located in the :mod:`logging.handlers` -module, inherits from :class:`SocketHandler` to support sending logging messages -over UDP sockets. - - -.. class:: DatagramHandler(host, port) - - Returns a new instance of the :class:`DatagramHandler` class intended to - communicate with a remote machine whose address is given by *host* and *port*. - - - .. method:: emit() - - Pickles the record's attribute dictionary and writes it to the socket in - binary format. If there is an error with the socket, silently drops the - packet. To unpickle the record at the receiving end into a - :class:`LogRecord`, use the :func:`makeLogRecord` function. - - - .. method:: makeSocket() - - The factory method of :class:`SocketHandler` is here overridden to create - a UDP socket (:const:`socket.SOCK_DGRAM`). - - - .. method:: send(s) - - Send a pickled string to a socket. - - -.. _syslog-handler: - -SysLogHandler -^^^^^^^^^^^^^ - -The :class:`SysLogHandler` class, located in the :mod:`logging.handlers` module, -supports sending logging messages to a remote or local Unix syslog. - - -.. class:: SysLogHandler([address[, facility[, socktype]]]) - - Returns a new instance of the :class:`SysLogHandler` class intended to - communicate with a remote Unix machine whose address is given by *address* in - the form of a ``(host, port)`` tuple. If *address* is not specified, - ``('localhost', 514)`` is used. The address is used to open a socket. An - alternative to providing a ``(host, port)`` tuple is providing an address as a - string, for example "/dev/log". In this case, a Unix domain socket is used to - send the message to the syslog. If *facility* is not specified, - :const:`LOG_USER` is used. The type of socket opened depends on the - *socktype* argument, which defaults to :const:`socket.SOCK_DGRAM` and thus - opens a UDP socket. To open a TCP socket (for use with the newer syslog - daemons such as rsyslog), specify a value of :const:`socket.SOCK_STREAM`. - - .. versionchanged:: 2.7 - *socktype* was added. - - - .. method:: close() - - Closes the socket to the remote host. - - - .. method:: emit(record) - - The record is formatted, and then sent to the syslog server. If exception - information is present, it is *not* sent to the server. - - - .. method:: encodePriority(facility, priority) - - Encodes the facility and priority into an integer. You can pass in strings - or integers - if strings are passed, internal mapping dictionaries are - used to convert them to integers. - - The symbolic ``LOG_`` values are defined in :class:`SysLogHandler` and - mirror the values defined in the ``sys/syslog.h`` header file. - - **Priorities** - - +--------------------------+---------------+ - | Name (string) | Symbolic value| - +==========================+===============+ - | ``alert`` | LOG_ALERT | - +--------------------------+---------------+ - | ``crit`` or ``critical`` | LOG_CRIT | - +--------------------------+---------------+ - | ``debug`` | LOG_DEBUG | - +--------------------------+---------------+ - | ``emerg`` or ``panic`` | LOG_EMERG | - +--------------------------+---------------+ - | ``err`` or ``error`` | LOG_ERR | - +--------------------------+---------------+ - | ``info`` | LOG_INFO | - +--------------------------+---------------+ - | ``notice`` | LOG_NOTICE | - +--------------------------+---------------+ - | ``warn`` or ``warning`` | LOG_WARNING | - +--------------------------+---------------+ - - **Facilities** - - +---------------+---------------+ - | Name (string) | Symbolic value| - +===============+===============+ - | ``auth`` | LOG_AUTH | - +---------------+---------------+ - | ``authpriv`` | LOG_AUTHPRIV | - +---------------+---------------+ - | ``cron`` | LOG_CRON | - +---------------+---------------+ - | ``daemon`` | LOG_DAEMON | - +---------------+---------------+ - | ``ftp`` | LOG_FTP | - +---------------+---------------+ - | ``kern`` | LOG_KERN | - +---------------+---------------+ - | ``lpr`` | LOG_LPR | - +---------------+---------------+ - | ``mail`` | LOG_MAIL | - +---------------+---------------+ - | ``news`` | LOG_NEWS | - +---------------+---------------+ - | ``syslog`` | LOG_SYSLOG | - +---------------+---------------+ - | ``user`` | LOG_USER | - +---------------+---------------+ - | ``uucp`` | LOG_UUCP | - +---------------+---------------+ - | ``local0`` | LOG_LOCAL0 | - +---------------+---------------+ - | ``local1`` | LOG_LOCAL1 | - +---------------+---------------+ - | ``local2`` | LOG_LOCAL2 | - +---------------+---------------+ - | ``local3`` | LOG_LOCAL3 | - +---------------+---------------+ - | ``local4`` | LOG_LOCAL4 | - +---------------+---------------+ - | ``local5`` | LOG_LOCAL5 | - +---------------+---------------+ - | ``local6`` | LOG_LOCAL6 | - +---------------+---------------+ - | ``local7`` | LOG_LOCAL7 | - +---------------+---------------+ - - .. method:: mapPriority(levelname) - - Maps a logging level name to a syslog priority name. - You may need to override this if you are using custom levels, or - if the default algorithm is not suitable for your needs. The - default algorithm maps ``DEBUG``, ``INFO``, ``WARNING``, ``ERROR`` and - ``CRITICAL`` to the equivalent syslog names, and all other level - names to "warning". - -.. _nt-eventlog-handler: - -NTEventLogHandler -^^^^^^^^^^^^^^^^^ - -The :class:`NTEventLogHandler` class, located in the :mod:`logging.handlers` -module, supports sending logging messages to a local Windows NT, Windows 2000 or -Windows XP event log. Before you can use it, you need Mark Hammond's Win32 -extensions for Python installed. - - -.. class:: NTEventLogHandler(appname[, dllname[, logtype]]) - - Returns a new instance of the :class:`NTEventLogHandler` class. The *appname* is - used to define the application name as it appears in the event log. An - appropriate registry entry is created using this name. The *dllname* should give - the fully qualified pathname of a .dll or .exe which contains message - definitions to hold in the log (if not specified, ``'win32service.pyd'`` is used - - this is installed with the Win32 extensions and contains some basic - placeholder message definitions. Note that use of these placeholders will make - your event logs big, as the entire message source is held in the log. If you - want slimmer logs, you have to pass in the name of your own .dll or .exe which - contains the message definitions you want to use in the event log). The - *logtype* is one of ``'Application'``, ``'System'`` or ``'Security'``, and - defaults to ``'Application'``. - - - .. method:: close() - - At this point, you can remove the application name from the registry as a - source of event log entries. However, if you do this, you will not be able - to see the events as you intended in the Event Log Viewer - it needs to be - able to access the registry to get the .dll name. The current version does - not do this. - - - .. method:: emit(record) - - Determines the message ID, event category and event type, and then logs - the message in the NT event log. - - - .. method:: getEventCategory(record) - - Returns the event category for the record. Override this if you want to - specify your own categories. This version returns 0. - - - .. method:: getEventType(record) - - Returns the event type for the record. Override this if you want to - specify your own types. This version does a mapping using the handler's - typemap attribute, which is set up in :meth:`__init__` to a dictionary - which contains mappings for :const:`DEBUG`, :const:`INFO`, - :const:`WARNING`, :const:`ERROR` and :const:`CRITICAL`. If you are using - your own levels, you will either need to override this method or place a - suitable dictionary in the handler's *typemap* attribute. - - - .. method:: getMessageID(record) - - Returns the message ID for the record. If you are using your own messages, - you could do this by having the *msg* passed to the logger being an ID - rather than a format string. Then, in here, you could use a dictionary - lookup to get the message ID. This version returns 1, which is the base - message ID in :file:`win32service.pyd`. - -.. _smtp-handler: - -SMTPHandler -^^^^^^^^^^^ - -The :class:`SMTPHandler` class, located in the :mod:`logging.handlers` module, -supports sending logging messages to an email address via SMTP. - - -.. class:: SMTPHandler(mailhost, fromaddr, toaddrs, subject[, credentials]) - - Returns a new instance of the :class:`SMTPHandler` class. The instance is - initialized with the from and to addresses and subject line of the email. The - *toaddrs* should be a list of strings. To specify a non-standard SMTP port, use - the (host, port) tuple format for the *mailhost* argument. If you use a string, - the standard SMTP port is used. If your SMTP server requires authentication, you - can specify a (username, password) tuple for the *credentials* argument. - - .. versionchanged:: 2.6 - *credentials* was added. - - - .. method:: emit(record) - - Formats the record and sends it to the specified addressees. - - - .. method:: getSubject(record) - - If you want to specify a subject line which is record-dependent, override - this method. - -.. _memory-handler: - -MemoryHandler -^^^^^^^^^^^^^ - -The :class:`MemoryHandler` class, located in the :mod:`logging.handlers` module, -supports buffering of logging records in memory, periodically flushing them to a -:dfn:`target` handler. Flushing occurs whenever the buffer is full, or when an -event of a certain severity or greater is seen. - -:class:`MemoryHandler` is a subclass of the more general -:class:`BufferingHandler`, which is an abstract class. This buffers logging -records in memory. Whenever each record is added to the buffer, a check is made -by calling :meth:`shouldFlush` to see if the buffer should be flushed. If it -should, then :meth:`flush` is expected to do the needful. - - -.. class:: BufferingHandler(capacity) - - Initializes the handler with a buffer of the specified capacity. - - - .. method:: emit(record) - - Appends the record to the buffer. If :meth:`shouldFlush` returns true, - calls :meth:`flush` to process the buffer. - - - .. method:: flush() - - You can override this to implement custom flushing behavior. This version - just zaps the buffer to empty. - - - .. method:: shouldFlush(record) - - Returns true if the buffer is up to capacity. This method can be - overridden to implement custom flushing strategies. - - -.. class:: MemoryHandler(capacity[, flushLevel [, target]]) - - Returns a new instance of the :class:`MemoryHandler` class. The instance is - initialized with a buffer size of *capacity*. If *flushLevel* is not specified, - :const:`ERROR` is used. If no *target* is specified, the target will need to be - set using :meth:`setTarget` before this handler does anything useful. - - - .. method:: close() - - Calls :meth:`flush`, sets the target to :const:`None` and clears the - buffer. - - - .. method:: flush() - - For a :class:`MemoryHandler`, flushing means just sending the buffered - records to the target, if there is one. Override if you want different - behavior. - - - .. method:: setTarget(target) - - Sets the target handler for this handler. - - - .. method:: shouldFlush(record) - - Checks for buffer full or a record at the *flushLevel* or higher. - - -.. _http-handler: - -HTTPHandler -^^^^^^^^^^^ - -The :class:`HTTPHandler` class, located in the :mod:`logging.handlers` module, -supports sending logging messages to a Web server, using either ``GET`` or -``POST`` semantics. - - -.. class:: HTTPHandler(host, url[, method]) - - Returns a new instance of the :class:`HTTPHandler` class. The instance is - initialized with a host address, url and HTTP method. The *host* can be of the - form ``host:port``, should you need to use a specific port number. If no - *method* is specified, ``GET`` is used. - - - .. method:: emit(record) - - Sends the record to the Web server as a percent-encoded dictionary. - - -.. _formatter: - -Formatter Objects ------------------ - -.. currentmodule:: logging - -:class:`Formatter`\ s have the following attributes and methods. They are -responsible for converting a :class:`LogRecord` to (usually) a string which can -be interpreted by either a human or an external system. The base -:class:`Formatter` allows a formatting string to be specified. If none is -supplied, the default value of ``'%(message)s'`` is used. - -A Formatter can be initialized with a format string which makes use of knowledge -of the :class:`LogRecord` attributes - such as the default value mentioned above -making use of the fact that the user's message and arguments are pre-formatted -into a :class:`LogRecord`'s *message* attribute. This format string contains -standard Python %-style mapping keys. See section :ref:`string-formatting` -for more information on string formatting. - -Currently, the useful mapping keys in a :class:`LogRecord` are: - -+-------------------------+-----------------------------------------------+ -| Format | Description | -+=========================+===============================================+ -| ``%(name)s`` | Name of the logger (logging channel). | -+-------------------------+-----------------------------------------------+ -| ``%(levelno)s`` | Numeric logging level for the message | -| | (:const:`DEBUG`, :const:`INFO`, | -| | :const:`WARNING`, :const:`ERROR`, | -| | :const:`CRITICAL`). | -+-------------------------+-----------------------------------------------+ -| ``%(levelname)s`` | Text logging level for the message | -| | (``'DEBUG'``, ``'INFO'``, ``'WARNING'``, | -| | ``'ERROR'``, ``'CRITICAL'``). | -+-------------------------+-----------------------------------------------+ -| ``%(pathname)s`` | Full pathname of the source file where the | -| | logging call was issued (if available). | -+-------------------------+-----------------------------------------------+ -| ``%(filename)s`` | Filename portion of pathname. | -+-------------------------+-----------------------------------------------+ -| ``%(module)s`` | Module (name portion of filename). | -+-------------------------+-----------------------------------------------+ -| ``%(funcName)s`` | Name of function containing the logging call. | -+-------------------------+-----------------------------------------------+ -| ``%(lineno)d`` | Source line number where the logging call was | -| | issued (if available). | -+-------------------------+-----------------------------------------------+ -| ``%(created)f`` | Time when the :class:`LogRecord` was created | -| | (as returned by :func:`time.time`). | -+-------------------------+-----------------------------------------------+ -| ``%(relativeCreated)d`` | Time in milliseconds when the LogRecord was | -| | created, relative to the time the logging | -| | module was loaded. | -+-------------------------+-----------------------------------------------+ -| ``%(asctime)s`` | Human-readable time when the | -| | :class:`LogRecord` was created. By default | -| | this is of the form "2003-07-08 16:49:45,896" | -| | (the numbers after the comma are millisecond | -| | portion of the time). | -+-------------------------+-----------------------------------------------+ -| ``%(msecs)d`` | Millisecond portion of the time when the | -| | :class:`LogRecord` was created. | -+-------------------------+-----------------------------------------------+ -| ``%(thread)d`` | Thread ID (if available). | -+-------------------------+-----------------------------------------------+ -| ``%(threadName)s`` | Thread name (if available). | -+-------------------------+-----------------------------------------------+ -| ``%(process)d`` | Process ID (if available). | -+-------------------------+-----------------------------------------------+ -| ``%(processName)s`` | Process name (if available). | -+-------------------------+-----------------------------------------------+ -| ``%(message)s`` | The logged message, computed as ``msg % | -| | args``. | -+-------------------------+-----------------------------------------------+ - -.. versionchanged:: 2.5 - *funcName* was added. - -.. versionchanged:: 2.6 - *processName* was added. - - -.. class:: Formatter([fmt[, datefmt]]) - - Returns a new instance of the :class:`Formatter` class. The instance is - initialized with a format string for the message as a whole, as well as a - format string for the date/time portion of a message. If no *fmt* is - specified, ``'%(message)s'`` is used. If no *datefmt* is specified, the - ISO8601 date format is used. - - .. method:: format(record) - - The record's attribute dictionary is used as the operand to a string - formatting operation. Returns the resulting string. Before formatting the - dictionary, a couple of preparatory steps are carried out. The *message* - attribute of the record is computed using *msg* % *args*. If the - formatting string contains ``'(asctime)'``, :meth:`formatTime` is called - to format the event time. If there is exception information, it is - formatted using :meth:`formatException` and appended to the message. Note - that the formatted exception information is cached in attribute - *exc_text*. This is useful because the exception information can be - pickled and sent across the wire, but you should be careful if you have - more than one :class:`Formatter` subclass which customizes the formatting - of exception information. In this case, you will have to clear the cached - value after a formatter has done its formatting, so that the next - formatter to handle the event doesn't use the cached value but - recalculates it afresh. - - - .. method:: formatTime(record[, datefmt]) - - This method should be called from :meth:`format` by a formatter which - wants to make use of a formatted time. This method can be overridden in - formatters to provide for any specific requirement, but the basic behavior - is as follows: if *datefmt* (a string) is specified, it is used with - :func:`time.strftime` to format the creation time of the - record. Otherwise, the ISO8601 format is used. The resulting string is - returned. - - - .. method:: formatException(exc_info) - - Formats the specified exception information (a standard exception tuple as - returned by :func:`sys.exc_info`) as a string. This default implementation - just uses :func:`traceback.print_exception`. The resulting string is - returned. - -.. _filter: - -Filter Objects --------------- - -:class:`Filter`\ s can be used by :class:`Handler`\ s and :class:`Logger`\ s for -more sophisticated filtering than is provided by levels. The base filter class -only allows events which are below a certain point in the logger hierarchy. For -example, a filter initialized with "A.B" will allow events logged by loggers -"A.B", "A.B.C", "A.B.C.D", "A.B.D" etc. but not "A.BB", "B.A.B" etc. If -initialized with the empty string, all events are passed. - - -.. class:: Filter([name]) - - Returns an instance of the :class:`Filter` class. If *name* is specified, it - names a logger which, together with its children, will have its events allowed - through the filter. If *name* is the empty string, allows every event. - - - .. method:: filter(record) - - Is the specified record to be logged? Returns zero for no, nonzero for - yes. If deemed appropriate, the record may be modified in-place by this - method. - -Note that filters attached to handlers are consulted whenever an event is -emitted by the handler, whereas filters attached to loggers are consulted -whenever an event is logged to the handler (using :meth:`debug`, :meth:`info`, -etc.) This means that events which have been generated by descendant loggers -will not be filtered by a logger's filter setting, unless the filter has also -been applied to those descendant loggers. - -You don't actually need to subclass ``Filter``: you can pass any instance -which has a ``filter`` method with the same semantics. - -Other uses for filters -^^^^^^^^^^^^^^^^^^^^^^ - -Although filters are used primarily to filter records based on more -sophisticated criteria than levels, they get to see every record which is -processed by the handler or logger they're attached to: this can be useful if -you want to do things like counting how many records were processed by a -particular logger or handler, or adding, changing or removing attributes in -the LogRecord being processed. Obviously changing the LogRecord needs to be -done with some care, but it does allow the injection of contextual information -into logs (see :ref:`filters-contextual`). - -.. _log-record: - -LogRecord Objects ------------------ - -:class:`LogRecord` instances are created automatically by the :class:`Logger` -every time something is logged, and can be created manually via -:func:`makeLogRecord` (for example, from a pickled event received over the -wire). - - -.. class:: - LogRecord(name, lvl, pathname, lineno, msg, args, exc_info [, func=None]) - - Contains all the information pertinent to the event being logged. - - The primary information is passed in :attr:`msg` and :attr:`args`, which - are combined using ``msg % args`` to create the :attr:`message` field of the - record. - - .. attribute:: args - - Tuple of arguments to be used in formatting :attr:`msg`. - - .. attribute:: exc_info - - Exception tuple (? la `sys.exc_info`) or `None` if no exception - information is available. - - .. attribute:: func - - Name of the function of origin (i.e. in which the logging call was made). - - .. attribute:: lineno - - Line number in the source file of origin. - - .. attribute:: lvl - - Numeric logging level. - - .. attribute:: message - - Bound to the result of :meth:`getMessage` when - :meth:`Formatter.format(record)` is invoked. - - .. attribute:: msg - - User-supplied :ref:`format string` or arbitrary object - (see :ref:`arbitrary-object-messages`) used in :meth:`getMessage`. - - .. attribute:: name - - Name of the logger that emitted the record. - - .. attribute:: pathname - - Absolute pathname of the source file of origin. - - .. method:: getMessage() - - Returns the message for this :class:`LogRecord` instance after merging any - user-supplied arguments with the message. If the user-supplied message - argument to the logging call is not a string, :func:`str` is called on it to - convert it to a string. This allows use of user-defined classes as - messages, whose ``__str__`` method can return the actual format string to - be used. - - .. versionchanged:: 2.5 - *func* was added. - - -.. _logger-adapter: - -LoggerAdapter Objects ---------------------- - -.. versionadded:: 2.6 - -:class:`LoggerAdapter` instances are used to conveniently pass contextual -information into logging calls. For a usage example , see the section on -:ref:`adding contextual information to your logging output `. - - -.. class:: LoggerAdapter(logger, extra) - - Returns an instance of :class:`LoggerAdapter` initialized with an - underlying :class:`Logger` instance and a dict-like object. - - .. method:: process(msg, kwargs) - - Modifies the message and/or keyword arguments passed to a logging call in - order to insert contextual information. This implementation takes the object - passed as *extra* to the constructor and adds it to *kwargs* using key - 'extra'. The return value is a (*msg*, *kwargs*) tuple which has the - (possibly modified) versions of the arguments passed in. - -In addition to the above, :class:`LoggerAdapter` supports all the logging -methods of :class:`Logger`, i.e. :meth:`debug`, :meth:`info`, :meth:`warning`, -:meth:`error`, :meth:`exception`, :meth:`critical` and :meth:`log`. These -methods have the same signatures as their counterparts in :class:`Logger`, so -you can use the two types of instances interchangeably. - -.. versionchanged:: 2.7 - -The :meth:`isEnabledFor` method was added to :class:`LoggerAdapter`. This method -delegates to the underlying logger. - - -Thread Safety -------------- - -The logging module is intended to be thread-safe without any special work -needing to be done by its clients. It achieves this though using threading -locks; there is one lock to serialize access to the module's shared data, and -each handler also creates a lock to serialize access to its underlying I/O. - -If you are implementing asynchronous signal handlers using the :mod:`signal` -module, you may not be able to use logging from within such handlers. This is -because lock implementations in the :mod:`threading` module are not always -re-entrant, and so cannot be invoked from such signal handlers. - - -Integration with the warnings module ------------------------------------- - -The :func:`captureWarnings` function can be used to integrate :mod:`logging` -with the :mod:`warnings` module. - -.. function:: captureWarnings(capture) - - This function is used to turn the capture of warnings by logging on and - off. - - If *capture* is ``True``, warnings issued by the :mod:`warnings` module - will be redirected to the logging system. Specifically, a warning will be - formatted using :func:`warnings.formatwarning` and the resulting string - logged to a logger named "py.warnings" with a severity of ``WARNING``. - - If *capture* is ``False``, the redirection of warnings to the logging system - will stop, and warnings will be redirected to their original destinations - (i.e. those in effect before ``captureWarnings(True)`` was called). - - -Configuration -------------- - - -.. _logging-config-api: - -Configuration functions -^^^^^^^^^^^^^^^^^^^^^^^ - -The following functions configure the logging module. They are located in the -:mod:`logging.config` module. Their use is optional --- you can configure the -logging module using these functions or by making calls to the main API (defined -in :mod:`logging` itself) and defining handlers which are declared either in -:mod:`logging` or :mod:`logging.handlers`. - -.. currentmodule:: logging.config - -.. function:: dictConfig(config) - - Takes the logging configuration from a dictionary. The contents of - this dictionary are described in :ref:`logging-config-dictschema` - below. - - If an error is encountered during configuration, this function will - raise a :exc:`ValueError`, :exc:`TypeError`, :exc:`AttributeError` - or :exc:`ImportError` with a suitably descriptive message. The - following is a (possibly incomplete) list of conditions which will - raise an error: - - * A ``level`` which is not a string or which is a string not - corresponding to an actual logging level. - * A ``propagate`` value which is not a boolean. - * An id which does not have a corresponding destination. - * A non-existent handler id found during an incremental call. - * An invalid logger name. - * Inability to resolve to an internal or external object. - - Parsing is performed by the :class:`DictConfigurator` class, whose - constructor is passed the dictionary used for configuration, and - has a :meth:`configure` method. The :mod:`logging.config` module - has a callable attribute :attr:`dictConfigClass` - which is initially set to :class:`DictConfigurator`. - You can replace the value of :attr:`dictConfigClass` with a - suitable implementation of your own. - - :func:`dictConfig` calls :attr:`dictConfigClass` passing - the specified dictionary, and then calls the :meth:`configure` method on - the returned object to put the configuration into effect:: - - def dictConfig(config): - dictConfigClass(config).configure() - - For example, a subclass of :class:`DictConfigurator` could call - ``DictConfigurator.__init__()`` in its own :meth:`__init__()`, then - set up custom prefixes which would be usable in the subsequent - :meth:`configure` call. :attr:`dictConfigClass` would be bound to - this new subclass, and then :func:`dictConfig` could be called exactly as - in the default, uncustomized state. - - .. versionadded:: 2.7 - -.. function:: fileConfig(fname[, defaults]) - - Reads the logging configuration from a :mod:`ConfigParser`\-format file named - *fname*. This function can be called several times from an application, - allowing an end user to select from various pre-canned - configurations (if the developer provides a mechanism to present the choices - and load the chosen configuration). Defaults to be passed to the ConfigParser - can be specified in the *defaults* argument. - -.. function:: listen([port]) - - Starts up a socket server on the specified port, and listens for new - configurations. If no port is specified, the module's default - :const:`DEFAULT_LOGGING_CONFIG_PORT` is used. Logging configurations will be - sent as a file suitable for processing by :func:`fileConfig`. Returns a - :class:`Thread` instance on which you can call :meth:`start` to start the - server, and which you can :meth:`join` when appropriate. To stop the server, - call :func:`stopListening`. - - To send a configuration to the socket, read in the configuration file and - send it to the socket as a string of bytes preceded by a four-byte length - string packed in binary using ``struct.pack('>L', n)``. - - -.. function:: stopListening() - - Stops the listening server which was created with a call to :func:`listen`. - This is typically called before calling :meth:`join` on the return value from - :func:`listen`. - -.. currentmodule:: logging - -.. _logging-config-dictschema: - -Configuration dictionary schema -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -Describing a logging configuration requires listing the various -objects to create and the connections between them; for example, you -may create a handler named "console" and then say that the logger -named "startup" will send its messages to the "console" handler. -These objects aren't limited to those provided by the :mod:`logging` -module because you might write your own formatter or handler class. -The parameters to these classes may also need to include external -objects such as ``sys.stderr``. The syntax for describing these -objects and connections is defined in :ref:`logging-config-dict-connections` -below. - -Dictionary Schema Details -""""""""""""""""""""""""" - -The dictionary passed to :func:`dictConfig` must contain the following -keys: - -* `version` - to be set to an integer value representing the schema - version. The only valid value at present is 1, but having this key - allows the schema to evolve while still preserving backwards - compatibility. - -All other keys are optional, but if present they will be interpreted -as described below. In all cases below where a 'configuring dict' is -mentioned, it will be checked for the special ``'()'`` key to see if a -custom instantiation is required. If so, the mechanism described in -:ref:`logging-config-dict-userdef` below is used to create an instance; -otherwise, the context is used to determine what to instantiate. - -* `formatters` - the corresponding value will be a dict in which each - key is a formatter id and each value is a dict describing how to - configure the corresponding Formatter instance. - - The configuring dict is searched for keys ``format`` and ``datefmt`` - (with defaults of ``None``) and these are used to construct a - :class:`logging.Formatter` instance. - -* `filters` - the corresponding value will be a dict in which each key - is a filter id and each value is a dict describing how to configure - the corresponding Filter instance. - - The configuring dict is searched for the key ``name`` (defaulting to the - empty string) and this is used to construct a :class:`logging.Filter` - instance. - -* `handlers` - the corresponding value will be a dict in which each - key is a handler id and each value is a dict describing how to - configure the corresponding Handler instance. - - The configuring dict is searched for the following keys: - - * ``class`` (mandatory). This is the fully qualified name of the - handler class. - - * ``level`` (optional). The level of the handler. - - * ``formatter`` (optional). The id of the formatter for this - handler. - - * ``filters`` (optional). A list of ids of the filters for this - handler. - - All *other* keys are passed through as keyword arguments to the - handler's constructor. For example, given the snippet:: - - handlers: - console: - class : logging.StreamHandler - formatter: brief - level : INFO - filters: [allow_foo] - stream : ext://sys.stdout - file: - class : logging.handlers.RotatingFileHandler - formatter: precise - filename: logconfig.log - maxBytes: 1024 - backupCount: 3 - - the handler with id ``console`` is instantiated as a - :class:`logging.StreamHandler`, using ``sys.stdout`` as the underlying - stream. The handler with id ``file`` is instantiated as a - :class:`logging.handlers.RotatingFileHandler` with the keyword arguments - ``filename='logconfig.log', maxBytes=1024, backupCount=3``. - -* `loggers` - the corresponding value will be a dict in which each key - is a logger name and each value is a dict describing how to - configure the corresponding Logger instance. - - The configuring dict is searched for the following keys: - - * ``level`` (optional). The level of the logger. - - * ``propagate`` (optional). The propagation setting of the logger. - - * ``filters`` (optional). A list of ids of the filters for this - logger. - - * ``handlers`` (optional). A list of ids of the handlers for this - logger. - - The specified loggers will be configured according to the level, - propagation, filters and handlers specified. - -* `root` - this will be the configuration for the root logger. - Processing of the configuration will be as for any logger, except - that the ``propagate`` setting will not be applicable. - -* `incremental` - whether the configuration is to be interpreted as - incremental to the existing configuration. This value defaults to - ``False``, which means that the specified configuration replaces the - existing configuration with the same semantics as used by the - existing :func:`fileConfig` API. - - If the specified value is ``True``, the configuration is processed - as described in the section on :ref:`logging-config-dict-incremental`. - -* `disable_existing_loggers` - whether any existing loggers are to be - disabled. This setting mirrors the parameter of the same name in - :func:`fileConfig`. If absent, this parameter defaults to ``True``. - This value is ignored if `incremental` is ``True``. - -.. _logging-config-dict-incremental: - -Incremental Configuration -""""""""""""""""""""""""" - -It is difficult to provide complete flexibility for incremental -configuration. For example, because objects such as filters -and formatters are anonymous, once a configuration is set up, it is -not possible to refer to such anonymous objects when augmenting a -configuration. - -Furthermore, there is not a compelling case for arbitrarily altering -the object graph of loggers, handlers, filters, formatters at -run-time, once a configuration is set up; the verbosity of loggers and -handlers can be controlled just by setting levels (and, in the case of -loggers, propagation flags). Changing the object graph arbitrarily in -a safe way is problematic in a multi-threaded environment; while not -impossible, the benefits are not worth the complexity it adds to the -implementation. - -Thus, when the ``incremental`` key of a configuration dict is present -and is ``True``, the system will completely ignore any ``formatters`` and -``filters`` entries, and process only the ``level`` -settings in the ``handlers`` entries, and the ``level`` and -``propagate`` settings in the ``loggers`` and ``root`` entries. - -Using a value in the configuration dict lets configurations to be sent -over the wire as pickled dicts to a socket listener. Thus, the logging -verbosity of a long-running application can be altered over time with -no need to stop and restart the application. - -.. _logging-config-dict-connections: - -Object connections -"""""""""""""""""" - -The schema describes a set of logging objects - loggers, -handlers, formatters, filters - which are connected to each other in -an object graph. Thus, the schema needs to represent connections -between the objects. For example, say that, once configured, a -particular logger has attached to it a particular handler. For the -purposes of this discussion, we can say that the logger represents the -source, and the handler the destination, of a connection between the -two. Of course in the configured objects this is represented by the -logger holding a reference to the handler. In the configuration dict, -this is done by giving each destination object an id which identifies -it unambiguously, and then using the id in the source object's -configuration to indicate that a connection exists between the source -and the destination object with that id. - -So, for example, consider the following YAML snippet:: - - formatters: - brief: - # configuration for formatter with id 'brief' goes here - precise: - # configuration for formatter with id 'precise' goes here - handlers: - h1: #This is an id - # configuration of handler with id 'h1' goes here - formatter: brief - h2: #This is another id - # configuration of handler with id 'h2' goes here - formatter: precise - loggers: - foo.bar.baz: - # other configuration for logger 'foo.bar.baz' - handlers: [h1, h2] - -(Note: YAML used here because it's a little more readable than the -equivalent Python source form for the dictionary.) - -The ids for loggers are the logger names which would be used -programmatically to obtain a reference to those loggers, e.g. -``foo.bar.baz``. The ids for Formatters and Filters can be any string -value (such as ``brief``, ``precise`` above) and they are transient, -in that they are only meaningful for processing the configuration -dictionary and used to determine connections between objects, and are -not persisted anywhere when the configuration call is complete. - -The above snippet indicates that logger named ``foo.bar.baz`` should -have two handlers attached to it, which are described by the handler -ids ``h1`` and ``h2``. The formatter for ``h1`` is that described by id -``brief``, and the formatter for ``h2`` is that described by id -``precise``. - - -.. _logging-config-dict-userdef: - -User-defined objects -"""""""""""""""""""" - -The schema supports user-defined objects for handlers, filters and -formatters. (Loggers do not need to have different types for -different instances, so there is no support in this configuration -schema for user-defined logger classes.) - -Objects to be configured are described by dictionaries -which detail their configuration. In some places, the logging system -will be able to infer from the context how an object is to be -instantiated, but when a user-defined object is to be instantiated, -the system will not know how to do this. In order to provide complete -flexibility for user-defined object instantiation, the user needs -to provide a 'factory' - a callable which is called with a -configuration dictionary and which returns the instantiated object. -This is signalled by an absolute import path to the factory being -made available under the special key ``'()'``. Here's a concrete -example:: - - formatters: - brief: - format: '%(message)s' - default: - format: '%(asctime)s %(levelname)-8s %(name)-15s %(message)s' - datefmt: '%Y-%m-%d %H:%M:%S' - custom: - (): my.package.customFormatterFactory - bar: baz - spam: 99.9 - answer: 42 - -The above YAML snippet defines three formatters. The first, with id -``brief``, is a standard :class:`logging.Formatter` instance with the -specified format string. The second, with id ``default``, has a -longer format and also defines the time format explicitly, and will -result in a :class:`logging.Formatter` initialized with those two format -strings. Shown in Python source form, the ``brief`` and ``default`` -formatters have configuration sub-dictionaries:: - - { - 'format' : '%(message)s' - } - -and:: - - { - 'format' : '%(asctime)s %(levelname)-8s %(name)-15s %(message)s', - 'datefmt' : '%Y-%m-%d %H:%M:%S' - } - -respectively, and as these dictionaries do not contain the special key -``'()'``, the instantiation is inferred from the context: as a result, -standard :class:`logging.Formatter` instances are created. The -configuration sub-dictionary for the third formatter, with id -``custom``, is:: - - { - '()' : 'my.package.customFormatterFactory', - 'bar' : 'baz', - 'spam' : 99.9, - 'answer' : 42 - } - -and this contains the special key ``'()'``, which means that -user-defined instantiation is wanted. In this case, the specified -factory callable will be used. If it is an actual callable it will be -used directly - otherwise, if you specify a string (as in the example) -the actual callable will be located using normal import mechanisms. -The callable will be called with the **remaining** items in the -configuration sub-dictionary as keyword arguments. In the above -example, the formatter with id ``custom`` will be assumed to be -returned by the call:: - - my.package.customFormatterFactory(bar='baz', spam=99.9, answer=42) - -The key ``'()'`` has been used as the special key because it is not a -valid keyword parameter name, and so will not clash with the names of -the keyword arguments used in the call. The ``'()'`` also serves as a -mnemonic that the corresponding value is a callable. - - -.. _logging-config-dict-externalobj: - -Access to external objects -"""""""""""""""""""""""""" - -There are times where a configuration needs to refer to objects -external to the configuration, for example ``sys.stderr``. If the -configuration dict is constructed using Python code, this is -straightforward, but a problem arises when the configuration is -provided via a text file (e.g. JSON, YAML). In a text file, there is -no standard way to distinguish ``sys.stderr`` from the literal string -``'sys.stderr'``. To facilitate this distinction, the configuration -system looks for certain special prefixes in string values and -treat them specially. For example, if the literal string -``'ext://sys.stderr'`` is provided as a value in the configuration, -then the ``ext://`` will be stripped off and the remainder of the -value processed using normal import mechanisms. - -The handling of such prefixes is done in a way analogous to protocol -handling: there is a generic mechanism to look for prefixes which -match the regular expression ``^(?P[a-z]+)://(?P.*)$`` -whereby, if the ``prefix`` is recognised, the ``suffix`` is processed -in a prefix-dependent manner and the result of the processing replaces -the string value. If the prefix is not recognised, then the string -value will be left as-is. - - -.. _logging-config-dict-internalobj: - -Access to internal objects -"""""""""""""""""""""""""" - -As well as external objects, there is sometimes also a need to refer -to objects in the configuration. This will be done implicitly by the -configuration system for things that it knows about. For example, the -string value ``'DEBUG'`` for a ``level`` in a logger or handler will -automatically be converted to the value ``logging.DEBUG``, and the -``handlers``, ``filters`` and ``formatter`` entries will take an -object id and resolve to the appropriate destination object. - -However, a more generic mechanism is needed for user-defined -objects which are not known to the :mod:`logging` module. For -example, consider :class:`logging.handlers.MemoryHandler`, which takes -a ``target`` argument which is another handler to delegate to. Since -the system already knows about this class, then in the configuration, -the given ``target`` just needs to be the object id of the relevant -target handler, and the system will resolve to the handler from the -id. If, however, a user defines a ``my.package.MyHandler`` which has -an ``alternate`` handler, the configuration system would not know that -the ``alternate`` referred to a handler. To cater for this, a generic -resolution system allows the user to specify:: - - handlers: - file: - # configuration of file handler goes here - - custom: - (): my.package.MyHandler - alternate: cfg://handlers.file - -The literal string ``'cfg://handlers.file'`` will be resolved in an -analogous way to strings with the ``ext://`` prefix, but looking -in the configuration itself rather than the import namespace. The -mechanism allows access by dot or by index, in a similar way to -that provided by ``str.format``. Thus, given the following snippet:: - - handlers: - email: - class: logging.handlers.SMTPHandler - mailhost: localhost - fromaddr: my_app at domain.tld - toaddrs: - - support_team at domain.tld - - dev_team at domain.tld - subject: Houston, we have a problem. - -in the configuration, the string ``'cfg://handlers'`` would resolve to -the dict with key ``handlers``, the string ``'cfg://handlers.email`` -would resolve to the dict with key ``email`` in the ``handlers`` dict, -and so on. The string ``'cfg://handlers.email.toaddrs[1]`` would -resolve to ``'dev_team.domain.tld'`` and the string -``'cfg://handlers.email.toaddrs[0]'`` would resolve to the value -``'support_team at domain.tld'``. The ``subject`` value could be accessed -using either ``'cfg://handlers.email.subject'`` or, equivalently, -``'cfg://handlers.email[subject]'``. The latter form only needs to be -used if the key contains spaces or non-alphanumeric characters. If an -index value consists only of decimal digits, access will be attempted -using the corresponding integer value, falling back to the string -value if needed. - -Given a string ``cfg://handlers.myhandler.mykey.123``, this will -resolve to ``config_dict['handlers']['myhandler']['mykey']['123']``. -If the string is specified as ``cfg://handlers.myhandler.mykey[123]``, -the system will attempt to retrieve the value from -``config_dict['handlers']['myhandler']['mykey'][123]``, and fall back -to ``config_dict['handlers']['myhandler']['mykey']['123']`` if that -fails. - -.. _logging-config-fileformat: - -Configuration file format -^^^^^^^^^^^^^^^^^^^^^^^^^ - -The configuration file format understood by :func:`fileConfig` is based on -:mod:`ConfigParser` functionality. The file must contain sections called -``[loggers]``, ``[handlers]`` and ``[formatters]`` which identify by name the -entities of each type which are defined in the file. For each such entity, -there is a separate section which identifies how that entity is configured. -Thus, for a logger named ``log01`` in the ``[loggers]`` section, the relevant -configuration details are held in a section ``[logger_log01]``. Similarly, a -handler called ``hand01`` in the ``[handlers]`` section will have its -configuration held in a section called ``[handler_hand01]``, while a formatter -called ``form01`` in the ``[formatters]`` section will have its configuration -specified in a section called ``[formatter_form01]``. The root logger -configuration must be specified in a section called ``[logger_root]``. - -Examples of these sections in the file are given below. :: - - [loggers] - keys=root,log02,log03,log04,log05,log06,log07 - - [handlers] - keys=hand01,hand02,hand03,hand04,hand05,hand06,hand07,hand08,hand09 - - [formatters] - keys=form01,form02,form03,form04,form05,form06,form07,form08,form09 - -The root logger must specify a level and a list of handlers. An example of a -root logger section is given below. :: - - [logger_root] - level=NOTSET - handlers=hand01 - -The ``level`` entry can be one of ``DEBUG, INFO, WARNING, ERROR, CRITICAL`` or -``NOTSET``. For the root logger only, ``NOTSET`` means that all messages will be -logged. Level values are :func:`eval`\ uated in the context of the ``logging`` -package's namespace. - -The ``handlers`` entry is a comma-separated list of handler names, which must -appear in the ``[handlers]`` section. These names must appear in the -``[handlers]`` section and have corresponding sections in the configuration -file. - -For loggers other than the root logger, some additional information is required. -This is illustrated by the following example. :: - - [logger_parser] - level=DEBUG - handlers=hand01 - propagate=1 - qualname=compiler.parser - -The ``level`` and ``handlers`` entries are interpreted as for the root logger, -except that if a non-root logger's level is specified as ``NOTSET``, the system -consults loggers higher up the hierarchy to determine the effective level of the -logger. The ``propagate`` entry is set to 1 to indicate that messages must -propagate to handlers higher up the logger hierarchy from this logger, or 0 to -indicate that messages are **not** propagated to handlers up the hierarchy. The -``qualname`` entry is the hierarchical channel name of the logger, that is to -say the name used by the application to get the logger. - -Sections which specify handler configuration are exemplified by the following. -:: - - [handler_hand01] - class=StreamHandler - level=NOTSET - formatter=form01 - args=(sys.stdout,) - -The ``class`` entry indicates the handler's class (as determined by :func:`eval` -in the ``logging`` package's namespace). The ``level`` is interpreted as for -loggers, and ``NOTSET`` is taken to mean "log everything". - -.. versionchanged:: 2.6 - Added support for resolving the handler's class as a dotted module and class - name. - -The ``formatter`` entry indicates the key name of the formatter for this -handler. If blank, a default formatter (``logging._defaultFormatter``) is used. -If a name is specified, it must appear in the ``[formatters]`` section and have -a corresponding section in the configuration file. - -The ``args`` entry, when :func:`eval`\ uated in the context of the ``logging`` -package's namespace, is the list of arguments to the constructor for the handler -class. Refer to the constructors for the relevant handlers, or to the examples -below, to see how typical entries are constructed. :: - - [handler_hand02] - class=FileHandler - level=DEBUG - formatter=form02 - args=('python.log', 'w') - - [handler_hand03] - class=handlers.SocketHandler - level=INFO - formatter=form03 - args=('localhost', handlers.DEFAULT_TCP_LOGGING_PORT) - - [handler_hand04] - class=handlers.DatagramHandler - level=WARN - formatter=form04 - args=('localhost', handlers.DEFAULT_UDP_LOGGING_PORT) - - [handler_hand05] - class=handlers.SysLogHandler - level=ERROR - formatter=form05 - args=(('localhost', handlers.SYSLOG_UDP_PORT), handlers.SysLogHandler.LOG_USER) - - [handler_hand06] - class=handlers.NTEventLogHandler - level=CRITICAL - formatter=form06 - args=('Python Application', '', 'Application') - - [handler_hand07] - class=handlers.SMTPHandler - level=WARN - formatter=form07 - args=('localhost', 'from at abc', ['user1 at abc', 'user2 at xyz'], 'Logger Subject') - - [handler_hand08] - class=handlers.MemoryHandler - level=NOTSET - formatter=form08 - target= - args=(10, ERROR) - - [handler_hand09] - class=handlers.HTTPHandler - level=NOTSET - formatter=form09 - args=('localhost:9022', '/log', 'GET') - -Sections which specify formatter configuration are typified by the following. :: - - [formatter_form01] - format=F1 %(asctime)s %(levelname)s %(message)s - datefmt= - class=logging.Formatter - -The ``format`` entry is the overall format string, and the ``datefmt`` entry is -the :func:`strftime`\ -compatible date/time format string. If empty, the -package substitutes ISO8601 format date/times, which is almost equivalent to -specifying the date format string ``"%Y-%m-%d %H:%M:%S"``. The ISO8601 format -also specifies milliseconds, which are appended to the result of using the above -format string, with a comma separator. An example time in ISO8601 format is -``2003-01-23 00:29:50,411``. - -The ``class`` entry is optional. It indicates the name of the formatter's class -(as a dotted module and class name.) This option is useful for instantiating a -:class:`Formatter` subclass. Subclasses of :class:`Formatter` can present -exception tracebacks in an expanded or condensed format. - - -Configuration server example -^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -Here is an example of a module using the logging configuration server:: - - import logging - import logging.config - import time - import os - - # read initial config file - logging.config.fileConfig("logging.conf") - - # create and start listener on port 9999 - t = logging.config.listen(9999) - t.start() - - logger = logging.getLogger("simpleExample") - - try: - # loop through logging calls to see the difference - # new configurations make, until Ctrl+C is pressed - while True: - logger.debug("debug message") - logger.info("info message") - logger.warn("warn message") - logger.error("error message") - logger.critical("critical message") - time.sleep(5) - except KeyboardInterrupt: - # cleanup - logging.config.stopListening() - t.join() - -And here is a script that takes a filename and sends that file to the server, -properly preceded with the binary-encoded length, as the new logging -configuration:: - - #!/usr/bin/env python - import socket, sys, struct - - data_to_send = open(sys.argv[1], "r").read() - - HOST = 'localhost' - PORT = 9999 - s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) - print "connecting..." - s.connect((HOST, PORT)) - print "sending config..." - s.send(struct.pack(">L", len(data_to_send))) - s.send(data_to_send) - s.close() - print "complete" - - -More examples -------------- - -Multiple handlers and formatters -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -Loggers are plain Python objects. The :func:`addHandler` method has no minimum -or maximum quota for the number of handlers you may add. Sometimes it will be -beneficial for an application to log all messages of all severities to a text -file while simultaneously logging errors or above to the console. To set this -up, simply configure the appropriate handlers. The logging calls in the -application code will remain unchanged. Here is a slight modification to the -previous simple module-based configuration example:: - - import logging - - logger = logging.getLogger("simple_example") - logger.setLevel(logging.DEBUG) - # create file handler which logs even debug messages - fh = logging.FileHandler("spam.log") - fh.setLevel(logging.DEBUG) - # create console handler with a higher log level - ch = logging.StreamHandler() - ch.setLevel(logging.ERROR) - # create formatter and add it to the handlers - formatter = logging.Formatter("%(asctime)s - %(name)s - %(levelname)s - %(message)s") - ch.setFormatter(formatter) - fh.setFormatter(formatter) - # add the handlers to logger - logger.addHandler(ch) - logger.addHandler(fh) - - # "application" code - logger.debug("debug message") - logger.info("info message") - logger.warn("warn message") - logger.error("error message") - logger.critical("critical message") - -Notice that the "application" code does not care about multiple handlers. All -that changed was the addition and configuration of a new handler named *fh*. - -The ability to create new handlers with higher- or lower-severity filters can be -very helpful when writing and testing an application. Instead of using many -``print`` statements for debugging, use ``logger.debug``: Unlike the print -statements, which you will have to delete or comment out later, the logger.debug -statements can remain intact in the source code and remain dormant until you -need them again. At that time, the only change that needs to happen is to -modify the severity level of the logger and/or handler to debug. - - -Using logging in multiple modules -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -It was mentioned above that multiple calls to -``logging.getLogger('someLogger')`` return a reference to the same logger -object. This is true not only within the same module, but also across modules -as long as it is in the same Python interpreter process. It is true for -references to the same object; additionally, application code can define and -configure a parent logger in one module and create (but not configure) a child -logger in a separate module, and all logger calls to the child will pass up to -the parent. Here is a main module:: - - import logging - import auxiliary_module - - # create logger with "spam_application" - logger = logging.getLogger("spam_application") - logger.setLevel(logging.DEBUG) - # create file handler which logs even debug messages - fh = logging.FileHandler("spam.log") - fh.setLevel(logging.DEBUG) - # create console handler with a higher log level - ch = logging.StreamHandler() - ch.setLevel(logging.ERROR) - # create formatter and add it to the handlers - formatter = logging.Formatter("%(asctime)s - %(name)s - %(levelname)s - %(message)s") - fh.setFormatter(formatter) - ch.setFormatter(formatter) - # add the handlers to the logger - logger.addHandler(fh) - logger.addHandler(ch) - - logger.info("creating an instance of auxiliary_module.Auxiliary") - a = auxiliary_module.Auxiliary() - logger.info("created an instance of auxiliary_module.Auxiliary") - logger.info("calling auxiliary_module.Auxiliary.do_something") - a.do_something() - logger.info("finished auxiliary_module.Auxiliary.do_something") - logger.info("calling auxiliary_module.some_function()") - auxiliary_module.some_function() - logger.info("done with auxiliary_module.some_function()") - -Here is the auxiliary module:: - - import logging - - # create logger - module_logger = logging.getLogger("spam_application.auxiliary") - - class Auxiliary: - def __init__(self): - self.logger = logging.getLogger("spam_application.auxiliary.Auxiliary") - self.logger.info("creating an instance of Auxiliary") - def do_something(self): - self.logger.info("doing something") - a = 1 + 1 - self.logger.info("done doing something") - - def some_function(): - module_logger.info("received a call to \"some_function\"") - -The output looks like this:: - - 2005-03-23 23:47:11,663 - spam_application - INFO - - creating an instance of auxiliary_module.Auxiliary - 2005-03-23 23:47:11,665 - spam_application.auxiliary.Auxiliary - INFO - - creating an instance of Auxiliary - 2005-03-23 23:47:11,665 - spam_application - INFO - - created an instance of auxiliary_module.Auxiliary - 2005-03-23 23:47:11,668 - spam_application - INFO - - calling auxiliary_module.Auxiliary.do_something - 2005-03-23 23:47:11,668 - spam_application.auxiliary.Auxiliary - INFO - - doing something - 2005-03-23 23:47:11,669 - spam_application.auxiliary.Auxiliary - INFO - - done doing something - 2005-03-23 23:47:11,670 - spam_application - INFO - - finished auxiliary_module.Auxiliary.do_something - 2005-03-23 23:47:11,671 - spam_application - INFO - - calling auxiliary_module.some_function() - 2005-03-23 23:47:11,672 - spam_application.auxiliary - INFO - - received a call to "some_function" - 2005-03-23 23:47:11,673 - spam_application - INFO - - done with auxiliary_module.some_function() - -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Fri Apr 8 12:59:13 2011 From: python-checkins at python.org (victor.stinner) Date: Fri, 08 Apr 2011 12:59:13 +0200 Subject: [Python-checkins] cpython: faulthandler: fix unregister() if it is called before register() Message-ID: http://hg.python.org/cpython/rev/145d0a56c8bd changeset: 69205:145d0a56c8bd parent: 69203:5ec2695c9c15 user: Victor Stinner date: Fri Apr 08 12:48:15 2011 +0200 summary: faulthandler: fix unregister() if it is called before register() Fix a crash: don't read from NULL. files: Doc/library/faulthandler.rst | 3 ++- Modules/faulthandler.c | 3 +++ 2 files changed, 5 insertions(+), 1 deletions(-) diff --git a/Doc/library/faulthandler.rst b/Doc/library/faulthandler.rst --- a/Doc/library/faulthandler.rst +++ b/Doc/library/faulthandler.rst @@ -97,7 +97,8 @@ .. function:: unregister(signum) Unregister a user signal: uninstall the handler of the *signum* signal - installed by :func:`register`. + installed by :func:`register`. Return ``True`` if the signal was registered, + ``False`` otherwise. Not available on Windows. diff --git a/Modules/faulthandler.c b/Modules/faulthandler.c --- a/Modules/faulthandler.c +++ b/Modules/faulthandler.c @@ -694,6 +694,9 @@ if (!check_signum(signum)) return NULL; + if (user_signals == NULL) + Py_RETURN_FALSE; + user = &user_signals[signum]; change = faulthandler_unregister(user, signum); return PyBool_FromLong(change); -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Fri Apr 8 12:59:15 2011 From: python-checkins at python.org (victor.stinner) Date: Fri, 08 Apr 2011 12:59:15 +0200 Subject: [Python-checkins] cpython: faulthandler: one more time, fix usage of locks in the watchdog thread Message-ID: http://hg.python.org/cpython/rev/e409d9beaf6b changeset: 69206:e409d9beaf6b user: Victor Stinner date: Fri Apr 08 12:57:06 2011 +0200 summary: faulthandler: one more time, fix usage of locks in the watchdog thread * Write a new test to ensure that dump_tracebacks_later() still works if it was already called and then cancelled before * Don't use a variable to check the status of the thread, only rely on locks * The thread only releases cancel_event if it was able to acquire it (if the timer was interrupted) * The main thread always hold this lock. It is only released when faulthandler_thread() is interrupted until this thread exits, or at Python exit. files: Lib/test/test_faulthandler.py | 45 ++++++++++------- Modules/faulthandler.c | 57 +++++++++++----------- 2 files changed, 55 insertions(+), 47 deletions(-) diff --git a/Lib/test/test_faulthandler.py b/Lib/test/test_faulthandler.py --- a/Lib/test/test_faulthandler.py +++ b/Lib/test/test_faulthandler.py @@ -352,7 +352,7 @@ with temporary_filename() as filename: self.check_dump_traceback_threads(filename) - def _check_dump_tracebacks_later(self, repeat, cancel, filename): + def _check_dump_tracebacks_later(self, repeat, cancel, filename, loops): """ Check how many times the traceback is written in timeout x 2.5 seconds, or timeout x 3.5 seconds if cancel is True: 1, 2 or 3 times depending @@ -364,42 +364,43 @@ import faulthandler import time -def func(repeat, cancel, timeout): - if cancel: +def func(timeout, repeat, cancel, file, loops): + for loop in range(loops): + faulthandler.dump_tracebacks_later(timeout, repeat=repeat, file=file) + if cancel: + faulthandler.cancel_dump_tracebacks_later() + time.sleep(timeout * 2.5) faulthandler.cancel_dump_tracebacks_later() - time.sleep(timeout * 2.5) - faulthandler.cancel_dump_tracebacks_later() timeout = {timeout} repeat = {repeat} cancel = {cancel} +loops = {loops} if {has_filename}: file = open({filename}, "wb") else: file = None -faulthandler.dump_tracebacks_later(timeout, - repeat=repeat, file=file) -func(repeat, cancel, timeout) +func(timeout, repeat, cancel, file, loops) if file is not None: file.close() """.strip() code = code.format( - filename=repr(filename), - has_filename=bool(filename), + timeout=TIMEOUT, repeat=repeat, cancel=cancel, - timeout=TIMEOUT, + loops=loops, + has_filename=bool(filename), + filename=repr(filename), ) trace, exitcode = self.get_output(code, filename) trace = '\n'.join(trace) if not cancel: + count = loops if repeat: - count = 2 - else: - count = 1 + count *= 2 header = 'Thread 0x[0-9a-f]+:\n' - regex = expected_traceback(7, 19, header, count=count) + regex = expected_traceback(9, 20, header, count=count) self.assertRegex(trace, regex) else: self.assertEqual(trace, '') @@ -408,12 +409,17 @@ @unittest.skipIf(not hasattr(faulthandler, 'dump_tracebacks_later'), 'need faulthandler.dump_tracebacks_later()') def check_dump_tracebacks_later(self, repeat=False, cancel=False, - file=False): + file=False, twice=False): + if twice: + loops = 2 + else: + loops = 1 if file: with temporary_filename() as filename: - self._check_dump_tracebacks_later(repeat, cancel, filename) + self._check_dump_tracebacks_later(repeat, cancel, + filename, loops) else: - self._check_dump_tracebacks_later(repeat, cancel, None) + self._check_dump_tracebacks_later(repeat, cancel, None, loops) def test_dump_tracebacks_later(self): self.check_dump_tracebacks_later() @@ -427,6 +433,9 @@ def test_dump_tracebacks_later_file(self): self.check_dump_tracebacks_later(file=True) + def test_dump_tracebacks_later_twice(self): + self.check_dump_tracebacks_later(twice=True) + @unittest.skipIf(not hasattr(faulthandler, "register"), "need faulthandler.register") def check_register(self, filename=False, all_threads=False, diff --git a/Modules/faulthandler.c b/Modules/faulthandler.c --- a/Modules/faulthandler.c +++ b/Modules/faulthandler.c @@ -48,13 +48,14 @@ int fd; PY_TIMEOUT_T timeout_ms; /* timeout in microseconds */ int repeat; - int running; PyInterpreterState *interp; int exit; - /* released by parent thread when cancel request */ + /* The main thread always hold this lock. It is only released when + faulthandler_thread() is interrupted until this thread exits, or at + Python exit. */ PyThread_type_lock cancel_event; /* released by child thread when joined */ - PyThread_type_lock join_event; + PyThread_type_lock running; } thread; #endif @@ -414,7 +415,7 @@ st = PyThread_acquire_lock_timed(thread.cancel_event, thread.timeout_ms, 0); if (st == PY_LOCK_ACQUIRED) { - /* Cancelled by user */ + PyThread_release_lock(thread.cancel_event); break; } /* Timeout => dump traceback */ @@ -431,21 +432,22 @@ } while (ok && thread.repeat); /* The only way out */ - PyThread_release_lock(thread.cancel_event); - PyThread_release_lock(thread.join_event); + PyThread_release_lock(thread.running); } static void -faulthandler_cancel_dump_tracebacks_later(void) +cancel_dump_tracebacks_later(void) { - if (thread.running) { - /* Notify cancellation */ - PyThread_release_lock(thread.cancel_event); - } + /* notify cancellation */ + PyThread_release_lock(thread.cancel_event); + /* Wait for thread to join */ - PyThread_acquire_lock(thread.join_event, 1); - PyThread_release_lock(thread.join_event); - thread.running = 0; + PyThread_acquire_lock(thread.running, 1); + PyThread_release_lock(thread.running); + + /* The main thread should always hold the cancel_event lock */ + PyThread_acquire_lock(thread.cancel_event, 1); + Py_CLEAR(thread.file); } @@ -489,7 +491,7 @@ return NULL; /* Cancel previous thread, if running */ - faulthandler_cancel_dump_tracebacks_later(); + cancel_dump_tracebacks_later(); Py_XDECREF(thread.file); Py_INCREF(file); @@ -501,14 +503,10 @@ thread.exit = exit; /* Arm these locks to serve as events when released */ - PyThread_acquire_lock(thread.join_event, 1); - PyThread_acquire_lock(thread.cancel_event, 1); + PyThread_acquire_lock(thread.running, 1); - thread.running = 1; if (PyThread_start_new_thread(faulthandler_thread, NULL) == -1) { - thread.running = 0; - PyThread_release_lock(thread.join_event); - PyThread_release_lock(thread.cancel_event); + PyThread_release_lock(thread.running); Py_CLEAR(thread.file); PyErr_SetString(PyExc_RuntimeError, "unable to start watchdog thread"); @@ -521,7 +519,7 @@ static PyObject* faulthandler_cancel_dump_tracebacks_later_py(PyObject *self) { - faulthandler_cancel_dump_tracebacks_later(); + cancel_dump_tracebacks_later(); Py_RETURN_NONE; } #endif /* FAULTHANDLER_LATER */ @@ -1001,15 +999,15 @@ } #endif #ifdef FAULTHANDLER_LATER - thread.running = 0; thread.file = NULL; thread.cancel_event = PyThread_allocate_lock(); - thread.join_event = PyThread_allocate_lock(); - if (!thread.cancel_event || !thread.join_event) { + thread.running = PyThread_allocate_lock(); + if (!thread.cancel_event || !thread.running) { PyErr_SetString(PyExc_RuntimeError, "could not allocate locks for faulthandler"); return -1; } + PyThread_acquire_lock(thread.cancel_event, 1); #endif return faulthandler_env_options(); @@ -1023,14 +1021,15 @@ #ifdef FAULTHANDLER_LATER /* later */ - faulthandler_cancel_dump_tracebacks_later(); + cancel_dump_tracebacks_later(); if (thread.cancel_event) { + PyThread_release_lock(thread.cancel_event); PyThread_free_lock(thread.cancel_event); thread.cancel_event = NULL; } - if (thread.join_event) { - PyThread_free_lock(thread.join_event); - thread.join_event = NULL; + if (thread.running) { + PyThread_free_lock(thread.running); + thread.running = NULL; } #endif -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Fri Apr 8 13:40:08 2011 From: python-checkins at python.org (victor.stinner) Date: Fri, 08 Apr 2011 13:40:08 +0200 Subject: [Python-checkins] cpython: faulthandler: fix variable name, timeout_ms => timeout_us Message-ID: http://hg.python.org/cpython/rev/0c39e067f35a changeset: 69207:0c39e067f35a user: Victor Stinner date: Fri Apr 08 13:00:31 2011 +0200 summary: faulthandler: fix variable name, timeout_ms => timeout_us The comment was already correct. files: Modules/faulthandler.c | 12 ++++++------ 1 files changed, 6 insertions(+), 6 deletions(-) diff --git a/Modules/faulthandler.c b/Modules/faulthandler.c --- a/Modules/faulthandler.c +++ b/Modules/faulthandler.c @@ -46,7 +46,7 @@ static struct { PyObject *file; int fd; - PY_TIMEOUT_T timeout_ms; /* timeout in microseconds */ + PY_TIMEOUT_T timeout_us; /* timeout in microseconds */ int repeat; PyInterpreterState *interp; int exit; @@ -413,7 +413,7 @@ do { st = PyThread_acquire_lock_timed(thread.cancel_event, - thread.timeout_ms, 0); + thread.timeout_us, 0); if (st == PY_LOCK_ACQUIRED) { PyThread_release_lock(thread.cancel_event); break; @@ -457,7 +457,7 @@ { static char *kwlist[] = {"timeout", "repeat", "file", "exit", NULL}; double timeout; - PY_TIMEOUT_T timeout_ms; + PY_TIMEOUT_T timeout_us; int repeat = 0; PyObject *file = NULL; int fd; @@ -473,8 +473,8 @@ PyErr_SetString(PyExc_OverflowError, "timeout value is too large"); return NULL; } - timeout_ms = (PY_TIMEOUT_T)timeout; - if (timeout_ms <= 0) { + timeout_us = (PY_TIMEOUT_T)timeout; + if (timeout_us <= 0) { PyErr_SetString(PyExc_ValueError, "timeout must be greater than 0"); return NULL; } @@ -497,7 +497,7 @@ Py_INCREF(file); thread.file = file; thread.fd = fd; - thread.timeout_ms = timeout_ms; + thread.timeout_us = timeout_us; thread.repeat = repeat; thread.interp = tstate->interp; thread.exit = exit; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Fri Apr 8 13:40:17 2011 From: python-checkins at python.org (victor.stinner) Date: Fri, 08 Apr 2011 13:40:17 +0200 Subject: [Python-checkins] cpython: faulthandler: dump_tracebacks_later() displays also the timeout Message-ID: http://hg.python.org/cpython/rev/7af470b0fa5e changeset: 69208:7af470b0fa5e user: Victor Stinner date: Fri Apr 08 13:39:59 2011 +0200 summary: faulthandler: dump_tracebacks_later() displays also the timeout files: Lib/test/test_faulthandler.py | 4 +- Modules/faulthandler.c | 52 +++++++++++++++++++++- 2 files changed, 52 insertions(+), 4 deletions(-) diff --git a/Lib/test/test_faulthandler.py b/Lib/test/test_faulthandler.py --- a/Lib/test/test_faulthandler.py +++ b/Lib/test/test_faulthandler.py @@ -1,4 +1,5 @@ from contextlib import contextmanager +import datetime import faulthandler import re import signal @@ -360,6 +361,7 @@ Raise an error if the output doesn't match the expect format. """ + timeout_str = str(datetime.timedelta(seconds=TIMEOUT)) code = """ import faulthandler import time @@ -399,7 +401,7 @@ count = loops if repeat: count *= 2 - header = 'Thread 0x[0-9a-f]+:\n' + header = r'Timeout \(%s\)!\nThread 0x[0-9a-f]+:\n' % timeout_str regex = expected_traceback(9, 20, header, count=count) self.assertRegex(trace, regex) else: diff --git a/Modules/faulthandler.c b/Modules/faulthandler.c --- a/Modules/faulthandler.c +++ b/Modules/faulthandler.c @@ -50,6 +50,8 @@ int repeat; PyInterpreterState *interp; int exit; + char *header; + size_t header_len; /* The main thread always hold this lock. It is only released when faulthandler_thread() is interrupted until this thread exits, or at Python exit. */ @@ -424,6 +426,8 @@ /* get the thread holding the GIL, NULL if no thread hold the GIL */ current = _Py_atomic_load_relaxed(&_PyThreadState_Current); + write(thread.fd, thread.header, thread.header_len); + errmsg = _Py_DumpTracebackThreads(thread.fd, thread.interp, current); ok = (errmsg == NULL); @@ -449,6 +453,37 @@ PyThread_acquire_lock(thread.cancel_event, 1); Py_CLEAR(thread.file); + if (thread.header) { + free(thread.header); + thread.header = NULL; + } +} + +static char* +format_timeout(double timeout) +{ + unsigned long us, sec, min, hour; + double intpart, fracpart; + char buffer[100]; + + fracpart = modf(timeout, &intpart); + sec = (unsigned long)intpart; + us = (unsigned long)(fracpart * 1e6); + min = sec / 60; + sec %= 60; + hour = min / 60; + min %= 60; + + if (us != 0) + PyOS_snprintf(buffer, sizeof(buffer), + "Timeout (%lu:%02lu:%02lu.%06lu)!\n", + hour, min, sec, us); + else + PyOS_snprintf(buffer, sizeof(buffer), + "Timeout (%lu:%02lu:%02lu)!\n", + hour, min, sec); + + return strdup(buffer); } static PyObject* @@ -463,17 +498,18 @@ int fd; int exit = 0; PyThreadState *tstate; + char *header; + size_t header_len; if (!PyArg_ParseTupleAndKeywords(args, kwargs, "d|iOi:dump_tracebacks_later", kwlist, &timeout, &repeat, &file, &exit)) return NULL; - timeout *= 1e6; - if (timeout >= (double) PY_TIMEOUT_MAX) { + if ((timeout * 1e6) >= (double) PY_TIMEOUT_MAX) { PyErr_SetString(PyExc_OverflowError, "timeout value is too large"); return NULL; } - timeout_us = (PY_TIMEOUT_T)timeout; + timeout_us = (PY_TIMEOUT_T)(timeout * 1e6); if (timeout_us <= 0) { PyErr_SetString(PyExc_ValueError, "timeout must be greater than 0"); return NULL; @@ -490,6 +526,12 @@ if (file == NULL) return NULL; + /* format the timeout */ + header = format_timeout(timeout); + if (header == NULL) + return PyErr_NoMemory(); + header_len = strlen(header); + /* Cancel previous thread, if running */ cancel_dump_tracebacks_later(); @@ -501,6 +543,8 @@ thread.repeat = repeat; thread.interp = tstate->interp; thread.exit = exit; + thread.header = header; + thread.header_len = header_len; /* Arm these locks to serve as events when released */ PyThread_acquire_lock(thread.running, 1); @@ -508,6 +552,8 @@ if (PyThread_start_new_thread(faulthandler_thread, NULL) == -1) { PyThread_release_lock(thread.running); Py_CLEAR(thread.file); + free(header); + thread.header = NULL; PyErr_SetString(PyExc_RuntimeError, "unable to start watchdog thread"); return NULL; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sat Apr 9 00:51:27 2011 From: python-checkins at python.org (victor.stinner) Date: Sat, 09 Apr 2011 00:51:27 +0200 Subject: [Python-checkins] cpython: Improve faulthandler.enable(all_threads=True) Message-ID: http://hg.python.org/cpython/rev/78a66c98288d changeset: 69209:78a66c98288d user: Victor Stinner date: Sat Apr 09 00:47:23 2011 +0200 summary: Improve faulthandler.enable(all_threads=True) faulthandler.enable(all_threads=True) dumps the tracebacks even if it is not possible to get the state of the current thread Create also the get_thread_state() subfunction to factorize the code. files: Modules/faulthandler.c | 54 +++++++++++++++++------------ 1 files changed, 32 insertions(+), 22 deletions(-) diff --git a/Modules/faulthandler.c b/Modules/faulthandler.c --- a/Modules/faulthandler.c +++ b/Modules/faulthandler.c @@ -40,6 +40,7 @@ PyObject *file; int fd; int all_threads; + PyInterpreterState *interp; } fatal_error = {0, NULL, -1, 0}; #ifdef FAULTHANDLER_LATER @@ -165,6 +166,20 @@ return file; } +/* Get the state of the current thread: only call this function if the current + thread holds the GIL. Raise an exception on error. */ +static PyThreadState* +get_thread_state(void) +{ + PyThreadState *tstate = PyThreadState_Get(); + if (tstate == NULL) { + PyErr_SetString(PyExc_RuntimeError, + "unable to get the current thread state"); + return NULL; + } + return tstate; +} + static PyObject* faulthandler_dump_traceback_py(PyObject *self, PyObject *args, PyObject *kwargs) @@ -185,13 +200,9 @@ if (file == NULL) return NULL; - /* The caller holds the GIL and so PyThreadState_Get() can be used */ - tstate = PyThreadState_Get(); - if (tstate == NULL) { - PyErr_SetString(PyExc_RuntimeError, - "unable to get the current thread state"); + tstate = get_thread_state(); + if (tstate == NULL) return NULL; - } if (all_threads) { errmsg = _Py_DumpTracebackThreads(fd, tstate->interp, tstate); @@ -266,13 +277,13 @@ #else tstate = PyThreadState_Get(); #endif - if (tstate == NULL) - return; if (fatal_error.all_threads) - _Py_DumpTracebackThreads(fd, tstate->interp, tstate); - else - _Py_DumpTraceback(fd, tstate); + _Py_DumpTracebackThreads(fd, fatal_error.interp, tstate); + else { + if (tstate != NULL) + _Py_DumpTraceback(fd, tstate); + } #ifdef MS_WINDOWS if (signum == SIGSEGV) { @@ -301,6 +312,7 @@ #endif int err; int fd; + PyThreadState *tstate; if (!PyArg_ParseTupleAndKeywords(args, kwargs, "|Oi:enable", kwlist, &file, &all_threads)) @@ -310,11 +322,16 @@ if (file == NULL) return NULL; + tstate = get_thread_state(); + if (tstate == NULL) + return NULL; + Py_XDECREF(fatal_error.file); Py_INCREF(file); fatal_error.file = file; fatal_error.fd = fd; fatal_error.all_threads = all_threads; + fatal_error.interp = tstate->interp; if (!fatal_error.enabled) { fatal_error.enabled = 1; @@ -515,12 +532,9 @@ return NULL; } - tstate = PyThreadState_Get(); - if (tstate == NULL) { - PyErr_SetString(PyExc_RuntimeError, - "unable to get the current thread state"); + tstate = get_thread_state(); + if (tstate == NULL) return NULL; - } file = faulthandler_get_fileno(file, &fd); if (file == NULL) @@ -652,13 +666,9 @@ if (!check_signum(signum)) return NULL; - /* The caller holds the GIL and so PyThreadState_Get() can be used */ - tstate = PyThreadState_Get(); - if (tstate == NULL) { - PyErr_SetString(PyExc_RuntimeError, - "unable to get the current thread state"); + tstate = get_thread_state(); + if (tstate == NULL) return NULL; - } file = faulthandler_get_fileno(file, &fd); if (file == NULL) -- Repository URL: http://hg.python.org/cpython From solipsis at pitrou.net Sat Apr 9 04:57:07 2011 From: solipsis at pitrou.net (solipsis at pitrou.net) Date: Sat, 09 Apr 2011 04:57:07 +0200 Subject: [Python-checkins] Daily reference leaks (78a66c98288d): sum=0 Message-ID: results for 78a66c98288d on branch "default" -------------------------------------------- Command line was: ['./python', '-m', 'test.regrtest', '-uall', '-R', '3:3:/home/antoine/cpython/refleaks/reflogmwFbNS', '-x'] From python-checkins at python.org Sat Apr 9 16:02:02 2011 From: python-checkins at python.org (victor.stinner) Date: Sat, 09 Apr 2011 16:02:02 +0200 Subject: [Python-checkins] cpython (3.1): Issue #11650: PyOS_StdioReadline() retries fgets() if it was interrupted Message-ID: http://hg.python.org/cpython/rev/2222f343ac51 changeset: 69210:2222f343ac51 branch: 3.1 parent: 69201:10725fc76e11 user: Victor Stinner date: Sat Apr 09 15:55:44 2011 +0200 summary: Issue #11650: PyOS_StdioReadline() retries fgets() if it was interrupted (EINTR), for example if the program is stopped with CTRL+z on Mac OS X. Patch written by Charles-Francois Natali. files: Misc/NEWS | 4 ++++ Parser/myreadline.c | 9 +++++---- 2 files changed, 9 insertions(+), 4 deletions(-) diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -10,6 +10,10 @@ Core and Builtins ----------------- +- Issue #11650: PyOS_StdioReadline() retries fgets() if it was interrupted + (EINTR), for example if the program is stopped with CTRL+z on Mac OS X. Patch + written by Charles-Francois Natali. + - Issue #8651: PyArg_Parse*() functions raise an OverflowError if the file doesn't have PY_SSIZE_T_CLEAN define and the size doesn't fit in an int (length bigger than 2^31-1 bytes). diff --git a/Parser/myreadline.c b/Parser/myreadline.c --- a/Parser/myreadline.c +++ b/Parser/myreadline.c @@ -36,7 +36,7 @@ my_fgets(char *buf, int len, FILE *fp) { char *p; - for (;;) { + while (1) { if (PyOS_InputHook != NULL) (void)(PyOS_InputHook)(); errno = 0; @@ -85,9 +85,10 @@ #ifdef WITH_THREAD PyEval_SaveThread(); #endif - if (s < 0) { - return 1; - } + if (s < 0) + return 1; + /* try again */ + continue; } #endif if (PyOS_InterruptOccurred()) { -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sat Apr 9 16:02:03 2011 From: python-checkins at python.org (victor.stinner) Date: Sat, 09 Apr 2011 16:02:03 +0200 Subject: [Python-checkins] cpython (merge 3.1 -> 3.2): (Merge 3.1) Issue #11650: PyOS_StdioReadline() retries fgets() if it was Message-ID: http://hg.python.org/cpython/rev/fc2f251e660a changeset: 69211:fc2f251e660a branch: 3.2 parent: 69202:74ec64dc3538 parent: 69210:2222f343ac51 user: Victor Stinner date: Sat Apr 09 15:59:25 2011 +0200 summary: (Merge 3.1) Issue #11650: PyOS_StdioReadline() retries fgets() if it was interrupted (EINTR), for example if the program is stopped with CTRL+z on Mac OS X. Patch written by Charles-Francois Natali. files: Misc/NEWS | 4 + Parser/myreadline.c | 106 ++++++++++++++++--------------- 2 files changed, 59 insertions(+), 51 deletions(-) diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -10,6 +10,10 @@ Core and Builtins ----------------- +- Issue #11650: PyOS_StdioReadline() retries fgets() if it was interrupted + (EINTR), for example if the program is stopped with CTRL+z on Mac OS X. Patch + written by Charles-Francois Natali. + - Issue #11395: io.FileIO().write() clamps the data length to 32,767 bytes on Windows if the file is a TTY to workaround a Windows bug. The Windows console returns an error (12: not enough space error) on writing into stdout if diff --git a/Parser/myreadline.c b/Parser/myreadline.c --- a/Parser/myreadline.c +++ b/Parser/myreadline.c @@ -36,63 +36,67 @@ my_fgets(char *buf, int len, FILE *fp) { char *p; - if (PyOS_InputHook != NULL) - (void)(PyOS_InputHook)(); - errno = 0; - p = fgets(buf, len, fp); - if (p != NULL) - return 0; /* No error */ + while (1) { + if (PyOS_InputHook != NULL) + (void)(PyOS_InputHook)(); + errno = 0; + p = fgets(buf, len, fp); + if (p != NULL) + return 0; /* No error */ #ifdef MS_WINDOWS - /* In the case of a Ctrl+C or some other external event - interrupting the operation: - Win2k/NT: ERROR_OPERATION_ABORTED is the most recent Win32 - error code (and feof() returns TRUE). - Win9x: Ctrl+C seems to have no effect on fgets() returning - early - the signal handler is called, but the fgets() - only returns "normally" (ie, when Enter hit or feof()) - */ - if (GetLastError()==ERROR_OPERATION_ABORTED) { - /* Signals come asynchronously, so we sleep a brief - moment before checking if the handler has been - triggered (we cant just return 1 before the - signal handler has been called, as the later - signal may be treated as a separate interrupt). + /* In the case of a Ctrl+C or some other external event + interrupting the operation: + Win2k/NT: ERROR_OPERATION_ABORTED is the most recent Win32 + error code (and feof() returns TRUE). + Win9x: Ctrl+C seems to have no effect on fgets() returning + early - the signal handler is called, but the fgets() + only returns "normally" (ie, when Enter hit or feof()) */ - Sleep(1); + if (GetLastError()==ERROR_OPERATION_ABORTED) { + /* Signals come asynchronously, so we sleep a brief + moment before checking if the handler has been + triggered (we cant just return 1 before the + signal handler has been called, as the later + signal may be treated as a separate interrupt). + */ + Sleep(1); + if (PyOS_InterruptOccurred()) { + return 1; /* Interrupt */ + } + /* Either the sleep wasn't long enough (need a + short loop retrying?) or not interrupted at all + (in which case we should revisit the whole thing!) + Logging some warning would be nice. assert is not + viable as under the debugger, the various dialogs + mean the condition is not true. + */ + } +#endif /* MS_WINDOWS */ + if (feof(fp)) { + return -1; /* EOF */ + } +#ifdef EINTR + if (errno == EINTR) { + int s; +#ifdef WITH_THREAD + PyEval_RestoreThread(_PyOS_ReadlineTState); +#endif + s = PyErr_CheckSignals(); +#ifdef WITH_THREAD + PyEval_SaveThread(); +#endif + if (s < 0) + return 1; + /* try again */ + continue; + } +#endif if (PyOS_InterruptOccurred()) { return 1; /* Interrupt */ } - /* Either the sleep wasn't long enough (need a - short loop retrying?) or not interrupted at all - (in which case we should revisit the whole thing!) - Logging some warning would be nice. assert is not - viable as under the debugger, the various dialogs - mean the condition is not true. - */ + return -2; /* Error */ } -#endif /* MS_WINDOWS */ - if (feof(fp)) { - return -1; /* EOF */ - } -#ifdef EINTR - if (errno == EINTR) { - int s; -#ifdef WITH_THREAD - PyEval_RestoreThread(_PyOS_ReadlineTState); -#endif - s = PyErr_CheckSignals(); -#ifdef WITH_THREAD - PyEval_SaveThread(); -#endif - if (s < 0) { - return 1; - } - } -#endif - if (PyOS_InterruptOccurred()) { - return 1; /* Interrupt */ - } - return -2; /* Error */ + /* NOTREACHED */ } -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sat Apr 9 16:02:06 2011 From: python-checkins at python.org (victor.stinner) Date: Sat, 09 Apr 2011 16:02:06 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): (Merge 3.2) Issue #11650: PyOS_StdioReadline() retries fgets() if it was Message-ID: http://hg.python.org/cpython/rev/64de1ded0744 changeset: 69212:64de1ded0744 parent: 69209:78a66c98288d parent: 69211:fc2f251e660a user: Victor Stinner date: Sat Apr 09 16:01:55 2011 +0200 summary: (Merge 3.2) Issue #11650: PyOS_StdioReadline() retries fgets() if it was interrupted (EINTR), for example if the program is stopped with CTRL+z on Mac OS X. Patch written by Charles-Francois Natali. files: Misc/NEWS | 4 + Parser/myreadline.c | 106 ++++++++++++++++--------------- 2 files changed, 59 insertions(+), 51 deletions(-) diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -10,6 +10,10 @@ Core and Builtins ----------------- +- Issue #11650: PyOS_StdioReadline() retries fgets() if it was interrupted + (EINTR), for example if the program is stopped with CTRL+z on Mac OS X. Patch + written by Charles-Francois Natali. + - Issue #9319: Include the filename in "Non-UTF8 code ..." syntax error. - Issue #10785: Store the filename as Unicode in the Python parser. diff --git a/Parser/myreadline.c b/Parser/myreadline.c --- a/Parser/myreadline.c +++ b/Parser/myreadline.c @@ -36,63 +36,67 @@ my_fgets(char *buf, int len, FILE *fp) { char *p; - if (PyOS_InputHook != NULL) - (void)(PyOS_InputHook)(); - errno = 0; - p = fgets(buf, len, fp); - if (p != NULL) - return 0; /* No error */ + while (1) { + if (PyOS_InputHook != NULL) + (void)(PyOS_InputHook)(); + errno = 0; + p = fgets(buf, len, fp); + if (p != NULL) + return 0; /* No error */ #ifdef MS_WINDOWS - /* In the case of a Ctrl+C or some other external event - interrupting the operation: - Win2k/NT: ERROR_OPERATION_ABORTED is the most recent Win32 - error code (and feof() returns TRUE). - Win9x: Ctrl+C seems to have no effect on fgets() returning - early - the signal handler is called, but the fgets() - only returns "normally" (ie, when Enter hit or feof()) - */ - if (GetLastError()==ERROR_OPERATION_ABORTED) { - /* Signals come asynchronously, so we sleep a brief - moment before checking if the handler has been - triggered (we cant just return 1 before the - signal handler has been called, as the later - signal may be treated as a separate interrupt). + /* In the case of a Ctrl+C or some other external event + interrupting the operation: + Win2k/NT: ERROR_OPERATION_ABORTED is the most recent Win32 + error code (and feof() returns TRUE). + Win9x: Ctrl+C seems to have no effect on fgets() returning + early - the signal handler is called, but the fgets() + only returns "normally" (ie, when Enter hit or feof()) */ - Sleep(1); + if (GetLastError()==ERROR_OPERATION_ABORTED) { + /* Signals come asynchronously, so we sleep a brief + moment before checking if the handler has been + triggered (we cant just return 1 before the + signal handler has been called, as the later + signal may be treated as a separate interrupt). + */ + Sleep(1); + if (PyOS_InterruptOccurred()) { + return 1; /* Interrupt */ + } + /* Either the sleep wasn't long enough (need a + short loop retrying?) or not interrupted at all + (in which case we should revisit the whole thing!) + Logging some warning would be nice. assert is not + viable as under the debugger, the various dialogs + mean the condition is not true. + */ + } +#endif /* MS_WINDOWS */ + if (feof(fp)) { + return -1; /* EOF */ + } +#ifdef EINTR + if (errno == EINTR) { + int s; +#ifdef WITH_THREAD + PyEval_RestoreThread(_PyOS_ReadlineTState); +#endif + s = PyErr_CheckSignals(); +#ifdef WITH_THREAD + PyEval_SaveThread(); +#endif + if (s < 0) + return 1; + /* try again */ + continue; + } +#endif if (PyOS_InterruptOccurred()) { return 1; /* Interrupt */ } - /* Either the sleep wasn't long enough (need a - short loop retrying?) or not interrupted at all - (in which case we should revisit the whole thing!) - Logging some warning would be nice. assert is not - viable as under the debugger, the various dialogs - mean the condition is not true. - */ + return -2; /* Error */ } -#endif /* MS_WINDOWS */ - if (feof(fp)) { - return -1; /* EOF */ - } -#ifdef EINTR - if (errno == EINTR) { - int s; -#ifdef WITH_THREAD - PyEval_RestoreThread(_PyOS_ReadlineTState); -#endif - s = PyErr_CheckSignals(); -#ifdef WITH_THREAD - PyEval_SaveThread(); -#endif - if (s < 0) { - return 1; - } - } -#endif - if (PyOS_InterruptOccurred()) { - return 1; /* Interrupt */ - } - return -2; /* Error */ + /* NOTREACHED */ } -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sat Apr 9 16:09:17 2011 From: python-checkins at python.org (victor.stinner) Date: Sat, 09 Apr 2011 16:09:17 +0200 Subject: [Python-checkins] cpython (2.7): (Merge 3.1) Issue #11650: PyOS_StdioReadline() retries fgets() if it was Message-ID: http://hg.python.org/cpython/rev/7febd5ef7619 changeset: 69213:7febd5ef7619 branch: 2.7 parent: 69204:6fb033af9310 user: Victor Stinner date: Sat Apr 09 16:09:08 2011 +0200 summary: (Merge 3.1) Issue #11650: PyOS_StdioReadline() retries fgets() if it was interrupted (EINTR), for example if the program is stopped with CTRL+z on Mac OS X. Patch written by Charles-Francois Natali. files: Misc/NEWS | 4 + Parser/myreadline.c | 106 ++++++++++++++++--------------- 2 files changed, 59 insertions(+), 51 deletions(-) diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -9,6 +9,10 @@ Core and Builtins ----------------- +- Issue #11650: PyOS_StdioReadline() retries fgets() if it was interrupted + (EINTR), for example if the program is stopped with CTRL+z on Mac OS X. Patch + written by Charles-Francois Natali. + - Issue #11144: Ensure that int(a_float) returns an int whenever possible. Previously, there were some corner cases where a long was returned even though the result was within the range of an int. diff --git a/Parser/myreadline.c b/Parser/myreadline.c --- a/Parser/myreadline.c +++ b/Parser/myreadline.c @@ -40,63 +40,67 @@ my_fgets(char *buf, int len, FILE *fp) { char *p; - if (PyOS_InputHook != NULL) - (void)(PyOS_InputHook)(); - errno = 0; - p = fgets(buf, len, fp); - if (p != NULL) - return 0; /* No error */ + while (1) { + if (PyOS_InputHook != NULL) + (void)(PyOS_InputHook)(); + errno = 0; + p = fgets(buf, len, fp); + if (p != NULL) + return 0; /* No error */ #ifdef MS_WINDOWS - /* In the case of a Ctrl+C or some other external event - interrupting the operation: - Win2k/NT: ERROR_OPERATION_ABORTED is the most recent Win32 - error code (and feof() returns TRUE). - Win9x: Ctrl+C seems to have no effect on fgets() returning - early - the signal handler is called, but the fgets() - only returns "normally" (ie, when Enter hit or feof()) - */ - if (GetLastError()==ERROR_OPERATION_ABORTED) { - /* Signals come asynchronously, so we sleep a brief - moment before checking if the handler has been - triggered (we cant just return 1 before the - signal handler has been called, as the later - signal may be treated as a separate interrupt). + /* In the case of a Ctrl+C or some other external event + interrupting the operation: + Win2k/NT: ERROR_OPERATION_ABORTED is the most recent Win32 + error code (and feof() returns TRUE). + Win9x: Ctrl+C seems to have no effect on fgets() returning + early - the signal handler is called, but the fgets() + only returns "normally" (ie, when Enter hit or feof()) */ - Sleep(1); + if (GetLastError()==ERROR_OPERATION_ABORTED) { + /* Signals come asynchronously, so we sleep a brief + moment before checking if the handler has been + triggered (we cant just return 1 before the + signal handler has been called, as the later + signal may be treated as a separate interrupt). + */ + Sleep(1); + if (PyOS_InterruptOccurred()) { + return 1; /* Interrupt */ + } + /* Either the sleep wasn't long enough (need a + short loop retrying?) or not interrupted at all + (in which case we should revisit the whole thing!) + Logging some warning would be nice. assert is not + viable as under the debugger, the various dialogs + mean the condition is not true. + */ + } +#endif /* MS_WINDOWS */ + if (feof(fp)) { + return -1; /* EOF */ + } +#ifdef EINTR + if (errno == EINTR) { + int s; +#ifdef WITH_THREAD + PyEval_RestoreThread(_PyOS_ReadlineTState); +#endif + s = PyErr_CheckSignals(); +#ifdef WITH_THREAD + PyEval_SaveThread(); +#endif + if (s < 0) + return 1; + /* try again */ + continue; + } +#endif if (PyOS_InterruptOccurred()) { return 1; /* Interrupt */ } - /* Either the sleep wasn't long enough (need a - short loop retrying?) or not interrupted at all - (in which case we should revisit the whole thing!) - Logging some warning would be nice. assert is not - viable as under the debugger, the various dialogs - mean the condition is not true. - */ + return -2; /* Error */ } -#endif /* MS_WINDOWS */ - if (feof(fp)) { - return -1; /* EOF */ - } -#ifdef EINTR - if (errno == EINTR) { - int s; -#ifdef WITH_THREAD - PyEval_RestoreThread(_PyOS_ReadlineTState); -#endif - s = PyErr_CheckSignals(); -#ifdef WITH_THREAD - PyEval_SaveThread(); -#endif - if (s < 0) { - return 1; - } - } -#endif - if (PyOS_InterruptOccurred()) { - return 1; /* Interrupt */ - } - return -2; /* Error */ + /* NOTREACHED */ } -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sat Apr 9 20:42:00 2011 From: python-checkins at python.org (ross.lagerwall) Date: Sat, 09 Apr 2011 20:42:00 +0200 Subject: [Python-checkins] cpython (3.1): Issue #11719: Fix message about unexpected test_msilib skip. Message-ID: http://hg.python.org/cpython/rev/d4730f14b6c0 changeset: 69214:d4730f14b6c0 branch: 3.1 parent: 69210:2222f343ac51 user: Ross Lagerwall date: Sat Apr 09 19:30:03 2011 +0200 summary: Issue #11719: Fix message about unexpected test_msilib skip. Patch by Nadeem Vawda. files: Lib/test/regrtest.py | 2 +- Misc/NEWS | 3 +++ 2 files changed, 4 insertions(+), 1 deletions(-) diff --git a/Lib/test/regrtest.py b/Lib/test/regrtest.py --- a/Lib/test/regrtest.py +++ b/Lib/test/regrtest.py @@ -1239,7 +1239,7 @@ # is distributed with Python WIN_ONLY = ["test_unicode_file", "test_winreg", "test_winsound", "test_startfile", - "test_sqlite"] + "test_sqlite", "test_msilib"] for skip in WIN_ONLY: self.expected.add(skip) diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -285,6 +285,9 @@ Tests ----- +- Issue #11719: Fix message about unexpected test_msilib skip on non-Windows + platforms. Patch by Nadeem Vawda. + - Issue #11490: test_subprocess:test_leaking_fds_on_error no longer gives a false positive if the last directory in the path is inaccessible. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sat Apr 9 20:42:00 2011 From: python-checkins at python.org (ross.lagerwall) Date: Sat, 09 Apr 2011 20:42:00 +0200 Subject: [Python-checkins] cpython (merge 3.1 -> 3.2): Merge with 3.1 Message-ID: http://hg.python.org/cpython/rev/677e9a9beac2 changeset: 69215:677e9a9beac2 branch: 3.2 parent: 69211:fc2f251e660a parent: 69214:d4730f14b6c0 user: Ross Lagerwall date: Sat Apr 09 20:05:04 2011 +0200 summary: Merge with 3.1 files: Lib/test/regrtest.py | 2 +- Misc/NEWS | 3 +++ 2 files changed, 4 insertions(+), 1 deletions(-) diff --git a/Lib/test/regrtest.py b/Lib/test/regrtest.py --- a/Lib/test/regrtest.py +++ b/Lib/test/regrtest.py @@ -1480,7 +1480,7 @@ # is distributed with Python WIN_ONLY = {"test_unicode_file", "test_winreg", "test_winsound", "test_startfile", - "test_sqlite"} + "test_sqlite", "test_msilib"} self.expected |= WIN_ONLY if sys.platform != 'sunos5': diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -223,6 +223,9 @@ Tests ----- +- Issue #11719: Fix message about unexpected test_msilib skip on non-Windows + platforms. Patch by Nadeem Vawda. + - Issue #11653: fix -W with -j in regrtest. - Issue #11577: improve test coverage of binhex.py. Patch by Arkady Koplyarov. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sat Apr 9 20:42:01 2011 From: python-checkins at python.org (ross.lagerwall) Date: Sat, 09 Apr 2011 20:42:01 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Merge with 3.2 Message-ID: http://hg.python.org/cpython/rev/cdffe73e3218 changeset: 69216:cdffe73e3218 parent: 69212:64de1ded0744 parent: 69215:677e9a9beac2 user: Ross Lagerwall date: Sat Apr 09 20:12:43 2011 +0200 summary: Merge with 3.2 files: Lib/test/regrtest.py | 2 +- Misc/NEWS | 3 +++ 2 files changed, 4 insertions(+), 1 deletions(-) diff --git a/Lib/test/regrtest.py b/Lib/test/regrtest.py --- a/Lib/test/regrtest.py +++ b/Lib/test/regrtest.py @@ -1530,7 +1530,7 @@ # is distributed with Python WIN_ONLY = {"test_unicode_file", "test_winreg", "test_winsound", "test_startfile", - "test_sqlite"} + "test_sqlite", "test_msilib"} self.expected |= WIN_ONLY if sys.platform != 'sunos5': diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -403,6 +403,9 @@ Tests ----- +- Issue #11719: Fix message about unexpected test_msilib skip on non-Windows + platforms. Patch by Nadeem Vawda. + - Issue #11727: Add a --timeout option to regrtest: if a test takes more than TIMEOUT seconds, dumps the traceback of all threads and exits. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sat Apr 9 20:42:08 2011 From: python-checkins at python.org (ross.lagerwall) Date: Sat, 09 Apr 2011 20:42:08 +0200 Subject: [Python-checkins] cpython (2.7): Issue #11719: Fix message about unexpected test_msilib skip. Message-ID: http://hg.python.org/cpython/rev/8b146103d29e changeset: 69217:8b146103d29e branch: 2.7 parent: 69213:7febd5ef7619 user: Ross Lagerwall date: Sat Apr 09 20:39:50 2011 +0200 summary: Issue #11719: Fix message about unexpected test_msilib skip. Patch by Nadeem Vawda. files: Lib/test/regrtest.py | 2 +- Misc/NEWS | 3 +++ 2 files changed, 4 insertions(+), 1 deletions(-) diff --git a/Lib/test/regrtest.py b/Lib/test/regrtest.py --- a/Lib/test/regrtest.py +++ b/Lib/test/regrtest.py @@ -1470,7 +1470,7 @@ # is distributed with Python WIN_ONLY = ["test_unicode_file", "test_winreg", "test_winsound", "test_startfile", - "test_sqlite"] + "test_sqlite", "test_msilib"] for skip in WIN_ONLY: self.expected.add(skip) diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -319,6 +319,9 @@ Tests ----- +- Issue #11719: Fix message about unexpected test_msilib skip on non-Windows + platforms. Patch by Nadeem Vawda. + - Issue #7108: Fix test_commands to not fail when special attributes ('@' or '.') appear in 'ls -l' output. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sat Apr 9 21:48:29 2011 From: python-checkins at python.org (ned.deily) Date: Sat, 09 Apr 2011 21:48:29 +0200 Subject: [Python-checkins] cpython (2.7): Issue #9670: Increase the default stack size for secondary threads on Message-ID: http://hg.python.org/cpython/rev/b0d2b696da19 changeset: 69218:b0d2b696da19 branch: 2.7 user: Ned Deily date: Sat Apr 09 12:29:58 2011 -0700 summary: Issue #9670: Increase the default stack size for secondary threads on Mac OS X and FreeBSD to reduce the chances of a crash instead of a "maximum recursion depth" RuntimeError exception. (Patch by Ronald Oussoren) files: Lib/test/test_threading.py | 30 ++++++++++++++++++++++++++ Misc/NEWS | 5 ++++ Python/thread_pthread.h | 12 ++++++++++ 3 files changed, 47 insertions(+), 0 deletions(-) diff --git a/Lib/test/test_threading.py b/Lib/test/test_threading.py --- a/Lib/test/test_threading.py +++ b/Lib/test/test_threading.py @@ -666,6 +666,36 @@ thread.start() self.assertRaises(RuntimeError, setattr, thread, "daemon", True) + def test_recursion_limit(self): + # Issue 9670 + # test that excessive recursion within a non-main thread causes + # an exception rather than crashing the interpreter on platforms + # like Mac OS X or FreeBSD which have small default stack sizes + # for threads + script = """if True: + import threading + + def recurse(): + return recurse() + + def outer(): + try: + recurse() + except RuntimeError: + pass + + w = threading.Thread(target=outer) + w.start() + w.join() + print('end of main thread') + """ + expected_output = "end of main thread\n" + p = subprocess.Popen([sys.executable, "-c", script], + stdout=subprocess.PIPE) + stdout, stderr = p.communicate() + data = stdout.decode().replace('\r', '') + self.assertEqual(p.returncode, 0, "Unexpected error") + self.assertEqual(data, expected_output) class LockTests(lock_tests.LockTests): locktype = staticmethod(threading.Lock) diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -9,6 +9,11 @@ Core and Builtins ----------------- +- Issue #9670: Increase the default stack size for secondary threads on + Mac OS X and FreeBSD to reduce the chances of a crash instead of a + "maximum recursion depth" RuntimeError exception. + (original patch by Ronald Oussoren) + - Issue #11650: PyOS_StdioReadline() retries fgets() if it was interrupted (EINTR), for example if the program is stopped with CTRL+z on Mac OS X. Patch written by Charles-Francois Natali. diff --git a/Python/thread_pthread.h b/Python/thread_pthread.h --- a/Python/thread_pthread.h +++ b/Python/thread_pthread.h @@ -18,6 +18,18 @@ #ifndef THREAD_STACK_SIZE #define THREAD_STACK_SIZE 0 /* use default stack size */ #endif + +#if (defined(__APPLE__) || defined(__FreeBSD__)) && defined(THREAD_STACK_SIZE) && THREAD_STACK_SIZE == 0 + /* The default stack size for new threads on OSX is small enough that + * we'll get hard crashes instead of 'maximum recursion depth exceeded' + * exceptions. + * + * The default stack size below is the minimal stack size where a + * simple recursive function doesn't cause a hard crash. + */ +#undef THREAD_STACK_SIZE +#define THREAD_STACK_SIZE 0x100000 +#endif /* for safety, ensure a viable minimum stacksize */ #define THREAD_STACK_MIN 0x8000 /* 32kB */ #else /* !_POSIX_THREAD_ATTR_STACKSIZE */ -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sat Apr 9 21:48:30 2011 From: python-checkins at python.org (ned.deily) Date: Sat, 09 Apr 2011 21:48:30 +0200 Subject: [Python-checkins] cpython (3.1): Issue #9670: Increase the default stack size for secondary threads on Message-ID: http://hg.python.org/cpython/rev/378b40d71175 changeset: 69219:378b40d71175 branch: 3.1 parent: 69214:d4730f14b6c0 user: Ned Deily date: Sat Apr 09 12:32:12 2011 -0700 summary: Issue #9670: Increase the default stack size for secondary threads on Mac OS X and FreeBSD to reduce the chances of a crash instead of a "maximum recursion depth" RuntimeError exception. (Patch by Ronald Oussoren) files: Lib/test/test_threading.py | 30 ++++++++++++++++++++++++++ Misc/NEWS | 5 ++++ Python/thread_pthread.h | 12 ++++++++++ 3 files changed, 47 insertions(+), 0 deletions(-) diff --git a/Lib/test/test_threading.py b/Lib/test/test_threading.py --- a/Lib/test/test_threading.py +++ b/Lib/test/test_threading.py @@ -650,6 +650,36 @@ thread.start() self.assertRaises(RuntimeError, setattr, thread, "daemon", True) + def test_recursion_limit(self): + # Issue 9670 + # test that excessive recursion within a non-main thread causes + # an exception rather than crashing the interpreter on platforms + # like Mac OS X or FreeBSD which have small default stack sizes + # for threads + script = """if True: + import threading + + def recurse(): + return recurse() + + def outer(): + try: + recurse() + except RuntimeError: + pass + + w = threading.Thread(target=outer) + w.start() + w.join() + print('end of main thread') + """ + expected_output = "end of main thread\n" + p = subprocess.Popen([sys.executable, "-c", script], + stdout=subprocess.PIPE) + stdout, stderr = p.communicate() + data = stdout.decode().replace('\r', '') + self.assertEqual(p.returncode, 0, "Unexpected error") + self.assertEqual(data, expected_output) class LockTests(lock_tests.LockTests): locktype = staticmethod(threading.Lock) diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -10,6 +10,11 @@ Core and Builtins ----------------- +- Issue #9670: Increase the default stack size for secondary threads on + Mac OS X and FreeBSD to reduce the chances of a crash instead of a + "maximum recursion depth" RuntimeError exception. + (original patch by Ronald Oussoren) + - Issue #11650: PyOS_StdioReadline() retries fgets() if it was interrupted (EINTR), for example if the program is stopped with CTRL+z on Mac OS X. Patch written by Charles-Francois Natali. diff --git a/Python/thread_pthread.h b/Python/thread_pthread.h --- a/Python/thread_pthread.h +++ b/Python/thread_pthread.h @@ -18,6 +18,18 @@ #ifndef THREAD_STACK_SIZE #define THREAD_STACK_SIZE 0 /* use default stack size */ #endif + +#if (defined(__APPLE__) || defined(__FreeBSD__)) && defined(THREAD_STACK_SIZE) && THREAD_STACK_SIZE == 0 + /* The default stack size for new threads on OSX is small enough that + * we'll get hard crashes instead of 'maximum recursion depth exceeded' + * exceptions. + * + * The default stack size below is the minimal stack size where a + * simple recursive function doesn't cause a hard crash. + */ +#undef THREAD_STACK_SIZE +#define THREAD_STACK_SIZE 0x100000 +#endif /* for safety, ensure a viable minimum stacksize */ #define THREAD_STACK_MIN 0x8000 /* 32kB */ #else /* !_POSIX_THREAD_ATTR_STACKSIZE */ -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sat Apr 9 21:48:31 2011 From: python-checkins at python.org (ned.deily) Date: Sat, 09 Apr 2011 21:48:31 +0200 Subject: [Python-checkins] cpython (merge 3.1 -> 3.2): Issue #9670: merge with 3.2 Message-ID: http://hg.python.org/cpython/rev/3fe8fd2fd1d0 changeset: 69220:3fe8fd2fd1d0 branch: 3.2 parent: 69215:677e9a9beac2 parent: 69219:378b40d71175 user: Ned Deily date: Sat Apr 09 12:37:55 2011 -0700 summary: Issue #9670: merge with 3.2 files: Lib/test/test_threading.py | 30 ++++++++++++++++++++++++++ Misc/NEWS | 5 ++++ Python/thread_pthread.h | 12 ++++++++++ 3 files changed, 47 insertions(+), 0 deletions(-) diff --git a/Lib/test/test_threading.py b/Lib/test/test_threading.py --- a/Lib/test/test_threading.py +++ b/Lib/test/test_threading.py @@ -677,6 +677,36 @@ thread.start() self.assertRaises(RuntimeError, setattr, thread, "daemon", True) + def test_recursion_limit(self): + # Issue 9670 + # test that excessive recursion within a non-main thread causes + # an exception rather than crashing the interpreter on platforms + # like Mac OS X or FreeBSD which have small default stack sizes + # for threads + script = """if True: + import threading + + def recurse(): + return recurse() + + def outer(): + try: + recurse() + except RuntimeError: + pass + + w = threading.Thread(target=outer) + w.start() + w.join() + print('end of main thread') + """ + expected_output = "end of main thread\n" + p = subprocess.Popen([sys.executable, "-c", script], + stdout=subprocess.PIPE) + stdout, stderr = p.communicate() + data = stdout.decode().replace('\r', '') + self.assertEqual(p.returncode, 0, "Unexpected error") + self.assertEqual(data, expected_output) class LockTests(lock_tests.LockTests): locktype = staticmethod(threading.Lock) diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -10,6 +10,11 @@ Core and Builtins ----------------- +- Issue #9670: Increase the default stack size for secondary threads on + Mac OS X and FreeBSD to reduce the chances of a crash instead of a + "maximum recursion depth" RuntimeError exception. + (original patch by Ronald Oussoren) + - Issue #11650: PyOS_StdioReadline() retries fgets() if it was interrupted (EINTR), for example if the program is stopped with CTRL+z on Mac OS X. Patch written by Charles-Francois Natali. diff --git a/Python/thread_pthread.h b/Python/thread_pthread.h --- a/Python/thread_pthread.h +++ b/Python/thread_pthread.h @@ -18,6 +18,18 @@ #ifndef THREAD_STACK_SIZE #define THREAD_STACK_SIZE 0 /* use default stack size */ #endif + +#if (defined(__APPLE__) || defined(__FreeBSD__)) && defined(THREAD_STACK_SIZE) && THREAD_STACK_SIZE == 0 + /* The default stack size for new threads on OSX is small enough that + * we'll get hard crashes instead of 'maximum recursion depth exceeded' + * exceptions. + * + * The default stack size below is the minimal stack size where a + * simple recursive function doesn't cause a hard crash. + */ +#undef THREAD_STACK_SIZE +#define THREAD_STACK_SIZE 0x100000 +#endif /* for safety, ensure a viable minimum stacksize */ #define THREAD_STACK_MIN 0x8000 /* 32kB */ #else /* !_POSIX_THREAD_ATTR_STACKSIZE */ -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sat Apr 9 21:48:36 2011 From: python-checkins at python.org (ned.deily) Date: Sat, 09 Apr 2011 21:48:36 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Issue #9670: merge with current Message-ID: http://hg.python.org/cpython/rev/4c750091d8c5 changeset: 69221:4c750091d8c5 parent: 69216:cdffe73e3218 parent: 69220:3fe8fd2fd1d0 user: Ned Deily date: Sat Apr 09 12:47:12 2011 -0700 summary: Issue #9670: merge with current files: Lib/test/test_threading.py | 30 ++++++++++++++++++++++++++ Misc/NEWS | 5 ++++ Python/thread_pthread.h | 12 ++++++++++ 3 files changed, 47 insertions(+), 0 deletions(-) diff --git a/Lib/test/test_threading.py b/Lib/test/test_threading.py --- a/Lib/test/test_threading.py +++ b/Lib/test/test_threading.py @@ -689,6 +689,36 @@ lock = threading.Lock() self.assertRaises(RuntimeError, lock.release) + def test_recursion_limit(self): + # Issue 9670 + # test that excessive recursion within a non-main thread causes + # an exception rather than crashing the interpreter on platforms + # like Mac OS X or FreeBSD which have small default stack sizes + # for threads + script = """if True: + import threading + + def recurse(): + return recurse() + + def outer(): + try: + recurse() + except RuntimeError: + pass + + w = threading.Thread(target=outer) + w.start() + w.join() + print('end of main thread') + """ + expected_output = "end of main thread\n" + p = subprocess.Popen([sys.executable, "-c", script], + stdout=subprocess.PIPE) + stdout, stderr = p.communicate() + data = stdout.decode().replace('\r', '') + self.assertEqual(p.returncode, 0, "Unexpected error") + self.assertEqual(data, expected_output) class LockTests(lock_tests.LockTests): locktype = staticmethod(threading.Lock) diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -10,6 +10,11 @@ Core and Builtins ----------------- +- Issue #9670: Increase the default stack size for secondary threads on + Mac OS X and FreeBSD to reduce the chances of a crash instead of a + "maximum recursion depth" RuntimeError exception. + (original patch by Ronald Oussoren) + - Issue #11650: PyOS_StdioReadline() retries fgets() if it was interrupted (EINTR), for example if the program is stopped with CTRL+z on Mac OS X. Patch written by Charles-Francois Natali. diff --git a/Python/thread_pthread.h b/Python/thread_pthread.h --- a/Python/thread_pthread.h +++ b/Python/thread_pthread.h @@ -18,6 +18,18 @@ #ifndef THREAD_STACK_SIZE #define THREAD_STACK_SIZE 0 /* use default stack size */ #endif + +#if (defined(__APPLE__) || defined(__FreeBSD__)) && defined(THREAD_STACK_SIZE) && THREAD_STACK_SIZE == 0 + /* The default stack size for new threads on OSX is small enough that + * we'll get hard crashes instead of 'maximum recursion depth exceeded' + * exceptions. + * + * The default stack size below is the minimal stack size where a + * simple recursive function doesn't cause a hard crash. + */ +#undef THREAD_STACK_SIZE +#define THREAD_STACK_SIZE 0x100000 +#endif /* for safety, ensure a viable minimum stacksize */ #define THREAD_STACK_MIN 0x8000 /* 32kB */ #else /* !_POSIX_THREAD_ATTR_STACKSIZE */ -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sat Apr 9 22:00:25 2011 From: python-checkins at python.org (raymond.hettinger) Date: Sat, 09 Apr 2011 22:00:25 +0200 Subject: [Python-checkins] cpython: Fix nit (make spelling consistent in prototype) Message-ID: http://hg.python.org/cpython/rev/a593a1030f2c changeset: 69222:a593a1030f2c user: Raymond Hettinger date: Sat Apr 09 12:57:00 2011 -0700 summary: Fix nit (make spelling consistent in prototype) files: Modules/_functoolsmodule.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Modules/_functoolsmodule.c b/Modules/_functoolsmodule.c --- a/Modules/_functoolsmodule.c +++ b/Modules/_functoolsmodule.c @@ -372,7 +372,7 @@ }; static PyObject * -keyobject_call(keyobject *ko, PyObject *args, PyObject *kw); +keyobject_call(keyobject *ko, PyObject *args, PyObject *kwds); static PyObject * keyobject_richcompare(PyObject *ko, PyObject *other, int op); -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sat Apr 9 22:00:25 2011 From: python-checkins at python.org (raymond.hettinger) Date: Sat, 09 Apr 2011 22:00:25 +0200 Subject: [Python-checkins] cpython: Replace constant tuple with constant set. Message-ID: http://hg.python.org/cpython/rev/a1235865dd54 changeset: 69223:a1235865dd54 user: Raymond Hettinger date: Sat Apr 09 13:00:17 2011 -0700 summary: Replace constant tuple with constant set. files: Lib/difflib.py | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/Lib/difflib.py b/Lib/difflib.py --- a/Lib/difflib.py +++ b/Lib/difflib.py @@ -1267,7 +1267,7 @@ yield '*** %d,%d ****%s' % (group[0][1]+1, group[-1][2], lineterm) else: yield '*** %d ****%s' % (group[-1][2], lineterm) - visiblechanges = [e for e in group if e[0] in ('replace', 'delete')] + visiblechanges = [e for e in group if e[0] in {'replace', 'delete'}] if visiblechanges: for tag, i1, i2, _, _ in group: if tag != 'insert': @@ -1278,7 +1278,7 @@ yield '--- %d,%d ----%s' % (group[0][3]+1, group[-1][4], lineterm) else: yield '--- %d ----%s' % (group[-1][4], lineterm) - visiblechanges = [e for e in group if e[0] in ('replace', 'insert')] + visiblechanges = [e for e in group if e[0] in {'replace', 'insert'}] if visiblechanges: for tag, _, _, j1, j2 in group: if tag != 'delete': -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sat Apr 9 23:50:02 2011 From: python-checkins at python.org (antoine.pitrou) Date: Sat, 09 Apr 2011 23:50:02 +0200 Subject: [Python-checkins] cpython: Issue #11757: select.select() now raises ValueError when a negative timeout Message-ID: http://hg.python.org/cpython/rev/3982be773b54 changeset: 69224:3982be773b54 user: Antoine Pitrou date: Sat Apr 09 23:49:58 2011 +0200 summary: Issue #11757: select.select() now raises ValueError when a negative timeout is passed (previously, a select.error with EINVAL would be raised). Patch by Charles-Fran?ois Natali. files: Lib/test/test_select.py | 1 + Misc/NEWS | 4 ++++ Modules/selectmodule.c | 5 +++++ 3 files changed, 10 insertions(+), 0 deletions(-) diff --git a/Lib/test/test_select.py b/Lib/test/test_select.py --- a/Lib/test/test_select.py +++ b/Lib/test/test_select.py @@ -20,6 +20,7 @@ self.assertRaises(TypeError, select.select, [self.Nope()], [], []) self.assertRaises(TypeError, select.select, [self.Almost()], [], []) self.assertRaises(TypeError, select.select, [], [], [], "not a number") + self.assertRaises(ValueError, select.select, [], [], [], -1) def test_returned_list_identity(self): # See issue #8329 diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -103,6 +103,10 @@ Library ------- +- Issue #11757: select.select() now raises ValueError when a negative timeout + is passed (previously, a select.error with EINVAL would be raised). Patch + by Charles-Fran?ois Natali. + - Issue #7311: fix html.parser to accept non-ASCII attribute values. - Issue #11605: email.parser.BytesFeedParser was incorrectly converting multipart diff --git a/Modules/selectmodule.c b/Modules/selectmodule.c --- a/Modules/selectmodule.c +++ b/Modules/selectmodule.c @@ -234,6 +234,11 @@ "timeout period too long"); return NULL; } + if (timeout < 0) { + PyErr_SetString(PyExc_ValueError, + "timeout must be non-negative"); + return NULL; + } seconds = (long)timeout; timeout = timeout - (double)seconds; tv.tv_sec = seconds; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Apr 10 00:01:20 2011 From: python-checkins at python.org (ned.deily) Date: Sun, 10 Apr 2011 00:01:20 +0200 Subject: [Python-checkins] cpython (3.1): Issue9670: Back out changeset 378b40d71175; test fails on other platforms Message-ID: http://hg.python.org/cpython/rev/42d5001e5845 changeset: 69225:42d5001e5845 branch: 3.1 parent: 69219:378b40d71175 user: Ned Deily date: Sat Apr 09 14:50:59 2011 -0700 summary: Issue9670: Back out changeset 378b40d71175; test fails on other platforms and on OS X with pydebug. files: Lib/test/test_threading.py | 30 -------------------------- Misc/NEWS | 5 ---- Python/thread_pthread.h | 12 ---------- 3 files changed, 0 insertions(+), 47 deletions(-) diff --git a/Lib/test/test_threading.py b/Lib/test/test_threading.py --- a/Lib/test/test_threading.py +++ b/Lib/test/test_threading.py @@ -650,36 +650,6 @@ thread.start() self.assertRaises(RuntimeError, setattr, thread, "daemon", True) - def test_recursion_limit(self): - # Issue 9670 - # test that excessive recursion within a non-main thread causes - # an exception rather than crashing the interpreter on platforms - # like Mac OS X or FreeBSD which have small default stack sizes - # for threads - script = """if True: - import threading - - def recurse(): - return recurse() - - def outer(): - try: - recurse() - except RuntimeError: - pass - - w = threading.Thread(target=outer) - w.start() - w.join() - print('end of main thread') - """ - expected_output = "end of main thread\n" - p = subprocess.Popen([sys.executable, "-c", script], - stdout=subprocess.PIPE) - stdout, stderr = p.communicate() - data = stdout.decode().replace('\r', '') - self.assertEqual(p.returncode, 0, "Unexpected error") - self.assertEqual(data, expected_output) class LockTests(lock_tests.LockTests): locktype = staticmethod(threading.Lock) diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -10,11 +10,6 @@ Core and Builtins ----------------- -- Issue #9670: Increase the default stack size for secondary threads on - Mac OS X and FreeBSD to reduce the chances of a crash instead of a - "maximum recursion depth" RuntimeError exception. - (original patch by Ronald Oussoren) - - Issue #11650: PyOS_StdioReadline() retries fgets() if it was interrupted (EINTR), for example if the program is stopped with CTRL+z on Mac OS X. Patch written by Charles-Francois Natali. diff --git a/Python/thread_pthread.h b/Python/thread_pthread.h --- a/Python/thread_pthread.h +++ b/Python/thread_pthread.h @@ -18,18 +18,6 @@ #ifndef THREAD_STACK_SIZE #define THREAD_STACK_SIZE 0 /* use default stack size */ #endif - -#if (defined(__APPLE__) || defined(__FreeBSD__)) && defined(THREAD_STACK_SIZE) && THREAD_STACK_SIZE == 0 - /* The default stack size for new threads on OSX is small enough that - * we'll get hard crashes instead of 'maximum recursion depth exceeded' - * exceptions. - * - * The default stack size below is the minimal stack size where a - * simple recursive function doesn't cause a hard crash. - */ -#undef THREAD_STACK_SIZE -#define THREAD_STACK_SIZE 0x100000 -#endif /* for safety, ensure a viable minimum stacksize */ #define THREAD_STACK_MIN 0x8000 /* 32kB */ #else /* !_POSIX_THREAD_ATTR_STACKSIZE */ -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Apr 10 00:01:21 2011 From: python-checkins at python.org (ned.deily) Date: Sun, 10 Apr 2011 00:01:21 +0200 Subject: [Python-checkins] cpython (merge 3.1 -> 3.2): Issue9670: Merge backout to 3.2. Message-ID: http://hg.python.org/cpython/rev/54edabf2846d changeset: 69226:54edabf2846d branch: 3.2 parent: 69220:3fe8fd2fd1d0 parent: 69225:42d5001e5845 user: Ned Deily date: Sat Apr 09 14:53:47 2011 -0700 summary: Issue9670: Merge backout to 3.2. files: Lib/test/test_threading.py | 30 -------------------------- Misc/NEWS | 5 ---- Python/thread_pthread.h | 12 ---------- 3 files changed, 0 insertions(+), 47 deletions(-) diff --git a/Lib/test/test_threading.py b/Lib/test/test_threading.py --- a/Lib/test/test_threading.py +++ b/Lib/test/test_threading.py @@ -677,36 +677,6 @@ thread.start() self.assertRaises(RuntimeError, setattr, thread, "daemon", True) - def test_recursion_limit(self): - # Issue 9670 - # test that excessive recursion within a non-main thread causes - # an exception rather than crashing the interpreter on platforms - # like Mac OS X or FreeBSD which have small default stack sizes - # for threads - script = """if True: - import threading - - def recurse(): - return recurse() - - def outer(): - try: - recurse() - except RuntimeError: - pass - - w = threading.Thread(target=outer) - w.start() - w.join() - print('end of main thread') - """ - expected_output = "end of main thread\n" - p = subprocess.Popen([sys.executable, "-c", script], - stdout=subprocess.PIPE) - stdout, stderr = p.communicate() - data = stdout.decode().replace('\r', '') - self.assertEqual(p.returncode, 0, "Unexpected error") - self.assertEqual(data, expected_output) class LockTests(lock_tests.LockTests): locktype = staticmethod(threading.Lock) diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -10,11 +10,6 @@ Core and Builtins ----------------- -- Issue #9670: Increase the default stack size for secondary threads on - Mac OS X and FreeBSD to reduce the chances of a crash instead of a - "maximum recursion depth" RuntimeError exception. - (original patch by Ronald Oussoren) - - Issue #11650: PyOS_StdioReadline() retries fgets() if it was interrupted (EINTR), for example if the program is stopped with CTRL+z on Mac OS X. Patch written by Charles-Francois Natali. diff --git a/Python/thread_pthread.h b/Python/thread_pthread.h --- a/Python/thread_pthread.h +++ b/Python/thread_pthread.h @@ -18,18 +18,6 @@ #ifndef THREAD_STACK_SIZE #define THREAD_STACK_SIZE 0 /* use default stack size */ #endif - -#if (defined(__APPLE__) || defined(__FreeBSD__)) && defined(THREAD_STACK_SIZE) && THREAD_STACK_SIZE == 0 - /* The default stack size for new threads on OSX is small enough that - * we'll get hard crashes instead of 'maximum recursion depth exceeded' - * exceptions. - * - * The default stack size below is the minimal stack size where a - * simple recursive function doesn't cause a hard crash. - */ -#undef THREAD_STACK_SIZE -#define THREAD_STACK_SIZE 0x100000 -#endif /* for safety, ensure a viable minimum stacksize */ #define THREAD_STACK_MIN 0x8000 /* 32kB */ #else /* !_POSIX_THREAD_ATTR_STACKSIZE */ -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Apr 10 00:01:26 2011 From: python-checkins at python.org (ned.deily) Date: Sun, 10 Apr 2011 00:01:26 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Issue9670: Merge backout from 3.2. Message-ID: http://hg.python.org/cpython/rev/b7456c1b4aa4 changeset: 69227:b7456c1b4aa4 parent: 69224:3982be773b54 parent: 69226:54edabf2846d user: Ned Deily date: Sat Apr 09 14:58:04 2011 -0700 summary: Issue9670: Merge backout from 3.2. files: Lib/test/test_threading.py | 30 -------------------------- Misc/NEWS | 5 ---- Python/thread_pthread.h | 12 ---------- 3 files changed, 0 insertions(+), 47 deletions(-) diff --git a/Lib/test/test_threading.py b/Lib/test/test_threading.py --- a/Lib/test/test_threading.py +++ b/Lib/test/test_threading.py @@ -689,36 +689,6 @@ lock = threading.Lock() self.assertRaises(RuntimeError, lock.release) - def test_recursion_limit(self): - # Issue 9670 - # test that excessive recursion within a non-main thread causes - # an exception rather than crashing the interpreter on platforms - # like Mac OS X or FreeBSD which have small default stack sizes - # for threads - script = """if True: - import threading - - def recurse(): - return recurse() - - def outer(): - try: - recurse() - except RuntimeError: - pass - - w = threading.Thread(target=outer) - w.start() - w.join() - print('end of main thread') - """ - expected_output = "end of main thread\n" - p = subprocess.Popen([sys.executable, "-c", script], - stdout=subprocess.PIPE) - stdout, stderr = p.communicate() - data = stdout.decode().replace('\r', '') - self.assertEqual(p.returncode, 0, "Unexpected error") - self.assertEqual(data, expected_output) class LockTests(lock_tests.LockTests): locktype = staticmethod(threading.Lock) diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -10,11 +10,6 @@ Core and Builtins ----------------- -- Issue #9670: Increase the default stack size for secondary threads on - Mac OS X and FreeBSD to reduce the chances of a crash instead of a - "maximum recursion depth" RuntimeError exception. - (original patch by Ronald Oussoren) - - Issue #11650: PyOS_StdioReadline() retries fgets() if it was interrupted (EINTR), for example if the program is stopped with CTRL+z on Mac OS X. Patch written by Charles-Francois Natali. diff --git a/Python/thread_pthread.h b/Python/thread_pthread.h --- a/Python/thread_pthread.h +++ b/Python/thread_pthread.h @@ -18,18 +18,6 @@ #ifndef THREAD_STACK_SIZE #define THREAD_STACK_SIZE 0 /* use default stack size */ #endif - -#if (defined(__APPLE__) || defined(__FreeBSD__)) && defined(THREAD_STACK_SIZE) && THREAD_STACK_SIZE == 0 - /* The default stack size for new threads on OSX is small enough that - * we'll get hard crashes instead of 'maximum recursion depth exceeded' - * exceptions. - * - * The default stack size below is the minimal stack size where a - * simple recursive function doesn't cause a hard crash. - */ -#undef THREAD_STACK_SIZE -#define THREAD_STACK_SIZE 0x100000 -#endif /* for safety, ensure a viable minimum stacksize */ #define THREAD_STACK_MIN 0x8000 /* 32kB */ #else /* !_POSIX_THREAD_ATTR_STACKSIZE */ -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Apr 10 00:01:27 2011 From: python-checkins at python.org (ned.deily) Date: Sun, 10 Apr 2011 00:01:27 +0200 Subject: [Python-checkins] cpython (2.7): Issue9670: Back out changeset b0d2b696da19; test fails on other platforms Message-ID: http://hg.python.org/cpython/rev/3630bc3d5a88 changeset: 69228:3630bc3d5a88 branch: 2.7 parent: 69218:b0d2b696da19 user: Ned Deily date: Sat Apr 09 14:59:30 2011 -0700 summary: Issue9670: Back out changeset b0d2b696da19; test fails on other platforms and on OS X with pydebug. files: Lib/test/test_threading.py | 30 -------------------------- Misc/NEWS | 5 ---- Python/thread_pthread.h | 12 ---------- 3 files changed, 0 insertions(+), 47 deletions(-) diff --git a/Lib/test/test_threading.py b/Lib/test/test_threading.py --- a/Lib/test/test_threading.py +++ b/Lib/test/test_threading.py @@ -666,36 +666,6 @@ thread.start() self.assertRaises(RuntimeError, setattr, thread, "daemon", True) - def test_recursion_limit(self): - # Issue 9670 - # test that excessive recursion within a non-main thread causes - # an exception rather than crashing the interpreter on platforms - # like Mac OS X or FreeBSD which have small default stack sizes - # for threads - script = """if True: - import threading - - def recurse(): - return recurse() - - def outer(): - try: - recurse() - except RuntimeError: - pass - - w = threading.Thread(target=outer) - w.start() - w.join() - print('end of main thread') - """ - expected_output = "end of main thread\n" - p = subprocess.Popen([sys.executable, "-c", script], - stdout=subprocess.PIPE) - stdout, stderr = p.communicate() - data = stdout.decode().replace('\r', '') - self.assertEqual(p.returncode, 0, "Unexpected error") - self.assertEqual(data, expected_output) class LockTests(lock_tests.LockTests): locktype = staticmethod(threading.Lock) diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -9,11 +9,6 @@ Core and Builtins ----------------- -- Issue #9670: Increase the default stack size for secondary threads on - Mac OS X and FreeBSD to reduce the chances of a crash instead of a - "maximum recursion depth" RuntimeError exception. - (original patch by Ronald Oussoren) - - Issue #11650: PyOS_StdioReadline() retries fgets() if it was interrupted (EINTR), for example if the program is stopped with CTRL+z on Mac OS X. Patch written by Charles-Francois Natali. diff --git a/Python/thread_pthread.h b/Python/thread_pthread.h --- a/Python/thread_pthread.h +++ b/Python/thread_pthread.h @@ -18,18 +18,6 @@ #ifndef THREAD_STACK_SIZE #define THREAD_STACK_SIZE 0 /* use default stack size */ #endif - -#if (defined(__APPLE__) || defined(__FreeBSD__)) && defined(THREAD_STACK_SIZE) && THREAD_STACK_SIZE == 0 - /* The default stack size for new threads on OSX is small enough that - * we'll get hard crashes instead of 'maximum recursion depth exceeded' - * exceptions. - * - * The default stack size below is the minimal stack size where a - * simple recursive function doesn't cause a hard crash. - */ -#undef THREAD_STACK_SIZE -#define THREAD_STACK_SIZE 0x100000 -#endif /* for safety, ensure a viable minimum stacksize */ #define THREAD_STACK_MIN 0x8000 /* 32kB */ #else /* !_POSIX_THREAD_ATTR_STACKSIZE */ -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Apr 10 04:41:47 2011 From: python-checkins at python.org (raymond.hettinger) Date: Sun, 10 Apr 2011 04:41:47 +0200 Subject: [Python-checkins] cpython (3.2): Beautify and modernize the SequenceMatcher example Message-ID: http://hg.python.org/cpython/rev/35f4e0d1d64b changeset: 69229:35f4e0d1d64b branch: 3.2 parent: 69226:54edabf2846d user: Raymond Hettinger date: Sat Apr 09 19:41:00 2011 -0700 summary: Beautify and modernize the SequenceMatcher example files: Doc/library/difflib.rst | 16 +++++++++------- 1 files changed, 9 insertions(+), 7 deletions(-) diff --git a/Doc/library/difflib.rst b/Doc/library/difflib.rst --- a/Doc/library/difflib.rst +++ b/Doc/library/difflib.rst @@ -483,13 +483,15 @@ >>> b = "abycdf" >>> s = SequenceMatcher(None, a, b) >>> for tag, i1, i2, j1, j2 in s.get_opcodes(): - ... print(("%7s a[%d:%d] (%s) b[%d:%d] (%s)" % - ... (tag, i1, i2, a[i1:i2], j1, j2, b[j1:j2]))) - delete a[0:1] (q) b[0:0] () - equal a[1:3] (ab) b[0:2] (ab) - replace a[3:4] (x) b[2:3] (y) - equal a[4:6] (cd) b[3:5] (cd) - insert a[6:6] () b[5:6] (f) + print('{:7} a[{}:{}] --> b[{}:{}] {!r:>8} --> {!r}'.format( + tag, i1, i2, j1, j2, a[i1:i2], b[j1:j2])) + + + delete a[0:1] --> b[0:0] 'q' --> '' + equal a[1:3] --> b[0:2] 'ab' --> 'ab' + replace a[3:4] --> b[2:3] 'x' --> 'y' + equal a[4:6] --> b[3:5] 'cd' --> 'cd' + insert a[6:6] --> b[5:6] '' --> 'f' .. method:: get_grouped_opcodes(n=3) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Apr 10 04:41:48 2011 From: python-checkins at python.org (raymond.hettinger) Date: Sun, 10 Apr 2011 04:41:48 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Beautify and modernize the SequenceMatcher example Message-ID: http://hg.python.org/cpython/rev/aff9bcbd2fc7 changeset: 69230:aff9bcbd2fc7 parent: 69227:b7456c1b4aa4 parent: 69229:35f4e0d1d64b user: Raymond Hettinger date: Sat Apr 09 19:41:31 2011 -0700 summary: Beautify and modernize the SequenceMatcher example files: Doc/library/difflib.rst | 16 +++++++++------- 1 files changed, 9 insertions(+), 7 deletions(-) diff --git a/Doc/library/difflib.rst b/Doc/library/difflib.rst --- a/Doc/library/difflib.rst +++ b/Doc/library/difflib.rst @@ -483,13 +483,15 @@ >>> b = "abycdf" >>> s = SequenceMatcher(None, a, b) >>> for tag, i1, i2, j1, j2 in s.get_opcodes(): - ... print(("%7s a[%d:%d] (%s) b[%d:%d] (%s)" % - ... (tag, i1, i2, a[i1:i2], j1, j2, b[j1:j2]))) - delete a[0:1] (q) b[0:0] () - equal a[1:3] (ab) b[0:2] (ab) - replace a[3:4] (x) b[2:3] (y) - equal a[4:6] (cd) b[3:5] (cd) - insert a[6:6] () b[5:6] (f) + print('{:7} a[{}:{}] --> b[{}:{}] {!r:>8} --> {!r}'.format( + tag, i1, i2, j1, j2, a[i1:i2], b[j1:j2])) + + + delete a[0:1] --> b[0:0] 'q' --> '' + equal a[1:3] --> b[0:2] 'ab' --> 'ab' + replace a[3:4] --> b[2:3] 'x' --> 'y' + equal a[4:6] --> b[3:5] 'cd' --> 'cd' + insert a[6:6] --> b[5:6] '' --> 'f' .. method:: get_grouped_opcodes(n=3) -- Repository URL: http://hg.python.org/cpython From solipsis at pitrou.net Sun Apr 10 04:56:54 2011 From: solipsis at pitrou.net (solipsis at pitrou.net) Date: Sun, 10 Apr 2011 04:56:54 +0200 Subject: [Python-checkins] Daily reference leaks (b7456c1b4aa4): sum=0 Message-ID: results for b7456c1b4aa4 on branch "default" -------------------------------------------- Command line was: ['./python', '-m', 'test.regrtest', '-uall', '-R', '3:3:/home/antoine/cpython/refleaks/reflogS4QsUQ', '-x'] From python-checkins at python.org Sun Apr 10 06:33:06 2011 From: python-checkins at python.org (eli.bendersky) Date: Sun, 10 Apr 2011 06:33:06 +0200 Subject: [Python-checkins] cpython: Issue #9904: fix and clarify some comments + fix indentation in symtable code Message-ID: http://hg.python.org/cpython/rev/b6fe63c914e4 changeset: 69231:b6fe63c914e4 user: Eli Bendersky date: Sun Apr 10 07:37:26 2011 +0300 summary: Issue #9904: fix and clarify some comments + fix indentation in symtable code files: Include/symtable.h | 11 +++++++---- Python/symtable.c | 8 ++++---- 2 files changed, 11 insertions(+), 8 deletions(-) diff --git a/Include/symtable.h b/Include/symtable.h --- a/Include/symtable.h +++ b/Include/symtable.h @@ -23,10 +23,13 @@ PyObject *st_blocks; /* dict: map AST node addresses * to symbol table entries */ PyObject *st_stack; /* list: stack of namespace info */ - PyObject *st_global; /* borrowed ref to st_top->st_symbols */ - int st_nblocks; /* number of blocks used */ + PyObject *st_global; /* borrowed ref to st_top->ste_symbols */ + int st_nblocks; /* number of blocks used. kept for + consistency with the corresponding + compiler structure */ PyObject *st_private; /* name of current class or NULL */ - PyFutureFeatures *st_future; /* module's future features */ + PyFutureFeatures *st_future; /* module's future features that affect + the symbol table */ }; typedef struct _symtable_entry { @@ -34,7 +37,7 @@ PyObject *ste_id; /* int: key in ste_table->st_blocks */ PyObject *ste_symbols; /* dict: variable names to flags */ PyObject *ste_name; /* string: name of current block */ - PyObject *ste_varnames; /* list of variable names */ + PyObject *ste_varnames; /* list of function parameters */ PyObject *ste_children; /* list of child blocks */ _Py_block_ty ste_type; /* module, class, or function */ int ste_unoptimized; /* false if namespace is optimized */ diff --git a/Python/symtable.c b/Python/symtable.c --- a/Python/symtable.c +++ b/Python/symtable.c @@ -750,7 +750,7 @@ goto error; } - /* Recursively call analyze_block() on each child block. + /* Recursively call analyze_child_block() on each child block. newbound, newglobal now contain the names visible in nested blocks. The free variables in the children will @@ -1205,9 +1205,9 @@ case Raise_kind: if (s->v.Raise.exc) { VISIT(st, expr, s->v.Raise.exc); - if (s->v.Raise.cause) { - VISIT(st, expr, s->v.Raise.cause); - } + if (s->v.Raise.cause) { + VISIT(st, expr, s->v.Raise.cause); + } } break; case TryExcept_kind: -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Apr 10 08:20:15 2011 From: python-checkins at python.org (georg.brandl) Date: Sun, 10 Apr 2011 08:20:15 +0200 Subject: [Python-checkins] devguide: Add Nadeem. Message-ID: http://hg.python.org/devguide/rev/aff894ab4c20 changeset: 411:aff894ab4c20 user: Georg Brandl date: Sun Apr 10 08:20:13 2011 +0200 summary: Add Nadeem. files: developers.rst | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/developers.rst b/developers.rst --- a/developers.rst +++ b/developers.rst @@ -24,6 +24,9 @@ Permissions History ------------------- +- Nadeem Vawda was given push privileges on Apr 10 2011 by GFB, for + general contributions, on recommendation by Antoine Pitrou. + - Carl Friedrich Bolz was given push privileges on Mar 21 2011 by BAC, for stdlib compatibility work for PyPy. -- Repository URL: http://hg.python.org/devguide From python-checkins at python.org Sun Apr 10 09:03:17 2011 From: python-checkins at python.org (georg.brandl) Date: Sun, 10 Apr 2011 09:03:17 +0200 Subject: [Python-checkins] devguide: Update address for sending SSH keys to. Message-ID: http://hg.python.org/devguide/rev/56b895308730 changeset: 412:56b895308730 user: Georg Brandl date: Sun Apr 10 09:03:14 2011 +0200 summary: Update address for sending SSH keys to. files: coredev.rst | 2 +- faq.rst | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/coredev.rst b/coredev.rst --- a/coredev.rst +++ b/coredev.rst @@ -60,7 +60,7 @@ You need to generate an SSH 2 RSA key to be able to commit code. You may have multiple keys if you wish (e.g., for work and home). Send your key as an -attachment in an email to python-committers at python.org (do not paste it in +attachment in an email to hgaccounts at python.org (do not paste it in the email as SSH keys have specific formatting requirements). Help in generating an SSH key can be found in the :ref:`faq`. diff --git a/faq.rst b/faq.rst --- a/faq.rst +++ b/faq.rst @@ -676,7 +676,7 @@ How do I generate an SSH 2 public key? ------------------------------------------------------------------------------- -All generated SSH keys should be sent to python-committers at python.org for +All generated SSH keys should be sent to hgaccounts at python.org for adding to the list of keys. UNIX -- Repository URL: http://hg.python.org/devguide From python-checkins at python.org Sun Apr 10 09:41:47 2011 From: python-checkins at python.org (ross.lagerwall) Date: Sun, 10 Apr 2011 09:41:47 +0200 Subject: [Python-checkins] cpython (3.2): Issue #11818: Fix tempfile examples for Python 3. Message-ID: http://hg.python.org/cpython/rev/87d89f767b23 changeset: 69232:87d89f767b23 branch: 3.2 parent: 69229:35f4e0d1d64b user: Ross Lagerwall date: Sun Apr 10 09:30:04 2011 +0200 summary: Issue #11818: Fix tempfile examples for Python 3. files: Doc/library/tempfile.rst | 10 +++++----- Misc/NEWS | 5 +++++ 2 files changed, 10 insertions(+), 5 deletions(-) diff --git a/Doc/library/tempfile.rst b/Doc/library/tempfile.rst --- a/Doc/library/tempfile.rst +++ b/Doc/library/tempfile.rst @@ -242,26 +242,26 @@ # create a temporary file and write some data to it >>> fp = tempfile.TemporaryFile() - >>> fp.write('Hello world!') + >>> fp.write(b'Hello world!') # read data from file >>> fp.seek(0) >>> fp.read() - 'Hello world!' + b'Hello world!' # close the file, it will be removed >>> fp.close() # create a temporary file using a context manager >>> with tempfile.TemporaryFile() as fp: - ... fp.write('Hello world!') + ... fp.write(b'Hello world!') ... fp.seek(0) ... fp.read() - 'Hello world!' + b'Hello world!' >>> # file is now closed and removed # create a temporary directory using the context manager >>> with tempfile.TemporaryDirectory() as tmpdirname: - ... print 'created temporary directory', tmpdirname + ... print('created temporary directory', tmpdirname) >>> # directory and contents have been removed diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -257,6 +257,11 @@ - Issue #10826: Prevent sporadic failure in test_subprocess on Solaris due to open door files. +Documentation +------------- + +- Issue #11818: Fix tempfile examples for Python 3. + What's New in Python 3.2? ========================= -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Apr 10 09:41:48 2011 From: python-checkins at python.org (ross.lagerwall) Date: Sun, 10 Apr 2011 09:41:48 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Merge with 3.2 Message-ID: http://hg.python.org/cpython/rev/655c8849937f changeset: 69233:655c8849937f parent: 69231:b6fe63c914e4 parent: 69232:87d89f767b23 user: Ross Lagerwall date: Sun Apr 10 09:34:35 2011 +0200 summary: Merge with 3.2 files: Doc/library/tempfile.rst | 10 +++++----- Misc/NEWS | 5 +++++ 2 files changed, 10 insertions(+), 5 deletions(-) diff --git a/Doc/library/tempfile.rst b/Doc/library/tempfile.rst --- a/Doc/library/tempfile.rst +++ b/Doc/library/tempfile.rst @@ -242,26 +242,26 @@ # create a temporary file and write some data to it >>> fp = tempfile.TemporaryFile() - >>> fp.write('Hello world!') + >>> fp.write(b'Hello world!') # read data from file >>> fp.seek(0) >>> fp.read() - 'Hello world!' + b'Hello world!' # close the file, it will be removed >>> fp.close() # create a temporary file using a context manager >>> with tempfile.TemporaryFile() as fp: - ... fp.write('Hello world!') + ... fp.write(b'Hello world!') ... fp.seek(0) ... fp.read() - 'Hello world!' + b'Hello world!' >>> # file is now closed and removed # create a temporary directory using the context manager >>> with tempfile.TemporaryDirectory() as tmpdirname: - ... print 'created temporary directory', tmpdirname + ... print('created temporary directory', tmpdirname) >>> # directory and contents have been removed diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -466,6 +466,11 @@ - PY_PATCHLEVEL_REVISION has been removed, since it's meaningless with Mercurial. +Documentation +------------- + +- Issue #11818: Fix tempfile examples for Python 3. + What's New in Python 3.2? ========================= -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Apr 10 11:33:19 2011 From: python-checkins at python.org (nadeem.vawda) Date: Sun, 10 Apr 2011 11:33:19 +0200 Subject: [Python-checkins] devguide: Add myself to the Experts Index for bz2. Message-ID: http://hg.python.org/devguide/rev/5a8ced2895c7 changeset: 413:5a8ced2895c7 user: Nadeem Vawda date: Sun Apr 10 11:31:10 2011 +0200 summary: Add myself to the Experts Index for bz2. files: experts.rst | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/experts.rst b/experts.rst --- a/experts.rst +++ b/experts.rst @@ -63,7 +63,7 @@ binhex bisect rhettinger builtins -bz2 +bz2 nadeem.vawda calendar rhettinger cgi cgitb -- Repository URL: http://hg.python.org/devguide From python-checkins at python.org Sun Apr 10 11:59:35 2011 From: python-checkins at python.org (ezio.melotti) Date: Sun, 10 Apr 2011 11:59:35 +0200 Subject: [Python-checkins] cpython: #2650: re.escape() no longer escapes the "_". Message-ID: http://hg.python.org/cpython/rev/dda33191f7f5 changeset: 69234:dda33191f7f5 user: Ezio Melotti date: Sun Apr 10 12:59:16 2011 +0300 summary: #2650: re.escape() no longer escapes the "_". files: Doc/library/re.rst | 9 ++++++--- Lib/re.py | 8 +++++--- Lib/test/test_re.py | 4 ++-- Misc/NEWS | 2 ++ 4 files changed, 15 insertions(+), 8 deletions(-) diff --git a/Doc/library/re.rst b/Doc/library/re.rst --- a/Doc/library/re.rst +++ b/Doc/library/re.rst @@ -689,9 +689,12 @@ .. function:: escape(string) - Return *string* with all non-alphanumerics backslashed; this is useful if you - want to match an arbitrary literal string that may have regular expression - metacharacters in it. + Escape all the characters in pattern except ASCII letters, numbers and ``'_'``. + This is useful if you want to match an arbitrary literal string that may + have regular expression metacharacters in it. + + .. versionchanged:: 3.3 + The ``'_'`` character is no longer escaped. .. function:: purge() diff --git a/Lib/re.py b/Lib/re.py --- a/Lib/re.py +++ b/Lib/re.py @@ -215,12 +215,14 @@ return _compile(pattern, flags|T) _alphanum_str = frozenset( - "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ01234567890") + "_abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ01234567890") _alphanum_bytes = frozenset( - b"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ01234567890") + b"_abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ01234567890") def escape(pattern): - "Escape all non-alphanumeric characters in pattern." + """ + Escape all the characters in pattern except ASCII letters, numbers and '_'. + """ if isinstance(pattern, str): alphanum = _alphanum_str s = list(pattern) diff --git a/Lib/test/test_re.py b/Lib/test/test_re.py --- a/Lib/test/test_re.py +++ b/Lib/test/test_re.py @@ -428,7 +428,7 @@ self.assertEqual(m.span(), span) def test_re_escape(self): - alnum_chars = string.ascii_letters + string.digits + alnum_chars = string.ascii_letters + string.digits + '_' p = ''.join(chr(i) for i in range(256)) for c in p: if c in alnum_chars: @@ -441,7 +441,7 @@ self.assertMatch(re.escape(p), p) def test_re_escape_byte(self): - alnum_chars = (string.ascii_letters + string.digits).encode('ascii') + alnum_chars = (string.ascii_letters + string.digits + '_').encode('ascii') p = bytes(range(256)) for i in p: b = bytes([i]) diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -98,6 +98,8 @@ Library ------- +- Issue #2650: re.escape() no longer escapes the '_'. + - Issue #11757: select.select() now raises ValueError when a negative timeout is passed (previously, a select.error with EINVAL would be raised). Patch by Charles-Fran?ois Natali. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Apr 10 14:05:54 2011 From: python-checkins at python.org (nick.coghlan) Date: Sun, 10 Apr 2011 14:05:54 +0200 Subject: [Python-checkins] peps: Updated given statement torture test to show why renaming strategies are flawed Message-ID: http://hg.python.org/peps/rev/fc2aa3ef6d34 changeset: 3863:fc2aa3ef6d34 user: Nick Coghlan date: Sun Apr 10 22:05:30 2011 +1000 summary: Updated given statement torture test to show why renaming strategies are flawed files: pep-3150.txt | 29 ++++++++++++++++++++++++++--- 1 files changed, 26 insertions(+), 3 deletions(-) diff --git a/pep-3150.txt b/pep-3150.txt --- a/pep-3150.txt +++ b/pep-3150.txt @@ -314,16 +314,25 @@ assert d[42] == 42 given: d = b assert "d" not in locals() + y = y given: + x = 42 + def f(): pass + y = locals("x"), f.__name__ + assert "x" not in locals() + assert "f" not in locals() + assert y == (42, "f") Most naive implementations will choke on the first complex assignment, while less naive but still broken implementations will fail when -the torture test is executed at class scope. +the torture test is executed at class scope. Renaming based strategies +struggle to support ``locals()`` correctly and also have problems with +class and function ``__name__`` attributes. And yes, that's a perfectly well-defined assignment statement. Insane, you might rightly say, but legal:: >>> def f(x): return x - ... + ... >>> x = 42 >>> b = {} >>> a = b[f(a)] = x @@ -349,6 +358,10 @@ * Return-based semantics struggle with complex assignment statements like the one in the torture test +The second thought is generally some kind of hidden renaming strategy. This +also creates problems, as Python exposes variables names via the ``locals()`` +dictionary and class and function ``__name__`` attributes. + The most promising approach is one based on symtable analysis and copy-in-copy-out referencing semantics to move any required name bindings between the inner and outer scopes. The torture test above @@ -371,6 +384,16 @@ # Nothing to copy out (not an assignment) _anon2() assert "d" not in locals() + def _anon3() # Nothing to copy in (no references to other variables) + x = 42 + def f(): pass + y = locals("x"), f.__name__ + y = y # Assuming no optimisation of special cases + return y # 'y' reference copied out + y = _anon3() + assert "x" not in locals() + assert "f" not in locals() + assert y == (42, "f") However, as noted in the abstract, an actual implementation of this idea has never been tried. @@ -417,7 +440,7 @@ This document has been placed in the public domain. - + .. Local Variables: mode: indented-text -- Repository URL: http://hg.python.org/peps From python-checkins at python.org Sun Apr 10 21:28:44 2011 From: python-checkins at python.org (r.david.murray) Date: Sun, 10 Apr 2011 21:28:44 +0200 Subject: [Python-checkins] cpython: Use stock assertEqual instead of custom ndiffAssertEqual. Message-ID: http://hg.python.org/cpython/rev/00bfad341323 changeset: 69235:00bfad341323 user: R David Murray date: Sun Apr 10 15:28:29 2011 -0400 summary: Use stock assertEqual instead of custom ndiffAssertEqual. Eventually I'll actually replace the calls in the tests themselves. files: Lib/test/test_email/__init__.py | 10 +--------- 1 files changed, 1 insertions(+), 9 deletions(-) diff --git a/Lib/test/test_email/__init__.py b/Lib/test/test_email/__init__.py --- a/Lib/test/test_email/__init__.py +++ b/Lib/test/test_email/__init__.py @@ -29,15 +29,7 @@ super().__init__(*args, **kw) self.addTypeEqualityFunc(bytes, self.assertBytesEqual) - def ndiffAssertEqual(self, first, second): - """Like assertEqual except use ndiff for readable output.""" - if first != second: - sfirst = str(first) - ssecond = str(second) - rfirst = [repr(line) for line in sfirst.splitlines()] - rsecond = [repr(line) for line in ssecond.splitlines()] - diff = difflib.ndiff(rfirst, rsecond) - raise self.failureException(NL + NL.join(diff)) + ndiffAssertEqual = unittest.TestCase.assertEqual def _msgobj(self, filename): with openfile(filename) as fp: -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 11 00:23:29 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 11 Apr 2011 00:23:29 +0200 Subject: [Python-checkins] cpython (3.2): Issue #8428: Fix a race condition in multiprocessing.Pool when terminating Message-ID: http://hg.python.org/cpython/rev/d5e43afeede6 changeset: 69236:d5e43afeede6 branch: 3.2 parent: 69232:87d89f767b23 user: Antoine Pitrou date: Mon Apr 11 00:18:59 2011 +0200 summary: Issue #8428: Fix a race condition in multiprocessing.Pool when terminating worker processes: new processes would be spawned while the pool is being shut down. Patch by Charles-Fran?ois Natali. files: Lib/multiprocessing/pool.py | 9 +++++++-- Misc/NEWS | 4 ++++ 2 files changed, 11 insertions(+), 2 deletions(-) diff --git a/Lib/multiprocessing/pool.py b/Lib/multiprocessing/pool.py --- a/Lib/multiprocessing/pool.py +++ b/Lib/multiprocessing/pool.py @@ -322,6 +322,8 @@ while pool._worker_handler._state == RUN and pool._state == RUN: pool._maintain_pool() time.sleep(0.1) + # send sentinel to stop workers + pool._taskqueue.put(None) debug('worker handler exiting') @staticmethod @@ -440,7 +442,6 @@ if self._state == RUN: self._state = CLOSE self._worker_handler._state = CLOSE - self._taskqueue.put(None) def terminate(self): debug('terminating pool') @@ -474,7 +475,6 @@ worker_handler._state = TERMINATE task_handler._state = TERMINATE - taskqueue.put(None) # sentinel debug('helping task handler/workers to finish') cls._help_stuff_finish(inqueue, task_handler, len(pool)) @@ -484,6 +484,11 @@ result_handler._state = TERMINATE outqueue.put(None) # sentinel + # We must wait for the worker handler to exit before terminating + # workers because we don't want workers to be restarted behind our back. + debug('joining worker handler') + worker_handler.join() + # Terminate workers which haven't already finished. if pool and hasattr(pool[0], 'terminate'): debug('terminating workers') diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -53,6 +53,10 @@ Library ------- +- Issue #8428: Fix a race condition in multiprocessing.Pool when terminating + worker processes: new processes would be spawned while the pool is being + shut down. Patch by Charles-Fran?ois Natali. + - Issue #7311: fix html.parser to accept non-ASCII attribute values. - Issue #11605: email.parser.BytesFeedParser was incorrectly converting multipart -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 11 00:23:31 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 11 Apr 2011 00:23:31 +0200 Subject: [Python-checkins] cpython (3.2): Issue #11814: Fix likely typo in multiprocessing.Pool._terminate(). Message-ID: http://hg.python.org/cpython/rev/c046b7e1087b changeset: 69237:c046b7e1087b branch: 3.2 user: Antoine Pitrou date: Mon Apr 11 00:20:23 2011 +0200 summary: Issue #11814: Fix likely typo in multiprocessing.Pool._terminate(). files: Lib/multiprocessing/pool.py | 2 +- Misc/NEWS | 2 ++ 2 files changed, 3 insertions(+), 1 deletions(-) diff --git a/Lib/multiprocessing/pool.py b/Lib/multiprocessing/pool.py --- a/Lib/multiprocessing/pool.py +++ b/Lib/multiprocessing/pool.py @@ -500,7 +500,7 @@ task_handler.join() debug('joining result handler') - task_handler.join() + result_handler.join() if pool and hasattr(pool[0], 'terminate'): debug('joining pool workers') diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -53,6 +53,8 @@ Library ------- +- Issue #11814: Fix likely typo in multiprocessing.Pool._terminate(). + - Issue #8428: Fix a race condition in multiprocessing.Pool when terminating worker processes: new processes would be spawned while the pool is being shut down. Patch by Charles-Fran?ois Natali. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 11 00:23:38 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 11 Apr 2011 00:23:38 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Merge from 3.2 (issue #11814, issue #8428) Message-ID: http://hg.python.org/cpython/rev/76a3fc180ce0 changeset: 69238:76a3fc180ce0 parent: 69235:00bfad341323 parent: 69237:c046b7e1087b user: Antoine Pitrou date: Mon Apr 11 00:22:08 2011 +0200 summary: Merge from 3.2 (issue #11814, issue #8428) files: Lib/multiprocessing/pool.py | 11 ++++++++--- Misc/NEWS | 6 ++++++ 2 files changed, 14 insertions(+), 3 deletions(-) diff --git a/Lib/multiprocessing/pool.py b/Lib/multiprocessing/pool.py --- a/Lib/multiprocessing/pool.py +++ b/Lib/multiprocessing/pool.py @@ -322,6 +322,8 @@ while pool._worker_handler._state == RUN and pool._state == RUN: pool._maintain_pool() time.sleep(0.1) + # send sentinel to stop workers + pool._taskqueue.put(None) debug('worker handler exiting') @staticmethod @@ -440,7 +442,6 @@ if self._state == RUN: self._state = CLOSE self._worker_handler._state = CLOSE - self._taskqueue.put(None) def terminate(self): debug('terminating pool') @@ -474,7 +475,6 @@ worker_handler._state = TERMINATE task_handler._state = TERMINATE - taskqueue.put(None) # sentinel debug('helping task handler/workers to finish') cls._help_stuff_finish(inqueue, task_handler, len(pool)) @@ -484,6 +484,11 @@ result_handler._state = TERMINATE outqueue.put(None) # sentinel + # We must wait for the worker handler to exit before terminating + # workers because we don't want workers to be restarted behind our back. + debug('joining worker handler') + worker_handler.join() + # Terminate workers which haven't already finished. if pool and hasattr(pool[0], 'terminate'): debug('terminating workers') @@ -495,7 +500,7 @@ task_handler.join() debug('joining result handler') - task_handler.join() + result_handler.join() if pool and hasattr(pool[0], 'terminate'): debug('joining pool workers') diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -98,6 +98,12 @@ Library ------- +- Issue #11814: Fix likely typo in multiprocessing.Pool._terminate(). + +- Issue #8428: Fix a race condition in multiprocessing.Pool when terminating + worker processes: new processes would be spawned while the pool is being + shut down. Patch by Charles-Fran?ois Natali. + - Issue #2650: re.escape() no longer escapes the '_'. - Issue #11757: select.select() now raises ValueError when a negative timeout -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 11 00:28:24 2011 From: python-checkins at python.org (antoine.pitrou) Date: Mon, 11 Apr 2011 00:28:24 +0200 Subject: [Python-checkins] cpython (2.7): Issue #8428: Fix a race condition in multiprocessing.Pool when terminating Message-ID: http://hg.python.org/cpython/rev/dfc61dc14f59 changeset: 69239:dfc61dc14f59 branch: 2.7 parent: 69228:3630bc3d5a88 user: Antoine Pitrou date: Mon Apr 11 00:26:42 2011 +0200 summary: Issue #8428: Fix a race condition in multiprocessing.Pool when terminating worker processes: new processes would be spawned while the pool is being shut down. Patch by Charles-Fran?ois Natali. files: Lib/multiprocessing/pool.py | 9 +++++++-- Misc/NEWS | 4 ++++ 2 files changed, 11 insertions(+), 2 deletions(-) diff --git a/Lib/multiprocessing/pool.py b/Lib/multiprocessing/pool.py --- a/Lib/multiprocessing/pool.py +++ b/Lib/multiprocessing/pool.py @@ -295,6 +295,8 @@ while pool._worker_handler._state == RUN and pool._state == RUN: pool._maintain_pool() time.sleep(0.1) + # send sentinel to stop workers + pool._taskqueue.put(None) debug('worker handler exiting') @staticmethod @@ -413,7 +415,6 @@ if self._state == RUN: self._state = CLOSE self._worker_handler._state = CLOSE - self._taskqueue.put(None) def terminate(self): debug('terminating pool') @@ -447,7 +448,6 @@ worker_handler._state = TERMINATE task_handler._state = TERMINATE - taskqueue.put(None) # sentinel debug('helping task handler/workers to finish') cls._help_stuff_finish(inqueue, task_handler, len(pool)) @@ -457,6 +457,11 @@ result_handler._state = TERMINATE outqueue.put(None) # sentinel + # We must wait for the worker handler to exit before terminating + # workers because we don't want workers to be restarted behind our back. + debug('joining worker handler') + worker_handler.join() + # Terminate workers which haven't already finished. if pool and hasattr(pool[0], 'terminate'): debug('terminating workers') diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -51,6 +51,10 @@ Library ------- +- Issue #8428: Fix a race condition in multiprocessing.Pool when terminating + worker processes: new processes would be spawned while the pool is being + shut down. Patch by Charles-Fran?ois Natali. + - Issue #7311: fix HTMLParser to accept non-ASCII attribute values. - Issue #10963: Ensure that subprocess.communicate() never raises EPIPE. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 11 02:24:41 2011 From: python-checkins at python.org (raymond.hettinger) Date: Mon, 11 Apr 2011 02:24:41 +0200 Subject: [Python-checkins] cpython (3.2): Cleanup and modernize code prior to working on Issue 11747. Message-ID: http://hg.python.org/cpython/rev/36648097fcd4 changeset: 69240:36648097fcd4 branch: 3.2 parent: 69237:c046b7e1087b user: Raymond Hettinger date: Sun Apr 10 17:14:56 2011 -0700 summary: Cleanup and modernize code prior to working on Issue 11747. files: Lib/difflib.py | 53 +++++++++++++++++++------------------ 1 files changed, 27 insertions(+), 26 deletions(-) diff --git a/Lib/difflib.py b/Lib/difflib.py --- a/Lib/difflib.py +++ b/Lib/difflib.py @@ -1188,22 +1188,23 @@ started = False for group in SequenceMatcher(None,a,b).get_grouped_opcodes(n): if not started: - fromdate = '\t%s' % fromfiledate if fromfiledate else '' - todate = '\t%s' % tofiledate if tofiledate else '' - yield '--- %s%s%s' % (fromfile, fromdate, lineterm) - yield '+++ %s%s%s' % (tofile, todate, lineterm) started = True - i1, i2, j1, j2 = group[0][1], group[-1][2], group[0][3], group[-1][4] - yield "@@ -%d,%d +%d,%d @@%s" % (i1+1, i2-i1, j1+1, j2-j1, lineterm) + fromdate = '\t{}'.format(fromfiledate) if fromfiledate else '' + todate = '\t{}'.format(tofiledate) if tofiledate else '' + yield '--- {}{}{}'.format(fromfile, fromdate, lineterm) + yield '+++ {}{}{}'.format(tofile, todate, lineterm) + first, last = group[0], group[-1] + i1, i2, j1, j2 = first[1], last[2], first[3], last[4] + yield '@@ -{},{} +{},{} @@{}'.format(i1+1, i2-i1, j1+1, j2-j1, lineterm) for tag, i1, i2, j1, j2 in group: if tag == 'equal': for line in a[i1:i2]: yield ' ' + line continue - if tag == 'replace' or tag == 'delete': + if tag in {'replace', 'delete'}: for line in a[i1:i2]: yield '-' + line - if tag == 'replace' or tag == 'insert': + if tag in {'replace', 'insert'}: for line in b[j1:j2]: yield '+' + line @@ -1252,38 +1253,38 @@ four """ + prefix = dict(insert='+ ', delete='- ', replace='! ', equal=' ') started = False - prefixmap = {'insert':'+ ', 'delete':'- ', 'replace':'! ', 'equal':' '} for group in SequenceMatcher(None,a,b).get_grouped_opcodes(n): if not started: - fromdate = '\t%s' % fromfiledate if fromfiledate else '' - todate = '\t%s' % tofiledate if tofiledate else '' - yield '*** %s%s%s' % (fromfile, fromdate, lineterm) - yield '--- %s%s%s' % (tofile, todate, lineterm) started = True + fromdate = '\t{}'.format(fromfiledate) if fromfiledate else '' + todate = '\t{}'.format(tofiledate) if tofiledate else '' + yield '*** {}{}{}'.format(fromfile, fromdate, lineterm) + yield '--- {}{}{}'.format(tofile, todate, lineterm) - yield '***************%s' % (lineterm,) - if group[-1][2] - group[0][1] >= 2: - yield '*** %d,%d ****%s' % (group[0][1]+1, group[-1][2], lineterm) + first, last = group[0], group[-1] + yield '***************{}'.format(lineterm) + + if last[2] - first[1] > 1: + yield '*** {},{} ****{}'.format(first[1]+1, last[2], lineterm) else: - yield '*** %d ****%s' % (group[-1][2], lineterm) - visiblechanges = [e for e in group if e[0] in ('replace', 'delete')] - if visiblechanges: + yield '*** {} ****{}'.format(last[2], lineterm) + if any(tag in {'replace', 'delete'} for tag, _, _, _, _ in group): for tag, i1, i2, _, _ in group: if tag != 'insert': for line in a[i1:i2]: - yield prefixmap[tag] + line + yield prefix[tag] + line - if group[-1][4] - group[0][3] >= 2: - yield '--- %d,%d ----%s' % (group[0][3]+1, group[-1][4], lineterm) + if last[4] - first[3] > 1: + yield '--- {},{} ----{}'.format(first[3]+1, last[4], lineterm) else: - yield '--- %d ----%s' % (group[-1][4], lineterm) - visiblechanges = [e for e in group if e[0] in ('replace', 'insert')] - if visiblechanges: + yield '--- {} ----{}'.format(last[4], lineterm) + if any(tag in {'replace', 'insert'} for tag, _, _, _, _ in group): for tag, _, _, j1, j2 in group: if tag != 'delete': for line in b[j1:j2]: - yield prefixmap[tag] + line + yield prefix[tag] + line def ndiff(a, b, linejunk=None, charjunk=IS_CHARACTER_JUNK): r""" -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 11 02:24:43 2011 From: python-checkins at python.org (raymond.hettinger) Date: Mon, 11 Apr 2011 02:24:43 +0200 Subject: [Python-checkins] cpython: Cleanup and modernize code prior to working on Issue 11747. Message-ID: http://hg.python.org/cpython/rev/58a3bfcc70f7 changeset: 69241:58a3bfcc70f7 parent: 69238:76a3fc180ce0 user: Raymond Hettinger date: Sun Apr 10 17:23:32 2011 -0700 summary: Cleanup and modernize code prior to working on Issue 11747. files: Lib/difflib.py | 53 +++++++++++++++++++------------------ 1 files changed, 27 insertions(+), 26 deletions(-) diff --git a/Lib/difflib.py b/Lib/difflib.py --- a/Lib/difflib.py +++ b/Lib/difflib.py @@ -1188,22 +1188,23 @@ started = False for group in SequenceMatcher(None,a,b).get_grouped_opcodes(n): if not started: - fromdate = '\t%s' % fromfiledate if fromfiledate else '' - todate = '\t%s' % tofiledate if tofiledate else '' - yield '--- %s%s%s' % (fromfile, fromdate, lineterm) - yield '+++ %s%s%s' % (tofile, todate, lineterm) started = True - i1, i2, j1, j2 = group[0][1], group[-1][2], group[0][3], group[-1][4] - yield "@@ -%d,%d +%d,%d @@%s" % (i1+1, i2-i1, j1+1, j2-j1, lineterm) + fromdate = '\t{}'.format(fromfiledate) if fromfiledate else '' + todate = '\t{}'.format(tofiledate) if tofiledate else '' + yield '--- {}{}{}'.format(fromfile, fromdate, lineterm) + yield '+++ {}{}{}'.format(tofile, todate, lineterm) + first, last = group[0], group[-1] + i1, i2, j1, j2 = first[1], last[2], first[3], last[4] + yield '@@ -{},{} +{},{} @@{}'.format(i1+1, i2-i1, j1+1, j2-j1, lineterm) for tag, i1, i2, j1, j2 in group: if tag == 'equal': for line in a[i1:i2]: yield ' ' + line continue - if tag == 'replace' or tag == 'delete': + if tag in {'replace', 'delete'}: for line in a[i1:i2]: yield '-' + line - if tag == 'replace' or tag == 'insert': + if tag in {'replace', 'insert'}: for line in b[j1:j2]: yield '+' + line @@ -1252,38 +1253,38 @@ four """ + prefix = dict(insert='+ ', delete='- ', replace='! ', equal=' ') started = False - prefixmap = {'insert':'+ ', 'delete':'- ', 'replace':'! ', 'equal':' '} for group in SequenceMatcher(None,a,b).get_grouped_opcodes(n): if not started: - fromdate = '\t%s' % fromfiledate if fromfiledate else '' - todate = '\t%s' % tofiledate if tofiledate else '' - yield '*** %s%s%s' % (fromfile, fromdate, lineterm) - yield '--- %s%s%s' % (tofile, todate, lineterm) started = True + fromdate = '\t{}'.format(fromfiledate) if fromfiledate else '' + todate = '\t{}'.format(tofiledate) if tofiledate else '' + yield '*** {}{}{}'.format(fromfile, fromdate, lineterm) + yield '--- {}{}{}'.format(tofile, todate, lineterm) - yield '***************%s' % (lineterm,) - if group[-1][2] - group[0][1] >= 2: - yield '*** %d,%d ****%s' % (group[0][1]+1, group[-1][2], lineterm) + first, last = group[0], group[-1] + yield '***************{}'.format(lineterm) + + if last[2] - first[1] > 1: + yield '*** {},{} ****{}'.format(first[1]+1, last[2], lineterm) else: - yield '*** %d ****%s' % (group[-1][2], lineterm) - visiblechanges = [e for e in group if e[0] in {'replace', 'delete'}] - if visiblechanges: + yield '*** {} ****{}'.format(last[2], lineterm) + if any(tag in {'replace', 'delete'} for tag, _, _, _, _ in group): for tag, i1, i2, _, _ in group: if tag != 'insert': for line in a[i1:i2]: - yield prefixmap[tag] + line + yield prefix[tag] + line - if group[-1][4] - group[0][3] >= 2: - yield '--- %d,%d ----%s' % (group[0][3]+1, group[-1][4], lineterm) + if last[4] - first[3] > 1: + yield '--- {},{} ----{}'.format(first[3]+1, last[4], lineterm) else: - yield '--- %d ----%s' % (group[-1][4], lineterm) - visiblechanges = [e for e in group if e[0] in {'replace', 'insert'}] - if visiblechanges: + yield '--- {} ----{}'.format(last[4], lineterm) + if any(tag in {'replace', 'insert'} for tag, _, _, _, _ in group): for tag, _, _, j1, j2 in group: if tag != 'delete': for line in b[j1:j2]: - yield prefixmap[tag] + line + yield prefix[tag] + line def ndiff(a, b, linejunk=None, charjunk=IS_CHARACTER_JUNK): r""" -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 11 02:24:48 2011 From: python-checkins at python.org (raymond.hettinger) Date: Mon, 11 Apr 2011 02:24:48 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Merge Message-ID: http://hg.python.org/cpython/rev/3644a1e3a289 changeset: 69242:3644a1e3a289 parent: 69241:58a3bfcc70f7 parent: 69240:36648097fcd4 user: Raymond Hettinger date: Sun Apr 10 17:24:14 2011 -0700 summary: Merge files: -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 11 02:45:55 2011 From: python-checkins at python.org (ezio.melotti) Date: Mon, 11 Apr 2011 02:45:55 +0200 Subject: [Python-checkins] cpython (2.7): #4877: Fix a segfault in xml.parsers.expat while attempting to parse a closed Message-ID: http://hg.python.org/cpython/rev/28705a7987c5 changeset: 69243:28705a7987c5 branch: 2.7 parent: 69239:dfc61dc14f59 user: Ezio Melotti date: Mon Apr 11 03:44:28 2011 +0300 summary: #4877: Fix a segfault in xml.parsers.expat while attempting to parse a closed file. files: Lib/test/test_pyexpat.py | 11 ++++++++ Misc/NEWS | 5 ++- Modules/pyexpat.c | 38 ++++++++------------------- 3 files changed, 27 insertions(+), 27 deletions(-) diff --git a/Lib/test/test_pyexpat.py b/Lib/test/test_pyexpat.py --- a/Lib/test/test_pyexpat.py +++ b/Lib/test/test_pyexpat.py @@ -6,6 +6,7 @@ from xml.parsers import expat +from test import test_support from test.test_support import sortdict, run_unittest @@ -217,6 +218,16 @@ self.assertEqual(op[15], "External entity ref: (None, u'entity.file', None)") self.assertEqual(op[16], "End element: u'root'") + # Issue 4877: expat.ParseFile causes segfault on a closed file. + fp = open(test_support.TESTFN, 'wb') + try: + fp.close() + parser = expat.ParserCreate() + with self.assertRaises(ValueError): + parser.ParseFile(fp) + finally: + test_support.unlink(test_support.TESTFN) + class NamespaceSeparatorTest(unittest.TestCase): def test_legal(self): diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -51,11 +51,14 @@ Library ------- +- Issue #4877: Fix a segfault in xml.parsers.expat while attempting to parse + a closed file. + - Issue #8428: Fix a race condition in multiprocessing.Pool when terminating worker processes: new processes would be spawned while the pool is being shut down. Patch by Charles-Fran?ois Natali. -- Issue #7311: fix HTMLParser to accept non-ASCII attribute values. +- Issue #7311: Fix HTMLParser to accept non-ASCII attribute values. - Issue #10963: Ensure that subprocess.communicate() never raises EPIPE. diff --git a/Modules/pyexpat.c b/Modules/pyexpat.c --- a/Modules/pyexpat.c +++ b/Modules/pyexpat.c @@ -962,21 +962,15 @@ xmlparse_ParseFile(xmlparseobject *self, PyObject *f) { int rv = 1; - FILE *fp; PyObject *readmethod = NULL; - if (PyFile_Check(f)) { - fp = PyFile_AsFile(f); - } - else { - fp = NULL; - readmethod = PyObject_GetAttrString(f, "read"); - if (readmethod == NULL) { - PyErr_Clear(); - PyErr_SetString(PyExc_TypeError, - "argument must have 'read' attribute"); - return NULL; - } + readmethod = PyObject_GetAttrString(f, "read"); + if (readmethod == NULL) { + PyErr_Clear(); + PyErr_SetString(PyExc_TypeError, + "argument must have 'read' attribute"); + return NULL; + } for (;;) { int bytes_read; @@ -986,20 +980,12 @@ return PyErr_NoMemory(); } - if (fp) { - bytes_read = fread(buf, sizeof(char), BUF_SIZE, fp); - if (bytes_read < 0) { - PyErr_SetFromErrno(PyExc_IOError); - return NULL; - } + bytes_read = readinst(buf, BUF_SIZE, readmethod); + if (bytes_read < 0) { + Py_XDECREF(readmethod); + return NULL; } - else { - bytes_read = readinst(buf, BUF_SIZE, readmethod); - if (bytes_read < 0) { - Py_XDECREF(readmethod); - return NULL; - } - } + rv = XML_ParseBuffer(self->itself, bytes_read, bytes_read == 0); if (PyErr_Occurred()) { Py_XDECREF(readmethod); -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 11 02:45:56 2011 From: python-checkins at python.org (ezio.melotti) Date: Mon, 11 Apr 2011 02:45:56 +0200 Subject: [Python-checkins] cpython (2.7): Remove unnecessary call to PyErr_Clear. Message-ID: http://hg.python.org/cpython/rev/ba699cf9bdbb changeset: 69244:ba699cf9bdbb branch: 2.7 user: Ezio Melotti date: Mon Apr 11 03:45:25 2011 +0300 summary: Remove unnecessary call to PyErr_Clear. files: Modules/pyexpat.c | 1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/Modules/pyexpat.c b/Modules/pyexpat.c --- a/Modules/pyexpat.c +++ b/Modules/pyexpat.c @@ -966,7 +966,6 @@ readmethod = PyObject_GetAttrString(f, "read"); if (readmethod == NULL) { - PyErr_Clear(); PyErr_SetString(PyExc_TypeError, "argument must have 'read' attribute"); return NULL; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 11 02:51:35 2011 From: python-checkins at python.org (ezio.melotti) Date: Mon, 11 Apr 2011 02:51:35 +0200 Subject: [Python-checkins] cpython (3.2): Remove unnecessary call to PyErr_Clear. Message-ID: http://hg.python.org/cpython/rev/6b4467e71872 changeset: 69245:6b4467e71872 branch: 3.2 parent: 69240:36648097fcd4 user: Ezio Melotti date: Mon Apr 11 03:48:57 2011 +0300 summary: Remove unnecessary call to PyErr_Clear. files: Modules/pyexpat.c | 1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/Modules/pyexpat.c b/Modules/pyexpat.c --- a/Modules/pyexpat.c +++ b/Modules/pyexpat.c @@ -846,7 +846,6 @@ readmethod = PyObject_GetAttrString(f, "read"); if (readmethod == NULL) { - PyErr_Clear(); PyErr_SetString(PyExc_TypeError, "argument must have 'read' attribute"); return NULL; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 11 02:51:36 2011 From: python-checkins at python.org (ezio.melotti) Date: Mon, 11 Apr 2011 02:51:36 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Merge with 3.2. Message-ID: http://hg.python.org/cpython/rev/2d1d9759d3a4 changeset: 69246:2d1d9759d3a4 parent: 69242:3644a1e3a289 parent: 69245:6b4467e71872 user: Ezio Melotti date: Mon Apr 11 03:51:14 2011 +0300 summary: Merge with 3.2. files: Modules/pyexpat.c | 1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/Modules/pyexpat.c b/Modules/pyexpat.c --- a/Modules/pyexpat.c +++ b/Modules/pyexpat.c @@ -847,7 +847,6 @@ readmethod = PyObject_GetAttrString(f, "read"); if (readmethod == NULL) { - PyErr_Clear(); PyErr_SetString(PyExc_TypeError, "argument must have 'read' attribute"); return NULL; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 11 04:25:36 2011 From: python-checkins at python.org (reid.kleckner) Date: Mon, 11 Apr 2011 04:25:36 +0200 Subject: [Python-checkins] cpython: Add Misc/NEWS "What's New" entry for subprocess timeouts. Message-ID: http://hg.python.org/cpython/rev/9140f2363623 changeset: 69247:9140f2363623 user: Reid Kleckner date: Sun Apr 10 22:23:08 2011 -0400 summary: Add Misc/NEWS "What's New" entry for subprocess timeouts. files: Misc/NEWS | 5 +++++ 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -10,6 +10,11 @@ Core and Builtins ----------------- +- Issue #5673: Added a `timeout` keyword argument to subprocess.Popen.wait, + subprocess.Popen.communicated, subprocess.call, subprocess.check_call, and + subprocess.check_output. If the blocking operation takes more than `timeout` + seconds, the `subprocess.TimeoutExpired` exception is raised. + - Issue #11650: PyOS_StdioReadline() retries fgets() if it was interrupted (EINTR), for example if the program is stopped with CTRL+z on Mac OS X. Patch written by Charles-Francois Natali. -- Repository URL: http://hg.python.org/cpython From solipsis at pitrou.net Mon Apr 11 05:01:13 2011 From: solipsis at pitrou.net (solipsis at pitrou.net) Date: Mon, 11 Apr 2011 05:01:13 +0200 Subject: [Python-checkins] Daily reference leaks (2d1d9759d3a4): sum=0 Message-ID: results for 2d1d9759d3a4 on branch "default" -------------------------------------------- Command line was: ['./python', '-m', 'test.regrtest', '-uall', '-R', '3:3:/home/antoine/cpython/refleaks/reflogl1EJSG', '-x'] From python-checkins at python.org Mon Apr 11 09:44:03 2011 From: python-checkins at python.org (vinay.sajip) Date: Mon, 11 Apr 2011 09:44:03 +0200 Subject: [Python-checkins] cpython: Added 'handlers' argument to logging.basicConfig. Message-ID: http://hg.python.org/cpython/rev/fbf5e4b1e52a changeset: 69248:fbf5e4b1e52a user: Vinay Sajip date: Mon Apr 11 08:42:07 2011 +0100 summary: Added 'handlers' argument to logging.basicConfig. files: Doc/library/logging.rst | 17 ++++++++++- Lib/logging/__init__.py | 42 ++++++++++++++++++++++------ Lib/test/test_logging.py | 20 +++++++++++++ Misc/NEWS | 4 ++ 4 files changed, 73 insertions(+), 10 deletions(-) diff --git a/Doc/library/logging.rst b/Doc/library/logging.rst --- a/Doc/library/logging.rst +++ b/Doc/library/logging.rst @@ -983,12 +983,27 @@ | ``stream`` | Use the specified stream to initialize the | | | StreamHandler. Note that this argument is | | | incompatible with 'filename' - if both are | - | | present, 'stream' is ignored. | + | | present, a ``ValueError`` is raised. | + +--------------+---------------------------------------------+ + | ``handlers`` | If specified, this should be an iterable of | + | | already created handlers to add to the root | + | | logger. Any handlers which don't already | + | | have a formatter set will be assigned the | + | | default formatter created in this function. | + | | Note that this argument is incompatible | + | | with 'filename' or 'stream' - if both are | + | | present, a ``ValueError`` is raised. | +--------------+---------------------------------------------+ .. versionchanged:: 3.2 The ``style`` argument was added. + .. versionchanged:: 3.3 + The ``handlers`` argument was added. Additional checks were added to + catch situations where incompatible arguments are specified (e.g. + ``handlers`` together with ``stream`` or ``filename``, or ``stream`` + together with ``filename``). + .. function:: shutdown() diff --git a/Lib/logging/__init__.py b/Lib/logging/__init__.py --- a/Lib/logging/__init__.py +++ b/Lib/logging/__init__.py @@ -1650,6 +1650,10 @@ stream Use the specified stream to initialize the StreamHandler. Note that this argument is incompatible with 'filename' - if both are present, 'stream' is ignored. + handlers If specified, this should be an iterable of already created + handlers, which will be added to the root handler. Any handler + in the list which does not have a formatter assigned will be + assigned the formatter created in this function. Note that you could specify a stream created using open(filename, mode) rather than passing the filename and mode in. However, it should be @@ -1657,27 +1661,47 @@ using sys.stdout or sys.stderr), whereas FileHandler closes its stream when the handler is closed. - .. versionchanged: 3.2 + .. versionchanged:: 3.2 Added the ``style`` parameter. + + .. versionchanged:: 3.3 + Added the ``handlers`` parameter. A ``ValueError`` is now thrown for + incompatible arguments (e.g. ``handlers`` specified together with + ``filename``/``filemode``, or ``filename``/``filemode`` specified + together with ``stream``, or ``handlers`` specified together with + ``stream``. """ # Add thread safety in case someone mistakenly calls # basicConfig() from multiple threads _acquireLock() try: if len(root.handlers) == 0: - filename = kwargs.get("filename") - if filename: - mode = kwargs.get("filemode", 'a') - hdlr = FileHandler(filename, mode) + handlers = kwargs.get("handlers") + if handlers is None: + if "stream" in kwargs and "filename" in kwargs: + raise ValueError("'stream' and 'filename' should not be " + "specified together") else: - stream = kwargs.get("stream") - hdlr = StreamHandler(stream) + if "stream" in kwargs or "filename" in kwargs: + raise ValueError("'stream' or 'filename' should not be " + "specified together with 'handlers'") + if handlers is None: + filename = kwargs.get("filename") + if filename: + mode = kwargs.get("filemode", 'a') + h = FileHandler(filename, mode) + else: + stream = kwargs.get("stream") + h = StreamHandler(stream) + handlers = [h] fs = kwargs.get("format", BASIC_FORMAT) dfs = kwargs.get("datefmt", None) style = kwargs.get("style", '%') fmt = Formatter(fs, dfs, style) - hdlr.setFormatter(fmt) - root.addHandler(hdlr) + for h in handlers: + if h.formatter is None: + h.setFormatter(fmt) + root.addHandler(h) level = kwargs.get("level") if level is not None: root.setLevel(level) diff --git a/Lib/test/test_logging.py b/Lib/test/test_logging.py --- a/Lib/test/test_logging.py +++ b/Lib/test/test_logging.py @@ -2482,6 +2482,26 @@ logging.basicConfig(level=57) self.assertEqual(logging.root.level, 57) + def test_incompatible(self): + assertRaises = self.assertRaises + handlers = [logging.StreamHandler()] + stream = sys.stderr + assertRaises(ValueError, logging.basicConfig, filename='test.log', + stream=stream) + assertRaises(ValueError, logging.basicConfig, filename='test.log', + handlers=handlers) + assertRaises(ValueError, logging.basicConfig, stream=stream, + handlers=handlers) + + def test_handlers(self): + handlers = [logging.StreamHandler(), logging.StreamHandler(sys.stdout)] + logging.basicConfig(handlers=handlers) + self.assertIs(handlers[0], logging.root.handlers[0]) + self.assertIs(handlers[1], logging.root.handlers[1]) + self.assertIsNotNone(handlers[0].formatter) + self.assertIsNotNone(handlers[1].formatter) + self.assertIs(handlers[0].formatter, handlers[1].formatter) + def _test_log(self, method, level=None): # logging.root has no handlers so basicConfig should be called called = [] diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -103,6 +103,10 @@ Library ------- +- logging.basicConfig now supports an optional 'handlers' argument taking an + iterable of handlers to be added to the root logger. Additional parameter + checks were also added to basicConfig. + - Issue #11814: Fix likely typo in multiprocessing.Pool._terminate(). - Issue #8428: Fix a race condition in multiprocessing.Pool when terminating -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 11 09:44:03 2011 From: python-checkins at python.org (vinay.sajip) Date: Mon, 11 Apr 2011 09:44:03 +0200 Subject: [Python-checkins] cpython: Whitespace normalized. Message-ID: http://hg.python.org/cpython/rev/c9e9142d82d6 changeset: 69249:c9e9142d82d6 user: Vinay Sajip date: Mon Apr 11 08:43:52 2011 +0100 summary: Whitespace normalized. files: Lib/logging/__init__.py | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Lib/logging/__init__.py b/Lib/logging/__init__.py --- a/Lib/logging/__init__.py +++ b/Lib/logging/__init__.py @@ -1663,7 +1663,7 @@ .. versionchanged:: 3.2 Added the ``style`` parameter. - + .. versionchanged:: 3.3 Added the ``handlers`` parameter. A ``ValueError`` is now thrown for incompatible arguments (e.g. ``handlers`` specified together with -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 11 22:11:11 2011 From: python-checkins at python.org (raymond.hettinger) Date: Mon, 11 Apr 2011 22:11:11 +0200 Subject: [Python-checkins] cpython (3.2): Issue #11747: Fix range formatting in context and unified diffs. Message-ID: http://hg.python.org/cpython/rev/a2ee967de44f changeset: 69250:a2ee967de44f branch: 3.2 parent: 69245:6b4467e71872 user: Raymond Hettinger date: Mon Apr 11 12:40:58 2011 -0700 summary: Issue #11747: Fix range formatting in context and unified diffs. files: Lib/difflib.py | 34 ++++++++++++++++++--------- Lib/test/test_difflib.py | 16 +++++++++++++ Misc/NEWS | 3 ++ 3 files changed, 42 insertions(+), 11 deletions(-) diff --git a/Lib/difflib.py b/Lib/difflib.py --- a/Lib/difflib.py +++ b/Lib/difflib.py @@ -1144,6 +1144,17 @@ return ch in ws +def _format_range(start, stop): + 'Convert range to the "ed" format' + # Per the diff spec at http://www.unix.org/single_unix_specification/ + beginning = start + 1 # lines start numbering with one + length = stop - start + if length == 1: + return '{}'.format(beginning) + if not length: + beginning -= 1 # empty ranges begin at line just before the range + return '{},{}'.format(beginning, length) + def unified_diff(a, b, fromfile='', tofile='', fromfiledate='', tofiledate='', n=3, lineterm='\n'): r""" @@ -1193,9 +1204,12 @@ todate = '\t{}'.format(tofiledate) if tofiledate else '' yield '--- {}{}{}'.format(fromfile, fromdate, lineterm) yield '+++ {}{}{}'.format(tofile, todate, lineterm) + first, last = group[0], group[-1] - i1, i2, j1, j2 = first[1], last[2], first[3], last[4] - yield '@@ -{},{} +{},{} @@{}'.format(i1+1, i2-i1, j1+1, j2-j1, lineterm) + file1_range = _format_range(first[1], last[2]) + file2_range = _format_range(first[3], last[4]) + yield '@@ -{} +{} @@{}'.format(file1_range, file2_range, lineterm) + for tag, i1, i2, j1, j2 in group: if tag == 'equal': for line in a[i1:i2]: @@ -1264,22 +1278,20 @@ yield '--- {}{}{}'.format(tofile, todate, lineterm) first, last = group[0], group[-1] - yield '***************{}'.format(lineterm) + yield '***************' + lineterm - if last[2] - first[1] > 1: - yield '*** {},{} ****{}'.format(first[1]+1, last[2], lineterm) - else: - yield '*** {} ****{}'.format(last[2], lineterm) + file1_range = _format_range(first[1], last[2]) + yield '*** {} ****{}'.format(file1_range, lineterm) + if any(tag in {'replace', 'delete'} for tag, _, _, _, _ in group): for tag, i1, i2, _, _ in group: if tag != 'insert': for line in a[i1:i2]: yield prefix[tag] + line - if last[4] - first[3] > 1: - yield '--- {},{} ----{}'.format(first[3]+1, last[4], lineterm) - else: - yield '--- {} ----{}'.format(last[4], lineterm) + file2_range = _format_range(first[3], last[4]) + yield '--- {} ----{}'.format(file2_range, lineterm) + if any(tag in {'replace', 'insert'} for tag, _, _, _, _ in group): for tag, _, _, j1, j2 in group: if tag != 'delete': diff --git a/Lib/test/test_difflib.py b/Lib/test/test_difflib.py --- a/Lib/test/test_difflib.py +++ b/Lib/test/test_difflib.py @@ -236,6 +236,22 @@ cd = difflib.context_diff(*args, lineterm='') self.assertEqual(list(cd)[0:2], ["*** Original", "--- Current"]) + def test_range_format(self): + # Per the diff spec at http://www.unix.org/single_unix_specification/ + spec = '''\ + Each field shall be of the form: + %1d", if the range contains exactly one line, + and: + "%1d,%1d", , otherwise. + If a range is empty, its beginning line number shall be the number of + the line just before the range, or 0 if the empty range starts the file. + ''' + fmt = difflib._format_range + self.assertEqual(fmt(3,3), '3,0') + self.assertEqual(fmt(3,4), '4') + self.assertEqual(fmt(3,5), '4,2') + self.assertEqual(fmt(3,6), '4,3') + self.assertEqual(fmt(0,0), '0,0') def test_main(): difflib.HtmlDiff._default_prefix = 0 diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -55,6 +55,9 @@ - Issue #11814: Fix likely typo in multiprocessing.Pool._terminate(). +- Issue #11747: Fix range formatting in difflib.context_diff() and + difflib.unified_diff(). + - Issue #8428: Fix a race condition in multiprocessing.Pool when terminating worker processes: new processes would be spawned while the pool is being shut down. Patch by Charles-Fran?ois Natali. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Apr 11 22:11:12 2011 From: python-checkins at python.org (raymond.hettinger) Date: Mon, 11 Apr 2011 22:11:12 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Issue #11747: Fix range formatting in context and unified diffs. Message-ID: http://hg.python.org/cpython/rev/1e5e3bb3e1f1 changeset: 69251:1e5e3bb3e1f1 parent: 69249:c9e9142d82d6 parent: 69250:a2ee967de44f user: Raymond Hettinger date: Mon Apr 11 12:42:59 2011 -0700 summary: Issue #11747: Fix range formatting in context and unified diffs. files: Lib/difflib.py | 34 ++++++++++++++++++--------- Lib/test/test_difflib.py | 16 +++++++++++++ Misc/NEWS | 3 ++ 3 files changed, 42 insertions(+), 11 deletions(-) diff --git a/Lib/difflib.py b/Lib/difflib.py --- a/Lib/difflib.py +++ b/Lib/difflib.py @@ -1144,6 +1144,17 @@ return ch in ws +def _format_range(start, stop): + 'Convert range to the "ed" format' + # Per the diff spec at http://www.unix.org/single_unix_specification/ + beginning = start + 1 # lines start numbering with one + length = stop - start + if length == 1: + return '{}'.format(beginning) + if not length: + beginning -= 1 # empty ranges begin at line just before the range + return '{},{}'.format(beginning, length) + def unified_diff(a, b, fromfile='', tofile='', fromfiledate='', tofiledate='', n=3, lineterm='\n'): r""" @@ -1193,9 +1204,12 @@ todate = '\t{}'.format(tofiledate) if tofiledate else '' yield '--- {}{}{}'.format(fromfile, fromdate, lineterm) yield '+++ {}{}{}'.format(tofile, todate, lineterm) + first, last = group[0], group[-1] - i1, i2, j1, j2 = first[1], last[2], first[3], last[4] - yield '@@ -{},{} +{},{} @@{}'.format(i1+1, i2-i1, j1+1, j2-j1, lineterm) + file1_range = _format_range(first[1], last[2]) + file2_range = _format_range(first[3], last[4]) + yield '@@ -{} +{} @@{}'.format(file1_range, file2_range, lineterm) + for tag, i1, i2, j1, j2 in group: if tag == 'equal': for line in a[i1:i2]: @@ -1264,22 +1278,20 @@ yield '--- {}{}{}'.format(tofile, todate, lineterm) first, last = group[0], group[-1] - yield '***************{}'.format(lineterm) + yield '***************' + lineterm - if last[2] - first[1] > 1: - yield '*** {},{} ****{}'.format(first[1]+1, last[2], lineterm) - else: - yield '*** {} ****{}'.format(last[2], lineterm) + file1_range = _format_range(first[1], last[2]) + yield '*** {} ****{}'.format(file1_range, lineterm) + if any(tag in {'replace', 'delete'} for tag, _, _, _, _ in group): for tag, i1, i2, _, _ in group: if tag != 'insert': for line in a[i1:i2]: yield prefix[tag] + line - if last[4] - first[3] > 1: - yield '--- {},{} ----{}'.format(first[3]+1, last[4], lineterm) - else: - yield '--- {} ----{}'.format(last[4], lineterm) + file2_range = _format_range(first[3], last[4]) + yield '--- {} ----{}'.format(file2_range, lineterm) + if any(tag in {'replace', 'insert'} for tag, _, _, _, _ in group): for tag, _, _, j1, j2 in group: if tag != 'delete': diff --git a/Lib/test/test_difflib.py b/Lib/test/test_difflib.py --- a/Lib/test/test_difflib.py +++ b/Lib/test/test_difflib.py @@ -236,6 +236,22 @@ cd = difflib.context_diff(*args, lineterm='') self.assertEqual(list(cd)[0:2], ["*** Original", "--- Current"]) + def test_range_format(self): + # Per the diff spec at http://www.unix.org/single_unix_specification/ + spec = '''\ + Each field shall be of the form: + %1d", if the range contains exactly one line, + and: + "%1d,%1d", , otherwise. + If a range is empty, its beginning line number shall be the number of + the line just before the range, or 0 if the empty range starts the file. + ''' + fmt = difflib._format_range + self.assertEqual(fmt(3,3), '3,0') + self.assertEqual(fmt(3,4), '4') + self.assertEqual(fmt(3,5), '4,2') + self.assertEqual(fmt(3,6), '4,3') + self.assertEqual(fmt(0,0), '0,0') def test_main(): difflib.HtmlDiff._default_prefix = 0 diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -109,6 +109,9 @@ - Issue #11814: Fix likely typo in multiprocessing.Pool._terminate(). +- Issue #11747: Fix range formatting in difflib.context_diff() and + difflib.unified_diff(). + - Issue #8428: Fix a race condition in multiprocessing.Pool when terminating worker processes: new processes would be spawned while the pool is being shut down. Patch by Charles-Fran?ois Natali. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 12 01:01:36 2011 From: python-checkins at python.org (brian.curtin) Date: Tue, 12 Apr 2011 01:01:36 +0200 Subject: [Python-checkins] cpython (3.1): Fix #5162. Allow child spawning from Windows services (via pywin32). Message-ID: http://hg.python.org/cpython/rev/1f41b1ab8924 changeset: 69252:1f41b1ab8924 branch: 3.1 parent: 69225:42d5001e5845 user: brian.curtin date: Mon Apr 11 17:56:23 2011 -0500 summary: Fix #5162. Allow child spawning from Windows services (via pywin32). files: Lib/multiprocessing/forking.py | 5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-) diff --git a/Lib/multiprocessing/forking.py b/Lib/multiprocessing/forking.py --- a/Lib/multiprocessing/forking.py +++ b/Lib/multiprocessing/forking.py @@ -195,6 +195,7 @@ TERMINATE = 0x10000 WINEXE = (sys.platform == 'win32' and getattr(sys, 'frozen', False)) + WINSERVICE = sys.executable.lower().endswith("pythonservice.exe") exit = win32.ExitProcess close = win32.CloseHandle @@ -204,7 +205,7 @@ # People embedding Python want to modify it. # - if sys.executable.lower().endswith('pythonservice.exe'): + if WINSERVICE: _python_exe = os.path.join(sys.exec_prefix, 'python.exe') else: _python_exe = sys.executable @@ -394,7 +395,7 @@ if _logger is not None: d['log_level'] = _logger.getEffectiveLevel() - if not WINEXE: + if not WINEXE and not WINSERVICE: main_path = getattr(sys.modules['__main__'], '__file__', None) if not main_path and sys.argv[0] not in ('', '-c'): main_path = sys.argv[0] -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 12 01:01:38 2011 From: python-checkins at python.org (brian.curtin) Date: Tue, 12 Apr 2011 01:01:38 +0200 Subject: [Python-checkins] cpython (merge 3.1 -> 3.2): Fix #5162. Allow child spawning from Windows services (via pywin32). Message-ID: http://hg.python.org/cpython/rev/184ae02e3221 changeset: 69253:184ae02e3221 branch: 3.2 parent: 69250:a2ee967de44f parent: 69252:1f41b1ab8924 user: brian.curtin date: Mon Apr 11 17:57:59 2011 -0500 summary: Fix #5162. Allow child spawning from Windows services (via pywin32). files: Lib/multiprocessing/forking.py | 5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-) diff --git a/Lib/multiprocessing/forking.py b/Lib/multiprocessing/forking.py --- a/Lib/multiprocessing/forking.py +++ b/Lib/multiprocessing/forking.py @@ -195,6 +195,7 @@ TERMINATE = 0x10000 WINEXE = (sys.platform == 'win32' and getattr(sys, 'frozen', False)) + WINSERVICE = sys.executable.lower().endswith("pythonservice.exe") exit = win32.ExitProcess close = win32.CloseHandle @@ -204,7 +205,7 @@ # People embedding Python want to modify it. # - if sys.executable.lower().endswith('pythonservice.exe'): + if WINSERVICE: _python_exe = os.path.join(sys.exec_prefix, 'python.exe') else: _python_exe = sys.executable @@ -394,7 +395,7 @@ if _logger is not None: d['log_level'] = _logger.getEffectiveLevel() - if not WINEXE: + if not WINEXE and not WINSERVICE: main_path = getattr(sys.modules['__main__'], '__file__', None) if not main_path and sys.argv[0] not in ('', '-c'): main_path = sys.argv[0] -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 12 01:01:40 2011 From: python-checkins at python.org (brian.curtin) Date: Tue, 12 Apr 2011 01:01:40 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Fix #5162. Allow child spawning from Windows services (via pywin32). Message-ID: http://hg.python.org/cpython/rev/3c2bdea18b5c changeset: 69254:3c2bdea18b5c parent: 69251:1e5e3bb3e1f1 parent: 69253:184ae02e3221 user: brian.curtin date: Mon Apr 11 17:59:01 2011 -0500 summary: Fix #5162. Allow child spawning from Windows services (via pywin32). files: Lib/multiprocessing/forking.py | 5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-) diff --git a/Lib/multiprocessing/forking.py b/Lib/multiprocessing/forking.py --- a/Lib/multiprocessing/forking.py +++ b/Lib/multiprocessing/forking.py @@ -195,6 +195,7 @@ TERMINATE = 0x10000 WINEXE = (sys.platform == 'win32' and getattr(sys, 'frozen', False)) + WINSERVICE = sys.executable.lower().endswith("pythonservice.exe") exit = win32.ExitProcess close = win32.CloseHandle @@ -204,7 +205,7 @@ # People embedding Python want to modify it. # - if sys.executable.lower().endswith('pythonservice.exe'): + if WINSERVICE: _python_exe = os.path.join(sys.exec_prefix, 'python.exe') else: _python_exe = sys.executable @@ -394,7 +395,7 @@ if _logger is not None: d['log_level'] = _logger.getEffectiveLevel() - if not WINEXE: + if not WINEXE and not WINSERVICE: main_path = getattr(sys.modules['__main__'], '__file__', None) if not main_path and sys.argv[0] not in ('', '-c'): main_path = sys.argv[0] -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 12 01:01:42 2011 From: python-checkins at python.org (brian.curtin) Date: Tue, 12 Apr 2011 01:01:42 +0200 Subject: [Python-checkins] cpython (2.7): Fix #5162. Allow child spawning from Windows services (via pywin32). Message-ID: http://hg.python.org/cpython/rev/6507a5ba5c27 changeset: 69255:6507a5ba5c27 branch: 2.7 parent: 69244:ba699cf9bdbb user: brian.curtin date: Mon Apr 11 18:00:59 2011 -0500 summary: Fix #5162. Allow child spawning from Windows services (via pywin32). files: Lib/multiprocessing/forking.py | 5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-) diff --git a/Lib/multiprocessing/forking.py b/Lib/multiprocessing/forking.py --- a/Lib/multiprocessing/forking.py +++ b/Lib/multiprocessing/forking.py @@ -198,6 +198,7 @@ TERMINATE = 0x10000 WINEXE = (sys.platform == 'win32' and getattr(sys, 'frozen', False)) + WINSERVICE = sys.executable.lower().endswith("pythonservice.exe") exit = win32.ExitProcess close = win32.CloseHandle @@ -207,7 +208,7 @@ # People embedding Python want to modify it. # - if sys.executable.lower().endswith('pythonservice.exe'): + if WINSERVICE: _python_exe = os.path.join(sys.exec_prefix, 'python.exe') else: _python_exe = sys.executable @@ -397,7 +398,7 @@ if _logger is not None: d['log_level'] = _logger.getEffectiveLevel() - if not WINEXE: + if not WINEXE and not WINSERVICE: main_path = getattr(sys.modules['__main__'], '__file__', None) if not main_path and sys.argv[0] not in ('', '-c'): main_path = sys.argv[0] -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 12 01:06:56 2011 From: python-checkins at python.org (brian.curtin) Date: Tue, 12 Apr 2011 01:06:56 +0200 Subject: [Python-checkins] cpython (2.7): Add NEWS item for #5162. Message-ID: http://hg.python.org/cpython/rev/a280672d3d8d changeset: 69256:a280672d3d8d branch: 2.7 user: brian.curtin date: Mon Apr 11 18:05:33 2011 -0500 summary: Add NEWS item for #5162. files: Misc/NEWS | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -51,6 +51,9 @@ Library ------- +- Issue #5162: Treat services like frozen executables to allow child spawning + from multiprocessing.forking on Windows. + - Issue #4877: Fix a segfault in xml.parsers.expat while attempting to parse a closed file. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 12 01:20:07 2011 From: python-checkins at python.org (brian.curtin) Date: Tue, 12 Apr 2011 01:20:07 +0200 Subject: [Python-checkins] cpython (3.1): Add NEWS item for #5162. Message-ID: http://hg.python.org/cpython/rev/c26474c6504a changeset: 69257:c26474c6504a branch: 3.1 parent: 69252:1f41b1ab8924 user: brian.curtin date: Mon Apr 11 18:09:24 2011 -0500 summary: Add NEWS item for #5162. files: Misc/NEWS | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -48,6 +48,9 @@ Library ------- + - Issue #5162: Treat services like frozen executables to allow child spawning + from multiprocessing.forking on Windows. + - Issue #10963: Ensure that subprocess.communicate() never raises EPIPE. - Issue #11696: Fix ID generation in msilib. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 12 01:20:08 2011 From: python-checkins at python.org (brian.curtin) Date: Tue, 12 Apr 2011 01:20:08 +0200 Subject: [Python-checkins] cpython (merge 3.1 -> 3.2): Add NEWS item for #5162. Message-ID: http://hg.python.org/cpython/rev/68ef2bf1aa99 changeset: 69258:68ef2bf1aa99 branch: 3.2 parent: 69253:184ae02e3221 parent: 69257:c26474c6504a user: brian.curtin date: Mon Apr 11 18:18:20 2011 -0500 summary: Add NEWS item for #5162. files: Misc/NEWS | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -53,6 +53,9 @@ Library ------- +- Issue #5162: Treat services like frozen executables to allow child spawning + from multiprocessing.forking on Windows. + - Issue #11814: Fix likely typo in multiprocessing.Pool._terminate(). - Issue #11747: Fix range formatting in difflib.context_diff() and -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 12 01:20:09 2011 From: python-checkins at python.org (brian.curtin) Date: Tue, 12 Apr 2011 01:20:09 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Add NEWS item for #5162. Message-ID: http://hg.python.org/cpython/rev/2c4043070f05 changeset: 69259:2c4043070f05 parent: 69254:3c2bdea18b5c parent: 69258:68ef2bf1aa99 user: brian.curtin date: Mon Apr 11 18:19:38 2011 -0500 summary: Add NEWS item for #5162. files: Misc/NEWS | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -103,6 +103,9 @@ Library ------- +- Issue #5162: Treat services like frozen executables to allow child spawning + from multiprocessing.forking on Windows. + - logging.basicConfig now supports an optional 'handlers' argument taking an iterable of handlers to be added to the root logger. Additional parameter checks were also added to basicConfig. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 12 01:37:27 2011 From: python-checkins at python.org (brian.curtin) Date: Tue, 12 Apr 2011 01:37:27 +0200 Subject: [Python-checkins] cpython (3.1): Correct leading spaces in my NEWS entry. Message-ID: http://hg.python.org/cpython/rev/33b54387cc2a changeset: 69260:33b54387cc2a branch: 3.1 parent: 69257:c26474c6504a user: Brian Curtin date: Mon Apr 11 18:35:18 2011 -0500 summary: Correct leading spaces in my NEWS entry. files: Misc/NEWS | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -48,8 +48,8 @@ Library ------- - - Issue #5162: Treat services like frozen executables to allow child spawning - from multiprocessing.forking on Windows. +- Issue #5162: Treat services like frozen executables to allow child spawning + from multiprocessing.forking on Windows. - Issue #10963: Ensure that subprocess.communicate() never raises EPIPE. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 12 01:37:27 2011 From: python-checkins at python.org (brian.curtin) Date: Tue, 12 Apr 2011 01:37:27 +0200 Subject: [Python-checkins] cpython (merge 3.1 -> 3.2): Dummy merge Message-ID: http://hg.python.org/cpython/rev/b166c972cb5e changeset: 69261:b166c972cb5e branch: 3.2 parent: 69258:68ef2bf1aa99 parent: 69260:33b54387cc2a user: Brian Curtin date: Mon Apr 11 18:36:25 2011 -0500 summary: Dummy merge files: -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 12 01:37:28 2011 From: python-checkins at python.org (brian.curtin) Date: Tue, 12 Apr 2011 01:37:28 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Dummy merge Message-ID: http://hg.python.org/cpython/rev/5062b5286eba changeset: 69262:5062b5286eba parent: 69259:2c4043070f05 parent: 69261:b166c972cb5e user: Brian Curtin date: Mon Apr 11 18:36:59 2011 -0500 summary: Dummy merge files: -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 12 02:28:02 2011 From: python-checkins at python.org (raymond.hettinger) Date: Tue, 12 Apr 2011 02:28:02 +0200 Subject: [Python-checkins] cpython (2.7): Issue #11830: Remove unnecessary introspection code in the decimal module. Message-ID: http://hg.python.org/cpython/rev/b4b1f557d563 changeset: 69263:b4b1f557d563 branch: 2.7 parent: 69256:a280672d3d8d user: Raymond Hettinger date: Mon Apr 11 17:27:42 2011 -0700 summary: Issue #11830: Remove unnecessary introspection code in the decimal module. It was causing a failed import in the Turkish locale where the locale sensitive str.upper() method caused a name mismatch. files: Lib/decimal.py | 25 +++++++++++-------------- Misc/NEWS | 4 ++++ 2 files changed, 15 insertions(+), 14 deletions(-) diff --git a/Lib/decimal.py b/Lib/decimal.py --- a/Lib/decimal.py +++ b/Lib/decimal.py @@ -1723,8 +1723,6 @@ # here self was representable to begin with; return unchanged return Decimal(self) - _pick_rounding_function = {} - # for each of the rounding functions below: # self is a finite, nonzero Decimal # prec is an integer satisfying 0 <= prec < len(self._int) @@ -1791,6 +1789,17 @@ else: return -self._round_down(prec) + _pick_rounding_function = dict( + ROUND_DOWN = '_round_down', + ROUND_UP = '_round_up', + ROUND_HALF_UP = '_round_half_up', + ROUND_HALF_DOWN = '_round_half_down', + ROUND_HALF_EVEN = '_round_half_even', + ROUND_CEILING = '_round_ceiling', + ROUND_FLOOR = '_round_floor', + ROUND_05UP = '_round_05up', + ) + def fma(self, other, third, context=None): """Fused multiply-add. @@ -3708,18 +3717,6 @@ ##### Context class ####################################################### - -# get rounding method function: -rounding_functions = [name for name in Decimal.__dict__.keys() - if name.startswith('_round_')] -for name in rounding_functions: - # name is like _round_half_even, goes to the global ROUND_HALF_EVEN value. - globalname = name[1:].upper() - val = globals()[globalname] - Decimal._pick_rounding_function[val] = name - -del name, val, globalname, rounding_functions - class _ContextManager(object): """Context manager class to support localcontext(). diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -57,6 +57,10 @@ - Issue #4877: Fix a segfault in xml.parsers.expat while attempting to parse a closed file. +- Issue #11830: Remove unnecessary introspection code in the decimal module. + It was causing a failed import in the Turkish locale where the locale + sensitive str.upper() method caused a name mismatch. + - Issue #8428: Fix a race condition in multiprocessing.Pool when terminating worker processes: new processes would be spawned while the pool is being shut down. Patch by Charles-Fran?ois Natali. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 12 02:45:14 2011 From: python-checkins at python.org (raymond.hettinger) Date: Tue, 12 Apr 2011 02:45:14 +0200 Subject: [Python-checkins] cpython (2.7): Use floor division operator instead of deprecated division operator. Message-ID: http://hg.python.org/cpython/rev/7b71872fb66a changeset: 69264:7b71872fb66a branch: 2.7 user: Raymond Hettinger date: Mon Apr 11 17:45:01 2011 -0700 summary: Use floor division operator instead of deprecated division operator. files: Lib/trace.py | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Lib/trace.py b/Lib/trace.py --- a/Lib/trace.py +++ b/Lib/trace.py @@ -335,7 +335,7 @@ lnotab, count) if summary and n_lines: - percent = int(100 * n_hits / n_lines) + percent = 100 * n_hits // n_lines sums[modulename] = n_lines, percent, modulename, filename if summary and sums: -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 12 02:57:50 2011 From: python-checkins at python.org (raymond.hettinger) Date: Tue, 12 Apr 2011 02:57:50 +0200 Subject: [Python-checkins] cpython: Fix markup. Message-ID: http://hg.python.org/cpython/rev/5e87dd117f74 changeset: 69265:5e87dd117f74 parent: 69262:5062b5286eba user: Raymond Hettinger date: Mon Apr 11 17:57:21 2011 -0700 summary: Fix markup. files: Doc/library/collections.rst | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/Doc/library/collections.rst b/Doc/library/collections.rst --- a/Doc/library/collections.rst +++ b/Doc/library/collections.rst @@ -60,9 +60,9 @@ Lookups search the underlying mappings successively until a key is found. In contrast, writes, updates, and deletions only operate on the first mapping. - A class:`ChainMap` incorporates the underlying mappings by reference. So, if + A :class:`ChainMap` incorporates the underlying mappings by reference. So, if one of the underlying mappings gets updated, those changes will be reflected - in class:`ChainMap`. + in :class:`ChainMap`. All of the usual dictionary methods are supported. In addition, there is a *maps* attribute, a method for creating new subcontexts, and a property for -- Repository URL: http://hg.python.org/cpython From solipsis at pitrou.net Tue Apr 12 05:00:07 2011 From: solipsis at pitrou.net (solipsis at pitrou.net) Date: Tue, 12 Apr 2011 05:00:07 +0200 Subject: [Python-checkins] Daily reference leaks (5e87dd117f74): sum=-56 Message-ID: results for 5e87dd117f74 on branch "default" -------------------------------------------- test_pyexpat leaked [0, 0, -56] references, sum=-56 Command line was: ['./python', '-m', 'test.regrtest', '-uall', '-R', '3:3:/home/antoine/cpython/refleaks/refloghaBAAf', '-x'] From nnorwitz at gmail.com Tue Apr 12 12:49:28 2011 From: nnorwitz at gmail.com (Neal Norwitz) Date: Tue, 12 Apr 2011 06:49:28 -0400 Subject: [Python-checkins] Python Regression Test Failures refleak (1) Message-ID: <20110412104928.GA11036@kbk-i386-bb.dyndns.org> More important issues: ---------------------- test_bz2 leaked [-84, 0, 0] references, sum=-84 Less important issues: ---------------------- From python-checkins at python.org Tue Apr 12 15:07:27 2011 From: python-checkins at python.org (ezio.melotti) Date: Tue, 12 Apr 2011 15:07:27 +0200 Subject: [Python-checkins] cpython (2.7): #9233: skip _json-specific tests when _json is not available. Message-ID: http://hg.python.org/cpython/rev/500063f6ae5a changeset: 69266:500063f6ae5a branch: 2.7 parent: 69264:7b71872fb66a user: Ezio Melotti date: Tue Apr 12 15:59:50 2011 +0300 summary: #9233: skip _json-specific tests when _json is not available. files: Lib/json/tests/test_scanstring.py | 8 +++++++- Lib/json/tests/test_speedups.py | 8 +++++++- 2 files changed, 14 insertions(+), 2 deletions(-) diff --git a/Lib/json/tests/test_scanstring.py b/Lib/json/tests/test_scanstring.py --- a/Lib/json/tests/test_scanstring.py +++ b/Lib/json/tests/test_scanstring.py @@ -1,14 +1,20 @@ import sys import decimal -from unittest import TestCase +from unittest import TestCase, skipUnless import json import json.decoder +try: + import _json +except ImportError: + _json = None + class TestScanString(TestCase): def test_py_scanstring(self): self._test_scanstring(json.decoder.py_scanstring) + @skipUnless(_json, 'test requires the _json module') def test_c_scanstring(self): self._test_scanstring(json.decoder.c_scanstring) diff --git a/Lib/json/tests/test_speedups.py b/Lib/json/tests/test_speedups.py --- a/Lib/json/tests/test_speedups.py +++ b/Lib/json/tests/test_speedups.py @@ -1,8 +1,14 @@ import decimal -from unittest import TestCase +from unittest import TestCase, skipUnless from json import decoder, encoder, scanner +try: + import _json +except ImportError: + _json = None + + at skipUnless(_json, 'test requires the _json module') class TestSpeedups(TestCase): def test_scanstring(self): self.assertEqual(decoder.scanstring.__module__, "_json") -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 12 15:07:29 2011 From: python-checkins at python.org (ezio.melotti) Date: Tue, 12 Apr 2011 15:07:29 +0200 Subject: [Python-checkins] cpython (2.7): Remove unnecessary imports and use assertIs instead of assertTrue. Message-ID: http://hg.python.org/cpython/rev/61f25c9e7fa1 changeset: 69267:61f25c9e7fa1 branch: 2.7 user: Ezio Melotti date: Tue Apr 12 16:06:43 2011 +0300 summary: Remove unnecessary imports and use assertIs instead of assertTrue. files: Lib/json/tests/test_scanstring.py | 1 - Lib/json/tests/test_speedups.py | 7 +++---- 2 files changed, 3 insertions(+), 5 deletions(-) diff --git a/Lib/json/tests/test_scanstring.py b/Lib/json/tests/test_scanstring.py --- a/Lib/json/tests/test_scanstring.py +++ b/Lib/json/tests/test_scanstring.py @@ -1,5 +1,4 @@ import sys -import decimal from unittest import TestCase, skipUnless import json diff --git a/Lib/json/tests/test_speedups.py b/Lib/json/tests/test_speedups.py --- a/Lib/json/tests/test_speedups.py +++ b/Lib/json/tests/test_speedups.py @@ -1,4 +1,3 @@ -import decimal from unittest import TestCase, skipUnless from json import decoder, encoder, scanner @@ -12,12 +11,12 @@ class TestSpeedups(TestCase): def test_scanstring(self): self.assertEqual(decoder.scanstring.__module__, "_json") - self.assertTrue(decoder.scanstring is decoder.c_scanstring) + self.assertIs(decoder.scanstring, decoder.c_scanstring) def test_encode_basestring_ascii(self): self.assertEqual(encoder.encode_basestring_ascii.__module__, "_json") - self.assertTrue(encoder.encode_basestring_ascii is - encoder.c_encode_basestring_ascii) + self.assertIs(encoder.encode_basestring_ascii, + encoder.c_encode_basestring_ascii) class TestDecode(TestCase): def test_make_scanner(self): -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 12 17:50:27 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 12 Apr 2011 17:50:27 +0200 Subject: [Python-checkins] cpython (3.2): Issue #11815: Remove dead code in concurrent.futures (since a blocking Queue Message-ID: http://hg.python.org/cpython/rev/bfc586c558ed changeset: 69268:bfc586c558ed branch: 3.2 parent: 69261:b166c972cb5e user: Antoine Pitrou date: Tue Apr 12 17:48:46 2011 +0200 summary: Issue #11815: Remove dead code in concurrent.futures (since a blocking Queue cannot raise queue.Empty). files: Lib/concurrent/futures/process.py | 67 ++++++------------ Lib/concurrent/futures/thread.py | 12 +-- 2 files changed, 28 insertions(+), 51 deletions(-) diff --git a/Lib/concurrent/futures/process.py b/Lib/concurrent/futures/process.py --- a/Lib/concurrent/futures/process.py +++ b/Lib/concurrent/futures/process.py @@ -104,7 +104,7 @@ self.args = args self.kwargs = kwargs -def _process_worker(call_queue, result_queue, shutdown): +def _process_worker(call_queue, result_queue): """Evaluates calls from call_queue and places the results in result_queue. This worker is run in a separate process. @@ -118,24 +118,19 @@ worker that it should exit when call_queue is empty. """ while True: + call_item = call_queue.get(block=True) + if call_item is None: + # Wake up queue management thread + result_queue.put(None) + return try: - call_item = call_queue.get(block=True) - except queue.Empty: - if shutdown.is_set(): - return + r = call_item.fn(*call_item.args, **call_item.kwargs) + except BaseException as e: + result_queue.put(_ResultItem(call_item.work_id, + exception=e)) else: - if call_item is None: - # Wake up queue management thread - result_queue.put(None) - return - try: - r = call_item.fn(*call_item.args, **call_item.kwargs) - except BaseException as e: - result_queue.put(_ResultItem(call_item.work_id, - exception=e)) - else: - result_queue.put(_ResultItem(call_item.work_id, - result=r)) + result_queue.put(_ResultItem(call_item.work_id, + result=r)) def _add_call_item_to_queue(pending_work_items, work_ids, @@ -179,8 +174,7 @@ pending_work_items, work_ids_queue, call_queue, - result_queue, - shutdown_process_event): + result_queue): """Manages the communication between this process and the worker processes. This function is run in a local thread. @@ -198,9 +192,6 @@ derived from _WorkItems for processing by the process workers. result_queue: A multiprocessing.Queue of _ResultItems generated by the process workers. - shutdown_process_event: A multiprocessing.Event used to signal the - process workers that they should exit when their work queue is - empty. """ nb_shutdown_processes = 0 def shutdown_one_process(): @@ -213,20 +204,16 @@ work_ids_queue, call_queue) - try: - result_item = result_queue.get(block=True) - except queue.Empty: - pass - else: - if result_item is not None: - work_item = pending_work_items[result_item.work_id] - del pending_work_items[result_item.work_id] + result_item = result_queue.get(block=True) + if result_item is not None: + work_item = pending_work_items[result_item.work_id] + del pending_work_items[result_item.work_id] - if result_item.exception: - work_item.future.set_exception(result_item.exception) - else: - work_item.future.set_result(result_item.result) - continue + if result_item.exception: + work_item.future.set_exception(result_item.exception) + else: + work_item.future.set_result(result_item.result) + continue # If we come here, we either got a timeout or were explicitly woken up. # In either case, check whether we should start shutting down. executor = executor_reference() @@ -238,8 +225,6 @@ # Since no new work items can be added, it is safe to shutdown # this thread if there are no pending work items. if not pending_work_items: - shutdown_process_event.set() - while nb_shutdown_processes < len(processes): shutdown_one_process() # If .join() is not called on the created processes then @@ -306,7 +291,6 @@ # Shutdown is a two-step process. self._shutdown_thread = False - self._shutdown_process_event = multiprocessing.Event() self._shutdown_lock = threading.Lock() self._queue_count = 0 self._pending_work_items = {} @@ -324,8 +308,7 @@ self._pending_work_items, self._work_ids, self._call_queue, - self._result_queue, - self._shutdown_process_event)) + self._result_queue)) self._queue_management_thread.daemon = True self._queue_management_thread.start() _threads_queues[self._queue_management_thread] = self._result_queue @@ -335,8 +318,7 @@ p = multiprocessing.Process( target=_process_worker, args=(self._call_queue, - self._result_queue, - self._shutdown_process_event)) + self._result_queue)) p.start() self._processes.add(p) @@ -372,7 +354,6 @@ self._queue_management_thread = None self._call_queue = None self._result_queue = None - self._shutdown_process_event = None self._processes = None shutdown.__doc__ = _base.Executor.shutdown.__doc__ diff --git a/Lib/concurrent/futures/thread.py b/Lib/concurrent/futures/thread.py --- a/Lib/concurrent/futures/thread.py +++ b/Lib/concurrent/futures/thread.py @@ -60,14 +60,10 @@ def _worker(executor_reference, work_queue): try: while True: - try: - work_item = work_queue.get(block=True) - except queue.Empty: - pass - else: - if work_item is not None: - work_item.run() - continue + work_item = work_queue.get(block=True) + if work_item is not None: + work_item.run() + continue executor = executor_reference() # Exit if: # - The interpreter is shutting down OR -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 12 17:50:31 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 12 Apr 2011 17:50:31 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Issue #11815: Remove dead code in concurrent.futures (since a blocking Queue Message-ID: http://hg.python.org/cpython/rev/eb751e3cb753 changeset: 69269:eb751e3cb753 parent: 69265:5e87dd117f74 parent: 69268:bfc586c558ed user: Antoine Pitrou date: Tue Apr 12 17:50:20 2011 +0200 summary: Issue #11815: Remove dead code in concurrent.futures (since a blocking Queue cannot raise queue.Empty). files: Lib/concurrent/futures/process.py | 67 ++++++------------ Lib/concurrent/futures/thread.py | 12 +-- 2 files changed, 28 insertions(+), 51 deletions(-) diff --git a/Lib/concurrent/futures/process.py b/Lib/concurrent/futures/process.py --- a/Lib/concurrent/futures/process.py +++ b/Lib/concurrent/futures/process.py @@ -104,7 +104,7 @@ self.args = args self.kwargs = kwargs -def _process_worker(call_queue, result_queue, shutdown): +def _process_worker(call_queue, result_queue): """Evaluates calls from call_queue and places the results in result_queue. This worker is run in a separate process. @@ -118,24 +118,19 @@ worker that it should exit when call_queue is empty. """ while True: + call_item = call_queue.get(block=True) + if call_item is None: + # Wake up queue management thread + result_queue.put(None) + return try: - call_item = call_queue.get(block=True) - except queue.Empty: - if shutdown.is_set(): - return + r = call_item.fn(*call_item.args, **call_item.kwargs) + except BaseException as e: + result_queue.put(_ResultItem(call_item.work_id, + exception=e)) else: - if call_item is None: - # Wake up queue management thread - result_queue.put(None) - return - try: - r = call_item.fn(*call_item.args, **call_item.kwargs) - except BaseException as e: - result_queue.put(_ResultItem(call_item.work_id, - exception=e)) - else: - result_queue.put(_ResultItem(call_item.work_id, - result=r)) + result_queue.put(_ResultItem(call_item.work_id, + result=r)) def _add_call_item_to_queue(pending_work_items, work_ids, @@ -179,8 +174,7 @@ pending_work_items, work_ids_queue, call_queue, - result_queue, - shutdown_process_event): + result_queue): """Manages the communication between this process and the worker processes. This function is run in a local thread. @@ -198,9 +192,6 @@ derived from _WorkItems for processing by the process workers. result_queue: A multiprocessing.Queue of _ResultItems generated by the process workers. - shutdown_process_event: A multiprocessing.Event used to signal the - process workers that they should exit when their work queue is - empty. """ nb_shutdown_processes = 0 def shutdown_one_process(): @@ -213,20 +204,16 @@ work_ids_queue, call_queue) - try: - result_item = result_queue.get(block=True) - except queue.Empty: - pass - else: - if result_item is not None: - work_item = pending_work_items[result_item.work_id] - del pending_work_items[result_item.work_id] + result_item = result_queue.get(block=True) + if result_item is not None: + work_item = pending_work_items[result_item.work_id] + del pending_work_items[result_item.work_id] - if result_item.exception: - work_item.future.set_exception(result_item.exception) - else: - work_item.future.set_result(result_item.result) - continue + if result_item.exception: + work_item.future.set_exception(result_item.exception) + else: + work_item.future.set_result(result_item.result) + continue # If we come here, we either got a timeout or were explicitly woken up. # In either case, check whether we should start shutting down. executor = executor_reference() @@ -238,8 +225,6 @@ # Since no new work items can be added, it is safe to shutdown # this thread if there are no pending work items. if not pending_work_items: - shutdown_process_event.set() - while nb_shutdown_processes < len(processes): shutdown_one_process() # If .join() is not called on the created processes then @@ -306,7 +291,6 @@ # Shutdown is a two-step process. self._shutdown_thread = False - self._shutdown_process_event = multiprocessing.Event() self._shutdown_lock = threading.Lock() self._queue_count = 0 self._pending_work_items = {} @@ -324,8 +308,7 @@ self._pending_work_items, self._work_ids, self._call_queue, - self._result_queue, - self._shutdown_process_event)) + self._result_queue)) self._queue_management_thread.daemon = True self._queue_management_thread.start() _threads_queues[self._queue_management_thread] = self._result_queue @@ -335,8 +318,7 @@ p = multiprocessing.Process( target=_process_worker, args=(self._call_queue, - self._result_queue, - self._shutdown_process_event)) + self._result_queue)) p.start() self._processes.add(p) @@ -372,7 +354,6 @@ self._queue_management_thread = None self._call_queue = None self._result_queue = None - self._shutdown_process_event = None self._processes = None shutdown.__doc__ = _base.Executor.shutdown.__doc__ diff --git a/Lib/concurrent/futures/thread.py b/Lib/concurrent/futures/thread.py --- a/Lib/concurrent/futures/thread.py +++ b/Lib/concurrent/futures/thread.py @@ -60,14 +60,10 @@ def _worker(executor_reference, work_queue): try: while True: - try: - work_item = work_queue.get(block=True) - except queue.Empty: - pass - else: - if work_item is not None: - work_item.run() - continue + work_item = work_queue.get(block=True) + if work_item is not None: + work_item.run() + continue executor = executor_reference() # Exit if: # - The interpreter is shutting down OR -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 12 17:58:16 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 12 Apr 2011 17:58:16 +0200 Subject: [Python-checkins] cpython: Issue #11815: Use a light-weight SimpleQueue for the result queue in Message-ID: http://hg.python.org/cpython/rev/c26d015cbde8 changeset: 69270:c26d015cbde8 user: Antoine Pitrou date: Tue Apr 12 17:58:11 2011 +0200 summary: Issue #11815: Use a light-weight SimpleQueue for the result queue in concurrent.futures.ProcessPoolExecutor. files: Lib/concurrent/futures/process.py | 5 +++-- Misc/NEWS | 3 +++ 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/Lib/concurrent/futures/process.py b/Lib/concurrent/futures/process.py --- a/Lib/concurrent/futures/process.py +++ b/Lib/concurrent/futures/process.py @@ -49,6 +49,7 @@ from concurrent.futures import _base import queue import multiprocessing +from multiprocessing.queues import SimpleQueue import threading import weakref @@ -204,7 +205,7 @@ work_ids_queue, call_queue) - result_item = result_queue.get(block=True) + result_item = result_queue.get() if result_item is not None: work_item = pending_work_items[result_item.work_id] del pending_work_items[result_item.work_id] @@ -284,7 +285,7 @@ # because futures in the call queue cannot be cancelled. self._call_queue = multiprocessing.Queue(self._max_workers + EXTRA_QUEUED_CALLS) - self._result_queue = multiprocessing.Queue() + self._result_queue = SimpleQueue() self._work_ids = queue.Queue() self._queue_management_thread = None self._processes = set() diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -103,6 +103,9 @@ Library ------- +- Issue #11815: Use a light-weight SimpleQueue for the result queue in + concurrent.futures.ProcessPoolExecutor. + - Issue #5162: Treat services like frozen executables to allow child spawning from multiprocessing.forking on Windows. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 12 18:06:10 2011 From: python-checkins at python.org (raymond.hettinger) Date: Tue, 12 Apr 2011 18:06:10 +0200 Subject: [Python-checkins] cpython (2.7): Neaten-up the fix to issue 11830 Message-ID: http://hg.python.org/cpython/rev/f4adc2926bf5 changeset: 69271:f4adc2926bf5 branch: 2.7 parent: 69267:61f25c9e7fa1 user: Raymond Hettinger date: Tue Apr 12 09:06:01 2011 -0700 summary: Neaten-up the fix to issue 11830 files: Lib/decimal.py | 22 +++++++++++----------- 1 files changed, 11 insertions(+), 11 deletions(-) diff --git a/Lib/decimal.py b/Lib/decimal.py --- a/Lib/decimal.py +++ b/Lib/decimal.py @@ -1683,7 +1683,7 @@ self = _dec_from_triple(self._sign, '1', exp_min-1) digits = 0 rounding_method = self._pick_rounding_function[context.rounding] - changed = getattr(self, rounding_method)(digits) + changed = rounding_method(self, digits) coeff = self._int[:digits] or '0' if changed > 0: coeff = str(int(coeff)+1) @@ -1790,14 +1790,14 @@ return -self._round_down(prec) _pick_rounding_function = dict( - ROUND_DOWN = '_round_down', - ROUND_UP = '_round_up', - ROUND_HALF_UP = '_round_half_up', - ROUND_HALF_DOWN = '_round_half_down', - ROUND_HALF_EVEN = '_round_half_even', - ROUND_CEILING = '_round_ceiling', - ROUND_FLOOR = '_round_floor', - ROUND_05UP = '_round_05up', + ROUND_DOWN = _round_down, + ROUND_UP = _round_up, + ROUND_HALF_UP = _round_half_up, + ROUND_HALF_DOWN = _round_half_down, + ROUND_HALF_EVEN = _round_half_even, + ROUND_CEILING = _round_ceiling, + ROUND_FLOOR = _round_floor, + ROUND_05UP = _round_05up, ) def fma(self, other, third, context=None): @@ -2504,8 +2504,8 @@ if digits < 0: self = _dec_from_triple(self._sign, '1', exp-1) digits = 0 - this_function = getattr(self, self._pick_rounding_function[rounding]) - changed = this_function(digits) + this_function = self._pick_rounding_function[rounding] + changed = this_function(self, digits) coeff = self._int[:digits] or '0' if changed == 1: coeff = str(int(coeff)+1) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 12 21:03:14 2011 From: python-checkins at python.org (r.david.murray) Date: Tue, 12 Apr 2011 21:03:14 +0200 Subject: [Python-checkins] cpython (3.1): Add maxlinelen to docstring, delete obsolete wording Message-ID: http://hg.python.org/cpython/rev/f9bd0add9732 changeset: 69272:f9bd0add9732 branch: 3.1 parent: 69260:33b54387cc2a user: R David Murray date: Tue Apr 12 15:00:44 2011 -0400 summary: Add maxlinelen to docstring, delete obsolete wording files: Lib/email/header.py | 12 ++++++------ 1 files changed, 6 insertions(+), 6 deletions(-) diff --git a/Lib/email/header.py b/Lib/email/header.py --- a/Lib/email/header.py +++ b/Lib/email/header.py @@ -281,12 +281,12 @@ 75-character length limit on any given encoded header field, so line-wrapping must be performed, even with double-byte character sets. - This method will do its best to convert the string to the correct - character set used in email, and encode and line wrap it safely with - the appropriate scheme for that character set. - - If the given charset is not known or an error occurs during - conversion, this function will return the header untouched. + Optional maxlinelen specifies the maxiumum length of each generated + line, exclusive of the linesep string. Individual lines may be longer + than maxlinelen if a folding point cannot be found. The first line + will be shorter by the length of the header name plus ": " if a header + name was specified at Header construction time. The default value for + maxlinelen is determined at header construction time. Optional splitchars is a string containing characters to split long ASCII lines on, in rough support of RFC 2822's `highest level -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 12 21:03:17 2011 From: python-checkins at python.org (r.david.murray) Date: Tue, 12 Apr 2011 21:03:17 +0200 Subject: [Python-checkins] cpython (merge 3.1 -> 3.2): Merge: Add maxlinelen to docstring, delete obsolete wording Message-ID: http://hg.python.org/cpython/rev/4254085bae84 changeset: 69273:4254085bae84 branch: 3.2 parent: 69268:bfc586c558ed parent: 69272:f9bd0add9732 user: R David Murray date: Tue Apr 12 15:01:28 2011 -0400 summary: Merge: Add maxlinelen to docstring, delete obsolete wording files: Lib/email/header.py | 12 ++++++------ 1 files changed, 6 insertions(+), 6 deletions(-) diff --git a/Lib/email/header.py b/Lib/email/header.py --- a/Lib/email/header.py +++ b/Lib/email/header.py @@ -292,12 +292,12 @@ 75-character length limit on any given encoded header field, so line-wrapping must be performed, even with double-byte character sets. - This method will do its best to convert the string to the correct - character set used in email, and encode and line wrap it safely with - the appropriate scheme for that character set. - - If the given charset is not known or an error occurs during - conversion, this function will return the header untouched. + Optional maxlinelen specifies the maxiumum length of each generated + line, exclusive of the linesep string. Individual lines may be longer + than maxlinelen if a folding point cannot be found. The first line + will be shorter by the length of the header name plus ": " if a header + name was specified at Header construction time. The default value for + maxlinelen is determined at header construction time. Optional splitchars is a string containing characters to split long ASCII lines on, in rough support of RFC 2822's `highest level -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 12 21:03:18 2011 From: python-checkins at python.org (r.david.murray) Date: Tue, 12 Apr 2011 21:03:18 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Merge: Add maxlinelen to docstring, delete obsolete wording Message-ID: http://hg.python.org/cpython/rev/66069ee12007 changeset: 69274:66069ee12007 parent: 69270:c26d015cbde8 parent: 69273:4254085bae84 user: R David Murray date: Tue Apr 12 15:02:07 2011 -0400 summary: Merge: Add maxlinelen to docstring, delete obsolete wording files: Lib/email/header.py | 12 ++++++------ 1 files changed, 6 insertions(+), 6 deletions(-) diff --git a/Lib/email/header.py b/Lib/email/header.py --- a/Lib/email/header.py +++ b/Lib/email/header.py @@ -292,12 +292,12 @@ 75-character length limit on any given encoded header field, so line-wrapping must be performed, even with double-byte character sets. - This method will do its best to convert the string to the correct - character set used in email, and encode and line wrap it safely with - the appropriate scheme for that character set. - - If the given charset is not known or an error occurs during - conversion, this function will return the header untouched. + Optional maxlinelen specifies the maxiumum length of each generated + line, exclusive of the linesep string. Individual lines may be longer + than maxlinelen if a folding point cannot be found. The first line + will be shorter by the length of the header name plus ": " if a header + name was specified at Header construction time. The default value for + maxlinelen is determined at header construction time. Optional splitchars is a string containing characters to split long ASCII lines on, in rough support of RFC 2822's `highest level -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 12 23:05:40 2011 From: python-checkins at python.org (nadeem.vawda) Date: Tue, 12 Apr 2011 23:05:40 +0200 Subject: [Python-checkins] cpython: Fix 64-bit safety issue in BZ2Compressor and BZ2Decompressor. Message-ID: http://hg.python.org/cpython/rev/0010cc5f22d4 changeset: 69275:0010cc5f22d4 user: Nadeem Vawda date: Tue Apr 12 23:02:42 2011 +0200 summary: Fix 64-bit safety issue in BZ2Compressor and BZ2Decompressor. files: Lib/test/test_bz2.py | 36 +++++++++++++++++++++++++++++++- Modules/_bz2module.c | 33 +++++++++++++++++++++-------- 2 files changed, 59 insertions(+), 10 deletions(-) diff --git a/Lib/test/test_bz2.py b/Lib/test/test_bz2.py --- a/Lib/test/test_bz2.py +++ b/Lib/test/test_bz2.py @@ -1,10 +1,11 @@ #!/usr/bin/env python3 from test import support -from test.support import TESTFN +from test.support import TESTFN, precisionbigmemtest, _4G import unittest from io import BytesIO import os +import random import subprocess import sys @@ -415,6 +416,23 @@ data += bz2c.flush() self.assertEqual(self.decompress(data), self.TEXT) + @precisionbigmemtest(size=_4G + 100, memuse=2) + def testCompress4G(self, size): + # "Test BZ2Compressor.compress()/flush() with >4GiB input" + bz2c = BZ2Compressor() + data = b"x" * size + try: + compressed = bz2c.compress(data) + compressed += bz2c.flush() + finally: + data = None # Release memory + data = bz2.decompress(compressed) + try: + self.assertEqual(len(data), size) + self.assertEqual(len(data.strip(b"x")), 0) + finally: + data = None + class BZ2DecompressorTest(BaseTest): def test_Constructor(self): self.assertRaises(TypeError, BZ2Decompressor, 42) @@ -453,6 +471,22 @@ text = bz2d.decompress(self.DATA) self.assertRaises(EOFError, bz2d.decompress, b"anything") + @precisionbigmemtest(size=_4G + 100, memuse=3) + def testDecompress4G(self, size): + # "Test BZ2Decompressor.decompress() with >4GiB input" + blocksize = 10 * 1024 * 1024 + block = random.getrandbits(blocksize * 8).to_bytes(blocksize, 'little') + try: + data = block * (size // blocksize + 1) + compressed = bz2.compress(data) + bz2d = BZ2Decompressor() + decompressed = bz2d.decompress(compressed) + self.assertTrue(decompressed == data) + finally: + data = None + compressed = None + decompressed = None + class FuncTest(BaseTest): "Test module functions" diff --git a/Modules/_bz2module.c b/Modules/_bz2module.c --- a/Modules/_bz2module.c +++ b/Modules/_bz2module.c @@ -36,6 +36,8 @@ #define RELEASE_LOCK(obj) #endif +#define MIN(X, Y) (((X) < (Y)) ? (X) : (Y)) + typedef struct { PyObject_HEAD @@ -145,8 +147,10 @@ if (result == NULL) return NULL; c->bzs.next_in = data; - /* FIXME This is not 64-bit clean - avail_in is an int. */ - c->bzs.avail_in = len; + /* On a 64-bit system, len might not fit in avail_in (an unsigned int). + Do compression in chunks of no more than UINT_MAX bytes each. */ + c->bzs.avail_in = MIN(len, UINT_MAX); + len -= c->bzs.avail_in; c->bzs.next_out = PyBytes_AS_STRING(result); c->bzs.avail_out = PyBytes_GET_SIZE(result); for (;;) { @@ -161,6 +165,11 @@ if (catch_bz2_error(bzerror)) goto error; + if (c->bzs.avail_in == 0 && len > 0) { + c->bzs.avail_in = MIN(len, UINT_MAX); + len -= c->bzs.avail_in; + } + /* In regular compression mode, stop when input data is exhausted. In flushing mode, stop when all buffered data has been flushed. */ if ((action == BZ_RUN && c->bzs.avail_in == 0) || @@ -354,8 +363,10 @@ if (result == NULL) return result; d->bzs.next_in = data; - /* FIXME This is not 64-bit clean - avail_in is an int. */ - d->bzs.avail_in = len; + /* On a 64-bit system, len might not fit in avail_in (an unsigned int). + Do decompression in chunks of no more than UINT_MAX bytes each. */ + d->bzs.avail_in = MIN(len, UINT_MAX); + len -= d->bzs.avail_in; d->bzs.next_out = PyBytes_AS_STRING(result); d->bzs.avail_out = PyBytes_GET_SIZE(result); for (;;) { @@ -371,17 +382,21 @@ goto error; if (bzerror == BZ_STREAM_END) { d->eof = 1; - if (d->bzs.avail_in > 0) { /* Save leftover input to unused_data */ + len += d->bzs.avail_in; + if (len > 0) { /* Save leftover input to unused_data */ Py_CLEAR(d->unused_data); - d->unused_data = PyBytes_FromStringAndSize(d->bzs.next_in, - d->bzs.avail_in); + d->unused_data = PyBytes_FromStringAndSize(d->bzs.next_in, len); if (d->unused_data == NULL) goto error; } break; } - if (d->bzs.avail_in == 0) - break; + if (d->bzs.avail_in == 0) { + if (len == 0) + break; + d->bzs.avail_in = MIN(len, UINT_MAX); + len -= d->bzs.avail_in; + } if (d->bzs.avail_out == 0) { if (grow_buffer(&result) < 0) goto error; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 12 23:44:44 2011 From: python-checkins at python.org (victor.stinner) Date: Tue, 12 Apr 2011 23:44:44 +0200 Subject: [Python-checkins] cpython: Issue #11186: pydoc ignores a module if its name contains a surrogate character Message-ID: http://hg.python.org/cpython/rev/506cab8fc329 changeset: 69276:506cab8fc329 user: Victor Stinner date: Tue Apr 12 23:41:50 2011 +0200 summary: Issue #11186: pydoc ignores a module if its name contains a surrogate character in the index of modules. files: Lib/pydoc.py | 3 +++ Misc/NEWS | 3 +++ 2 files changed, 6 insertions(+), 0 deletions(-) diff --git a/Lib/pydoc.py b/Lib/pydoc.py --- a/Lib/pydoc.py +++ b/Lib/pydoc.py @@ -952,6 +952,9 @@ modpkgs = [] if shadowed is None: shadowed = {} for importer, name, ispkg in pkgutil.iter_modules([dir]): + if any((0xD800 <= ord(ch) <= 0xDFFF) for ch in name): + # ignore a module if its name contains a surrogate character + continue modpkgs.append((name, '', ispkg, name in shadowed)) shadowed[name] = 1 diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -103,6 +103,9 @@ Library ------- +- Issue #11186: pydoc ignores a module if its name contains a surrogate + character in the index of modules. + - Issue #11815: Use a light-weight SimpleQueue for the result queue in concurrent.futures.ProcessPoolExecutor. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Apr 12 23:59:53 2011 From: python-checkins at python.org (brett.cannon) Date: Tue, 12 Apr 2011 23:59:53 +0200 Subject: [Python-checkins] peps: Drop the propcheck target as that was a svn-specific thing. Message-ID: http://hg.python.org/peps/rev/7415e03f11cf changeset: 3864:7415e03f11cf user: Brett Cannon date: Tue Apr 12 14:30:10 2011 -0700 summary: Drop the propcheck target as that was a svn-specific thing. files: Makefile | 2 -- 1 files changed, 0 insertions(+), 2 deletions(-) diff --git a/Makefile b/Makefile --- a/Makefile +++ b/Makefile @@ -31,5 +31,3 @@ update: hg pull --update http://hg.python.org/peps -propcheck: - $(PYTHON) propcheck.py -- Repository URL: http://hg.python.org/peps From python-checkins at python.org Tue Apr 12 23:59:53 2011 From: python-checkins at python.org (brett.cannon) Date: Tue, 12 Apr 2011 23:59:53 +0200 Subject: [Python-checkins] peps: Update PEP 399 to include comments from python-dev. Message-ID: http://hg.python.org/peps/rev/24d68b6329e9 changeset: 3865:24d68b6329e9 user: Brett Cannon date: Tue Apr 12 14:59:45 2011 -0700 summary: Update PEP 399 to include comments from python-dev. files: pep-0399.txt | 94 +++++++++++++++++++++++++-------------- 1 files changed, 59 insertions(+), 35 deletions(-) diff --git a/pep-0399.txt b/pep-0399.txt --- a/pep-0399.txt +++ b/pep-0399.txt @@ -8,19 +8,19 @@ Content-Type: text/x-rst Created: 04-Apr-2011 Python-Version: 3.3 -Post-History: +Post-History: 04-Apr-2011, 12-Apr-2011 Abstract ======== The Python standard library under CPython contains various instances of modules implemented in both pure Python and C (either entirely or -partially). This PEP requires that in these instances that both the -Python and C code *must* be semantically identical (except in cases -where implementation details of a VM prevents it entirely). It is also -required that new C-based modules lacking a pure Python equivalent -implementation get special permissions to be added to the standard -library. +partially). This PEP requires that in these instances that the +C code *must* pass the test suite used for the pure Python code +so as to act as much as a drop-in replacement as possible +(C- and VM-specific tests are exempt). It is also required that new +C-based modules lacking a pure Python equivalent implementation get +special permissions to be added to the standard library. Rationale @@ -48,18 +48,22 @@ mandating that all new modules added to Python's standard library *must* have a pure Python implementation _unless_ special dispensation is given. This makes sure that a module in the stdlib is available to -all VMs and not just to CPython. +all VMs and not just to CPython (pre-existing modules that do not meet +this requirement are exempt, although there is nothing preventing +someone from adding in a pure Python implementation retroactively). Re-implementing parts (or all) of a module in C (in the case of CPython) is still allowed for performance reasons, but any such -accelerated code must semantically match the pure Python equivalent to -prevent divergence. To accomplish this, the pure Python and C code must -be thoroughly tested with the *same* test suite to verify compliance. +accelerated code must pass the same test suite (sans VM- or C-specific +tests) to verify semantics and prevent divergence. To accomplish this, +the test suite for the module must have 100% branch coverage of the +pure Python implementation before the acceleration code may be added. This is to prevent users from accidentally relying on semantics that are specific to the C code and are not reflected in -the pure Python implementation that other VMs rely upon, e.g., in -CPython 3.2.0, ``heapq.heappop()`` raises different exceptions -depending on whether the accelerated C code is used or not:: +the pure Python implementation that other VMs rely upon. For example, +in CPython 3.2.0, ``heapq.heappop()`` does an explicit type +check in its accelerated C code while the Python code uses duck +typing:: from test.support import import_fresh_module @@ -77,13 +81,13 @@ try: c_heapq.heappop(Spam()) except TypeError: - # "heap argument must be a list" + # Explicit type check failure: "heap argument must be a list" pass try: py_heapq.heappop(Spam()) except AttributeError: - # "'Foo' object has no attribute 'pop'" + # Duck typing failure: "'Foo' object has no attribute 'pop'" pass This kind of divergence is a problem for users as they unwittingly @@ -99,11 +103,11 @@ Starting in Python 3.3, any modules added to the standard library must have a pure Python implementation. This rule can only be ignored if the Python development team grants a special exemption for the module. -Typically the exemption would be granted only when a module wraps a +Typically the exemption will be granted only when a module wraps a specific C-based library (e.g., sqlite3_). In granting an exemption it -will be recognized that the module will most likely be considered -exclusive to CPython and not part of Python's standard library that -other VMs are expected to support. Usage of ``ctypes`` to provide an +will be recognized that the module will be considered exclusive to +CPython and not part of Python's standard library that other VMs are +expected to support. Usage of ``ctypes`` to provide an API for a C library will continue to be frowned upon as ``ctypes`` lacks compiler guarantees that C code typically relies upon to prevent certain errors from occurring (e.g., API changes). @@ -126,18 +130,34 @@ implementation of the ``csv`` module and maintaining it) then such code will be accepted. -Any accelerated code must be semantically identical to the pure Python -implementation. The only time any semantics are allowed to be -different are when technical details of the VM providing the -accelerated code prevent matching semantics from being possible, e.g., -a class being a ``type`` when implemented in C. The semantics -equivalence requirement also dictates that no public API be provided -in accelerated code that does not exist in the pure Python code. -Without this requirement people could accidentally come to rely on a -detail in the accelerated code which is not made available to other VMs -that use the pure Python implementation. To help verify that the -contract of semantic equivalence is being met, a module must be tested -both with and without its accelerated code as thoroughly as possible. +This requirement does not apply to modules already existing as only C +code in the standard library. It is acceptable to retroactively add a +pure Python implementation of a module implemented entirely in C, but +in those instances the C version is considered the reference +implementation in terms of expected semantics. + +Any new accelerated code must act as a drop-in replacement as close +to the pure Python implementation as reasonable. Technical details of +the VM providing the accelerated code are allowed to differ as +necessary, e.g., a class being a ``type`` when implemented in C. To +verify that the Python and equivalent C code operate as similarly as +possible, both code bases must be tested using the same tests which +apply to the pure Python code (tests specific to the C code or any VM +do not follow under this requirement). To make sure that the test +suite is thorough enough to cover all relevant semantics, the tests +must have 100% branch coverage for the Python code being replaced by +C code. This will make sure that the new acceleration code will +operate as much like a drop-in replacement for the Python code is as +possible. + +Acting as a drop-in replacement also dictates that no public API be +provided in accelerated code that does not exist in the pure Python +code. Without this requirement people could accidentally come to rely +on a detail in the accelerated code which is not made available to +other VMs that use the pure Python implementation. To help verify +that the contract of semantic equivalence is being met, a module must +be tested both with and without its accelerated code as thoroughly as +possible. As an example, to write tests which exercise both the pure Python and C accelerated versions of a module, a basic idiom can be followed:: @@ -189,9 +209,13 @@ if __name__ == '__main__': test_main() -Thoroughness of the test can be verified using coverage measurements -with branching coverage on the pure Python code to verify that all -possible scenarios are tested using (or not using) accelerator code. + +If this test were to provide 100% branch coverage for +``heapq.heappop()`` in the pure Python implementation then the +accelerated C code would be allowed to be added to CPython's standard +library. If it did not, then the test suite would need to be updated +until 100% branch coverage was provided before the accelerated C code +could be added. Copyright -- Repository URL: http://hg.python.org/peps From python-checkins at python.org Wed Apr 13 00:04:01 2011 From: python-checkins at python.org (brett.cannon) Date: Wed, 13 Apr 2011 00:04:01 +0200 Subject: [Python-checkins] peps: Mention that tests that cover C-specific issues are a good thing. Message-ID: http://hg.python.org/peps/rev/cb4815d2aba0 changeset: 3866:cb4815d2aba0 user: Brett Cannon date: Tue Apr 12 15:03:54 2011 -0700 summary: Mention that tests that cover C-specific issues are a good thing. files: pep-0399.txt | 6 +++++- 1 files changed, 5 insertions(+), 1 deletions(-) diff --git a/pep-0399.txt b/pep-0399.txt --- a/pep-0399.txt +++ b/pep-0399.txt @@ -148,7 +148,11 @@ must have 100% branch coverage for the Python code being replaced by C code. This will make sure that the new acceleration code will operate as much like a drop-in replacement for the Python code is as -possible. +possible. Testing should still be done for issues that come up when +working with C code even if it is not explicitly required to meet the +coverage requirement, e.g., Tests should be aware that C code typically +has special paths for things such as built-in types, subclasses of +built-in types, etc. Acting as a drop-in replacement also dictates that no public API be provided in accelerated code that does not exist in the pure Python -- Repository URL: http://hg.python.org/peps From python-checkins at python.org Wed Apr 13 00:26:10 2011 From: python-checkins at python.org (raymond.hettinger) Date: Wed, 13 Apr 2011 00:26:10 +0200 Subject: [Python-checkins] cpython (3.1): Issue 11747: Fix output format for context diffs. Message-ID: http://hg.python.org/cpython/rev/707078ca0a77 changeset: 69277:707078ca0a77 branch: 3.1 parent: 69272:f9bd0add9732 user: Raymond Hettinger date: Tue Apr 12 15:14:12 2011 -0700 summary: Issue 11747: Fix output format for context diffs. files: Lib/difflib.py | 108 ++++++++++++++++++--------- Lib/test/test_difflib.py | 65 ++++++++++++++++- 2 files changed, 136 insertions(+), 37 deletions(-) diff --git a/Lib/difflib.py b/Lib/difflib.py --- a/Lib/difflib.py +++ b/Lib/difflib.py @@ -1140,6 +1140,21 @@ return ch in ws +######################################################################## +### Unified Diff +######################################################################## + +def _format_range_unified(start, stop): + 'Convert range to the "ed" format' + # Per the diff spec at http://www.unix.org/single_unix_specification/ + beginning = start + 1 # lines start numbering with one + length = stop - start + if length == 1: + return '{}'.format(beginning) + if not length: + beginning -= 1 # empty ranges begin at line just before the range + return '{},{}'.format(beginning, length) + def unified_diff(a, b, fromfile='', tofile='', fromfiledate='', tofiledate='', n=3, lineterm='\n'): r""" @@ -1160,18 +1175,18 @@ The unidiff format normally has a header for filenames and modification times. Any or all of these may be specified using strings for - 'fromfile', 'tofile', 'fromfiledate', and 'tofiledate'. The modification - times are normally expressed in the format returned by time.ctime(). + 'fromfile', 'tofile', 'fromfiledate', and 'tofiledate'. + The modification times are normally expressed in the ISO 8601 format. Example: >>> for line in unified_diff('one two three four'.split(), ... 'zero one tree four'.split(), 'Original', 'Current', - ... 'Sat Jan 26 23:30:50 1991', 'Fri Jun 06 10:20:52 2003', + ... '2005-01-26 23:30:50', '2010-04-02 10:20:52', ... lineterm=''): - ... print(line) - --- Original Sat Jan 26 23:30:50 1991 - +++ Current Fri Jun 06 10:20:52 2003 + ... print(line) # doctest: +NORMALIZE_WHITESPACE + --- Original 2005-01-26 23:30:50 + +++ Current 2010-04-02 10:20:52 @@ -1,4 +1,4 @@ +zero one @@ -1184,23 +1199,45 @@ started = False for group in SequenceMatcher(None,a,b).get_grouped_opcodes(n): if not started: - yield '--- %s %s%s' % (fromfile, fromfiledate, lineterm) - yield '+++ %s %s%s' % (tofile, tofiledate, lineterm) started = True - i1, i2, j1, j2 = group[0][1], group[-1][2], group[0][3], group[-1][4] - yield "@@ -%d,%d +%d,%d @@%s" % (i1+1, i2-i1, j1+1, j2-j1, lineterm) + fromdate = '\t{}'.format(fromfiledate) if fromfiledate else '' + todate = '\t{}'.format(tofiledate) if tofiledate else '' + yield '--- {}{}{}'.format(fromfile, fromdate, lineterm) + yield '+++ {}{}{}'.format(tofile, todate, lineterm) + + first, last = group[0], group[-1] + file1_range = _format_range_unified(first[1], last[2]) + file2_range = _format_range_unified(first[3], last[4]) + yield '@@ -{} +{} @@{}'.format(file1_range, file2_range, lineterm) + for tag, i1, i2, j1, j2 in group: if tag == 'equal': for line in a[i1:i2]: yield ' ' + line continue - if tag == 'replace' or tag == 'delete': + if tag in {'replace', 'delete'}: for line in a[i1:i2]: yield '-' + line - if tag == 'replace' or tag == 'insert': + if tag in {'replace', 'insert'}: for line in b[j1:j2]: yield '+' + line + +######################################################################## +### Context Diff +######################################################################## + +def _format_range_context(start, stop): + 'Convert range to the "ed" format' + # Per the diff spec at http://www.unix.org/single_unix_specification/ + beginning = start + 1 # lines start numbering with one + length = stop - start + if not length: + beginning -= 1 # empty ranges begin at line just before the range + if length <= 1: + return '{}'.format(beginning) + return '{},{}'.format(beginning, beginning + length - 1) + # See http://www.unix.org/single_unix_specification/ def context_diff(a, b, fromfile='', tofile='', fromfiledate='', tofiledate='', n=3, lineterm='\n'): @@ -1223,17 +1260,16 @@ The context diff format normally has a header for filenames and modification times. Any or all of these may be specified using strings for 'fromfile', 'tofile', 'fromfiledate', and 'tofiledate'. - The modification times are normally expressed in the format returned - by time.ctime(). If not specified, the strings default to blanks. + The modification times are normally expressed in the ISO 8601 format. + If not specified, the strings default to blanks. Example: >>> print(''.join(context_diff('one\ntwo\nthree\nfour\n'.splitlines(1), - ... 'zero\none\ntree\nfour\n'.splitlines(1), 'Original', 'Current', - ... 'Sat Jan 26 23:30:50 1991', 'Fri Jun 06 10:22:46 2003')), + ... 'zero\none\ntree\nfour\n'.splitlines(1), 'Original', 'Current')), ... end="") - *** Original Sat Jan 26 23:30:50 1991 - --- Current Fri Jun 06 10:22:46 2003 + *** Original + --- Current *************** *** 1,4 **** one @@ -1247,36 +1283,36 @@ four """ + prefix = dict(insert='+ ', delete='- ', replace='! ', equal=' ') started = False - prefixmap = {'insert':'+ ', 'delete':'- ', 'replace':'! ', 'equal':' '} for group in SequenceMatcher(None,a,b).get_grouped_opcodes(n): if not started: - yield '*** %s %s%s' % (fromfile, fromfiledate, lineterm) - yield '--- %s %s%s' % (tofile, tofiledate, lineterm) started = True + fromdate = '\t{}'.format(fromfiledate) if fromfiledate else '' + todate = '\t{}'.format(tofiledate) if tofiledate else '' + yield '*** {}{}{}'.format(fromfile, fromdate, lineterm) + yield '--- {}{}{}'.format(tofile, todate, lineterm) - yield '***************%s' % (lineterm,) - if group[-1][2] - group[0][1] >= 2: - yield '*** %d,%d ****%s' % (group[0][1]+1, group[-1][2], lineterm) - else: - yield '*** %d ****%s' % (group[-1][2], lineterm) - visiblechanges = [e for e in group if e[0] in ('replace', 'delete')] - if visiblechanges: + first, last = group[0], group[-1] + yield '***************' + lineterm + + file1_range = _format_range_context(first[1], last[2]) + yield '*** {} ****{}'.format(file1_range, lineterm) + + if any(tag in {'replace', 'delete'} for tag, _, _, _, _ in group): for tag, i1, i2, _, _ in group: if tag != 'insert': for line in a[i1:i2]: - yield prefixmap[tag] + line + yield prefix[tag] + line - if group[-1][4] - group[0][3] >= 2: - yield '--- %d,%d ----%s' % (group[0][3]+1, group[-1][4], lineterm) - else: - yield '--- %d ----%s' % (group[-1][4], lineterm) - visiblechanges = [e for e in group if e[0] in ('replace', 'insert')] - if visiblechanges: + file2_range = _format_range_context(first[3], last[4]) + yield '--- {} ----{}'.format(file2_range, lineterm) + + if any(tag in {'replace', 'insert'} for tag, _, _, _, _ in group): for tag, _, _, j1, j2 in group: if tag != 'delete': for line in b[j1:j2]: - yield prefixmap[tag] + line + yield prefix[tag] + line def ndiff(a, b, linejunk=None, charjunk=IS_CHARACTER_JUNK): r""" diff --git a/Lib/test/test_difflib.py b/Lib/test/test_difflib.py --- a/Lib/test/test_difflib.py +++ b/Lib/test/test_difflib.py @@ -159,10 +159,73 @@ difflib.SequenceMatcher(None, old, new).get_opcodes() +class TestOutputFormat(unittest.TestCase): + def test_tab_delimiter(self): + args = ['one', 'two', 'Original', 'Current', + '2005-01-26 23:30:50', '2010-04-02 10:20:52'] + ud = difflib.unified_diff(*args, lineterm='') + self.assertEqual(list(ud)[0:2], [ + "--- Original\t2005-01-26 23:30:50", + "+++ Current\t2010-04-02 10:20:52"]) + cd = difflib.context_diff(*args, lineterm='') + self.assertEqual(list(cd)[0:2], [ + "*** Original\t2005-01-26 23:30:50", + "--- Current\t2010-04-02 10:20:52"]) + + def test_no_trailing_tab_on_empty_filedate(self): + args = ['one', 'two', 'Original', 'Current'] + ud = difflib.unified_diff(*args, lineterm='') + self.assertEqual(list(ud)[0:2], ["--- Original", "+++ Current"]) + + cd = difflib.context_diff(*args, lineterm='') + self.assertEqual(list(cd)[0:2], ["*** Original", "--- Current"]) + + def test_range_format_unified(self): + # Per the diff spec at http://www.unix.org/single_unix_specification/ + spec = '''\ + Each field shall be of the form: + %1d", if the range contains exactly one line, + and: + "%1d,%1d", , otherwise. + If a range is empty, its beginning line number shall be the number of + the line just before the range, or 0 if the empty range starts the file. + ''' + fmt = difflib._format_range_unified + self.assertEqual(fmt(3,3), '3,0') + self.assertEqual(fmt(3,4), '4') + self.assertEqual(fmt(3,5), '4,2') + self.assertEqual(fmt(3,6), '4,3') + self.assertEqual(fmt(0,0), '0,0') + + def test_range_format_context(self): + # Per the diff spec at http://www.unix.org/single_unix_specification/ + spec = '''\ + The range of lines in file1 shall be written in the following format + if the range contains two or more lines: + "*** %d,%d ****\n", , + and the following format otherwise: + "*** %d ****\n", + The ending line number of an empty range shall be the number of the preceding line, + or 0 if the range is at the start of the file. + + Next, the range of lines in file2 shall be written in the following format + if the range contains two or more lines: + "--- %d,%d ----\n", , + and the following format otherwise: + "--- %d ----\n", + ''' + fmt = difflib._format_range_context + self.assertEqual(fmt(3,3), '3') + self.assertEqual(fmt(3,4), '4') + self.assertEqual(fmt(3,5), '4,5') + self.assertEqual(fmt(3,6), '4,6') + self.assertEqual(fmt(0,0), '0') + + def test_main(): difflib.HtmlDiff._default_prefix = 0 Doctests = doctest.DocTestSuite(difflib) - run_unittest(TestSFpatches, TestSFbugs, Doctests) + run_unittest(TestSFpatches, TestSFbugs, Doctests, TestOutputFormat) if __name__ == '__main__': test_main() -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Apr 13 00:26:13 2011 From: python-checkins at python.org (raymond.hettinger) Date: Wed, 13 Apr 2011 00:26:13 +0200 Subject: [Python-checkins] cpython (3.2): Issue 11747: Fix output format for context diffs. Message-ID: http://hg.python.org/cpython/rev/e3387295a24f changeset: 69278:e3387295a24f branch: 3.2 parent: 69273:4254085bae84 user: Raymond Hettinger date: Tue Apr 12 15:19:33 2011 -0700 summary: Issue 11747: Fix output format for context diffs. files: Lib/difflib.py | 30 +++++++++++++++++++++++---- Lib/test/test_difflib.py | 29 +++++++++++++++++++++++++- 2 files changed, 52 insertions(+), 7 deletions(-) diff --git a/Lib/difflib.py b/Lib/difflib.py --- a/Lib/difflib.py +++ b/Lib/difflib.py @@ -1144,7 +1144,11 @@ return ch in ws -def _format_range(start, stop): +######################################################################## +### Unified Diff +######################################################################## + +def _format_range_unified(start, stop): 'Convert range to the "ed" format' # Per the diff spec at http://www.unix.org/single_unix_specification/ beginning = start + 1 # lines start numbering with one @@ -1206,8 +1210,8 @@ yield '+++ {}{}{}'.format(tofile, todate, lineterm) first, last = group[0], group[-1] - file1_range = _format_range(first[1], last[2]) - file2_range = _format_range(first[3], last[4]) + file1_range = _format_range_unified(first[1], last[2]) + file2_range = _format_range_unified(first[3], last[4]) yield '@@ -{} +{} @@{}'.format(file1_range, file2_range, lineterm) for tag, i1, i2, j1, j2 in group: @@ -1222,6 +1226,22 @@ for line in b[j1:j2]: yield '+' + line + +######################################################################## +### Context Diff +######################################################################## + +def _format_range_context(start, stop): + 'Convert range to the "ed" format' + # Per the diff spec at http://www.unix.org/single_unix_specification/ + beginning = start + 1 # lines start numbering with one + length = stop - start + if not length: + beginning -= 1 # empty ranges begin at line just before the range + if length <= 1: + return '{}'.format(beginning) + return '{},{}'.format(beginning, beginning + length - 1) + # See http://www.unix.org/single_unix_specification/ def context_diff(a, b, fromfile='', tofile='', fromfiledate='', tofiledate='', n=3, lineterm='\n'): @@ -1280,7 +1300,7 @@ first, last = group[0], group[-1] yield '***************' + lineterm - file1_range = _format_range(first[1], last[2]) + file1_range = _format_range_context(first[1], last[2]) yield '*** {} ****{}'.format(file1_range, lineterm) if any(tag in {'replace', 'delete'} for tag, _, _, _, _ in group): @@ -1289,7 +1309,7 @@ for line in a[i1:i2]: yield prefix[tag] + line - file2_range = _format_range(first[3], last[4]) + file2_range = _format_range_context(first[3], last[4]) yield '--- {} ----{}'.format(file2_range, lineterm) if any(tag in {'replace', 'insert'} for tag, _, _, _, _ in group): diff --git a/Lib/test/test_difflib.py b/Lib/test/test_difflib.py --- a/Lib/test/test_difflib.py +++ b/Lib/test/test_difflib.py @@ -236,7 +236,7 @@ cd = difflib.context_diff(*args, lineterm='') self.assertEqual(list(cd)[0:2], ["*** Original", "--- Current"]) - def test_range_format(self): + def test_range_format_unified(self): # Per the diff spec at http://www.unix.org/single_unix_specification/ spec = '''\ Each field shall be of the form: @@ -246,13 +246,38 @@ If a range is empty, its beginning line number shall be the number of the line just before the range, or 0 if the empty range starts the file. ''' - fmt = difflib._format_range + fmt = difflib._format_range_unified self.assertEqual(fmt(3,3), '3,0') self.assertEqual(fmt(3,4), '4') self.assertEqual(fmt(3,5), '4,2') self.assertEqual(fmt(3,6), '4,3') self.assertEqual(fmt(0,0), '0,0') + def test_range_format_context(self): + # Per the diff spec at http://www.unix.org/single_unix_specification/ + spec = '''\ + The range of lines in file1 shall be written in the following format + if the range contains two or more lines: + "*** %d,%d ****\n", , + and the following format otherwise: + "*** %d ****\n", + The ending line number of an empty range shall be the number of the preceding line, + or 0 if the range is at the start of the file. + + Next, the range of lines in file2 shall be written in the following format + if the range contains two or more lines: + "--- %d,%d ----\n", , + and the following format otherwise: + "--- %d ----\n", + ''' + fmt = difflib._format_range_context + self.assertEqual(fmt(3,3), '3') + self.assertEqual(fmt(3,4), '4') + self.assertEqual(fmt(3,5), '4,5') + self.assertEqual(fmt(3,6), '4,6') + self.assertEqual(fmt(0,0), '0') + + def test_main(): difflib.HtmlDiff._default_prefix = 0 Doctests = doctest.DocTestSuite(difflib) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Apr 13 00:26:18 2011 From: python-checkins at python.org (raymond.hettinger) Date: Wed, 13 Apr 2011 00:26:18 +0200 Subject: [Python-checkins] cpython (merge 3.1 -> 3.2): Merge Message-ID: http://hg.python.org/cpython/rev/8733fa6d3cfa changeset: 69279:8733fa6d3cfa branch: 3.2 parent: 69278:e3387295a24f parent: 69277:707078ca0a77 user: Raymond Hettinger date: Tue Apr 12 15:24:39 2011 -0700 summary: Merge files: -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Apr 13 00:26:20 2011 From: python-checkins at python.org (raymond.hettinger) Date: Wed, 13 Apr 2011 00:26:20 +0200 Subject: [Python-checkins] cpython: Issue 11747: Fix output format for context diffs. Message-ID: http://hg.python.org/cpython/rev/fbfd5435889c changeset: 69280:fbfd5435889c parent: 69276:506cab8fc329 user: Raymond Hettinger date: Tue Apr 12 15:25:30 2011 -0700 summary: Issue 11747: Fix output format for context diffs. files: Lib/difflib.py | 30 +++++++++++++++++++++++---- Lib/test/test_difflib.py | 29 +++++++++++++++++++++++++- 2 files changed, 52 insertions(+), 7 deletions(-) diff --git a/Lib/difflib.py b/Lib/difflib.py --- a/Lib/difflib.py +++ b/Lib/difflib.py @@ -1144,7 +1144,11 @@ return ch in ws -def _format_range(start, stop): +######################################################################## +### Unified Diff +######################################################################## + +def _format_range_unified(start, stop): 'Convert range to the "ed" format' # Per the diff spec at http://www.unix.org/single_unix_specification/ beginning = start + 1 # lines start numbering with one @@ -1206,8 +1210,8 @@ yield '+++ {}{}{}'.format(tofile, todate, lineterm) first, last = group[0], group[-1] - file1_range = _format_range(first[1], last[2]) - file2_range = _format_range(first[3], last[4]) + file1_range = _format_range_unified(first[1], last[2]) + file2_range = _format_range_unified(first[3], last[4]) yield '@@ -{} +{} @@{}'.format(file1_range, file2_range, lineterm) for tag, i1, i2, j1, j2 in group: @@ -1222,6 +1226,22 @@ for line in b[j1:j2]: yield '+' + line + +######################################################################## +### Context Diff +######################################################################## + +def _format_range_context(start, stop): + 'Convert range to the "ed" format' + # Per the diff spec at http://www.unix.org/single_unix_specification/ + beginning = start + 1 # lines start numbering with one + length = stop - start + if not length: + beginning -= 1 # empty ranges begin at line just before the range + if length <= 1: + return '{}'.format(beginning) + return '{},{}'.format(beginning, beginning + length - 1) + # See http://www.unix.org/single_unix_specification/ def context_diff(a, b, fromfile='', tofile='', fromfiledate='', tofiledate='', n=3, lineterm='\n'): @@ -1280,7 +1300,7 @@ first, last = group[0], group[-1] yield '***************' + lineterm - file1_range = _format_range(first[1], last[2]) + file1_range = _format_range_context(first[1], last[2]) yield '*** {} ****{}'.format(file1_range, lineterm) if any(tag in {'replace', 'delete'} for tag, _, _, _, _ in group): @@ -1289,7 +1309,7 @@ for line in a[i1:i2]: yield prefix[tag] + line - file2_range = _format_range(first[3], last[4]) + file2_range = _format_range_context(first[3], last[4]) yield '--- {} ----{}'.format(file2_range, lineterm) if any(tag in {'replace', 'insert'} for tag, _, _, _, _ in group): diff --git a/Lib/test/test_difflib.py b/Lib/test/test_difflib.py --- a/Lib/test/test_difflib.py +++ b/Lib/test/test_difflib.py @@ -236,7 +236,7 @@ cd = difflib.context_diff(*args, lineterm='') self.assertEqual(list(cd)[0:2], ["*** Original", "--- Current"]) - def test_range_format(self): + def test_range_format_unified(self): # Per the diff spec at http://www.unix.org/single_unix_specification/ spec = '''\ Each field shall be of the form: @@ -246,13 +246,38 @@ If a range is empty, its beginning line number shall be the number of the line just before the range, or 0 if the empty range starts the file. ''' - fmt = difflib._format_range + fmt = difflib._format_range_unified self.assertEqual(fmt(3,3), '3,0') self.assertEqual(fmt(3,4), '4') self.assertEqual(fmt(3,5), '4,2') self.assertEqual(fmt(3,6), '4,3') self.assertEqual(fmt(0,0), '0,0') + def test_range_format_context(self): + # Per the diff spec at http://www.unix.org/single_unix_specification/ + spec = '''\ + The range of lines in file1 shall be written in the following format + if the range contains two or more lines: + "*** %d,%d ****\n", , + and the following format otherwise: + "*** %d ****\n", + The ending line number of an empty range shall be the number of the preceding line, + or 0 if the range is at the start of the file. + + Next, the range of lines in file2 shall be written in the following format + if the range contains two or more lines: + "--- %d,%d ----\n", , + and the following format otherwise: + "--- %d ----\n", + ''' + fmt = difflib._format_range_context + self.assertEqual(fmt(3,3), '3') + self.assertEqual(fmt(3,4), '4') + self.assertEqual(fmt(3,5), '4,5') + self.assertEqual(fmt(3,6), '4,6') + self.assertEqual(fmt(0,0), '0') + + def test_main(): difflib.HtmlDiff._default_prefix = 0 Doctests = doctest.DocTestSuite(difflib) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Apr 13 00:26:22 2011 From: python-checkins at python.org (raymond.hettinger) Date: Wed, 13 Apr 2011 00:26:22 +0200 Subject: [Python-checkins] cpython (merge 3.2 -> default): Merge Message-ID: http://hg.python.org/cpython/rev/69573aab3ea7 changeset: 69281:69573aab3ea7 parent: 69280:fbfd5435889c parent: 69279:8733fa6d3cfa user: Raymond Hettinger date: Tue Apr 12 15:25:44 2011 -0700 summary: Merge files: -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Apr 13 00:48:34 2011 From: python-checkins at python.org (raymond.hettinger) Date: Wed, 13 Apr 2011 00:48:34 +0200 Subject: [Python-checkins] cpython (2.7): Issue 11747: Fix output format for context diffs. Message-ID: http://hg.python.org/cpython/rev/09459397f807 changeset: 69282:09459397f807 branch: 2.7 parent: 69271:f4adc2926bf5 user: Raymond Hettinger date: Tue Apr 12 15:48:25 2011 -0700 summary: Issue 11747: Fix output format for context diffs. files: Lib/difflib.py | 89 +++++++++++++++++++-------- Lib/test/test_difflib.py | 41 ++++++++++++ 2 files changed, 102 insertions(+), 28 deletions(-) diff --git a/Lib/difflib.py b/Lib/difflib.py --- a/Lib/difflib.py +++ b/Lib/difflib.py @@ -1140,6 +1140,21 @@ return ch in ws +######################################################################## +### Unified Diff +######################################################################## + +def _format_range_unified(start, stop): + 'Convert range to the "ed" format' + # Per the diff spec at http://www.unix.org/single_unix_specification/ + beginning = start + 1 # lines start numbering with one + length = stop - start + if length == 1: + return '{}'.format(beginning) + if not length: + beginning -= 1 # empty ranges begin at line just before the range + return '{},{}'.format(beginning, length) + def unified_diff(a, b, fromfile='', tofile='', fromfiledate='', tofiledate='', n=3, lineterm='\n'): r""" @@ -1184,25 +1199,45 @@ started = False for group in SequenceMatcher(None,a,b).get_grouped_opcodes(n): if not started: - fromdate = '\t%s' % fromfiledate if fromfiledate else '' - todate = '\t%s' % tofiledate if tofiledate else '' - yield '--- %s%s%s' % (fromfile, fromdate, lineterm) - yield '+++ %s%s%s' % (tofile, todate, lineterm) started = True - i1, i2, j1, j2 = group[0][1], group[-1][2], group[0][3], group[-1][4] - yield "@@ -%d,%d +%d,%d @@%s" % (i1+1, i2-i1, j1+1, j2-j1, lineterm) + fromdate = '\t{}'.format(fromfiledate) if fromfiledate else '' + todate = '\t{}'.format(tofiledate) if tofiledate else '' + yield '--- {}{}{}'.format(fromfile, fromdate, lineterm) + yield '+++ {}{}{}'.format(tofile, todate, lineterm) + + first, last = group[0], group[-1] + file1_range = _format_range_unified(first[1], last[2]) + file2_range = _format_range_unified(first[3], last[4]) + yield '@@ -{} +{} @@{}'.format(file1_range, file2_range, lineterm) + for tag, i1, i2, j1, j2 in group: if tag == 'equal': for line in a[i1:i2]: yield ' ' + line continue - if tag == 'replace' or tag == 'delete': + if tag in ('replace', 'delete'): for line in a[i1:i2]: yield '-' + line - if tag == 'replace' or tag == 'insert': + if tag in ('replace', 'insert'): for line in b[j1:j2]: yield '+' + line + +######################################################################## +### Context Diff +######################################################################## + +def _format_range_context(start, stop): + 'Convert range to the "ed" format' + # Per the diff spec at http://www.unix.org/single_unix_specification/ + beginning = start + 1 # lines start numbering with one + length = stop - start + if not length: + beginning -= 1 # empty ranges begin at line just before the range + if length <= 1: + return '{}'.format(beginning) + return '{},{}'.format(beginning, beginning + length - 1) + # See http://www.unix.org/single_unix_specification/ def context_diff(a, b, fromfile='', tofile='', fromfiledate='', tofiledate='', n=3, lineterm='\n'): @@ -1247,38 +1282,36 @@ four """ + prefix = dict(insert='+ ', delete='- ', replace='! ', equal=' ') started = False - prefixmap = {'insert':'+ ', 'delete':'- ', 'replace':'! ', 'equal':' '} for group in SequenceMatcher(None,a,b).get_grouped_opcodes(n): if not started: - fromdate = '\t%s' % fromfiledate if fromfiledate else '' - todate = '\t%s' % tofiledate if tofiledate else '' - yield '*** %s%s%s' % (fromfile, fromdate, lineterm) - yield '--- %s%s%s' % (tofile, todate, lineterm) started = True + fromdate = '\t{}'.format(fromfiledate) if fromfiledate else '' + todate = '\t{}'.format(tofiledate) if tofiledate else '' + yi