From python-checkins at python.org Sat Oct 1 01:05:28 2011 From: python-checkins at python.org (victor.stinner) Date: Sat, 01 Oct 2011 01:05:28 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_pyexat_uses_the_new_Unicode?= =?utf8?q?_API?= Message-ID: http://hg.python.org/cpython/rev/a1be34457ccf changeset: 72548:a1be34457ccf user: Victor Stinner date: Sat Oct 01 01:05:40 2011 +0200 summary: pyexat uses the new Unicode API files: Modules/pyexpat.c | 12 +++++++----- 1 files changed, 7 insertions(+), 5 deletions(-) diff --git a/Modules/pyexpat.c b/Modules/pyexpat.c --- a/Modules/pyexpat.c +++ b/Modules/pyexpat.c @@ -1234,11 +1234,13 @@ static PyObject * xmlparse_getattro(xmlparseobject *self, PyObject *nameobj) { - const Py_UNICODE *name; + Py_UCS4 first_char; int handlernum = -1; if (!PyUnicode_Check(nameobj)) goto generic; + if (PyUnicode_READY(nameobj)) + return NULL; handlernum = handlername2int(nameobj); @@ -1250,8 +1252,8 @@ return result; } - name = PyUnicode_AS_UNICODE(nameobj); - if (name[0] == 'E') { + first_char = PyUnicode_READ_CHAR(nameobj, 0); + if (first_char == 'E') { if (PyUnicode_CompareWithASCIIString(nameobj, "ErrorCode") == 0) return PyLong_FromLong((long) XML_GetErrorCode(self->itself)); @@ -1265,7 +1267,7 @@ return PyLong_FromLong((long) XML_GetErrorByteIndex(self->itself)); } - if (name[0] == 'C') { + if (first_char == 'C') { if (PyUnicode_CompareWithASCIIString(nameobj, "CurrentLineNumber") == 0) return PyLong_FromLong((long) XML_GetCurrentLineNumber(self->itself)); @@ -1276,7 +1278,7 @@ return PyLong_FromLong((long) XML_GetCurrentByteIndex(self->itself)); } - if (name[0] == 'b') { + if (first_char == 'b') { if (PyUnicode_CompareWithASCIIString(nameobj, "buffer_size") == 0) return PyLong_FromLong((long) self->buffer_size); if (PyUnicode_CompareWithASCIIString(nameobj, "buffer_text") == 0) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sat Oct 1 01:08:37 2011 From: python-checkins at python.org (guido.van.rossum) Date: Sat, 01 Oct 2011 01:08:37 +0200 Subject: [Python-checkins] =?utf8?q?peps=3A_Update_posted_by_Greg_Ewing_to?= =?utf8?q?_python-ideas_today?= Message-ID: http://hg.python.org/peps/rev/31aa6576828d changeset: 3953:31aa6576828d user: Guido van Rossum date: Fri Sep 30 16:08:30 2011 -0700 summary: Update posted by Greg Ewing to python-ideas today files: pep-0335.txt | 329 +++++++++++++++++++++++++++++++++++++- 1 files changed, 317 insertions(+), 12 deletions(-) diff --git a/pep-0335.txt b/pep-0335.txt --- a/pep-0335.txt +++ b/pep-0335.txt @@ -2,13 +2,13 @@ Title: Overloadable Boolean Operators Version: $Revision$ Last-Modified: $Date$ -Author: Greg Ewing +Author: Gregory Ewing Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 29-Aug-2004 -Python-Version: 2.4 -Post-History: 05-Sep-2004 +Python-Version: 3.3 +Post-History: 05-Sep-2004, 30-Sep-2011 Abstract @@ -45,7 +45,7 @@ operators excluded from those able to be customised can be inconvenient. Examples include: -1. Numeric/Numarray, in which almost all the operators are defined on +1. NumPy, in which almost all the operators are defined on arrays so as to perform the appropriate operation between corresponding elements, and return an array of the results. For consistency, one would expect a boolean operation between two @@ -54,7 +54,7 @@ There is a precedent for an extension of this kind: comparison operators were originally restricted to returning boolean results, - and rich comparisons were added so that comparisons of Numeric + and rich comparisons were added so that comparisons of NumPy arrays could return arrays of booleans. 2. A symbolic algebra system, in which a Python expression is @@ -87,7 +87,7 @@ 2. There must not be any appreciable loss of speed in the default case. -3. If possible, the customisation mechanism should allow the object to +3. Ideally, the customisation mechanism should allow the object to provide either short-circuiting or non-short-circuiting semantics, at its discretion. @@ -131,14 +131,14 @@ To permit short-circuiting, processing of the 'and' and 'or' operators is split into two phases. Phase 1 occurs after evaluation of the first operand but before the second. If the first operand defines the -appropriate phase 1 method, it is called with the first operand as +relevant phase 1 method, it is called with the first operand as argument. If that method can determine the result without needing the second operand, it returns the result, and further processing is skipped. If the phase 1 method determines that the second operand is needed, it returns the special value NeedOtherOperand. This triggers the -evaluation of the second operand, and the calling of an appropriate +evaluation of the second operand, and the calling of a relevant phase 2 method. During phase 2, the __and2__/__rand2__ and __or2__/__ror2__ method pairs work as for other binary operators. @@ -149,7 +149,7 @@ no corresponding phase 1 method, the second operand is always evaluated and the phase 2 method called. This allows an object which does not want short-circuiting semantics to simply implement the -relevant phase 2 methods and ignore phase 1. +phase 2 methods and ignore phase 1. Bytecodes @@ -182,9 +182,9 @@ Type Slots ---------- -A the C level, the new special methods are manifested as five new +At the C level, the new special methods are manifested as five new slots in the type object. In the patch, they are added to the -tp_as_number substructure, since this allowed making use of some +tp_as_number substructure, since this allows making use of some existing code for dealing with unary and binary operators. Their existence is signalled by a new type flag, Py_TPFLAGS_HAVE_BOOLEAN_OVERLOAD. @@ -208,7 +208,312 @@ PyObject *PyObject_LogicalAnd1(PyObject *); PyObject *PyObject_LogicalOr1(PyObject *); PyObject *PyObject_LogicalAnd2(PyObject *, PyObject *); - PyObject *PyObject_LogicalOr2(PyObject *, PyObject *); + + + +Alternatives and Optimisations +============================== + +This section discusses some possible variations on the proposal, +and ways in which the bytecode sequences generated for boolean +expressions could be optimised. + +Reduced special method set +-------------------------- + +For completeness, the full version of this proposal includes a +mechanism for types to define their own customised short-circuiting +behaviour. However, the full mechanism is not needed to address the +main use cases put forward here, and it would be possible to +define a simplified version that only includes the phase 2 +methods. There would then only be 5 new special methods (__and2__, +__rand2__, __or2__, __ror2__, __not__) with 3 associated type slots +and 3 API functions. + +This simplified version could be expanded to the full version +later if desired. + +Additional bytecodes +-------------------- + +As defined here, the bytecode sequence for code that branches on +the result of a boolean expression would be slightly longer than +it currently is. For example, in Python 2.7, + +:: + + if a and b: + statement1 + else: + statement2 + +generates + +:: + + LOAD_GLOBAL a + POP_JUMP_IF_FALSE false_branch + LOAD_GLOBAL b + POP_JUMP_IF_FALSE false_branch + + JUMP_FORWARD end_branch + false_branch: + + end_branch: + +Under this proposal as described so far, it would become something like + +:: + + LOAD_GLOBAL a + LOGICAL_AND_1 test + LOAD_GLOBAL b + LOGICAL_AND_2 + test: + POP_JUMP_IF_FALSE false_branch + + JUMP_FORWARD end_branch + false_branch: + + end_branch: + +This involves executing one extra bytecode in the short-circuiting +case and two extra bytecodes in the non-short-circuiting case. + +However, by introducing extra bytecodes that combine the logical +operations with testing and branching on the result, it can be +reduced to the same number of bytecodes as the original: + +:: + + LOAD_GLOBAL a + AND1_JUMP true_branch, false_branch + LOAD_GLOBAL b + AND2_JUMP_IF_FALSE false_branch + true_branch: + + JUMP_FORWARD end_branch + false_branch: + + end_branch: + +Here, AND1_JUMP performs phase 1 processing as above, +and then examines the result. If there is a result, it is popped +from the stack, its truth value is tested and a branch taken to +one of two locations. + +Otherwise, the first operand is left on the stack and execution +continues to the next bytecode. The AND2_JUMP_IF_FALSE bytecode +performs phase 2 processing, pops the result and branches if +it tests false + +For the 'or' operator, there would be corresponding OR1_JUMP +and OR2_JUMP_IF_TRUE bytecodes. + +If the simplified version without phase 1 methods is used, then +early exiting can only occur if the first operand is false for +'and' and true for 'or'. Consequently, the two-target AND1_JUMP and +OR1_JUMP bytecodes can be replaced with AND1_JUMP_IF_FALSE and +OR1_JUMP_IF_TRUE, these being ordinary branch instructions with +only one target. + +Optimisation of 'not' +--------------------- + +Recent versions of Python implement a simple optimisation in +which branching on a negated boolean expression is implemented +by reversing the sense of the branch, saving a UNARY_NOT opcode. + +Taking a strict view, this optimisation should no longer be +performed, because the 'not' operator may be overridden to produce +quite different results from usual. However, in typical use cases, +it is not envisaged that expressions involving customised boolean +operations will be used for branching -- it is much more likely +that the result will be used in some other way. + +Therefore, it would probably do little harm to specify that the +compiler is allowed to use the laws of boolean algebra to +simplify any expression that appears directly in a boolean +context. If this is inconvenient, the result can always be assigned +to a temporary name first. + +This would allow the existing 'not' optimisation to remain, and +would permit future extensions of it such as using De Morgan's laws +to extend it deeper into the expression. + + +Usage Examples +============== + +Example 1: NumPy Arrays +----------------------- + +:: + + #----------------------------------------------------------------- + # + # This example creates a subclass of numpy array to which + # 'and', 'or' and 'not' can be applied, producing an array + # of booleans. + # + #----------------------------------------------------------------- + + from numpy import array, ndarray + + class BArray(ndarray): + + def __str__(self): + return "barray(%s)" % ndarray.__str__(self) + + def __and2__(self, other): + return (self & other) + + def __or2__(self, other): + return (self & other) + + def __not__(self): + return (self == 0) + + def barray(*args, **kwds): + return array(*args, **kwds).view(type = BArray) + + a0 = barray([0, 1, 2, 4]) + a1 = barray([1, 2, 3, 4]) + a2 = barray([5, 6, 3, 4]) + a3 = barray([5, 1, 2, 4]) + + print "a0:", a0 + print "a1:", a1 + print "a2:", a2 + print "a3:", a3 + print "not a0:", not a0 + print "a0 == a1 and a2 == a3:", a0 == a1 and a2 == a3 + print "a0 == a1 or a2 == a3:", a0 == a1 or a2 == a3 + +Example 1 Output +---------------- + +:: + + a0: barray([0 1 2 4]) + a1: barray([1 2 3 4]) + a2: barray([5 6 3 4]) + a3: barray([5 1 2 4]) + not a0: barray([ True False False False]) + a0 == a1 and a2 == a3: barray([False False False True]) + a0 == a1 or a2 == a3: barray([False False False True]) + + +Example 2: Database Queries +--------------------------- + +:: + + #----------------------------------------------------------------- + # + # This example demonstrates the creation of a DSL for database + # queries allowing 'and' and 'or' operators to be used to + # formulate the query. + # + #----------------------------------------------------------------- + + class SQLNode(object): + + def __and2__(self, other): + return SQLBinop("and", self, other) + + def __rand2__(self, other): + return SQLBinop("and", other, self) + + def __eq__(self, other): + return SQLBinop("=", self, other) + + + class Table(SQLNode): + + def __init__(self, name): + self.__tablename__ = name + + def __getattr__(self, name): + return SQLAttr(self, name) + + def __sql__(self): + return self.__tablename__ + + + class SQLBinop(SQLNode): + + def __init__(self, op, opnd1, opnd2): + self.op = op.upper() + self.opnd1 = opnd1 + self.opnd2 = opnd2 + + def __sql__(self): + return "(%s %s %s)" % (sql(self.opnd1), self.op, sql(self.opnd2)) + + + class SQLAttr(SQLNode): + + def __init__(self, table, name): + self.table = table + self.name = name + + def __sql__(self): + return "%s.%s" % (sql(self.table), self.name) + + + class SQLSelect(SQLNode): + + def __init__(self, targets): + self.targets = targets + self.where_clause = None + + def where(self, expr): + self.where_clause = expr + return self + + def __sql__(self): + result = "SELECT %s" % ", ".join([sql(target) for target in self.targets]) + if self.where_clause: + result = "%s WHERE %s" % (result, sql(self.where_clause)) + return result + + + def sql(expr): + if isinstance(expr, SQLNode): + return expr.__sql__() + elif isinstance(expr, str): + return "'%s'" % expr.replace("'", "''") + else: + return str(expr) + + + def select(*targets): + return SQLSelect(targets) + + +#-------------------------------------------------------------------------------- + + dishes = Table("dishes") + customers = Table("customers") + orders = Table("orders") + + query = select(customers.name, dishes.price, orders.amount).where( + customers.cust_id == orders.cust_id and orders.dish_id == dishes.dish_id + and dishes.name == "Spam, Eggs, Sausages and Spam") + + print repr(query) + print sql(query) + +Example 2 Output +---------------- + +:: + + <__main__.SQLSelect object at 0x1cc830> + SELECT customers.name, dishes.price, orders.amount WHERE + (((customers.cust_id = orders.cust_id) AND (orders.dish_id = + dishes.dish_id)) AND (dishes.name = 'Spam, Eggs, Sausages and Spam')) Copyright -- Repository URL: http://hg.python.org/peps From python-checkins at python.org Sat Oct 1 02:49:27 2011 From: python-checkins at python.org (victor.stinner) Date: Sat, 01 Oct 2011 02:49:27 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_PyUnicode=5FFromObject=28?= =?utf8?q?=29_reuses_PyUnicode=5FCopy=28=29?= Message-ID: http://hg.python.org/cpython/rev/d94b0b371878 changeset: 72549:d94b0b371878 user: Victor Stinner date: Sat Oct 01 01:16:59 2011 +0200 summary: PyUnicode_FromObject() reuses PyUnicode_Copy() * PyUnicode_Copy() is faster than substring() * Fix also a compiler warning files: Objects/unicodeobject.c | 6 ++---- 1 files changed, 2 insertions(+), 4 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -2052,9 +2052,7 @@ if (PyUnicode_Check(obj)) { /* For a Unicode subtype that's not a Unicode object, return a true Unicode object with the same data. */ - if (PyUnicode_READY(obj) == -1) - return NULL; - return substring((PyUnicodeObject *)obj, 0, PyUnicode_GET_LENGTH(obj)); + return PyUnicode_Copy(obj); } PyErr_Format(PyExc_TypeError, "Can't convert '%.100s' object to str implicitly", @@ -11465,7 +11463,7 @@ return (PyObject*) self; } else - return PyUnicode_Copy(self); + return PyUnicode_Copy((PyObject*)self); } fill = width - _PyUnicode_LENGTH(self); -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sat Oct 1 02:49:27 2011 From: python-checkins at python.org (victor.stinner) Date: Sat, 01 Oct 2011 02:49:27 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Remove_commented_code=3A_st?= =?utf8?q?r+=3Dstr_is_no_more_super-optimized?= Message-ID: http://hg.python.org/cpython/rev/df6deb7bb772 changeset: 72550:df6deb7bb772 user: Victor Stinner date: Sat Oct 01 01:26:08 2011 +0200 summary: Remove commented code: str+=str is no more super-optimized files: Python/ceval.c | 118 +----------------------------------- 1 files changed, 6 insertions(+), 112 deletions(-) diff --git a/Python/ceval.c b/Python/ceval.c --- a/Python/ceval.c +++ b/Python/ceval.c @@ -136,8 +136,6 @@ static int import_all_from(PyObject *, PyObject *); static void format_exc_check_arg(PyObject *, const char *, PyObject *); static void format_exc_unbound(PyCodeObject *co, int oparg); -static PyObject * unicode_concatenate(PyObject *, PyObject *, - PyFrameObject *, unsigned char *); static PyObject * special_lookup(PyObject *, char *, PyObject **); #define NAME_ERROR_MSG \ @@ -1509,17 +1507,11 @@ TARGET(BINARY_ADD) w = POP(); v = TOP(); - if (PyUnicode_CheckExact(v) && - PyUnicode_CheckExact(w)) { - x = unicode_concatenate(v, w, f, next_instr); - /* unicode_concatenate consumed the ref to v */ - goto skip_decref_vx; - } - else { + if (PyUnicode_Check(v) && PyUnicode_Check(w)) + x = PyUnicode_Concat(v, w); + else x = PyNumber_Add(v, w); - } Py_DECREF(v); - skip_decref_vx: Py_DECREF(w); SET_TOP(x); if (x != NULL) DISPATCH(); @@ -1670,17 +1662,11 @@ TARGET(INPLACE_ADD) w = POP(); v = TOP(); - if (PyUnicode_CheckExact(v) && - PyUnicode_CheckExact(w)) { - x = unicode_concatenate(v, w, f, next_instr); - /* unicode_concatenate consumed the ref to v */ - goto skip_decref_v; - } - else { + if (PyUnicode_Check(v) && PyUnicode_Check(w)) + x = PyUnicode_Concat(v, w); + else x = PyNumber_InPlaceAdd(v, w); - } Py_DECREF(v); - skip_decref_v: Py_DECREF(w); SET_TOP(x); if (x != NULL) DISPATCH(); @@ -4515,98 +4501,6 @@ } } -static PyObject * -unicode_concatenate(PyObject *v, PyObject *w, - PyFrameObject *f, unsigned char *next_instr) -{ - /* This function implements 'variable += expr' when both arguments - are (Unicode) strings. */ - - w = PyUnicode_Concat(v, w); - Py_DECREF(v); - return w; - - /* XXX: This optimization is currently disabled as unicode objects in the - new flexible representation are not in-place resizable anymore. */ -#if 0 - Py_ssize_t v_len = PyUnicode_GET_SIZE(v); - Py_ssize_t w_len = PyUnicode_GET_SIZE(w); - Py_ssize_t new_len = v_len + w_len; - if (new_len < 0) { - PyErr_SetString(PyExc_OverflowError, - "strings are too large to concat"); - return NULL; - } - - if (Py_REFCNT(v) == 2) { - /* In the common case, there are 2 references to the value - * stored in 'variable' when the += is performed: one on the - * value stack (in 'v') and one still stored in the - * 'variable'. We try to delete the variable now to reduce - * the refcnt to 1. - */ - switch (*next_instr) { - case STORE_FAST: - { - int oparg = PEEKARG(); - PyObject **fastlocals = f->f_localsplus; - if (GETLOCAL(oparg) == v) - SETLOCAL(oparg, NULL); - break; - } - case STORE_DEREF: - { - PyObject **freevars = (f->f_localsplus + - f->f_code->co_nlocals); - PyObject *c = freevars[PEEKARG()]; - if (PyCell_GET(c) == v) - PyCell_Set(c, NULL); - break; - } - case STORE_NAME: - { - PyObject *names = f->f_code->co_names; - PyObject *name = GETITEM(names, PEEKARG()); - PyObject *locals = f->f_locals; - if (PyDict_CheckExact(locals) && - PyDict_GetItem(locals, name) == v) { - if (PyDict_DelItem(locals, name) != 0) { - PyErr_Clear(); - } - } - break; - } - } - } - - if (Py_REFCNT(v) == 1 && !PyUnicode_CHECK_INTERNED(v) && - !PyUnicode_IS_COMPACT((PyUnicodeObject *)v)) { - /* Now we own the last reference to 'v', so we can resize it - * in-place. - */ - if (PyUnicode_Resize(&v, new_len) != 0) { - /* XXX if PyUnicode_Resize() fails, 'v' has been - * deallocated so it cannot be put back into - * 'variable'. The MemoryError is raised when there - * is no value in 'variable', which might (very - * remotely) be a cause of incompatibilities. - */ - return NULL; - } - /* copy 'w' into the newly allocated area of 'v' */ - memcpy(PyUnicode_AS_UNICODE(v) + v_len, - PyUnicode_AS_UNICODE(w), w_len*sizeof(Py_UNICODE)); - return v; - } - else { - /* When in-place resizing is not an option. */ - w = PyUnicode_Concat(v, w); - Py_DECREF(v); - return w; - } -#endif -} - #ifdef DYNAMIC_EXECUTION_PROFILE static PyObject * -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sat Oct 1 02:49:28 2011 From: python-checkins at python.org (victor.stinner) Date: Sat, 01 Oct 2011 02:49:28 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Optimize_PyUnicode=5FCopy?= =?utf8?q?=28=29=3A_don=27t_recompute_maximum_character?= Message-ID: http://hg.python.org/cpython/rev/b47e8c50a6a0 changeset: 72551:b47e8c50a6a0 user: Victor Stinner date: Sat Oct 01 01:34:32 2011 +0200 summary: Optimize PyUnicode_Copy(): don't recompute maximum character files: Objects/unicodeobject.c | 31 ++++++++++++++++++++++++++-- 1 files changed, 28 insertions(+), 3 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -1212,15 +1212,40 @@ PyObject* PyUnicode_Copy(PyObject *unicode) { + Py_ssize_t size; + PyObject *copy; + void *data; + if (!PyUnicode_Check(unicode)) { PyErr_BadInternalCall(); return NULL; } if (PyUnicode_READY(unicode)) return NULL; - return PyUnicode_FromKindAndData(PyUnicode_KIND(unicode), - PyUnicode_DATA(unicode), - PyUnicode_GET_LENGTH(unicode)); + + size = PyUnicode_GET_LENGTH(unicode); + copy = PyUnicode_New(size, PyUnicode_MAX_CHAR_VALUE(unicode)); + if (!copy) + return NULL; + assert(PyUnicode_KIND(copy) == PyUnicode_KIND(unicode)); + + data = PyUnicode_DATA(unicode); + switch (PyUnicode_KIND(unicode)) + { + case PyUnicode_1BYTE_KIND: + memcpy(PyUnicode_1BYTE_DATA(copy), data, size); + break; + case PyUnicode_2BYTE_KIND: + memcpy(PyUnicode_2BYTE_DATA(copy), data, sizeof(Py_UCS2) * size); + break; + case PyUnicode_4BYTE_KIND: + memcpy(PyUnicode_4BYTE_DATA(copy), data, sizeof(Py_UCS4) * size); + break; + default: + assert(0); + break; + } + return copy; } -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sat Oct 1 02:49:29 2011 From: python-checkins at python.org (victor.stinner) Date: Sat, 01 Oct 2011 02:49:29 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Remove_private_substring=28?= =?utf8?q?=29_function=2C_reuse_public_PyUnicode=5FSubstring=28=29?= Message-ID: http://hg.python.org/cpython/rev/6a98d9bde900 changeset: 72552:6a98d9bde900 user: Victor Stinner date: Sat Oct 01 01:53:49 2011 +0200 summary: Remove private substring() function, reuse public PyUnicode_Substring() * PyUnicode_Substring() now fails if start or end is invalid * PyUnicode_Substring() reuses PyUnicode_Copy() for non-exact strings files: Objects/unicodeobject.c | 90 ++++++++-------------------- 1 files changed, 25 insertions(+), 65 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -283,9 +283,6 @@ /* --- Unicode Object ----------------------------------------------------- */ static PyObject * -substring(PyUnicodeObject *self, Py_ssize_t start, Py_ssize_t len); - -static PyObject * fixup(PyUnicodeObject *self, Py_UCS4 (*fixfct)(PyUnicodeObject *s)); Py_LOCAL_INLINE(char *) findchar(void *s, int kind, @@ -10445,51 +10442,7 @@ j++; } - if (i == 0 && j == len && PyUnicode_CheckExact(self)) { - Py_INCREF(self); - return (PyObject*)self; - } - else - return PyUnicode_Substring((PyObject*)self, i, j); -} - -/* Assumes an already ready self string. */ - -static PyObject * -substring(PyUnicodeObject *self, Py_ssize_t start, Py_ssize_t len) -{ - const int kind = PyUnicode_KIND(self); - void *data = PyUnicode_DATA(self); - Py_UCS4 maxchar = 0; - Py_ssize_t i; - PyObject *unicode; - - if (start < 0 || len < 0 || (start + len) > PyUnicode_GET_LENGTH(self)) { - PyErr_BadInternalCall(); - return NULL; - } - - if (len == PyUnicode_GET_LENGTH(self) && PyUnicode_CheckExact(self)) { - Py_INCREF(self); - return (PyObject*)self; - } - - for (i = 0; i < len; ++i) { - const Py_UCS4 ch = PyUnicode_READ(kind, data, start + i); - if (ch > maxchar) - maxchar = ch; - } - - unicode = PyUnicode_New(len, maxchar); - if (unicode == NULL) - return NULL; - if (PyUnicode_CopyCharacters(unicode, 0, - (PyObject*)self, start, len) < 0) - { - Py_DECREF(unicode); - return NULL; - } - return unicode; + return PyUnicode_Substring((PyObject*)self, i, j); } PyObject* @@ -10497,24 +10450,34 @@ { unsigned char *data; int kind; - - if (start == 0 && end == PyUnicode_GET_LENGTH(self) - && PyUnicode_CheckExact(self)) + Py_ssize_t length; + + if (start == 0 && end == PyUnicode_GET_LENGTH(self)) { - Py_INCREF(self); - return (PyObject *)self; - } - - if ((end - start) == 1) + if (PyUnicode_CheckExact(self)) { + Py_INCREF(self); + return self; + } + else + return PyUnicode_Copy(self); + } + + length = end - start; + if (length == 1) return unicode_getitem((PyUnicodeObject*)self, start); + if (start < 0 || end < 0 || end > PyUnicode_GET_LENGTH(self)) { + PyErr_SetString(PyExc_IndexError, "string index out of range"); + return NULL; + } + if (PyUnicode_READY(self) == -1) return NULL; kind = PyUnicode_KIND(self); data = PyUnicode_1BYTE_DATA(self); return PyUnicode_FromKindAndData(kind, data + PyUnicode_KIND_SIZE(kind, start), - end-start); + length); } static PyObject * @@ -10546,12 +10509,7 @@ j++; } - if (i == 0 && j == len && PyUnicode_CheckExact(self)) { - Py_INCREF(self); - return (PyObject*)self; - } - else - return substring(self, i, j-i); + return PyUnicode_Substring((PyObject*)self, i, j); } @@ -11814,7 +11772,8 @@ Py_INCREF(self); return (PyObject *)self; } else if (step == 1) { - return substring(self, start, slicelength); + return PyUnicode_Substring((PyObject*)self, + start, start + slicelength); } else { source_buf = PyUnicode_AS_UNICODE((PyObject*)self); result_buf = (Py_UNICODE *)PyObject_MALLOC(slicelength* @@ -12051,7 +12010,8 @@ "incomplete format key"); goto onError; } - key = substring(uformat, keystart, keylen); + key = PyUnicode_Substring((PyObject*)uformat, + keystart, keystart + keylen); if (key == NULL) goto onError; if (args_owned) { -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sat Oct 1 02:49:30 2011 From: python-checkins at python.org (victor.stinner) Date: Sat, 01 Oct 2011 02:49:30 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Fix_usage_of_PyUnicode=5FRE?= =?utf8?q?ADY_in_unicodeobject=2Ec?= Message-ID: http://hg.python.org/cpython/rev/beaa42dcbaec changeset: 72553:beaa42dcbaec user: Victor Stinner date: Sat Oct 01 02:14:59 2011 +0200 summary: Fix usage of PyUnicode_READY in unicodeobject.c files: Objects/unicodeobject.c | 35 +++++++++++----------------- 1 files changed, 14 insertions(+), 21 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -7933,7 +7933,7 @@ if (!str_obj || PyUnicode_READY(str_obj) == -1) return -1; sub_obj = (PyUnicodeObject*) PyUnicode_FromObject(substr); - if (!sub_obj || PyUnicode_READY(str_obj) == -1) { + if (!sub_obj || PyUnicode_READY(sub_obj) == -1) { Py_DECREF(str_obj); return -1; } @@ -8460,7 +8460,7 @@ if (separator == NULL) { /* fall back to a blank space separator */ sep = PyUnicode_FromOrdinal(' '); - if (!sep || PyUnicode_READY(sep) == -1) + if (!sep) goto onError; } else { @@ -9190,10 +9190,6 @@ Py_DECREF(uniobj); return 0; } - if (PyUnicode_READY(uniobj)) { - Py_DECREF(uniobj); - return 0; - } *fillcharloc = PyUnicode_READ_CHAR(uniobj, 0); Py_DECREF(uniobj); return 1; @@ -9212,12 +9208,12 @@ Py_ssize_t width; Py_UCS4 fillchar = ' '; + if (!PyArg_ParseTuple(args, "n|O&:center", &width, convert_uc, &fillchar)) + return NULL; + if (PyUnicode_READY(self) == -1) return NULL; - if (!PyArg_ParseTuple(args, "n|O&:center", &width, convert_uc, &fillchar)) - return NULL; - if (_PyUnicode_LENGTH(self) >= width && PyUnicode_CheckExact(self)) { Py_INCREF(self); return (PyObject*) self; @@ -9437,7 +9433,7 @@ return -1; str = PyUnicode_FromObject(container); - if (!str || PyUnicode_READY(container) == -1) { + if (!str || PyUnicode_READY(str) == -1) { Py_DECREF(sub); return -1; } @@ -9515,9 +9511,6 @@ return v; } - if (PyUnicode_READY(u) == -1 || PyUnicode_READY(v) == -1) - goto onError; - maxchar = PyUnicode_MAX_CHAR_VALUE(u); maxchar = Py_MAX(maxchar, PyUnicode_MAX_CHAR_VALUE(v)); @@ -10662,15 +10655,15 @@ PyObject *result; self = PyUnicode_FromObject(obj); - if (self == NULL || PyUnicode_READY(obj) == -1) + if (self == NULL || PyUnicode_READY(self) == -1) return NULL; str1 = PyUnicode_FromObject(subobj); - if (str1 == NULL || PyUnicode_READY(obj) == -1) { + if (str1 == NULL || PyUnicode_READY(str1) == -1) { Py_DECREF(self); return NULL; } str2 = PyUnicode_FromObject(replobj); - if (str2 == NULL || PyUnicode_READY(obj)) { + if (str2 == NULL || PyUnicode_READY(str2)) { Py_DECREF(self); Py_DECREF(str1); return NULL; @@ -10705,7 +10698,7 @@ if (str1 == NULL || PyUnicode_READY(str1) == -1) return NULL; str2 = PyUnicode_FromObject(str2); - if (str2 == NULL || PyUnicode_READY(str1) == -1) { + if (str2 == NULL || PyUnicode_READY(str2) == -1) { Py_DECREF(str1); return NULL; } @@ -10958,12 +10951,12 @@ Py_ssize_t width; Py_UCS4 fillchar = ' '; + if (!PyArg_ParseTuple(args, "n|O&:rjust", &width, convert_uc, &fillchar)) + return NULL; + if (PyUnicode_READY(self) == -1) return NULL; - if (!PyArg_ParseTuple(args, "n|O&:rjust", &width, convert_uc, &fillchar)) - return NULL; - if (_PyUnicode_LENGTH(self) >= width && PyUnicode_CheckExact(self)) { Py_INCREF(self); return (PyObject*) self; @@ -11032,7 +11025,7 @@ Py_ssize_t len1, len2; str_obj = PyUnicode_FromObject(str_in); - if (!str_obj || PyUnicode_READY(str_in) == -1) + if (!str_obj || PyUnicode_READY(str_obj) == -1) return NULL; sep_obj = PyUnicode_FromObject(sep_in); if (!sep_obj || PyUnicode_READY(sep_obj) == -1) { -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sat Oct 1 02:49:30 2011 From: python-checkins at python.org (victor.stinner) Date: Sat, 01 Oct 2011 02:49:30 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_PyUnicode=5FCHARACTER=5FSIZ?= =?utf8?q?E=28=29=3A_add_a_reference_to_PyUnicode=5FKIND=5FSIZE=28=29?= Message-ID: http://hg.python.org/cpython/rev/2e9fb59a1484 changeset: 72554:2e9fb59a1484 user: Victor Stinner date: Sat Oct 01 02:39:37 2011 +0200 summary: PyUnicode_CHARACTER_SIZE(): add a reference to PyUnicode_KIND_SIZE() files: Include/unicodeobject.h | 7 +++++-- 1 files changed, 5 insertions(+), 2 deletions(-) diff --git a/Include/unicodeobject.h b/Include/unicodeobject.h --- a/Include/unicodeobject.h +++ b/Include/unicodeobject.h @@ -338,7 +338,9 @@ /* Return the number of bytes the string uses to represent single characters, - this can be 1, 2 or 4. */ + this can be 1, 2 or 4. + + See also PyUnicode_KIND_SIZE(). */ #define PyUnicode_CHARACTER_SIZE(op) \ (1 << (PyUnicode_KIND(op) - 1)) @@ -378,8 +380,9 @@ _PyUnicode_NONCOMPACT_DATA(op)) /* Compute (index * char_size) where char_size is 2 ** (kind - 1). + The index is a character index, the result is a size in bytes. - The index is a character index, the result is a size in bytes. */ + See also PyUnicode_CHARACTER_SIZE(). */ #define PyUnicode_KIND_SIZE(kind, index) ((index) << ((kind) - 1)) /* In the access macros below, "kind" may be evaluated more than once. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sat Oct 1 02:49:31 2011 From: python-checkins at python.org (victor.stinner) Date: Sat, 01 Oct 2011 02:49:31 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_I_want_a_super_fast_=27a=27?= =?utf8?q?_*_n!?= Message-ID: http://hg.python.org/cpython/rev/ba3e9f5bcbf6 changeset: 72555:ba3e9f5bcbf6 user: Victor Stinner date: Sat Oct 01 02:47:29 2011 +0200 summary: I want a super fast 'a' * n! * Optimize unicode_repeat() for a special case with memset() * Simplify integer overflow checking; remove the second check because PyUnicode_New() already does it and uses a smaller limit (Py_ssize_t vs size_t) files: Objects/unicodeobject.c | 25 ++++++++++--------------- 1 files changed, 10 insertions(+), 15 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -10583,7 +10583,6 @@ { PyUnicodeObject *u; Py_ssize_t nchars, n; - size_t nbytes, char_size; if (len < 1) { Py_INCREF(unicode_empty); @@ -10599,32 +10598,28 @@ if (PyUnicode_READY(str) == -1) return NULL; - /* ensure # of chars needed doesn't overflow int and # of bytes - * needed doesn't overflow size_t - */ - nchars = len * PyUnicode_GET_LENGTH(str); - if (nchars / len != PyUnicode_GET_LENGTH(str)) { + if (len > PY_SSIZE_T_MAX / PyUnicode_GET_LENGTH(str)) { PyErr_SetString(PyExc_OverflowError, "repeated string is too long"); return NULL; } - char_size = PyUnicode_CHARACTER_SIZE(str); - nbytes = (nchars + 1) * char_size; - if (nbytes / char_size != (size_t)(nchars + 1)) { - PyErr_SetString(PyExc_OverflowError, - "repeated string is too long"); - return NULL; - } + nchars = len * PyUnicode_GET_LENGTH(str); + u = (PyUnicodeObject *)PyUnicode_New(nchars, PyUnicode_MAX_CHAR_VALUE(str)); if (!u) return NULL; + assert(PyUnicode_KIND(u) == PyUnicode_KIND(str)); if (PyUnicode_GET_LENGTH(str) == 1) { const int kind = PyUnicode_KIND(str); const Py_UCS4 fill_char = PyUnicode_READ(kind, PyUnicode_DATA(str), 0); void *to = PyUnicode_DATA(u); - for (n = 0; n < len; ++n) - PyUnicode_WRITE(kind, to, n, fill_char); + if (kind == PyUnicode_1BYTE_KIND) + memset(to, (unsigned char)fill_char, len); + else { + for (n = 0; n < len; ++n) + PyUnicode_WRITE(kind, to, n, fill_char); + } } else { /* number of characters copied this far */ -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sat Oct 1 03:09:44 2011 From: python-checkins at python.org (victor.stinner) Date: Sat, 01 Oct 2011 03:09:44 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_PyUnicode=5FFromObject=28?= =?utf8?q?=29_ensures_that_its_output_is_a_ready_string?= Message-ID: http://hg.python.org/cpython/rev/58e3373f06d1 changeset: 72556:58e3373f06d1 user: Victor Stinner date: Sat Oct 01 03:09:33 2011 +0200 summary: PyUnicode_FromObject() ensures that its output is a ready string files: Objects/unicodeobject.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -2068,6 +2068,8 @@ /* XXX Perhaps we should make this API an alias of PyObject_Str() instead ?! */ if (PyUnicode_CheckExact(obj)) { + if (PyUnicode_READY(obj)) + return NULL; Py_INCREF(obj); return obj; } -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sat Oct 1 03:09:45 2011 From: python-checkins at python.org (victor.stinner) Date: Sat, 01 Oct 2011 03:09:45 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Ooops=2C_avoid_a_division_b?= =?utf8?q?y_zero_in_unicode=5Frepeat=28=29?= Message-ID: http://hg.python.org/cpython/rev/17aba77fa99c changeset: 72557:17aba77fa99c user: Victor Stinner date: Sat Oct 01 03:09:58 2011 +0200 summary: Ooops, avoid a division by zero in unicode_repeat() files: Objects/unicodeobject.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -10600,7 +10600,7 @@ if (PyUnicode_READY(str) == -1) return NULL; - if (len > PY_SSIZE_T_MAX / PyUnicode_GET_LENGTH(str)) { + if (PyUnicode_GET_LENGTH(str) > PY_SSIZE_T_MAX / len) { PyErr_SetString(PyExc_OverflowError, "repeated string is too long"); return NULL; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sat Oct 1 03:31:30 2011 From: python-checkins at python.org (benjamin.peterson) Date: Sat, 01 Oct 2011 03:31:30 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_remove_=22fast-path=22_for_?= =?utf8?q?=28i=29adding_strings?= Message-ID: http://hg.python.org/cpython/rev/6da962d77eb3 changeset: 72558:6da962d77eb3 user: Benjamin Peterson date: Fri Sep 30 21:31:21 2011 -0400 summary: remove "fast-path" for (i)adding strings These were just an artifact of the old unicode concatenation hack and likely just penalized other kinds of adding. Also, this fixes __(i)add__ on string subclasses. files: Lib/test/test_unicode.py | 12 ++++++++++++ Python/ceval.c | 10 ++-------- 2 files changed, 14 insertions(+), 8 deletions(-) diff --git a/Lib/test/test_unicode.py b/Lib/test/test_unicode.py --- a/Lib/test/test_unicode.py +++ b/Lib/test/test_unicode.py @@ -1760,6 +1760,18 @@ self.assertEqual(size, nchar) self.assertEqual(wchar, nonbmp + '\0') + def test_subclass_add(self): + class S(str): + def __add__(self, o): + return "3" + self.assertEqual(S("4") + S("5"), "3") + class S(str): + def __iadd__(self, o): + return "3" + s = S("1") + s += "4" + self.assertEqual(s, "3") + class StringModuleTest(unittest.TestCase): def test_formatter_parser(self): diff --git a/Python/ceval.c b/Python/ceval.c --- a/Python/ceval.c +++ b/Python/ceval.c @@ -1507,10 +1507,7 @@ TARGET(BINARY_ADD) w = POP(); v = TOP(); - if (PyUnicode_Check(v) && PyUnicode_Check(w)) - x = PyUnicode_Concat(v, w); - else - x = PyNumber_Add(v, w); + x = PyNumber_Add(v, w); Py_DECREF(v); Py_DECREF(w); SET_TOP(x); @@ -1662,10 +1659,7 @@ TARGET(INPLACE_ADD) w = POP(); v = TOP(); - if (PyUnicode_Check(v) && PyUnicode_Check(w)) - x = PyUnicode_Concat(v, w); - else - x = PyNumber_InPlaceAdd(v, w); + x = PyNumber_InPlaceAdd(v, w); Py_DECREF(v); Py_DECREF(w); SET_TOP(x); -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sat Oct 1 04:02:21 2011 From: python-checkins at python.org (victor.stinner) Date: Sat, 01 Oct 2011 04:02:21 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_=5FPyUnicode=5FAsKind=28=29?= =?utf8?q?_is_*not*_part_of_the_stable_ABI?= Message-ID: http://hg.python.org/cpython/rev/fc7d2c6db61b changeset: 72559:fc7d2c6db61b user: Victor Stinner date: Sat Oct 01 03:57:28 2011 +0200 summary: _PyUnicode_AsKind() is *not* part of the stable ABI files: Include/unicodeobject.h | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/Include/unicodeobject.h b/Include/unicodeobject.h --- a/Include/unicodeobject.h +++ b/Include/unicodeobject.h @@ -795,7 +795,9 @@ Py_ssize_t *size /* number of characters of the result */ ); +#ifndef Py_LIMITED_API PyAPI_FUNC(void*) _PyUnicode_AsKind(PyObject *s, unsigned int kind); +#endif #endif -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sat Oct 1 04:02:21 2011 From: python-checkins at python.org (victor.stinner) Date: Sat, 01 Oct 2011 04:02:21 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_PyUnicode=5FSubstring=28=29?= =?utf8?q?_now_accepts_end_bigger_than_string_length?= Message-ID: http://hg.python.org/cpython/rev/45f1de829d70 changeset: 72560:45f1de829d70 user: Victor Stinner date: Sat Oct 01 03:55:54 2011 +0200 summary: PyUnicode_Substring() now accepts end bigger than string length Fix also a bug: call PyUnicode_READY() before reading string length. files: Objects/unicodeobject.c | 9 ++++++--- 1 files changed, 6 insertions(+), 3 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -10447,6 +10447,11 @@ int kind; Py_ssize_t length; + if (PyUnicode_READY(self) == -1) + return NULL; + + end = Py_MIN(end, PyUnicode_GET_LENGTH(self)); + if (start == 0 && end == PyUnicode_GET_LENGTH(self)) { if (PyUnicode_CheckExact(self)) { @@ -10461,13 +10466,11 @@ if (length == 1) return unicode_getitem((PyUnicodeObject*)self, start); - if (start < 0 || end < 0 || end > PyUnicode_GET_LENGTH(self)) { + if (start < 0 || end < 0) { PyErr_SetString(PyExc_IndexError, "string index out of range"); return NULL; } - if (PyUnicode_READY(self) == -1) - return NULL; kind = PyUnicode_KIND(self); data = PyUnicode_1BYTE_DATA(self); return PyUnicode_FromKindAndData(kind, -- Repository URL: http://hg.python.org/cpython From solipsis at pitrou.net Sat Oct 1 05:26:42 2011 From: solipsis at pitrou.net (solipsis at pitrou.net) Date: Sat, 01 Oct 2011 05:26:42 +0200 Subject: [Python-checkins] Daily reference leaks (17aba77fa99c): sum=0 Message-ID: results for 17aba77fa99c on branch "default" -------------------------------------------- Command line was: ['./python', '-m', 'test.regrtest', '-uall', '-R', '3:3:/home/antoine/cpython/refleaks/reflogZEmHtw', '-x'] From python-checkins at python.org Sat Oct 1 06:12:28 2011 From: python-checkins at python.org (benjamin.peterson) Date: Sat, 01 Oct 2011 06:12:28 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_remove_reference_to_non-exi?= =?utf8?q?stent_file?= Message-ID: http://hg.python.org/cpython/rev/bb0264220858 changeset: 72561:bb0264220858 parent: 72558:6da962d77eb3 user: Benjamin Peterson date: Sat Oct 01 00:11:09 2011 -0400 summary: remove reference to non-existent file files: Objects/unicodeobject.c | 3 +-- 1 files changed, 1 insertions(+), 2 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -1,8 +1,7 @@ /* Unicode implementation based on original code by Fredrik Lundh, -modified by Marc-Andre Lemburg according to the -Unicode Integration Proposal (see file Misc/unicode.txt). +modified by Marc-Andre Lemburg . Major speed upgrades to the method implementations at the Reykjavik NeedForSpeed sprint, by Fredrik Lundh and Andrew Dalke. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sat Oct 1 06:12:29 2011 From: python-checkins at python.org (benjamin.peterson) Date: Sat, 01 Oct 2011 06:12:29 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=28merge_default_-=3E_default?= =?utf8?q?=29=3A_merge_heads?= Message-ID: http://hg.python.org/cpython/rev/cb9334cdff18 changeset: 72562:cb9334cdff18 parent: 72561:bb0264220858 parent: 72560:45f1de829d70 user: Benjamin Peterson date: Sat Oct 01 00:12:20 2011 -0400 summary: merge heads files: Include/unicodeobject.h | 2 ++ Objects/unicodeobject.c | 9 ++++++--- 2 files changed, 8 insertions(+), 3 deletions(-) diff --git a/Include/unicodeobject.h b/Include/unicodeobject.h --- a/Include/unicodeobject.h +++ b/Include/unicodeobject.h @@ -795,7 +795,9 @@ Py_ssize_t *size /* number of characters of the result */ ); +#ifndef Py_LIMITED_API PyAPI_FUNC(void*) _PyUnicode_AsKind(PyObject *s, unsigned int kind); +#endif #endif diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -10446,6 +10446,11 @@ int kind; Py_ssize_t length; + if (PyUnicode_READY(self) == -1) + return NULL; + + end = Py_MIN(end, PyUnicode_GET_LENGTH(self)); + if (start == 0 && end == PyUnicode_GET_LENGTH(self)) { if (PyUnicode_CheckExact(self)) { @@ -10460,13 +10465,11 @@ if (length == 1) return unicode_getitem((PyUnicodeObject*)self, start); - if (start < 0 || end < 0 || end > PyUnicode_GET_LENGTH(self)) { + if (start < 0 || end < 0) { PyErr_SetString(PyExc_IndexError, "string index out of range"); return NULL; } - if (PyUnicode_READY(self) == -1) - return NULL; kind = PyUnicode_KIND(self); data = PyUnicode_1BYTE_DATA(self); return PyUnicode_FromKindAndData(kind, -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sat Oct 1 16:35:48 2011 From: python-checkins at python.org (martin.v.loewis) Date: Sat, 01 Oct 2011 16:35:48 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Issue_13085=3A_Fix_some_mem?= =?utf8?q?ory_leaks=2E_Patch_by_Stefan_Krah=2E?= Message-ID: http://hg.python.org/cpython/rev/1b203e741fb2 changeset: 72563:1b203e741fb2 user: Martin v. L?wis date: Sat Oct 01 16:35:40 2011 +0200 summary: Issue 13085: Fix some memory leaks. Patch by Stefan Krah. files: Objects/unicodeobject.c | 1 + Python/import.c | 30 ++++++++++++++++++---------- 2 files changed, 20 insertions(+), 11 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -9075,6 +9075,7 @@ PyUnicode_KIND_SIZE(rkind, slen-i)); } u = PyUnicode_FromKindAndData(rkind, res, new_size); + PyMem_Free(res); } if (srelease) PyMem_FREE(sbuf); diff --git a/Python/import.c b/Python/import.c --- a/Python/import.c +++ b/Python/import.c @@ -1564,8 +1564,10 @@ if (py == NULL) goto error; - if (_Py_stat(py, &statbuf) == 0 && S_ISREG(statbuf.st_mode)) + if (_Py_stat(py, &statbuf) == 0 && S_ISREG(statbuf.st_mode)) { + PyMem_Free(fileuni); return py; + } Py_DECREF(py); goto unchanged; @@ -3074,7 +3076,7 @@ Py_ssize_t len; Py_UCS4 *p; PyObject *fullname, *name, *result, *mark_name; - const Py_UCS4 *nameuni; + Py_UCS4 *nameuni; *p_outputname = NULL; @@ -3095,7 +3097,7 @@ if (len == 0) { PyErr_SetString(PyExc_ValueError, "Empty module name"); - return NULL; + goto error; } } else @@ -3104,7 +3106,7 @@ if (*p_buflen+len+1 >= bufsize) { PyErr_SetString(PyExc_ValueError, "Module name too long"); - return NULL; + goto error; } p = buf + *p_buflen; @@ -3119,12 +3121,12 @@ fullname = PyUnicode_FromKindAndData(PyUnicode_4BYTE_KIND, buf, *p_buflen); if (fullname == NULL) - return NULL; + goto error; name = PyUnicode_FromKindAndData(PyUnicode_4BYTE_KIND, p, len); if (name == NULL) { Py_DECREF(fullname); - return NULL; + goto error; } result = import_submodule(mod, name, fullname); Py_DECREF(fullname); @@ -3138,12 +3140,12 @@ buf, *p_buflen); if (mark_name == NULL) { Py_DECREF(result); - return NULL; + goto error; } if (mark_miss(mark_name) != 0) { Py_DECREF(result); Py_DECREF(mark_name); - return NULL; + goto error; } Py_DECREF(mark_name); Py_UCS4_strncpy(buf, nameuni, len); @@ -3154,13 +3156,13 @@ else Py_DECREF(name); if (result == NULL) - return NULL; + goto error; if (result == Py_None) { Py_DECREF(result); PyErr_Format(PyExc_ImportError, "No module named %R", inputname); - return NULL; + goto error; } if (dot != NULL) { @@ -3168,11 +3170,17 @@ dot+1, Py_UCS4_strlen(dot+1)); if (*p_outputname == NULL) { Py_DECREF(result); - return NULL; + goto error; } } +out: + PyMem_Free(nameuni); return result; + +error: + PyMem_Free(nameuni); + return NULL; } static int -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sat Oct 1 16:45:25 2011 From: python-checkins at python.org (antoine.pitrou) Date: Sat, 01 Oct 2011 16:45:25 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Backout_of_changeset_228fd2?= =?utf8?q?bd83a5_by_Nadeem_Vawda_in_branch_=27default=27=3A?= Message-ID: http://hg.python.org/cpython/rev/7fabd75a6ae4 changeset: 72564:7fabd75a6ae4 user: Antoine Pitrou date: Sat Oct 01 16:41:48 2011 +0200 summary: Backout of changeset 228fd2bd83a5 by Nadeem Vawda in branch 'default': Issue #12804: Prevent "make test" from using network resources. files: Tools/scripts/run_tests.py | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Tools/scripts/run_tests.py b/Tools/scripts/run_tests.py --- a/Tools/scripts/run_tests.py +++ b/Tools/scripts/run_tests.py @@ -37,7 +37,7 @@ if not any(is_multiprocess_flag(arg) for arg in regrtest_args): args.extend(['-j', '0']) # Use all CPU cores if not any(is_resource_use_flag(arg) for arg in regrtest_args): - args.extend(['-u', 'all,-largefile,-network,-urlfetch,-audio,-gui']) + args.extend(['-u', 'all,-largefile,-audio,-gui']) args.extend(regrtest_args) print(' '.join(args)) os.execv(sys.executable, args) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sat Oct 1 16:53:44 2011 From: python-checkins at python.org (victor.stinner) Date: Sat, 01 Oct 2011 16:53:44 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Add_=5FPyUnicode=5FUTF8=28?= =?utf8?q?=29_and_=5FPyUnicode=5FUTF8=5FLENGTH=28=29_macros?= Message-ID: http://hg.python.org/cpython/rev/4afab01f5374 changeset: 72565:4afab01f5374 user: Victor Stinner date: Sat Oct 01 16:48:13 2011 +0200 summary: Add _PyUnicode_UTF8() and _PyUnicode_UTF8_LENGTH() macros * Rename existing _PyUnicode_UTF8() macro to PyUnicode_UTF8() * Rename existing _PyUnicode_UTF8_LENGTH() macro to PyUnicode_UTF8_LENGTH() * PyUnicode_UTF8() and PyUnicode_UTF8_LENGTH() are more strict files: Objects/unicodeobject.c | 97 ++++++++++++++++------------ 1 files changed, 55 insertions(+), 42 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -104,14 +104,22 @@ } \ } while (0) -#define _PyUnicode_UTF8(op) \ - (PyUnicode_IS_COMPACT_ASCII(op) ? \ - ((char*)((PyASCIIObject*)(op) + 1)) : \ - ((PyCompactUnicodeObject*)(op))->utf8) +#define _PyUnicode_UTF8(op) \ + (((PyCompactUnicodeObject*)(op))->utf8) +#define PyUnicode_UTF8(op) \ + (assert(PyUnicode_Check(op)), \ + assert(PyUnicode_IS_READY(op)), \ + PyUnicode_IS_COMPACT_ASCII(op) ? \ + ((char*)((PyASCIIObject*)(op) + 1)) : \ + _PyUnicode_UTF8(op)) #define _PyUnicode_UTF8_LENGTH(op) \ - (PyUnicode_IS_COMPACT_ASCII(op) ? \ - ((PyASCIIObject*)(op))->length : \ - ((PyCompactUnicodeObject*)(op))->utf8_length) + (((PyCompactUnicodeObject*)(op))->utf8_length) +#define PyUnicode_UTF8_LENGTH(op) \ + (assert(PyUnicode_Check(op)), \ + assert(PyUnicode_IS_READY(op)), \ + PyUnicode_IS_COMPACT_ASCII(op) ? \ + ((PyASCIIObject*)(op))->length : \ + _PyUnicode_UTF8_LENGTH(op)) #define _PyUnicode_WSTR(op) (((PyASCIIObject*)(op))->wstr) #define _PyUnicode_WSTR_LENGTH(op) (((PyCompactUnicodeObject*)(op))->wstr_length) #define _PyUnicode_LENGTH(op) (((PyASCIIObject *)(op))->length) @@ -353,11 +361,11 @@ reset: if (unicode->data.any != NULL) { PyObject_FREE(unicode->data.any); - if (unicode->_base.utf8 && unicode->_base.utf8 != unicode->data.any) { - PyObject_FREE(unicode->_base.utf8); - } - unicode->_base.utf8 = NULL; - unicode->_base.utf8_length = 0; + if (_PyUnicode_UTF8(unicode) && _PyUnicode_UTF8(unicode) != unicode->data.any) { + PyObject_FREE(_PyUnicode_UTF8(unicode)); + } + _PyUnicode_UTF8(unicode) = NULL; + _PyUnicode_UTF8_LENGTH(unicode) = 0; unicode->data.any = NULL; _PyUnicode_LENGTH(unicode) = 0; _PyUnicode_STATE(unicode).interned = _PyUnicode_STATE(unicode).interned; @@ -435,8 +443,8 @@ _PyUnicode_STATE(unicode).ascii = 0; unicode->data.any = NULL; _PyUnicode_LENGTH(unicode) = 0; - unicode->_base.utf8 = NULL; - unicode->_base.utf8_length = 0; + _PyUnicode_UTF8(unicode) = NULL; + _PyUnicode_UTF8_LENGTH(unicode) = 0; return unicode; onError: @@ -452,7 +460,7 @@ /* Functions wrapping macros for use in debugger */ char *_PyUnicode_utf8(void *unicode){ - return _PyUnicode_UTF8(unicode); + return PyUnicode_UTF8(unicode); } void *_PyUnicode_compact_data(void *unicode) { @@ -799,7 +807,7 @@ assert(_PyUnicode_KIND(obj) == PyUnicode_WCHAR_KIND); assert(_PyUnicode_WSTR(unicode) != NULL); assert(unicode->data.any == NULL); - assert(unicode->_base.utf8 == NULL); + assert(_PyUnicode_UTF8(unicode) == NULL); /* Actually, it should neither be interned nor be anything else: */ assert(_PyUnicode_STATE(unicode).interned == SSTATE_NOT_INTERNED); @@ -825,12 +833,12 @@ _PyUnicode_LENGTH(unicode) = _PyUnicode_WSTR_LENGTH(unicode); _PyUnicode_STATE(unicode).kind = PyUnicode_1BYTE_KIND; if (maxchar < 128) { - unicode->_base.utf8 = unicode->data.any; - unicode->_base.utf8_length = _PyUnicode_WSTR_LENGTH(unicode); + _PyUnicode_UTF8(unicode) = unicode->data.any; + _PyUnicode_UTF8_LENGTH(unicode) = _PyUnicode_WSTR_LENGTH(unicode); } else { - unicode->_base.utf8 = NULL; - unicode->_base.utf8_length = 0; + _PyUnicode_UTF8(unicode) = NULL; + _PyUnicode_UTF8_LENGTH(unicode) = 0; } PyObject_FREE(_PyUnicode_WSTR(unicode)); _PyUnicode_WSTR(unicode) = NULL; @@ -848,8 +856,8 @@ PyUnicode_2BYTE_DATA(unicode)[_PyUnicode_WSTR_LENGTH(unicode)] = '\0'; _PyUnicode_LENGTH(unicode) = _PyUnicode_WSTR_LENGTH(unicode); _PyUnicode_STATE(unicode).kind = PyUnicode_2BYTE_KIND; - unicode->_base.utf8 = NULL; - unicode->_base.utf8_length = 0; + _PyUnicode_UTF8(unicode) = NULL; + _PyUnicode_UTF8_LENGTH(unicode) = 0; #else /* sizeof(wchar_t) == 4 */ unicode->data.any = PyObject_MALLOC( @@ -864,8 +872,8 @@ PyUnicode_2BYTE_DATA(unicode)[_PyUnicode_WSTR_LENGTH(unicode)] = '\0'; _PyUnicode_LENGTH(unicode) = _PyUnicode_WSTR_LENGTH(unicode); _PyUnicode_STATE(unicode).kind = PyUnicode_2BYTE_KIND; - unicode->_base.utf8 = NULL; - unicode->_base.utf8_length = 0; + _PyUnicode_UTF8(unicode) = NULL; + _PyUnicode_UTF8_LENGTH(unicode) = 0; PyObject_FREE(_PyUnicode_WSTR(unicode)); _PyUnicode_WSTR(unicode) = NULL; _PyUnicode_WSTR_LENGTH(unicode) = 0; @@ -884,8 +892,8 @@ } _PyUnicode_LENGTH(unicode) = length_wo_surrogates; _PyUnicode_STATE(unicode).kind = PyUnicode_4BYTE_KIND; - unicode->_base.utf8 = NULL; - unicode->_base.utf8_length = 0; + _PyUnicode_UTF8(unicode) = NULL; + _PyUnicode_UTF8_LENGTH(unicode) = 0; if (unicode_convert_wchar_to_ucs4(_PyUnicode_WSTR(unicode), end, unicode) < 0) { assert(0 && "ConvertWideCharToUCS4 failed"); @@ -899,8 +907,8 @@ unicode->data.any = _PyUnicode_WSTR(unicode); _PyUnicode_LENGTH(unicode) = _PyUnicode_WSTR_LENGTH(unicode); - unicode->_base.utf8 = NULL; - unicode->_base.utf8_length = 0; + _PyUnicode_UTF8(unicode) = NULL; + _PyUnicode_UTF8_LENGTH(unicode) = 0; _PyUnicode_STATE(unicode).kind = PyUnicode_4BYTE_KIND; #endif PyUnicode_4BYTE_DATA(unicode)[_PyUnicode_LENGTH(unicode)] = '\0'; @@ -935,8 +943,10 @@ (!PyUnicode_IS_READY(unicode) || _PyUnicode_WSTR(unicode) != PyUnicode_DATA(unicode))) PyObject_DEL(_PyUnicode_WSTR(unicode)); - if (_PyUnicode_UTF8(unicode) && _PyUnicode_UTF8(unicode) != PyUnicode_DATA(unicode)) - PyObject_DEL(unicode->_base.utf8); + if (!PyUnicode_IS_COMPACT_ASCII(unicode) + && _PyUnicode_UTF8(unicode) + && _PyUnicode_UTF8(unicode) != PyUnicode_DATA(unicode)) + PyObject_DEL(_PyUnicode_UTF8(unicode)); if (PyUnicode_IS_COMPACT(unicode)) { Py_TYPE(unicode)->tp_free((PyObject *)unicode); @@ -2648,23 +2658,24 @@ if (PyUnicode_READY(u) == -1) return NULL; - if (_PyUnicode_UTF8(unicode) == NULL) { + if (PyUnicode_UTF8(unicode) == NULL) { + assert(!PyUnicode_IS_COMPACT_ASCII(unicode)); bytes = _PyUnicode_AsUTF8String(unicode, "strict"); if (bytes == NULL) return NULL; - u->_base.utf8 = PyObject_MALLOC(PyBytes_GET_SIZE(bytes) + 1); - if (u->_base.utf8 == NULL) { + _PyUnicode_UTF8(u) = PyObject_MALLOC(PyBytes_GET_SIZE(bytes) + 1); + if (_PyUnicode_UTF8(u) == NULL) { Py_DECREF(bytes); return NULL; } - u->_base.utf8_length = PyBytes_GET_SIZE(bytes); - Py_MEMCPY(u->_base.utf8, PyBytes_AS_STRING(bytes), u->_base.utf8_length + 1); + _PyUnicode_UTF8_LENGTH(u) = PyBytes_GET_SIZE(bytes); + Py_MEMCPY(_PyUnicode_UTF8(u), PyBytes_AS_STRING(bytes), _PyUnicode_UTF8_LENGTH(u) + 1); Py_DECREF(bytes); } if (psize) - *psize = _PyUnicode_UTF8_LENGTH(unicode); - return _PyUnicode_UTF8(unicode); + *psize = PyUnicode_UTF8_LENGTH(unicode); + return PyUnicode_UTF8(unicode); } char* @@ -3997,9 +4008,9 @@ if (PyUnicode_READY(unicode) == -1) return NULL; - if (_PyUnicode_UTF8(unicode)) - return PyBytes_FromStringAndSize(_PyUnicode_UTF8(unicode), - _PyUnicode_UTF8_LENGTH(unicode)); + if (PyUnicode_UTF8(unicode)) + return PyBytes_FromStringAndSize(PyUnicode_UTF8(unicode), + PyUnicode_UTF8_LENGTH(unicode)); kind = PyUnicode_KIND(unicode); data = PyUnicode_DATA(unicode); @@ -11625,8 +11636,10 @@ (!PyUnicode_IS_READY(v) || (PyUnicode_DATA(v) != _PyUnicode_WSTR(v)))) size += (PyUnicode_WSTR_LENGTH(v) + 1) * sizeof(wchar_t); - if (_PyUnicode_UTF8(v) && _PyUnicode_UTF8(v) != PyUnicode_DATA(v)) - size += _PyUnicode_UTF8_LENGTH(v) + 1; + if (!PyUnicode_IS_COMPACT_ASCII(v) + && _PyUnicode_UTF8(v) + && _PyUnicode_UTF8(v) != PyUnicode_DATA(v)) + size += PyUnicode_UTF8_LENGTH(v) + 1; return PyLong_FromSsize_t(size); } -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sat Oct 1 16:53:45 2011 From: python-checkins at python.org (victor.stinner) Date: Sat, 01 Oct 2011 16:53:45 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Optimize_unicode=5Fsubtype?= =?utf8?q?=5Fnew=28=29=3A_don=27t_encode_to_wchar=5Ft_and_decode_from_wcha?= =?utf8?b?cl90?= Message-ID: http://hg.python.org/cpython/rev/756001a37949 changeset: 72566:756001a37949 user: Victor Stinner date: Sat Oct 01 16:16:43 2011 +0200 summary: Optimize unicode_subtype_new(): don't encode to wchar_t and decode from wchar_t Rewrite unicode_subtype_new(): allocate directly the right type. files: Lib/test/test_unicode.py | 11 +- Objects/unicodeobject.c | 121 +++++++++++++++++--------- 2 files changed, 85 insertions(+), 47 deletions(-) diff --git a/Lib/test/test_unicode.py b/Lib/test/test_unicode.py --- a/Lib/test/test_unicode.py +++ b/Lib/test/test_unicode.py @@ -1010,10 +1010,13 @@ class UnicodeSubclass(str): pass - self.assertEqual( - str(UnicodeSubclass('unicode subclass becomes unicode')), - 'unicode subclass becomes unicode' - ) + for text in ('ascii', '\xe9', '\u20ac', '\U0010FFFF'): + subclass = UnicodeSubclass(text) + self.assertEqual(str(subclass), text) + self.assertEqual(len(subclass), len(text)) + if text == 'ascii': + self.assertEqual(subclass.encode('ascii'), b'ascii') + self.assertEqual(subclass.encode('utf-8'), b'ascii') self.assertEqual( str('strings are converted to unicode'), diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -12410,56 +12410,91 @@ static PyObject * unicode_subtype_new(PyTypeObject *type, PyObject *args, PyObject *kwds) { - PyUnicodeObject *tmp, *pnew; - Py_ssize_t n; - PyObject *err = NULL; + PyUnicodeObject *unicode, *self; + Py_ssize_t length, char_size; + int share_wstr, share_utf8; + unsigned int kind; + void *data; assert(PyType_IsSubtype(type, &PyUnicode_Type)); - tmp = (PyUnicodeObject *)unicode_new(&PyUnicode_Type, args, kwds); - if (tmp == NULL) - return NULL; - assert(PyUnicode_Check(tmp)); - // TODO: Verify the PyUnicode_GET_SIZE does the right thing. - // it seems kind of strange that tp_alloc gets passed the size - // of the unicode string because there will follow another - // malloc. - pnew = (PyUnicodeObject *) type->tp_alloc(type, - n = PyUnicode_GET_SIZE(tmp)); - if (pnew == NULL) { - Py_DECREF(tmp); - return NULL; - } - _PyUnicode_WSTR(pnew) = (Py_UNICODE*) PyObject_MALLOC(sizeof(Py_UNICODE) * (n+1)); - if (_PyUnicode_WSTR(pnew) == NULL) { - err = PyErr_NoMemory(); + + unicode = (PyUnicodeObject *)unicode_new(&PyUnicode_Type, args, kwds); + if (unicode == NULL) + return NULL; + assert(PyUnicode_Check(unicode)); + if (PyUnicode_READY(unicode)) + return NULL; + + self = (PyUnicodeObject *) type->tp_alloc(type, 0); + if (self == NULL) { + Py_DECREF(unicode); + return NULL; + } + kind = PyUnicode_KIND(unicode); + length = PyUnicode_GET_LENGTH(unicode); + + _PyUnicode_LENGTH(self) = length; + _PyUnicode_HASH(self) = _PyUnicode_HASH(unicode); + _PyUnicode_STATE(self).interned = 0; + _PyUnicode_STATE(self).kind = kind; + _PyUnicode_STATE(self).compact = 0; + _PyUnicode_STATE(self).ascii = 0; + _PyUnicode_STATE(self).ready = 1; + _PyUnicode_WSTR(self) = NULL; + _PyUnicode_UTF8_LENGTH(self) = 0; + _PyUnicode_UTF8(self) = NULL; + _PyUnicode_WSTR_LENGTH(self) = 0; + self->data.any = NULL; + + share_utf8 = 0; + share_wstr = 0; + if (kind == PyUnicode_1BYTE_KIND) { + char_size = 1; + if (PyUnicode_MAX_CHAR_VALUE(unicode) < 128) + share_utf8 = 1; + } + else if (kind == PyUnicode_2BYTE_KIND) { + char_size = 2; + if (sizeof(wchar_t) == 2) + share_wstr = 1; + } + else { + assert(kind == PyUnicode_4BYTE_KIND); + char_size = 4; + if (sizeof(wchar_t) == 4) + share_wstr = 1; + } + + /* Ensure we won't overflow the length. */ + if (length > (PY_SSIZE_T_MAX / char_size - 1)) { + PyErr_NoMemory(); goto onError; } - Py_UNICODE_COPY(_PyUnicode_WSTR(pnew), PyUnicode_AS_UNICODE(tmp), n+1); - _PyUnicode_WSTR_LENGTH(pnew) = n; - _PyUnicode_HASH(pnew) = _PyUnicode_HASH(tmp); - _PyUnicode_STATE(pnew).interned = 0; - _PyUnicode_STATE(pnew).kind = 0; - _PyUnicode_STATE(pnew).compact = 0; - _PyUnicode_STATE(pnew).ready = 0; - _PyUnicode_STATE(pnew).ascii = 0; - pnew->data.any = NULL; - _PyUnicode_LENGTH(pnew) = 0; - pnew->_base.utf8 = NULL; - pnew->_base.utf8_length = 0; - - if (PyUnicode_READY(pnew) == -1) { - PyObject_FREE(_PyUnicode_WSTR(pnew)); + data = PyObject_MALLOC((length + 1) * char_size); + if (data == NULL) { + PyErr_NoMemory(); goto onError; } - Py_DECREF(tmp); - return (PyObject *)pnew; - - onError: - _Py_ForgetReference((PyObject *)pnew); - PyObject_Del(pnew); - Py_DECREF(tmp); - return err; + self->data.any = data; + if (share_utf8) { + _PyUnicode_UTF8_LENGTH(self) = length; + _PyUnicode_UTF8(self) = data; + } + if (share_wstr) { + _PyUnicode_WSTR_LENGTH(self) = length; + _PyUnicode_WSTR(self) = (wchar_t *)data; + } + + Py_MEMCPY(data, PyUnicode_DATA(unicode), + PyUnicode_KIND_SIZE(kind, length + 1)); + Py_DECREF(unicode); + return (PyObject *)self; + +onError: + Py_DECREF(unicode); + Py_DECREF(self); + return NULL; } PyDoc_STRVAR(unicode_doc, -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sat Oct 1 19:26:02 2011 From: python-checkins at python.org (antoine.pitrou) Date: Sat, 01 Oct 2011 19:26:02 +0200 Subject: [Python-checkins] =?utf8?b?Y3B5dGhvbiAoMy4yKTogSXNzdWUgIzEzMDM0?= =?utf8?q?=3A_When_decoding_some_SSL_certificates=2C_the_subjectAltName_ex?= =?utf8?q?tension?= Message-ID: http://hg.python.org/cpython/rev/65e7f40fefd4 changeset: 72567:65e7f40fefd4 branch: 3.2 parent: 72531:160b52c9e8b3 user: Antoine Pitrou date: Sat Oct 01 19:20:25 2011 +0200 summary: Issue #13034: When decoding some SSL certificates, the subjectAltName extension could be unreported. files: Lib/test/nokia.pem | 31 +++++++++++++++++++++++++++++++ Lib/test/test_ssl.py | 26 ++++++++++++++++++++++++++ Misc/NEWS | 3 +++ Modules/_ssl.c | 2 +- 4 files changed, 61 insertions(+), 1 deletions(-) diff --git a/Lib/test/nokia.pem b/Lib/test/nokia.pem new file mode 100644 --- /dev/null +++ b/Lib/test/nokia.pem @@ -0,0 +1,31 @@ +# Certificate for projects.developer.nokia.com:443 (see issue 13034) +-----BEGIN CERTIFICATE----- +MIIFLDCCBBSgAwIBAgIQLubqdkCgdc7lAF9NfHlUmjANBgkqhkiG9w0BAQUFADCB +vDELMAkGA1UEBhMCVVMxFzAVBgNVBAoTDlZlcmlTaWduLCBJbmMuMR8wHQYDVQQL +ExZWZXJpU2lnbiBUcnVzdCBOZXR3b3JrMTswOQYDVQQLEzJUZXJtcyBvZiB1c2Ug +YXQgaHR0cHM6Ly93d3cudmVyaXNpZ24uY29tL3JwYSAoYykxMDE2MDQGA1UEAxMt +VmVyaVNpZ24gQ2xhc3MgMyBJbnRlcm5hdGlvbmFsIFNlcnZlciBDQSAtIEczMB4X +DTExMDkyMTAwMDAwMFoXDTEyMDkyMDIzNTk1OVowcTELMAkGA1UEBhMCRkkxDjAM +BgNVBAgTBUVzcG9vMQ4wDAYDVQQHFAVFc3BvbzEOMAwGA1UEChQFTm9raWExCzAJ +BgNVBAsUAkJJMSUwIwYDVQQDFBxwcm9qZWN0cy5kZXZlbG9wZXIubm9raWEuY29t +MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCr92w1bpHYSYxUEx8N/8Iddda2 +lYi+aXNtQfV/l2Fw9Ykv3Ipw4nLeGTj18FFlAZgMdPRlgrzF/NNXGw/9l3/qKdow +CypkQf8lLaxb9Ze1E/KKmkRJa48QTOqvo6GqKuTI6HCeGlG1RxDb8YSKcQWLiytn +yj3Wp4MgRQO266xmMQIDAQABo4IB9jCCAfIwQQYDVR0RBDowOIIccHJvamVjdHMu +ZGV2ZWxvcGVyLm5va2lhLmNvbYIYcHJvamVjdHMuZm9ydW0ubm9raWEuY29tMAkG +A1UdEwQCMAAwCwYDVR0PBAQDAgWgMEEGA1UdHwQ6MDgwNqA0oDKGMGh0dHA6Ly9T +VlJJbnRsLUczLWNybC52ZXJpc2lnbi5jb20vU1ZSSW50bEczLmNybDBEBgNVHSAE +PTA7MDkGC2CGSAGG+EUBBxcDMCowKAYIKwYBBQUHAgEWHGh0dHBzOi8vd3d3LnZl +cmlzaWduLmNvbS9ycGEwKAYDVR0lBCEwHwYJYIZIAYb4QgQBBggrBgEFBQcDAQYI +KwYBBQUHAwIwcgYIKwYBBQUHAQEEZjBkMCQGCCsGAQUFBzABhhhodHRwOi8vb2Nz +cC52ZXJpc2lnbi5jb20wPAYIKwYBBQUHMAKGMGh0dHA6Ly9TVlJJbnRsLUczLWFp +YS52ZXJpc2lnbi5jb20vU1ZSSW50bEczLmNlcjBuBggrBgEFBQcBDARiMGChXqBc +MFowWDBWFglpbWFnZS9naWYwITAfMAcGBSsOAwIaBBRLa7kolgYMu9BSOJsprEsH +iyEFGDAmFiRodHRwOi8vbG9nby52ZXJpc2lnbi5jb20vdnNsb2dvMS5naWYwDQYJ +KoZIhvcNAQEFBQADggEBACQuPyIJqXwUyFRWw9x5yDXgMW4zYFopQYOw/ItRY522 +O5BsySTh56BWS6mQB07XVfxmYUGAvRQDA5QHpmY8jIlNwSmN3s8RKo+fAtiNRlcL +x/mWSfuMs3D/S6ev3D6+dpEMZtjrhOdctsarMKp8n/hPbwhAbg5hVjpkW5n8vz2y +0KxvvkA1AxpLwpVv7OlK17ttzIHw8bp9HTlHBU5s8bKz4a565V/a5HI0CSEv/+0y +ko4/ghTnZc1CkmUngKKeFMSah/mT/xAh8XnE2l1AazFa8UKuYki1e+ArHaGZc4ix +UYOtiRphwfuYQhRZ7qX9q2MMkCMI65XNK/SaFrAbbG0= +-----END CERTIFICATE----- diff --git a/Lib/test/test_ssl.py b/Lib/test/test_ssl.py --- a/Lib/test/test_ssl.py +++ b/Lib/test/test_ssl.py @@ -51,6 +51,7 @@ BADCERT = data_file("badcert.pem") WRONGCERT = data_file("XXXnonexisting.pem") BADKEY = data_file("badkey.pem") +NOKIACERT = data_file("nokia.pem") def handle_error(prefix): @@ -117,6 +118,31 @@ p = ssl._ssl._test_decode_cert(CERTFILE) if support.verbose: sys.stdout.write("\n" + pprint.pformat(p) + "\n") + self.assertEqual(p['issuer'], + ((('countryName', 'XY'),), + (('localityName', 'Castle Anthrax'),), + (('organizationName', 'Python Software Foundation'),), + (('commonName', 'localhost'),)) + ) + self.assertEqual(p['notAfter'], 'Oct 5 23:01:56 2020 GMT') + self.assertEqual(p['notBefore'], 'Oct 8 23:01:56 2010 GMT') + self.assertEqual(p['serialNumber'], 'D7C7381919AFC24E') + self.assertEqual(p['subject'], + ((('countryName', 'XY'),), + (('localityName', 'Castle Anthrax'),), + (('organizationName', 'Python Software Foundation'),), + (('commonName', 'localhost'),)) + ) + self.assertEqual(p['subjectAltName'], (('DNS', 'localhost'),)) + # Issue #13034: the subjectAltName in some certificates + # (notably projects.developer.nokia.com:443) wasn't parsed + p = ssl._ssl._test_decode_cert(NOKIACERT) + if support.verbose: + sys.stdout.write("\n" + pprint.pformat(p) + "\n") + self.assertEqual(p['subjectAltName'], + (('DNS', 'projects.developer.nokia.com'), + ('DNS', 'projects.forum.nokia.com')) + ) def test_DER_to_PEM(self): with open(SVN_PYTHON_ORG_ROOT_CERT, 'r') as f: diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -36,6 +36,9 @@ Library ------- +- Issue #13034: When decoding some SSL certificates, the subjectAltName + extension could be unreported. + - Issue #9871: Prevent IDLE 3 crash when given byte stings with invalid hex escape sequences, like b'\x0'. (Original patch by Claudiu Popa.) diff --git a/Modules/_ssl.c b/Modules/_ssl.c --- a/Modules/_ssl.c +++ b/Modules/_ssl.c @@ -578,7 +578,7 @@ /* get a memory buffer */ biobuf = BIO_new(BIO_s_mem()); - i = 0; + i = -1; while ((i = X509_get_ext_by_NID( certificate, NID_subject_alt_name, i)) >= 0) { -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sat Oct 1 19:26:03 2011 From: python-checkins at python.org (antoine.pitrou) Date: Sat, 01 Oct 2011 19:26:03 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=28merge_3=2E2_-=3E_default=29?= =?utf8?q?=3A_Issue_=2313034=3A_When_decoding_some_SSL_certificates=2C_the?= =?utf8?q?_subjectAltName_extension?= Message-ID: http://hg.python.org/cpython/rev/90a06fbb1f85 changeset: 72568:90a06fbb1f85 parent: 72566:756001a37949 parent: 72567:65e7f40fefd4 user: Antoine Pitrou date: Sat Oct 01 19:22:30 2011 +0200 summary: Issue #13034: When decoding some SSL certificates, the subjectAltName extension could be unreported. files: Lib/test/nokia.pem | 31 +++++++++++++++++++++++++++++++ Lib/test/test_ssl.py | 26 ++++++++++++++++++++++++++ Misc/NEWS | 3 +++ Modules/_ssl.c | 2 +- 4 files changed, 61 insertions(+), 1 deletions(-) diff --git a/Lib/test/nokia.pem b/Lib/test/nokia.pem new file mode 100644 --- /dev/null +++ b/Lib/test/nokia.pem @@ -0,0 +1,31 @@ +# Certificate for projects.developer.nokia.com:443 (see issue 13034) +-----BEGIN CERTIFICATE----- +MIIFLDCCBBSgAwIBAgIQLubqdkCgdc7lAF9NfHlUmjANBgkqhkiG9w0BAQUFADCB +vDELMAkGA1UEBhMCVVMxFzAVBgNVBAoTDlZlcmlTaWduLCBJbmMuMR8wHQYDVQQL +ExZWZXJpU2lnbiBUcnVzdCBOZXR3b3JrMTswOQYDVQQLEzJUZXJtcyBvZiB1c2Ug +YXQgaHR0cHM6Ly93d3cudmVyaXNpZ24uY29tL3JwYSAoYykxMDE2MDQGA1UEAxMt +VmVyaVNpZ24gQ2xhc3MgMyBJbnRlcm5hdGlvbmFsIFNlcnZlciBDQSAtIEczMB4X +DTExMDkyMTAwMDAwMFoXDTEyMDkyMDIzNTk1OVowcTELMAkGA1UEBhMCRkkxDjAM +BgNVBAgTBUVzcG9vMQ4wDAYDVQQHFAVFc3BvbzEOMAwGA1UEChQFTm9raWExCzAJ +BgNVBAsUAkJJMSUwIwYDVQQDFBxwcm9qZWN0cy5kZXZlbG9wZXIubm9raWEuY29t +MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCr92w1bpHYSYxUEx8N/8Iddda2 +lYi+aXNtQfV/l2Fw9Ykv3Ipw4nLeGTj18FFlAZgMdPRlgrzF/NNXGw/9l3/qKdow +CypkQf8lLaxb9Ze1E/KKmkRJa48QTOqvo6GqKuTI6HCeGlG1RxDb8YSKcQWLiytn +yj3Wp4MgRQO266xmMQIDAQABo4IB9jCCAfIwQQYDVR0RBDowOIIccHJvamVjdHMu +ZGV2ZWxvcGVyLm5va2lhLmNvbYIYcHJvamVjdHMuZm9ydW0ubm9raWEuY29tMAkG +A1UdEwQCMAAwCwYDVR0PBAQDAgWgMEEGA1UdHwQ6MDgwNqA0oDKGMGh0dHA6Ly9T +VlJJbnRsLUczLWNybC52ZXJpc2lnbi5jb20vU1ZSSW50bEczLmNybDBEBgNVHSAE +PTA7MDkGC2CGSAGG+EUBBxcDMCowKAYIKwYBBQUHAgEWHGh0dHBzOi8vd3d3LnZl +cmlzaWduLmNvbS9ycGEwKAYDVR0lBCEwHwYJYIZIAYb4QgQBBggrBgEFBQcDAQYI +KwYBBQUHAwIwcgYIKwYBBQUHAQEEZjBkMCQGCCsGAQUFBzABhhhodHRwOi8vb2Nz +cC52ZXJpc2lnbi5jb20wPAYIKwYBBQUHMAKGMGh0dHA6Ly9TVlJJbnRsLUczLWFp +YS52ZXJpc2lnbi5jb20vU1ZSSW50bEczLmNlcjBuBggrBgEFBQcBDARiMGChXqBc +MFowWDBWFglpbWFnZS9naWYwITAfMAcGBSsOAwIaBBRLa7kolgYMu9BSOJsprEsH +iyEFGDAmFiRodHRwOi8vbG9nby52ZXJpc2lnbi5jb20vdnNsb2dvMS5naWYwDQYJ +KoZIhvcNAQEFBQADggEBACQuPyIJqXwUyFRWw9x5yDXgMW4zYFopQYOw/ItRY522 +O5BsySTh56BWS6mQB07XVfxmYUGAvRQDA5QHpmY8jIlNwSmN3s8RKo+fAtiNRlcL +x/mWSfuMs3D/S6ev3D6+dpEMZtjrhOdctsarMKp8n/hPbwhAbg5hVjpkW5n8vz2y +0KxvvkA1AxpLwpVv7OlK17ttzIHw8bp9HTlHBU5s8bKz4a565V/a5HI0CSEv/+0y +ko4/ghTnZc1CkmUngKKeFMSah/mT/xAh8XnE2l1AazFa8UKuYki1e+ArHaGZc4ix +UYOtiRphwfuYQhRZ7qX9q2MMkCMI65XNK/SaFrAbbG0= +-----END CERTIFICATE----- diff --git a/Lib/test/test_ssl.py b/Lib/test/test_ssl.py --- a/Lib/test/test_ssl.py +++ b/Lib/test/test_ssl.py @@ -54,6 +54,7 @@ BADCERT = data_file("badcert.pem") WRONGCERT = data_file("XXXnonexisting.pem") BADKEY = data_file("badkey.pem") +NOKIACERT = data_file("nokia.pem") def handle_error(prefix): @@ -130,6 +131,31 @@ p = ssl._ssl._test_decode_cert(CERTFILE) if support.verbose: sys.stdout.write("\n" + pprint.pformat(p) + "\n") + self.assertEqual(p['issuer'], + ((('countryName', 'XY'),), + (('localityName', 'Castle Anthrax'),), + (('organizationName', 'Python Software Foundation'),), + (('commonName', 'localhost'),)) + ) + self.assertEqual(p['notAfter'], 'Oct 5 23:01:56 2020 GMT') + self.assertEqual(p['notBefore'], 'Oct 8 23:01:56 2010 GMT') + self.assertEqual(p['serialNumber'], 'D7C7381919AFC24E') + self.assertEqual(p['subject'], + ((('countryName', 'XY'),), + (('localityName', 'Castle Anthrax'),), + (('organizationName', 'Python Software Foundation'),), + (('commonName', 'localhost'),)) + ) + self.assertEqual(p['subjectAltName'], (('DNS', 'localhost'),)) + # Issue #13034: the subjectAltName in some certificates + # (notably projects.developer.nokia.com:443) wasn't parsed + p = ssl._ssl._test_decode_cert(NOKIACERT) + if support.verbose: + sys.stdout.write("\n" + pprint.pformat(p) + "\n") + self.assertEqual(p['subjectAltName'], + (('DNS', 'projects.developer.nokia.com'), + ('DNS', 'projects.forum.nokia.com')) + ) def test_DER_to_PEM(self): with open(SVN_PYTHON_ORG_ROOT_CERT, 'r') as f: diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -294,6 +294,9 @@ Library ------- +- Issue #13034: When decoding some SSL certificates, the subjectAltName + extension could be unreported. + - Issue #9871: Prevent IDLE 3 crash when given byte stings with invalid hex escape sequences, like b'\x0'. (Original patch by Claudiu Popa.) diff --git a/Modules/_ssl.c b/Modules/_ssl.c --- a/Modules/_ssl.c +++ b/Modules/_ssl.c @@ -595,7 +595,7 @@ /* get a memory buffer */ biobuf = BIO_new(BIO_s_mem()); - i = 0; + i = -1; while ((i = X509_get_ext_by_NID( certificate, NID_subject_alt_name, i)) >= 0) { -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sat Oct 1 19:34:38 2011 From: python-checkins at python.org (antoine.pitrou) Date: Sat, 01 Oct 2011 19:34:38 +0200 Subject: [Python-checkins] =?utf8?b?Y3B5dGhvbiAoMi43KTogSXNzdWUgIzEzMDM0?= =?utf8?q?=3A_When_decoding_some_SSL_certificates=2C_the_subjectAltName_ex?= =?utf8?q?tension?= Message-ID: http://hg.python.org/cpython/rev/8e6694387c98 changeset: 72569:8e6694387c98 branch: 2.7 parent: 72530:dec00ae64ca8 user: Antoine Pitrou date: Sat Oct 01 19:30:58 2011 +0200 summary: Issue #13034: When decoding some SSL certificates, the subjectAltName extension could be unreported. files: Lib/test/nokia.pem | 31 +++++++++++++++++++++++++++++++ Lib/test/test_ssl.py | 24 ++++++++++++++++++++++-- Modules/_ssl.c | 2 +- 3 files changed, 54 insertions(+), 3 deletions(-) diff --git a/Lib/test/nokia.pem b/Lib/test/nokia.pem new file mode 100644 --- /dev/null +++ b/Lib/test/nokia.pem @@ -0,0 +1,31 @@ +# Certificate for projects.developer.nokia.com:443 (see issue 13034) +-----BEGIN CERTIFICATE----- +MIIFLDCCBBSgAwIBAgIQLubqdkCgdc7lAF9NfHlUmjANBgkqhkiG9w0BAQUFADCB +vDELMAkGA1UEBhMCVVMxFzAVBgNVBAoTDlZlcmlTaWduLCBJbmMuMR8wHQYDVQQL +ExZWZXJpU2lnbiBUcnVzdCBOZXR3b3JrMTswOQYDVQQLEzJUZXJtcyBvZiB1c2Ug +YXQgaHR0cHM6Ly93d3cudmVyaXNpZ24uY29tL3JwYSAoYykxMDE2MDQGA1UEAxMt +VmVyaVNpZ24gQ2xhc3MgMyBJbnRlcm5hdGlvbmFsIFNlcnZlciBDQSAtIEczMB4X +DTExMDkyMTAwMDAwMFoXDTEyMDkyMDIzNTk1OVowcTELMAkGA1UEBhMCRkkxDjAM +BgNVBAgTBUVzcG9vMQ4wDAYDVQQHFAVFc3BvbzEOMAwGA1UEChQFTm9raWExCzAJ +BgNVBAsUAkJJMSUwIwYDVQQDFBxwcm9qZWN0cy5kZXZlbG9wZXIubm9raWEuY29t +MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCr92w1bpHYSYxUEx8N/8Iddda2 +lYi+aXNtQfV/l2Fw9Ykv3Ipw4nLeGTj18FFlAZgMdPRlgrzF/NNXGw/9l3/qKdow +CypkQf8lLaxb9Ze1E/KKmkRJa48QTOqvo6GqKuTI6HCeGlG1RxDb8YSKcQWLiytn +yj3Wp4MgRQO266xmMQIDAQABo4IB9jCCAfIwQQYDVR0RBDowOIIccHJvamVjdHMu +ZGV2ZWxvcGVyLm5va2lhLmNvbYIYcHJvamVjdHMuZm9ydW0ubm9raWEuY29tMAkG +A1UdEwQCMAAwCwYDVR0PBAQDAgWgMEEGA1UdHwQ6MDgwNqA0oDKGMGh0dHA6Ly9T +VlJJbnRsLUczLWNybC52ZXJpc2lnbi5jb20vU1ZSSW50bEczLmNybDBEBgNVHSAE +PTA7MDkGC2CGSAGG+EUBBxcDMCowKAYIKwYBBQUHAgEWHGh0dHBzOi8vd3d3LnZl +cmlzaWduLmNvbS9ycGEwKAYDVR0lBCEwHwYJYIZIAYb4QgQBBggrBgEFBQcDAQYI +KwYBBQUHAwIwcgYIKwYBBQUHAQEEZjBkMCQGCCsGAQUFBzABhhhodHRwOi8vb2Nz +cC52ZXJpc2lnbi5jb20wPAYIKwYBBQUHMAKGMGh0dHA6Ly9TVlJJbnRsLUczLWFp +YS52ZXJpc2lnbi5jb20vU1ZSSW50bEczLmNlcjBuBggrBgEFBQcBDARiMGChXqBc +MFowWDBWFglpbWFnZS9naWYwITAfMAcGBSsOAwIaBBRLa7kolgYMu9BSOJsprEsH +iyEFGDAmFiRodHRwOi8vbG9nby52ZXJpc2lnbi5jb20vdnNsb2dvMS5naWYwDQYJ +KoZIhvcNAQEFBQADggEBACQuPyIJqXwUyFRWw9x5yDXgMW4zYFopQYOw/ItRY522 +O5BsySTh56BWS6mQB07XVfxmYUGAvRQDA5QHpmY8jIlNwSmN3s8RKo+fAtiNRlcL +x/mWSfuMs3D/S6ev3D6+dpEMZtjrhOdctsarMKp8n/hPbwhAbg5hVjpkW5n8vz2y +0KxvvkA1AxpLwpVv7OlK17ttzIHw8bp9HTlHBU5s8bKz4a565V/a5HI0CSEv/+0y +ko4/ghTnZc1CkmUngKKeFMSah/mT/xAh8XnE2l1AazFa8UKuYki1e+ArHaGZc4ix +UYOtiRphwfuYQhRZ7qX9q2MMkCMI65XNK/SaFrAbbG0= +-----END CERTIFICATE----- diff --git a/Lib/test/test_ssl.py b/Lib/test/test_ssl.py --- a/Lib/test/test_ssl.py +++ b/Lib/test/test_ssl.py @@ -110,6 +110,23 @@ p = ssl._ssl._test_decode_cert(CERTFILE, False) if test_support.verbose: sys.stdout.write("\n" + pprint.pformat(p) + "\n") + self.assertEqual(p['subject'], + ((('countryName', u'US'),), + (('stateOrProvinceName', u'Delaware'),), + (('localityName', u'Wilmington'),), + (('organizationName', u'Python Software Foundation'),), + (('organizationalUnitName', u'SSL'),), + (('commonName', u'somemachine.python.org'),)), + ) + # Issue #13034: the subjectAltName in some certificates + # (notably projects.developer.nokia.com:443) wasn't parsed + p = ssl._ssl._test_decode_cert(NOKIACERT) + if test_support.verbose: + sys.stdout.write("\n" + pprint.pformat(p) + "\n") + self.assertEqual(p['subjectAltName'], + (('DNS', 'projects.developer.nokia.com'), + ('DNS', 'projects.forum.nokia.com')) + ) def test_DER_to_PEM(self): with open(SVN_PYTHON_ORG_ROOT_CERT, 'r') as f: @@ -1329,15 +1346,18 @@ def test_main(verbose=False): - global CERTFILE, SVN_PYTHON_ORG_ROOT_CERT + global CERTFILE, SVN_PYTHON_ORG_ROOT_CERT, NOKIACERT CERTFILE = os.path.join(os.path.dirname(__file__) or os.curdir, "keycert.pem") SVN_PYTHON_ORG_ROOT_CERT = os.path.join( os.path.dirname(__file__) or os.curdir, "https_svn_python_org_root.pem") + NOKIACERT = os.path.join(os.path.dirname(__file__) or os.curdir, + "nokia.pem") if (not os.path.exists(CERTFILE) or - not os.path.exists(SVN_PYTHON_ORG_ROOT_CERT)): + not os.path.exists(SVN_PYTHON_ORG_ROOT_CERT) or + not os.path.exists(NOKIACERT)): raise test_support.TestFailed("Can't read certificate files!") tests = [BasicTests, BasicSocketTests] diff --git a/Modules/_ssl.c b/Modules/_ssl.c --- a/Modules/_ssl.c +++ b/Modules/_ssl.c @@ -702,7 +702,7 @@ /* get a memory buffer */ biobuf = BIO_new(BIO_s_mem()); - i = 0; + i = -1; while ((i = X509_get_ext_by_NID( certificate, NID_subject_alt_name, i)) >= 0) { -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sat Oct 1 22:49:14 2011 From: python-checkins at python.org (r.david.murray) Date: Sat, 01 Oct 2011 22:49:14 +0200 Subject: [Python-checkins] =?utf8?b?Y3B5dGhvbiAoMy4yKTogIzQxNDc6IG1pbmlk?= =?utf8?q?om=27s_toprettyxml_no_longer_adds_whitespace_to_text_nodes=2E?= Message-ID: http://hg.python.org/cpython/rev/086ca132e161 changeset: 72570:086ca132e161 branch: 3.2 parent: 72567:65e7f40fefd4 user: R David Murray date: Sat Oct 01 16:19:51 2011 -0400 summary: #4147: minidom's toprettyxml no longer adds whitespace to text nodes. Patch by Dan Kenigsberg. files: Lib/test/test_minidom.py | 7 +++++++ Lib/xml/dom/minidom.py | 6 ++++-- Misc/ACKS | 1 + Misc/NEWS | 2 ++ 4 files changed, 14 insertions(+), 2 deletions(-) diff --git a/Lib/test/test_minidom.py b/Lib/test/test_minidom.py --- a/Lib/test/test_minidom.py +++ b/Lib/test/test_minidom.py @@ -446,6 +446,13 @@ dom.unlink() self.confirm(domstr == str.replace("\n", "\r\n")) + def test_toPrettyXML_perserves_content_of_text_node(self): + str = 'B' + dom = parseString(str) + dom2 = parseString(dom.toprettyxml()) + self.assertEqual(dom.childNodes[0].childNodes[0].toxml(), + dom2.childNodes[0].childNodes[0].toxml()) + def testProcessingInstruction(self): dom = parseString('') pi = dom.documentElement.firstChild diff --git a/Lib/xml/dom/minidom.py b/Lib/xml/dom/minidom.py --- a/Lib/xml/dom/minidom.py +++ b/Lib/xml/dom/minidom.py @@ -836,7 +836,9 @@ _write_data(writer, attrs[a_name].value) writer.write("\"") if self.childNodes: - writer.write(">%s"%(newl)) + writer.write(">") + if self.childNodes[0].nodeType != Node.TEXT_NODE: + writer.write(newl) for node in self.childNodes: node.writexml(writer,indent+addindent,addindent,newl) writer.write("%s%s" % (indent,self.tagName,newl)) @@ -1061,7 +1063,7 @@ return newText def writexml(self, writer, indent="", addindent="", newl=""): - _write_data(writer, "%s%s%s"%(indent, self.data, newl)) + _write_data(writer, self.data) # DOM Level 3 (WD 9 April 2002) diff --git a/Misc/ACKS b/Misc/ACKS --- a/Misc/ACKS +++ b/Misc/ACKS @@ -468,6 +468,7 @@ Hiroaki Kawai Sebastien Keim Ryan Kelly +Dan Kenigsberg Robert Kern Randall Kern Magnus Kessler diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -36,6 +36,8 @@ Library ------- +- Issue #4147: minidom's toprettyxml no longer adds whitespace to text nodes. + - Issue #13034: When decoding some SSL certificates, the subjectAltName extension could be unreported. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sat Oct 1 22:49:15 2011 From: python-checkins at python.org (r.david.murray) Date: Sat, 01 Oct 2011 22:49:15 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=28merge_3=2E2_-=3E_default=29?= =?utf8?q?=3A_merge_=234147=3A_minidom=27s_toprettyxml_no_longer_adds_whit?= =?utf8?q?espace_to_text_nodes=2E?= Message-ID: http://hg.python.org/cpython/rev/fa0b1e50270f changeset: 72571:fa0b1e50270f parent: 72568:90a06fbb1f85 parent: 72570:086ca132e161 user: R David Murray date: Sat Oct 01 16:22:35 2011 -0400 summary: merge #4147: minidom's toprettyxml no longer adds whitespace to text nodes. files: Lib/test/test_minidom.py | 7 +++++++ Lib/xml/dom/minidom.py | 6 ++++-- Misc/ACKS | 1 + Misc/NEWS | 2 ++ 4 files changed, 14 insertions(+), 2 deletions(-) diff --git a/Lib/test/test_minidom.py b/Lib/test/test_minidom.py --- a/Lib/test/test_minidom.py +++ b/Lib/test/test_minidom.py @@ -467,6 +467,13 @@ dom.unlink() self.confirm(domstr == str.replace("\n", "\r\n")) + def test_toPrettyXML_perserves_content_of_text_node(self): + str = 'B' + dom = parseString(str) + dom2 = parseString(dom.toprettyxml()) + self.assertEqual(dom.childNodes[0].childNodes[0].toxml(), + dom2.childNodes[0].childNodes[0].toxml()) + def testProcessingInstruction(self): dom = parseString('') pi = dom.documentElement.firstChild diff --git a/Lib/xml/dom/minidom.py b/Lib/xml/dom/minidom.py --- a/Lib/xml/dom/minidom.py +++ b/Lib/xml/dom/minidom.py @@ -836,7 +836,9 @@ _write_data(writer, attrs[a_name].value) writer.write("\"") if self.childNodes: - writer.write(">%s"%(newl)) + writer.write(">") + if self.childNodes[0].nodeType != Node.TEXT_NODE: + writer.write(newl) for node in self.childNodes: node.writexml(writer,indent+addindent,addindent,newl) writer.write("%s%s" % (indent,self.tagName,newl)) @@ -1061,7 +1063,7 @@ return newText def writexml(self, writer, indent="", addindent="", newl=""): - _write_data(writer, "%s%s%s"%(indent, self.data, newl)) + _write_data(writer, self.data) # DOM Level 3 (WD 9 April 2002) diff --git a/Misc/ACKS b/Misc/ACKS --- a/Misc/ACKS +++ b/Misc/ACKS @@ -496,6 +496,7 @@ Hiroaki Kawai Sebastien Keim Ryan Kelly +Dan Kenigsberg Robert Kern Randall Kern Magnus Kessler diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -294,6 +294,8 @@ Library ------- +- Issue #4147: minidom's toprettyxml no longer adds whitespace to text nodes. + - Issue #13034: When decoding some SSL certificates, the subjectAltName extension could be unreported. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sat Oct 1 22:49:37 2011 From: python-checkins at python.org (r.david.murray) Date: Sat, 01 Oct 2011 22:49:37 +0200 Subject: [Python-checkins] =?utf8?b?Y3B5dGhvbiAoMi43KTogIzQxNDc6IG1pbmlk?= =?utf8?q?om=27s_toprettyxml_no_longer_adds_whitespace_to_text_nodes=2E?= Message-ID: http://hg.python.org/cpython/rev/406c5b69cb1b changeset: 72572:406c5b69cb1b branch: 2.7 parent: 72569:8e6694387c98 user: R David Murray date: Sat Oct 01 16:49:25 2011 -0400 summary: #4147: minidom's toprettyxml no longer adds whitespace to text nodes. Patch by Dan Kenigsberg. files: Lib/test/test_minidom.py | 7 +++++++ Lib/xml/dom/minidom.py | 6 ++++-- 2 files changed, 11 insertions(+), 2 deletions(-) diff --git a/Lib/test/test_minidom.py b/Lib/test/test_minidom.py --- a/Lib/test/test_minidom.py +++ b/Lib/test/test_minidom.py @@ -439,6 +439,13 @@ dom.unlink() self.confirm(domstr == str.replace("\n", "\r\n")) + def test_toPrettyXML_perserves_content_of_text_node(self): + str = 'B' + dom = parseString(str) + dom2 = parseString(dom.toprettyxml()) + self.assertEqual(dom.childNodes[0].childNodes[0].toxml(), + dom2.childNodes[0].childNodes[0].toxml()) + def testProcessingInstruction(self): dom = parseString('') pi = dom.documentElement.firstChild diff --git a/Lib/xml/dom/minidom.py b/Lib/xml/dom/minidom.py --- a/Lib/xml/dom/minidom.py +++ b/Lib/xml/dom/minidom.py @@ -806,7 +806,9 @@ _write_data(writer, attrs[a_name].value) writer.write("\"") if self.childNodes: - writer.write(">%s"%(newl)) + writer.write(">") + if self.childNodes[0].nodeType != Node.TEXT_NODE: + writer.write(newl) for node in self.childNodes: node.writexml(writer,indent+addindent,addindent,newl) writer.write("%s%s" % (indent,self.tagName,newl)) @@ -1031,7 +1033,7 @@ return newText def writexml(self, writer, indent="", addindent="", newl=""): - _write_data(writer, "%s%s%s"%(indent, self.data, newl)) + _write_data(writer, self.data) # DOM Level 3 (WD 9 April 2002) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Oct 2 01:14:15 2011 From: python-checkins at python.org (victor.stinner) Date: Sun, 02 Oct 2011 01:14:15 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_PyUnicode=5FFromKindAndData?= =?utf8?q?=28=29_raises_a_ValueError_if_the_kind_is_unknown?= Message-ID: http://hg.python.org/cpython/rev/9124a00df142 changeset: 72573:9124a00df142 parent: 72571:fa0b1e50270f user: Victor Stinner date: Sat Oct 01 23:48:37 2011 +0200 summary: PyUnicode_FromKindAndData() raises a ValueError if the kind is unknown files: Objects/unicodeobject.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -1211,7 +1211,7 @@ case PyUnicode_4BYTE_KIND: return _PyUnicode_FromUCS4(buffer, size); } - assert(0); + PyErr_SetString(PyExc_ValueError, "invalid kind"); return NULL; } -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Oct 2 01:14:16 2011 From: python-checkins at python.org (victor.stinner) Date: Sun, 02 Oct 2011 01:14:16 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_PyUnicode=5FReadChar=28=29_?= =?utf8?q?raises_a_IndexError_if_the_index_in_invalid?= Message-ID: http://hg.python.org/cpython/rev/ae2b07f9ede6 changeset: 72574:ae2b07f9ede6 user: Victor Stinner date: Sun Oct 02 00:25:40 2011 +0200 summary: PyUnicode_ReadChar() raises a IndexError if the index in invalid unicode_getitem() reuses PyUnicode_ReadChar() files: Objects/unicodeobject.c | 29 +++++++++++++---------------- 1 files changed, 13 insertions(+), 16 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -2840,8 +2840,12 @@ Py_UCS4 PyUnicode_ReadChar(PyObject *unicode, Py_ssize_t index) { - if (!PyUnicode_Check(unicode) || PyUnicode_READY(unicode) != -1) { - return PyErr_BadArgument(); + if (!PyUnicode_Check(unicode) || PyUnicode_READY(unicode) == -1) { + PyErr_BadArgument(); + return (Py_UCS4)-1; + } + if (index < 0 || index >= _PyUnicode_LENGTH(unicode)) { + PyErr_SetString(PyExc_IndexError, "string index out of range"); return (Py_UCS4)-1; } return PyUnicode_READ_CHAR(unicode, index); @@ -9808,18 +9812,11 @@ } static PyObject * -unicode_getitem(PyUnicodeObject *self, Py_ssize_t index) -{ - Py_UCS4 ch; - - if (PyUnicode_READY(self) == -1) - return NULL; - if (index < 0 || index >= _PyUnicode_LENGTH(self)) { - PyErr_SetString(PyExc_IndexError, "string index out of range"); - return NULL; - } - - ch = PyUnicode_READ(PyUnicode_KIND(self), PyUnicode_DATA(self), index); +unicode_getitem(PyObject *self, Py_ssize_t index) +{ + Py_UCS4 ch = PyUnicode_ReadChar(self, index); + if (ch == (Py_UCS4)-1) + return NULL; return PyUnicode_FromOrdinal(ch); } @@ -10475,7 +10472,7 @@ length = end - start; if (length == 1) - return unicode_getitem((PyUnicodeObject*)self, start); + return unicode_getitem(self, start); if (start < 0 || end < 0) { PyErr_SetString(PyExc_IndexError, "string index out of range"); @@ -11758,7 +11755,7 @@ return NULL; if (i < 0) i += PyUnicode_GET_LENGTH(self); - return unicode_getitem(self, i); + return unicode_getitem((PyObject*)self, i); } else if (PySlice_Check(item)) { Py_ssize_t start, stop, step, slicelength, cur, i; const Py_UNICODE* source_buf; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Oct 2 01:14:17 2011 From: python-checkins at python.org (victor.stinner) Date: Sun, 02 Oct 2011 01:14:17 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_PyUnicode=5FWriteChar=28=29?= =?utf8?q?_raises_IndexError_on_invalid_index?= Message-ID: http://hg.python.org/cpython/rev/99aa46107a22 changeset: 72575:99aa46107a22 user: Victor Stinner date: Sun Oct 02 00:34:53 2011 +0200 summary: PyUnicode_WriteChar() raises IndexError on invalid index PyUnicode_WriteChar() raises also a ValueError if the string has more than 1 reference. files: Include/unicodeobject.h | 4 +++- Objects/unicodeobject.c | 28 +++++++++++++++++++++------- 2 files changed, 24 insertions(+), 8 deletions(-) diff --git a/Include/unicodeobject.h b/Include/unicodeobject.h --- a/Include/unicodeobject.h +++ b/Include/unicodeobject.h @@ -647,7 +647,9 @@ ); /* Write a character to the string. The string must have been created through - PyUnicode_New, must not be shared, and must not have been hashed yet. */ + PyUnicode_New, must not be shared, and must not have been hashed yet. + + Return 0 on success, -1 on error. */ PyAPI_FUNC(int) PyUnicode_WriteChar( PyObject *unicode, diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -622,6 +622,19 @@ } #endif +static int +_PyUnicode_Dirty(PyObject *unicode) +{ + assert(PyUnicode_Check(unicode)); + if (Py_REFCNT(unicode) != 1) { + PyErr_SetString(PyExc_ValueError, + "Cannot modify a string having more than 1 reference"); + return -1; + } + _PyUnicode_DIRTY(unicode); + return 0; +} + Py_ssize_t PyUnicode_CopyCharacters(PyObject *to, Py_ssize_t to_start, PyObject *from, Py_ssize_t from_start, @@ -651,12 +664,8 @@ if (how_many == 0) return 0; - if (Py_REFCNT(to) != 1) { - PyErr_SetString(PyExc_ValueError, - "Cannot modify a string having more than 1 reference"); + if (_PyUnicode_Dirty(to)) return -1; - } - _PyUnicode_DIRTY(to); from_kind = PyUnicode_KIND(from); from_data = PyUnicode_DATA(from); @@ -2855,10 +2864,15 @@ PyUnicode_WriteChar(PyObject *unicode, Py_ssize_t index, Py_UCS4 ch) { if (!PyUnicode_Check(unicode) || !PyUnicode_IS_COMPACT(unicode)) { - return PyErr_BadArgument(); + PyErr_BadArgument(); return -1; } - + if (index < 0 || index >= _PyUnicode_LENGTH(unicode)) { + PyErr_SetString(PyExc_IndexError, "string index out of range"); + return -1; + } + if (_PyUnicode_Dirty(unicode)) + return -1; PyUnicode_WRITE(PyUnicode_KIND(unicode), PyUnicode_DATA(unicode), index, ch); return 0; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Oct 2 01:14:18 2011 From: python-checkins at python.org (victor.stinner) Date: Sun, 02 Oct 2011 01:14:18 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Use_Py=5FUCS1_instead_of_un?= =?utf8?q?signed_char_in_unicodeobject=2Eh?= Message-ID: http://hg.python.org/cpython/rev/5231de1080b0 changeset: 72577:5231de1080b0 user: Victor Stinner date: Sun Oct 02 00:55:25 2011 +0200 summary: Use Py_UCS1 instead of unsigned char in unicodeobject.h files: Include/unicodeobject.h | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/Include/unicodeobject.h b/Include/unicodeobject.h --- a/Include/unicodeobject.h +++ b/Include/unicodeobject.h @@ -417,7 +417,7 @@ #define PyUnicode_READ(kind, data, index) \ ((Py_UCS4) \ ((kind) == PyUnicode_1BYTE_KIND ? \ - ((const unsigned char *)(data))[(index)] : \ + ((const Py_UCS1 *)(data))[(index)] : \ ((kind) == PyUnicode_2BYTE_KIND ? \ ((const Py_UCS2 *)(data))[(index)] : \ ((const Py_UCS4 *)(data))[(index)] \ @@ -431,7 +431,7 @@ #define PyUnicode_READ_CHAR(unicode, index) \ ((Py_UCS4) \ (PyUnicode_KIND((unicode)) == PyUnicode_1BYTE_KIND ? \ - ((const unsigned char *)(PyUnicode_DATA((unicode))))[(index)] : \ + ((const Py_UCS1 *)(PyUnicode_DATA((unicode))))[(index)] : \ (PyUnicode_KIND((unicode)) == PyUnicode_2BYTE_KIND ? \ ((const Py_UCS2 *)(PyUnicode_DATA((unicode))))[(index)] : \ ((const Py_UCS4 *)(PyUnicode_DATA((unicode))))[(index)] \ -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Oct 2 01:14:18 2011 From: python-checkins at python.org (victor.stinner) Date: Sun, 02 Oct 2011 01:14:18 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Fix_usage_of_PyUnicode=5FRE?= =?utf8?q?ADY=28=29_in_PyUnicode=5FGetLength=28=29?= Message-ID: http://hg.python.org/cpython/rev/745fe40c9bbe changeset: 72576:745fe40c9bbe user: Victor Stinner date: Sun Oct 02 00:36:53 2011 +0200 summary: Fix usage of PyUnicode_READY() in PyUnicode_GetLength() files: Objects/unicodeobject.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -2838,7 +2838,7 @@ Py_ssize_t PyUnicode_GetLength(PyObject *unicode) { - if (!PyUnicode_Check(unicode) || PyUnicode_READY(unicode) != -1) { + if (!PyUnicode_Check(unicode) || PyUnicode_READY(unicode) == -1) { PyErr_BadArgument(); return -1; } -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Oct 2 01:14:19 2011 From: python-checkins at python.org (victor.stinner) Date: Sun, 02 Oct 2011 01:14:19 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Optimize_=5FPyUnicode=5FAsK?= =?utf8?q?ind=28=29_for_UCS1-=3EUCS4_and_UCS2-=3EUCS4?= Message-ID: http://hg.python.org/cpython/rev/329a981b9143 changeset: 72578:329a981b9143 user: Victor Stinner date: Sun Oct 02 01:00:40 2011 +0200 summary: Optimize _PyUnicode_AsKind() for UCS1->UCS4 and UCS2->UCS4 * Ensure that the input string is ready * Raise a ValueError instead of of a fatal error files: Objects/unicodeobject.c | 72 ++++++++++++++++++---------- 1 files changed, 45 insertions(+), 27 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -1264,43 +1264,61 @@ } -/* Widen Unicode objects to larger buffers. - Return NULL if the string is too wide already. */ +/* Widen Unicode objects to larger buffers. Don't write terminating null + character. Return NULL on error. */ void* _PyUnicode_AsKind(PyObject *s, unsigned int kind) { - Py_ssize_t i; - Py_ssize_t len = PyUnicode_GET_LENGTH(s); - void *d = PyUnicode_DATA(s); - unsigned int skind = PyUnicode_KIND(s); - if (PyUnicode_KIND(s) >= kind) { + Py_ssize_t len; + void *result; + unsigned int skind; + + if (PyUnicode_READY(s)) + return NULL; + + len = PyUnicode_GET_LENGTH(s); + skind = PyUnicode_KIND(s); + if (skind >= kind) { PyErr_SetString(PyExc_RuntimeError, "invalid widening attempt"); return NULL; } switch(kind) { - case PyUnicode_2BYTE_KIND: { - Py_UCS2 *result = PyMem_Malloc(PyUnicode_GET_LENGTH(s) * sizeof(Py_UCS2)); - if (!result) { - PyErr_NoMemory(); - return 0; - } - for (i = 0; i < len; i++) - result[i] = ((Py_UCS1*)d)[i]; + case PyUnicode_2BYTE_KIND: + result = PyMem_Malloc(len * sizeof(Py_UCS2)); + if (!result) + return PyErr_NoMemory(); + assert(skind == PyUnicode_1BYTE_KIND); + _PyUnicode_CONVERT_BYTES( + Py_UCS1, Py_UCS2, + PyUnicode_1BYTE_DATA(s), + PyUnicode_1BYTE_DATA(s) + len, + result); return result; - } - case PyUnicode_4BYTE_KIND: { - Py_UCS4 *result = PyMem_Malloc(PyUnicode_GET_LENGTH(s) * sizeof(Py_UCS4)); - if (!result) { - PyErr_NoMemory(); - return 0; - } - for (i = 0; i < len; i++) - result[i] = PyUnicode_READ(skind, d, i); + case PyUnicode_4BYTE_KIND: + result = PyMem_Malloc(len * sizeof(Py_UCS4)); + if (!result) + return PyErr_NoMemory(); + if (skind == PyUnicode_2BYTE_KIND) { + _PyUnicode_CONVERT_BYTES( + Py_UCS2, Py_UCS4, + PyUnicode_2BYTE_DATA(s), + PyUnicode_2BYTE_DATA(s) + len, + result); + } + else { + assert(skind == PyUnicode_1BYTE_KIND); + _PyUnicode_CONVERT_BYTES( + Py_UCS1, Py_UCS4, + PyUnicode_1BYTE_DATA(s), + PyUnicode_1BYTE_DATA(s) + len, + result); + } return result; - } - } - Py_FatalError("invalid kind"); + default: + break; + } + PyErr_SetString(PyExc_ValueError, "invalid kind"); return NULL; } -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Oct 2 01:14:20 2011 From: python-checkins at python.org (victor.stinner) Date: Sun, 02 Oct 2011 01:14:20 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_PyUnicode=5FFindChar=28=29_?= =?utf8?q?raises_a_IndexError_on_invalid_index?= Message-ID: http://hg.python.org/cpython/rev/6bd6cc7f2c8d changeset: 72579:6bd6cc7f2c8d user: Victor Stinner date: Sun Oct 02 01:08:37 2011 +0200 summary: PyUnicode_FindChar() raises a IndexError on invalid index files: Objects/unicodeobject.c | 4 ++++ 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -8089,6 +8089,10 @@ int kind; if (PyUnicode_READY(str) == -1) return -2; + if (start < 0 || end < 0) { + PyErr_SetString(PyExc_IndexError, "string index out of range"); + return -2; + } if (end > PyUnicode_GET_LENGTH(str)) end = PyUnicode_GET_LENGTH(str); kind = PyUnicode_KIND(str); -- Repository URL: http://hg.python.org/cpython From solipsis at pitrou.net Sun Oct 2 05:23:14 2011 From: solipsis at pitrou.net (solipsis at pitrou.net) Date: Sun, 02 Oct 2011 05:23:14 +0200 Subject: [Python-checkins] Daily reference leaks (6bd6cc7f2c8d): sum=0 Message-ID: results for 6bd6cc7f2c8d on branch "default" -------------------------------------------- Command line was: ['./python', '-m', 'test.regrtest', '-uall', '-R', '3:3:/home/antoine/cpython/refleaks/reflogbB32ed', '-x'] From python-checkins at python.org Sun Oct 2 11:47:23 2011 From: python-checkins at python.org (ezio.melotti) Date: Sun, 02 Oct 2011 11:47:23 +0200 Subject: [Python-checkins] =?utf8?b?Y3B5dGhvbiAoMi43KTogIzEzMDc2OiBmaXgg?= =?utf8?q?links_to_datetime=2Etime=2E?= Message-ID: http://hg.python.org/cpython/rev/854e31d80151 changeset: 72580:854e31d80151 branch: 2.7 parent: 72572:406c5b69cb1b user: Ezio Melotti date: Sun Oct 02 12:22:13 2011 +0300 summary: #13076: fix links to datetime.time. files: Doc/library/datetime.rst | 18 +++++++++--------- 1 files changed, 9 insertions(+), 9 deletions(-) diff --git a/Doc/library/datetime.rst b/Doc/library/datetime.rst --- a/Doc/library/datetime.rst +++ b/Doc/library/datetime.rst @@ -1164,19 +1164,19 @@ .. attribute:: time.min - The earliest representable :class:`time`, ``time(0, 0, 0, 0)``. + The earliest representable :class:`.time`, ``time(0, 0, 0, 0)``. .. attribute:: time.max - The latest representable :class:`time`, ``time(23, 59, 59, 999999)``. + The latest representable :class:`.time`, ``time(23, 59, 59, 999999)``. .. attribute:: time.resolution - The smallest possible difference between non-equal :class:`time` objects, - ``timedelta(microseconds=1)``, although note that arithmetic on :class:`time` - objects is not supported. + The smallest possible difference between non-equal :class:`.time` objects, + ``timedelta(microseconds=1)``, although note that arithmetic on + :class:`.time` objects is not supported. Instance attributes (read-only): @@ -1203,7 +1203,7 @@ .. attribute:: time.tzinfo - The object passed as the tzinfo argument to the :class:`time` constructor, or + The object passed as the tzinfo argument to the :class:`.time` constructor, or ``None`` if none was passed. @@ -1234,10 +1234,10 @@ .. method:: time.replace([hour[, minute[, second[, microsecond[, tzinfo]]]]]) - Return a :class:`time` with the same value, except for those attributes given + Return a :class:`.time` with the same value, except for those attributes given new values by whichever keyword arguments are specified. Note that - ``tzinfo=None`` can be specified to create a naive :class:`time` from an - aware :class:`time`, without conversion of the time data. + ``tzinfo=None`` can be specified to create a naive :class:`.time` from an + aware :class:`.time`, without conversion of the time data. .. method:: time.isoformat() -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Oct 2 11:47:24 2011 From: python-checkins at python.org (ezio.melotti) Date: Sun, 02 Oct 2011 11:47:24 +0200 Subject: [Python-checkins] =?utf8?b?Y3B5dGhvbiAoMy4yKTogIzEzMDc2OiBmaXgg?= =?utf8?q?links_to_datetime=2Etime_and_datetime=2Edatetime=2E?= Message-ID: http://hg.python.org/cpython/rev/95689ed69097 changeset: 72581:95689ed69097 branch: 3.2 parent: 72570:086ca132e161 user: Ezio Melotti date: Sun Oct 02 12:44:50 2011 +0300 summary: #13076: fix links to datetime.time and datetime.datetime. files: Doc/library/datetime.rst | 152 +++++++++++++------------- 1 files changed, 76 insertions(+), 76 deletions(-) diff --git a/Doc/library/datetime.rst b/Doc/library/datetime.rst --- a/Doc/library/datetime.rst +++ b/Doc/library/datetime.rst @@ -18,13 +18,13 @@ There are two kinds of date and time objects: "naive" and "aware". This distinction refers to whether the object has any notion of time zone, daylight saving time, or other kind of algorithmic or political time adjustment. Whether -a naive :class:`datetime` object represents Coordinated Universal Time (UTC), +a naive :class:`.datetime` object represents Coordinated Universal Time (UTC), local time, or time in some other timezone is purely up to the program, just like it's up to the program whether a particular number represents metres, -miles, or mass. Naive :class:`datetime` objects are easy to understand and to +miles, or mass. Naive :class:`.datetime` objects are easy to understand and to work with, at the cost of ignoring some aspects of reality. -For applications requiring more, :class:`datetime` and :class:`time` objects +For applications requiring more, :class:`.datetime` and :class:`.time` objects have an optional time zone information attribute, :attr:`tzinfo`, that can be set to an instance of a subclass of the abstract :class:`tzinfo` class. These :class:`tzinfo` objects capture information about the offset from UTC time, the @@ -41,13 +41,13 @@ .. data:: MINYEAR - The smallest year number allowed in a :class:`date` or :class:`datetime` object. + The smallest year number allowed in a :class:`date` or :class:`.datetime` object. :const:`MINYEAR` is ``1``. .. data:: MAXYEAR - The largest year number allowed in a :class:`date` or :class:`datetime` object. + The largest year number allowed in a :class:`date` or :class:`.datetime` object. :const:`MAXYEAR` is ``9999``. @@ -91,14 +91,14 @@ .. class:: timedelta :noindex: - A duration expressing the difference between two :class:`date`, :class:`time`, - or :class:`datetime` instances to microsecond resolution. + A duration expressing the difference between two :class:`date`, :class:`.time`, + or :class:`.datetime` instances to microsecond resolution. .. class:: tzinfo An abstract base class for time zone information objects. These are used by the - :class:`datetime` and :class:`time` classes to provide a customizable notion of + :class:`.datetime` and :class:`.time` classes to provide a customizable notion of time adjustment (for example, to account for time zone and/or daylight saving time). @@ -114,7 +114,7 @@ Objects of the :class:`date` type are always naive. -An object *d* of type :class:`time` or :class:`datetime` may be naive or aware. +An object *d* of type :class:`.time` or :class:`.datetime` may be naive or aware. *d* is aware if ``d.tzinfo`` is not ``None`` and ``d.tzinfo.utcoffset(d)`` does not return ``None``. If ``d.tzinfo`` is ``None``, or if ``d.tzinfo`` is not ``None`` but ``d.tzinfo.utcoffset(d)`` returns ``None``, *d* is naive. @@ -299,7 +299,7 @@ -1 day, 19:00:00 In addition to the operations listed above :class:`timedelta` objects support -certain additions and subtractions with :class:`date` and :class:`datetime` +certain additions and subtractions with :class:`date` and :class:`.datetime` objects (see below). .. versionchanged:: 3.2 @@ -638,10 +638,10 @@ :class:`datetime` Objects ------------------------- -A :class:`datetime` object is a single object containing all the information -from a :class:`date` object and a :class:`time` object. Like a :class:`date` -object, :class:`datetime` assumes the current Gregorian calendar extended in -both directions; like a time object, :class:`datetime` assumes there are exactly +A :class:`.datetime` object is a single object containing all the information +from a :class:`date` object and a :class:`.time` object. Like a :class:`date` +object, :class:`.datetime` assumes the current Gregorian calendar extended in +both directions; like a time object, :class:`.datetime` assumes there are exactly 3600\*24 seconds in every day. Constructor: @@ -689,7 +689,7 @@ Return the current UTC date and time, with :attr:`tzinfo` ``None``. This is like :meth:`now`, but returns the current UTC date and time, as a naive - :class:`datetime` object. An aware current UTC datetime can be obtained by + :class:`.datetime` object. An aware current UTC datetime can be obtained by calling ``datetime.now(timezone.utc)``. See also :meth:`now`. .. classmethod:: datetime.fromtimestamp(timestamp, tz=None) @@ -697,7 +697,7 @@ Return the local date and time corresponding to the POSIX timestamp, such as is returned by :func:`time.time`. If optional argument *tz* is ``None`` or not specified, the timestamp is converted to the platform's local date and time, and - the returned :class:`datetime` object is naive. + the returned :class:`.datetime` object is naive. Else *tz* must be an instance of a class :class:`tzinfo` subclass, and the timestamp is converted to *tz*'s time zone. In this case the result is @@ -710,12 +710,12 @@ 1970 through 2038. Note that on non-POSIX systems that include leap seconds in their notion of a timestamp, leap seconds are ignored by :meth:`fromtimestamp`, and then it's possible to have two timestamps differing by a second that yield - identical :class:`datetime` objects. See also :meth:`utcfromtimestamp`. + identical :class:`.datetime` objects. See also :meth:`utcfromtimestamp`. .. classmethod:: datetime.utcfromtimestamp(timestamp) - Return the UTC :class:`datetime` corresponding to the POSIX timestamp, with + Return the UTC :class:`.datetime` corresponding to the POSIX timestamp, with :attr:`tzinfo` ``None``. This may raise :exc:`ValueError`, if the timestamp is out of the range of values supported by the platform C :c:func:`gmtime` function. It's common for this to be restricted to years in 1970 through 2038. See also @@ -724,7 +724,7 @@ .. classmethod:: datetime.fromordinal(ordinal) - Return the :class:`datetime` corresponding to the proleptic Gregorian ordinal, + Return the :class:`.datetime` corresponding to the proleptic Gregorian ordinal, where January 1 of year 1 has ordinal 1. :exc:`ValueError` is raised unless ``1 <= ordinal <= datetime.max.toordinal()``. The hour, minute, second and microsecond of the result are all 0, and :attr:`tzinfo` is ``None``. @@ -732,18 +732,18 @@ .. classmethod:: datetime.combine(date, time) - Return a new :class:`datetime` object whose date components are equal to the + Return a new :class:`.datetime` object whose date components are equal to the given :class:`date` object's, and whose time components and :attr:`tzinfo` - attributes are equal to the given :class:`time` object's. For any - :class:`datetime` object *d*, + attributes are equal to the given :class:`.time` object's. For any + :class:`.datetime` object *d*, ``d == datetime.combine(d.date(), d.timetz())``. If date is a - :class:`datetime` object, its time components and :attr:`tzinfo` attributes + :class:`.datetime` object, its time components and :attr:`tzinfo` attributes are ignored. .. classmethod:: datetime.strptime(date_string, format) - Return a :class:`datetime` corresponding to *date_string*, parsed according to + Return a :class:`.datetime` corresponding to *date_string*, parsed according to *format*. This is equivalent to ``datetime(*(time.strptime(date_string, format)[0:6]))``. :exc:`ValueError` is raised if the date_string and format can't be parsed by :func:`time.strptime` or if it returns a value which isn't a @@ -755,19 +755,19 @@ .. attribute:: datetime.min - The earliest representable :class:`datetime`, ``datetime(MINYEAR, 1, 1, + The earliest representable :class:`.datetime`, ``datetime(MINYEAR, 1, 1, tzinfo=None)``. .. attribute:: datetime.max - The latest representable :class:`datetime`, ``datetime(MAXYEAR, 12, 31, 23, 59, + The latest representable :class:`.datetime`, ``datetime(MAXYEAR, 12, 31, 23, 59, 59, 999999, tzinfo=None)``. .. attribute:: datetime.resolution - The smallest possible difference between non-equal :class:`datetime` objects, + The smallest possible difference between non-equal :class:`.datetime` objects, ``timedelta(microseconds=1)``. @@ -810,24 +810,24 @@ .. attribute:: datetime.tzinfo - The object passed as the *tzinfo* argument to the :class:`datetime` constructor, + The object passed as the *tzinfo* argument to the :class:`.datetime` constructor, or ``None`` if none was passed. Supported operations: -+---------------------------------------+-------------------------------+ -| Operation | Result | -+=======================================+===============================+ -| ``datetime2 = datetime1 + timedelta`` | \(1) | -+---------------------------------------+-------------------------------+ -| ``datetime2 = datetime1 - timedelta`` | \(2) | -+---------------------------------------+-------------------------------+ -| ``timedelta = datetime1 - datetime2`` | \(3) | -+---------------------------------------+-------------------------------+ -| ``datetime1 < datetime2`` | Compares :class:`datetime` to | -| | :class:`datetime`. (4) | -+---------------------------------------+-------------------------------+ ++---------------------------------------+--------------------------------+ +| Operation | Result | ++=======================================+================================+ +| ``datetime2 = datetime1 + timedelta`` | \(1) | ++---------------------------------------+--------------------------------+ +| ``datetime2 = datetime1 - timedelta`` | \(2) | ++---------------------------------------+--------------------------------+ +| ``timedelta = datetime1 - datetime2`` | \(3) | ++---------------------------------------+--------------------------------+ +| ``datetime1 < datetime2`` | Compares :class:`.datetime` to | +| | :class:`.datetime`. (4) | ++---------------------------------------+--------------------------------+ (1) datetime2 is a duration of timedelta removed from datetime1, moving forward in @@ -846,7 +846,7 @@ in isolation can overflow in cases where datetime1 - timedelta does not. (3) - Subtraction of a :class:`datetime` from a :class:`datetime` is defined only if + Subtraction of a :class:`.datetime` from a :class:`.datetime` is defined only if both operands are naive, or if both are aware. If one is aware and the other is naive, :exc:`TypeError` is raised. @@ -875,16 +875,16 @@ In order to stop comparison from falling back to the default scheme of comparing object addresses, datetime comparison normally raises :exc:`TypeError` if the - other comparand isn't also a :class:`datetime` object. However, + other comparand isn't also a :class:`.datetime` object. However, ``NotImplemented`` is returned instead if the other comparand has a :meth:`timetuple` attribute. This hook gives other kinds of date objects a - chance at implementing mixed-type comparison. If not, when a :class:`datetime` + chance at implementing mixed-type comparison. If not, when a :class:`.datetime` object is compared to an object of a different type, :exc:`TypeError` is raised unless the comparison is ``==`` or ``!=``. The latter cases return :const:`False` or :const:`True`, respectively. -:class:`datetime` objects can be used as dictionary keys. In Boolean contexts, -all :class:`datetime` objects are considered to be true. +:class:`.datetime` objects can be used as dictionary keys. In Boolean contexts, +all :class:`.datetime` objects are considered to be true. Instance methods: @@ -895,13 +895,13 @@ .. method:: datetime.time() - Return :class:`time` object with same hour, minute, second and microsecond. + Return :class:`.time` object with same hour, minute, second and microsecond. :attr:`tzinfo` is ``None``. See also method :meth:`timetz`. .. method:: datetime.timetz() - Return :class:`time` object with same hour, minute, second, microsecond, and + Return :class:`.time` object with same hour, minute, second, microsecond, and tzinfo attributes. See also method :meth:`time`. @@ -915,7 +915,7 @@ .. method:: datetime.astimezone(tz) - Return a :class:`datetime` object with new :attr:`tzinfo` attribute *tz*, + Return a :class:`.datetime` object with new :attr:`tzinfo` attribute *tz*, adjusting the date and time data so the result is the same UTC time as *self*, but in *tz*'s local time. @@ -989,7 +989,7 @@ .. method:: datetime.utctimetuple() - If :class:`datetime` instance *d* is naive, this is the same as + If :class:`.datetime` instance *d* is naive, this is the same as ``d.timetuple()`` except that :attr:`tm_isdst` is forced to 0 regardless of what ``d.dst()`` returns. DST is never in effect for a UTC time. @@ -1050,7 +1050,7 @@ .. method:: datetime.__str__() - For a :class:`datetime` instance *d*, ``str(d)`` is equivalent to + For a :class:`.datetime` instance *d*, ``str(d)`` is equivalent to ``d.isoformat(' ')``. @@ -1199,19 +1199,19 @@ .. attribute:: time.min - The earliest representable :class:`time`, ``time(0, 0, 0, 0)``. + The earliest representable :class:`.time`, ``time(0, 0, 0, 0)``. .. attribute:: time.max - The latest representable :class:`time`, ``time(23, 59, 59, 999999)``. + The latest representable :class:`.time`, ``time(23, 59, 59, 999999)``. .. attribute:: time.resolution - The smallest possible difference between non-equal :class:`time` objects, - ``timedelta(microseconds=1)``, although note that arithmetic on :class:`time` - objects is not supported. + The smallest possible difference between non-equal :class:`.time` objects, + ``timedelta(microseconds=1)``, although note that arithmetic on + :class:`.time` objects is not supported. Instance attributes (read-only): @@ -1238,13 +1238,13 @@ .. attribute:: time.tzinfo - The object passed as the tzinfo argument to the :class:`time` constructor, or + The object passed as the tzinfo argument to the :class:`.time` constructor, or ``None`` if none was passed. Supported operations: -* comparison of :class:`time` to :class:`time`, where *a* is considered less +* comparison of :class:`.time` to :class:`.time`, where *a* is considered less than *b* when *a* precedes *b* in time. If one comparand is naive and the other is aware, :exc:`TypeError` is raised. If both comparands are aware, and have the same :attr:`tzinfo` attribute, the common :attr:`tzinfo` attribute is @@ -1252,7 +1252,7 @@ have different :attr:`tzinfo` attributes, the comparands are first adjusted by subtracting their UTC offsets (obtained from ``self.utcoffset()``). In order to stop mixed-type comparisons from falling back to the default comparison by - object address, when a :class:`time` object is compared to an object of a + object address, when a :class:`.time` object is compared to an object of a different type, :exc:`TypeError` is raised unless the comparison is ``==`` or ``!=``. The latter cases return :const:`False` or :const:`True`, respectively. @@ -1260,7 +1260,7 @@ * efficient pickling -* in Boolean contexts, a :class:`time` object is considered to be true if and +* in Boolean contexts, a :class:`.time` object is considered to be true if and only if, after converting it to minutes and subtracting :meth:`utcoffset` (or ``0`` if that's ``None``), the result is non-zero. @@ -1269,10 +1269,10 @@ .. method:: time.replace([hour[, minute[, second[, microsecond[, tzinfo]]]]]) - Return a :class:`time` with the same value, except for those attributes given + Return a :class:`.time` with the same value, except for those attributes given new values by whichever keyword arguments are specified. Note that - ``tzinfo=None`` can be specified to create a naive :class:`time` from an - aware :class:`time`, without conversion of the time data. + ``tzinfo=None`` can be specified to create a naive :class:`.time` from an + aware :class:`.time`, without conversion of the time data. .. method:: time.isoformat() @@ -1350,13 +1350,13 @@ :class:`tzinfo` is an abstract base class, meaning that this class should not be instantiated directly. You need to derive a concrete subclass, and (at least) supply implementations of the standard :class:`tzinfo` methods needed by the -:class:`datetime` methods you use. The :mod:`datetime` module supplies +:class:`.datetime` methods you use. The :mod:`datetime` module supplies a simple concrete subclass of :class:`tzinfo` :class:`timezone` which can reprsent timezones with fixed offset from UTC such as UTC itself or North American EST and EDT. An instance of (a concrete subclass of) :class:`tzinfo` can be passed to the -constructors for :class:`datetime` and :class:`time` objects. The latter objects +constructors for :class:`.datetime` and :class:`.time` objects. The latter objects view their attributes as being in local time, and the :class:`tzinfo` object supports methods revealing offset of local time from UTC, the name of the time zone, and DST offset, all relative to a date or time object passed to them. @@ -1411,7 +1411,7 @@ ``tz.utcoffset(dt) - tz.dst(dt)`` - must return the same result for every :class:`datetime` *dt* with ``dt.tzinfo == + must return the same result for every :class:`.datetime` *dt* with ``dt.tzinfo == tz`` For sane :class:`tzinfo` subclasses, this expression yields the time zone's "standard offset", which should not depend on the date or the time, but only on geographic location. The implementation of :meth:`datetime.astimezone` @@ -1443,7 +1443,7 @@ .. method:: tzinfo.tzname(dt) - Return the time zone name corresponding to the :class:`datetime` object *dt*, as + Return the time zone name corresponding to the :class:`.datetime` object *dt*, as a string. Nothing about string names is defined by the :mod:`datetime` module, and there's no requirement that it mean anything in particular. For example, "GMT", "UTC", "-500", "-5:00", "EDT", "US/Eastern", "America/New York" are all @@ -1456,11 +1456,11 @@ The default implementation of :meth:`tzname` raises :exc:`NotImplementedError`. -These methods are called by a :class:`datetime` or :class:`time` object, in -response to their methods of the same names. A :class:`datetime` object passes -itself as the argument, and a :class:`time` object passes ``None`` as the +These methods are called by a :class:`.datetime` or :class:`.time` object, in +response to their methods of the same names. A :class:`.datetime` object passes +itself as the argument, and a :class:`.time` object passes ``None`` as the argument. A :class:`tzinfo` subclass's methods should therefore be prepared to -accept a *dt* argument of ``None``, or of class :class:`datetime`. +accept a *dt* argument of ``None``, or of class :class:`.datetime`. When ``None`` is passed, it's up to the class designer to decide the best response. For example, returning ``None`` is appropriate if the class wishes to @@ -1468,7 +1468,7 @@ may be more useful for ``utcoffset(None)`` to return the standard UTC offset, as there is no other convention for discovering the standard offset. -When a :class:`datetime` object is passed in response to a :class:`datetime` +When a :class:`.datetime` object is passed in response to a :class:`.datetime` method, ``dt.tzinfo`` is the same object as *self*. :class:`tzinfo` methods can rely on this, unless user code calls :class:`tzinfo` methods directly. The intent is that the :class:`tzinfo` methods interpret *dt* as being in local @@ -1606,7 +1606,7 @@ .. method:: timezone.fromutc(dt) Return ``dt + offset``. The *dt* argument must be an aware - :class:`datetime` instance, with ``tzinfo`` set to ``self``. + :class:`.datetime` instance, with ``tzinfo`` set to ``self``. Class attributes: @@ -1620,18 +1620,18 @@ :meth:`strftime` and :meth:`strptime` Behavior ---------------------------------------------- -:class:`date`, :class:`datetime`, and :class:`time` objects all support a +:class:`date`, :class:`.datetime`, and :class:`.time` objects all support a ``strftime(format)`` method, to create a string representing the time under the control of an explicit format string. Broadly speaking, ``d.strftime(fmt)`` acts like the :mod:`time` module's ``time.strftime(fmt, d.timetuple())`` although not all objects support a :meth:`timetuple` method. Conversely, the :meth:`datetime.strptime` class method creates a -:class:`datetime` object from a string representing a date and time and a +:class:`.datetime` object from a string representing a date and time and a corresponding format string. ``datetime.strptime(date_string, format)`` is equivalent to ``datetime(*(time.strptime(date_string, format)[0:6]))``. -For :class:`time` objects, the format codes for year, month, and day should not +For :class:`.time` objects, the format codes for year, month, and day should not be used, as time objects have no such values. If they're used anyway, ``1900`` is substituted for the year, and ``1`` for the month and day. @@ -1789,5 +1789,5 @@ .. versionchanged:: 3.2 When the ``%z`` directive is provided to the :meth:`strptime` method, an - aware :class:`datetime` object will be produced. The ``tzinfo`` of the + aware :class:`.datetime` object will be produced. The ``tzinfo`` of the result will be set to a :class:`timezone` instance. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Oct 2 11:47:25 2011 From: python-checkins at python.org (ezio.melotti) Date: Sun, 02 Oct 2011 11:47:25 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=28merge_3=2E2_-=3E_default=29?= =?utf8?q?=3A_=2313076=3A_merge_with_3=2E2=2E?= Message-ID: http://hg.python.org/cpython/rev/175cd2a51ea9 changeset: 72582:175cd2a51ea9 parent: 72579:6bd6cc7f2c8d parent: 72581:95689ed69097 user: Ezio Melotti date: Sun Oct 02 12:47:10 2011 +0300 summary: #13076: merge with 3.2. files: Doc/library/datetime.rst | 152 +++++++++++++------------- 1 files changed, 76 insertions(+), 76 deletions(-) diff --git a/Doc/library/datetime.rst b/Doc/library/datetime.rst --- a/Doc/library/datetime.rst +++ b/Doc/library/datetime.rst @@ -18,13 +18,13 @@ There are two kinds of date and time objects: "naive" and "aware". This distinction refers to whether the object has any notion of time zone, daylight saving time, or other kind of algorithmic or political time adjustment. Whether -a naive :class:`datetime` object represents Coordinated Universal Time (UTC), +a naive :class:`.datetime` object represents Coordinated Universal Time (UTC), local time, or time in some other timezone is purely up to the program, just like it's up to the program whether a particular number represents metres, -miles, or mass. Naive :class:`datetime` objects are easy to understand and to +miles, or mass. Naive :class:`.datetime` objects are easy to understand and to work with, at the cost of ignoring some aspects of reality. -For applications requiring more, :class:`datetime` and :class:`time` objects +For applications requiring more, :class:`.datetime` and :class:`.time` objects have an optional time zone information attribute, :attr:`tzinfo`, that can be set to an instance of a subclass of the abstract :class:`tzinfo` class. These :class:`tzinfo` objects capture information about the offset from UTC time, the @@ -41,13 +41,13 @@ .. data:: MINYEAR - The smallest year number allowed in a :class:`date` or :class:`datetime` object. + The smallest year number allowed in a :class:`date` or :class:`.datetime` object. :const:`MINYEAR` is ``1``. .. data:: MAXYEAR - The largest year number allowed in a :class:`date` or :class:`datetime` object. + The largest year number allowed in a :class:`date` or :class:`.datetime` object. :const:`MAXYEAR` is ``9999``. @@ -91,14 +91,14 @@ .. class:: timedelta :noindex: - A duration expressing the difference between two :class:`date`, :class:`time`, - or :class:`datetime` instances to microsecond resolution. + A duration expressing the difference between two :class:`date`, :class:`.time`, + or :class:`.datetime` instances to microsecond resolution. .. class:: tzinfo An abstract base class for time zone information objects. These are used by the - :class:`datetime` and :class:`time` classes to provide a customizable notion of + :class:`.datetime` and :class:`.time` classes to provide a customizable notion of time adjustment (for example, to account for time zone and/or daylight saving time). @@ -114,7 +114,7 @@ Objects of the :class:`date` type are always naive. -An object *d* of type :class:`time` or :class:`datetime` may be naive or aware. +An object *d* of type :class:`.time` or :class:`.datetime` may be naive or aware. *d* is aware if ``d.tzinfo`` is not ``None`` and ``d.tzinfo.utcoffset(d)`` does not return ``None``. If ``d.tzinfo`` is ``None``, or if ``d.tzinfo`` is not ``None`` but ``d.tzinfo.utcoffset(d)`` returns ``None``, *d* is naive. @@ -299,7 +299,7 @@ -1 day, 19:00:00 In addition to the operations listed above :class:`timedelta` objects support -certain additions and subtractions with :class:`date` and :class:`datetime` +certain additions and subtractions with :class:`date` and :class:`.datetime` objects (see below). .. versionchanged:: 3.2 @@ -638,10 +638,10 @@ :class:`datetime` Objects ------------------------- -A :class:`datetime` object is a single object containing all the information -from a :class:`date` object and a :class:`time` object. Like a :class:`date` -object, :class:`datetime` assumes the current Gregorian calendar extended in -both directions; like a time object, :class:`datetime` assumes there are exactly +A :class:`.datetime` object is a single object containing all the information +from a :class:`date` object and a :class:`.time` object. Like a :class:`date` +object, :class:`.datetime` assumes the current Gregorian calendar extended in +both directions; like a time object, :class:`.datetime` assumes there are exactly 3600\*24 seconds in every day. Constructor: @@ -689,7 +689,7 @@ Return the current UTC date and time, with :attr:`tzinfo` ``None``. This is like :meth:`now`, but returns the current UTC date and time, as a naive - :class:`datetime` object. An aware current UTC datetime can be obtained by + :class:`.datetime` object. An aware current UTC datetime can be obtained by calling ``datetime.now(timezone.utc)``. See also :meth:`now`. .. classmethod:: datetime.fromtimestamp(timestamp, tz=None) @@ -697,7 +697,7 @@ Return the local date and time corresponding to the POSIX timestamp, such as is returned by :func:`time.time`. If optional argument *tz* is ``None`` or not specified, the timestamp is converted to the platform's local date and time, and - the returned :class:`datetime` object is naive. + the returned :class:`.datetime` object is naive. Else *tz* must be an instance of a class :class:`tzinfo` subclass, and the timestamp is converted to *tz*'s time zone. In this case the result is @@ -710,12 +710,12 @@ 1970 through 2038. Note that on non-POSIX systems that include leap seconds in their notion of a timestamp, leap seconds are ignored by :meth:`fromtimestamp`, and then it's possible to have two timestamps differing by a second that yield - identical :class:`datetime` objects. See also :meth:`utcfromtimestamp`. + identical :class:`.datetime` objects. See also :meth:`utcfromtimestamp`. .. classmethod:: datetime.utcfromtimestamp(timestamp) - Return the UTC :class:`datetime` corresponding to the POSIX timestamp, with + Return the UTC :class:`.datetime` corresponding to the POSIX timestamp, with :attr:`tzinfo` ``None``. This may raise :exc:`ValueError`, if the timestamp is out of the range of values supported by the platform C :c:func:`gmtime` function. It's common for this to be restricted to years in 1970 through 2038. See also @@ -740,7 +740,7 @@ .. classmethod:: datetime.fromordinal(ordinal) - Return the :class:`datetime` corresponding to the proleptic Gregorian ordinal, + Return the :class:`.datetime` corresponding to the proleptic Gregorian ordinal, where January 1 of year 1 has ordinal 1. :exc:`ValueError` is raised unless ``1 <= ordinal <= datetime.max.toordinal()``. The hour, minute, second and microsecond of the result are all 0, and :attr:`tzinfo` is ``None``. @@ -748,18 +748,18 @@ .. classmethod:: datetime.combine(date, time) - Return a new :class:`datetime` object whose date components are equal to the + Return a new :class:`.datetime` object whose date components are equal to the given :class:`date` object's, and whose time components and :attr:`tzinfo` - attributes are equal to the given :class:`time` object's. For any - :class:`datetime` object *d*, + attributes are equal to the given :class:`.time` object's. For any + :class:`.datetime` object *d*, ``d == datetime.combine(d.date(), d.timetz())``. If date is a - :class:`datetime` object, its time components and :attr:`tzinfo` attributes + :class:`.datetime` object, its time components and :attr:`tzinfo` attributes are ignored. .. classmethod:: datetime.strptime(date_string, format) - Return a :class:`datetime` corresponding to *date_string*, parsed according to + Return a :class:`.datetime` corresponding to *date_string*, parsed according to *format*. This is equivalent to ``datetime(*(time.strptime(date_string, format)[0:6]))``. :exc:`ValueError` is raised if the date_string and format can't be parsed by :func:`time.strptime` or if it returns a value which isn't a @@ -771,19 +771,19 @@ .. attribute:: datetime.min - The earliest representable :class:`datetime`, ``datetime(MINYEAR, 1, 1, + The earliest representable :class:`.datetime`, ``datetime(MINYEAR, 1, 1, tzinfo=None)``. .. attribute:: datetime.max - The latest representable :class:`datetime`, ``datetime(MAXYEAR, 12, 31, 23, 59, + The latest representable :class:`.datetime`, ``datetime(MAXYEAR, 12, 31, 23, 59, 59, 999999, tzinfo=None)``. .. attribute:: datetime.resolution - The smallest possible difference between non-equal :class:`datetime` objects, + The smallest possible difference between non-equal :class:`.datetime` objects, ``timedelta(microseconds=1)``. @@ -826,24 +826,24 @@ .. attribute:: datetime.tzinfo - The object passed as the *tzinfo* argument to the :class:`datetime` constructor, + The object passed as the *tzinfo* argument to the :class:`.datetime` constructor, or ``None`` if none was passed. Supported operations: -+---------------------------------------+-------------------------------+ -| Operation | Result | -+=======================================+===============================+ -| ``datetime2 = datetime1 + timedelta`` | \(1) | -+---------------------------------------+-------------------------------+ -| ``datetime2 = datetime1 - timedelta`` | \(2) | -+---------------------------------------+-------------------------------+ -| ``timedelta = datetime1 - datetime2`` | \(3) | -+---------------------------------------+-------------------------------+ -| ``datetime1 < datetime2`` | Compares :class:`datetime` to | -| | :class:`datetime`. (4) | -+---------------------------------------+-------------------------------+ ++---------------------------------------+--------------------------------+ +| Operation | Result | ++=======================================+================================+ +| ``datetime2 = datetime1 + timedelta`` | \(1) | ++---------------------------------------+--------------------------------+ +| ``datetime2 = datetime1 - timedelta`` | \(2) | ++---------------------------------------+--------------------------------+ +| ``timedelta = datetime1 - datetime2`` | \(3) | ++---------------------------------------+--------------------------------+ +| ``datetime1 < datetime2`` | Compares :class:`.datetime` to | +| | :class:`.datetime`. (4) | ++---------------------------------------+--------------------------------+ (1) datetime2 is a duration of timedelta removed from datetime1, moving forward in @@ -862,7 +862,7 @@ in isolation can overflow in cases where datetime1 - timedelta does not. (3) - Subtraction of a :class:`datetime` from a :class:`datetime` is defined only if + Subtraction of a :class:`.datetime` from a :class:`.datetime` is defined only if both operands are naive, or if both are aware. If one is aware and the other is naive, :exc:`TypeError` is raised. @@ -891,16 +891,16 @@ In order to stop comparison from falling back to the default scheme of comparing object addresses, datetime comparison normally raises :exc:`TypeError` if the - other comparand isn't also a :class:`datetime` object. However, + other comparand isn't also a :class:`.datetime` object. However, ``NotImplemented`` is returned instead if the other comparand has a :meth:`timetuple` attribute. This hook gives other kinds of date objects a - chance at implementing mixed-type comparison. If not, when a :class:`datetime` + chance at implementing mixed-type comparison. If not, when a :class:`.datetime` object is compared to an object of a different type, :exc:`TypeError` is raised unless the comparison is ``==`` or ``!=``. The latter cases return :const:`False` or :const:`True`, respectively. -:class:`datetime` objects can be used as dictionary keys. In Boolean contexts, -all :class:`datetime` objects are considered to be true. +:class:`.datetime` objects can be used as dictionary keys. In Boolean contexts, +all :class:`.datetime` objects are considered to be true. Instance methods: @@ -911,13 +911,13 @@ .. method:: datetime.time() - Return :class:`time` object with same hour, minute, second and microsecond. + Return :class:`.time` object with same hour, minute, second and microsecond. :attr:`tzinfo` is ``None``. See also method :meth:`timetz`. .. method:: datetime.timetz() - Return :class:`time` object with same hour, minute, second, microsecond, and + Return :class:`.time` object with same hour, minute, second, microsecond, and tzinfo attributes. See also method :meth:`time`. @@ -931,7 +931,7 @@ .. method:: datetime.astimezone(tz) - Return a :class:`datetime` object with new :attr:`tzinfo` attribute *tz*, + Return a :class:`.datetime` object with new :attr:`tzinfo` attribute *tz*, adjusting the date and time data so the result is the same UTC time as *self*, but in *tz*'s local time. @@ -1005,7 +1005,7 @@ .. method:: datetime.utctimetuple() - If :class:`datetime` instance *d* is naive, this is the same as + If :class:`.datetime` instance *d* is naive, this is the same as ``d.timetuple()`` except that :attr:`tm_isdst` is forced to 0 regardless of what ``d.dst()`` returns. DST is never in effect for a UTC time. @@ -1066,7 +1066,7 @@ .. method:: datetime.__str__() - For a :class:`datetime` instance *d*, ``str(d)`` is equivalent to + For a :class:`.datetime` instance *d*, ``str(d)`` is equivalent to ``d.isoformat(' ')``. @@ -1215,19 +1215,19 @@ .. attribute:: time.min - The earliest representable :class:`time`, ``time(0, 0, 0, 0)``. + The earliest representable :class:`.time`, ``time(0, 0, 0, 0)``. .. attribute:: time.max - The latest representable :class:`time`, ``time(23, 59, 59, 999999)``. + The latest representable :class:`.time`, ``time(23, 59, 59, 999999)``. .. attribute:: time.resolution - The smallest possible difference between non-equal :class:`time` objects, - ``timedelta(microseconds=1)``, although note that arithmetic on :class:`time` - objects is not supported. + The smallest possible difference between non-equal :class:`.time` objects, + ``timedelta(microseconds=1)``, although note that arithmetic on + :class:`.time` objects is not supported. Instance attributes (read-only): @@ -1254,13 +1254,13 @@ .. attribute:: time.tzinfo - The object passed as the tzinfo argument to the :class:`time` constructor, or + The object passed as the tzinfo argument to the :class:`.time` constructor, or ``None`` if none was passed. Supported operations: -* comparison of :class:`time` to :class:`time`, where *a* is considered less +* comparison of :class:`.time` to :class:`.time`, where *a* is considered less than *b* when *a* precedes *b* in time. If one comparand is naive and the other is aware, :exc:`TypeError` is raised. If both comparands are aware, and have the same :attr:`tzinfo` attribute, the common :attr:`tzinfo` attribute is @@ -1268,7 +1268,7 @@ have different :attr:`tzinfo` attributes, the comparands are first adjusted by subtracting their UTC offsets (obtained from ``self.utcoffset()``). In order to stop mixed-type comparisons from falling back to the default comparison by - object address, when a :class:`time` object is compared to an object of a + object address, when a :class:`.time` object is compared to an object of a different type, :exc:`TypeError` is raised unless the comparison is ``==`` or ``!=``. The latter cases return :const:`False` or :const:`True`, respectively. @@ -1276,7 +1276,7 @@ * efficient pickling -* in Boolean contexts, a :class:`time` object is considered to be true if and +* in Boolean contexts, a :class:`.time` object is considered to be true if and only if, after converting it to minutes and subtracting :meth:`utcoffset` (or ``0`` if that's ``None``), the result is non-zero. @@ -1285,10 +1285,10 @@ .. method:: time.replace([hour[, minute[, second[, microsecond[, tzinfo]]]]]) - Return a :class:`time` with the same value, except for those attributes given + Return a :class:`.time` with the same value, except for those attributes given new values by whichever keyword arguments are specified. Note that - ``tzinfo=None`` can be specified to create a naive :class:`time` from an - aware :class:`time`, without conversion of the time data. + ``tzinfo=None`` can be specified to create a naive :class:`.time` from an + aware :class:`.time`, without conversion of the time data. .. method:: time.isoformat() @@ -1366,13 +1366,13 @@ :class:`tzinfo` is an abstract base class, meaning that this class should not be instantiated directly. You need to derive a concrete subclass, and (at least) supply implementations of the standard :class:`tzinfo` methods needed by the -:class:`datetime` methods you use. The :mod:`datetime` module supplies +:class:`.datetime` methods you use. The :mod:`datetime` module supplies a simple concrete subclass of :class:`tzinfo` :class:`timezone` which can reprsent timezones with fixed offset from UTC such as UTC itself or North American EST and EDT. An instance of (a concrete subclass of) :class:`tzinfo` can be passed to the -constructors for :class:`datetime` and :class:`time` objects. The latter objects +constructors for :class:`.datetime` and :class:`.time` objects. The latter objects view their attributes as being in local time, and the :class:`tzinfo` object supports methods revealing offset of local time from UTC, the name of the time zone, and DST offset, all relative to a date or time object passed to them. @@ -1427,7 +1427,7 @@ ``tz.utcoffset(dt) - tz.dst(dt)`` - must return the same result for every :class:`datetime` *dt* with ``dt.tzinfo == + must return the same result for every :class:`.datetime` *dt* with ``dt.tzinfo == tz`` For sane :class:`tzinfo` subclasses, this expression yields the time zone's "standard offset", which should not depend on the date or the time, but only on geographic location. The implementation of :meth:`datetime.astimezone` @@ -1459,7 +1459,7 @@ .. method:: tzinfo.tzname(dt) - Return the time zone name corresponding to the :class:`datetime` object *dt*, as + Return the time zone name corresponding to the :class:`.datetime` object *dt*, as a string. Nothing about string names is defined by the :mod:`datetime` module, and there's no requirement that it mean anything in particular. For example, "GMT", "UTC", "-500", "-5:00", "EDT", "US/Eastern", "America/New York" are all @@ -1472,11 +1472,11 @@ The default implementation of :meth:`tzname` raises :exc:`NotImplementedError`. -These methods are called by a :class:`datetime` or :class:`time` object, in -response to their methods of the same names. A :class:`datetime` object passes -itself as the argument, and a :class:`time` object passes ``None`` as the +These methods are called by a :class:`.datetime` or :class:`.time` object, in +response to their methods of the same names. A :class:`.datetime` object passes +itself as the argument, and a :class:`.time` object passes ``None`` as the argument. A :class:`tzinfo` subclass's methods should therefore be prepared to -accept a *dt* argument of ``None``, or of class :class:`datetime`. +accept a *dt* argument of ``None``, or of class :class:`.datetime`. When ``None`` is passed, it's up to the class designer to decide the best response. For example, returning ``None`` is appropriate if the class wishes to @@ -1484,7 +1484,7 @@ may be more useful for ``utcoffset(None)`` to return the standard UTC offset, as there is no other convention for discovering the standard offset. -When a :class:`datetime` object is passed in response to a :class:`datetime` +When a :class:`.datetime` object is passed in response to a :class:`.datetime` method, ``dt.tzinfo`` is the same object as *self*. :class:`tzinfo` methods can rely on this, unless user code calls :class:`tzinfo` methods directly. The intent is that the :class:`tzinfo` methods interpret *dt* as being in local @@ -1623,7 +1623,7 @@ .. method:: timezone.fromutc(dt) Return ``dt + offset``. The *dt* argument must be an aware - :class:`datetime` instance, with ``tzinfo`` set to ``self``. + :class:`.datetime` instance, with ``tzinfo`` set to ``self``. Class attributes: @@ -1637,18 +1637,18 @@ :meth:`strftime` and :meth:`strptime` Behavior ---------------------------------------------- -:class:`date`, :class:`datetime`, and :class:`time` objects all support a +:class:`date`, :class:`.datetime`, and :class:`.time` objects all support a ``strftime(format)`` method, to create a string representing the time under the control of an explicit format string. Broadly speaking, ``d.strftime(fmt)`` acts like the :mod:`time` module's ``time.strftime(fmt, d.timetuple())`` although not all objects support a :meth:`timetuple` method. Conversely, the :meth:`datetime.strptime` class method creates a -:class:`datetime` object from a string representing a date and time and a +:class:`.datetime` object from a string representing a date and time and a corresponding format string. ``datetime.strptime(date_string, format)`` is equivalent to ``datetime(*(time.strptime(date_string, format)[0:6]))``. -For :class:`time` objects, the format codes for year, month, and day should not +For :class:`.time` objects, the format codes for year, month, and day should not be used, as time objects have no such values. If they're used anyway, ``1900`` is substituted for the year, and ``1`` for the month and day. @@ -1806,5 +1806,5 @@ .. versionchanged:: 3.2 When the ``%z`` directive is provided to the :meth:`strptime` method, an - aware :class:`datetime` object will be produced. The ``tzinfo`` of the + aware :class:`.datetime` object will be produced. The ``tzinfo`` of the result will be set to a :class:`timezone` instance. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Oct 2 18:33:35 2011 From: python-checkins at python.org (charles-francois.natali) Date: Sun, 02 Oct 2011 18:33:35 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Issue_=2313084=3A_Fix_a_tes?= =?utf8?q?t=5Fsignal_failure=3A_the_delivery_order_is_only_defined_for?= Message-ID: http://hg.python.org/cpython/rev/e4f4272479d0 changeset: 72583:e4f4272479d0 user: Charles-Fran?ois Natali date: Sun Oct 02 18:36:05 2011 +0200 summary: Issue #13084: Fix a test_signal failure: the delivery order is only defined for real-time signals. files: Lib/test/test_signal.py | 14 ++++++-------- 1 files changed, 6 insertions(+), 8 deletions(-) diff --git a/Lib/test/test_signal.py b/Lib/test/test_signal.py --- a/Lib/test/test_signal.py +++ b/Lib/test/test_signal.py @@ -224,7 +224,7 @@ @unittest.skipIf(sys.platform == "win32", "Not valid on Windows") class WakeupSignalTests(unittest.TestCase): - def check_wakeup(self, test_body, *signals): + def check_wakeup(self, test_body, *signals, ordered=True): # use a subprocess to have only one thread code = """if 1: import fcntl @@ -240,6 +240,9 @@ def check_signum(signals): data = os.read(read, len(signals)+1) raised = struct.unpack('%uB' % len(data), data) + if not {!r}: + raised = set(raised) + signals = set(signals) if raised != signals: raise Exception("%r != %r" % (raised, signals)) @@ -258,7 +261,7 @@ os.close(read) os.close(write) - """.format(signals, test_body) + """.format(signals, ordered, test_body) assert_python_ok('-c', code) @@ -319,11 +322,6 @@ @unittest.skipUnless(hasattr(signal, 'pthread_sigmask'), 'need signal.pthread_sigmask()') def test_pending(self): - signals = (signal.SIGUSR1, signal.SIGUSR2) - # when signals are unblocked, pending signal ared delivered in the - # reverse order of their number - signals = tuple(sorted(signals, reverse=True)) - self.check_wakeup("""def test(): signum1 = signal.SIGUSR1 signum2 = signal.SIGUSR2 @@ -336,7 +334,7 @@ os.kill(os.getpid(), signum2) # Unblocking the 2 signals calls the C signal handler twice signal.pthread_sigmask(signal.SIG_UNBLOCK, (signum1, signum2)) - """, *signals) + """, signal.SIGUSR1, signal.SIGUSR2, ordered=False) @unittest.skipIf(sys.platform == "win32", "Not valid on Windows") -- Repository URL: http://hg.python.org/cpython From riscutiavlad at gmail.com Sun Oct 2 18:47:47 2011 From: riscutiavlad at gmail.com (Vlad Riscutia) Date: Sun, 2 Oct 2011 09:47:47 -0700 Subject: [Python-checkins] [Python-Dev] Hg tips (was Re: cpython (merge default -> default): Merge heads.) In-Reply-To: References: Message-ID: Great tips. Can we add them to the developer guide somewhere? Thank you, Vlad On Thu, Sep 29, 2011 at 12:54 AM, Ezio Melotti wrote: > Tip 1 -- merging heads: > > A while ago ?ric suggested a nice tip to make merges easier and since I > haven't seen many people using it and now I got a chance to use it again, I > think it might be worth showing it once more: > > # so assume you just committed some changes: > $ hg ci Doc/whatsnew/3.3.rst -m 'Update and reorganize the whatsnew entry > for PEP 393.' > # you push them, but someone else pushed something in the meanwhile, so the > push fails > $ hg push > pushing to ssh://hg at hg.python.org/cpython > searching for changes > abort: push creates new remote heads on branch 'default'! > (you should pull and merge or use push -f to force) > # so you pull the other changes > $ hg pull -u > pulling from ssh://hg at hg.python.org/cpython > searching for changes > adding changesets > adding manifests > adding file changes > added 4 changesets with 5 changes to 5 files (+1 heads) > not updating, since new heads added > (run 'hg heads' to see heads, 'hg merge' to merge) > # and use "hg heads ." to see the two heads (yours and the one you pulled) > in the current branch > $ hg heads . > changeset: 72521:e6a2b54c1d16 > tag: tip > user: Victor Stinner > date: Thu Sep 29 04:02:13 2011 +0200 > summary: Fix hex_digit_to_int() prototype: expect Py_UCS4, not > Py_UNICODE > > changeset: 72517:ba6ee5cc9ed6 > user: Ezio Melotti > date: Thu Sep 29 08:34:36 2011 +0300 > summary: Update and reorganize the whatsnew entry for PEP 393. > # here comes the tip: before merging you switch to the other head (i.e. the > one pushed by Victor), > # if you don't switch, you'll be merging Victor changeset and in case of > conflicts you will have to review > # and modify his code (e.g. put a Misc/NEWS entry in the right section or > something more complicated) > $ hg up e6a2b54c1d16 > 6 files updated, 0 files merged, 0 files removed, 0 files unresolved > # after the switch you will merge the changeset you just committed, so in > case of conflicts > # reviewing and merging is much easier because you know the changes already > $ hg merge > 1 files updated, 0 files merged, 0 files removed, 0 files unresolved > (branch merge, don't forget to commit) > # here everything went fine and there were no conflicts, and in the diff I > can see my last changeset > $ hg di > diff --git a/Doc/whatsnew/3.3.rst b/Doc/whatsnew/3.3.rst > [...] > # everything looks fine, so I can commit the merge and push > $ hg ci -m 'Merge heads.' > $ hg push > pushing to ssh://hg at hg.python.org/cpython > searching for changes > remote: adding > changesets > > remote: adding manifests > remote: adding file changes > remote: added 2 changesets with 1 changes to 1 files > remote: buildbot: 2 changes sent successfully > remote: notified python-checkins at python.org of incoming changeset > ba6ee5cc9ed6 > remote: notified python-checkins at python.org of incoming changeset > e7672fe3cd35 > > This tip is not only useful while merging, but it's also useful for > python-checkins reviews, because the "merge" mail has the same diff of the > previous mail rather than having 15 unrelated changesets from the last week > because the committer didn't pull in a while. > > > Tip 2 -- extended diffs: > > If you haven't already, enable git diffs, adding to your ~/.hgrc the > following two lines: > >> [diff] >> git = True >> > (this is already in the devguide, even if 'git = on' is used there. The > mercurial website uses git = True too.) > More info: > http://hgtip.com/tips/beginner/2009-10-22-always-use-git-diffs/ > > > Tip 3 -- extensions: > > I personally like the 'color' extension, it makes the output of commands > like 'hg diff' and 'hg stat' more readable (e.g. it shows removed lines in > red and added ones in green). > If you want to give it a try, add to your ~/.hgrc the following two lines: > >> [extensions] >> color = >> > > If you find operations like pulling, updating or cloning too slow, you > might also want to look at the 'progress' extension, which displays a > progress bar during these operations: > >> [extensions] >> progress = >> > > > Tip 4 -- porting from 2.7 to 3.2: > > The devguide suggests: >> >> hg export a7df1a869e4a | hg import --no-commit - >> > but it's not always necessary to copy the changeset number manually. > If you are porting your last commit you can just use 'hg export 2.7' (or > any other branch name): > * using the one-dir-per-branch setup: > wolf at hp:~/dev/py/2.7$ hg ci -m 'Fix some bug.' > wolf at hp:~/dev/py/2.7$ cd ../3.2 > wolf at hp:~/dev/py/3.2$ hg pull -u ../2.7 > wolf at hp:~/dev/py/3.2$ hg export 2.7 | hg import --no-commit - > * using the single-dir setup: > wolf at hp:~/dev/python$ hg branch > 2.7 > wolf at hp:~/dev/python$ hg ci -m 'Fix some bug.' > wolf at hp:~/dev/python$ hg up 3.2 # here you might enjoy the progress > extension > wolf at hp:~/dev/python$ hg export 2.7 | hg import --no-commit - > And then you can check that everything is fine, and commit on 3.2 too. > Of course it works the other way around (from 3.2 to 2.7) too. > > > I hope you'll find these tips useful. > > Best Regards, > Ezio Melotti > > > On Thu, Sep 29, 2011 at 8:36 AM, ezio.melotti wrote: > >> http://hg.python.org/cpython/rev/e7672fe3cd35 >> changeset: 72522:e7672fe3cd35 >> parent: 72520:e6a2b54c1d16 >> parent: 72521:ba6ee5cc9ed6 >> user: Ezio Melotti >> date: Thu Sep 29 08:36:23 2011 +0300 >> summary: >> Merge heads. >> >> files: >> Doc/whatsnew/3.3.rst | 63 +++++++++++++++++++++---------- >> 1 files changed, 42 insertions(+), 21 deletions(-) >> >> > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/riscutiavlad%40gmail.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From python-checkins at python.org Sun Oct 2 19:19:34 2011 From: python-checkins at python.org (benjamin.peterson) Date: Sun, 02 Oct 2011 19:19:34 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_remove_unused_label?= Message-ID: http://hg.python.org/cpython/rev/e5019967a60a changeset: 72584:e5019967a60a parent: 72582:175cd2a51ea9 user: Benjamin Peterson date: Sun Oct 02 13:19:16 2011 -0400 summary: remove unused label files: Python/import.c | 1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/Python/import.c b/Python/import.c --- a/Python/import.c +++ b/Python/import.c @@ -3174,7 +3174,6 @@ } } -out: PyMem_Free(nameuni); return result; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Oct 2 19:19:35 2011 From: python-checkins at python.org (benjamin.peterson) Date: Sun, 02 Oct 2011 19:19:35 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=28merge_default_-=3E_default?= =?utf8?q?=29=3A_merge_heads?= Message-ID: http://hg.python.org/cpython/rev/08223e6cf325 changeset: 72585:08223e6cf325 parent: 72584:e5019967a60a parent: 72583:e4f4272479d0 user: Benjamin Peterson date: Sun Oct 02 13:19:30 2011 -0400 summary: merge heads files: Lib/test/test_signal.py | 14 ++++++-------- 1 files changed, 6 insertions(+), 8 deletions(-) diff --git a/Lib/test/test_signal.py b/Lib/test/test_signal.py --- a/Lib/test/test_signal.py +++ b/Lib/test/test_signal.py @@ -224,7 +224,7 @@ @unittest.skipIf(sys.platform == "win32", "Not valid on Windows") class WakeupSignalTests(unittest.TestCase): - def check_wakeup(self, test_body, *signals): + def check_wakeup(self, test_body, *signals, ordered=True): # use a subprocess to have only one thread code = """if 1: import fcntl @@ -240,6 +240,9 @@ def check_signum(signals): data = os.read(read, len(signals)+1) raised = struct.unpack('%uB' % len(data), data) + if not {!r}: + raised = set(raised) + signals = set(signals) if raised != signals: raise Exception("%r != %r" % (raised, signals)) @@ -258,7 +261,7 @@ os.close(read) os.close(write) - """.format(signals, test_body) + """.format(signals, ordered, test_body) assert_python_ok('-c', code) @@ -319,11 +322,6 @@ @unittest.skipUnless(hasattr(signal, 'pthread_sigmask'), 'need signal.pthread_sigmask()') def test_pending(self): - signals = (signal.SIGUSR1, signal.SIGUSR2) - # when signals are unblocked, pending signal ared delivered in the - # reverse order of their number - signals = tuple(sorted(signals, reverse=True)) - self.check_wakeup("""def test(): signum1 = signal.SIGUSR1 signum2 = signal.SIGUSR2 @@ -336,7 +334,7 @@ os.kill(os.getpid(), signum2) # Unblocking the 2 signals calls the C signal handler twice signal.pthread_sigmask(signal.SIG_UNBLOCK, (signum1, signum2)) - """, *signals) + """, signal.SIGUSR1, signal.SIGUSR2, ordered=False) @unittest.skipIf(sys.platform == "win32", "Not valid on Windows") -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Oct 2 23:41:33 2011 From: python-checkins at python.org (antoine.pitrou) Date: Sun, 02 Oct 2011 23:41:33 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=283=2E2=29=3A_Fix_ResourceWar?= =?utf8?q?nings_in_the_TIPC_socket_tests=2E?= Message-ID: http://hg.python.org/cpython/rev/77397669c62f changeset: 72586:77397669c62f branch: 3.2 parent: 72581:95689ed69097 user: Antoine Pitrou date: Sun Oct 02 23:33:19 2011 +0200 summary: Fix ResourceWarnings in the TIPC socket tests. files: Lib/test/test_socket.py | 9 +++++++-- 1 files changed, 7 insertions(+), 2 deletions(-) diff --git a/Lib/test/test_socket.py b/Lib/test/test_socket.py --- a/Lib/test/test_socket.py +++ b/Lib/test/test_socket.py @@ -1869,10 +1869,12 @@ print("TIPC module is not loaded, please 'sudo modprobe tipc'") return False -class TIPCTest (unittest.TestCase): +class TIPCTest(unittest.TestCase): def testRDM(self): srv = socket.socket(socket.AF_TIPC, socket.SOCK_RDM) cli = socket.socket(socket.AF_TIPC, socket.SOCK_RDM) + self.addCleanup(srv.close) + self.addCleanup(cli.close) srv.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1) srvaddr = (socket.TIPC_ADDR_NAMESEQ, TIPC_STYPE, @@ -1889,13 +1891,14 @@ self.assertEqual(msg, MSG) -class TIPCThreadableTest (unittest.TestCase, ThreadableTest): +class TIPCThreadableTest(unittest.TestCase, ThreadableTest): def __init__(self, methodName = 'runTest'): unittest.TestCase.__init__(self, methodName = methodName) ThreadableTest.__init__(self) def setUp(self): self.srv = socket.socket(socket.AF_TIPC, socket.SOCK_STREAM) + self.addCleanup(self.srv.close) self.srv.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1) srvaddr = (socket.TIPC_ADDR_NAMESEQ, TIPC_STYPE, TIPC_LOWER, TIPC_UPPER) @@ -1903,6 +1906,7 @@ self.srv.listen(5) self.serverExplicitReady() self.conn, self.connaddr = self.srv.accept() + self.addCleanup(self.conn.close) def clientSetUp(self): # The is a hittable race between serverExplicitReady() and the @@ -1910,6 +1914,7 @@ # we could get an exception time.sleep(0.1) self.cli = socket.socket(socket.AF_TIPC, socket.SOCK_STREAM) + self.addCleanup(self.cli.close) addr = (socket.TIPC_ADDR_NAME, TIPC_STYPE, TIPC_LOWER + int((TIPC_UPPER - TIPC_LOWER) / 2), 0) self.cli.connect(addr) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Sun Oct 2 23:41:34 2011 From: python-checkins at python.org (antoine.pitrou) Date: Sun, 02 Oct 2011 23:41:34 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=28merge_3=2E2_-=3E_default=29?= =?utf8?q?=3A_Fix_ResourceWarnings_in_the_TIPC_socket_tests=2E?= Message-ID: http://hg.python.org/cpython/rev/d3194ef040df changeset: 72587:d3194ef040df parent: 72585:08223e6cf325 parent: 72586:77397669c62f user: Antoine Pitrou date: Sun Oct 02 23:37:41 2011 +0200 summary: Fix ResourceWarnings in the TIPC socket tests. files: Lib/test/test_socket.py | 9 +++++++-- 1 files changed, 7 insertions(+), 2 deletions(-) diff --git a/Lib/test/test_socket.py b/Lib/test/test_socket.py --- a/Lib/test/test_socket.py +++ b/Lib/test/test_socket.py @@ -3999,10 +3999,12 @@ print("TIPC module is not loaded, please 'sudo modprobe tipc'") return False -class TIPCTest (unittest.TestCase): +class TIPCTest(unittest.TestCase): def testRDM(self): srv = socket.socket(socket.AF_TIPC, socket.SOCK_RDM) cli = socket.socket(socket.AF_TIPC, socket.SOCK_RDM) + self.addCleanup(srv.close) + self.addCleanup(cli.close) srv.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1) srvaddr = (socket.TIPC_ADDR_NAMESEQ, TIPC_STYPE, @@ -4019,13 +4021,14 @@ self.assertEqual(msg, MSG) -class TIPCThreadableTest (unittest.TestCase, ThreadableTest): +class TIPCThreadableTest(unittest.TestCase, ThreadableTest): def __init__(self, methodName = 'runTest'): unittest.TestCase.__init__(self, methodName = methodName) ThreadableTest.__init__(self) def setUp(self): self.srv = socket.socket(socket.AF_TIPC, socket.SOCK_STREAM) + self.addCleanup(self.srv.close) self.srv.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1) srvaddr = (socket.TIPC_ADDR_NAMESEQ, TIPC_STYPE, TIPC_LOWER, TIPC_UPPER) @@ -4033,6 +4036,7 @@ self.srv.listen(5) self.serverExplicitReady() self.conn, self.connaddr = self.srv.accept() + self.addCleanup(self.conn.close) def clientSetUp(self): # The is a hittable race between serverExplicitReady() and the @@ -4040,6 +4044,7 @@ # we could get an exception time.sleep(0.1) self.cli = socket.socket(socket.AF_TIPC, socket.SOCK_STREAM) + self.addCleanup(self.cli.close) addr = (socket.TIPC_ADDR_NAME, TIPC_STYPE, TIPC_LOWER + int((TIPC_UPPER - TIPC_LOWER) / 2), 0) self.cli.connect(addr) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Oct 3 01:38:22 2011 From: python-checkins at python.org (senthil.kumaran) Date: Mon, 03 Oct 2011 01:38:22 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=283=2E2=29=3A_Document_messag?= =?utf8?q?e=5Fbody_arg_in_HTTPConnection=2Eendheaders?= Message-ID: http://hg.python.org/cpython/rev/a3f2dba93743 changeset: 72588:a3f2dba93743 branch: 3.2 parent: 72586:77397669c62f user: Senthil Kumaran date: Mon Oct 03 07:27:06 2011 +0800 summary: Document message_body arg in HTTPConnection.endheaders files: Doc/library/http.client.rst | 8 ++++++-- 1 files changed, 6 insertions(+), 2 deletions(-) diff --git a/Doc/library/http.client.rst b/Doc/library/http.client.rst --- a/Doc/library/http.client.rst +++ b/Doc/library/http.client.rst @@ -472,9 +472,13 @@ an argument. -.. method:: HTTPConnection.endheaders() +.. method:: HTTPConnection.endheaders(message_body=None) - Send a blank line to the server, signalling the end of the headers. + Send a blank line to the server, signalling the end of the headers. The + optional message_body argument can be used to pass message body + associated with the request. The message body will be sent in + the same packet as the message headers if possible. The + message_body should be a string. .. method:: HTTPConnection.send(data) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Oct 3 01:38:23 2011 From: python-checkins at python.org (senthil.kumaran) Date: Mon, 03 Oct 2011 01:38:23 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=28merge_3=2E2_-=3E_default=29?= =?utf8?q?=3A_merge_from_3=2E2_-__Document_message=5Fbody_arg_in_HTTPConne?= =?utf8?q?ction=2Eendheaders?= Message-ID: http://hg.python.org/cpython/rev/1ed413b52af3 changeset: 72589:1ed413b52af3 parent: 72587:d3194ef040df parent: 72588:a3f2dba93743 user: Senthil Kumaran date: Mon Oct 03 07:28:00 2011 +0800 summary: merge from 3.2 - Document message_body arg in HTTPConnection.endheaders files: Doc/library/http.client.rst | 8 ++++++-- 1 files changed, 6 insertions(+), 2 deletions(-) diff --git a/Doc/library/http.client.rst b/Doc/library/http.client.rst --- a/Doc/library/http.client.rst +++ b/Doc/library/http.client.rst @@ -472,9 +472,13 @@ an argument. -.. method:: HTTPConnection.endheaders() +.. method:: HTTPConnection.endheaders(message_body=None) - Send a blank line to the server, signalling the end of the headers. + Send a blank line to the server, signalling the end of the headers. The + optional message_body argument can be used to pass message body + associated with the request. The message body will be sent in + the same packet as the message headers if possible. The + message_body should be a string. .. method:: HTTPConnection.send(data) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Oct 3 01:38:23 2011 From: python-checkins at python.org (senthil.kumaran) Date: Mon, 03 Oct 2011 01:38:23 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=282=2E7=29=3A_update_2=2E7_-_?= =?utf8?q?Document_message=5Fbody_arg_in_HTTPConnection=2Eendheaders?= Message-ID: http://hg.python.org/cpython/rev/277688052c5a changeset: 72590:277688052c5a branch: 2.7 parent: 72580:854e31d80151 user: Senthil Kumaran date: Mon Oct 03 07:37:58 2011 +0800 summary: update 2.7 - Document message_body arg in HTTPConnection.endheaders files: Doc/library/httplib.rst | 6 +++++- 1 files changed, 5 insertions(+), 1 deletions(-) diff --git a/Doc/library/httplib.rst b/Doc/library/httplib.rst --- a/Doc/library/httplib.rst +++ b/Doc/library/httplib.rst @@ -492,9 +492,13 @@ an argument. -.. method:: HTTPConnection.endheaders() +.. method:: HTTPConnection.endheaders(message_body=None) Send a blank line to the server, signalling the end of the headers. + The optional message_body argument can be used to pass message body + associated with the request. The message body will be sent in + the same packet as the message headers if possible. The + message_body should be a string. .. method:: HTTPConnection.send(data) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Oct 3 03:45:15 2011 From: python-checkins at python.org (victor.stinner) Date: Mon, 03 Oct 2011 03:45:15 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_PyCodec=5FReplaceErrors=28?= =?utf8?q?=29_uses_=22C=22_format_instead_of_=22u=23=22_to_build_result?= Message-ID: http://hg.python.org/cpython/rev/611c57aa694a changeset: 72591:611c57aa694a parent: 72589:1ed413b52af3 user: Victor Stinner date: Sun Oct 02 19:00:15 2011 +0200 summary: PyCodec_ReplaceErrors() uses "C" format instead of "u#" to build result files: Python/codecs.c | 5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-) diff --git a/Python/codecs.c b/Python/codecs.c --- a/Python/codecs.c +++ b/Python/codecs.c @@ -534,10 +534,11 @@ return Py_BuildValue("(Nn)", res, end); } else if (PyObject_IsInstance(exc, PyExc_UnicodeDecodeError)) { - Py_UNICODE res = Py_UNICODE_REPLACEMENT_CHARACTER; if (PyUnicodeDecodeError_GetEnd(exc, &end)) return NULL; - return Py_BuildValue("(u#n)", &res, 1, end); + return Py_BuildValue("(Cn)", + (int)Py_UNICODE_REPLACEMENT_CHARACTER, + end); } else if (PyObject_IsInstance(exc, PyExc_UnicodeTranslateError)) { PyObject *res; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Oct 3 03:45:16 2011 From: python-checkins at python.org (victor.stinner) Date: Mon, 03 Oct 2011 03:45:16 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Check_error_when_calling_Py?= =?utf8?q?Unicode=5FAppendAndDel=28=29?= Message-ID: http://hg.python.org/cpython/rev/8d29cdf28216 changeset: 72592:8d29cdf28216 user: Victor Stinner date: Sun Oct 02 20:35:10 2011 +0200 summary: Check error when calling PyUnicode_AppendAndDel() files: Modules/_ctypes/callproc.c | 4 ++-- Python/dynload_win.c | 8 +++++--- 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/Modules/_ctypes/callproc.c b/Modules/_ctypes/callproc.c --- a/Modules/_ctypes/callproc.c +++ b/Modules/_ctypes/callproc.c @@ -944,9 +944,9 @@ else { PyErr_Clear(); PyUnicode_AppendAndDel(&s, PyUnicode_FromString("???")); - if (s == NULL) - goto error; } + if (s == NULL) + goto error; PyErr_SetObject(exc_class, s); error: Py_XDECREF(tp); diff --git a/Python/dynload_win.c b/Python/dynload_win.c --- a/Python/dynload_win.c +++ b/Python/dynload_win.c @@ -187,7 +187,7 @@ HINSTANCE hDLL = NULL; unsigned int old_mode; ULONG_PTR cookie = 0; - + /* Don't display a message box when Python can't load a DLL */ old_mode = SetErrorMode(SEM_FAILCRITICALERRORS); @@ -248,8 +248,10 @@ theInfo, theLength)); } - PyErr_SetObject(PyExc_ImportError, message); - Py_XDECREF(message); + if (message != NULL) { + PyErr_SetObject(PyExc_ImportError, message); + Py_DECREF(message); + } return NULL; } else { char buffer[256]; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Oct 3 03:45:17 2011 From: python-checkins at python.org (victor.stinner) Date: Mon, 03 Oct 2011 03:45:17 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_unicode=5Fempty_and_unicode?= =?utf8?q?=5Flatin1_are_PyObject*_objects=2C_not_PyUnicodeObject*?= Message-ID: http://hg.python.org/cpython/rev/5656f5517feb changeset: 72593:5656f5517feb user: Victor Stinner date: Sun Oct 02 20:39:30 2011 +0200 summary: unicode_empty and unicode_latin1 are PyObject* objects, not PyUnicodeObject* files: Objects/unicodeobject.c | 30 ++++++++++++++-------------- 1 files changed, 15 insertions(+), 15 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -147,11 +147,11 @@ static PyObject *interned; /* The empty Unicode object is shared to improve performance. */ -static PyUnicodeObject *unicode_empty; +static PyObject *unicode_empty; /* Single character Unicode strings in the Latin-1 range are being shared as well. */ -static PyUnicodeObject *unicode_latin1[256]; +static PyObject *unicode_latin1[256]; /* Fast detection of the most frequent whitespace characters */ const unsigned char _Py_ascii_whitespace[] = { @@ -398,7 +398,7 @@ /* Optimization for empty strings */ if (length == 0 && unicode_empty != NULL) { Py_INCREF(unicode_empty); - return unicode_empty; + return (PyUnicodeObject*)unicode_empty; } /* Ensure we won't overflow the size. */ @@ -491,7 +491,7 @@ /* Optimization for empty strings */ if (size == 0 && unicode_empty != NULL) { Py_INCREF(unicode_empty); - return (PyObject *)unicode_empty; + return unicode_empty; } #ifdef Py_DEBUG @@ -1017,16 +1017,16 @@ static PyObject* get_latin1_char(unsigned char ch) { - PyUnicodeObject *unicode = unicode_latin1[ch]; + PyObject *unicode = unicode_latin1[ch]; if (!unicode) { - unicode = (PyUnicodeObject *)PyUnicode_New(1, ch); + unicode = PyUnicode_New(1, ch); if (!unicode) return NULL; PyUnicode_1BYTE_DATA(unicode)[0] = ch; unicode_latin1[ch] = unicode; } Py_INCREF(unicode); - return (PyObject *)unicode; + return unicode; } PyObject * @@ -1045,7 +1045,7 @@ /* Optimization for empty strings */ if (size == 0 && unicode_empty != NULL) { Py_INCREF(unicode_empty); - return (PyObject *)unicode_empty; + return unicode_empty; } /* Single character Unicode objects in the Latin-1 range are @@ -1117,7 +1117,7 @@ /* Optimization for empty strings */ if (size == 0 && unicode_empty != NULL) { Py_INCREF(unicode_empty); - return (PyObject *)unicode_empty; + return unicode_empty; } /* Single characters are shared when using this constructor. @@ -2137,7 +2137,7 @@ if (PyBytes_Check(obj)) { if (PyBytes_GET_SIZE(obj) == 0) { Py_INCREF(unicode_empty); - v = (PyObject *) unicode_empty; + v = unicode_empty; } else { v = PyUnicode_Decode( @@ -2164,7 +2164,7 @@ if (buffer.len == 0) { Py_INCREF(unicode_empty); - v = (PyObject *) unicode_empty; + v = unicode_empty; } else v = PyUnicode_Decode((char*) buffer.buf, buffer.len, encoding, errors); @@ -9555,11 +9555,11 @@ goto onError; /* Shortcuts */ - if (v == (PyObject*)unicode_empty) { + if (v == unicode_empty) { Py_DECREF(v); return u; } - if (u == (PyObject*)unicode_empty) { + if (u == unicode_empty) { Py_DECREF(u); return v; } @@ -10635,7 +10635,7 @@ if (len < 1) { Py_INCREF(unicode_empty); - return (PyObject *)unicode_empty; + return unicode_empty; } if (len == 1 && PyUnicode_CheckExact(str)) { @@ -12602,7 +12602,7 @@ }; /* Init the implementation */ - unicode_empty = (PyUnicodeObject *) PyUnicode_New(0, 0); + unicode_empty = PyUnicode_New(0, 0); if (!unicode_empty) Py_FatalError("Can't create empty string"); -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Oct 3 03:45:18 2011 From: python-checkins at python.org (victor.stinner) Date: Mon, 03 Oct 2011 03:45:18 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Add_=5FPyUnicode=5FDATA=5FA?= =?utf8?q?NY=28op=29_private_macro?= Message-ID: http://hg.python.org/cpython/rev/2203ccbc4895 changeset: 72594:2203ccbc4895 user: Victor Stinner date: Sun Oct 02 20:39:55 2011 +0200 summary: Add _PyUnicode_DATA_ANY(op) private macro files: Objects/unicodeobject.c | 34 ++++++++++++++-------------- 1 files changed, 17 insertions(+), 17 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -131,11 +131,11 @@ #define _PyUnicode_GET_LENGTH(op) \ (assert(PyUnicode_Check(op)), \ ((PyASCIIObject *)(op))->length) +#define _PyUnicode_DATA_ANY(op) (((PyUnicodeObject*)(op))->data.any) /* The Unicode string has been modified: reset the hash */ #define _PyUnicode_DIRTY(op) do { _PyUnicode_HASH(op) = -1; } while (0) - /* This dictionary holds all interned unicode strings. Note that references to strings in this dictionary are *not* counted in the string's ob_refcnt. When the interned string reaches a refcnt of 0 the string deallocation @@ -441,7 +441,7 @@ _PyUnicode_STATE(unicode).compact = 0; _PyUnicode_STATE(unicode).ready = 0; _PyUnicode_STATE(unicode).ascii = 0; - unicode->data.any = NULL; + _PyUnicode_DATA_ANY(unicode) = NULL; _PyUnicode_LENGTH(unicode) = 0; _PyUnicode_UTF8(unicode) = NULL; _PyUnicode_UTF8_LENGTH(unicode) = 0; @@ -815,7 +815,7 @@ assert(!PyUnicode_IS_COMPACT(obj)); assert(_PyUnicode_KIND(obj) == PyUnicode_WCHAR_KIND); assert(_PyUnicode_WSTR(unicode) != NULL); - assert(unicode->data.any == NULL); + assert(_PyUnicode_DATA_ANY(unicode) == NULL); assert(_PyUnicode_UTF8(unicode) == NULL); /* Actually, it should neither be interned nor be anything else: */ assert(_PyUnicode_STATE(unicode).interned == SSTATE_NOT_INTERNED); @@ -830,8 +830,8 @@ return -1; if (maxchar < 256) { - unicode->data.any = PyObject_MALLOC(_PyUnicode_WSTR_LENGTH(unicode) + 1); - if (!unicode->data.any) { + _PyUnicode_DATA_ANY(unicode) = PyObject_MALLOC(_PyUnicode_WSTR_LENGTH(unicode) + 1); + if (!_PyUnicode_DATA_ANY(unicode)) { PyErr_NoMemory(); return -1; } @@ -842,7 +842,7 @@ _PyUnicode_LENGTH(unicode) = _PyUnicode_WSTR_LENGTH(unicode); _PyUnicode_STATE(unicode).kind = PyUnicode_1BYTE_KIND; if (maxchar < 128) { - _PyUnicode_UTF8(unicode) = unicode->data.any; + _PyUnicode_UTF8(unicode) = _PyUnicode_DATA_ANY(unicode); _PyUnicode_UTF8_LENGTH(unicode) = _PyUnicode_WSTR_LENGTH(unicode); } else { @@ -861,7 +861,7 @@ #if SIZEOF_WCHAR_T == 2 /* We can share representations and are done. */ - unicode->data.any = _PyUnicode_WSTR(unicode); + _PyUnicode_DATA_ANY(unicode) = _PyUnicode_WSTR(unicode); PyUnicode_2BYTE_DATA(unicode)[_PyUnicode_WSTR_LENGTH(unicode)] = '\0'; _PyUnicode_LENGTH(unicode) = _PyUnicode_WSTR_LENGTH(unicode); _PyUnicode_STATE(unicode).kind = PyUnicode_2BYTE_KIND; @@ -869,9 +869,9 @@ _PyUnicode_UTF8_LENGTH(unicode) = 0; #else /* sizeof(wchar_t) == 4 */ - unicode->data.any = PyObject_MALLOC( + _PyUnicode_DATA_ANY(unicode) = PyObject_MALLOC( 2 * (_PyUnicode_WSTR_LENGTH(unicode) + 1)); - if (!unicode->data.any) { + if (!_PyUnicode_DATA_ANY(unicode)) { PyErr_NoMemory(); return -1; } @@ -894,8 +894,8 @@ /* in case the native representation is 2-bytes, we need to allocate a new normalized 4-byte version. */ length_wo_surrogates = _PyUnicode_WSTR_LENGTH(unicode) - num_surrogates; - unicode->data.any = PyObject_MALLOC(4 * (length_wo_surrogates + 1)); - if (!unicode->data.any) { + _PyUnicode_DATA_ANY(unicode) = PyObject_MALLOC(4 * (length_wo_surrogates + 1)); + if (!_PyUnicode_DATA_ANY(unicode)) { PyErr_NoMemory(); return -1; } @@ -914,7 +914,7 @@ #else assert(num_surrogates == 0); - unicode->data.any = _PyUnicode_WSTR(unicode); + _PyUnicode_DATA_ANY(unicode) = _PyUnicode_WSTR(unicode); _PyUnicode_LENGTH(unicode) = _PyUnicode_WSTR_LENGTH(unicode); _PyUnicode_UTF8(unicode) = NULL; _PyUnicode_UTF8_LENGTH(unicode) = 0; @@ -961,8 +961,8 @@ Py_TYPE(unicode)->tp_free((PyObject *)unicode); } else { - if (unicode->data.any) - PyObject_DEL(unicode->data.any); + if (_PyUnicode_DATA_ANY(unicode)) + PyObject_DEL(_PyUnicode_DATA_ANY(unicode)); Py_TYPE(unicode)->tp_free((PyObject *)unicode); } } @@ -11657,7 +11657,7 @@ /* If it is a two-block object, account for base object, and for character block if present. */ size = sizeof(PyUnicodeObject); - if (v->data.any) + if (_PyUnicode_DATA_ANY(v)) size += (PyUnicode_GET_LENGTH(v) + 1) * PyUnicode_CHARACTER_SIZE(v); } @@ -12477,7 +12477,7 @@ _PyUnicode_UTF8_LENGTH(self) = 0; _PyUnicode_UTF8(self) = NULL; _PyUnicode_WSTR_LENGTH(self) = 0; - self->data.any = NULL; + _PyUnicode_DATA_ANY(self) = NULL; share_utf8 = 0; share_wstr = 0; @@ -12509,7 +12509,7 @@ goto onError; } - self->data.any = data; + _PyUnicode_DATA_ANY(self) = data; if (share_utf8) { _PyUnicode_UTF8_LENGTH(self) = length; _PyUnicode_UTF8(self) = data; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Oct 3 03:45:19 2011 From: python-checkins at python.org (victor.stinner) Date: Mon, 03 Oct 2011 03:45:19 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_unicode=5Fconvert=5Fwchar?= =?utf8?q?=5Fto=5Fucs4=28=29_cannot_fail?= Message-ID: http://hg.python.org/cpython/rev/43cd1d9552de changeset: 72595:43cd1d9552de user: Victor Stinner date: Sun Oct 02 21:33:54 2011 +0200 summary: unicode_convert_wchar_to_ucs4() cannot fail files: Objects/unicodeobject.c | 14 ++++---------- 1 files changed, 4 insertions(+), 10 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -590,7 +590,7 @@ This function assumes that unicode can hold one more code point than wstr characters for a terminating null character. */ -static int +static void unicode_convert_wchar_to_ucs4(const wchar_t *begin, const wchar_t *end, PyUnicodeObject *unicode) { @@ -757,6 +757,7 @@ { const wchar_t *iter; + assert(num_surrogates != NULL && maxchar != NULL); if (num_surrogates == NULL || maxchar == NULL) { PyErr_SetString(PyExc_SystemError, "unexpected NULL arguments to " @@ -903,11 +904,7 @@ _PyUnicode_STATE(unicode).kind = PyUnicode_4BYTE_KIND; _PyUnicode_UTF8(unicode) = NULL; _PyUnicode_UTF8_LENGTH(unicode) = 0; - if (unicode_convert_wchar_to_ucs4(_PyUnicode_WSTR(unicode), end, - unicode) < 0) { - assert(0 && "ConvertWideCharToUCS4 failed"); - return -1; - } + unicode_convert_wchar_to_ucs4(_PyUnicode_WSTR(unicode), end, unicode); PyObject_FREE(_PyUnicode_WSTR(unicode)); _PyUnicode_WSTR(unicode) = NULL; _PyUnicode_WSTR_LENGTH(unicode) = 0; @@ -1081,10 +1078,7 @@ #if SIZEOF_WCHAR_T == 2 /* This is the only case which has to process surrogates, thus a simple copy loop is not enough and we need a function. */ - if (unicode_convert_wchar_to_ucs4(u, u + size, unicode) < 0) { - Py_DECREF(unicode); - return NULL; - } + unicode_convert_wchar_to_ucs4(u, u + size, unicode); #else assert(num_surrogates == 0); Py_MEMCPY(PyUnicode_4BYTE_DATA(unicode), u, size * 4); -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Oct 3 03:45:19 2011 From: python-checkins at python.org (victor.stinner) Date: Mon, 03 Oct 2011 03:45:19 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_PyUnicode=5FCopyCharacters?= =?utf8?q?=28=29_fails_when_copying_latin1_into_ascii?= Message-ID: http://hg.python.org/cpython/rev/d64b5c1e3d94 changeset: 72596:d64b5c1e3d94 user: Victor Stinner date: Sun Oct 02 23:33:16 2011 +0200 summary: PyUnicode_CopyCharacters() fails when copying latin1 into ascii files: Objects/unicodeobject.c | 63 +++++++++++++++++++++++++--- 1 files changed, 56 insertions(+), 7 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -455,6 +455,46 @@ return NULL; } +static const char* +unicode_kind_name(PyObject *unicode) +{ + assert(PyUnicode_Check(unicode)); + if (!PyUnicode_IS_COMPACT(unicode)) + { + if (!PyUnicode_IS_READY(unicode)) + return "wstr"; + switch(PyUnicode_KIND(unicode)) + { + case PyUnicode_1BYTE_KIND: + if (PyUnicode_IS_COMPACT_ASCII(unicode)) + return "legacy ascii"; + else + return "legacy latin1"; + case PyUnicode_2BYTE_KIND: + return "legacy UCS2"; + case PyUnicode_4BYTE_KIND: + return "legacy UCS4"; + default: + return ""; + } + } + assert(PyUnicode_IS_READY(unicode)); + switch(PyUnicode_KIND(unicode)) + { + case PyUnicode_1BYTE_KIND: + if (PyUnicode_IS_COMPACT_ASCII(unicode)) + return "ascii"; + else + return "compact latin1"; + case PyUnicode_2BYTE_KIND: + return "compact UCS2"; + case PyUnicode_4BYTE_KIND: + return "compact UCS4"; + default: + return ""; + } +} + #ifdef Py_DEBUG int unicode_new_new_calls = 0; @@ -672,8 +712,10 @@ to_kind = PyUnicode_KIND(to); to_data = PyUnicode_DATA(to); - if (from_kind == to_kind) { - /* fast path */ + if (from_kind == to_kind + /* deny latin1 => ascii */ + && PyUnicode_MAX_CHAR_VALUE(to) >= PyUnicode_MAX_CHAR_VALUE(from)) + { Py_MEMCPY((char*)to_data + PyUnicode_KIND_SIZE(to_kind, to_start), (char*)from_data @@ -712,7 +754,14 @@ } else { int invalid_kinds; - if (from_kind > to_kind) { + + /* check if max_char(from substring) <= max_char(to) */ + if (from_kind > to_kind + /* latin1 => ascii */ + || (PyUnicode_IS_COMPACT_ASCII(to) + && to_kind == PyUnicode_1BYTE_KIND + && !PyUnicode_IS_COMPACT_ASCII(from))) + { /* slow path to check for character overflow */ const Py_UCS4 to_maxchar = PyUnicode_MAX_CHAR_VALUE(to); Py_UCS4 ch, maxchar; @@ -736,10 +785,10 @@ invalid_kinds = 1; if (invalid_kinds) { PyErr_Format(PyExc_ValueError, - "Cannot copy UCS%u characters " - "into a string of UCS%u characters", - 1 << (from_kind - 1), - 1 << (to_kind -1)); + "Cannot copy %s characters " + "into a string of %s characters", + unicode_kind_name(from), + unicode_kind_name(to)); return -1; } } -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Oct 3 03:45:20 2011 From: python-checkins at python.org (victor.stinner) Date: Mon, 03 Oct 2011 03:45:20 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Write_=5FPyUnicode=5FDump?= =?utf8?q?=28=29_to_help_debugging?= Message-ID: http://hg.python.org/cpython/rev/0f227b4bd20c changeset: 72597:0f227b4bd20c user: Victor Stinner date: Mon Oct 03 02:59:31 2011 +0200 summary: Write _PyUnicode_Dump() to help debugging files: Objects/unicodeobject.c | 23 +++++++++++++++++++++++ 1 files changed, 23 insertions(+), 0 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -515,6 +515,29 @@ printf("compact data %p\n", _PyUnicode_COMPACT_DATA(unicode)); return PyUnicode_DATA(unicode); } + +void +_PyUnicode_Dump(PyObject *op) +{ + PyASCIIObject *ascii = (PyASCIIObject *)op; + printf("%s: len=%zu, wstr=%p", + unicode_kind_name(op), + ascii->length, + ascii->wstr); + if (!ascii->state.ascii) { + PyCompactUnicodeObject *compact = (PyCompactUnicodeObject *)op; + printf(" (%zu), utf8=%p (%zu)", + compact->wstr_length, + compact->utf8, + compact->utf8_length); + } + if (!ascii->state.compact) { + PyUnicodeObject *unicode = (PyUnicodeObject *)op; + printf(", data=%p", + unicode->data.any); + } + printf("\n"); +} #endif PyObject * -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Oct 3 03:45:21 2011 From: python-checkins at python.org (victor.stinner) Date: Mon, 03 Oct 2011 03:45:21 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_PyUnicode=5FREAD=5FCHAR=28?= =?utf8?q?=29_ensures_that_the_string_is_ready?= Message-ID: http://hg.python.org/cpython/rev/3a0af974f1b5 changeset: 72598:3a0af974f1b5 user: Victor Stinner date: Sun Oct 02 20:33:18 2011 +0200 summary: PyUnicode_READ_CHAR() ensures that the string is ready files: Include/unicodeobject.h | 18 ++++++++++-------- 1 files changed, 10 insertions(+), 8 deletions(-) diff --git a/Include/unicodeobject.h b/Include/unicodeobject.h --- a/Include/unicodeobject.h +++ b/Include/unicodeobject.h @@ -429,14 +429,16 @@ PyUnicode_READ_CHAR, for multiple consecutive reads callers should cache kind and use PyUnicode_READ instead. */ #define PyUnicode_READ_CHAR(unicode, index) \ - ((Py_UCS4) \ - (PyUnicode_KIND((unicode)) == PyUnicode_1BYTE_KIND ? \ - ((const Py_UCS1 *)(PyUnicode_DATA((unicode))))[(index)] : \ - (PyUnicode_KIND((unicode)) == PyUnicode_2BYTE_KIND ? \ - ((const Py_UCS2 *)(PyUnicode_DATA((unicode))))[(index)] : \ - ((const Py_UCS4 *)(PyUnicode_DATA((unicode))))[(index)] \ - ) \ - )) + (assert(PyUnicode_Check(unicode)), \ + assert(PyUnicode_IS_READY(unicode)), \ + (Py_UCS4) \ + (PyUnicode_KIND((unicode)) == PyUnicode_1BYTE_KIND ? \ + ((const Py_UCS1 *)(PyUnicode_DATA((unicode))))[(index)] : \ + (PyUnicode_KIND((unicode)) == PyUnicode_2BYTE_KIND ? \ + ((const Py_UCS2 *)(PyUnicode_DATA((unicode))))[(index)] : \ + ((const Py_UCS4 *)(PyUnicode_DATA((unicode))))[(index)] \ + ) \ + )) /* Returns the length of the unicode string. The caller has to make sure that the string has it's canonical representation set before calling -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Oct 3 04:02:35 2011 From: python-checkins at python.org (victor.stinner) Date: Mon, 03 Oct 2011 04:02:35 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Add_=5FPyUnicode=5FHAS=5FUT?= =?utf8?q?F8=5FMEMORY=28=29_macro?= Message-ID: http://hg.python.org/cpython/rev/d7c96f8f79db changeset: 72599:d7c96f8f79db user: Victor Stinner date: Mon Oct 03 01:08:02 2011 +0200 summary: Add _PyUnicode_HAS_UTF8_MEMORY() macro files: Objects/unicodeobject.c | 17 +++++++++++------ 1 files changed, 11 insertions(+), 6 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -133,6 +133,15 @@ ((PyASCIIObject *)(op))->length) #define _PyUnicode_DATA_ANY(op) (((PyUnicodeObject*)(op))->data.any) +/* true if the Unicode object has an allocated UTF-8 memory block + (not shared with other data) */ +#define _PyUnicode_HAS_UTF8_MEMORY(op) \ + (assert(PyUnicode_Check(op)), \ + (!PyUnicode_IS_COMPACT_ASCII(op) \ + && _PyUnicode_UTF8(op) \ + && _PyUnicode_UTF8(op) != PyUnicode_DATA(op))) + + /* The Unicode string has been modified: reset the hash */ #define _PyUnicode_DIRTY(op) do { _PyUnicode_HASH(op) = -1; } while (0) @@ -1021,9 +1030,7 @@ (!PyUnicode_IS_READY(unicode) || _PyUnicode_WSTR(unicode) != PyUnicode_DATA(unicode))) PyObject_DEL(_PyUnicode_WSTR(unicode)); - if (!PyUnicode_IS_COMPACT_ASCII(unicode) - && _PyUnicode_UTF8(unicode) - && _PyUnicode_UTF8(unicode) != PyUnicode_DATA(unicode)) + if (_PyUnicode_HAS_UTF8_MEMORY(unicode)) PyObject_DEL(_PyUnicode_UTF8(unicode)); if (PyUnicode_IS_COMPACT(unicode)) { @@ -11735,9 +11742,7 @@ (!PyUnicode_IS_READY(v) || (PyUnicode_DATA(v) != _PyUnicode_WSTR(v)))) size += (PyUnicode_WSTR_LENGTH(v) + 1) * sizeof(wchar_t); - if (!PyUnicode_IS_COMPACT_ASCII(v) - && _PyUnicode_UTF8(v) - && _PyUnicode_UTF8(v) != PyUnicode_DATA(v)) + if (_PyUnicode_HAS_UTF8_MEMORY(v)) size += PyUnicode_UTF8_LENGTH(v) + 1; return PyLong_FromSsize_t(size); -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Oct 3 04:02:36 2011 From: python-checkins at python.org (victor.stinner) Date: Mon, 03 Oct 2011 04:02:36 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Rewrite_PyUnicode=5FResize?= =?utf8?b?KCk=?= Message-ID: http://hg.python.org/cpython/rev/970c4aa11825 changeset: 72600:970c4aa11825 user: Victor Stinner date: Mon Oct 03 03:52:20 2011 +0200 summary: Rewrite PyUnicode_Resize() * Rename _PyUnicode_Resize() to unicode_resize() * unicode_resize() creates a copy if the string cannot be resized instead of failing * Optimize resize_copy() for wstr strings * Disable temporary resize_inplace() files: Objects/unicodeobject.c | 330 ++++++++++++++++++--------- 1 files changed, 221 insertions(+), 109 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -193,6 +193,8 @@ 0, 0, 0, 0, 0, 0, 0, 0 }; +static PyUnicodeObject *_PyUnicode_New(Py_ssize_t length); + static PyObject * unicode_encode_call_errorhandler(const char *errors, PyObject **errorHandler,const char *encoding, const char *reason, @@ -320,71 +322,144 @@ return NULL; } +static PyObject* +resize_compact(PyObject *unicode, Py_ssize_t length) +{ + Py_ssize_t char_size; + Py_ssize_t struct_size; + Py_ssize_t new_size; + int share_wstr; + + assert(PyUnicode_IS_READY(unicode)); + char_size = PyUnicode_CHARACTER_SIZE(unicode); + if (PyUnicode_IS_COMPACT_ASCII(unicode)) + struct_size = sizeof(PyASCIIObject); + else + struct_size = sizeof(PyCompactUnicodeObject); + share_wstr = (_PyUnicode_WSTR(unicode) == PyUnicode_DATA(unicode)); + + _Py_DEC_REFTOTAL; + _Py_ForgetReference(unicode); + + if (length > ((PY_SSIZE_T_MAX - struct_size) / char_size - 1)) { + PyErr_NoMemory(); + return NULL; + } + new_size = (struct_size + (length + 1) * char_size); + + unicode = (PyObject *)PyObject_REALLOC((char *)unicode, new_size); + if (unicode == NULL) { + PyObject_Del(unicode); + PyErr_NoMemory(); + return NULL; + } + _Py_NewReference(unicode); + _PyUnicode_LENGTH(unicode) = length; + if (share_wstr) + _PyUnicode_WSTR(unicode) = PyUnicode_DATA(unicode); + PyUnicode_WRITE(PyUnicode_KIND(unicode), PyUnicode_DATA(unicode), + length, 0); + return unicode; +} + static int -unicode_resize(register PyUnicodeObject *unicode, - Py_ssize_t length) +resize_inplace(register PyUnicodeObject *unicode, Py_ssize_t length) { void *oldstr; - /* Resizing is only supported for old unicode objects. */ assert(!PyUnicode_IS_COMPACT(unicode)); - assert(_PyUnicode_WSTR(unicode) != NULL); - - /* ... and only if they have not been readied yet, because - callees usually rely on the wstr representation when resizing. */ - assert(unicode->data.any == NULL); - - /* Shortcut if there's nothing much to do. */ - if (_PyUnicode_WSTR_LENGTH(unicode) == length) - goto reset; - - /* Resizing shared object (unicode_empty or single character - objects) in-place is not allowed. Use PyUnicode_Resize() - instead ! */ - - if (unicode == unicode_empty || - (_PyUnicode_WSTR_LENGTH(unicode) == 1 && - _PyUnicode_WSTR(unicode)[0] < 256U && - unicode_latin1[_PyUnicode_WSTR(unicode)[0]] == unicode)) { - PyErr_SetString(PyExc_SystemError, - "can't resize shared str objects"); - return -1; - } - - /* We allocate one more byte to make sure the string is Ux0000 terminated. - The overallocation is also used by fastsearch, which assumes that it's - safe to look at str[length] (without making any assumptions about what - it contains). */ - - oldstr = _PyUnicode_WSTR(unicode); - _PyUnicode_WSTR(unicode) = PyObject_REALLOC(_PyUnicode_WSTR(unicode), - sizeof(Py_UNICODE) * (length + 1)); - if (!_PyUnicode_WSTR(unicode)) { - _PyUnicode_WSTR(unicode) = (Py_UNICODE *)oldstr; - PyErr_NoMemory(); - return -1; - } - _PyUnicode_WSTR(unicode)[length] = 0; - _PyUnicode_WSTR_LENGTH(unicode) = length; - - reset: - if (unicode->data.any != NULL) { - PyObject_FREE(unicode->data.any); - if (_PyUnicode_UTF8(unicode) && _PyUnicode_UTF8(unicode) != unicode->data.any) { - PyObject_FREE(_PyUnicode_UTF8(unicode)); - } + + assert(Py_REFCNT(unicode) == 1); + _PyUnicode_DIRTY(unicode); + + if (_PyUnicode_HAS_UTF8_MEMORY(unicode)) + { + PyObject_DEL(_PyUnicode_UTF8(unicode)); _PyUnicode_UTF8(unicode) = NULL; - _PyUnicode_UTF8_LENGTH(unicode) = 0; - unicode->data.any = NULL; - _PyUnicode_LENGTH(unicode) = 0; - _PyUnicode_STATE(unicode).interned = _PyUnicode_STATE(unicode).interned; - _PyUnicode_STATE(unicode).kind = PyUnicode_WCHAR_KIND; - } - _PyUnicode_DIRTY(unicode); - + } + + if (PyUnicode_IS_READY(unicode)) { + Py_ssize_t char_size; + Py_ssize_t new_size; + int share_wstr; + void *data; + + data = _PyUnicode_DATA_ANY(unicode); + assert(data != NULL); + char_size = PyUnicode_CHARACTER_SIZE(unicode); + share_wstr = (_PyUnicode_WSTR(unicode) == data); + + if (length > (PY_SSIZE_T_MAX / char_size - 1)) { + PyErr_NoMemory(); + return -1; + } + new_size = (length + 1) * char_size; + + data = (PyObject *)PyObject_REALLOC(data, new_size); + if (data == NULL) { + PyErr_NoMemory(); + return -1; + } + _PyUnicode_DATA_ANY(unicode) = data; + if (share_wstr) + _PyUnicode_WSTR(unicode) = data; + _PyUnicode_LENGTH(unicode) = length; + PyUnicode_WRITE(PyUnicode_KIND(unicode), data, length, 0); + if (share_wstr) + return 0; + } + if (_PyUnicode_WSTR(unicode) != NULL) { + assert(_PyUnicode_WSTR(unicode) != NULL); + + oldstr = _PyUnicode_WSTR(unicode); + _PyUnicode_WSTR(unicode) = PyObject_REALLOC(_PyUnicode_WSTR(unicode), + sizeof(Py_UNICODE) * (length + 1)); + if (!_PyUnicode_WSTR(unicode)) { + _PyUnicode_WSTR(unicode) = (Py_UNICODE *)oldstr; + PyErr_NoMemory(); + return -1; + } + _PyUnicode_WSTR(unicode)[length] = 0; + _PyUnicode_WSTR_LENGTH(unicode) = length; + } return 0; } +static PyObject* +resize_copy(PyObject *unicode, Py_ssize_t length) +{ + Py_ssize_t copy_length; + if (PyUnicode_IS_COMPACT(unicode)) { + PyObject *copy; + assert(PyUnicode_IS_READY(unicode)); + + copy = PyUnicode_New(length, PyUnicode_MAX_CHAR_VALUE(unicode)); + if (copy == NULL) + return NULL; + + copy_length = Py_MIN(length, PyUnicode_GET_LENGTH(unicode)); + if (PyUnicode_CopyCharacters(copy, 0, + unicode, 0, + copy_length) < 0) + { + Py_DECREF(copy); + return NULL; + } + return copy; + } else { + assert(_PyUnicode_WSTR(unicode) != NULL); + assert(_PyUnicode_DATA_ANY(unicode) == NULL); + PyUnicodeObject *w = _PyUnicode_New(length); + if (w == NULL) + return NULL; + copy_length = _PyUnicode_WSTR_LENGTH(unicode); + copy_length = Py_MIN(copy_length, length); + Py_UNICODE_COPY(_PyUnicode_WSTR(w), _PyUnicode_WSTR(unicode), + copy_length); + return (PyObject*)w; + } +} + /* We allocate one more byte to make sure the string is Ux0000 terminated; some code (e.g. new_identifier) relies on that. @@ -690,7 +765,6 @@ assert(ucs4_out == (PyUnicode_4BYTE_DATA(unicode) + _PyUnicode_GET_LENGTH(unicode))); - return 0; } #endif @@ -1044,50 +1118,84 @@ } static int -_PyUnicode_Resize(PyUnicodeObject **unicode, Py_ssize_t length) -{ - register PyUnicodeObject *v; - - /* Argument checks */ - if (unicode == NULL) { +unicode_resizable(PyObject *unicode) +{ + if (Py_REFCNT(unicode) != 1) + return 0; + if (PyUnicode_CHECK_INTERNED(unicode)) + return 0; + if (unicode == unicode_empty) + return 0; + if (PyUnicode_WSTR_LENGTH(unicode) == 1) { + Py_UCS4 ch; + if (PyUnicode_IS_COMPACT(unicode)) + ch = PyUnicode_READ_CHAR(unicode, 0); + else + ch = _PyUnicode_WSTR(unicode)[0]; + if (ch < 256 && unicode_latin1[ch] == unicode) + return 0; + } + /* FIXME: reenable resize_inplace */ + if (!PyUnicode_IS_COMPACT(unicode)) + return 0; + return 1; +} + +static int +unicode_resize(PyObject **p_unicode, Py_ssize_t length) +{ + PyObject *unicode; + Py_ssize_t old_length; + + assert(p_unicode != NULL); + unicode = *p_unicode; + + assert(unicode != NULL); + assert(PyUnicode_Check(unicode)); + assert(0 <= length); + + if (!PyUnicode_IS_COMPACT(unicode) && !PyUnicode_IS_READY(unicode)) + old_length = PyUnicode_WSTR_LENGTH(unicode); + else + old_length = PyUnicode_GET_LENGTH(unicode); + if (old_length == length) + return 0; + + /* FIXME: really create a new object? */ + if (!unicode_resizable(unicode)) { + PyObject *copy = resize_copy(unicode, length); + if (copy == NULL) + return -1; + Py_DECREF(*p_unicode); + *p_unicode = copy; + return 0; + } + + if (PyUnicode_IS_COMPACT(unicode)) { + *p_unicode = resize_compact(unicode, length); + if (*p_unicode == NULL) + return -1; + return 0; + } else + return resize_inplace((PyUnicodeObject*)unicode, length); +} + +int +PyUnicode_Resize(PyObject **p_unicode, Py_ssize_t length) +{ + PyObject *unicode; + if (p_unicode == NULL) { PyErr_BadInternalCall(); return -1; } - v = *unicode; - if (v == NULL || !PyUnicode_Check(v) || Py_REFCNT(v) != 1 || length < 0 || - PyUnicode_IS_COMPACT(v) || _PyUnicode_WSTR(v) == NULL) { + unicode = *p_unicode; + if (unicode == NULL || !PyUnicode_Check(unicode) || length < 0 + || _PyUnicode_KIND(unicode) != PyUnicode_WCHAR_KIND) + { PyErr_BadInternalCall(); return -1; } - - /* Resizing unicode_empty and single character objects is not - possible since these are being shared. - The same goes for new-representation unicode objects or objects which - have already been readied. - For these, we simply return a fresh copy with the same Unicode content. - */ - if ((_PyUnicode_WSTR_LENGTH(v) != length && - (v == unicode_empty || _PyUnicode_WSTR_LENGTH(v) == 1)) || - PyUnicode_IS_COMPACT(v) || v->data.any) { - PyUnicodeObject *w = _PyUnicode_New(length); - if (w == NULL) - return -1; - Py_UNICODE_COPY(_PyUnicode_WSTR(w), _PyUnicode_WSTR(v), - length < _PyUnicode_WSTR_LENGTH(v) ? length : _PyUnicode_WSTR_LENGTH(v)); - Py_DECREF(*unicode); - *unicode = w; - return 0; - } - - /* Note that we don't have to modify *unicode for unshared Unicode - objects, since we can modify them in-place. */ - return unicode_resize(v, length); -} - -int -PyUnicode_Resize(PyObject **unicode, Py_ssize_t length) -{ - return _PyUnicode_Resize((PyUnicodeObject **)unicode, length); + return unicode_resize(p_unicode, length); } static PyObject* @@ -3085,7 +3193,7 @@ if (requiredsize > outsize) { if (requiredsize<2*outsize) requiredsize = 2*outsize; - if (_PyUnicode_Resize(output, requiredsize) < 0) + if (PyUnicode_Resize((PyObject**)output, requiredsize) < 0) goto onError; *outptr = PyUnicode_AS_UNICODE(*output) + *outpos; } @@ -3375,7 +3483,7 @@ } } - if (_PyUnicode_Resize(&unicode, p - PyUnicode_AS_UNICODE(unicode)) < 0) + if (PyUnicode_Resize((PyObject**)&unicode, p - PyUnicode_AS_UNICODE(unicode)) < 0) goto onError; Py_XDECREF(errorHandler); @@ -3944,7 +4052,7 @@ /* Adjust length and ready string when it contained errors and is of the old resizable kind. */ if (kind == PyUnicode_WCHAR_KIND) { - if (_PyUnicode_Resize(&unicode, i) < 0 || + if (PyUnicode_Resize((PyObject**)&unicode, i) < 0 || PyUnicode_READY(unicode) == -1) goto onError; } @@ -4449,7 +4557,7 @@ *consumed = (const char *)q-starts; /* Adjust length */ - if (_PyUnicode_Resize(&unicode, p - PyUnicode_AS_UNICODE(unicode)) < 0) + if (PyUnicode_Resize((PyObject**)&unicode, p - PyUnicode_AS_UNICODE(unicode)) < 0) goto onError; Py_XDECREF(errorHandler); @@ -4847,7 +4955,7 @@ *consumed = (const char *)q-starts; /* Adjust length */ - if (_PyUnicode_Resize(&unicode, p - PyUnicode_AS_UNICODE(unicode)) < 0) + if (PyUnicode_Resize((PyObject**)&unicode, p - PyUnicode_AS_UNICODE(unicode)) < 0) goto onError; Py_XDECREF(errorHandler); @@ -5304,9 +5412,13 @@ /* Ensure the length prediction worked in case of ASCII strings */ assert(kind == PyUnicode_WCHAR_KIND || i == ascii_length); - if (kind == PyUnicode_WCHAR_KIND && (_PyUnicode_Resize(&v, i) < 0 || - PyUnicode_READY(v) == -1)) - goto onError; + if (kind == PyUnicode_WCHAR_KIND) + { + if (PyUnicode_Resize((PyObject**)&v, i) < 0) + goto onError; + if (PyUnicode_READY(v) == -1) + goto onError; + } Py_XDECREF(errorHandler); Py_XDECREF(exc); return (PyObject *)v; @@ -5602,7 +5714,7 @@ nextByte: ; } - if (_PyUnicode_Resize(&v, p - PyUnicode_AS_UNICODE(v)) < 0) + if (PyUnicode_Resize((PyObject**)&v, p - PyUnicode_AS_UNICODE(v)) < 0) goto onError; Py_XDECREF(errorHandler); Py_XDECREF(exc); @@ -5790,7 +5902,7 @@ } } - if (_PyUnicode_Resize(&v, p - PyUnicode_AS_UNICODE(v)) < 0) + if (PyUnicode_Resize((PyObject**)&v, p - PyUnicode_AS_UNICODE(v)) < 0) goto onError; Py_XDECREF(errorHandler); Py_XDECREF(exc); @@ -6216,7 +6328,7 @@ } } if (p - PyUnicode_AS_UNICODE(v) < PyUnicode_GET_SIZE(v)) - if (_PyUnicode_Resize(&v, p - PyUnicode_AS_UNICODE(v)) < 0) + if (PyUnicode_Resize((PyObject**)&v, p - PyUnicode_AS_UNICODE(v)) < 0) goto onError; Py_XDECREF(errorHandler); Py_XDECREF(exc); @@ -6343,7 +6455,7 @@ else { /* Extend unicode object */ n = PyUnicode_GET_SIZE(*v); - if (_PyUnicode_Resize(v, n + usize) < 0) + if (PyUnicode_Resize(v, n + usize) < 0) return -1; } @@ -6682,7 +6794,7 @@ (targetsize << 2); extrachars += needed; /* XXX overflow detection missing */ - if (_PyUnicode_Resize(&v, + if (PyUnicode_Resize((PyObject**)&v, PyUnicode_GET_SIZE(v) + needed) < 0) { Py_DECREF(x); goto onError; @@ -6709,7 +6821,7 @@ } } if (p - PyUnicode_AS_UNICODE(v) < PyUnicode_GET_SIZE(v)) - if (_PyUnicode_Resize(&v, p - PyUnicode_AS_UNICODE(v)) < 0) + if (PyUnicode_Resize((PyObject**)&v, p - PyUnicode_AS_UNICODE(v)) < 0) goto onError; Py_XDECREF(errorHandler); Py_XDECREF(exc); -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Oct 3 04:02:36 2011 From: python-checkins at python.org (victor.stinner) Date: Mon, 03 Oct 2011 04:02:36 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_PyUnicode=5FAppend=28=29_no?= =?utf8?q?w_works_in-place_when_it=27s_possible?= Message-ID: http://hg.python.org/cpython/rev/c346f879afbc changeset: 72601:c346f879afbc user: Victor Stinner date: Mon Oct 03 03:54:37 2011 +0200 summary: PyUnicode_Append() now works in-place when it's possible files: Objects/unicodeobject.c | 85 ++++++++++++++++++++++++---- 1 files changed, 73 insertions(+), 12 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -9775,19 +9775,80 @@ } void -PyUnicode_Append(PyObject **pleft, PyObject *right) -{ - PyObject *new; - if (*pleft == NULL) +PyUnicode_Append(PyObject **p_left, PyObject *right) +{ + PyObject *left, *res; + + if (p_left == NULL) { + if (!PyErr_Occurred()) + PyErr_BadInternalCall(); return; - if (right == NULL || !PyUnicode_Check(*pleft)) { - Py_DECREF(*pleft); - *pleft = NULL; - return; - } - new = PyUnicode_Concat(*pleft, right); - Py_DECREF(*pleft); - *pleft = new; + } + left = *p_left; + if (right == NULL || !PyUnicode_Check(left)) { + if (!PyErr_Occurred()) + PyErr_BadInternalCall(); + goto error; + } + + if (PyUnicode_CheckExact(left) && left != unicode_empty + && PyUnicode_CheckExact(right) && right != unicode_empty + && unicode_resizable(left) + && (_PyUnicode_KIND(right) <= _PyUnicode_KIND(left) + || _PyUnicode_WSTR(left) != NULL)) + { + Py_ssize_t u_len, v_len, new_len, copied; + + /* FIXME: don't make wstr string ready */ + if (PyUnicode_READY(left)) + goto error; + if (PyUnicode_READY(right)) + goto error; + + /* FIXME: support ascii+latin1, PyASCIIObject => PyCompactUnicodeObject */ + if (PyUnicode_MAX_CHAR_VALUE(right) <= PyUnicode_MAX_CHAR_VALUE(left)) + { + u_len = PyUnicode_GET_LENGTH(left); + v_len = PyUnicode_GET_LENGTH(right); + if (u_len > PY_SSIZE_T_MAX - v_len) { + PyErr_SetString(PyExc_OverflowError, + "strings are too large to concat"); + goto error; + } + new_len = u_len + v_len; + + /* Now we own the last reference to 'left', so we can resize it + * in-place. + */ + if (unicode_resize(&left, new_len) != 0) { + /* XXX if _PyUnicode_Resize() fails, 'left' has been + * deallocated so it cannot be put back into + * 'variable'. The MemoryError is raised when there + * is no value in 'variable', which might (very + * remotely) be a cause of incompatibilities. + */ + goto error; + } + /* copy 'right' into the newly allocated area of 'left' */ + copied = PyUnicode_CopyCharacters(left, u_len, + right, 0, + v_len); + assert(0 <= copied); + *p_left = left; + return; + } + } + + res = PyUnicode_Concat(left, right); + if (res == NULL) + goto error; + Py_DECREF(left); + *p_left = res; + return; + +error: + Py_DECREF(*p_left); + *p_left = NULL; } void -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Oct 3 04:02:37 2011 From: python-checkins at python.org (victor.stinner) Date: Mon, 03 Oct 2011 04:02:37 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_In_release_mode=2C_PyUnicod?= =?utf8?q?e=5FInternInPlace=28=29_does_nothing_if_the_input_is_NULL_or?= Message-ID: http://hg.python.org/cpython/rev/6ca9bb281c37 changeset: 72602:6ca9bb281c37 user: Victor Stinner date: Mon Oct 03 02:01:52 2011 +0200 summary: In release mode, PyUnicode_InternInPlace() does nothing if the input is NULL or not a unicode, instead of failing with a fatal error. Use assertions in debug mode (provide better error messages). files: Objects/unicodeobject.c | 10 +++++++--- 1 files changed, 7 insertions(+), 3 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -12893,9 +12893,13 @@ { register PyUnicodeObject *s = (PyUnicodeObject *)(*p); PyObject *t; +#ifdef Py_DEBUG + assert(s != NULL); + assert(_PyUnicode_CHECK(s)); +#else if (s == NULL || !PyUnicode_Check(s)) - Py_FatalError( - "PyUnicode_InternInPlace: unicode strings only please!"); + return; +#endif /* If it's a subclass, we don't really know what putting it in the interned dict might do. */ if (!PyUnicode_CheckExact(s)) @@ -12903,7 +12907,7 @@ if (PyUnicode_CHECK_INTERNED(s)) return; if (PyUnicode_READY(s) == -1) { - assert(0 && "ready fail in intern..."); + assert(0 && "PyUnicode_READY fail in PyUnicode_InternInPlace"); return; } if (interned == NULL) { -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Oct 3 04:02:38 2011 From: python-checkins at python.org (victor.stinner) Date: Mon, 03 Oct 2011 04:02:38 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Add_=5FPyUnicode=5FCheckCon?= =?utf8?q?sistency=28=29_macro_to_help_debugging?= Message-ID: http://hg.python.org/cpython/rev/dbb9313c3ed8 changeset: 72603:dbb9313c3ed8 user: Victor Stinner date: Mon Oct 03 03:20:16 2011 +0200 summary: Add _PyUnicode_CheckConsistency() macro to help debugging * Document Unicode string states * Use _PyUnicode_CheckConsistency() to ensure that objects are always consistent. files: Include/unicodeobject.h | 46 +++++++++ Objects/unicodeobject.c | 135 ++++++++++++++++++++------- 2 files changed, 144 insertions(+), 37 deletions(-) diff --git a/Include/unicodeobject.h b/Include/unicodeobject.h --- a/Include/unicodeobject.h +++ b/Include/unicodeobject.h @@ -206,6 +206,52 @@ immediately follow the structure. utf8_length and wstr_length can be found in the length field; the utf8 pointer is equal to the data pointer. */ typedef struct { + /* Unicode strings can be in 4 states: + + - compact ascii: + + * structure = PyASCIIObject + * kind = PyUnicode_1BYTE_KIND + * compact = 1 + * ascii = 1 + * ready = 1 + * utf8 = data + + - compact: + + * structure = PyCompactUnicodeObject + * kind = PyUnicode_1BYTE_KIND, PyUnicode_2BYTE_KIND or + PyUnicode_4BYTE_KIND + * compact = 1 + * ready = 1 + * (ascii = 0) + + - string created by the legacy API (not ready): + + * structure = PyUnicodeObject + * kind = PyUnicode_WCHAR_KIND + * compact = 0 + * ready = 0 + * wstr is not NULL + * data.any is NULL + * utf8 is NULL + * interned = SSTATE_NOT_INTERNED + * (ascii = 0) + + - string created by the legacy API, ready: + + * structure = PyUnicodeObject structure + * kind = PyUnicode_1BYTE_KIND, PyUnicode_2BYTE_KIND or + PyUnicode_4BYTE_KIND + * compact = 0 + * ready = 1 + * data.any is not NULL + * (ascii = 0) + + String created by the legacy API becomes ready when calling + PyUnicode_READY(). + + See also _PyUnicode_CheckConsistency(). */ PyObject_HEAD Py_ssize_t length; /* Number of code points in the string */ Py_hash_t hash; /* Hash value; -1 if not set */ diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -89,25 +89,16 @@ extern "C" { #endif -/* Generic helper macro to convert characters of different types. - from_type and to_type have to be valid type names, begin and end - are pointers to the source characters which should be of type - "from_type *". to is a pointer of type "to_type *" and points to the - buffer where the result characters are written to. */ -#define _PyUnicode_CONVERT_BYTES(from_type, to_type, begin, end, to) \ - do { \ - const from_type *iter_; to_type *to_; \ - for (iter_ = (begin), to_ = (to_type *)(to); \ - iter_ < (end); \ - ++iter_, ++to_) { \ - *to_ = (to_type)*iter_; \ - } \ - } while (0) +#ifdef Py_DEBUG +# define _PyUnicode_CHECK(op) _PyUnicode_CheckConsistency(op) +#else +# define _PyUnicode_CHECK(op) PyUnicode_Check(op) +#endif #define _PyUnicode_UTF8(op) \ (((PyCompactUnicodeObject*)(op))->utf8) #define PyUnicode_UTF8(op) \ - (assert(PyUnicode_Check(op)), \ + (assert(_PyUnicode_CHECK(op)), \ assert(PyUnicode_IS_READY(op)), \ PyUnicode_IS_COMPACT_ASCII(op) ? \ ((char*)((PyASCIIObject*)(op) + 1)) : \ @@ -115,7 +106,7 @@ #define _PyUnicode_UTF8_LENGTH(op) \ (((PyCompactUnicodeObject*)(op))->utf8_length) #define PyUnicode_UTF8_LENGTH(op) \ - (assert(PyUnicode_Check(op)), \ + (assert(_PyUnicode_CHECK(op)), \ assert(PyUnicode_IS_READY(op)), \ PyUnicode_IS_COMPACT_ASCII(op) ? \ ((PyASCIIObject*)(op))->length : \ @@ -125,22 +116,42 @@ #define _PyUnicode_LENGTH(op) (((PyASCIIObject *)(op))->length) #define _PyUnicode_STATE(op) (((PyASCIIObject *)(op))->state) #define _PyUnicode_HASH(op) (((PyASCIIObject *)(op))->hash) -#define _PyUnicode_KIND(op) \ - (assert(PyUnicode_Check(op)), \ +#define _PyUnicode_KIND(op) \ + (assert(_PyUnicode_CHECK(op)), \ ((PyASCIIObject *)(op))->state.kind) -#define _PyUnicode_GET_LENGTH(op) \ - (assert(PyUnicode_Check(op)), \ +#define _PyUnicode_GET_LENGTH(op) \ + (assert(_PyUnicode_CHECK(op)), \ ((PyASCIIObject *)(op))->length) #define _PyUnicode_DATA_ANY(op) (((PyUnicodeObject*)(op))->data.any) +#undef PyUnicode_READY +#define PyUnicode_READY(op) \ + (assert(_PyUnicode_CHECK(op)), \ + (PyUnicode_IS_READY(op) ? \ + 0 : _PyUnicode_Ready((PyObject *)(op)))) + /* true if the Unicode object has an allocated UTF-8 memory block (not shared with other data) */ -#define _PyUnicode_HAS_UTF8_MEMORY(op) \ - (assert(PyUnicode_Check(op)), \ - (!PyUnicode_IS_COMPACT_ASCII(op) \ - && _PyUnicode_UTF8(op) \ +#define _PyUnicode_HAS_UTF8_MEMORY(op) \ + (assert(_PyUnicode_CHECK(op)), \ + (!PyUnicode_IS_COMPACT_ASCII(op) \ + && _PyUnicode_UTF8(op) \ && _PyUnicode_UTF8(op) != PyUnicode_DATA(op))) +/* Generic helper macro to convert characters of different types. + from_type and to_type have to be valid type names, begin and end + are pointers to the source characters which should be of type + "from_type *". to is a pointer of type "to_type *" and points to the + buffer where the result characters are written to. */ +#define _PyUnicode_CONVERT_BYTES(from_type, to_type, begin, end, to) \ + do { \ + const from_type *iter_; to_type *to_; \ + for (iter_ = (begin), to_ = (to_type *)(to); \ + iter_ < (end); \ + ++iter_, ++to_) { \ + *to_ = (to_type)*iter_; \ + } \ + } while (0) /* The Unicode string has been modified: reset the hash */ #define _PyUnicode_DIRTY(op) do { _PyUnicode_HASH(op) = -1; } while (0) @@ -250,6 +261,57 @@ #endif } +#ifdef Py_DEBUG +static int +_PyUnicode_CheckConsistency(void *op) +{ + PyASCIIObject *ascii; + unsigned int kind; + + assert(PyUnicode_Check(op)); + + ascii = (PyASCIIObject *)op; + kind = ascii->state.kind; + + if (ascii->state.ascii == 1) { + assert(kind == PyUnicode_1BYTE_KIND); + assert(ascii->state.compact == 1); + assert(ascii->state.ready == 1); + } + else if (ascii->state.compact == 1) { + assert(kind == PyUnicode_1BYTE_KIND + || kind == PyUnicode_2BYTE_KIND + || kind == PyUnicode_4BYTE_KIND); + assert(ascii->state.compact == 1); + assert(ascii->state.ascii == 0); + assert(ascii->state.ready == 1); + } else { + PyCompactUnicodeObject *compact = (PyCompactUnicodeObject *)op; + PyUnicodeObject *unicode = (PyUnicodeObject *)op; + + if (kind == PyUnicode_WCHAR_KIND) { + assert(!ascii->state.compact == 1); + assert(ascii->state.ascii == 0); + assert(!ascii->state.ready == 1); + assert(ascii->wstr != NULL); + assert(unicode->data.any == NULL); + assert(compact->utf8 == NULL); + assert(ascii->state.interned == SSTATE_NOT_INTERNED); + } + else { + assert(kind == PyUnicode_1BYTE_KIND + || kind == PyUnicode_2BYTE_KIND + || kind == PyUnicode_4BYTE_KIND); + assert(!ascii->state.compact == 1); + assert(ascii->state.ready == 1); + assert(unicode->data.any != NULL); + assert(ascii->state.ascii == 0); + } + } + return 1; +} +#endif + /* --- Bloom Filters ----------------------------------------------------- */ /* stuff to implement simple "bloom filters" for Unicode characters. @@ -542,7 +604,7 @@ static const char* unicode_kind_name(PyObject *unicode) { - assert(PyUnicode_Check(unicode)); + assert(_PyUnicode_CHECK(unicode)); if (!PyUnicode_IS_COMPACT(unicode)) { if (!PyUnicode_IS_READY(unicode)) @@ -744,7 +806,8 @@ const wchar_t *iter; Py_UCS4 *ucs4_out; - assert(unicode && PyUnicode_Check(unicode)); + assert(unicode != NULL); + assert(_PyUnicode_CHECK(unicode)); assert(_PyUnicode_KIND(unicode) == PyUnicode_4BYTE_KIND); ucs4_out = PyUnicode_4BYTE_DATA(unicode); @@ -771,7 +834,7 @@ static int _PyUnicode_Dirty(PyObject *unicode) { - assert(PyUnicode_Check(unicode)); + assert(_PyUnicode_CHECK(unicode)); if (Py_REFCNT(unicode) != 1) { PyErr_SetString(PyExc_ValueError, "Cannot modify a string having more than 1 reference"); @@ -966,10 +1029,8 @@ strings were created using _PyObject_New() and where no canonical representation (the str field) has been set yet aka strings which are not yet ready. */ - assert(PyUnicode_Check(obj)); - assert(!PyUnicode_IS_READY(obj)); - assert(!PyUnicode_IS_COMPACT(obj)); - assert(_PyUnicode_KIND(obj) == PyUnicode_WCHAR_KIND); + assert(_PyUnicode_CHECK(unicode)); + assert(_PyUnicode_KIND(unicode) == PyUnicode_WCHAR_KIND); assert(_PyUnicode_WSTR(unicode) != NULL); assert(_PyUnicode_DATA_ANY(unicode) == NULL); assert(_PyUnicode_UTF8(unicode) == NULL); @@ -1154,7 +1215,7 @@ assert(PyUnicode_Check(unicode)); assert(0 <= length); - if (!PyUnicode_IS_COMPACT(unicode) && !PyUnicode_IS_READY(unicode)) + if (_PyUnicode_KIND(unicode) == PyUnicode_WCHAR_KIND) old_length = PyUnicode_WSTR_LENGTH(unicode); else old_length = PyUnicode_GET_LENGTH(unicode); @@ -1907,7 +1968,7 @@ case 'U': { PyObject *obj = va_arg(count, PyObject *); - assert(obj && PyUnicode_Check(obj)); + assert(obj && _PyUnicode_CHECK(obj)); if (PyUnicode_READY(obj) == -1) goto fail; argmaxchar = PyUnicode_MAX_CHAR_VALUE(obj); @@ -1921,7 +1982,7 @@ const char *str = va_arg(count, const char *); PyObject *str_obj; assert(obj || str); - assert(!obj || PyUnicode_Check(obj)); + assert(!obj || _PyUnicode_CHECK(obj)); if (obj) { if (PyUnicode_READY(obj) == -1) goto fail; @@ -9570,7 +9631,7 @@ void *data; Py_UCS4 chr; - assert(PyUnicode_Check(uni)); + assert(_PyUnicode_CHECK(uni)); if (PyUnicode_READY(uni) == -1) return -1; kind = PyUnicode_KIND(uni); @@ -12698,7 +12759,7 @@ unicode = (PyUnicodeObject *)unicode_new(&PyUnicode_Type, args, kwds); if (unicode == NULL) return NULL; - assert(PyUnicode_Check(unicode)); + assert(_PyUnicode_CHECK(unicode)); if (PyUnicode_READY(unicode)) return NULL; @@ -13054,7 +13115,7 @@ seq = it->it_seq; if (seq == NULL) return NULL; - assert(PyUnicode_Check(seq)); + assert(_PyUnicode_CHECK(seq)); if (it->it_index < PyUnicode_GET_LENGTH(seq)) { int kind = PyUnicode_KIND(seq); -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Oct 3 04:02:39 2011 From: python-checkins at python.org (victor.stinner) Date: Mon, 03 Oct 2011 04:02:39 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Use_PyUnicode=5FWCHAR=5FKIN?= =?utf8?q?D_to_check_if_a_string_is_a_wstr_string?= Message-ID: http://hg.python.org/cpython/rev/2f9ac1eb1a99 changeset: 72604:2f9ac1eb1a99 user: Victor Stinner date: Mon Oct 03 02:16:37 2011 +0200 summary: Use PyUnicode_WCHAR_KIND to check if a string is a wstr string Simplify the test in wstr pointer in unicode_sizeof(). files: Objects/unicodeobject.c | 20 +++++++++++--------- 1 files changed, 11 insertions(+), 9 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -1181,18 +1181,23 @@ static int unicode_resizable(PyObject *unicode) { + Py_ssize_t len; if (Py_REFCNT(unicode) != 1) return 0; if (PyUnicode_CHECK_INTERNED(unicode)) return 0; if (unicode == unicode_empty) return 0; - if (PyUnicode_WSTR_LENGTH(unicode) == 1) { + if (_PyUnicode_KIND(unicode) == PyUnicode_WCHAR_KIND) + len = PyUnicode_WSTR_LENGTH(unicode); + else + len = PyUnicode_GET_LENGTH(unicode); + if (len == 1) { Py_UCS4 ch; - if (PyUnicode_IS_COMPACT(unicode)) + if (_PyUnicode_KIND(unicode) == PyUnicode_WCHAR_KIND) + ch = _PyUnicode_WSTR(unicode)[0]; + else ch = PyUnicode_READ_CHAR(unicode, 0); - else - ch = _PyUnicode_WSTR(unicode)[0]; if (ch < 256 && unicode_latin1[ch] == unicode) return 0; } @@ -11969,12 +11974,9 @@ PyUnicode_CHARACTER_SIZE(v); } /* If the wstr pointer is present, account for it unless it is shared - with the data pointer. Since PyUnicode_DATA will crash if the object - is not ready, check whether it's either not ready (in which case the - data is entirely in wstr) or if the data is not shared. */ + with the data pointer. Check if the data is not shared. */ if (_PyUnicode_WSTR(v) && - (!PyUnicode_IS_READY(v) || - (PyUnicode_DATA(v) != _PyUnicode_WSTR(v)))) + (PyUnicode_DATA(v) != _PyUnicode_WSTR(v))) size += (PyUnicode_WSTR_LENGTH(v) + 1) * sizeof(wchar_t); if (_PyUnicode_HAS_UTF8_MEMORY(v)) size += PyUnicode_UTF8_LENGTH(v) + 1; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Oct 3 04:02:39 2011 From: python-checkins at python.org (victor.stinner) Date: Mon, 03 Oct 2011 04:02:39 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_ceval=2Ec=3A_restore_str+?= =?utf8?q?=3Dstr_optimization?= Message-ID: http://hg.python.org/cpython/rev/07d27d865e2c changeset: 72605:07d27d865e2c user: Victor Stinner date: Sun Oct 02 20:34:20 2011 +0200 summary: ceval.c: restore str+=str optimization files: Python/ceval.c | 76 ++++++++++++++++++++++++++++++++++++- 1 files changed, 73 insertions(+), 3 deletions(-) diff --git a/Python/ceval.c b/Python/ceval.c --- a/Python/ceval.c +++ b/Python/ceval.c @@ -136,6 +136,8 @@ static int import_all_from(PyObject *, PyObject *); static void format_exc_check_arg(PyObject *, const char *, PyObject *); static void format_exc_unbound(PyCodeObject *co, int oparg); +static PyObject * unicode_concatenate(PyObject *, PyObject *, + PyFrameObject *, unsigned char *); static PyObject * special_lookup(PyObject *, char *, PyObject **); #define NAME_ERROR_MSG \ @@ -1507,8 +1509,17 @@ TARGET(BINARY_ADD) w = POP(); v = TOP(); - x = PyNumber_Add(v, w); + if (PyUnicode_CheckExact(v) && + PyUnicode_CheckExact(w)) { + x = unicode_concatenate(v, w, f, next_instr); + /* unicode_concatenate consumed the ref to v */ + goto skip_decref_vx; + } + else { + x = PyNumber_Add(v, w); + } Py_DECREF(v); + skip_decref_vx: Py_DECREF(w); SET_TOP(x); if (x != NULL) DISPATCH(); @@ -1659,8 +1670,17 @@ TARGET(INPLACE_ADD) w = POP(); v = TOP(); - x = PyNumber_InPlaceAdd(v, w); + if (PyUnicode_CheckExact(v) && + PyUnicode_CheckExact(w)) { + x = unicode_concatenate(v, w, f, next_instr); + /* unicode_concatenate consumed the ref to v */ + goto skip_decref_v; + } + else { + x = PyNumber_InPlaceAdd(v, w); + } Py_DECREF(v); + skip_decref_v: Py_DECREF(w); SET_TOP(x); if (x != NULL) DISPATCH(); @@ -3399,7 +3419,7 @@ f->f_exc_traceback = tstate->exc_traceback; Py_XDECREF(type); Py_XDECREF(value); - Py_XDECREF(traceback); + Py_XDECREF(traceback); } static void @@ -4495,6 +4515,56 @@ } } +static PyObject * +unicode_concatenate(PyObject *v, PyObject *w, + PyFrameObject *f, unsigned char *next_instr) +{ + PyObject *res; + if (Py_REFCNT(v) == 2) { + /* In the common case, there are 2 references to the value + * stored in 'variable' when the += is performed: one on the + * value stack (in 'v') and one still stored in the + * 'variable'. We try to delete the variable now to reduce + * the refcnt to 1. + */ + switch (*next_instr) { + case STORE_FAST: + { + int oparg = PEEKARG(); + PyObject **fastlocals = f->f_localsplus; + if (GETLOCAL(oparg) == v) + SETLOCAL(oparg, NULL); + break; + } + case STORE_DEREF: + { + PyObject **freevars = (f->f_localsplus + + f->f_code->co_nlocals); + PyObject *c = freevars[PEEKARG()]; + if (PyCell_GET(c) == v) + PyCell_Set(c, NULL); + break; + } + case STORE_NAME: + { + PyObject *names = f->f_code->co_names; + PyObject *name = GETITEM(names, PEEKARG()); + PyObject *locals = f->f_locals; + if (PyDict_CheckExact(locals) && + PyDict_GetItem(locals, name) == v) { + if (PyDict_DelItem(locals, name) != 0) { + PyErr_Clear(); + } + } + break; + } + } + } + res = v; + PyUnicode_Append(&res, w); + return res; +} + #ifdef DYNAMIC_EXECUTION_PROFILE static PyObject * -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Oct 3 04:05:43 2011 From: python-checkins at python.org (victor.stinner) Date: Mon, 03 Oct 2011 04:05:43 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Fix_compilation_error_on_Wi?= =?utf8?q?ndows?= Message-ID: http://hg.python.org/cpython/rev/5c79977313e4 changeset: 72606:5c79977313e4 user: Victor Stinner date: Mon Oct 03 04:06:05 2011 +0200 summary: Fix compilation error on Windows Fix also a compiler warning. files: Objects/unicodeobject.c | 5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -509,9 +509,10 @@ } return copy; } else { + PyUnicodeObject *w; assert(_PyUnicode_WSTR(unicode) != NULL); assert(_PyUnicode_DATA_ANY(unicode) == NULL); - PyUnicodeObject *w = _PyUnicode_New(length); + w = _PyUnicode_New(length); if (w == NULL) return NULL; copy_length = _PyUnicode_WSTR_LENGTH(unicode); @@ -6521,7 +6522,7 @@ else { /* Extend unicode object */ n = PyUnicode_GET_SIZE(*v); - if (PyUnicode_Resize(v, n + usize) < 0) + if (PyUnicode_Resize((PyObject**)v, n + usize) < 0) return -1; } -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Oct 3 04:17:49 2011 From: python-checkins at python.org (victor.stinner) Date: Mon, 03 Oct 2011 04:17:49 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_=5FPyUnicode=5FReady=28=29_?= =?utf8?q?for_16-bit_wchar=5Ft?= Message-ID: http://hg.python.org/cpython/rev/8720e6fa6fc5 changeset: 72607:8720e6fa6fc5 user: Victor Stinner date: Mon Oct 03 04:17:10 2011 +0200 summary: _PyUnicode_Ready() for 16-bit wchar_t files: Objects/unicodeobject.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -1121,6 +1121,8 @@ _PyUnicode_STATE(unicode).kind = PyUnicode_4BYTE_KIND; _PyUnicode_UTF8(unicode) = NULL; _PyUnicode_UTF8_LENGTH(unicode) = 0; + /* unicode_convert_wchar_to_ucs4() requires a ready string */ + _PyUnicode_STATE(unicode).ready = 1; unicode_convert_wchar_to_ucs4(_PyUnicode_WSTR(unicode), end, unicode); PyObject_FREE(_PyUnicode_WSTR(unicode)); _PyUnicode_WSTR(unicode) = NULL; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Oct 3 04:17:50 2011 From: python-checkins at python.org (victor.stinner) Date: Mon, 03 Oct 2011 04:17:50 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Disable_unicode=5Fresize=28?= =?utf8?q?=29_optimization_on_Windows_=2816-bit_wchar=5Ft=29?= Message-ID: http://hg.python.org/cpython/rev/26d63e7eac17 changeset: 72608:26d63e7eac17 user: Victor Stinner date: Mon Oct 03 04:18:04 2011 +0200 summary: Disable unicode_resize() optimization on Windows (16-bit wchar_t) files: Objects/unicodeobject.c | 4 ++++ 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -1185,6 +1185,10 @@ unicode_resizable(PyObject *unicode) { Py_ssize_t len; +#if SIZEOF_WCHAR_T == 2 + /* FIXME: unicode_resize() is buggy on Windows */ + return 0; +#endif if (Py_REFCNT(unicode) != 1) return 0; if (PyUnicode_CHECK_INTERNED(unicode)) -- Repository URL: http://hg.python.org/cpython From solipsis at pitrou.net Mon Oct 3 05:23:35 2011 From: solipsis at pitrou.net (solipsis at pitrou.net) Date: Mon, 03 Oct 2011 05:23:35 +0200 Subject: [Python-checkins] Daily reference leaks (1ed413b52af3): sum=0 Message-ID: results for 1ed413b52af3 on branch "default" -------------------------------------------- Command line was: ['./python', '-m', 'test.regrtest', '-uall', '-R', '3:3:/home/antoine/cpython/refleaks/reflogpbOnRq', '-x'] From python-checkins at python.org Mon Oct 3 12:21:41 2011 From: python-checkins at python.org (victor.stinner) Date: Mon, 03 Oct 2011 12:21:41 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Fix_resize=5Finplace=28=29?= =?utf8?q?=3A_update_shared_utf8_pointer?= Message-ID: http://hg.python.org/cpython/rev/f96d8f8a6e37 changeset: 72609:f96d8f8a6e37 user: Victor Stinner date: Mon Oct 03 12:11:00 2011 +0200 summary: Fix resize_inplace(): update shared utf8 pointer files: Objects/unicodeobject.c | 5 ++++- 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -443,13 +443,14 @@ if (PyUnicode_IS_READY(unicode)) { Py_ssize_t char_size; Py_ssize_t new_size; - int share_wstr; + int share_wstr, share_utf8; void *data; data = _PyUnicode_DATA_ANY(unicode); assert(data != NULL); char_size = PyUnicode_CHARACTER_SIZE(unicode); share_wstr = (_PyUnicode_WSTR(unicode) == data); + share_utf8 = (_PyUnicode_UTF8(unicode) == data); if (length > (PY_SSIZE_T_MAX / char_size - 1)) { PyErr_NoMemory(); @@ -465,6 +466,8 @@ _PyUnicode_DATA_ANY(unicode) = data; if (share_wstr) _PyUnicode_WSTR(unicode) = data; + if (share_utf8) + _PyUnicode_UTF8(unicode) = data; _PyUnicode_LENGTH(unicode) = length; PyUnicode_WRITE(PyUnicode_KIND(unicode), data, length, 0); if (share_wstr) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Oct 3 12:21:42 2011 From: python-checkins at python.org (victor.stinner) Date: Mon, 03 Oct 2011 12:21:42 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_=5FPyUnicode=5FDump=28=29_i?= =?utf8?q?ndicates_if_wstr_and/or_utf8_are_shared?= Message-ID: http://hg.python.org/cpython/rev/fca7280aad8d changeset: 72610:fca7280aad8d user: Victor Stinner date: Mon Oct 03 12:12:11 2011 +0200 summary: _PyUnicode_Dump() indicates if wstr and/or utf8 are shared files: Objects/unicodeobject.c | 33 ++++++++++++++-------------- 1 files changed, 17 insertions(+), 16 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -670,23 +670,24 @@ _PyUnicode_Dump(PyObject *op) { PyASCIIObject *ascii = (PyASCIIObject *)op; - printf("%s: len=%zu, wstr=%p", - unicode_kind_name(op), - ascii->length, - ascii->wstr); + PyCompactUnicodeObject *compact = (PyCompactUnicodeObject *)op; + PyUnicodeObject *unicode = (PyUnicodeObject *)op; + void *data; + printf("%s: len=%zu, ",unicode_kind_name(op), ascii->length); + if (ascii->state.compact) + data = (compact + 1); + else + data = unicode->data.any; + if (ascii->wstr == data) + printf("shared "); + printf("wstr=%p", ascii->wstr); if (!ascii->state.ascii) { - PyCompactUnicodeObject *compact = (PyCompactUnicodeObject *)op; - printf(" (%zu), utf8=%p (%zu)", - compact->wstr_length, - compact->utf8, - compact->utf8_length); - } - if (!ascii->state.compact) { - PyUnicodeObject *unicode = (PyUnicodeObject *)op; - printf(", data=%p", - unicode->data.any); - } - printf("\n"); + printf(" (%zu), ", compact->wstr_length); + if (!ascii->state.compact && compact->utf8 == unicode->data.any) + printf("shared "); + printf("utf8=%p (%zu)", compact->utf8, compact->utf8_length); + } + printf(", data=%p\n", data); } #endif -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Oct 3 12:21:42 2011 From: python-checkins at python.org (victor.stinner) Date: Mon, 03 Oct 2011 12:21:42 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_resize=5Finplace=28=29_has_?= =?utf8?q?been_fixed=3A_reenable_this_optimization?= Message-ID: http://hg.python.org/cpython/rev/6bb850f6a438 changeset: 72611:6bb850f6a438 user: Victor Stinner date: Mon Oct 03 12:21:33 2011 +0200 summary: resize_inplace() has been fixed: reenable this optimization files: Objects/unicodeobject.c | 3 --- 1 files changed, 0 insertions(+), 3 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -1212,9 +1212,6 @@ if (ch < 256 && unicode_latin1[ch] == unicode) return 0; } - /* FIXME: reenable resize_inplace */ - if (!PyUnicode_IS_COMPACT(unicode)) - return 0; return 1; } -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Oct 3 12:52:05 2011 From: python-checkins at python.org (victor.stinner) Date: Mon, 03 Oct 2011 12:52:05 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Fix_resize=5Fcompact=28=29_?= =?utf8?q?and_resize=5Finplace=28=29=3B_reenable_full_resize_optimizations?= Message-ID: http://hg.python.org/cpython/rev/78183a564462 changeset: 72612:78183a564462 user: Victor Stinner date: Mon Oct 03 12:52:27 2011 +0200 summary: Fix resize_compact() and resize_inplace(); reenable full resize optimizations * resize_compact() updates also wstr_len for non-ascii strings sharing wstr * resize_inplace() updates also utf8_len/wstr_len for strings sharing utf8/wstr files: Objects/unicodeobject.c | 31 +++++++++++++++++++--------- 1 files changed, 21 insertions(+), 10 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -130,6 +130,14 @@ (PyUnicode_IS_READY(op) ? \ 0 : _PyUnicode_Ready((PyObject *)(op)))) +#define _PyUnicode_SHARE_UTF8(op) \ + (assert(_PyUnicode_CHECK(op)), \ + assert(!PyUnicode_IS_COMPACT_ASCII(op)), \ + (_PyUnicode_UTF8(op) == PyUnicode_DATA(op))) +#define _PyUnicode_SHARE_WSTR(op) \ + (assert(_PyUnicode_CHECK(op)), \ + (_PyUnicode_WSTR(unicode) == PyUnicode_DATA(op))) + /* true if the Unicode object has an allocated UTF-8 memory block (not shared with other data) */ #define _PyUnicode_HAS_UTF8_MEMORY(op) \ @@ -398,7 +406,7 @@ struct_size = sizeof(PyASCIIObject); else struct_size = sizeof(PyCompactUnicodeObject); - share_wstr = (_PyUnicode_WSTR(unicode) == PyUnicode_DATA(unicode)); + share_wstr = _PyUnicode_SHARE_WSTR(unicode); _Py_DEC_REFTOTAL; _Py_ForgetReference(unicode); @@ -417,8 +425,11 @@ } _Py_NewReference(unicode); _PyUnicode_LENGTH(unicode) = length; - if (share_wstr) + if (share_wstr) { _PyUnicode_WSTR(unicode) = PyUnicode_DATA(unicode); + if (!PyUnicode_IS_COMPACT_ASCII(unicode)) + _PyUnicode_WSTR_LENGTH(unicode) = length; + } PyUnicode_WRITE(PyUnicode_KIND(unicode), PyUnicode_DATA(unicode), length, 0); return unicode; @@ -449,8 +460,8 @@ data = _PyUnicode_DATA_ANY(unicode); assert(data != NULL); char_size = PyUnicode_CHARACTER_SIZE(unicode); - share_wstr = (_PyUnicode_WSTR(unicode) == data); - share_utf8 = (_PyUnicode_UTF8(unicode) == data); + share_wstr = _PyUnicode_SHARE_WSTR(unicode); + share_utf8 = _PyUnicode_SHARE_UTF8(unicode); if (length > (PY_SSIZE_T_MAX / char_size - 1)) { PyErr_NoMemory(); @@ -464,10 +475,14 @@ return -1; } _PyUnicode_DATA_ANY(unicode) = data; - if (share_wstr) + if (share_wstr) { _PyUnicode_WSTR(unicode) = data; - if (share_utf8) + _PyUnicode_WSTR_LENGTH(unicode) = length; + } + if (share_utf8) { _PyUnicode_UTF8(unicode) = data; + _PyUnicode_UTF8_LENGTH(unicode) = length; + } _PyUnicode_LENGTH(unicode) = length; PyUnicode_WRITE(PyUnicode_KIND(unicode), data, length, 0); if (share_wstr) @@ -1189,10 +1204,6 @@ unicode_resizable(PyObject *unicode) { Py_ssize_t len; -#if SIZEOF_WCHAR_T == 2 - /* FIXME: unicode_resize() is buggy on Windows */ - return 0; -#endif if (Py_REFCNT(unicode) != 1) return 0; if (PyUnicode_CHECK_INTERNED(unicode)) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Oct 3 14:01:35 2011 From: python-checkins at python.org (victor.stinner) Date: Mon, 03 Oct 2011 14:01:35 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Create_=5FPyUnicode=5FREADY?= =?utf8?q?=5FREPLACE=28=29_to_reuse_singleton?= Message-ID: http://hg.python.org/cpython/rev/0312400eaa48 changeset: 72613:0312400eaa48 user: Victor Stinner date: Mon Oct 03 13:28:14 2011 +0200 summary: Create _PyUnicode_READY_REPLACE() to reuse singleton Only use _PyUnicode_READY_REPLACE() on just created strings. files: Objects/unicodeobject.c | 96 ++++++++++++++++++++++------ 1 files changed, 73 insertions(+), 23 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -130,6 +130,11 @@ (PyUnicode_IS_READY(op) ? \ 0 : _PyUnicode_Ready((PyObject *)(op)))) +#define _PyUnicode_READY_REPLACE(p_obj) \ + (assert(_PyUnicode_CHECK(*p_obj)), \ + (PyUnicode_IS_READY(*p_obj) ? \ + 0 : _PyUnicode_ReadyReplace((PyObject **)(p_obj)))) + #define _PyUnicode_SHARE_UTF8(op) \ (assert(_PyUnicode_CHECK(op)), \ assert(!PyUnicode_IS_COMPACT_ASCII(op)), \ @@ -212,7 +217,9 @@ 0, 0, 0, 0, 0, 0, 0, 0 }; +/* forward */ static PyUnicodeObject *_PyUnicode_New(Py_ssize_t length); +static PyObject* get_latin1_char(unsigned char ch); static PyObject * unicode_encode_call_errorhandler(const char *errors, @@ -1034,10 +1041,10 @@ int unicode_ready_calls = 0; #endif -int -_PyUnicode_Ready(PyObject *obj) -{ - PyUnicodeObject *unicode = (PyUnicodeObject *)obj; +static int +unicode_ready(PyObject **p_obj, int replace) +{ + PyUnicodeObject *unicode; wchar_t *end; Py_UCS4 maxchar = 0; Py_ssize_t num_surrogates; @@ -1045,6 +1052,9 @@ Py_ssize_t length_wo_surrogates; #endif + assert(p_obj != NULL); + unicode = (PyUnicodeObject *)*p_obj; + /* _PyUnicode_Ready() is only intented for old-style API usage where strings were created using _PyObject_New() and where no canonical representation (the str field) has been set yet aka strings @@ -1061,6 +1071,32 @@ ++unicode_ready_calls; #endif +#ifdef Py_DEBUG + assert(!replace || Py_REFCNT(unicode) == 1); +#else + if (replace && Py_REFCNT(unicode) != 1) + replace = 0; +#endif + if (replace) { + Py_ssize_t len = _PyUnicode_WSTR_LENGTH(unicode); + wchar_t *wstr = _PyUnicode_WSTR(unicode); + /* Optimization for empty strings */ + if (len == 0) { + Py_INCREF(unicode_empty); + Py_DECREF(*p_obj); + *p_obj = unicode_empty; + return 0; + } + if (len == 1 && wstr[0] < 256) { + PyObject *latin1_char = get_latin1_char((unsigned char)wstr[0]); + if (latin1_char == NULL) + return -1; + Py_DECREF(*p_obj); + *p_obj = latin1_char; + return 0; + } + } + end = _PyUnicode_WSTR(unicode) + _PyUnicode_WSTR_LENGTH(unicode); if (find_maxchar_surrogates(_PyUnicode_WSTR(unicode), end, &maxchar, &num_surrogates) == -1) @@ -1161,6 +1197,18 @@ return 0; } +int +_PyUnicode_ReadyReplace(PyObject **op) +{ + return unicode_ready(op, 1); +} + +int +_PyUnicode_Ready(PyObject *op) +{ + return unicode_ready(&op, 0); +} + static void unicode_dealloc(register PyUnicodeObject *unicode) { @@ -2524,7 +2572,7 @@ goto onError; } Py_DECREF(buffer); - if (PyUnicode_READY(unicode)) { + if (_PyUnicode_READY_REPLACE(&unicode)) { Py_DECREF(unicode); return NULL; } @@ -3573,7 +3621,7 @@ Py_XDECREF(errorHandler); Py_XDECREF(exc); - if (PyUnicode_READY(unicode) == -1) { + if (_PyUnicode_READY_REPLACE(&unicode)) { Py_DECREF(unicode); return NULL; } @@ -4137,14 +4185,13 @@ /* Adjust length and ready string when it contained errors and is of the old resizable kind. */ if (kind == PyUnicode_WCHAR_KIND) { - if (PyUnicode_Resize((PyObject**)&unicode, i) < 0 || - PyUnicode_READY(unicode) == -1) + if (PyUnicode_Resize((PyObject**)&unicode, i) < 0) goto onError; } Py_XDECREF(errorHandler); Py_XDECREF(exc); - if (PyUnicode_READY(unicode) == -1) { + if (_PyUnicode_READY_REPLACE(&unicode)) { Py_DECREF(unicode); return NULL; } @@ -4647,7 +4694,7 @@ Py_XDECREF(errorHandler); Py_XDECREF(exc); - if (PyUnicode_READY(unicode) == -1) { + if (_PyUnicode_READY_REPLACE(&unicode)) { Py_DECREF(unicode); return NULL; } @@ -5045,7 +5092,7 @@ Py_XDECREF(errorHandler); Py_XDECREF(exc); - if (PyUnicode_READY(unicode) == -1) { + if (_PyUnicode_READY_REPLACE(&unicode)) { Py_DECREF(unicode); return NULL; } @@ -5501,11 +5548,13 @@ { if (PyUnicode_Resize((PyObject**)&v, i) < 0) goto onError; - if (PyUnicode_READY(v) == -1) - goto onError; } Py_XDECREF(errorHandler); Py_XDECREF(exc); + if (_PyUnicode_READY_REPLACE(&v)) { + Py_DECREF(v); + return NULL; + } return (PyObject *)v; ucnhashError: @@ -5803,7 +5852,7 @@ goto onError; Py_XDECREF(errorHandler); Py_XDECREF(exc); - if (PyUnicode_READY(v) == -1) { + if (_PyUnicode_READY_REPLACE(&v)) { Py_DECREF(v); return NULL; } @@ -5991,7 +6040,7 @@ goto onError; Py_XDECREF(errorHandler); Py_XDECREF(exc); - if (PyUnicode_READY(v) == -1) { + if (_PyUnicode_READY_REPLACE(&v)) { Py_DECREF(v); return NULL; } @@ -6417,7 +6466,7 @@ goto onError; Py_XDECREF(errorHandler); Py_XDECREF(exc); - if (PyUnicode_READY(v) == -1) { + if (_PyUnicode_READY_REPLACE(&v)) { Py_DECREF(v); return NULL; } @@ -6611,7 +6660,7 @@ goto retry; } #endif - if (PyUnicode_READY(v) == -1) { + if (_PyUnicode_READY_REPLACE(&v)) { Py_DECREF(v); return NULL; } @@ -6910,7 +6959,7 @@ goto onError; Py_XDECREF(errorHandler); Py_XDECREF(exc); - if (PyUnicode_READY(v) == -1) { + if (_PyUnicode_READY_REPLACE(&v)) { Py_DECREF(v); return NULL; } @@ -7816,7 +7865,7 @@ repunicode = unicode_translate_call_errorhandler(errors, &errorHandler, reason, input, &exc, collstart, collend, &newpos); - if (repunicode == NULL || PyUnicode_READY(repunicode) == -1) + if (repunicode == NULL || _PyUnicode_READY_REPLACE(&repunicode)) goto onError; /* generate replacement */ repsize = PyUnicode_GET_LENGTH(repunicode); @@ -8793,7 +8842,7 @@ Py_TYPE(separator)->tp_name); goto onError; } - if (PyUnicode_READY(separator) == -1) + if (PyUnicode_READY(separator)) goto onError; sep = separator; seplen = PyUnicode_GET_LENGTH(separator); @@ -10126,7 +10175,7 @@ j = 0; } - if (PyUnicode_READY(u) == -1) { + if (_PyUnicode_READY_REPLACE(&u)) { Py_DECREF(u); return NULL; } @@ -12781,7 +12830,7 @@ if (unicode == NULL) return NULL; assert(_PyUnicode_CHECK(unicode)); - if (PyUnicode_READY(unicode)) + if (_PyUnicode_READY_REPLACE(&unicode)) return NULL; self = (PyUnicodeObject *) type->tp_alloc(type, 0); @@ -12988,10 +13037,11 @@ return; if (PyUnicode_CHECK_INTERNED(s)) return; - if (PyUnicode_READY(s) == -1) { + if (_PyUnicode_READY_REPLACE(p)) { assert(0 && "PyUnicode_READY fail in PyUnicode_InternInPlace"); return; } + s = (PyUnicodeObject *)(*p); if (interned == NULL) { interned = PyDict_New(); if (interned == NULL) { -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Oct 3 14:01:36 2011 From: python-checkins at python.org (victor.stinner) Date: Mon, 03 Oct 2011 14:01:36 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_PyUnicode=5FReady=28=29_now?= =?utf8?q?_sets_ascii=3D1_if_maxchar_=3C_128?= Message-ID: http://hg.python.org/cpython/rev/93f033c7cfea changeset: 72614:93f033c7cfea user: Victor Stinner date: Mon Oct 03 13:53:37 2011 +0200 summary: PyUnicode_Ready() now sets ascii=1 if maxchar < 128 ascii=1 is no more reserved to PyASCIIObject. Use PyUnicode_IS_COMPACT_ASCII(obj) to check if obj is a PyASCIIObject (as before). files: Include/unicodeobject.h | 41 +++++++++++++++++----------- Objects/unicodeobject.c | 29 +++++++++---------- Tools/gdb/libpython.py | 5 ++- 3 files changed, 42 insertions(+), 33 deletions(-) diff --git a/Include/unicodeobject.h b/Include/unicodeobject.h --- a/Include/unicodeobject.h +++ b/Include/unicodeobject.h @@ -224,7 +224,7 @@ PyUnicode_4BYTE_KIND * compact = 1 * ready = 1 - * (ascii = 0) + * ascii = 0 - string created by the legacy API (not ready): @@ -236,7 +236,7 @@ * data.any is NULL * utf8 is NULL * interned = SSTATE_NOT_INTERNED - * (ascii = 0) + * ascii = 0 - string created by the legacy API, ready: @@ -246,7 +246,6 @@ * compact = 0 * ready = 1 * data.any is not NULL - * (ascii = 0) String created by the legacy API becomes ready when calling PyUnicode_READY(). @@ -278,8 +277,9 @@ one block for the PyUnicodeObject struct and another for its data buffer. */ unsigned int compact:1; - /* Compact objects which are ASCII-only also have the state.compact - flag set, and use the PyASCIIObject struct. */ + /* kind is PyUnicode_1BYTE_KIND but data contains only ASCII + characters. If ascii is 1 and compact is 1, use the PyASCIIObject + structure. */ unsigned int ascii:1; /* The ready flag indicates whether the object layout is initialized completely. This means that this is either a compact object, or @@ -304,7 +304,7 @@ /* Strings allocated through PyUnicode_FromUnicode(NULL, len) use the PyUnicodeObject structure. The actual string data is initially in the wstr - block, and copied into the data block using PyUnicode_Ready. */ + block, and copied into the data block using _PyUnicode_Ready. */ typedef struct { PyCompactUnicodeObject _base; union { @@ -327,7 +327,7 @@ #ifndef Py_LIMITED_API #define PyUnicode_WSTR_LENGTH(op) \ - (((PyASCIIObject*)op)->state.ascii ? \ + (PyUnicode_IS_COMPACT_ASCII(op) ? \ ((PyASCIIObject*)op)->length : \ ((PyCompactUnicodeObject*)op)->wstr_length) @@ -369,10 +369,24 @@ #define SSTATE_INTERNED_MORTAL 1 #define SSTATE_INTERNED_IMMORTAL 2 -#define PyUnicode_IS_COMPACT_ASCII(op) (((PyASCIIObject*)op)->state.ascii) +/* Return true if the string contains only ASCII characters, or 0 if not. The + string may be compact (PyUnicode_IS_COMPACT_ASCII) or not. No type checks + or Ready calls are performed. */ +#define PyUnicode_IS_ASCII(op) \ + (((PyASCIIObject*)op)->state.ascii) + +/* Return true if the string is compact or 0 if not. + No type checks or Ready calls are performed. */ +#define PyUnicode_IS_COMPACT(op) \ + (((PyASCIIObject*)(op))->state.compact) + +/* Return true if the string is a compact ASCII string (use PyASCIIObject + structure), or 0 if not. No type checks or Ready calls are performed. */ +#define PyUnicode_IS_COMPACT_ASCII(op) \ + (PyUnicode_IS_ASCII(op) && PyUnicode_IS_COMPACT(op)) /* String contains only wstr byte characters. This is only possible - when the string was created with a legacy API and PyUnicode_Ready() + when the string was created with a legacy API and _PyUnicode_Ready() has not been called yet. */ #define PyUnicode_WCHAR_KIND 0 @@ -399,11 +413,6 @@ #define PyUnicode_2BYTE_DATA(op) ((Py_UCS2*)PyUnicode_DATA(op)) #define PyUnicode_4BYTE_DATA(op) ((Py_UCS4*)PyUnicode_DATA(op)) -/* Return true if the string is compact or 0 if not. - No type checks or Ready calls are performed. */ -#define PyUnicode_IS_COMPACT(op) \ - (((PyASCIIObject*)(op))->state.compact) - /* Return one of the PyUnicode_*_KIND values defined above. */ #define PyUnicode_KIND(op) \ (assert(PyUnicode_Check(op)), \ @@ -500,9 +509,9 @@ #define PyUnicode_IS_READY(op) (((PyASCIIObject*)op)->state.ready) -/* PyUnicode_READY() does less work than PyUnicode_Ready() in the best +/* PyUnicode_READY() does less work than _PyUnicode_Ready() in the best case. If the canonical representation is not yet set, it will still call - PyUnicode_Ready(). + _PyUnicode_Ready(). Returns 0 on success and -1 on errors. */ #define PyUnicode_READY(op) \ (assert(PyUnicode_Check(op)), \ diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -288,16 +288,14 @@ ascii = (PyASCIIObject *)op; kind = ascii->state.kind; - if (ascii->state.ascii == 1) { + if (ascii->state.ascii == 1 && ascii->state.compact == 1) { assert(kind == PyUnicode_1BYTE_KIND); - assert(ascii->state.compact == 1); assert(ascii->state.ready == 1); } else if (ascii->state.compact == 1) { assert(kind == PyUnicode_1BYTE_KIND || kind == PyUnicode_2BYTE_KIND || kind == PyUnicode_4BYTE_KIND); - assert(ascii->state.compact == 1); assert(ascii->state.ascii == 0); assert(ascii->state.ready == 1); } else { @@ -305,9 +303,9 @@ PyUnicodeObject *unicode = (PyUnicodeObject *)op; if (kind == PyUnicode_WCHAR_KIND) { - assert(!ascii->state.compact == 1); + assert(ascii->state.compact == 0); assert(ascii->state.ascii == 0); - assert(!ascii->state.ready == 1); + assert(ascii->state.ready == 0); assert(ascii->wstr != NULL); assert(unicode->data.any == NULL); assert(compact->utf8 == NULL); @@ -317,10 +315,9 @@ assert(kind == PyUnicode_1BYTE_KIND || kind == PyUnicode_2BYTE_KIND || kind == PyUnicode_4BYTE_KIND); - assert(!ascii->state.compact == 1); + assert(ascii->state.compact == 0); assert(ascii->state.ready == 1); assert(unicode->data.any != NULL); - assert(ascii->state.ascii == 0); } } return 1; @@ -638,7 +635,7 @@ switch(PyUnicode_KIND(unicode)) { case PyUnicode_1BYTE_KIND: - if (PyUnicode_IS_COMPACT_ASCII(unicode)) + if (PyUnicode_IS_ASCII(unicode)) return "legacy ascii"; else return "legacy latin1"; @@ -654,14 +651,14 @@ switch(PyUnicode_KIND(unicode)) { case PyUnicode_1BYTE_KIND: - if (PyUnicode_IS_COMPACT_ASCII(unicode)) + if (PyUnicode_IS_ASCII(unicode)) return "ascii"; else - return "compact latin1"; + return "latin1"; case PyUnicode_2BYTE_KIND: - return "compact UCS2"; + return "UCS2"; case PyUnicode_4BYTE_KIND: - return "compact UCS4"; + return "UCS4"; default: return ""; } @@ -703,7 +700,7 @@ if (ascii->wstr == data) printf("shared "); printf("wstr=%p", ascii->wstr); - if (!ascii->state.ascii) { + if (!(ascii->state.ascii == 1 && ascii->state.compact == 1)) { printf(" (%zu), ", compact->wstr_length); if (!ascii->state.compact && compact->utf8 == unicode->data.any) printf("shared "); @@ -954,9 +951,9 @@ /* check if max_char(from substring) <= max_char(to) */ if (from_kind > to_kind /* latin1 => ascii */ - || (PyUnicode_IS_COMPACT_ASCII(to) + || (PyUnicode_IS_ASCII(to) && to_kind == PyUnicode_1BYTE_KIND - && !PyUnicode_IS_COMPACT_ASCII(from))) + && !PyUnicode_IS_ASCII(from))) { /* slow path to check for character overflow */ const Py_UCS4 to_maxchar = PyUnicode_MAX_CHAR_VALUE(to); @@ -1115,10 +1112,12 @@ _PyUnicode_LENGTH(unicode) = _PyUnicode_WSTR_LENGTH(unicode); _PyUnicode_STATE(unicode).kind = PyUnicode_1BYTE_KIND; if (maxchar < 128) { + _PyUnicode_STATE(unicode).ascii = 1; _PyUnicode_UTF8(unicode) = _PyUnicode_DATA_ANY(unicode); _PyUnicode_UTF8_LENGTH(unicode) = _PyUnicode_WSTR_LENGTH(unicode); } else { + _PyUnicode_STATE(unicode).ascii = 0; _PyUnicode_UTF8(unicode) = NULL; _PyUnicode_UTF8_LENGTH(unicode) = 0; } diff --git a/Tools/gdb/libpython.py b/Tools/gdb/libpython.py --- a/Tools/gdb/libpython.py +++ b/Tools/gdb/libpython.py @@ -1132,15 +1132,16 @@ compact = self.field('_base') ascii = compact['_base'] state = ascii['state'] + is_compact_ascii = (int(state['ascii']) and int(state['compact'])) field_length = long(ascii['length']) if not int(state['ready']): # string is not ready may_have_surrogates = True field_str = ascii['wstr'] - if not int(state['ascii']): + if not is_compact_ascii: field_length = compact('wstr_length') else: - if int(state['ascii']): + if is_compact_ascii: field_str = ascii.address + 1 elif int(state['compact']): field_str = compact.address + 1 -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Oct 3 14:51:25 2011 From: python-checkins at python.org (victor.stinner) Date: Mon, 03 Oct 2011 14:51:25 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_unicode=5Fkind=5Fname=28=29?= =?utf8?q?_doesn=27t_check_consistency_anymore?= Message-ID: http://hg.python.org/cpython/rev/9fe93afc57b5 changeset: 72615:9fe93afc57b5 user: Victor Stinner date: Mon Oct 03 14:41:45 2011 +0200 summary: unicode_kind_name() doesn't check consistency anymore It is is called from _PyUnicode_Dump() and so must not fail. files: Objects/unicodeobject.c | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -627,7 +627,8 @@ static const char* unicode_kind_name(PyObject *unicode) { - assert(_PyUnicode_CHECK(unicode)); + /* don't check consistency: unicode_kind_name() is called from + _PyUnicode_Dump() */ if (!PyUnicode_IS_COMPACT(unicode)) { if (!PyUnicode_IS_READY(unicode)) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Oct 3 14:51:25 2011 From: python-checkins at python.org (victor.stinner) Date: Mon, 03 Oct 2011 14:51:25 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_unicode=5Fsubtype=5Fnew=28?= =?utf8?q?=29_copies_also_the_ascii_flag?= Message-ID: http://hg.python.org/cpython/rev/54d77a25736d changeset: 72616:54d77a25736d user: Victor Stinner date: Mon Oct 03 14:42:15 2011 +0200 summary: unicode_subtype_new() copies also the ascii flag files: Objects/unicodeobject.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -12846,7 +12846,7 @@ _PyUnicode_STATE(self).interned = 0; _PyUnicode_STATE(self).kind = kind; _PyUnicode_STATE(self).compact = 0; - _PyUnicode_STATE(self).ascii = 0; + _PyUnicode_STATE(self).ascii = _PyUnicode_STATE(unicode).ascii; _PyUnicode_STATE(self).ready = 1; _PyUnicode_WSTR(self) = NULL; _PyUnicode_UTF8_LENGTH(self) = 0; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Oct 3 14:51:26 2011 From: python-checkins at python.org (victor.stinner) Date: Mon, 03 Oct 2011 14:51:26 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_=5FPyUnicode=5FCheckConsist?= =?utf8?q?ency=28=29_checks_utf8_field_consistency?= Message-ID: http://hg.python.org/cpython/rev/ec481f3f79cd changeset: 72617:ec481f3f79cd user: Victor Stinner date: Mon Oct 03 14:42:39 2011 +0200 summary: _PyUnicode_CheckConsistency() checks utf8 field consistency files: Include/unicodeobject.h | 2 ++ Objects/unicodeobject.c | 6 ++++++ 2 files changed, 8 insertions(+), 0 deletions(-) diff --git a/Include/unicodeobject.h b/Include/unicodeobject.h --- a/Include/unicodeobject.h +++ b/Include/unicodeobject.h @@ -225,6 +225,7 @@ * compact = 1 * ready = 1 * ascii = 0 + * utf8 != data - string created by the legacy API (not ready): @@ -246,6 +247,7 @@ * compact = 0 * ready = 1 * data.any is not NULL + * utf8 = data if ascii is 1 String created by the legacy API becomes ready when calling PyUnicode_READY(). diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -293,11 +293,13 @@ assert(ascii->state.ready == 1); } else if (ascii->state.compact == 1) { + PyCompactUnicodeObject *compact = (PyCompactUnicodeObject *)op; assert(kind == PyUnicode_1BYTE_KIND || kind == PyUnicode_2BYTE_KIND || kind == PyUnicode_4BYTE_KIND); assert(ascii->state.ascii == 0); assert(ascii->state.ready == 1); + assert (compact->utf8 != (void*)(compact + 1)); } else { PyCompactUnicodeObject *compact = (PyCompactUnicodeObject *)op; PyUnicodeObject *unicode = (PyUnicodeObject *)op; @@ -318,6 +320,10 @@ assert(ascii->state.compact == 0); assert(ascii->state.ready == 1); assert(unicode->data.any != NULL); + if (ascii->state.ascii) + assert (compact->utf8 == unicode->data.any); + else + assert (compact->utf8 != unicode->data.any); } } return 1; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Oct 3 19:41:05 2011 From: python-checkins at python.org (charles-francois.natali) Date: Mon, 03 Oct 2011 19:41:05 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Introduce_support=2Erequire?= =?utf8?q?s=5Ffreebsd=5Fversion_decorator=2E?= Message-ID: http://hg.python.org/cpython/rev/3b1859f80e6d changeset: 72618:3b1859f80e6d user: Charles-Fran?ois Natali date: Mon Oct 03 19:40:37 2011 +0200 summary: Introduce support.requires_freebsd_version decorator. files: Lib/test/support.py | 40 +++++++++++++++++++++++--------- 1 files changed, 28 insertions(+), 12 deletions(-) diff --git a/Lib/test/support.py b/Lib/test/support.py --- a/Lib/test/support.py +++ b/Lib/test/support.py @@ -44,8 +44,8 @@ "Error", "TestFailed", "ResourceDenied", "import_module", "verbose", "use_resources", "max_memuse", "record_original_stdout", "get_original_stdout", "unload", "unlink", "rmtree", "forget", - "is_resource_enabled", "requires", "requires_linux_version", - "requires_mac_ver", "find_unused_port", "bind_port", + "is_resource_enabled", "requires", "requires_freebsd_version", + "requires_linux_version", "requires_mac_ver", "find_unused_port", "bind_port", "IPV6_ENABLED", "is_jython", "TESTFN", "HOST", "SAVEDCWD", "temp_cwd", "findfile", "create_empty_file", "sortdict", "check_syntax_error", "open_urlresource", "check_warnings", "CleanImport", "EnvironmentVarGuard", "TransientResource", @@ -312,17 +312,17 @@ msg = "Use of the %r resource not enabled" % resource raise ResourceDenied(msg) -def requires_linux_version(*min_version): - """Decorator raising SkipTest if the OS is Linux and the kernel version is - less than min_version. +def _requires_unix_version(sysname, min_version): + """Decorator raising SkipTest if the OS is `sysname` and the version is less + than `min_version`. - For example, @requires_linux_version(2, 6, 35) raises SkipTest if the Linux - kernel version is less than 2.6.35. + For example, @_requires_unix_version('FreeBSD', (7, 2)) raises SkipTest if + the FreeBSD version is less than 7.2. """ def decorator(func): @functools.wraps(func) def wrapper(*args, **kw): - if sys.platform == 'linux': + if platform.system() == sysname: version_txt = platform.release().split('-', 1)[0] try: version = tuple(map(int, version_txt.split('.'))) @@ -332,13 +332,29 @@ if version < min_version: min_version_txt = '.'.join(map(str, min_version)) raise unittest.SkipTest( - "Linux kernel %s or higher required, not %s" - % (min_version_txt, version_txt)) - return func(*args, **kw) - wrapper.min_version = min_version + "%s version %s or higher required, not %s" + % (sysname, min_version_txt, version_txt)) return wrapper return decorator +def requires_freebsd_version(*min_version): + """Decorator raising SkipTest if the OS is FreeBSD and the FreeBSD version is + less than `min_version`. + + For example, @requires_freebsd_version(7, 2) raises SkipTest if the FreeBSD + version is less than 7.2. + """ + return _requires_unix_version('FreeBSD', min_version) + +def requires_linux_version(*min_version): + """Decorator raising SkipTest if the OS is Linux and the Linux version is + less than `min_version`. + + For example, @requires_linux_version(2, 6, 32) raises SkipTest if the Linux + version is less than 2.6.32. + """ + return _requires_unix_version('Linux', min_version) + def requires_mac_ver(*min_version): """Decorator raising SkipTest if the OS is Mac OS X and the OS X version if less than min_version. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Oct 3 19:41:06 2011 From: python-checkins at python.org (charles-francois.natali) Date: Mon, 03 Oct 2011 19:41:06 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Issue_=2313001=3A_Fix_test?= =?utf8?q?=5Fsocket=2EtestRecvmsgTrunc_failure_on_FreeBSD_=3C_8=2C_which?= Message-ID: http://hg.python.org/cpython/rev/4378bae6b8dc changeset: 72619:4378bae6b8dc user: Charles-Fran?ois Natali date: Mon Oct 03 19:43:15 2011 +0200 summary: Issue #13001: Fix test_socket.testRecvmsgTrunc failure on FreeBSD < 8, which doesn't always set the MSG_TRUNC flag when a truncated datagram is received. files: Lib/test/test_socket.py | 4 ++++ 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/Lib/test/test_socket.py b/Lib/test/test_socket.py --- a/Lib/test/test_socket.py +++ b/Lib/test/test_socket.py @@ -1659,6 +1659,9 @@ def _testRecvmsgShorter(self): self.sendToServer(MSG) + # FreeBSD < 8 doesn't always set the MSG_TRUNC flag when a truncated + # datagram is received (issue #13001). + @support.requires_freebsd_version(8) def testRecvmsgTrunc(self): # Receive part of message, check for truncation indicators. msg, ancdata, flags, addr = self.doRecvmsg(self.serv_sock, @@ -1668,6 +1671,7 @@ self.assertEqual(ancdata, []) self.checkFlags(flags, eor=False) + @support.requires_freebsd_version(8) def _testRecvmsgTrunc(self): self.sendToServer(MSG) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Oct 3 20:06:28 2011 From: python-checkins at python.org (victor.stinner) Date: Mon, 03 Oct 2011 20:06:28 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Simplify_unicode=5Fresizabl?= =?utf8?q?e=28=29=3A_singletons_reference_count_is_at_least_2?= Message-ID: http://hg.python.org/cpython/rev/6fbc5e9141fc changeset: 72620:6fbc5e9141fc user: Victor Stinner date: Mon Oct 03 20:06:05 2011 +0200 summary: Simplify unicode_resizable(): singletons reference count is at least 2 files: Objects/unicodeobject.c | 20 +++++++------------- 1 files changed, 7 insertions(+), 13 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -1257,26 +1257,20 @@ static int unicode_resizable(PyObject *unicode) { - Py_ssize_t len; if (Py_REFCNT(unicode) != 1) return 0; if (PyUnicode_CHECK_INTERNED(unicode)) return 0; - if (unicode == unicode_empty) - return 0; - if (_PyUnicode_KIND(unicode) == PyUnicode_WCHAR_KIND) - len = PyUnicode_WSTR_LENGTH(unicode); - else - len = PyUnicode_GET_LENGTH(unicode); - if (len == 1) { - Py_UCS4 ch; - if (_PyUnicode_KIND(unicode) == PyUnicode_WCHAR_KIND) - ch = _PyUnicode_WSTR(unicode)[0]; - else - ch = PyUnicode_READ_CHAR(unicode, 0); + assert (unicode != unicode_empty); +#ifdef Py_DEBUG + if (_PyUnicode_KIND(unicode) != PyUnicode_WCHAR_KIND + && PyUnicode_GET_LENGTH(unicode) == 1) + { + Py_UCS4 ch = PyUnicode_READ_CHAR(unicode, 0); if (ch < 256 && unicode_latin1[ch] == unicode) return 0; } +#endif return 1; } -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Oct 3 23:35:47 2011 From: python-checkins at python.org (victor.stinner) Date: Mon, 03 Oct 2011 23:35:47 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Improve_string_forms_and_Py?= =?utf8?q?Unicode=5FResize=28=29_documentation?= Message-ID: http://hg.python.org/cpython/rev/fe10f0bcc860 changeset: 72621:fe10f0bcc860 user: Victor Stinner date: Mon Oct 03 23:19:21 2011 +0200 summary: Improve string forms and PyUnicode_Resize() documentation Remove also the FIXME for resize_copy(): as discussed with Martin, copy the string on resize if the string is not resizable is just fine. files: Include/unicodeobject.h | 35 ++++++++++++++++++---------- Objects/unicodeobject.c | 4 +- 2 files changed, 24 insertions(+), 15 deletions(-) diff --git a/Include/unicodeobject.h b/Include/unicodeobject.h --- a/Include/unicodeobject.h +++ b/Include/unicodeobject.h @@ -206,7 +206,7 @@ immediately follow the structure. utf8_length and wstr_length can be found in the length field; the utf8 pointer is equal to the data pointer. */ typedef struct { - /* Unicode strings can be in 4 states: + /* There a 4 forms of Unicode strings: - compact ascii: @@ -227,7 +227,7 @@ * ascii = 0 * utf8 != data - - string created by the legacy API (not ready): + - legacy string, not ready: * structure = PyUnicodeObject * kind = PyUnicode_WCHAR_KIND @@ -239,7 +239,7 @@ * interned = SSTATE_NOT_INTERNED * ascii = 0 - - string created by the legacy API, ready: + - legacy string, ready: * structure = PyUnicodeObject structure * kind = PyUnicode_1BYTE_KIND, PyUnicode_2BYTE_KIND or @@ -249,10 +249,16 @@ * data.any is not NULL * utf8 = data if ascii is 1 - String created by the legacy API becomes ready when calling - PyUnicode_READY(). + Compact strings use only one memory block (structure + characters), + whereas legacy strings use one block for the structure and one block + for characters. - See also _PyUnicode_CheckConsistency(). */ + Legacy strings are created by PyUnicode_FromUnicode() and + PyUnicode_FromStringAndSize(NULL, size) functions. They become ready + when PyUnicode_READY() is called. + + See also _PyUnicode_CheckConsistency(). + */ PyObject_HEAD Py_ssize_t length; /* Number of code points in the string */ Py_hash_t hash; /* Hash value; -1 if not set */ @@ -721,19 +727,22 @@ PyAPI_FUNC(Py_UNICODE) PyUnicode_GetMax(void); #endif -/* Resize an already allocated Unicode object to the new size length. +/* Resize an Unicode object allocated by the legacy API (e.g. + PyUnicode_FromUnicode). Unicode objects allocated by the new API (e.g. + PyUnicode_New) cannot be resized by this function. + + The length is a number of Py_UNICODE characters (and not the number of code + points). *unicode is modified to point to the new (resized) object and 0 returned on success. - This API may only be called by the function which also called the - Unicode constructor. The refcount on the object must be 1. Otherwise, - an error is returned. + If the refcount on the object is 1, the function resizes the string in + place, which is usually faster than allocating a new string (and copy + characters). Error handling is implemented as follows: an exception is set, -1 - is returned and *unicode left untouched. - -*/ + is returned and *unicode left untouched. */ PyAPI_FUNC(int) PyUnicode_Resize( PyObject **unicode, /* Pointer to the Unicode object */ diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -536,7 +536,8 @@ return NULL; } return copy; - } else { + } + else { PyUnicodeObject *w; assert(_PyUnicode_WSTR(unicode) != NULL); assert(_PyUnicode_DATA_ANY(unicode) == NULL); @@ -1294,7 +1295,6 @@ if (old_length == length) return 0; - /* FIXME: really create a new object? */ if (!unicode_resizable(unicode)) { PyObject *copy = resize_copy(unicode, length); if (copy == NULL) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Oct 3 23:35:48 2011 From: python-checkins at python.org (victor.stinner) Date: Mon, 03 Oct 2011 23:35:48 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Fix_a_compiler_warning_in_P?= =?utf8?q?yUnicode=5FAppend=28=29?= Message-ID: http://hg.python.org/cpython/rev/bf6dbd1b10b4 changeset: 72622:bf6dbd1b10b4 user: Victor Stinner date: Mon Oct 03 23:27:56 2011 +0200 summary: Fix a compiler warning in PyUnicode_Append() Don't check PyUnicode_CopyCharacters() in release mode. Rename also some variables. files: Objects/unicodeobject.c | 24 +++++++++++++++--------- 1 files changed, 15 insertions(+), 9 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -9931,9 +9931,11 @@ && (_PyUnicode_KIND(right) <= _PyUnicode_KIND(left) || _PyUnicode_WSTR(left) != NULL)) { - Py_ssize_t u_len, v_len, new_len, copied; - - /* FIXME: don't make wstr string ready */ + Py_ssize_t left_len, right_len, new_len; +#ifdef Py_DEBUG + Py_ssize_t copied; +#endif + if (PyUnicode_READY(left)) goto error; if (PyUnicode_READY(right)) @@ -9942,14 +9944,14 @@ /* FIXME: support ascii+latin1, PyASCIIObject => PyCompactUnicodeObject */ if (PyUnicode_MAX_CHAR_VALUE(right) <= PyUnicode_MAX_CHAR_VALUE(left)) { - u_len = PyUnicode_GET_LENGTH(left); - v_len = PyUnicode_GET_LENGTH(right); - if (u_len > PY_SSIZE_T_MAX - v_len) { + left_len = PyUnicode_GET_LENGTH(left); + right_len = PyUnicode_GET_LENGTH(right); + if (left_len > PY_SSIZE_T_MAX - right_len) { PyErr_SetString(PyExc_OverflowError, "strings are too large to concat"); goto error; } - new_len = u_len + v_len; + new_len = left_len + right_len; /* Now we own the last reference to 'left', so we can resize it * in-place. @@ -9964,10 +9966,14 @@ goto error; } /* copy 'right' into the newly allocated area of 'left' */ - copied = PyUnicode_CopyCharacters(left, u_len, +#ifdef Py_DEBUG + copied = PyUnicode_CopyCharacters(left, left_len, right, 0, - v_len); + right_len); assert(0 <= copied); +#else + PyUnicode_CopyCharacters(left, left_len, right, 0, right_len); +#endif *p_left = left; return; } -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Mon Oct 3 23:35:49 2011 From: python-checkins at python.org (victor.stinner) Date: Mon, 03 Oct 2011 23:35:49 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_PyUnicode=5FJoin=28=29_chec?= =?utf8?q?ks_output_length_in_debug_mode?= Message-ID: http://hg.python.org/cpython/rev/bfd8b5d35f9c changeset: 72623:bfd8b5d35f9c user: Victor Stinner date: Mon Oct 03 23:36:02 2011 +0200 summary: PyUnicode_Join() checks output length in debug mode PyUnicode_CopyCharacters() may copies less character than requested size, if the input string is smaller than the argument. (This is very unlikely, but who knows!?) Avoid also calling PyUnicode_CopyCharacters() if the string is empty. files: Objects/unicodeobject.c | 34 +++++++++++++++++++--------- 1 files changed, 23 insertions(+), 11 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -8890,20 +8890,32 @@ /* Catenate everything. */ for (i = 0, res_offset = 0; i < seqlen; ++i) { - Py_ssize_t itemlen; + Py_ssize_t itemlen, copied; item = items[i]; + /* Copy item, and maybe the separator. */ + if (i && seplen != 0) { + copied = PyUnicode_CopyCharacters(res, res_offset, + sep, 0, seplen); + if (copied < 0) + goto onError; +#ifdef Py_DEBUG + res_offset += copied; +#else + res_offset += seplen; +#endif + } itemlen = PyUnicode_GET_LENGTH(item); - /* Copy item, and maybe the separator. */ - if (i) { - if (PyUnicode_CopyCharacters(res, res_offset, - sep, 0, seplen) < 0) + if (itemlen != 0) { + copied = PyUnicode_CopyCharacters(res, res_offset, + item, 0, itemlen); + if (copied < 0) goto onError; - res_offset += seplen; - } - if (PyUnicode_CopyCharacters(res, res_offset, - item, 0, itemlen) < 0) - goto onError; - res_offset += itemlen; +#ifdef Py_DEBUG + res_offset += copied; +#else + res_offset += itemlen; +#endif + } } assert(res_offset == PyUnicode_GET_LENGTH(res)); -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 00:09:10 2011 From: python-checkins at python.org (victor.stinner) Date: Tue, 04 Oct 2011 00:09:10 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Add_=5FPyUnicode=5FHAS=5FWS?= =?utf8?q?TR=5FMEMORY=28=29_macro?= Message-ID: http://hg.python.org/cpython/rev/65ff63a8347b changeset: 72624:65ff63a8347b user: Victor Stinner date: Mon Oct 03 23:45:12 2011 +0200 summary: Add _PyUnicode_HAS_WSTR_MEMORY() macro files: Objects/unicodeobject.c | 15 ++++++++++----- 1 files changed, 10 insertions(+), 5 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -151,6 +151,14 @@ && _PyUnicode_UTF8(op) \ && _PyUnicode_UTF8(op) != PyUnicode_DATA(op))) +/* true if the Unicode object has an allocated wstr memory block + (not shared with other data) */ +#define _PyUnicode_HAS_WSTR_MEMORY(op) \ + (assert(_PyUnicode_CHECK(op)), \ + (_PyUnicode_WSTR(op) && \ + (!PyUnicode_IS_READY(op) || \ + _PyUnicode_WSTR(op) != PyUnicode_DATA(op)))) + /* Generic helper macro to convert characters of different types. from_type and to_type have to be valid type names, begin and end are pointers to the source characters which should be of type @@ -1238,9 +1246,7 @@ Py_FatalError("Inconsistent interned string state."); } - if (_PyUnicode_WSTR(unicode) && - (!PyUnicode_IS_READY(unicode) || - _PyUnicode_WSTR(unicode) != PyUnicode_DATA(unicode))) + if (_PyUnicode_HAS_WSTR_MEMORY(unicode)) PyObject_DEL(_PyUnicode_WSTR(unicode)); if (_PyUnicode_HAS_UTF8_MEMORY(unicode)) PyObject_DEL(_PyUnicode_UTF8(unicode)); @@ -12061,8 +12067,7 @@ } /* If the wstr pointer is present, account for it unless it is shared with the data pointer. Check if the data is not shared. */ - if (_PyUnicode_WSTR(v) && - (PyUnicode_DATA(v) != _PyUnicode_WSTR(v))) + if (_PyUnicode_HAS_WSTR_MEMORY(v)) size += (PyUnicode_WSTR_LENGTH(v) + 1) * sizeof(wchar_t); if (_PyUnicode_HAS_UTF8_MEMORY(v)) size += PyUnicode_UTF8_LENGTH(v) + 1; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 00:09:11 2011 From: python-checkins at python.org (victor.stinner) Date: Tue, 04 Oct 2011 00:09:11 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Unicode=3A_document_when_th?= =?utf8?q?e_wstr_pointer_is_shared_with_data?= Message-ID: http://hg.python.org/cpython/rev/3889fa2194f2 changeset: 72625:3889fa2194f2 user: Victor Stinner date: Tue Oct 04 00:00:20 2011 +0200 summary: Unicode: document when the wstr pointer is shared with data Add also related assertions to _PyUnicode_CheckConsistency(). files: Include/unicodeobject.h | 8 +++++++- Objects/unicodeobject.c | 24 +++++++++++++++++++++++- 2 files changed, 30 insertions(+), 2 deletions(-) diff --git a/Include/unicodeobject.h b/Include/unicodeobject.h --- a/Include/unicodeobject.h +++ b/Include/unicodeobject.h @@ -226,6 +226,9 @@ * ready = 1 * ascii = 0 * utf8 != data + * wstr is shared with data if kind=PyUnicode_2BYTE_KIND + and sizeof(wchar_t)=2 or if kind=PyUnicode_4BYTE_KIND and + sizeof(wchar_4)=4 - legacy string, not ready: @@ -247,7 +250,10 @@ * compact = 0 * ready = 1 * data.any is not NULL - * utf8 = data if ascii is 1 + * utf8 is shared with data.any if ascii = 1 + * wstr is shared with data.any if kind=PyUnicode_2BYTE_KIND + and sizeof(wchar_t)=2 or if kind=PyUnicode_4BYTE_KIND and + sizeof(wchar_4)=4 Compact strings use only one memory block (structure + characters), whereas legacy strings use one block for the structure and one block diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -302,12 +302,24 @@ } else if (ascii->state.compact == 1) { PyCompactUnicodeObject *compact = (PyCompactUnicodeObject *)op; + void *data; assert(kind == PyUnicode_1BYTE_KIND || kind == PyUnicode_2BYTE_KIND || kind == PyUnicode_4BYTE_KIND); assert(ascii->state.ascii == 0); assert(ascii->state.ready == 1); - assert (compact->utf8 != (void*)(compact + 1)); + data = compact + 1; + assert (compact->utf8 != data); + if ( +#if SIZEOF_WCHAR_T == 2 + kind == PyUnicode_2BYTE_KIND +#else + kind == PyUnicode_4BYTE_KIND +#endif + ) + assert(ascii->wstr == data); + else + assert(ascii->wstr != data); } else { PyCompactUnicodeObject *compact = (PyCompactUnicodeObject *)op; PyUnicodeObject *unicode = (PyUnicodeObject *)op; @@ -332,6 +344,16 @@ assert (compact->utf8 == unicode->data.any); else assert (compact->utf8 != unicode->data.any); + if ( +#if SIZEOF_WCHAR_T == 2 + kind == PyUnicode_2BYTE_KIND +#else + kind == PyUnicode_4BYTE_KIND +#endif + ) + assert(ascii->wstr == unicode->data.any); + else + assert(ascii->wstr != unicode->data.any); } } return 1; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 00:09:12 2011 From: python-checkins at python.org (victor.stinner) Date: Tue, 04 Oct 2011 00:09:12 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Unicode=3A_raise_SystemErro?= =?utf8?q?r_instead_of_ValueError_or_RuntimeError_on_invalid?= Message-ID: http://hg.python.org/cpython/rev/721bb2e59815 changeset: 72626:721bb2e59815 user: Victor Stinner date: Tue Oct 04 00:04:26 2011 +0200 summary: Unicode: raise SystemError instead of ValueError or RuntimeError on invalid state files: Objects/unicodeobject.c | 12 ++++++------ 1 files changed, 6 insertions(+), 6 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -898,7 +898,7 @@ { assert(_PyUnicode_CHECK(unicode)); if (Py_REFCNT(unicode) != 1) { - PyErr_SetString(PyExc_ValueError, + PyErr_SetString(PyExc_SystemError, "Cannot modify a string having more than 1 reference"); return -1; } @@ -926,7 +926,7 @@ how_many = Py_MIN(PyUnicode_GET_LENGTH(from), how_many); if (to_start + how_many > PyUnicode_GET_LENGTH(to)) { - PyErr_Format(PyExc_ValueError, + PyErr_Format(PyExc_SystemError, "Cannot write %zi characters at %zi " "in a string of %zi characters", how_many, to_start, PyUnicode_GET_LENGTH(to)); @@ -1015,7 +1015,7 @@ else invalid_kinds = 1; if (invalid_kinds) { - PyErr_Format(PyExc_ValueError, + PyErr_Format(PyExc_SystemError, "Cannot copy %s characters " "into a string of %s characters", unicode_kind_name(from), @@ -1562,7 +1562,7 @@ case PyUnicode_4BYTE_KIND: return _PyUnicode_FromUCS4(buffer, size); } - PyErr_SetString(PyExc_ValueError, "invalid kind"); + PyErr_SetString(PyExc_SystemError, "invalid kind"); return NULL; } @@ -1622,7 +1622,7 @@ len = PyUnicode_GET_LENGTH(s); skind = PyUnicode_KIND(s); if (skind >= kind) { - PyErr_SetString(PyExc_RuntimeError, "invalid widening attempt"); + PyErr_SetString(PyExc_SystemError, "invalid widening attempt"); return NULL; } switch(kind) { @@ -1660,7 +1660,7 @@ default: break; } - PyErr_SetString(PyExc_ValueError, "invalid kind"); + PyErr_SetString(PyExc_SystemError, "invalid kind"); return NULL; } -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 01:16:14 2011 From: python-checkins at python.org (victor.stinner) Date: Tue, 04 Oct 2011 01:16:14 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_PyUnicode=5FNew=28=29_sets_?= =?utf8?q?utf8=5Flength_to_zero_for_latin1?= Message-ID: http://hg.python.org/cpython/rev/e59f4265033b changeset: 72627:e59f4265033b user: Victor Stinner date: Tue Oct 04 01:02:02 2011 +0200 summary: PyUnicode_New() sets utf8_length to zero for latin1 files: Objects/unicodeobject.c | 7 +++++-- 1 files changed, 5 insertions(+), 2 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -755,7 +755,7 @@ PyCompactUnicodeObject *unicode; void *data; int kind_state; - int is_sharing = 0, is_ascii = 0; + int is_sharing, is_ascii; Py_ssize_t char_size; Py_ssize_t struct_size; @@ -769,6 +769,8 @@ ++unicode_new_new_calls; #endif + is_ascii = 0; + is_sharing = 0; struct_size = sizeof(PyCompactUnicodeObject); if (maxchar < 128) { kind_state = PyUnicode_1BYTE_KIND; @@ -833,11 +835,12 @@ ((char*)data)[size] = 0; _PyUnicode_WSTR(unicode) = NULL; _PyUnicode_WSTR_LENGTH(unicode) = 0; + unicode->utf8 = NULL; unicode->utf8_length = 0; - unicode->utf8 = NULL; } else { unicode->utf8 = NULL; + unicode->utf8_length = 0; if (kind_state == PyUnicode_2BYTE_KIND) ((Py_UCS2*)data)[size] = 0; else /* kind_state == PyUnicode_4BYTE_KIND */ -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 01:16:15 2011 From: python-checkins at python.org (victor.stinner) Date: Tue, 04 Oct 2011 01:16:15 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_resize=5Finplace=28=29_sets?= =?utf8?q?_utf8=5Flength_to_zero_if_the_utf8_is_not_shared8?= Message-ID: http://hg.python.org/cpython/rev/96a9f62c6b6d changeset: 72628:96a9f62c6b6d user: Victor Stinner date: Tue Oct 04 01:03:50 2011 +0200 summary: resize_inplace() sets utf8_length to zero if the utf8 is not shared8 Cleanup also the code. files: Objects/unicodeobject.c | 59 +++++++++++++++------------- 1 files changed, 32 insertions(+), 27 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -476,21 +476,14 @@ } static int -resize_inplace(register PyUnicodeObject *unicode, Py_ssize_t length) -{ - void *oldstr; - +resize_inplace(PyUnicodeObject *unicode, Py_ssize_t length) +{ + wchar_t *wstr; assert(!PyUnicode_IS_COMPACT(unicode)); - assert(Py_REFCNT(unicode) == 1); + _PyUnicode_DIRTY(unicode); - if (_PyUnicode_HAS_UTF8_MEMORY(unicode)) - { - PyObject_DEL(_PyUnicode_UTF8(unicode)); - _PyUnicode_UTF8(unicode) = NULL; - } - if (PyUnicode_IS_READY(unicode)) { Py_ssize_t char_size; Py_ssize_t new_size; @@ -502,6 +495,12 @@ char_size = PyUnicode_CHARACTER_SIZE(unicode); share_wstr = _PyUnicode_SHARE_WSTR(unicode); share_utf8 = _PyUnicode_SHARE_UTF8(unicode); + if (!share_utf8 && _PyUnicode_HAS_UTF8_MEMORY(unicode)) + { + PyObject_DEL(_PyUnicode_UTF8(unicode)); + _PyUnicode_UTF8(unicode) = NULL; + _PyUnicode_UTF8_LENGTH(unicode) = 0; + } if (length > (PY_SSIZE_T_MAX / char_size - 1)) { PyErr_NoMemory(); @@ -525,23 +524,28 @@ } _PyUnicode_LENGTH(unicode) = length; PyUnicode_WRITE(PyUnicode_KIND(unicode), data, length, 0); - if (share_wstr) + if (share_wstr || _PyUnicode_WSTR(unicode) == NULL) { + _PyUnicode_CHECK(unicode); return 0; - } - if (_PyUnicode_WSTR(unicode) != NULL) { - assert(_PyUnicode_WSTR(unicode) != NULL); - - oldstr = _PyUnicode_WSTR(unicode); - _PyUnicode_WSTR(unicode) = PyObject_REALLOC(_PyUnicode_WSTR(unicode), - sizeof(Py_UNICODE) * (length + 1)); - if (!_PyUnicode_WSTR(unicode)) { - _PyUnicode_WSTR(unicode) = (Py_UNICODE *)oldstr; - PyErr_NoMemory(); - return -1; - } - _PyUnicode_WSTR(unicode)[length] = 0; - _PyUnicode_WSTR_LENGTH(unicode) = length; - } + } + } + assert(_PyUnicode_WSTR(unicode) != NULL); + + /* check for integer overflow */ + if (length > PY_SSIZE_T_MAX / sizeof(wchar_t) - 1) { + PyErr_NoMemory(); + return -1; + } + wstr = _PyUnicode_WSTR(unicode); + wstr = PyObject_REALLOC(wstr, sizeof(wchar_t) * (length + 1)); + if (!wstr) { + PyErr_NoMemory(); + return -1; + } + _PyUnicode_WSTR(unicode) = wstr; + _PyUnicode_WSTR(unicode)[length] = 0; + _PyUnicode_WSTR_LENGTH(unicode) = length; + _PyUnicode_CHECK(unicode); return 0; } @@ -1339,6 +1343,7 @@ *p_unicode = resize_compact(unicode, length); if (*p_unicode == NULL) return -1; + _PyUnicode_CHECK(*p_unicode); return 0; } else return resize_inplace((PyUnicodeObject*)unicode, length); -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 01:16:16 2011 From: python-checkins at python.org (victor.stinner) Date: Tue, 04 Oct 2011 01:16:16 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Document_utf8=5Flength_and_?= =?utf8?q?wstr=5Flength_states?= Message-ID: http://hg.python.org/cpython/rev/5346409167a7 changeset: 72629:5346409167a7 user: Victor Stinner date: Tue Oct 04 01:05:08 2011 +0200 summary: Document utf8_length and wstr_length states Ensure these states with assertions in _PyUnicode_CheckConsistency(). files: Include/unicodeobject.h | 19 +++-- Objects/unicodeobject.c | 88 +++++++++++++++------------- 2 files changed, 58 insertions(+), 49 deletions(-) diff --git a/Include/unicodeobject.h b/Include/unicodeobject.h --- a/Include/unicodeobject.h +++ b/Include/unicodeobject.h @@ -226,9 +226,11 @@ * ready = 1 * ascii = 0 * utf8 != data - * wstr is shared with data if kind=PyUnicode_2BYTE_KIND - and sizeof(wchar_t)=2 or if kind=PyUnicode_4BYTE_KIND and - sizeof(wchar_4)=4 + * utf8_length = 0 if utf8 is NULL + * wstr is shared with data and wstr_length=length + if kind=PyUnicode_2BYTE_KIND and sizeof(wchar_t)=2 + or if kind=PyUnicode_4BYTE_KIND and sizeof(wchar_4)=4 + * wstr_length = 0 if wstr is NULL - legacy string, not ready: @@ -239,6 +241,7 @@ * wstr is not NULL * data.any is NULL * utf8 is NULL + * utf8_length = 0 * interned = SSTATE_NOT_INTERNED * ascii = 0 @@ -250,10 +253,12 @@ * compact = 0 * ready = 1 * data.any is not NULL - * utf8 is shared with data.any if ascii = 1 - * wstr is shared with data.any if kind=PyUnicode_2BYTE_KIND - and sizeof(wchar_t)=2 or if kind=PyUnicode_4BYTE_KIND and - sizeof(wchar_4)=4 + * utf8 is shared and utf8_length = length with data.any if ascii = 1 + * utf8_length = 0 if utf8 is NULL + * wstr is shared and wstr_length = length with data.any + if kind=PyUnicode_2BYTE_KIND and sizeof(wchar_t)=2 + or if kind=PyUnicode_4BYTE_KIND and sizeof(wchar_4)=4 + * wstr_length = 0 if wstr is NULL Compact strings use only one memory block (structure + characters), whereas legacy strings use one block for the structure and one block diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -300,50 +300,47 @@ assert(kind == PyUnicode_1BYTE_KIND); assert(ascii->state.ready == 1); } - else if (ascii->state.compact == 1) { + else { PyCompactUnicodeObject *compact = (PyCompactUnicodeObject *)op; void *data; - assert(kind == PyUnicode_1BYTE_KIND - || kind == PyUnicode_2BYTE_KIND - || kind == PyUnicode_4BYTE_KIND); - assert(ascii->state.ascii == 0); - assert(ascii->state.ready == 1); - data = compact + 1; - assert (compact->utf8 != data); - if ( -#if SIZEOF_WCHAR_T == 2 - kind == PyUnicode_2BYTE_KIND -#else - kind == PyUnicode_4BYTE_KIND -#endif - ) - assert(ascii->wstr == data); - else - assert(ascii->wstr != data); - } else { - PyCompactUnicodeObject *compact = (PyCompactUnicodeObject *)op; - PyUnicodeObject *unicode = (PyUnicodeObject *)op; - - if (kind == PyUnicode_WCHAR_KIND) { - assert(ascii->state.compact == 0); - assert(ascii->state.ascii == 0); - assert(ascii->state.ready == 0); - assert(ascii->wstr != NULL); - assert(unicode->data.any == NULL); - assert(compact->utf8 == NULL); - assert(ascii->state.interned == SSTATE_NOT_INTERNED); - } - else { + + if (ascii->state.compact == 1) { + data = compact + 1; assert(kind == PyUnicode_1BYTE_KIND || kind == PyUnicode_2BYTE_KIND || kind == PyUnicode_4BYTE_KIND); - assert(ascii->state.compact == 0); + assert(ascii->state.ascii == 0); assert(ascii->state.ready == 1); - assert(unicode->data.any != NULL); - if (ascii->state.ascii) - assert (compact->utf8 == unicode->data.any); - else - assert (compact->utf8 != unicode->data.any); + assert (compact->utf8 != data); + } else { + PyUnicodeObject *unicode = (PyUnicodeObject *)op; + + data = unicode->data.any; + if (kind == PyUnicode_WCHAR_KIND) { + assert(ascii->state.compact == 0); + assert(ascii->state.ascii == 0); + assert(ascii->state.ready == 0); + assert(ascii->wstr != NULL); + assert(data == NULL); + assert(compact->utf8 == NULL); + assert(ascii->state.interned == SSTATE_NOT_INTERNED); + } + else { + assert(kind == PyUnicode_1BYTE_KIND + || kind == PyUnicode_2BYTE_KIND + || kind == PyUnicode_4BYTE_KIND); + assert(ascii->state.compact == 0); + assert(ascii->state.ready == 1); + assert(data != NULL); + if (ascii->state.ascii) { + assert (compact->utf8 == data); + assert (compact->utf8_length == ascii->length); + } + else + assert (compact->utf8 != data); + } + } + if (kind != PyUnicode_WCHAR_KIND) { if ( #if SIZEOF_WCHAR_T == 2 kind == PyUnicode_2BYTE_KIND @@ -351,10 +348,17 @@ kind == PyUnicode_4BYTE_KIND #endif ) - assert(ascii->wstr == unicode->data.any); - else - assert(ascii->wstr != unicode->data.any); - } + { + assert(ascii->wstr == data); + assert(compact->wstr_length == ascii->length); + } else + assert(ascii->wstr != data); + } + + if (compact->utf8 == NULL) + assert(compact->utf8_length == 0); + if (ascii->wstr == NULL) + assert(compact->wstr_length == 0); } return 1; } -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 01:16:16 2011 From: python-checkins at python.org (victor.stinner) Date: Tue, 04 Oct 2011 01:16:16 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Reindent_internal_Unicode_m?= =?utf8?q?acros?= Message-ID: http://hg.python.org/cpython/rev/3dd3e8ff7296 changeset: 72630:3dd3e8ff7296 user: Victor Stinner date: Tue Oct 04 01:07:11 2011 +0200 summary: Reindent internal Unicode macros files: Objects/unicodeobject.c | 21 ++++++++++++++------- 1 files changed, 14 insertions(+), 7 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -111,24 +111,31 @@ PyUnicode_IS_COMPACT_ASCII(op) ? \ ((PyASCIIObject*)(op))->length : \ _PyUnicode_UTF8_LENGTH(op)) -#define _PyUnicode_WSTR(op) (((PyASCIIObject*)(op))->wstr) -#define _PyUnicode_WSTR_LENGTH(op) (((PyCompactUnicodeObject*)(op))->wstr_length) -#define _PyUnicode_LENGTH(op) (((PyASCIIObject *)(op))->length) -#define _PyUnicode_STATE(op) (((PyASCIIObject *)(op))->state) -#define _PyUnicode_HASH(op) (((PyASCIIObject *)(op))->hash) +#define _PyUnicode_WSTR(op) \ + (((PyASCIIObject*)(op))->wstr) +#define _PyUnicode_WSTR_LENGTH(op) \ + (((PyCompactUnicodeObject*)(op))->wstr_length) +#define _PyUnicode_LENGTH(op) \ + (((PyASCIIObject *)(op))->length) +#define _PyUnicode_STATE(op) \ + (((PyASCIIObject *)(op))->state) +#define _PyUnicode_HASH(op) \ + (((PyASCIIObject *)(op))->hash) #define _PyUnicode_KIND(op) \ (assert(_PyUnicode_CHECK(op)), \ ((PyASCIIObject *)(op))->state.kind) #define _PyUnicode_GET_LENGTH(op) \ (assert(_PyUnicode_CHECK(op)), \ ((PyASCIIObject *)(op))->length) -#define _PyUnicode_DATA_ANY(op) (((PyUnicodeObject*)(op))->data.any) +#define _PyUnicode_DATA_ANY(op) \ + (((PyUnicodeObject*)(op))->data.any) #undef PyUnicode_READY #define PyUnicode_READY(op) \ (assert(_PyUnicode_CHECK(op)), \ (PyUnicode_IS_READY(op) ? \ - 0 : _PyUnicode_Ready((PyObject *)(op)))) + 0 : \ + _PyUnicode_Ready((PyObject *)(op)))) #define _PyUnicode_READY_REPLACE(p_obj) \ (assert(_PyUnicode_CHECK(*p_obj)), \ -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 01:17:22 2011 From: python-checkins at python.org (victor.stinner) Date: Tue, 04 Oct 2011 01:17:22 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Move_in-place_Unicode_appen?= =?utf8?q?d_to_its_own_subfunction?= Message-ID: http://hg.python.org/cpython/rev/714a039c6fa6 changeset: 72631:714a039c6fa6 user: Victor Stinner date: Tue Oct 04 01:17:31 2011 +0200 summary: Move in-place Unicode append to its own subfunction files: Objects/unicodeobject.c | 92 +++++++++++++++++----------- 1 files changed, 54 insertions(+), 38 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -9967,6 +9967,54 @@ return NULL; } +static void +unicode_append_inplace(PyObject **p_left, PyObject *right) +{ + Py_ssize_t left_len, right_len, new_len; +#ifdef Py_DEBUG + Py_ssize_t copied; +#endif + + assert(PyUnicode_IS_READY(*p_left)); + assert(PyUnicode_IS_READY(right)); + + left_len = PyUnicode_GET_LENGTH(*p_left); + right_len = PyUnicode_GET_LENGTH(right); + if (left_len > PY_SSIZE_T_MAX - right_len) { + PyErr_SetString(PyExc_OverflowError, + "strings are too large to concat"); + goto error; + } + new_len = left_len + right_len; + + /* Now we own the last reference to 'left', so we can resize it + * in-place. + */ + if (unicode_resize(p_left, new_len) != 0) { + /* XXX if _PyUnicode_Resize() fails, 'left' has been + * deallocated so it cannot be put back into + * 'variable'. The MemoryError is raised when there + * is no value in 'variable', which might (very + * remotely) be a cause of incompatibilities. + */ + goto error; + } + /* copy 'right' into the newly allocated area of 'left' */ +#ifdef Py_DEBUG + copied = PyUnicode_CopyCharacters(*p_left, left_len, + right, 0, + right_len); + assert(0 <= copied); +#else + PyUnicode_CopyCharacters(*p_left, left_len, right, 0, right_len); +#endif + return; + +error: + Py_DECREF(*p_left); + *p_left = NULL; +} + void PyUnicode_Append(PyObject **p_left, PyObject *right) { @@ -9990,50 +10038,18 @@ && (_PyUnicode_KIND(right) <= _PyUnicode_KIND(left) || _PyUnicode_WSTR(left) != NULL)) { - Py_ssize_t left_len, right_len, new_len; -#ifdef Py_DEBUG - Py_ssize_t copied; -#endif - if (PyUnicode_READY(left)) goto error; if (PyUnicode_READY(right)) goto error; - /* FIXME: support ascii+latin1, PyASCIIObject => PyCompactUnicodeObject */ - if (PyUnicode_MAX_CHAR_VALUE(right) <= PyUnicode_MAX_CHAR_VALUE(left)) + /* Don't resize for ascii += latin1. Convert ascii to latin1 requires + to change the structure size, but characters are stored just after + the structure, and so it requires to move all charactres which is + not so different than duplicating the string. */ + if (!(PyUnicode_IS_ASCII(left) && !PyUnicode_IS_ASCII(right))) { - left_len = PyUnicode_GET_LENGTH(left); - right_len = PyUnicode_GET_LENGTH(right); - if (left_len > PY_SSIZE_T_MAX - right_len) { - PyErr_SetString(PyExc_OverflowError, - "strings are too large to concat"); - goto error; - } - new_len = left_len + right_len; - - /* Now we own the last reference to 'left', so we can resize it - * in-place. - */ - if (unicode_resize(&left, new_len) != 0) { - /* XXX if _PyUnicode_Resize() fails, 'left' has been - * deallocated so it cannot be put back into - * 'variable'. The MemoryError is raised when there - * is no value in 'variable', which might (very - * remotely) be a cause of incompatibilities. - */ - goto error; - } - /* copy 'right' into the newly allocated area of 'left' */ -#ifdef Py_DEBUG - copied = PyUnicode_CopyCharacters(left, left_len, - right, 0, - right_len); - assert(0 <= copied); -#else - PyUnicode_CopyCharacters(left, left_len, right, 0, right_len); -#endif - *p_left = left; + unicode_append_inplace(p_left, right); return; } } -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 01:32:25 2011 From: python-checkins at python.org (victor.stinner) Date: Tue, 04 Oct 2011 01:32:25 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Complete_documentation_of_c?= =?utf8?q?ompact_ASCII_strings?= Message-ID: http://hg.python.org/cpython/rev/9c9ebb07d053 changeset: 72632:9c9ebb07d053 user: Victor Stinner date: Tue Oct 04 01:32:45 2011 +0200 summary: Complete documentation of compact ASCII strings files: Include/unicodeobject.h | 9 ++++++--- 1 files changed, 6 insertions(+), 3 deletions(-) diff --git a/Include/unicodeobject.h b/Include/unicodeobject.h --- a/Include/unicodeobject.h +++ b/Include/unicodeobject.h @@ -215,7 +215,9 @@ * compact = 1 * ascii = 1 * ready = 1 - * utf8 = data + * (length is the length of the utf8 and wstr strings) + * (data starts just after the structure) + * (since ASCII is decoded from UTF-8, the utf8 string are the data) - compact: @@ -225,25 +227,26 @@ * compact = 1 * ready = 1 * ascii = 0 - * utf8 != data + * utf8 is not shared with data * utf8_length = 0 if utf8 is NULL * wstr is shared with data and wstr_length=length if kind=PyUnicode_2BYTE_KIND and sizeof(wchar_t)=2 or if kind=PyUnicode_4BYTE_KIND and sizeof(wchar_4)=4 * wstr_length = 0 if wstr is NULL + * (data starts just after the structure) - legacy string, not ready: * structure = PyUnicodeObject * kind = PyUnicode_WCHAR_KIND * compact = 0 + * ascii = 0 * ready = 0 * wstr is not NULL * data.any is NULL * utf8 is NULL * utf8_length = 0 * interned = SSTATE_NOT_INTERNED - * ascii = 0 - legacy string, ready: -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 01:34:17 2011 From: python-checkins at python.org (benjamin.peterson) Date: Tue, 04 Oct 2011 01:34:17 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_fix_compiler_warnings?= Message-ID: http://hg.python.org/cpython/rev/afb60b190f1c changeset: 72633:afb60b190f1c user: Benjamin Peterson date: Mon Oct 03 19:34:12 2011 -0400 summary: fix compiler warnings files: Objects/unicodeobject.c | 12 +++++++++--- 1 files changed, 9 insertions(+), 3 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -369,6 +369,12 @@ } return 1; } +#else +static int +_PyUnicode_CheckConsistency(void *op) +{ + return 1; +} #endif /* --- Bloom Filters ----------------------------------------------------- */ @@ -536,7 +542,7 @@ _PyUnicode_LENGTH(unicode) = length; PyUnicode_WRITE(PyUnicode_KIND(unicode), data, length, 0); if (share_wstr || _PyUnicode_WSTR(unicode) == NULL) { - _PyUnicode_CHECK(unicode); + _PyUnicode_CheckConsistency(unicode); return 0; } } @@ -556,7 +562,7 @@ _PyUnicode_WSTR(unicode) = wstr; _PyUnicode_WSTR(unicode)[length] = 0; _PyUnicode_WSTR_LENGTH(unicode) = length; - _PyUnicode_CHECK(unicode); + _PyUnicode_CheckConsistency(unicode); return 0; } @@ -1354,7 +1360,7 @@ *p_unicode = resize_compact(unicode, length); if (*p_unicode == NULL) return -1; - _PyUnicode_CHECK(*p_unicode); + _PyUnicode_CheckConsistency(*p_unicode); return 0; } else return resize_inplace((PyUnicodeObject*)unicode, length); -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 01:35:12 2011 From: python-checkins at python.org (benjamin.peterson) Date: Tue, 04 Oct 2011 01:35:12 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_fix_formatting?= Message-ID: http://hg.python.org/cpython/rev/64495ad8aa54 changeset: 72634:64495ad8aa54 user: Benjamin Peterson date: Mon Oct 03 19:35:07 2011 -0400 summary: fix formatting files: Objects/unicodeobject.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -1362,8 +1362,8 @@ return -1; _PyUnicode_CheckConsistency(*p_unicode); return 0; - } else - return resize_inplace((PyUnicodeObject*)unicode, length); + } + return resize_inplace((PyUnicodeObject*)unicode, length); } int -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 01:37:36 2011 From: python-checkins at python.org (benjamin.peterson) Date: Tue, 04 Oct 2011 01:37:36 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_fix_parens?= Message-ID: http://hg.python.org/cpython/rev/61de28fa5537 changeset: 72635:61de28fa5537 user: Benjamin Peterson date: Mon Oct 03 19:37:29 2011 -0400 summary: fix parens files: Objects/unicodeobject.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -1314,7 +1314,7 @@ return 0; if (PyUnicode_CHECK_INTERNED(unicode)) return 0; - assert (unicode != unicode_empty); + assert(unicode != unicode_empty); #ifdef Py_DEBUG if (_PyUnicode_KIND(unicode) != PyUnicode_WCHAR_KIND && PyUnicode_GET_LENGTH(unicode) == 1) -- Repository URL: http://hg.python.org/cpython From solipsis at pitrou.net Tue Oct 4 05:26:43 2011 From: solipsis at pitrou.net (solipsis at pitrou.net) Date: Tue, 04 Oct 2011 05:26:43 +0200 Subject: [Python-checkins] Daily reference leaks (61de28fa5537): sum=0 Message-ID: results for 61de28fa5537 on branch "default" -------------------------------------------- Command line was: ['./python', '-m', 'test.regrtest', '-uall', '-R', '3:3:/home/antoine/cpython/refleaks/reflogsu5co9', '-x'] From python-checkins at python.org Tue Oct 4 05:40:20 2011 From: python-checkins at python.org (meador.inge) Date: Tue, 04 Oct 2011 05:40:20 +0200 Subject: [Python-checkins] =?utf8?b?Y3B5dGhvbiAoMi43KTogSXNzdWUgIzEyODgx?= =?utf8?q?=3A_ctypes=3A_Fix_segfault_with_large_structure_field_names=2E?= Message-ID: http://hg.python.org/cpython/rev/aa3ebc2dfc15 changeset: 72636:aa3ebc2dfc15 branch: 2.7 parent: 72590:277688052c5a user: Meador Inge date: Mon Oct 03 21:34:04 2011 -0500 summary: Issue #12881: ctypes: Fix segfault with large structure field names. files: Lib/ctypes/test/test_structures.py | 12 ++++++++++++ Misc/NEWS | 2 ++ Modules/_ctypes/stgdict.c | 8 +++++++- 3 files changed, 21 insertions(+), 1 deletions(-) diff --git a/Lib/ctypes/test/test_structures.py b/Lib/ctypes/test/test_structures.py --- a/Lib/ctypes/test/test_structures.py +++ b/Lib/ctypes/test/test_structures.py @@ -332,6 +332,18 @@ else: self.assertEqual(msg, "(Phone) exceptions.TypeError: too many initializers") + def test_huge_field_name(self): + # issue12881: segfault with large structure field names + def create_class(length): + class S(Structure): + _fields_ = [('x' * length, c_int)] + + for length in [10 ** i for i in range(0, 8)]: + try: + create_class(length) + except MemoryError: + # MemoryErrors are OK, we just don't want to segfault + pass def get_except(self, func, *args): try: diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -210,6 +210,8 @@ Extension Modules ----------------- +- Issue #12881: ctypes: Fix segfault with large structure field names. + - Issue #13013: ctypes: Fix a reference leak in PyCArrayType_from_ctype. Thanks to Suman Saha for finding the bug and providing a patch. diff --git a/Modules/_ctypes/stgdict.c b/Modules/_ctypes/stgdict.c --- a/Modules/_ctypes/stgdict.c +++ b/Modules/_ctypes/stgdict.c @@ -508,13 +508,19 @@ } len = strlen(fieldname) + strlen(fieldfmt); - buf = alloca(len + 2 + 1); + buf = PyMem_Malloc(len + 2 + 1); + if (buf == NULL) { + Py_DECREF(pair); + PyErr_NoMemory(); + return -1; + } sprintf(buf, "%s:%s:", fieldfmt, fieldname); ptr = stgdict->format; stgdict->format = _ctypes_alloc_format_string(stgdict->format, buf); PyMem_Free(ptr); + PyMem_Free(buf); if (stgdict->format == NULL) { Py_DECREF(pair); -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 05:40:21 2011 From: python-checkins at python.org (meador.inge) Date: Tue, 04 Oct 2011 05:40:21 +0200 Subject: [Python-checkins] =?utf8?b?Y3B5dGhvbiAoMy4yKTogSXNzdWUgIzEyODgx?= =?utf8?q?=3A_ctypes=3A_Fix_segfault_with_large_structure_field_names=2E?= Message-ID: http://hg.python.org/cpython/rev/d05350c14e77 changeset: 72637:d05350c14e77 branch: 3.2 parent: 72588:a3f2dba93743 user: Meador Inge date: Mon Oct 03 21:44:22 2011 -0500 summary: Issue #12881: ctypes: Fix segfault with large structure field names. files: Lib/ctypes/test/test_structures.py | 12 ++++++++++++ Misc/NEWS | 2 ++ Modules/_ctypes/stgdict.c | 8 +++++++- 3 files changed, 21 insertions(+), 1 deletions(-) diff --git a/Lib/ctypes/test/test_structures.py b/Lib/ctypes/test/test_structures.py --- a/Lib/ctypes/test/test_structures.py +++ b/Lib/ctypes/test/test_structures.py @@ -326,6 +326,18 @@ else: self.assertEqual(msg, "(Phone) TypeError: too many initializers") + def test_huge_field_name(self): + # issue12881: segfault with large structure field names + def create_class(length): + class S(Structure): + _fields_ = [('x' * length, c_int)] + + for length in [10 ** i for i in range(0, 8)]: + try: + create_class(length) + except MemoryError: + # MemoryErrors are OK, we just don't want to segfault + pass def get_except(self, func, *args): try: diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -87,6 +87,8 @@ Extension Modules ----------------- +- Issue #12881: ctypes: Fix segfault with large structure field names. + - Issue #13058: ossaudiodev: fix a file descriptor leak on error. Patch by Thomas Jarosch. diff --git a/Modules/_ctypes/stgdict.c b/Modules/_ctypes/stgdict.c --- a/Modules/_ctypes/stgdict.c +++ b/Modules/_ctypes/stgdict.c @@ -496,13 +496,19 @@ } len = strlen(fieldname) + strlen(fieldfmt); - buf = alloca(len + 2 + 1); + buf = PyMem_Malloc(len + 2 + 1); + if (buf == NULL) { + Py_DECREF(pair); + PyErr_NoMemory(); + return -1; + } sprintf(buf, "%s:%s:", fieldfmt, fieldname); ptr = stgdict->format; stgdict->format = _ctypes_alloc_format_string(stgdict->format, buf); PyMem_Free(ptr); + PyMem_Free(buf); if (stgdict->format == NULL) { Py_DECREF(pair); -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 05:40:21 2011 From: python-checkins at python.org (meador.inge) Date: Tue, 04 Oct 2011 05:40:21 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=28merge_3=2E2_-=3E_default=29?= =?utf8?q?=3A_Issue_=2312881=3A_ctypes=3A_Fix_segfault_with_large_structur?= =?utf8?q?e_field_names=2E?= Message-ID: http://hg.python.org/cpython/rev/2eab632864f6 changeset: 72638:2eab632864f6 parent: 72635:61de28fa5537 parent: 72637:d05350c14e77 user: Meador Inge date: Mon Oct 03 21:48:30 2011 -0500 summary: Issue #12881: ctypes: Fix segfault with large structure field names. files: Lib/ctypes/test/test_structures.py | 12 ++++++++++++ Misc/NEWS | 2 ++ Modules/_ctypes/stgdict.c | 8 +++++++- 3 files changed, 21 insertions(+), 1 deletions(-) diff --git a/Lib/ctypes/test/test_structures.py b/Lib/ctypes/test/test_structures.py --- a/Lib/ctypes/test/test_structures.py +++ b/Lib/ctypes/test/test_structures.py @@ -326,6 +326,18 @@ else: self.assertEqual(msg, "(Phone) TypeError: too many initializers") + def test_huge_field_name(self): + # issue12881: segfault with large structure field names + def create_class(length): + class S(Structure): + _fields_ = [('x' * length, c_int)] + + for length in [10 ** i for i in range(0, 8)]: + try: + create_class(length) + except MemoryError: + # MemoryErrors are OK, we just don't want to segfault + pass def get_except(self, func, *args): try: diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -1303,6 +1303,8 @@ Extension Modules ----------------- +- Issue #12881: ctypes: Fix segfault with large structure field names. + - Issue #13058: ossaudiodev: fix a file descriptor leak on error. Patch by Thomas Jarosch. diff --git a/Modules/_ctypes/stgdict.c b/Modules/_ctypes/stgdict.c --- a/Modules/_ctypes/stgdict.c +++ b/Modules/_ctypes/stgdict.c @@ -493,13 +493,19 @@ } len = strlen(fieldname) + strlen(fieldfmt); - buf = alloca(len + 2 + 1); + buf = PyMem_Malloc(len + 2 + 1); + if (buf == NULL) { + Py_DECREF(pair); + PyErr_NoMemory(); + return -1; + } sprintf(buf, "%s:%s:", fieldfmt, fieldname); ptr = stgdict->format; stgdict->format = _ctypes_alloc_format_string(stgdict->format, buf); PyMem_Free(ptr); + PyMem_Free(buf); if (stgdict->format == NULL) { Py_DECREF(pair); -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 09:34:54 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 04 Oct 2011 09:34:54 +0200 Subject: [Python-checkins] =?utf8?b?Y3B5dGhvbiAoMy4yKTogSXNzdWUgIzc2ODk6?= =?utf8?q?_Allow_pickling_of_dynamically_created_classes_when_their?= Message-ID: http://hg.python.org/cpython/rev/760ac320fa3d changeset: 72639:760ac320fa3d branch: 3.2 parent: 72637:d05350c14e77 user: Antoine Pitrou date: Tue Oct 04 09:23:04 2011 +0200 summary: Issue #7689: Allow pickling of dynamically created classes when their metaclass is registered with copyreg. Patch by Nicolas M. Thi?ry and Craig Citro. files: Lib/pickle.py | 18 +++++++++--------- Lib/test/pickletester.py | 21 +++++++++++++++++++++ Misc/ACKS | 2 ++ Misc/NEWS | 4 ++++ Modules/_pickle.c | 8 ++++---- 5 files changed, 40 insertions(+), 13 deletions(-) diff --git a/Lib/pickle.py b/Lib/pickle.py --- a/Lib/pickle.py +++ b/Lib/pickle.py @@ -299,20 +299,20 @@ f(self, obj) # Call unbound method with explicit self return - # Check for a class with a custom metaclass; treat as regular class - try: - issc = issubclass(t, type) - except TypeError: # t is not a class (old Boost; see SF #502085) - issc = 0 - if issc: - self.save_global(obj) - return - # Check copyreg.dispatch_table reduce = dispatch_table.get(t) if reduce: rv = reduce(obj) else: + # Check for a class with a custom metaclass; treat as regular class + try: + issc = issubclass(t, type) + except TypeError: # t is not a class (old Boost; see SF #502085) + issc = False + if issc: + self.save_global(obj) + return + # Check for a __reduce_ex__ method, fall back to __reduce__ reduce = getattr(obj, "__reduce_ex__", None) if reduce: diff --git a/Lib/test/pickletester.py b/Lib/test/pickletester.py --- a/Lib/test/pickletester.py +++ b/Lib/test/pickletester.py @@ -121,6 +121,19 @@ class use_metaclass(object, metaclass=metaclass): pass +class pickling_metaclass(type): + def __eq__(self, other): + return (type(self) == type(other) and + self.reduce_args == other.reduce_args) + + def __reduce__(self): + return (create_dynamic_class, self.reduce_args) + +def create_dynamic_class(name, bases): + result = pickling_metaclass(name, bases, dict()) + result.reduce_args = (name, bases) + return result + # DATA0 .. DATA2 are the pickles we expect under the various protocols, for # the object returned by create_data(). @@ -695,6 +708,14 @@ b = self.loads(s) self.assertEqual(a.__class__, b.__class__) + def test_dynamic_class(self): + a = create_dynamic_class("my_dynamic_class", (object,)) + copyreg.pickle(pickling_metaclass, pickling_metaclass.__reduce__) + for proto in protocols: + s = self.dumps(a, proto) + b = self.loads(s) + self.assertEqual(a, b) + def test_structseq(self): import time import os diff --git a/Misc/ACKS b/Misc/ACKS --- a/Misc/ACKS +++ b/Misc/ACKS @@ -164,6 +164,7 @@ Tom Christiansen Vadim Chugunov David Cinege +Craig Citro Mike Clarkson Andrew Clegg Brad Clements @@ -881,6 +882,7 @@ Mikhail Terekhov Richard M. Tew Tobias Thelen +Nicolas M. Thi?ry James Thomas Robin Thomas Stephen Thorne diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -36,6 +36,10 @@ Library ------- +- Issue #7689: Allow pickling of dynamically created classes when their + metaclass is registered with copyreg. Patch by Nicolas M. Thi?ry and Craig + Citro. + - Issue #4147: minidom's toprettyxml no longer adds whitespace to text nodes. - Issue #13034: When decoding some SSL certificates, the subjectAltName diff --git a/Modules/_pickle.c b/Modules/_pickle.c --- a/Modules/_pickle.c +++ b/Modules/_pickle.c @@ -3141,10 +3141,6 @@ status = save_global(self, obj, NULL); goto done; } - else if (PyType_IsSubtype(type, &PyType_Type)) { - status = save_global(self, obj, NULL); - goto done; - } /* XXX: This part needs some unit tests. */ @@ -3163,6 +3159,10 @@ Py_INCREF(obj); reduce_value = _Pickler_FastCall(self, reduce_func, obj); } + else if (PyType_IsSubtype(type, &PyType_Type)) { + status = save_global(self, obj, NULL); + goto done; + } else { static PyObject *reduce_str = NULL; static PyObject *reduce_ex_str = NULL; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 09:34:55 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 04 Oct 2011 09:34:55 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=28merge_3=2E2_-=3E_default=29?= =?utf8?q?=3A_Issue_=237689=3A_Allow_pickling_of_dynamically_created_class?= =?utf8?q?es_when_their?= Message-ID: http://hg.python.org/cpython/rev/46c026a5ccb9 changeset: 72640:46c026a5ccb9 parent: 72638:2eab632864f6 parent: 72639:760ac320fa3d user: Antoine Pitrou date: Tue Oct 04 09:25:28 2011 +0200 summary: Issue #7689: Allow pickling of dynamically created classes when their metaclass is registered with copyreg. Patch by Nicolas M. Thi?ry and Craig Citro. files: Lib/pickle.py | 18 +++++++++--------- Lib/test/pickletester.py | 21 +++++++++++++++++++++ Misc/ACKS | 2 ++ Misc/NEWS | 4 ++++ Modules/_pickle.c | 8 ++++---- 5 files changed, 40 insertions(+), 13 deletions(-) diff --git a/Lib/pickle.py b/Lib/pickle.py --- a/Lib/pickle.py +++ b/Lib/pickle.py @@ -297,20 +297,20 @@ f(self, obj) # Call unbound method with explicit self return - # Check for a class with a custom metaclass; treat as regular class - try: - issc = issubclass(t, type) - except TypeError: # t is not a class (old Boost; see SF #502085) - issc = 0 - if issc: - self.save_global(obj) - return - # Check copyreg.dispatch_table reduce = dispatch_table.get(t) if reduce: rv = reduce(obj) else: + # Check for a class with a custom metaclass; treat as regular class + try: + issc = issubclass(t, type) + except TypeError: # t is not a class (old Boost; see SF #502085) + issc = False + if issc: + self.save_global(obj) + return + # Check for a __reduce_ex__ method, fall back to __reduce__ reduce = getattr(obj, "__reduce_ex__", None) if reduce: diff --git a/Lib/test/pickletester.py b/Lib/test/pickletester.py --- a/Lib/test/pickletester.py +++ b/Lib/test/pickletester.py @@ -122,6 +122,19 @@ class use_metaclass(object, metaclass=metaclass): pass +class pickling_metaclass(type): + def __eq__(self, other): + return (type(self) == type(other) and + self.reduce_args == other.reduce_args) + + def __reduce__(self): + return (create_dynamic_class, self.reduce_args) + +def create_dynamic_class(name, bases): + result = pickling_metaclass(name, bases, dict()) + result.reduce_args = (name, bases) + return result + # DATA0 .. DATA2 are the pickles we expect under the various protocols, for # the object returned by create_data(). @@ -696,6 +709,14 @@ b = self.loads(s) self.assertEqual(a.__class__, b.__class__) + def test_dynamic_class(self): + a = create_dynamic_class("my_dynamic_class", (object,)) + copyreg.pickle(pickling_metaclass, pickling_metaclass.__reduce__) + for proto in protocols: + s = self.dumps(a, proto) + b = self.loads(s) + self.assertEqual(a, b) + def test_structseq(self): import time import os diff --git a/Misc/ACKS b/Misc/ACKS --- a/Misc/ACKS +++ b/Misc/ACKS @@ -174,6 +174,7 @@ Tom Christiansen Vadim Chugunov David Cinege +Craig Citro Mike Clarkson Andrew Clegg Brad Clements @@ -941,6 +942,7 @@ Mikhail Terekhov Richard M. Tew Tobias Thelen +Nicolas M. Thi?ry James Thomas Robin Thomas Stephen Thorne diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -294,6 +294,10 @@ Library ------- +- Issue #7689: Allow pickling of dynamically created classes when their + metaclass is registered with copyreg. Patch by Nicolas M. Thi?ry and Craig + Citro. + - Issue #4147: minidom's toprettyxml no longer adds whitespace to text nodes. - Issue #13034: When decoding some SSL certificates, the subjectAltName diff --git a/Modules/_pickle.c b/Modules/_pickle.c --- a/Modules/_pickle.c +++ b/Modules/_pickle.c @@ -3134,10 +3134,6 @@ status = save_global(self, obj, NULL); goto done; } - else if (PyType_IsSubtype(type, &PyType_Type)) { - status = save_global(self, obj, NULL); - goto done; - } /* XXX: This part needs some unit tests. */ @@ -3156,6 +3152,10 @@ Py_INCREF(obj); reduce_value = _Pickler_FastCall(self, reduce_func, obj); } + else if (PyType_IsSubtype(type, &PyType_Type)) { + status = save_global(self, obj, NULL); + goto done; + } else { static PyObject *reduce_str = NULL; static PyObject *reduce_ex_str = NULL; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 09:39:13 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 04 Oct 2011 09:39:13 +0200 Subject: [Python-checkins] =?utf8?b?Y3B5dGhvbiAoMi43KTogSXNzdWUgIzc2ODk6?= =?utf8?q?_Allow_pickling_of_dynamically_created_classes_when_their?= Message-ID: http://hg.python.org/cpython/rev/64053bd79590 changeset: 72641:64053bd79590 branch: 2.7 parent: 72636:aa3ebc2dfc15 user: Antoine Pitrou date: Tue Oct 04 09:34:48 2011 +0200 summary: Issue #7689: Allow pickling of dynamically created classes when their metaclass is registered with copyreg. Patch by Nicolas M. Thi?ry and Craig Citro. files: Lib/pickle.py | 18 +++++++++--------- Lib/test/pickletester.py | 21 +++++++++++++++++++++ Misc/ACKS | 2 ++ Misc/NEWS | 4 ++++ Modules/cPickle.c | 10 +++++----- 5 files changed, 41 insertions(+), 14 deletions(-) diff --git a/Lib/pickle.py b/Lib/pickle.py --- a/Lib/pickle.py +++ b/Lib/pickle.py @@ -286,20 +286,20 @@ f(self, obj) # Call unbound method with explicit self return - # Check for a class with a custom metaclass; treat as regular class - try: - issc = issubclass(t, TypeType) - except TypeError: # t is not a class (old Boost; see SF #502085) - issc = 0 - if issc: - self.save_global(obj) - return - # Check copy_reg.dispatch_table reduce = dispatch_table.get(t) if reduce: rv = reduce(obj) else: + # Check for a class with a custom metaclass; treat as regular class + try: + issc = issubclass(t, TypeType) + except TypeError: # t is not a class (old Boost; see SF #502085) + issc = 0 + if issc: + self.save_global(obj) + return + # Check for a __reduce_ex__ method, fall back to __reduce__ reduce = getattr(obj, "__reduce_ex__", None) if reduce: diff --git a/Lib/test/pickletester.py b/Lib/test/pickletester.py --- a/Lib/test/pickletester.py +++ b/Lib/test/pickletester.py @@ -124,6 +124,19 @@ class use_metaclass(object): __metaclass__ = metaclass +class pickling_metaclass(type): + def __eq__(self, other): + return (type(self) == type(other) and + self.reduce_args == other.reduce_args) + + def __reduce__(self): + return (create_dynamic_class, self.reduce_args) + +def create_dynamic_class(name, bases): + result = pickling_metaclass(name, bases, dict()) + result.reduce_args = (name, bases) + return result + # DATA0 .. DATA2 are the pickles we expect under the various protocols, for # the object returned by create_data(). @@ -609,6 +622,14 @@ b = self.loads(s) self.assertEqual(a.__class__, b.__class__) + def test_dynamic_class(self): + a = create_dynamic_class("my_dynamic_class", (object,)) + copy_reg.pickle(pickling_metaclass, pickling_metaclass.__reduce__) + for proto in protocols: + s = self.dumps(a, proto) + b = self.loads(s) + self.assertEqual(a, b) + def test_structseq(self): import time import os diff --git a/Misc/ACKS b/Misc/ACKS --- a/Misc/ACKS +++ b/Misc/ACKS @@ -147,6 +147,7 @@ Tom Christiansen Vadim Chugunov David Cinege +Craig Citro Mike Clarkson Andrew Clegg Brad Clements @@ -817,6 +818,7 @@ Mikhail Terekhov Richard M. Tew Tobias Thelen +Nicolas M. Thi?ry James Thomas Robin Thomas Stephen Thorne diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -50,6 +50,10 @@ Library ------- +- Issue #7689: Allow pickling of dynamically created classes when their + metaclass is registered with copy_reg. Patch by Nicolas M. Thi?ry and + Craig Citro. + - Issue #13058: ossaudiodev: fix a file descriptor leak on error. Patch by Thomas Jarosch. diff --git a/Modules/cPickle.c b/Modules/cPickle.c --- a/Modules/cPickle.c +++ b/Modules/cPickle.c @@ -2697,11 +2697,6 @@ } } - if (PyType_IsSubtype(type, &PyType_Type)) { - res = save_global(self, args, NULL); - goto finally; - } - /* Get a reduction callable, and call it. This may come from * copy_reg.dispatch_table, the object's __reduce_ex__ method, * or the object's __reduce__ method. @@ -2717,6 +2712,11 @@ } } else { + if (PyType_IsSubtype(type, &PyType_Type)) { + res = save_global(self, args, NULL); + goto finally; + } + /* Check for a __reduce_ex__ method. */ __reduce__ = PyObject_GetAttr(args, __reduce_ex___str); if (__reduce__ != NULL) { -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 10:32:22 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 04 Oct 2011 10:32:22 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=283=2E2=29=3A_Start_fixing_te?= =?utf8?q?st=5Fbigmem=3A?= Message-ID: http://hg.python.org/cpython/rev/bf39434dd506 changeset: 72642:bf39434dd506 branch: 3.2 parent: 72639:760ac320fa3d user: Antoine Pitrou date: Tue Oct 04 10:22:36 2011 +0200 summary: Start fixing test_bigmem: - bigmemtest is replaced by precisionbigmemtest - add a poor man's watchdog thread to print memory consumption files: Lib/test/pickletester.py | 12 +- Lib/test/support.py | 92 +++++-- Lib/test/test_bigmem.py | 239 +++++++++++----------- Lib/test/test_hashlib.py | 6 +- Lib/test/test_xml_etree_c.py | 4 +- Lib/test/test_zlib.py | 14 +- 6 files changed, 203 insertions(+), 164 deletions(-) diff --git a/Lib/test/pickletester.py b/Lib/test/pickletester.py --- a/Lib/test/pickletester.py +++ b/Lib/test/pickletester.py @@ -8,7 +8,7 @@ from test.support import ( TestFailed, TESTFN, run_with_locale, - _2G, _4G, precisionbigmemtest, + _2G, _4G, bigmemtest, ) from pickle import bytes_types @@ -1159,7 +1159,7 @@ # Binary protocols can serialize longs of up to 2GB-1 - @precisionbigmemtest(size=_2G, memuse=1 + 1, dry_run=False) + @bigmemtest(size=_2G, memuse=1 + 1, dry_run=False) def test_huge_long_32b(self, size): data = 1 << (8 * size) try: @@ -1175,7 +1175,7 @@ # (older protocols don't have a dedicated opcode for bytes and are # too inefficient) - @precisionbigmemtest(size=_2G, memuse=1 + 1, dry_run=False) + @bigmemtest(size=_2G, memuse=1 + 1, dry_run=False) def test_huge_bytes_32b(self, size): data = b"abcd" * (size // 4) try: @@ -1191,7 +1191,7 @@ finally: data = None - @precisionbigmemtest(size=_4G, memuse=1 + 1, dry_run=False) + @bigmemtest(size=_4G, memuse=1 + 1, dry_run=False) def test_huge_bytes_64b(self, size): data = b"a" * size try: @@ -1206,7 +1206,7 @@ # All protocols use 1-byte per printable ASCII character; we add another # byte because the encoded form has to be copied into the internal buffer. - @precisionbigmemtest(size=_2G, memuse=2 + character_size, dry_run=False) + @bigmemtest(size=_2G, memuse=2 + character_size, dry_run=False) def test_huge_str_32b(self, size): data = "abcd" * (size // 4) try: @@ -1223,7 +1223,7 @@ # BINUNICODE (protocols 1, 2 and 3) cannot carry more than # 2**32 - 1 bytes of utf-8 encoded unicode. - @precisionbigmemtest(size=_4G, memuse=1 + character_size, dry_run=False) + @bigmemtest(size=_4G, memuse=1 + character_size, dry_run=False) def test_huge_str_64b(self, size): data = "a" * size try: diff --git a/Lib/test/support.py b/Lib/test/support.py --- a/Lib/test/support.py +++ b/Lib/test/support.py @@ -1053,45 +1053,54 @@ raise ValueError('Memory limit %r too low to be useful' % (limit,)) max_memuse = memlimit -def bigmemtest(minsize, memuse): +def _memory_watchdog(start_evt, finish_evt, period=10.0): + """A function which periodically watches the process' memory consumption + and prints it out. + """ + # XXX: because of the GIL, and because the very long operations tested + # in most bigmem tests are uninterruptible, the loop below gets woken up + # much less often than expected. + # The polling code should be rewritten in raw C, without holding the GIL, + # and push results onto an anonymous pipe. + try: + page_size = os.sysconf('SC_PAGESIZE') + except (ValueError, AttributeError): + try: + page_size = os.sysconf('SC_PAGE_SIZE') + except (ValueError, AttributeError): + page_size = 4096 + procfile = '/proc/{pid}/statm'.format(pid=os.getpid()) + try: + f = open(procfile, 'rb') + except IOError as e: + warnings.warn('/proc not available for stats: {}'.format(e), + RuntimeWarning) + sys.stderr.flush() + return + with f: + start_evt.set() + old_data = -1 + while not finish_evt.wait(period): + f.seek(0) + statm = f.read().decode('ascii') + data = int(statm.split()[5]) + if data != old_data: + old_data = data + print(" ... process data size: {data:.1f}G" + .format(data=data * page_size / (1024 ** 3))) + +def bigmemtest(size, memuse, dry_run=True): """Decorator for bigmem tests. 'minsize' is the minimum useful size for the test (in arbitrary, test-interpreted units.) 'memuse' is the number of 'bytes per size' for the test, or a good estimate of it. - The decorator tries to guess a good value for 'size' and passes it to - the decorated test function. If minsize * memuse is more than the - allowed memory use (as defined by max_memuse), the test is skipped. - Otherwise, minsize is adjusted upward to use up to max_memuse. + if 'dry_run' is False, it means the test doesn't support dummy runs + when -M is not specified. """ def decorator(f): def wrapper(self): - # Retrieve values in case someone decided to adjust them - minsize = wrapper.minsize - memuse = wrapper.memuse - if not max_memuse: - # If max_memuse is 0 (the default), - # we still want to run the tests with size set to a few kb, - # to make sure they work. We still want to avoid using - # too much memory, though, but we do that noisily. - maxsize = 5147 - self.assertFalse(maxsize * memuse > 20 * _1M) - else: - maxsize = int(max_memuse / memuse) - if maxsize < minsize: - raise unittest.SkipTest( - "not enough memory: %.1fG minimum needed" - % (minsize * memuse / (1024 ** 3))) - return f(self, maxsize) - wrapper.minsize = minsize - wrapper.memuse = memuse - return wrapper - return decorator - -def precisionbigmemtest(size, memuse, dry_run=True): - def decorator(f): - def wrapper(self): size = wrapper.size memuse = wrapper.memuse if not real_max_memuse: @@ -1105,7 +1114,28 @@ "not enough memory: %.1fG minimum needed" % (size * memuse / (1024 ** 3))) - return f(self, maxsize) + if real_max_memuse and verbose and threading: + print() + print(" ... expected peak memory use: {peak:.1f}G" + .format(peak=size * memuse / (1024 ** 3))) + sys.stdout.flush() + start_evt = threading.Event() + finish_evt = threading.Event() + t = threading.Thread(target=_memory_watchdog, + args=(start_evt, finish_evt, 0.5)) + t.daemon = True + t.start() + start_evt.set() + else: + t = None + + try: + return f(self, maxsize) + finally: + if t: + finish_evt.set() + t.join() + wrapper.size = size wrapper.memuse = memuse return wrapper diff --git a/Lib/test/test_bigmem.py b/Lib/test/test_bigmem.py --- a/Lib/test/test_bigmem.py +++ b/Lib/test/test_bigmem.py @@ -1,5 +1,5 @@ from test import support -from test.support import bigmemtest, _1G, _2G, _4G, precisionbigmemtest +from test.support import bigmemtest, _1G, _2G, _4G import unittest import operator @@ -25,10 +25,10 @@ # a large object, make the subobject of a length that is not a power of # 2. That way, int-wrapping problems are more easily detected. # -# - While the bigmemtest decorator speaks of 'minsize', all tests will -# actually be called with a much smaller number too, in the normal -# test run (5Kb currently.) This is so the tests themselves get frequent -# testing. Consequently, always make all large allocations based on the +# - Despite the bigmemtest decorator, all tests will actually be called +# with a much smaller number too, in the normal test run (5Kb currently.) +# This is so the tests themselves get frequent testing. +# Consequently, always make all large allocations based on the # passed-in 'size', and don't rely on the size being very large. Also, # memuse-per-size should remain sane (less than a few thousand); if your # test uses more, adjust 'size' upward, instead. @@ -42,7 +42,7 @@ class BaseStrTest: - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_capitalize(self, size): _ = self.from_latin1 SUBSTR = self.from_latin1(' abc def ghi') @@ -52,7 +52,7 @@ SUBSTR.capitalize()) self.assertEqual(caps.lstrip(_('-')), SUBSTR) - @bigmemtest(minsize=_2G + 10, memuse=1) + @bigmemtest(size=_2G + 10, memuse=1) def test_center(self, size): SUBSTR = self.from_latin1(' abc def ghi') s = SUBSTR.center(size) @@ -63,7 +63,7 @@ self.assertEqual(s[lpadsize:-rpadsize], SUBSTR) self.assertEqual(s.strip(), SUBSTR.strip()) - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_count(self, size): _ = self.from_latin1 SUBSTR = _(' abc def ghi') @@ -75,7 +75,7 @@ self.assertEqual(s.count(_('i')), 1) self.assertEqual(s.count(_('j')), 0) - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_endswith(self, size): _ = self.from_latin1 SUBSTR = _(' abc def ghi') @@ -87,7 +87,7 @@ self.assertFalse(s.endswith(_('a') + SUBSTR)) self.assertFalse(SUBSTR.endswith(s)) - @bigmemtest(minsize=_2G + 10, memuse=2) + @bigmemtest(size=_2G + 10, memuse=2) def test_expandtabs(self, size): _ = self.from_latin1 s = _('-') * size @@ -100,7 +100,7 @@ self.assertEqual(len(s), size - remainder) self.assertEqual(len(s.strip(_(' '))), 0) - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_find(self, size): _ = self.from_latin1 SUBSTR = _(' abc def ghi') @@ -117,7 +117,7 @@ sublen + size + SUBSTR.find(_('i'))) self.assertEqual(s.find(_('j')), -1) - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_index(self, size): _ = self.from_latin1 SUBSTR = _(' abc def ghi') @@ -134,7 +134,7 @@ sublen + size + SUBSTR.index(_('i'))) self.assertRaises(ValueError, s.index, _('j')) - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_isalnum(self, size): _ = self.from_latin1 SUBSTR = _('123456') @@ -143,7 +143,7 @@ s += _('.') self.assertFalse(s.isalnum()) - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_isalpha(self, size): _ = self.from_latin1 SUBSTR = _('zzzzzzz') @@ -152,7 +152,7 @@ s += _('.') self.assertFalse(s.isalpha()) - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_isdigit(self, size): _ = self.from_latin1 SUBSTR = _('123456') @@ -161,7 +161,7 @@ s += _('z') self.assertFalse(s.isdigit()) - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_islower(self, size): _ = self.from_latin1 chars = _(''.join( @@ -172,7 +172,7 @@ s += _('A') self.assertFalse(s.islower()) - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_isspace(self, size): _ = self.from_latin1 whitespace = _(' \f\n\r\t\v') @@ -182,7 +182,7 @@ s += _('j') self.assertFalse(s.isspace()) - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_istitle(self, size): _ = self.from_latin1 SUBSTR = _('123456') @@ -193,7 +193,7 @@ s += _('aA') self.assertFalse(s.istitle()) - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_isupper(self, size): _ = self.from_latin1 chars = _(''.join( @@ -204,7 +204,7 @@ s += _('a') self.assertFalse(s.isupper()) - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_join(self, size): _ = self.from_latin1 s = _('A') * size @@ -214,7 +214,7 @@ self.assertTrue(x.startswith(_('aaaaaA'))) self.assertTrue(x.endswith(_('Abbbbb'))) - @bigmemtest(minsize=_2G + 10, memuse=1) + @bigmemtest(size=_2G + 10, memuse=1) def test_ljust(self, size): _ = self.from_latin1 SUBSTR = _(' abc def ghi') @@ -223,7 +223,7 @@ self.assertEqual(len(s), size) self.assertEqual(s.strip(), SUBSTR.strip()) - @bigmemtest(minsize=_2G + 10, memuse=2) + @bigmemtest(size=_2G + 10, memuse=2) def test_lower(self, size): _ = self.from_latin1 s = _('A') * size @@ -231,7 +231,7 @@ self.assertEqual(len(s), size) self.assertEqual(s.count(_('a')), size) - @bigmemtest(minsize=_2G + 10, memuse=1) + @bigmemtest(size=_2G + 10, memuse=1) def test_lstrip(self, size): _ = self.from_latin1 SUBSTR = _('abc def ghi') @@ -246,7 +246,7 @@ stripped = s.lstrip() self.assertTrue(stripped is s) - @bigmemtest(minsize=_2G + 10, memuse=2) + @bigmemtest(size=_2G + 10, memuse=2) def test_replace(self, size): _ = self.from_latin1 replacement = _('a') @@ -259,7 +259,7 @@ self.assertEqual(s.count(replacement), 4) self.assertEqual(s[-10:], _(' aaaa')) - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_rfind(self, size): _ = self.from_latin1 SUBSTR = _(' abc def ghi') @@ -275,7 +275,7 @@ SUBSTR.rfind(_('i'))) self.assertEqual(s.rfind(_('j')), -1) - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_rindex(self, size): _ = self.from_latin1 SUBSTR = _(' abc def ghi') @@ -294,7 +294,7 @@ SUBSTR.rindex(_('i'))) self.assertRaises(ValueError, s.rindex, _('j')) - @bigmemtest(minsize=_2G + 10, memuse=1) + @bigmemtest(size=_2G + 10, memuse=1) def test_rjust(self, size): _ = self.from_latin1 SUBSTR = _(' abc def ghi') @@ -303,7 +303,7 @@ self.assertEqual(len(s), size) self.assertEqual(s.strip(), SUBSTR.strip()) - @bigmemtest(minsize=_2G + 10, memuse=1) + @bigmemtest(size=_2G + 10, memuse=1) def test_rstrip(self, size): _ = self.from_latin1 SUBSTR = _(' abc def ghi') @@ -321,7 +321,7 @@ # The test takes about size bytes to build a string, and then about # sqrt(size) substrings of sqrt(size) in size and a list to # hold sqrt(size) items. It's close but just over 2x size. - @bigmemtest(minsize=_2G, memuse=2.1) + @bigmemtest(size=_2G, memuse=2.1) def test_split_small(self, size): _ = self.from_latin1 # Crudely calculate an estimate so that the result of s.split won't @@ -347,7 +347,7 @@ # suffer for the list size. (Otherwise, it'd cost another 48 times # size in bytes!) Nevertheless, a list of size takes # 8*size bytes. - @bigmemtest(minsize=_2G + 5, memuse=10) + @bigmemtest(size=_2G + 5, memuse=10) def test_split_large(self, size): _ = self.from_latin1 s = _(' a') * size + _(' ') @@ -359,7 +359,7 @@ self.assertEqual(len(l), size + 1) self.assertEqual(set(l), set([_(' ')])) - @bigmemtest(minsize=_2G, memuse=2.1) + @bigmemtest(size=_2G, memuse=2.1) def test_splitlines(self, size): _ = self.from_latin1 # Crudely calculate an estimate so that the result of s.split won't @@ -373,7 +373,7 @@ for item in l: self.assertEqual(item, expected) - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_startswith(self, size): _ = self.from_latin1 SUBSTR = _(' abc def ghi') @@ -382,7 +382,7 @@ self.assertTrue(s.startswith(_('-') * size)) self.assertFalse(s.startswith(SUBSTR)) - @bigmemtest(minsize=_2G, memuse=1) + @bigmemtest(size=_2G, memuse=1) def test_strip(self, size): _ = self.from_latin1 SUBSTR = _(' abc def ghi ') @@ -394,7 +394,7 @@ self.assertEqual(len(s), size) self.assertEqual(s.strip(), SUBSTR.strip()) - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_swapcase(self, size): _ = self.from_latin1 SUBSTR = _("aBcDeFG12.'\xa9\x00") @@ -406,7 +406,7 @@ self.assertEqual(s[:sublen * 3], SUBSTR.swapcase() * 3) self.assertEqual(s[-sublen * 3:], SUBSTR.swapcase() * 3) - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_title(self, size): _ = self.from_latin1 SUBSTR = _('SpaaHAaaAaham') @@ -415,7 +415,7 @@ self.assertTrue(s.startswith((SUBSTR * 3).title())) self.assertTrue(s.endswith(SUBSTR.lower() * 3)) - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_translate(self, size): _ = self.from_latin1 SUBSTR = _('aZz.z.Aaz.') @@ -438,7 +438,7 @@ self.assertEqual(s.count(_('!')), repeats * 2) self.assertEqual(s.count(_('z')), repeats * 3) - @bigmemtest(minsize=_2G + 5, memuse=2) + @bigmemtest(size=_2G + 5, memuse=2) def test_upper(self, size): _ = self.from_latin1 s = _('a') * size @@ -446,7 +446,7 @@ self.assertEqual(len(s), size) self.assertEqual(s.count(_('A')), size) - @bigmemtest(minsize=_2G + 20, memuse=1) + @bigmemtest(size=_2G + 20, memuse=1) def test_zfill(self, size): _ = self.from_latin1 SUBSTR = _('-568324723598234') @@ -458,7 +458,7 @@ # This test is meaningful even with size < 2G, as long as the # doubled string is > 2G (but it tests more if both are > 2G :) - @bigmemtest(minsize=_1G + 2, memuse=3) + @bigmemtest(size=_1G + 2, memuse=3) def test_concat(self, size): _ = self.from_latin1 s = _('.') * size @@ -469,7 +469,7 @@ # This test is meaningful even with size < 2G, as long as the # repeated string is > 2G (but it tests more if both are > 2G :) - @bigmemtest(minsize=_1G + 2, memuse=3) + @bigmemtest(size=_1G + 2, memuse=3) def test_repeat(self, size): _ = self.from_latin1 s = _('.') * size @@ -478,7 +478,7 @@ self.assertEqual(len(s), size * 2) self.assertEqual(s.count(_('.')), size * 2) - @bigmemtest(minsize=_2G + 20, memuse=2) + @bigmemtest(size=_2G + 20, memuse=2) def test_slice_and_getitem(self, size): _ = self.from_latin1 SUBSTR = _('0123456789') @@ -512,7 +512,7 @@ self.assertRaises(IndexError, operator.getitem, s, len(s) + 1) self.assertRaises(IndexError, operator.getitem, s, len(s) + 1<<31) - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_contains(self, size): _ = self.from_latin1 SUBSTR = _('0123456789') @@ -526,7 +526,7 @@ s += _('a') self.assertIn(_('a'), s) - @bigmemtest(minsize=_2G + 10, memuse=2) + @bigmemtest(size=_2G + 10, memuse=2) def test_compare(self, size): _ = self.from_latin1 s1 = _('-') * size @@ -539,7 +539,7 @@ s2 = _('.') * size self.assertFalse(s1 == s2) - @bigmemtest(minsize=_2G + 10, memuse=1) + @bigmemtest(size=_2G + 10, memuse=1) def test_hash(self, size): # Not sure if we can do any meaningful tests here... Even if we # start relying on the exact algorithm used, the result will be @@ -590,46 +590,36 @@ getattr(type(self), name).memuse = memuse # the utf8 encoder preallocates big time (4x the number of characters) - @bigmemtest(minsize=_2G + 2, memuse=character_size + 4) + @bigmemtest(size=_2G + 2, memuse=character_size + 4) def test_encode(self, size): return self.basic_encode_test(size, 'utf-8') - @precisionbigmemtest(size=_4G // 6 + 2, memuse=character_size + 1) + @bigmemtest(size=_4G // 6 + 2, memuse=character_size + 1) def test_encode_raw_unicode_escape(self, size): try: return self.basic_encode_test(size, 'raw_unicode_escape') except MemoryError: pass # acceptable on 32-bit - @precisionbigmemtest(size=_4G // 5 + 70, memuse=character_size + 1) + @bigmemtest(size=_4G // 5 + 70, memuse=character_size + 1) def test_encode_utf7(self, size): try: return self.basic_encode_test(size, 'utf7') except MemoryError: pass # acceptable on 32-bit - @precisionbigmemtest(size=_4G // 4 + 5, memuse=character_size + 4) + @bigmemtest(size=_4G // 4 + 5, memuse=character_size + 4) def test_encode_utf32(self, size): try: return self.basic_encode_test(size, 'utf32', expectedsize=4*size+4) except MemoryError: pass # acceptable on 32-bit - @precisionbigmemtest(size=_2G - 1, memuse=character_size + 1) + @bigmemtest(size=_2G - 1, memuse=character_size + 1) def test_encode_ascii(self, size): return self.basic_encode_test(size, 'ascii', c='A') - @precisionbigmemtest(size=_4G // 5, memuse=character_size * (6 + 1)) - def test_unicode_repr_overflow(self, size): - try: - s = "\uDCBA"*size - r = repr(s) - except MemoryError: - pass # acceptable on 32-bit - else: - self.assertTrue(s == eval(r)) - - @bigmemtest(minsize=_2G + 10, memuse=character_size * 2) + @bigmemtest(size=_2G + 10, memuse=character_size * 2) def test_format(self, size): s = '-' * size sf = '%s' % (s,) @@ -650,7 +640,7 @@ self.assertEqual(s.count('.'), 3) self.assertEqual(s.count('-'), size * 2) - @bigmemtest(minsize=_2G + 10, memuse=character_size * 2) + @bigmemtest(size=_2G + 10, memuse=character_size * 2) def test_repr_small(self, size): s = '-' * size s = repr(s) @@ -671,7 +661,7 @@ self.assertEqual(s.count('\\'), size) self.assertEqual(s.count('0'), size * 2) - @bigmemtest(minsize=_2G + 10, memuse=character_size * 5) + @bigmemtest(size=_2G + 10, memuse=character_size * 5) def test_repr_large(self, size): s = '\x00' * size s = repr(s) @@ -681,27 +671,46 @@ self.assertEqual(s.count('\\'), size) self.assertEqual(s.count('0'), size * 2) - @bigmemtest(minsize=2**32 / 5, memuse=character_size * 7) + @bigmemtest(size=_2G // 5 + 1, memuse=character_size * 7) def test_unicode_repr(self, size): # Use an assigned, but not printable code point. # It is in the range of the low surrogates \uDC00-\uDFFF. - s = "\uDCBA" * size - for f in (repr, ascii): - r = f(s) - self.assertTrue(len(r) > size) - self.assertTrue(r.endswith(r"\udcba'"), r[-10:]) - del r + char = "\uDCBA" + s = char * size + try: + for f in (repr, ascii): + r = f(s) + self.assertEqual(len(r), 2 + (len(f(char)) - 2) * size) + self.assertTrue(r.endswith(r"\udcba'"), r[-10:]) + r = None + finally: + r = s = None # The character takes 4 bytes even in UCS-2 builds because it will # be decomposed into surrogates. - @bigmemtest(minsize=2**32 / 5, memuse=4 + character_size * 9) + @bigmemtest(size=_2G // 5 + 1, memuse=4 + character_size * 9) def test_unicode_repr_wide(self, size): - s = "\U0001DCBA" * size - for f in (repr, ascii): - r = f(s) - self.assertTrue(len(r) > size) - self.assertTrue(r.endswith(r"\U0001dcba'"), r[-12:]) - del r + char = "\U0001DCBA" + s = char * size + try: + for f in (repr, ascii): + r = f(s) + self.assertEqual(len(r), 2 + (len(f(char)) - 2) * size) + self.assertTrue(r.endswith(r"\U0001dcba'"), r[-12:]) + r = None + finally: + r = s = None + + @bigmemtest(size=_4G // 5, memuse=character_size * (6 + 1)) + def _test_unicode_repr_overflow(self, size): + # XXX not sure what this test is about + char = "\uDCBA" + s = char * size + try: + r = repr(s) + self.assertTrue(s == eval(r)) + finally: + r = s = None class BytesTest(unittest.TestCase, BaseStrTest): @@ -709,7 +718,7 @@ def from_latin1(self, s): return s.encode("latin1") - @bigmemtest(minsize=_2G + 2, memuse=1 + character_size) + @bigmemtest(size=_2G + 2, memuse=1 + character_size) def test_decode(self, size): s = self.from_latin1('.') * size self.assertEqual(len(s.decode('utf-8')), size) @@ -720,7 +729,7 @@ def from_latin1(self, s): return bytearray(s.encode("latin1")) - @bigmemtest(minsize=_2G + 2, memuse=1 + character_size) + @bigmemtest(size=_2G + 2, memuse=1 + character_size) def test_decode(self, size): s = self.from_latin1('.') * size self.assertEqual(len(s.decode('utf-8')), size) @@ -739,7 +748,7 @@ # having more than 2<<31 references to any given object. Hence the # use of different types of objects as contents in different tests. - @bigmemtest(minsize=_2G + 2, memuse=16) + @bigmemtest(size=_2G + 2, memuse=16) def test_compare(self, size): t1 = ('',) * size t2 = ('',) * size @@ -762,15 +771,15 @@ t = t + t self.assertEqual(len(t), size * 2) - @bigmemtest(minsize=_2G // 2 + 2, memuse=24) + @bigmemtest(size=_2G // 2 + 2, memuse=24) def test_concat_small(self, size): return self.basic_concat_test(size) - @bigmemtest(minsize=_2G + 2, memuse=24) + @bigmemtest(size=_2G + 2, memuse=24) def test_concat_large(self, size): return self.basic_concat_test(size) - @bigmemtest(minsize=_2G // 5 + 10, memuse=8 * 5) + @bigmemtest(size=_2G // 5 + 10, memuse=8 * 5) def test_contains(self, size): t = (1, 2, 3, 4, 5) * size self.assertEqual(len(t), size * 5) @@ -778,7 +787,7 @@ self.assertNotIn((1, 2, 3, 4, 5), t) self.assertNotIn(0, t) - @bigmemtest(minsize=_2G + 10, memuse=8) + @bigmemtest(size=_2G + 10, memuse=8) def test_hash(self, size): t1 = (0,) * size h1 = hash(t1) @@ -786,7 +795,7 @@ t2 = (0,) * (size + 1) self.assertFalse(h1 == hash(t2)) - @bigmemtest(minsize=_2G + 10, memuse=8) + @bigmemtest(size=_2G + 10, memuse=8) def test_index_and_slice(self, size): t = (None,) * size self.assertEqual(len(t), size) @@ -811,19 +820,19 @@ t = t * 2 self.assertEqual(len(t), size * 2) - @bigmemtest(minsize=_2G // 2 + 2, memuse=24) + @bigmemtest(size=_2G // 2 + 2, memuse=24) def test_repeat_small(self, size): return self.basic_test_repeat(size) - @bigmemtest(minsize=_2G + 2, memuse=24) + @bigmemtest(size=_2G + 2, memuse=24) def test_repeat_large(self, size): return self.basic_test_repeat(size) - @bigmemtest(minsize=_1G - 1, memuse=12) + @bigmemtest(size=_1G - 1, memuse=12) def test_repeat_large_2(self, size): return self.basic_test_repeat(size) - @precisionbigmemtest(size=_1G - 1, memuse=9) + @bigmemtest(size=_1G - 1, memuse=9) def test_from_2G_generator(self, size): self.skipTest("test needs much more memory than advertised, see issue5438") try: @@ -837,7 +846,7 @@ count += 1 self.assertEqual(count, size) - @precisionbigmemtest(size=_1G - 25, memuse=9) + @bigmemtest(size=_1G - 25, memuse=9) def test_from_almost_2G_generator(self, size): self.skipTest("test needs much more memory than advertised, see issue5438") try: @@ -860,11 +869,11 @@ self.assertEqual(s[-5:], '0, 0)') self.assertEqual(s.count('0'), size) - @bigmemtest(minsize=_2G // 3 + 2, memuse=8 + 3 * character_size) + @bigmemtest(size=_2G // 3 + 2, memuse=8 + 3 * character_size) def test_repr_small(self, size): return self.basic_test_repr(size) - @bigmemtest(minsize=_2G + 2, memuse=8 + 3 * character_size) + @bigmemtest(size=_2G + 2, memuse=8 + 3 * character_size) def test_repr_large(self, size): return self.basic_test_repr(size) @@ -875,7 +884,7 @@ # lists hold references to various objects to test their refcount # limits. - @bigmemtest(minsize=_2G + 2, memuse=16) + @bigmemtest(size=_2G + 2, memuse=16) def test_compare(self, size): l1 = [''] * size l2 = [''] * size @@ -898,11 +907,11 @@ l = l + l self.assertEqual(len(l), size * 2) - @bigmemtest(minsize=_2G // 2 + 2, memuse=24) + @bigmemtest(size=_2G // 2 + 2, memuse=24) def test_concat_small(self, size): return self.basic_test_concat(size) - @bigmemtest(minsize=_2G + 2, memuse=24) + @bigmemtest(size=_2G + 2, memuse=24) def test_concat_large(self, size): return self.basic_test_concat(size) @@ -913,15 +922,15 @@ self.assertTrue(l[0] is l[-1]) self.assertTrue(l[size - 1] is l[size + 1]) - @bigmemtest(minsize=_2G // 2 + 2, memuse=24) + @bigmemtest(size=_2G // 2 + 2, memuse=24) def test_inplace_concat_small(self, size): return self.basic_test_inplace_concat(size) - @bigmemtest(minsize=_2G + 2, memuse=24) + @bigmemtest(size=_2G + 2, memuse=24) def test_inplace_concat_large(self, size): return self.basic_test_inplace_concat(size) - @bigmemtest(minsize=_2G // 5 + 10, memuse=8 * 5) + @bigmemtest(size=_2G // 5 + 10, memuse=8 * 5) def test_contains(self, size): l = [1, 2, 3, 4, 5] * size self.assertEqual(len(l), size * 5) @@ -929,12 +938,12 @@ self.assertNotIn([1, 2, 3, 4, 5], l) self.assertNotIn(0, l) - @bigmemtest(minsize=_2G + 10, memuse=8) + @bigmemtest(size=_2G + 10, memuse=8) def test_hash(self, size): l = [0] * size self.assertRaises(TypeError, hash, l) - @bigmemtest(minsize=_2G + 10, memuse=8) + @bigmemtest(size=_2G + 10, memuse=8) def test_index_and_slice(self, size): l = [None] * size self.assertEqual(len(l), size) @@ -998,11 +1007,11 @@ l = l * 2 self.assertEqual(len(l), size * 2) - @bigmemtest(minsize=_2G // 2 + 2, memuse=24) + @bigmemtest(size=_2G // 2 + 2, memuse=24) def test_repeat_small(self, size): return self.basic_test_repeat(size) - @bigmemtest(minsize=_2G + 2, memuse=24) + @bigmemtest(size=_2G + 2, memuse=24) def test_repeat_large(self, size): return self.basic_test_repeat(size) @@ -1018,11 +1027,11 @@ self.assertEqual(len(l), size * 2) self.assertTrue(l[size - 1] is l[-1]) - @bigmemtest(minsize=_2G // 2 + 2, memuse=16) + @bigmemtest(size=_2G // 2 + 2, memuse=16) def test_inplace_repeat_small(self, size): return self.basic_test_inplace_repeat(size) - @bigmemtest(minsize=_2G + 2, memuse=16) + @bigmemtest(size=_2G + 2, memuse=16) def test_inplace_repeat_large(self, size): return self.basic_test_inplace_repeat(size) @@ -1035,17 +1044,17 @@ self.assertEqual(s[-5:], '0, 0]') self.assertEqual(s.count('0'), size) - @bigmemtest(minsize=_2G // 3 + 2, memuse=8 + 3 * character_size) + @bigmemtest(size=_2G // 3 + 2, memuse=8 + 3 * character_size) def test_repr_small(self, size): return self.basic_test_repr(size) - @bigmemtest(minsize=_2G + 2, memuse=8 + 3 * character_size) + @bigmemtest(size=_2G + 2, memuse=8 + 3 * character_size) def test_repr_large(self, size): return self.basic_test_repr(size) # list overallocates ~1/8th of the total size (on first expansion) so # the single list.append call puts memuse at 9 bytes per size. - @bigmemtest(minsize=_2G, memuse=9) + @bigmemtest(size=_2G, memuse=9) def test_append(self, size): l = [object()] * size l.append(object()) @@ -1053,7 +1062,7 @@ self.assertTrue(l[-3] is l[-2]) self.assertFalse(l[-2] is l[-1]) - @bigmemtest(minsize=_2G // 5 + 2, memuse=8 * 5) + @bigmemtest(size=_2G // 5 + 2, memuse=8 * 5) def test_count(self, size): l = [1, 2, 3, 4, 5] * size self.assertEqual(l.count(1), size) @@ -1066,15 +1075,15 @@ self.assertTrue(l[0] is l[-1]) self.assertTrue(l[size - 1] is l[size + 1]) - @bigmemtest(minsize=_2G // 2 + 2, memuse=16) + @bigmemtest(size=_2G // 2 + 2, memuse=16) def test_extend_small(self, size): return self.basic_test_extend(size) - @bigmemtest(minsize=_2G + 2, memuse=16) + @bigmemtest(size=_2G + 2, memuse=16) def test_extend_large(self, size): return self.basic_test_extend(size) - @bigmemtest(minsize=_2G // 5 + 2, memuse=8 * 5) + @bigmemtest(size=_2G // 5 + 2, memuse=8 * 5) def test_index(self, size): l = [1, 2, 3, 4, 5] * size size *= 5 @@ -1085,7 +1094,7 @@ self.assertRaises(ValueError, l.index, 6) # This tests suffers from overallocation, just like test_append. - @bigmemtest(minsize=_2G + 10, memuse=9) + @bigmemtest(size=_2G + 10, memuse=9) def test_insert(self, size): l = [1.0] * size l.insert(size - 1, "A") @@ -1104,7 +1113,7 @@ self.assertEqual(l[:3], [1.0, "C", 1.0]) self.assertEqual(l[size - 3:], ["A", 1.0, "B"]) - @bigmemtest(minsize=_2G // 5 + 4, memuse=8 * 5) + @bigmemtest(size=_2G // 5 + 4, memuse=8 * 5) def test_pop(self, size): l = ["a", "b", "c", "d", "e"] * size size *= 5 @@ -1128,7 +1137,7 @@ self.assertEqual(item, "c") self.assertEqual(l[-2:], ["b", "d"]) - @bigmemtest(minsize=_2G + 10, memuse=8) + @bigmemtest(size=_2G + 10, memuse=8) def test_remove(self, size): l = [10] * size self.assertEqual(len(l), size) @@ -1148,7 +1157,7 @@ self.assertEqual(len(l), size) self.assertEqual(l[-2:], [10, 10]) - @bigmemtest(minsize=_2G // 5 + 2, memuse=8 * 5) + @bigmemtest(size=_2G // 5 + 2, memuse=8 * 5) def test_reverse(self, size): l = [1, 2, 3, 4, 5] * size l.reverse() @@ -1156,7 +1165,7 @@ self.assertEqual(l[-5:], [5, 4, 3, 2, 1]) self.assertEqual(l[:5], [5, 4, 3, 2, 1]) - @bigmemtest(minsize=_2G // 5 + 2, memuse=8 * 5) + @bigmemtest(size=_2G // 5 + 2, memuse=8 * 5) def test_sort(self, size): l = [1, 2, 3, 4, 5] * size l.sort() diff --git a/Lib/test/test_hashlib.py b/Lib/test/test_hashlib.py --- a/Lib/test/test_hashlib.py +++ b/Lib/test/test_hashlib.py @@ -17,7 +17,7 @@ import unittest import warnings from test import support -from test.support import _4G, precisionbigmemtest +from test.support import _4G, bigmemtest # Were we compiled --with-pydebug or with #define Py_DEBUG? COMPILED_WITH_PYDEBUG = hasattr(sys, 'gettotalrefcount') @@ -196,7 +196,7 @@ b'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789', 'd174ab98d277d9f5a5611c2c9f419d9f') - @precisionbigmemtest(size=_4G + 5, memuse=1) + @bigmemtest(size=_4G + 5, memuse=1) def test_case_md5_huge(self, size): if size == _4G + 5: try: @@ -204,7 +204,7 @@ except OverflowError: pass # 32-bit arch - @precisionbigmemtest(size=_4G - 1, memuse=1) + @bigmemtest(size=_4G - 1, memuse=1) def test_case_md5_uintmax(self, size): if size == _4G - 1: try: diff --git a/Lib/test/test_xml_etree_c.py b/Lib/test/test_xml_etree_c.py --- a/Lib/test/test_xml_etree_c.py +++ b/Lib/test/test_xml_etree_c.py @@ -1,7 +1,7 @@ # xml.etree test for cElementTree from test import support -from test.support import precisionbigmemtest, _2G +from test.support import bigmemtest, _2G import unittest cET = support.import_module('xml.etree.cElementTree') @@ -35,7 +35,7 @@ class MiscTests(unittest.TestCase): # Issue #8651. - @support.precisionbigmemtest(size=support._2G + 100, memuse=1) + @support.bigmemtest(size=support._2G + 100, memuse=1) def test_length_overflow(self, size): if size < support._2G + 100: self.skipTest("not enough free memory, need at least 2 GB") diff --git a/Lib/test/test_zlib.py b/Lib/test/test_zlib.py --- a/Lib/test/test_zlib.py +++ b/Lib/test/test_zlib.py @@ -3,7 +3,7 @@ import binascii import random import sys -from test.support import precisionbigmemtest, _1G, _4G +from test.support import bigmemtest, _1G, _4G zlib = support.import_module('zlib') @@ -177,16 +177,16 @@ # Memory use of the following functions takes into account overallocation - @precisionbigmemtest(size=_1G + 1024 * 1024, memuse=3) + @bigmemtest(size=_1G + 1024 * 1024, memuse=3) def test_big_compress_buffer(self, size): compress = lambda s: zlib.compress(s, 1) self.check_big_compress_buffer(size, compress) - @precisionbigmemtest(size=_1G + 1024 * 1024, memuse=2) + @bigmemtest(size=_1G + 1024 * 1024, memuse=2) def test_big_decompress_buffer(self, size): self.check_big_decompress_buffer(size, zlib.decompress) - @precisionbigmemtest(size=_4G + 100, memuse=1) + @bigmemtest(size=_4G + 100, memuse=1) def test_length_overflow(self, size): if size < _4G + 100: self.skipTest("not enough free memory, need at least 4 GB") @@ -511,19 +511,19 @@ # Memory use of the following functions takes into account overallocation - @precisionbigmemtest(size=_1G + 1024 * 1024, memuse=3) + @bigmemtest(size=_1G + 1024 * 1024, memuse=3) def test_big_compress_buffer(self, size): c = zlib.compressobj(1) compress = lambda s: c.compress(s) + c.flush() self.check_big_compress_buffer(size, compress) - @precisionbigmemtest(size=_1G + 1024 * 1024, memuse=2) + @bigmemtest(size=_1G + 1024 * 1024, memuse=2) def test_big_decompress_buffer(self, size): d = zlib.decompressobj() decompress = lambda s: d.decompress(s) + d.flush() self.check_big_decompress_buffer(size, decompress) - @precisionbigmemtest(size=_4G + 100, memuse=1) + @bigmemtest(size=_4G + 100, memuse=1) def test_length_overflow(self, size): if size < _4G + 100: self.skipTest("not enough free memory, need at least 4 GB") -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 10:32:22 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 04 Oct 2011 10:32:22 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=28merge_3=2E2_-=3E_default=29?= =?utf8?q?=3A_Start_fixing_test=5Fbigmem=3A?= Message-ID: http://hg.python.org/cpython/rev/dac5dd1911b4 changeset: 72643:dac5dd1911b4 parent: 72640:46c026a5ccb9 parent: 72642:bf39434dd506 user: Antoine Pitrou date: Tue Oct 04 10:28:37 2011 +0200 summary: Start fixing test_bigmem: - bigmemtest is replaced by precisionbigmemtest - add a poor man's watchdog thread to print memory consumption files: Lib/test/pickletester.py | 12 +- Lib/test/support.py | 97 +++++--- Lib/test/test_bigmem.py | 241 +++++++++++----------- Lib/test/test_bz2.py | 6 +- Lib/test/test_hashlib.py | 6 +- Lib/test/test_xml_etree_c.py | 4 +- Lib/test/test_zlib.py | 14 +- 7 files changed, 207 insertions(+), 173 deletions(-) diff --git a/Lib/test/pickletester.py b/Lib/test/pickletester.py --- a/Lib/test/pickletester.py +++ b/Lib/test/pickletester.py @@ -9,7 +9,7 @@ from test.support import ( TestFailed, TESTFN, run_with_locale, no_tracing, - _2G, _4G, precisionbigmemtest, + _2G, _4G, bigmemtest, ) from pickle import bytes_types @@ -1188,7 +1188,7 @@ # Binary protocols can serialize longs of up to 2GB-1 - @precisionbigmemtest(size=_2G, memuse=1 + 1, dry_run=False) + @bigmemtest(size=_2G, memuse=1 + 1, dry_run=False) def test_huge_long_32b(self, size): data = 1 << (8 * size) try: @@ -1204,7 +1204,7 @@ # (older protocols don't have a dedicated opcode for bytes and are # too inefficient) - @precisionbigmemtest(size=_2G, memuse=1 + 1, dry_run=False) + @bigmemtest(size=_2G, memuse=1 + 1, dry_run=False) def test_huge_bytes_32b(self, size): data = b"abcd" * (size // 4) try: @@ -1220,7 +1220,7 @@ finally: data = None - @precisionbigmemtest(size=_4G, memuse=1 + 1, dry_run=False) + @bigmemtest(size=_4G, memuse=1 + 1, dry_run=False) def test_huge_bytes_64b(self, size): data = b"a" * size try: @@ -1235,7 +1235,7 @@ # All protocols use 1-byte per printable ASCII character; we add another # byte because the encoded form has to be copied into the internal buffer. - @precisionbigmemtest(size=_2G, memuse=2 + character_size, dry_run=False) + @bigmemtest(size=_2G, memuse=2 + character_size, dry_run=False) def test_huge_str_32b(self, size): data = "abcd" * (size // 4) try: @@ -1252,7 +1252,7 @@ # BINUNICODE (protocols 1, 2 and 3) cannot carry more than # 2**32 - 1 bytes of utf-8 encoded unicode. - @precisionbigmemtest(size=_4G, memuse=1 + character_size, dry_run=False) + @bigmemtest(size=_4G, memuse=1 + character_size, dry_run=False) def test_huge_str_64b(self, size): data = "a" * size try: diff --git a/Lib/test/support.py b/Lib/test/support.py --- a/Lib/test/support.py +++ b/Lib/test/support.py @@ -1133,47 +1133,51 @@ raise ValueError('Memory limit %r too low to be useful' % (limit,)) max_memuse = memlimit -def bigmemtest(minsize, memuse): +def _memory_watchdog(start_evt, finish_evt, period=10.0): + """A function which periodically watches the process' memory consumption + and prints it out. + """ + # XXX: because of the GIL, and because the very long operations tested + # in most bigmem tests are uninterruptible, the loop below gets woken up + # much less often than expected. + # The polling code should be rewritten in raw C, without holding the GIL, + # and push results onto an anonymous pipe. + try: + page_size = os.sysconf('SC_PAGESIZE') + except (ValueError, AttributeError): + try: + page_size = os.sysconf('SC_PAGE_SIZE') + except (ValueError, AttributeError): + page_size = 4096 + procfile = '/proc/{pid}/statm'.format(pid=os.getpid()) + try: + f = open(procfile, 'rb') + except IOError as e: + warnings.warn('/proc not available for stats: {}'.format(e), + RuntimeWarning) + sys.stderr.flush() + return + with f: + start_evt.set() + old_data = -1 + while not finish_evt.wait(period): + f.seek(0) + statm = f.read().decode('ascii') + data = int(statm.split()[5]) + if data != old_data: + old_data = data + print(" ... process data size: {data:.1f}G" + .format(data=data * page_size / (1024 ** 3))) + +def bigmemtest(size, memuse, dry_run=True): """Decorator for bigmem tests. 'minsize' is the minimum useful size for the test (in arbitrary, test-interpreted units.) 'memuse' is the number of 'bytes per size' for the test, or a good estimate of it. - The decorator tries to guess a good value for 'size' and passes it to - the decorated test function. If minsize * memuse is more than the - allowed memory use (as defined by max_memuse), the test is skipped. - Otherwise, minsize is adjusted upward to use up to max_memuse. - """ - def decorator(f): - def wrapper(self): - # Retrieve values in case someone decided to adjust them - minsize = wrapper.minsize - memuse = wrapper.memuse - if not max_memuse: - # If max_memuse is 0 (the default), - # we still want to run the tests with size set to a few kb, - # to make sure they work. We still want to avoid using - # too much memory, though, but we do that noisily. - maxsize = 5147 - self.assertFalse(maxsize * memuse > 20 * _1M) - else: - maxsize = int(max_memuse / memuse) - if maxsize < minsize: - raise unittest.SkipTest( - "not enough memory: %.1fG minimum needed" - % (minsize * memuse / (1024 ** 3))) - return f(self, maxsize) - wrapper.minsize = minsize - wrapper.memuse = memuse - return wrapper - return decorator - -def precisionbigmemtest(size, memuse, dry_run=True): - """Decorator for bigmem tests that need exact sizes. - - Like bigmemtest, but without the size scaling upward to fill available - memory. + if 'dry_run' is False, it means the test doesn't support dummy runs + when -M is not specified. """ def decorator(f): def wrapper(self): @@ -1190,7 +1194,28 @@ "not enough memory: %.1fG minimum needed" % (size * memuse / (1024 ** 3))) - return f(self, maxsize) + if real_max_memuse and verbose and threading: + print() + print(" ... expected peak memory use: {peak:.1f}G" + .format(peak=size * memuse / (1024 ** 3))) + sys.stdout.flush() + start_evt = threading.Event() + finish_evt = threading.Event() + t = threading.Thread(target=_memory_watchdog, + args=(start_evt, finish_evt, 0.5)) + t.daemon = True + t.start() + start_evt.set() + else: + t = None + + try: + return f(self, maxsize) + finally: + if t: + finish_evt.set() + t.join() + wrapper.size = size wrapper.memuse = memuse return wrapper diff --git a/Lib/test/test_bigmem.py b/Lib/test/test_bigmem.py --- a/Lib/test/test_bigmem.py +++ b/Lib/test/test_bigmem.py @@ -9,7 +9,7 @@ """ from test import support -from test.support import bigmemtest, _1G, _2G, _4G, precisionbigmemtest +from test.support import bigmemtest, _1G, _2G, _4G import unittest import operator @@ -50,11 +50,11 @@ # a large object, make the subobject of a length that is not a power of # 2. That way, int-wrapping problems are more easily detected. # -# - While the bigmem decorators speak of 'minsize', all tests will actually -# be called with a much smaller number too, in the normal test run (5Kb -# currently.) This is so the tests themselves get frequent testing. -# Consequently, always make all large allocations based on the passed-in -# 'size', and don't rely on the size being very large. Also, +# - Despite the bigmemtest decorator, all tests will actually be called +# with a much smaller number too, in the normal test run (5Kb currently.) +# This is so the tests themselves get frequent testing. +# Consequently, always make all large allocations based on the +# passed-in 'size', and don't rely on the size being very large. Also, # memuse-per-size should remain sane (less than a few thousand); if your # test uses more, adjust 'size' upward, instead. @@ -67,7 +67,7 @@ class BaseStrTest: - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_capitalize(self, size): _ = self.from_latin1 SUBSTR = self.from_latin1(' abc def ghi') @@ -77,7 +77,7 @@ SUBSTR.capitalize()) self.assertEqual(caps.lstrip(_('-')), SUBSTR) - @bigmemtest(minsize=_2G + 10, memuse=1) + @bigmemtest(size=_2G + 10, memuse=1) def test_center(self, size): SUBSTR = self.from_latin1(' abc def ghi') s = SUBSTR.center(size) @@ -88,7 +88,7 @@ self.assertEqual(s[lpadsize:-rpadsize], SUBSTR) self.assertEqual(s.strip(), SUBSTR.strip()) - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_count(self, size): _ = self.from_latin1 SUBSTR = _(' abc def ghi') @@ -100,7 +100,7 @@ self.assertEqual(s.count(_('i')), 1) self.assertEqual(s.count(_('j')), 0) - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_endswith(self, size): _ = self.from_latin1 SUBSTR = _(' abc def ghi') @@ -112,7 +112,7 @@ self.assertFalse(s.endswith(_('a') + SUBSTR)) self.assertFalse(SUBSTR.endswith(s)) - @bigmemtest(minsize=_2G + 10, memuse=2) + @bigmemtest(size=_2G + 10, memuse=2) def test_expandtabs(self, size): _ = self.from_latin1 s = _('-') * size @@ -125,7 +125,7 @@ self.assertEqual(len(s), size - remainder) self.assertEqual(len(s.strip(_(' '))), 0) - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_find(self, size): _ = self.from_latin1 SUBSTR = _(' abc def ghi') @@ -142,7 +142,7 @@ sublen + size + SUBSTR.find(_('i'))) self.assertEqual(s.find(_('j')), -1) - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_index(self, size): _ = self.from_latin1 SUBSTR = _(' abc def ghi') @@ -159,7 +159,7 @@ sublen + size + SUBSTR.index(_('i'))) self.assertRaises(ValueError, s.index, _('j')) - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_isalnum(self, size): _ = self.from_latin1 SUBSTR = _('123456') @@ -168,7 +168,7 @@ s += _('.') self.assertFalse(s.isalnum()) - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_isalpha(self, size): _ = self.from_latin1 SUBSTR = _('zzzzzzz') @@ -177,7 +177,7 @@ s += _('.') self.assertFalse(s.isalpha()) - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_isdigit(self, size): _ = self.from_latin1 SUBSTR = _('123456') @@ -186,7 +186,7 @@ s += _('z') self.assertFalse(s.isdigit()) - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_islower(self, size): _ = self.from_latin1 chars = _(''.join( @@ -197,7 +197,7 @@ s += _('A') self.assertFalse(s.islower()) - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_isspace(self, size): _ = self.from_latin1 whitespace = _(' \f\n\r\t\v') @@ -207,7 +207,7 @@ s += _('j') self.assertFalse(s.isspace()) - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_istitle(self, size): _ = self.from_latin1 SUBSTR = _('123456') @@ -218,7 +218,7 @@ s += _('aA') self.assertFalse(s.istitle()) - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_isupper(self, size): _ = self.from_latin1 chars = _(''.join( @@ -229,7 +229,7 @@ s += _('a') self.assertFalse(s.isupper()) - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_join(self, size): _ = self.from_latin1 s = _('A') * size @@ -239,7 +239,7 @@ self.assertTrue(x.startswith(_('aaaaaA'))) self.assertTrue(x.endswith(_('Abbbbb'))) - @bigmemtest(minsize=_2G + 10, memuse=1) + @bigmemtest(size=_2G + 10, memuse=1) def test_ljust(self, size): _ = self.from_latin1 SUBSTR = _(' abc def ghi') @@ -248,7 +248,7 @@ self.assertEqual(len(s), size) self.assertEqual(s.strip(), SUBSTR.strip()) - @bigmemtest(minsize=_2G + 10, memuse=2) + @bigmemtest(size=_2G + 10, memuse=2) def test_lower(self, size): _ = self.from_latin1 s = _('A') * size @@ -256,7 +256,7 @@ self.assertEqual(len(s), size) self.assertEqual(s.count(_('a')), size) - @bigmemtest(minsize=_2G + 10, memuse=1) + @bigmemtest(size=_2G + 10, memuse=1) def test_lstrip(self, size): _ = self.from_latin1 SUBSTR = _('abc def ghi') @@ -271,7 +271,7 @@ stripped = s.lstrip() self.assertTrue(stripped is s) - @bigmemtest(minsize=_2G + 10, memuse=2) + @bigmemtest(size=_2G + 10, memuse=2) def test_replace(self, size): _ = self.from_latin1 replacement = _('a') @@ -284,7 +284,7 @@ self.assertEqual(s.count(replacement), 4) self.assertEqual(s[-10:], _(' aaaa')) - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_rfind(self, size): _ = self.from_latin1 SUBSTR = _(' abc def ghi') @@ -300,7 +300,7 @@ SUBSTR.rfind(_('i'))) self.assertEqual(s.rfind(_('j')), -1) - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_rindex(self, size): _ = self.from_latin1 SUBSTR = _(' abc def ghi') @@ -319,7 +319,7 @@ SUBSTR.rindex(_('i'))) self.assertRaises(ValueError, s.rindex, _('j')) - @bigmemtest(minsize=_2G + 10, memuse=1) + @bigmemtest(size=_2G + 10, memuse=1) def test_rjust(self, size): _ = self.from_latin1 SUBSTR = _(' abc def ghi') @@ -328,7 +328,7 @@ self.assertEqual(len(s), size) self.assertEqual(s.strip(), SUBSTR.strip()) - @bigmemtest(minsize=_2G + 10, memuse=1) + @bigmemtest(size=_2G + 10, memuse=1) def test_rstrip(self, size): _ = self.from_latin1 SUBSTR = _(' abc def ghi') @@ -346,7 +346,7 @@ # The test takes about size bytes to build a string, and then about # sqrt(size) substrings of sqrt(size) in size and a list to # hold sqrt(size) items. It's close but just over 2x size. - @bigmemtest(minsize=_2G, memuse=2.1) + @bigmemtest(size=_2G, memuse=2.1) def test_split_small(self, size): _ = self.from_latin1 # Crudely calculate an estimate so that the result of s.split won't @@ -372,7 +372,7 @@ # suffer for the list size. (Otherwise, it'd cost another 48 times # size in bytes!) Nevertheless, a list of size takes # 8*size bytes. - @bigmemtest(minsize=_2G + 5, memuse=10) + @bigmemtest(size=_2G + 5, memuse=10) def test_split_large(self, size): _ = self.from_latin1 s = _(' a') * size + _(' ') @@ -384,7 +384,7 @@ self.assertEqual(len(l), size + 1) self.assertEqual(set(l), set([_(' ')])) - @bigmemtest(minsize=_2G, memuse=2.1) + @bigmemtest(size=_2G, memuse=2.1) def test_splitlines(self, size): _ = self.from_latin1 # Crudely calculate an estimate so that the result of s.split won't @@ -398,7 +398,7 @@ for item in l: self.assertEqual(item, expected) - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_startswith(self, size): _ = self.from_latin1 SUBSTR = _(' abc def ghi') @@ -407,7 +407,7 @@ self.assertTrue(s.startswith(_('-') * size)) self.assertFalse(s.startswith(SUBSTR)) - @bigmemtest(minsize=_2G, memuse=1) + @bigmemtest(size=_2G, memuse=1) def test_strip(self, size): _ = self.from_latin1 SUBSTR = _(' abc def ghi ') @@ -419,7 +419,7 @@ self.assertEqual(len(s), size) self.assertEqual(s.strip(), SUBSTR.strip()) - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_swapcase(self, size): _ = self.from_latin1 SUBSTR = _("aBcDeFG12.'\xa9\x00") @@ -431,7 +431,7 @@ self.assertEqual(s[:sublen * 3], SUBSTR.swapcase() * 3) self.assertEqual(s[-sublen * 3:], SUBSTR.swapcase() * 3) - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_title(self, size): _ = self.from_latin1 SUBSTR = _('SpaaHAaaAaham') @@ -440,7 +440,7 @@ self.assertTrue(s.startswith((SUBSTR * 3).title())) self.assertTrue(s.endswith(SUBSTR.lower() * 3)) - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_translate(self, size): _ = self.from_latin1 SUBSTR = _('aZz.z.Aaz.') @@ -463,7 +463,7 @@ self.assertEqual(s.count(_('!')), repeats * 2) self.assertEqual(s.count(_('z')), repeats * 3) - @bigmemtest(minsize=_2G + 5, memuse=2) + @bigmemtest(size=_2G + 5, memuse=2) def test_upper(self, size): _ = self.from_latin1 s = _('a') * size @@ -471,7 +471,7 @@ self.assertEqual(len(s), size) self.assertEqual(s.count(_('A')), size) - @bigmemtest(minsize=_2G + 20, memuse=1) + @bigmemtest(size=_2G + 20, memuse=1) def test_zfill(self, size): _ = self.from_latin1 SUBSTR = _('-568324723598234') @@ -483,7 +483,7 @@ # This test is meaningful even with size < 2G, as long as the # doubled string is > 2G (but it tests more if both are > 2G :) - @bigmemtest(minsize=_1G + 2, memuse=3) + @bigmemtest(size=_1G + 2, memuse=3) def test_concat(self, size): _ = self.from_latin1 s = _('.') * size @@ -494,7 +494,7 @@ # This test is meaningful even with size < 2G, as long as the # repeated string is > 2G (but it tests more if both are > 2G :) - @bigmemtest(minsize=_1G + 2, memuse=3) + @bigmemtest(size=_1G + 2, memuse=3) def test_repeat(self, size): _ = self.from_latin1 s = _('.') * size @@ -503,7 +503,7 @@ self.assertEqual(len(s), size * 2) self.assertEqual(s.count(_('.')), size * 2) - @bigmemtest(minsize=_2G + 20, memuse=2) + @bigmemtest(size=_2G + 20, memuse=2) def test_slice_and_getitem(self, size): _ = self.from_latin1 SUBSTR = _('0123456789') @@ -537,7 +537,7 @@ self.assertRaises(IndexError, operator.getitem, s, len(s) + 1) self.assertRaises(IndexError, operator.getitem, s, len(s) + 1<<31) - @bigmemtest(minsize=_2G, memuse=2) + @bigmemtest(size=_2G, memuse=2) def test_contains(self, size): _ = self.from_latin1 SUBSTR = _('0123456789') @@ -551,7 +551,7 @@ s += _('a') self.assertTrue(_('a') in s) - @bigmemtest(minsize=_2G + 10, memuse=2) + @bigmemtest(size=_2G + 10, memuse=2) def test_compare(self, size): _ = self.from_latin1 s1 = _('-') * size @@ -564,7 +564,7 @@ s2 = _('.') * size self.assertFalse(s1 == s2) - @bigmemtest(minsize=_2G + 10, memuse=1) + @bigmemtest(size=_2G + 10, memuse=1) def test_hash(self, size): # Not sure if we can do any meaningful tests here... Even if we # start relying on the exact algorithm used, the result will be @@ -615,46 +615,36 @@ getattr(type(self), name).memuse = memuse # the utf8 encoder preallocates big time (4x the number of characters) - @bigmemtest(minsize=_2G + 2, memuse=character_size + 4) + @bigmemtest(size=_2G + 2, memuse=character_size + 4) def test_encode(self, size): return self.basic_encode_test(size, 'utf-8') - @precisionbigmemtest(size=_4G // 6 + 2, memuse=character_size + 1) + @bigmemtest(size=_4G // 6 + 2, memuse=character_size + 1) def test_encode_raw_unicode_escape(self, size): try: return self.basic_encode_test(size, 'raw_unicode_escape') except MemoryError: pass # acceptable on 32-bit - @precisionbigmemtest(size=_4G // 5 + 70, memuse=character_size + 1) + @bigmemtest(size=_4G // 5 + 70, memuse=character_size + 1) def test_encode_utf7(self, size): try: return self.basic_encode_test(size, 'utf7') except MemoryError: pass # acceptable on 32-bit - @precisionbigmemtest(size=_4G // 4 + 5, memuse=character_size + 4) + @bigmemtest(size=_4G // 4 + 5, memuse=character_size + 4) def test_encode_utf32(self, size): try: return self.basic_encode_test(size, 'utf32', expectedsize=4*size+4) except MemoryError: pass # acceptable on 32-bit - @precisionbigmemtest(size=_2G - 1, memuse=character_size + 1) + @bigmemtest(size=_2G - 1, memuse=character_size + 1) def test_encode_ascii(self, size): return self.basic_encode_test(size, 'ascii', c='A') - @precisionbigmemtest(size=_4G // 5, memuse=character_size * (6 + 1)) - def test_unicode_repr_overflow(self, size): - try: - s = "\uDCBA"*size - r = repr(s) - except MemoryError: - pass # acceptable on 32-bit - else: - self.assertTrue(s == eval(r)) - - @bigmemtest(minsize=_2G + 10, memuse=character_size * 2) + @bigmemtest(size=_2G + 10, memuse=character_size * 2) def test_format(self, size): s = '-' * size sf = '%s' % (s,) @@ -675,7 +665,7 @@ self.assertEqual(s.count('.'), 3) self.assertEqual(s.count('-'), size * 2) - @bigmemtest(minsize=_2G + 10, memuse=character_size * 2) + @bigmemtest(size=_2G + 10, memuse=character_size * 2) def test_repr_small(self, size): s = '-' * size s = repr(s) @@ -696,7 +686,7 @@ self.assertEqual(s.count('\\'), size) self.assertEqual(s.count('0'), size * 2) - @bigmemtest(minsize=_2G + 10, memuse=character_size * 5) + @bigmemtest(size=_2G + 10, memuse=character_size * 5) def test_repr_large(self, size): s = '\x00' * size s = repr(s) @@ -706,27 +696,46 @@ self.assertEqual(s.count('\\'), size) self.assertEqual(s.count('0'), size * 2) - @bigmemtest(minsize=2**32 / 5, memuse=character_size * 7) + @bigmemtest(size=_2G // 5 + 1, memuse=character_size * 7) def test_unicode_repr(self, size): # Use an assigned, but not printable code point. # It is in the range of the low surrogates \uDC00-\uDFFF. - s = "\uDCBA" * size - for f in (repr, ascii): - r = f(s) - self.assertTrue(len(r) > size) - self.assertTrue(r.endswith(r"\udcba'"), r[-10:]) - del r + char = "\uDCBA" + s = char * size + try: + for f in (repr, ascii): + r = f(s) + self.assertEqual(len(r), 2 + (len(f(char)) - 2) * size) + self.assertTrue(r.endswith(r"\udcba'"), r[-10:]) + r = None + finally: + r = s = None # The character takes 4 bytes even in UCS-2 builds because it will # be decomposed into surrogates. - @bigmemtest(minsize=2**32 / 5, memuse=4 + character_size * 9) + @bigmemtest(size=_2G // 5 + 1, memuse=4 + character_size * 9) def test_unicode_repr_wide(self, size): - s = "\U0001DCBA" * size - for f in (repr, ascii): - r = f(s) - self.assertTrue(len(r) > size) - self.assertTrue(r.endswith(r"\U0001dcba'"), r[-12:]) - del r + char = "\U0001DCBA" + s = char * size + try: + for f in (repr, ascii): + r = f(s) + self.assertEqual(len(r), 2 + (len(f(char)) - 2) * size) + self.assertTrue(r.endswith(r"\U0001dcba'"), r[-12:]) + r = None + finally: + r = s = None + + @bigmemtest(size=_4G // 5, memuse=character_size * (6 + 1)) + def _test_unicode_repr_overflow(self, size): + # XXX not sure what this test is about + char = "\uDCBA" + s = char * size + try: + r = repr(s) + self.assertTrue(s == eval(r)) + finally: + r = s = None class BytesTest(unittest.TestCase, BaseStrTest): @@ -734,7 +743,7 @@ def from_latin1(self, s): return s.encode("latin-1") - @bigmemtest(minsize=_2G + 2, memuse=1 + character_size) + @bigmemtest(size=_2G + 2, memuse=1 + character_size) def test_decode(self, size): s = self.from_latin1('.') * size self.assertEqual(len(s.decode('utf-8')), size) @@ -745,7 +754,7 @@ def from_latin1(self, s): return bytearray(s.encode("latin-1")) - @bigmemtest(minsize=_2G + 2, memuse=1 + character_size) + @bigmemtest(size=_2G + 2, memuse=1 + character_size) def test_decode(self, size): s = self.from_latin1('.') * size self.assertEqual(len(s.decode('utf-8')), size) @@ -764,7 +773,7 @@ # having more than 2<<31 references to any given object. Hence the # use of different types of objects as contents in different tests. - @bigmemtest(minsize=_2G + 2, memuse=16) + @bigmemtest(size=_2G + 2, memuse=16) def test_compare(self, size): t1 = ('',) * size t2 = ('',) * size @@ -787,15 +796,15 @@ t = t + t self.assertEqual(len(t), size * 2) - @bigmemtest(minsize=_2G // 2 + 2, memuse=24) + @bigmemtest(size=_2G // 2 + 2, memuse=24) def test_concat_small(self, size): return self.basic_concat_test(size) - @bigmemtest(minsize=_2G + 2, memuse=24) + @bigmemtest(size=_2G + 2, memuse=24) def test_concat_large(self, size): return self.basic_concat_test(size) - @bigmemtest(minsize=_2G // 5 + 10, memuse=8 * 5) + @bigmemtest(size=_2G // 5 + 10, memuse=8 * 5) def test_contains(self, size): t = (1, 2, 3, 4, 5) * size self.assertEqual(len(t), size * 5) @@ -803,7 +812,7 @@ self.assertFalse((1, 2, 3, 4, 5) in t) self.assertFalse(0 in t) - @bigmemtest(minsize=_2G + 10, memuse=8) + @bigmemtest(size=_2G + 10, memuse=8) def test_hash(self, size): t1 = (0,) * size h1 = hash(t1) @@ -811,7 +820,7 @@ t2 = (0,) * (size + 1) self.assertFalse(h1 == hash(t2)) - @bigmemtest(minsize=_2G + 10, memuse=8) + @bigmemtest(size=_2G + 10, memuse=8) def test_index_and_slice(self, size): t = (None,) * size self.assertEqual(len(t), size) @@ -836,19 +845,19 @@ t = t * 2 self.assertEqual(len(t), size * 2) - @bigmemtest(minsize=_2G // 2 + 2, memuse=24) + @bigmemtest(size=_2G // 2 + 2, memuse=24) def test_repeat_small(self, size): return self.basic_test_repeat(size) - @bigmemtest(minsize=_2G + 2, memuse=24) + @bigmemtest(size=_2G + 2, memuse=24) def test_repeat_large(self, size): return self.basic_test_repeat(size) - @bigmemtest(minsize=_1G - 1, memuse=12) + @bigmemtest(size=_1G - 1, memuse=12) def test_repeat_large_2(self, size): return self.basic_test_repeat(size) - @precisionbigmemtest(size=_1G - 1, memuse=9) + @bigmemtest(size=_1G - 1, memuse=9) def test_from_2G_generator(self, size): self.skipTest("test needs much more memory than advertised, see issue5438") try: @@ -862,7 +871,7 @@ count += 1 self.assertEqual(count, size) - @precisionbigmemtest(size=_1G - 25, memuse=9) + @bigmemtest(size=_1G - 25, memuse=9) def test_from_almost_2G_generator(self, size): self.skipTest("test needs much more memory than advertised, see issue5438") try: @@ -885,11 +894,11 @@ self.assertEqual(s[-5:], '0, 0)') self.assertEqual(s.count('0'), size) - @bigmemtest(minsize=_2G // 3 + 2, memuse=8 + 3 * character_size) + @bigmemtest(size=_2G // 3 + 2, memuse=8 + 3 * character_size) def test_repr_small(self, size): return self.basic_test_repr(size) - @bigmemtest(minsize=_2G + 2, memuse=8 + 3 * character_size) + @bigmemtest(size=_2G + 2, memuse=8 + 3 * character_size) def test_repr_large(self, size): return self.basic_test_repr(size) @@ -900,7 +909,7 @@ # lists hold references to various objects to test their refcount # limits. - @bigmemtest(minsize=_2G + 2, memuse=16) + @bigmemtest(size=_2G + 2, memuse=16) def test_compare(self, size): l1 = [''] * size l2 = [''] * size @@ -923,11 +932,11 @@ l = l + l self.assertEqual(len(l), size * 2) - @bigmemtest(minsize=_2G // 2 + 2, memuse=24) + @bigmemtest(size=_2G // 2 + 2, memuse=24) def test_concat_small(self, size): return self.basic_test_concat(size) - @bigmemtest(minsize=_2G + 2, memuse=24) + @bigmemtest(size=_2G + 2, memuse=24) def test_concat_large(self, size): return self.basic_test_concat(size) @@ -938,15 +947,15 @@ self.assertTrue(l[0] is l[-1]) self.assertTrue(l[size - 1] is l[size + 1]) - @bigmemtest(minsize=_2G // 2 + 2, memuse=24) + @bigmemtest(size=_2G // 2 + 2, memuse=24) def test_inplace_concat_small(self, size): return self.basic_test_inplace_concat(size) - @bigmemtest(minsize=_2G + 2, memuse=24) + @bigmemtest(size=_2G + 2, memuse=24) def test_inplace_concat_large(self, size): return self.basic_test_inplace_concat(size) - @bigmemtest(minsize=_2G // 5 + 10, memuse=8 * 5) + @bigmemtest(size=_2G // 5 + 10, memuse=8 * 5) def test_contains(self, size): l = [1, 2, 3, 4, 5] * size self.assertEqual(len(l), size * 5) @@ -954,12 +963,12 @@ self.assertFalse([1, 2, 3, 4, 5] in l) self.assertFalse(0 in l) - @bigmemtest(minsize=_2G + 10, memuse=8) + @bigmemtest(size=_2G + 10, memuse=8) def test_hash(self, size): l = [0] * size self.assertRaises(TypeError, hash, l) - @bigmemtest(minsize=_2G + 10, memuse=8) + @bigmemtest(size=_2G + 10, memuse=8) def test_index_and_slice(self, size): l = [None] * size self.assertEqual(len(l), size) @@ -1023,11 +1032,11 @@ l = l * 2 self.assertEqual(len(l), size * 2) - @bigmemtest(minsize=_2G // 2 + 2, memuse=24) + @bigmemtest(size=_2G // 2 + 2, memuse=24) def test_repeat_small(self, size): return self.basic_test_repeat(size) - @bigmemtest(minsize=_2G + 2, memuse=24) + @bigmemtest(size=_2G + 2, memuse=24) def test_repeat_large(self, size): return self.basic_test_repeat(size) @@ -1043,11 +1052,11 @@ self.assertEqual(len(l), size * 2) self.assertTrue(l[size - 1] is l[-1]) - @bigmemtest(minsize=_2G // 2 + 2, memuse=16) + @bigmemtest(size=_2G // 2 + 2, memuse=16) def test_inplace_repeat_small(self, size): return self.basic_test_inplace_repeat(size) - @bigmemtest(minsize=_2G + 2, memuse=16) + @bigmemtest(size=_2G + 2, memuse=16) def test_inplace_repeat_large(self, size): return self.basic_test_inplace_repeat(size) @@ -1060,17 +1069,17 @@ self.assertEqual(s[-5:], '0, 0]') self.assertEqual(s.count('0'), size) - @bigmemtest(minsize=_2G // 3 + 2, memuse=8 + 3 * character_size) + @bigmemtest(size=_2G // 3 + 2, memuse=8 + 3 * character_size) def test_repr_small(self, size): return self.basic_test_repr(size) - @bigmemtest(minsize=_2G + 2, memuse=8 + 3 * character_size) + @bigmemtest(size=_2G + 2, memuse=8 + 3 * character_size) def test_repr_large(self, size): return self.basic_test_repr(size) # list overallocates ~1/8th of the total size (on first expansion) so # the single list.append call puts memuse at 9 bytes per size. - @bigmemtest(minsize=_2G, memuse=9) + @bigmemtest(size=_2G, memuse=9) def test_append(self, size): l = [object()] * size l.append(object()) @@ -1078,7 +1087,7 @@ self.assertTrue(l[-3] is l[-2]) self.assertFalse(l[-2] is l[-1]) - @bigmemtest(minsize=_2G // 5 + 2, memuse=8 * 5) + @bigmemtest(size=_2G // 5 + 2, memuse=8 * 5) def test_count(self, size): l = [1, 2, 3, 4, 5] * size self.assertEqual(l.count(1), size) @@ -1091,15 +1100,15 @@ self.assertTrue(l[0] is l[-1]) self.assertTrue(l[size - 1] is l[size + 1]) - @bigmemtest(minsize=_2G // 2 + 2, memuse=16) + @bigmemtest(size=_2G // 2 + 2, memuse=16) def test_extend_small(self, size): return self.basic_test_extend(size) - @bigmemtest(minsize=_2G + 2, memuse=16) + @bigmemtest(size=_2G + 2, memuse=16) def test_extend_large(self, size): return self.basic_test_extend(size) - @bigmemtest(minsize=_2G // 5 + 2, memuse=8 * 5) + @bigmemtest(size=_2G // 5 + 2, memuse=8 * 5) def test_index(self, size): l = [1, 2, 3, 4, 5] * size size *= 5 @@ -1110,7 +1119,7 @@ self.assertRaises(ValueError, l.index, 6) # This tests suffers from overallocation, just like test_append. - @bigmemtest(minsize=_2G + 10, memuse=9) + @bigmemtest(size=_2G + 10, memuse=9) def test_insert(self, size): l = [1.0] * size l.insert(size - 1, "A") @@ -1129,7 +1138,7 @@ self.assertEqual(l[:3], [1.0, "C", 1.0]) self.assertEqual(l[size - 3:], ["A", 1.0, "B"]) - @bigmemtest(minsize=_2G // 5 + 4, memuse=8 * 5) + @bigmemtest(size=_2G // 5 + 4, memuse=8 * 5) def test_pop(self, size): l = ["a", "b", "c", "d", "e"] * size size *= 5 @@ -1153,7 +1162,7 @@ self.assertEqual(item, "c") self.assertEqual(l[-2:], ["b", "d"]) - @bigmemtest(minsize=_2G + 10, memuse=8) + @bigmemtest(size=_2G + 10, memuse=8) def test_remove(self, size): l = [10] * size self.assertEqual(len(l), size) @@ -1173,7 +1182,7 @@ self.assertEqual(len(l), size) self.assertEqual(l[-2:], [10, 10]) - @bigmemtest(minsize=_2G // 5 + 2, memuse=8 * 5) + @bigmemtest(size=_2G // 5 + 2, memuse=8 * 5) def test_reverse(self, size): l = [1, 2, 3, 4, 5] * size l.reverse() @@ -1181,7 +1190,7 @@ self.assertEqual(l[-5:], [5, 4, 3, 2, 1]) self.assertEqual(l[:5], [5, 4, 3, 2, 1]) - @bigmemtest(minsize=_2G // 5 + 2, memuse=8 * 5) + @bigmemtest(size=_2G // 5 + 2, memuse=8 * 5) def test_sort(self, size): l = [1, 2, 3, 4, 5] * size l.sort() diff --git a/Lib/test/test_bz2.py b/Lib/test/test_bz2.py --- a/Lib/test/test_bz2.py +++ b/Lib/test/test_bz2.py @@ -1,6 +1,6 @@ #!/usr/bin/env python3 from test import support -from test.support import TESTFN, precisionbigmemtest, _4G +from test.support import TESTFN, bigmemtest, _4G import unittest from io import BytesIO @@ -497,7 +497,7 @@ data += bz2c.flush() self.assertEqual(self.decompress(data), self.TEXT) - @precisionbigmemtest(size=_4G + 100, memuse=2) + @bigmemtest(size=_4G + 100, memuse=2) def testCompress4G(self, size): # "Test BZ2Compressor.compress()/flush() with >4GiB input" bz2c = BZ2Compressor() @@ -548,7 +548,7 @@ text = bz2d.decompress(self.DATA) self.assertRaises(EOFError, bz2d.decompress, b"anything") - @precisionbigmemtest(size=_4G + 100, memuse=3) + @bigmemtest(size=_4G + 100, memuse=3) def testDecompress4G(self, size): # "Test BZ2Decompressor.decompress() with >4GiB input" blocksize = 10 * 1024 * 1024 diff --git a/Lib/test/test_hashlib.py b/Lib/test/test_hashlib.py --- a/Lib/test/test_hashlib.py +++ b/Lib/test/test_hashlib.py @@ -17,7 +17,7 @@ import unittest import warnings from test import support -from test.support import _4G, precisionbigmemtest +from test.support import _4G, bigmemtest # Were we compiled --with-pydebug or with #define Py_DEBUG? COMPILED_WITH_PYDEBUG = hasattr(sys, 'gettotalrefcount') @@ -196,7 +196,7 @@ b'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789', 'd174ab98d277d9f5a5611c2c9f419d9f') - @precisionbigmemtest(size=_4G + 5, memuse=1) + @bigmemtest(size=_4G + 5, memuse=1) def test_case_md5_huge(self, size): if size == _4G + 5: try: @@ -204,7 +204,7 @@ except OverflowError: pass # 32-bit arch - @precisionbigmemtest(size=_4G - 1, memuse=1) + @bigmemtest(size=_4G - 1, memuse=1) def test_case_md5_uintmax(self, size): if size == _4G - 1: try: diff --git a/Lib/test/test_xml_etree_c.py b/Lib/test/test_xml_etree_c.py --- a/Lib/test/test_xml_etree_c.py +++ b/Lib/test/test_xml_etree_c.py @@ -1,7 +1,7 @@ # xml.etree test for cElementTree from test import support -from test.support import precisionbigmemtest, _2G +from test.support import bigmemtest, _2G import unittest cET = support.import_module('xml.etree.cElementTree') @@ -35,7 +35,7 @@ class MiscTests(unittest.TestCase): # Issue #8651. - @support.precisionbigmemtest(size=support._2G + 100, memuse=1) + @support.bigmemtest(size=support._2G + 100, memuse=1) def test_length_overflow(self, size): if size < support._2G + 100: self.skipTest("not enough free memory, need at least 2 GB") diff --git a/Lib/test/test_zlib.py b/Lib/test/test_zlib.py --- a/Lib/test/test_zlib.py +++ b/Lib/test/test_zlib.py @@ -3,7 +3,7 @@ import binascii import random import sys -from test.support import precisionbigmemtest, _1G, _4G +from test.support import bigmemtest, _1G, _4G zlib = support.import_module('zlib') @@ -188,16 +188,16 @@ # Memory use of the following functions takes into account overallocation - @precisionbigmemtest(size=_1G + 1024 * 1024, memuse=3) + @bigmemtest(size=_1G + 1024 * 1024, memuse=3) def test_big_compress_buffer(self, size): compress = lambda s: zlib.compress(s, 1) self.check_big_compress_buffer(size, compress) - @precisionbigmemtest(size=_1G + 1024 * 1024, memuse=2) + @bigmemtest(size=_1G + 1024 * 1024, memuse=2) def test_big_decompress_buffer(self, size): self.check_big_decompress_buffer(size, zlib.decompress) - @precisionbigmemtest(size=_4G + 100, memuse=1) + @bigmemtest(size=_4G + 100, memuse=1) def test_length_overflow(self, size): if size < _4G + 100: self.skipTest("not enough free memory, need at least 4 GB") @@ -542,19 +542,19 @@ # Memory use of the following functions takes into account overallocation - @precisionbigmemtest(size=_1G + 1024 * 1024, memuse=3) + @bigmemtest(size=_1G + 1024 * 1024, memuse=3) def test_big_compress_buffer(self, size): c = zlib.compressobj(1) compress = lambda s: c.compress(s) + c.flush() self.check_big_compress_buffer(size, compress) - @precisionbigmemtest(size=_1G + 1024 * 1024, memuse=2) + @bigmemtest(size=_1G + 1024 * 1024, memuse=2) def test_big_decompress_buffer(self, size): d = zlib.decompressobj() decompress = lambda s: d.decompress(s) + d.flush() self.check_big_decompress_buffer(size, decompress) - @precisionbigmemtest(size=_4G + 100, memuse=1) + @bigmemtest(size=_4G + 100, memuse=1) def test_length_overflow(self, size): if size < _4G + 100: self.skipTest("not enough free memory, need at least 4 GB") -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 10:43:26 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 04 Oct 2011 10:43:26 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Fix_test_failure?= Message-ID: http://hg.python.org/cpython/rev/c4b6d9312da1 changeset: 72644:c4b6d9312da1 user: Antoine Pitrou date: Tue Oct 04 10:39:54 2011 +0200 summary: Fix test failure files: Lib/test/json_tests/test_dump.py | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/Lib/test/json_tests/test_dump.py b/Lib/test/json_tests/test_dump.py --- a/Lib/test/json_tests/test_dump.py +++ b/Lib/test/json_tests/test_dump.py @@ -1,7 +1,7 @@ from io import StringIO from test.json_tests import PyTest, CTest -from test.support import precisionbigmemtest, _1G +from test.support import bigmemtest, _1G class TestDump: def test_dump(self): @@ -30,7 +30,7 @@ # system memory management, since this may allocate a lot of # small objects). - @precisionbigmemtest(size=_1G, memuse=1) + @bigmemtest(size=_1G, memuse=1) def test_large_list(self, size): N = int(30 * 1024 * 1024 * (size / _1G)) l = [1] * N -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 11:55:04 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 04 Oct 2011 11:55:04 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Use_the_faulthandler_module?= =?utf8?q?=27s_infrastructure_to_write_a_GIL-less?= Message-ID: http://hg.python.org/cpython/rev/8eaa4c3f8633 changeset: 72645:8eaa4c3f8633 user: Antoine Pitrou date: Tue Oct 04 11:51:23 2011 +0200 summary: Use the faulthandler module's infrastructure to write a GIL-less memory watchdog for timely stats collection. files: Lib/test/support.py | 109 ++++++++++------ Modules/faulthandler.c | 183 +++++++++++++++++++++++++++++ 2 files changed, 249 insertions(+), 43 deletions(-) diff --git a/Lib/test/support.py b/Lib/test/support.py --- a/Lib/test/support.py +++ b/Lib/test/support.py @@ -23,6 +23,7 @@ import sysconfig import fnmatch import logging.handlers +import struct try: import _thread, threading @@ -34,6 +35,10 @@ except ImportError: multiprocessing = None +try: + import faulthandler +except ImportError: + faulthandler = None try: import zlib @@ -1133,41 +1138,66 @@ raise ValueError('Memory limit %r too low to be useful' % (limit,)) max_memuse = memlimit -def _memory_watchdog(start_evt, finish_evt, period=10.0): - """A function which periodically watches the process' memory consumption +class _MemoryWatchdog: + """An object which periodically watches the process' memory consumption and prints it out. """ - # XXX: because of the GIL, and because the very long operations tested - # in most bigmem tests are uninterruptible, the loop below gets woken up - # much less often than expected. - # The polling code should be rewritten in raw C, without holding the GIL, - # and push results onto an anonymous pipe. - try: - page_size = os.sysconf('SC_PAGESIZE') - except (ValueError, AttributeError): + + def __init__(self): + self.procfile = '/proc/{pid}/statm'.format(pid=os.getpid()) + self.started = False + self.thread = None try: - page_size = os.sysconf('SC_PAGE_SIZE') + self.page_size = os.sysconf('SC_PAGESIZE') except (ValueError, AttributeError): - page_size = 4096 - procfile = '/proc/{pid}/statm'.format(pid=os.getpid()) - try: - f = open(procfile, 'rb') - except IOError as e: - warnings.warn('/proc not available for stats: {}'.format(e), - RuntimeWarning) - sys.stderr.flush() - return - with f: - start_evt.set() - old_data = -1 - while not finish_evt.wait(period): - f.seek(0) - statm = f.read().decode('ascii') - data = int(statm.split()[5]) - if data != old_data: - old_data = data + try: + self.page_size = os.sysconf('SC_PAGE_SIZE') + except (ValueError, AttributeError): + self.page_size = 4096 + + def consumer(self, fd): + HEADER = "l" + header_size = struct.calcsize(HEADER) + try: + while True: + header = os.read(fd, header_size) + if len(header) < header_size: + # Pipe closed on other end + break + data_len, = struct.unpack(HEADER, header) + data = os.read(fd, data_len) + statm = data.decode('ascii') + data = int(statm.split()[5]) print(" ... process data size: {data:.1f}G" - .format(data=data * page_size / (1024 ** 3))) + .format(data=data * self.page_size / (1024 ** 3))) + finally: + os.close(fd) + + def start(self): + if not faulthandler or not hasattr(faulthandler, '_file_watchdog'): + return + try: + rfd = os.open(self.procfile, os.O_RDONLY) + except OSError as e: + warnings.warn('/proc not available for stats: {}'.format(e), + RuntimeWarning) + sys.stderr.flush() + return + pipe_fd, wfd = os.pipe() + # _file_watchdog() doesn't take the GIL in its child thread, and + # therefore collects statistics timely + faulthandler._file_watchdog(rfd, wfd, 3.0) + self.started = True + self.thread = threading.Thread(target=self.consumer, args=(pipe_fd,)) + self.thread.daemon = True + self.thread.start() + + def stop(self): + if not self.started: + return + faulthandler._cancel_file_watchdog() + self.thread.join() + def bigmemtest(size, memuse, dry_run=True): """Decorator for bigmem tests. @@ -1194,27 +1224,20 @@ "not enough memory: %.1fG minimum needed" % (size * memuse / (1024 ** 3))) - if real_max_memuse and verbose and threading: + if real_max_memuse and verbose and faulthandler and threading: print() print(" ... expected peak memory use: {peak:.1f}G" .format(peak=size * memuse / (1024 ** 3))) - sys.stdout.flush() - start_evt = threading.Event() - finish_evt = threading.Event() - t = threading.Thread(target=_memory_watchdog, - args=(start_evt, finish_evt, 0.5)) - t.daemon = True - t.start() - start_evt.set() + watchdog = _MemoryWatchdog() + watchdog.start() else: - t = None + watchdog = None try: return f(self, maxsize) finally: - if t: - finish_evt.set() - t.join() + if watchdog: + watchdog.stop() wrapper.size = size wrapper.memuse = memuse diff --git a/Modules/faulthandler.c b/Modules/faulthandler.c --- a/Modules/faulthandler.c +++ b/Modules/faulthandler.c @@ -13,6 +13,7 @@ #ifdef WITH_THREAD # define FAULTHANDLER_LATER +# define FAULTHANDLER_WATCHDOG #endif #ifndef MS_WINDOWS @@ -65,6 +66,20 @@ } thread; #endif +#ifdef FAULTHANDLER_WATCHDOG +static struct { + int rfd; + int wfd; + PY_TIMEOUT_T period_us; /* period in microseconds */ + /* The main thread always holds this lock. It is only released when + faulthandler_watchdog() is interrupted before this thread exits, or at + Python exit. */ + PyThread_type_lock cancel_event; + /* released by child thread when joined */ + PyThread_type_lock running; +} watchdog; +#endif + #ifdef FAULTHANDLER_USER typedef struct { int enabled; @@ -587,6 +602,138 @@ } #endif /* FAULTHANDLER_LATER */ +#ifdef FAULTHANDLER_WATCHDOG + +static void +file_watchdog(void *unused) +{ + PyLockStatus st; + PY_TIMEOUT_T timeout; + + const int MAXDATA = 1024; + char buf1[MAXDATA], buf2[MAXDATA]; + char *data = buf1, *old_data = buf2; + Py_ssize_t data_len, old_data_len = -1; + +#if defined(HAVE_PTHREAD_SIGMASK) && !defined(HAVE_BROKEN_PTHREAD_SIGMASK) + sigset_t set; + + /* we don't want to receive any signal */ + sigfillset(&set); + pthread_sigmask(SIG_SETMASK, &set, NULL); +#endif + + /* On first pass, feed file contents immediately */ + timeout = 0; + do { + st = PyThread_acquire_lock_timed(watchdog.cancel_event, + timeout, 0); + timeout = watchdog.period_us; + if (st == PY_LOCK_ACQUIRED) { + PyThread_release_lock(watchdog.cancel_event); + break; + } + /* Timeout => read and write data */ + assert(st == PY_LOCK_FAILURE); + + if (lseek(watchdog.rfd, 0, SEEK_SET) < 0) { + break; + } + data_len = read(watchdog.rfd, data, MAXDATA); + if (data_len < 0) { + break; + } + if (data_len != old_data_len || memcmp(data, old_data, data_len)) { + char *tdata; + Py_ssize_t tlen; + /* Contents changed, feed them to wfd */ + long x = (long) data_len; + /* We can't do anything if the consumer is too slow, just bail out */ + if (write(watchdog.wfd, (void *) &x, sizeof(x)) < sizeof(x)) + break; + if (write(watchdog.wfd, data, data_len) < data_len) + break; + tdata = data; + data = old_data; + old_data = tdata; + tlen = data_len; + data_len = old_data_len; + old_data_len = tlen; + } + } while (1); + + close(watchdog.rfd); + close(watchdog.wfd); + + /* The only way out */ + PyThread_release_lock(watchdog.running); +} + +static void +cancel_file_watchdog(void) +{ + /* Notify cancellation */ + PyThread_release_lock(watchdog.cancel_event); + + /* Wait for thread to join */ + PyThread_acquire_lock(watchdog.running, 1); + PyThread_release_lock(watchdog.running); + + /* The main thread should always hold the cancel_event lock */ + PyThread_acquire_lock(watchdog.cancel_event, 1); +} + +static PyObject* +faulthandler_file_watchdog(PyObject *self, + PyObject *args, PyObject *kwargs) +{ + static char *kwlist[] = {"rfd", "wfd", "period", NULL}; + double period; + PY_TIMEOUT_T period_us; + int rfd, wfd; + + if (!PyArg_ParseTupleAndKeywords(args, kwargs, + "iid:_file_watchdog", kwlist, + &rfd, &wfd, &period)) + return NULL; + if ((period * 1e6) >= (double) PY_TIMEOUT_MAX) { + PyErr_SetString(PyExc_OverflowError, "period value is too large"); + return NULL; + } + period_us = (PY_TIMEOUT_T)(period * 1e6); + if (period_us <= 0) { + PyErr_SetString(PyExc_ValueError, "period must be greater than 0"); + return NULL; + } + + /* Cancel previous thread, if running */ + cancel_file_watchdog(); + + watchdog.rfd = rfd; + watchdog.wfd = wfd; + watchdog.period_us = period_us; + + /* Arm these locks to serve as events when released */ + PyThread_acquire_lock(watchdog.running, 1); + + if (PyThread_start_new_thread(file_watchdog, NULL) == -1) { + PyThread_release_lock(watchdog.running); + PyErr_SetString(PyExc_RuntimeError, + "unable to start file watchdog thread"); + return NULL; + } + + Py_RETURN_NONE; +} + +static PyObject* +faulthandler_cancel_file_watchdog(PyObject *self) +{ + cancel_file_watchdog(); + Py_RETURN_NONE; +} +#endif /* FAULTHANDLER_WATCHDOG */ + #ifdef FAULTHANDLER_USER static int faulthandler_register(int signum, int chain, _Py_sighandler_t *p_previous) @@ -973,6 +1120,18 @@ "to dump_tracebacks_later().")}, #endif +#ifdef FAULTHANDLER_WATCHDOG + {"_file_watchdog", + (PyCFunction)faulthandler_file_watchdog, METH_VARARGS|METH_KEYWORDS, + PyDoc_STR("_file_watchdog(rfd, wfd, period):\n" + "feed the contents of 'rfd' to 'wfd', if changed,\n" + "every 'period seconds'.")}, + {"_cancel_file_watchdog", + (PyCFunction)faulthandler_cancel_file_watchdog, METH_NOARGS, + PyDoc_STR("_cancel_file_watchdog():\ncancel the previous call " + "to _file_watchdog().")}, +#endif + #ifdef FAULTHANDLER_USER {"register", (PyCFunction)faulthandler_register_py, METH_VARARGS|METH_KEYWORDS, @@ -1097,6 +1256,16 @@ } PyThread_acquire_lock(thread.cancel_event, 1); #endif +#ifdef FAULTHANDLER_WATCHDOG + watchdog.cancel_event = PyThread_allocate_lock(); + watchdog.running = PyThread_allocate_lock(); + if (!watchdog.cancel_event || !watchdog.running) { + PyErr_SetString(PyExc_RuntimeError, + "could not allocate locks for faulthandler"); + return -1; + } + PyThread_acquire_lock(watchdog.cancel_event, 1); +#endif return faulthandler_env_options(); } @@ -1121,6 +1290,20 @@ } #endif +#ifdef FAULTHANDLER_WATCHDOG + /* file watchdog */ + cancel_file_watchdog(); + if (watchdog.cancel_event) { + PyThread_release_lock(watchdog.cancel_event); + PyThread_free_lock(watchdog.cancel_event); + watchdog.cancel_event = NULL; + } + if (watchdog.running) { + PyThread_free_lock(watchdog.running); + watchdog.running = NULL; + } +#endif + #ifdef FAULTHANDLER_USER /* user */ if (user_signals != NULL) { -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 12:03:47 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 04 Oct 2011 12:03:47 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=282=2E7=29=3A_Avoid_testing_s?= =?utf8?q?tuff_that=27s_been_fixed_in_2=2E7_on_older_Pythons?= Message-ID: http://hg.python.org/cpython/rev/05c58e0873f0 changeset: 72646:05c58e0873f0 branch: 2.7 parent: 72641:64053bd79590 user: Antoine Pitrou date: Tue Oct 04 12:00:13 2011 +0200 summary: Avoid testing stuff that's been fixed in 2.7 on older Pythons files: Lib/test/test_xpickle.py | 4 ++++ 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/Lib/test/test_xpickle.py b/Lib/test/test_xpickle.py --- a/Lib/test/test_xpickle.py +++ b/Lib/test/test_xpickle.py @@ -154,6 +154,10 @@ def test_unicode_high_plane(self): pass + # This tests a fix that's in 2.7 only + def test_dynamic_class(self): + pass + if test_support.have_unicode: # This is a cut-down version of pickletester's test_unicode. Backwards # compatibility was explicitly broken in r67934 to fix a bug. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 12:09:39 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 04 Oct 2011 12:09:39 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Collect_stats_a_bit_more_of?= =?utf8?q?ten?= Message-ID: http://hg.python.org/cpython/rev/f2ed0310adec changeset: 72647:f2ed0310adec parent: 72645:8eaa4c3f8633 user: Antoine Pitrou date: Tue Oct 04 12:06:06 2011 +0200 summary: Collect stats a bit more often files: Lib/test/support.py | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Lib/test/support.py b/Lib/test/support.py --- a/Lib/test/support.py +++ b/Lib/test/support.py @@ -1186,7 +1186,7 @@ pipe_fd, wfd = os.pipe() # _file_watchdog() doesn't take the GIL in its child thread, and # therefore collects statistics timely - faulthandler._file_watchdog(rfd, wfd, 3.0) + faulthandler._file_watchdog(rfd, wfd, 1.0) self.started = True self.thread = threading.Thread(target=self.consumer, args=(pipe_fd,)) self.thread.daemon = True -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 12:33:00 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 04 Oct 2011 12:33:00 +0200 Subject: [Python-checkins] =?utf8?b?Y3B5dGhvbiAoMy4yKTogSXNzdWUgIzEzMDg3?= =?utf8?q?=3A_BufferedReader=2Eseek=28=29_now_always_raises_UnsupportedOpe?= =?utf8?q?ration?= Message-ID: http://hg.python.org/cpython/rev/d287f0654349 changeset: 72648:d287f0654349 branch: 3.2 parent: 72642:bf39434dd506 user: Antoine Pitrou date: Tue Oct 04 12:26:20 2011 +0200 summary: Issue #13087: BufferedReader.seek() now always raises UnsupportedOperation if the underlying raw stream is unseekable, even if the seek could be satisfied using the internal buffer. Patch by John O'Connor. files: Lib/test/test_io.py | 8 ++++++++ Misc/NEWS | 4 ++++ Modules/_io/bufferedio.c | 3 +++ 3 files changed, 15 insertions(+), 0 deletions(-) diff --git a/Lib/test/test_io.py b/Lib/test/test_io.py --- a/Lib/test/test_io.py +++ b/Lib/test/test_io.py @@ -922,6 +922,14 @@ finally: support.unlink(support.TESTFN) + def test_unseekable(self): + bufio = self.tp(self.MockUnseekableIO(b"A" * 10)) + self.assertRaises(self.UnsupportedOperation, bufio.tell) + self.assertRaises(self.UnsupportedOperation, bufio.seek, 0) + bufio.read(1) + self.assertRaises(self.UnsupportedOperation, bufio.seek, 0) + self.assertRaises(self.UnsupportedOperation, bufio.tell) + def test_misbehaved_io(self): rawio = self.MisbehavedRawIO((b"abc", b"d", b"efg")) bufio = self.tp(rawio) diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -36,6 +36,10 @@ Library ------- +- Issue #13087: BufferedReader.seek() now always raises UnsupportedOperation + if the underlying raw stream is unseekable, even if the seek could be + satisfied using the internal buffer. Patch by John O'Connor. + - Issue #7689: Allow pickling of dynamically created classes when their metaclass is registered with copyreg. Patch by Nicolas M. Thi?ry and Craig Citro. diff --git a/Modules/_io/bufferedio.c b/Modules/_io/bufferedio.c --- a/Modules/_io/bufferedio.c +++ b/Modules/_io/bufferedio.c @@ -1086,6 +1086,9 @@ CHECK_CLOSED(self, "seek of closed file") + if (_PyIOBase_check_seekable(self->raw, Py_True) == NULL) + return NULL; + target = PyNumber_AsOff_t(targetobj, PyExc_ValueError); if (target == -1 && PyErr_Occurred()) return NULL; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 12:33:01 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 04 Oct 2011 12:33:01 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=283=2E2=29=3A_Add_John_to_ACK?= =?utf8?q?S?= Message-ID: http://hg.python.org/cpython/rev/94c32dff61c7 changeset: 72649:94c32dff61c7 branch: 3.2 user: Antoine Pitrou date: Tue Oct 04 12:26:34 2011 +0200 summary: Add John to ACKS files: Misc/ACKS | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/Misc/ACKS b/Misc/ACKS --- a/Misc/ACKS +++ b/Misc/ACKS @@ -652,6 +652,7 @@ Michal Nowikowski Steffen Daode Nurpmeso Nigel O'Brian +John O'Connor Kevin O'Connor Tim O'Malley Pascal Oberndoerfer -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 12:33:02 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 04 Oct 2011 12:33:02 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=28merge_3=2E2_-=3E_default=29?= =?utf8?q?=3A_Issue_=2313087=3A_BufferedReader=2Eseek=28=29_now_always_rai?= =?utf8?q?ses_UnsupportedOperation?= Message-ID: http://hg.python.org/cpython/rev/0cf38407a3a2 changeset: 72650:0cf38407a3a2 parent: 72647:f2ed0310adec parent: 72649:94c32dff61c7 user: Antoine Pitrou date: Tue Oct 04 12:28:52 2011 +0200 summary: Issue #13087: BufferedReader.seek() now always raises UnsupportedOperation if the underlying raw stream is unseekable, even if the seek could be satisfied using the internal buffer. Patch by John OConnor. files: Lib/test/test_io.py | 8 ++++++++ Misc/NEWS | 4 ++++ Modules/_io/bufferedio.c | 3 +++ 3 files changed, 15 insertions(+), 0 deletions(-) diff --git a/Lib/test/test_io.py b/Lib/test/test_io.py --- a/Lib/test/test_io.py +++ b/Lib/test/test_io.py @@ -928,6 +928,14 @@ finally: support.unlink(support.TESTFN) + def test_unseekable(self): + bufio = self.tp(self.MockUnseekableIO(b"A" * 10)) + self.assertRaises(self.UnsupportedOperation, bufio.tell) + self.assertRaises(self.UnsupportedOperation, bufio.seek, 0) + bufio.read(1) + self.assertRaises(self.UnsupportedOperation, bufio.seek, 0) + self.assertRaises(self.UnsupportedOperation, bufio.tell) + def test_misbehaved_io(self): rawio = self.MisbehavedRawIO((b"abc", b"d", b"efg")) bufio = self.tp(rawio) diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -294,6 +294,10 @@ Library ------- +- Issue #13087: BufferedReader.seek() now always raises UnsupportedOperation + if the underlying raw stream is unseekable, even if the seek could be + satisfied using the internal buffer. Patch by John O'Connor. + - Issue #7689: Allow pickling of dynamically created classes when their metaclass is registered with copyreg. Patch by Nicolas M. Thi?ry and Craig Citro. diff --git a/Modules/_io/bufferedio.c b/Modules/_io/bufferedio.c --- a/Modules/_io/bufferedio.c +++ b/Modules/_io/bufferedio.c @@ -1155,6 +1155,9 @@ CHECK_CLOSED(self, "seek of closed file") + if (_PyIOBase_check_seekable(self->raw, Py_True) == NULL) + return NULL; + target = PyNumber_AsOff_t(targetobj, PyExc_ValueError); if (target == -1 && PyErr_Occurred()) return NULL; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 13:03:36 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 04 Oct 2011 13:03:36 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Fix_compilation_error_under?= =?utf8?q?_Windows?= Message-ID: http://hg.python.org/cpython/rev/dc21b26d80f9 changeset: 72651:dc21b26d80f9 user: Antoine Pitrou date: Tue Oct 04 13:00:02 2011 +0200 summary: Fix compilation error under Windows files: Modules/faulthandler.c | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/Modules/faulthandler.c b/Modules/faulthandler.c --- a/Modules/faulthandler.c +++ b/Modules/faulthandler.c @@ -610,7 +610,7 @@ PyLockStatus st; PY_TIMEOUT_T timeout; - const int MAXDATA = 1024; +#define MAXDATA 1024 char buf1[MAXDATA], buf2[MAXDATA]; char *data = buf1, *old_data = buf2; Py_ssize_t data_len, old_data_len = -1; @@ -667,6 +667,7 @@ /* The only way out */ PyThread_release_lock(watchdog.running); +#undef MAXDATA } static void -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 13:41:45 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 04 Oct 2011 13:41:45 +0200 Subject: [Python-checkins] =?utf8?b?Y3B5dGhvbiAoMy4yKTogSXNzdWUgIzEzMDk5?= =?utf8?q?=3A_Fix_sqlite3=2ECursor=2Elastrowid_under_a_Turkish_locale=2E?= Message-ID: http://hg.python.org/cpython/rev/469555867244 changeset: 72652:469555867244 branch: 3.2 parent: 72649:94c32dff61c7 user: Antoine Pitrou date: Tue Oct 04 13:35:28 2011 +0200 summary: Issue #13099: Fix sqlite3.Cursor.lastrowid under a Turkish locale. Reported and diagnosed by Thomas Kluyver. files: Misc/ACKS | 1 + Misc/NEWS | 3 +++ Modules/_sqlite/cursor.c | 4 ++-- 3 files changed, 6 insertions(+), 2 deletions(-) diff --git a/Misc/ACKS b/Misc/ACKS --- a/Misc/ACKS +++ b/Misc/ACKS @@ -488,6 +488,7 @@ Bob Kline Matthias Klose Jeremy Kloth +Thomas Kluyver Kim Knapp Lenny Kneler Pat Knight diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -36,6 +36,9 @@ Library ------- +- Issue #13099: Fix sqlite3.Cursor.lastrowid under a Turkish locale. + Reported and diagnosed by Thomas Kluyver. + - Issue #13087: BufferedReader.seek() now always raises UnsupportedOperation if the underlying raw stream is unseekable, even if the seek could be satisfied using the internal buffer. Patch by John O'Connor. diff --git a/Modules/_sqlite/cursor.c b/Modules/_sqlite/cursor.c --- a/Modules/_sqlite/cursor.c +++ b/Modules/_sqlite/cursor.c @@ -55,8 +55,8 @@ dst = buf; *dst = 0; - while (isalpha(*src) && dst - buf < sizeof(buf) - 2) { - *dst++ = tolower(*src++); + while (Py_ISALPHA(*src) && dst - buf < sizeof(buf) - 2) { + *dst++ = Py_TOLOWER(*src++); } *dst = 0; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 13:41:46 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 04 Oct 2011 13:41:46 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=28merge_3=2E2_-=3E_default=29?= =?utf8?q?=3A_Issue_=2313099=3A_Fix_sqlite3=2ECursor=2Elastrowid_under_a_T?= =?utf8?q?urkish_locale=2E?= Message-ID: http://hg.python.org/cpython/rev/652e2dacbf4b changeset: 72653:652e2dacbf4b parent: 72651:dc21b26d80f9 parent: 72652:469555867244 user: Antoine Pitrou date: Tue Oct 04 13:37:06 2011 +0200 summary: Issue #13099: Fix sqlite3.Cursor.lastrowid under a Turkish locale. Reported and diagnosed by Thomas Kluyver. files: Misc/ACKS | 1 + Misc/NEWS | 3 +++ Modules/_sqlite/cursor.c | 4 ++-- 3 files changed, 6 insertions(+), 2 deletions(-) diff --git a/Misc/ACKS b/Misc/ACKS --- a/Misc/ACKS +++ b/Misc/ACKS @@ -516,6 +516,7 @@ Bob Kline Matthias Klose Jeremy Kloth +Thomas Kluyver Kim Knapp Lenny Kneler Pat Knight diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -294,6 +294,9 @@ Library ------- +- Issue #13099: Fix sqlite3.Cursor.lastrowid under a Turkish locale. + Reported and diagnosed by Thomas Kluyver. + - Issue #13087: BufferedReader.seek() now always raises UnsupportedOperation if the underlying raw stream is unseekable, even if the seek could be satisfied using the internal buffer. Patch by John O'Connor. diff --git a/Modules/_sqlite/cursor.c b/Modules/_sqlite/cursor.c --- a/Modules/_sqlite/cursor.c +++ b/Modules/_sqlite/cursor.c @@ -55,8 +55,8 @@ dst = buf; *dst = 0; - while (isalpha(*src) && dst - buf < sizeof(buf) - 2) { - *dst++ = tolower(*src++); + while (Py_ISALPHA(*src) && dst - buf < sizeof(buf) - 2) { + *dst++ = Py_TOLOWER(*src++); } *dst = 0; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 13:41:46 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 04 Oct 2011 13:41:46 +0200 Subject: [Python-checkins] =?utf8?b?Y3B5dGhvbiAoMi43KTogSXNzdWUgIzEzMDk5?= =?utf8?q?=3A_Fix_sqlite3=2ECursor=2Elastrowid_under_a_Turkish_locale=2E?= Message-ID: http://hg.python.org/cpython/rev/89713606b654 changeset: 72654:89713606b654 branch: 2.7 parent: 72646:05c58e0873f0 user: Antoine Pitrou date: Tue Oct 04 13:38:04 2011 +0200 summary: Issue #13099: Fix sqlite3.Cursor.lastrowid under a Turkish locale. Reported and diagnosed by Thomas Kluyver. files: Misc/ACKS | 1 + Misc/NEWS | 3 +++ Modules/_sqlite/cursor.c | 4 ++-- 3 files changed, 6 insertions(+), 2 deletions(-) diff --git a/Misc/ACKS b/Misc/ACKS --- a/Misc/ACKS +++ b/Misc/ACKS @@ -450,6 +450,7 @@ Bob Kline Matthias Klose Jeremy Kloth +Thomas Kluyver Kim Knapp Lenny Kneler Pat Knight diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -50,6 +50,9 @@ Library ------- +- Issue #13099: Fix sqlite3.Cursor.lastrowid under a Turkish locale. + Reported and diagnosed by Thomas Kluyver. + - Issue #7689: Allow pickling of dynamically created classes when their metaclass is registered with copy_reg. Patch by Nicolas M. Thi?ry and Craig Citro. diff --git a/Modules/_sqlite/cursor.c b/Modules/_sqlite/cursor.c --- a/Modules/_sqlite/cursor.c +++ b/Modules/_sqlite/cursor.c @@ -55,8 +55,8 @@ dst = buf; *dst = 0; - while (isalpha(*src) && dst - buf < sizeof(buf) - 2) { - *dst++ = tolower(*src++); + while (Py_ISALPHA(*src) && dst - buf < sizeof(buf) - 2) { + *dst++ = Py_TOLOWER(*src++); } *dst = 0; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 13:56:36 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 04 Oct 2011 13:56:36 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=283=2E2=29=3A_Remove_all_othe?= =?utf8?q?r_uses_of_the_C_tolower=28=29/toupper=28=29_which_could_break_wi?= =?utf8?q?th_a?= Message-ID: http://hg.python.org/cpython/rev/fe48e2b3dbee changeset: 72655:fe48e2b3dbee branch: 3.2 parent: 72652:469555867244 user: Antoine Pitrou date: Tue Oct 04 13:50:21 2011 +0200 summary: Remove all other uses of the C tolower()/toupper() which could break with a Turkish locale. files: Modules/_tkinter.c | 4 ++-- Modules/binascii.c | 4 ++-- Modules/unicodedata.c | 4 ++-- 3 files changed, 6 insertions(+), 6 deletions(-) diff --git a/Modules/_tkinter.c b/Modules/_tkinter.c --- a/Modules/_tkinter.c +++ b/Modules/_tkinter.c @@ -661,8 +661,8 @@ } strcpy(argv0, className); - if (isupper(Py_CHARMASK(argv0[0]))) - argv0[0] = tolower(Py_CHARMASK(argv0[0])); + if (Py_ISUPPER(Py_CHARMASK(argv0[0]))) + argv0[0] = Py_TOLOWER(Py_CHARMASK(argv0[0])); Tcl_SetVar(v->interp, "argv0", argv0, TCL_GLOBAL_ONLY); ckfree(argv0); diff --git a/Modules/binascii.c b/Modules/binascii.c --- a/Modules/binascii.c +++ b/Modules/binascii.c @@ -1102,8 +1102,8 @@ if (isdigit(c)) return c - '0'; else { - if (isupper(c)) - c = tolower(c); + if (Py_ISUPPER(c)) + c = Py_TOLOWER(c); if (c >= 'a' && c <= 'f') return c - 'a' + 10; } diff --git a/Modules/unicodedata.c b/Modules/unicodedata.c --- a/Modules/unicodedata.c +++ b/Modules/unicodedata.c @@ -830,7 +830,7 @@ unsigned long h = 0; unsigned long ix; for (i = 0; i < len; i++) { - h = (h * scale) + (unsigned char) toupper(Py_CHARMASK(s[i])); + h = (h * scale) + (unsigned char) Py_TOUPPER(Py_CHARMASK(s[i])); ix = h & 0xff000000; if (ix) h = (h ^ ((ix>>24) & 0xff)) & 0x00ffffff; @@ -980,7 +980,7 @@ if (!_getucname(self, code, buffer, sizeof(buffer))) return 0; for (i = 0; i < namelen; i++) { - if (toupper(Py_CHARMASK(name[i])) != buffer[i]) + if (Py_TOUPPER(Py_CHARMASK(name[i])) != buffer[i]) return 0; } return buffer[namelen] == '\0'; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 13:56:37 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 04 Oct 2011 13:56:37 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=28merge_3=2E2_-=3E_default=29?= =?utf8?q?=3A_Remove_all_other_uses_of_the_C_tolower=28=29/toupper=28=29_w?= =?utf8?q?hich_could_break_with_a?= Message-ID: http://hg.python.org/cpython/rev/969bbc6700a0 changeset: 72656:969bbc6700a0 parent: 72653:652e2dacbf4b parent: 72655:fe48e2b3dbee user: Antoine Pitrou date: Tue Oct 04 13:53:01 2011 +0200 summary: Remove all other uses of the C tolower()/toupper() which could break with a Turkish locale. files: Modules/_tkinter.c | 4 ++-- Modules/binascii.c | 4 ++-- Modules/unicodedata.c | 4 ++-- 3 files changed, 6 insertions(+), 6 deletions(-) diff --git a/Modules/_tkinter.c b/Modules/_tkinter.c --- a/Modules/_tkinter.c +++ b/Modules/_tkinter.c @@ -649,8 +649,8 @@ } strcpy(argv0, className); - if (isupper(Py_CHARMASK(argv0[0]))) - argv0[0] = tolower(Py_CHARMASK(argv0[0])); + if (Py_ISUPPER(Py_CHARMASK(argv0[0]))) + argv0[0] = Py_TOLOWER(Py_CHARMASK(argv0[0])); Tcl_SetVar(v->interp, "argv0", argv0, TCL_GLOBAL_ONLY); ckfree(argv0); diff --git a/Modules/binascii.c b/Modules/binascii.c --- a/Modules/binascii.c +++ b/Modules/binascii.c @@ -1102,8 +1102,8 @@ if (isdigit(c)) return c - '0'; else { - if (isupper(c)) - c = tolower(c); + if (Py_ISUPPER(c)) + c = Py_TOLOWER(c); if (c >= 'a' && c <= 'f') return c - 'a' + 10; } diff --git a/Modules/unicodedata.c b/Modules/unicodedata.c --- a/Modules/unicodedata.c +++ b/Modules/unicodedata.c @@ -875,7 +875,7 @@ unsigned long h = 0; unsigned long ix; for (i = 0; i < len; i++) { - h = (h * scale) + (unsigned char) toupper(Py_CHARMASK(s[i])); + h = (h * scale) + (unsigned char) Py_TOUPPER(Py_CHARMASK(s[i])); ix = h & 0xff000000; if (ix) h = (h ^ ((ix>>24) & 0xff)) & 0x00ffffff; @@ -1025,7 +1025,7 @@ if (!_getucname(self, code, buffer, sizeof(buffer))) return 0; for (i = 0; i < namelen; i++) { - if (toupper(Py_CHARMASK(name[i])) != buffer[i]) + if (Py_TOUPPER(Py_CHARMASK(name[i])) != buffer[i]) return 0; } return buffer[namelen] == '\0'; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 13:59:11 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 04 Oct 2011 13:59:11 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=282=2E7=29=3A_Remove_all_othe?= =?utf8?q?r_uses_of_the_C_tolower=28=29/toupper=28=29_which_could_break_wi?= =?utf8?q?th_a?= Message-ID: http://hg.python.org/cpython/rev/60d80880bf88 changeset: 72657:60d80880bf88 branch: 2.7 parent: 72654:89713606b654 user: Antoine Pitrou date: Tue Oct 04 13:55:37 2011 +0200 summary: Remove all other uses of the C tolower()/toupper() which could break with a Turkish locale. (except in the strop module, which is deprecated anyway) files: Modules/_tkinter.c | 4 ++-- Modules/binascii.c | 4 ++-- Modules/unicodedata.c | 4 ++-- 3 files changed, 6 insertions(+), 6 deletions(-) diff --git a/Modules/_tkinter.c b/Modules/_tkinter.c --- a/Modules/_tkinter.c +++ b/Modules/_tkinter.c @@ -663,8 +663,8 @@ } strcpy(argv0, className); - if (isupper(Py_CHARMASK(argv0[0]))) - argv0[0] = tolower(Py_CHARMASK(argv0[0])); + if (Py_ISUPPER(Py_CHARMASK(argv0[0]))) + argv0[0] = Py_TOLOWER(Py_CHARMASK(argv0[0])); Tcl_SetVar(v->interp, "argv0", argv0, TCL_GLOBAL_ONLY); ckfree(argv0); diff --git a/Modules/binascii.c b/Modules/binascii.c --- a/Modules/binascii.c +++ b/Modules/binascii.c @@ -1105,8 +1105,8 @@ if (isdigit(c)) return c - '0'; else { - if (isupper(c)) - c = tolower(c); + if (Py_ISUPPER(c)) + c = Py_TOLOWER(c); if (c >= 'a' && c <= 'f') return c - 'a' + 10; } diff --git a/Modules/unicodedata.c b/Modules/unicodedata.c --- a/Modules/unicodedata.c +++ b/Modules/unicodedata.c @@ -830,7 +830,7 @@ unsigned long h = 0; unsigned long ix; for (i = 0; i < len; i++) { - h = (h * scale) + (unsigned char) toupper(Py_CHARMASK(s[i])); + h = (h * scale) + (unsigned char) Py_TOUPPER(Py_CHARMASK(s[i])); ix = h & 0xff000000; if (ix) h = (h ^ ((ix>>24) & 0xff)) & 0x00ffffff; @@ -978,7 +978,7 @@ if (!_getucname(self, code, buffer, sizeof(buffer))) return 0; for (i = 0; i < namelen; i++) { - if (toupper(Py_CHARMASK(name[i])) != buffer[i]) + if (Py_TOUPPER(Py_CHARMASK(name[i])) != buffer[i]) return 0; } return buffer[namelen] == '\0'; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 14:48:08 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 04 Oct 2011 14:48:08 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=283=2E2=29=3A_Try_to_fix_link?= =?utf8?q?ing_failures_under_Windows?= Message-ID: http://hg.python.org/cpython/rev/2484b2b8876e changeset: 72658:2484b2b8876e branch: 3.2 parent: 72655:fe48e2b3dbee user: Antoine Pitrou date: Tue Oct 04 14:43:47 2011 +0200 summary: Try to fix linking failures under Windows files: Include/pyctype.h | 6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/Include/pyctype.h b/Include/pyctype.h --- a/Include/pyctype.h +++ b/Include/pyctype.h @@ -10,7 +10,7 @@ #define PY_CTF_SPACE 0x08 #define PY_CTF_XDIGIT 0x10 -extern const unsigned int _Py_ctype_table[256]; +PyAPI_DATA(const unsigned int) _Py_ctype_table[256]; /* Unlike their C counterparts, the following macros are not meant to * handle an int with any of the values [EOF, 0-UCHAR_MAX]. The argument @@ -23,8 +23,8 @@ #define Py_ISALNUM(c) (_Py_ctype_table[Py_CHARMASK(c)] & PY_CTF_ALNUM) #define Py_ISSPACE(c) (_Py_ctype_table[Py_CHARMASK(c)] & PY_CTF_SPACE) -extern const unsigned char _Py_ctype_tolower[256]; -extern const unsigned char _Py_ctype_toupper[256]; +PyAPI_DATA(const unsigned char) _Py_ctype_tolower[256]; +PyAPI_DATA(const unsigned char) _Py_ctype_toupper[256]; #define Py_TOLOWER(c) (_Py_ctype_tolower[Py_CHARMASK(c)]) #define Py_TOUPPER(c) (_Py_ctype_toupper[Py_CHARMASK(c)]) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 14:48:09 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 04 Oct 2011 14:48:09 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=28merge_3=2E2_-=3E_default=29?= =?utf8?q?=3A_Try_to_fix_linking_failures_under_Windows?= Message-ID: http://hg.python.org/cpython/rev/f0dcc71e00ab changeset: 72659:f0dcc71e00ab parent: 72656:969bbc6700a0 parent: 72658:2484b2b8876e user: Antoine Pitrou date: Tue Oct 04 14:44:35 2011 +0200 summary: Try to fix linking failures under Windows files: Include/pyctype.h | 6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/Include/pyctype.h b/Include/pyctype.h --- a/Include/pyctype.h +++ b/Include/pyctype.h @@ -10,7 +10,7 @@ #define PY_CTF_SPACE 0x08 #define PY_CTF_XDIGIT 0x10 -extern const unsigned int _Py_ctype_table[256]; +PyAPI_DATA(const unsigned int) _Py_ctype_table[256]; /* Unlike their C counterparts, the following macros are not meant to * handle an int with any of the values [EOF, 0-UCHAR_MAX]. The argument @@ -23,8 +23,8 @@ #define Py_ISALNUM(c) (_Py_ctype_table[Py_CHARMASK(c)] & PY_CTF_ALNUM) #define Py_ISSPACE(c) (_Py_ctype_table[Py_CHARMASK(c)] & PY_CTF_SPACE) -extern const unsigned char _Py_ctype_tolower[256]; -extern const unsigned char _Py_ctype_toupper[256]; +PyAPI_DATA(const unsigned char) _Py_ctype_tolower[256]; +PyAPI_DATA(const unsigned char) _Py_ctype_toupper[256]; #define Py_TOLOWER(c) (_Py_ctype_tolower[Py_CHARMASK(c)]) #define Py_TOUPPER(c) (_Py_ctype_toupper[Py_CHARMASK(c)]) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 14:49:18 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 04 Oct 2011 14:49:18 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=282=2E7=29=3A_Try_to_fix_link?= =?utf8?q?ing_failures_under_Windows?= Message-ID: http://hg.python.org/cpython/rev/504981afa007 changeset: 72660:504981afa007 branch: 2.7 parent: 72657:60d80880bf88 user: Antoine Pitrou date: Tue Oct 04 14:45:32 2011 +0200 summary: Try to fix linking failures under Windows files: Include/pyctype.h | 6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/Include/pyctype.h b/Include/pyctype.h --- a/Include/pyctype.h +++ b/Include/pyctype.h @@ -9,7 +9,7 @@ #define PY_CTF_SPACE 0x08 #define PY_CTF_XDIGIT 0x10 -extern const unsigned int _Py_ctype_table[256]; +PyAPI_DATA(const unsigned int) _Py_ctype_table[256]; /* Unlike their C counterparts, the following macros are not meant to * handle an int with any of the values [EOF, 0-UCHAR_MAX]. The argument @@ -22,8 +22,8 @@ #define Py_ISALNUM(c) (_Py_ctype_table[Py_CHARMASK(c)] & PY_CTF_ALNUM) #define Py_ISSPACE(c) (_Py_ctype_table[Py_CHARMASK(c)] & PY_CTF_SPACE) -extern const unsigned char _Py_ctype_tolower[256]; -extern const unsigned char _Py_ctype_toupper[256]; +PyAPI_DATA(const unsigned char) _Py_ctype_tolower[256]; +PyAPI_DATA(const unsigned char) _Py_ctype_toupper[256]; #define Py_TOLOWER(c) (_Py_ctype_tolower[Py_CHARMASK(c)]) #define Py_TOUPPER(c) (_Py_ctype_toupper[Py_CHARMASK(c)]) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 15:59:18 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 04 Oct 2011 15:59:18 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Migrate_str=2Eexpandtabs_to?= =?utf8?q?_the_new_API?= Message-ID: http://hg.python.org/cpython/rev/ab5086539ab9 changeset: 72661:ab5086539ab9 parent: 72659:f0dcc71e00ab user: Antoine Pitrou date: Tue Oct 04 15:55:09 2011 +0200 summary: Migrate str.expandtabs to the new API files: Objects/unicodeobject.c | 95 +++++++++++++--------------- 1 files changed, 43 insertions(+), 52 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -10190,87 +10190,78 @@ static PyObject* unicode_expandtabs(PyUnicodeObject *self, PyObject *args) { - Py_UNICODE *e; - Py_UNICODE *p; - Py_UNICODE *q; - Py_UNICODE *qe; - Py_ssize_t i, j, incr, wstr_length; - PyUnicodeObject *u; + Py_ssize_t i, j, line_pos, src_len, incr; + Py_UCS4 ch; + PyObject *u; + void *src_data, *dest_data; int tabsize = 8; + int kind; if (!PyArg_ParseTuple(args, "|i:expandtabs", &tabsize)) return NULL; - if (PyUnicode_AsUnicodeAndSize((PyObject *)self, &wstr_length) == NULL) - return NULL; - /* First pass: determine size of output string */ - i = 0; /* chars up to and including most recent \n or \r */ - j = 0; /* chars since most recent \n or \r (use in tab calculations) */ - e = _PyUnicode_WSTR(self) + wstr_length; /* end of input */ - for (p = _PyUnicode_WSTR(self); p < e; p++) - if (*p == '\t') { + src_len = PyUnicode_GET_LENGTH(self); + i = j = line_pos = 0; + kind = PyUnicode_KIND(self); + src_data = PyUnicode_DATA(self); + for (; i < src_len; i++) { + ch = PyUnicode_READ(kind, src_data, i); + if (ch == '\t') { if (tabsize > 0) { - incr = tabsize - (j % tabsize); /* cannot overflow */ + incr = tabsize - (line_pos % tabsize); /* cannot overflow */ if (j > PY_SSIZE_T_MAX - incr) - goto overflow1; + goto overflow; + line_pos += incr; j += incr; } } else { if (j > PY_SSIZE_T_MAX - 1) - goto overflow1; + goto overflow; + line_pos++; j++; - if (*p == '\n' || *p == '\r') { - if (i > PY_SSIZE_T_MAX - j) - goto overflow1; - i += j; - j = 0; - } - } - - if (i > PY_SSIZE_T_MAX - j) - goto overflow1; + if (ch == '\n' || ch == '\r') + line_pos = 0; + } + } /* Second pass: create output string and fill it */ - u = _PyUnicode_New(i + j); + u = PyUnicode_New(j, PyUnicode_MAX_CHAR_VALUE(self)); if (!u) return NULL; - - j = 0; /* same as in first pass */ - q = _PyUnicode_WSTR(u); /* next output char */ - qe = _PyUnicode_WSTR(u) + PyUnicode_GET_SIZE(u); /* end of output */ - - for (p = _PyUnicode_WSTR(self); p < e; p++) - if (*p == '\t') { + dest_data = PyUnicode_DATA(u); + + i = j = line_pos = 0; + + for (; i < src_len; i++) { + ch = PyUnicode_READ(kind, src_data, i); + if (ch == '\t') { if (tabsize > 0) { - i = tabsize - (j % tabsize); - j += i; - while (i--) { - if (q >= qe) - goto overflow2; - *q++ = ' '; + incr = tabsize - (line_pos % tabsize); + line_pos += incr; + while (incr--) { + PyUnicode_WRITE(kind, dest_data, j, ' '); + j++; } } } else { - if (q >= qe) - goto overflow2; - *q++ = *p; + line_pos++; + PyUnicode_WRITE(kind, dest_data, j, ch); j++; - if (*p == '\n' || *p == '\r') - j = 0; - } - - if (_PyUnicode_READY_REPLACE(&u)) { + if (ch == '\n' || ch == '\r') + line_pos = 0; + } + } + assert (j == PyUnicode_GET_LENGTH(u)); + if (PyUnicode_READY(u)) { Py_DECREF(u); return NULL; } return (PyObject*) u; - overflow2: - Py_DECREF(u); - overflow1: + overflow: PyErr_SetString(PyExc_OverflowError, "new string is too long"); return NULL; } -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 15:59:19 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 04 Oct 2011 15:59:19 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Migrate_test=5Fbigmem_to_PE?= =?utf8?q?P_393-compliant_size_calculations_=28hopefully=29?= Message-ID: http://hg.python.org/cpython/rev/9457bd55820d changeset: 72662:9457bd55820d user: Antoine Pitrou date: Tue Oct 04 15:55:44 2011 +0200 summary: Migrate test_bigmem to PEP 393-compliant size calculations (hopefully) files: Lib/test/test_bigmem.py | 54 +++++++++++----------------- 1 files changed, 21 insertions(+), 33 deletions(-) diff --git a/Lib/test/test_bigmem.py b/Lib/test/test_bigmem.py --- a/Lib/test/test_bigmem.py +++ b/Lib/test/test_bigmem.py @@ -62,7 +62,9 @@ # fail as well. I do not know whether it is due to memory fragmentation # issues, or other specifics of the platform malloc() routine. -character_size = 4 if sys.maxunicode > 0xFFFF else 2 +ascii_char_size = 1 +ucs2_char_size = 2 +ucs4_char_size = 2 class BaseStrTest: @@ -588,7 +590,6 @@ def basic_encode_test(self, size, enc, c='.', expectedsize=None): if expectedsize is None: expectedsize = size - try: s = c * size self.assertEqual(len(s.encode(enc)), expectedsize) @@ -607,7 +608,7 @@ memuse = meth.memuse except AttributeError: continue - meth.memuse = character_size * memuse + meth.memuse = ascii_char_size * memuse self._adjusted[name] = memuse def tearDown(self): @@ -615,36 +616,36 @@ getattr(type(self), name).memuse = memuse # the utf8 encoder preallocates big time (4x the number of characters) - @bigmemtest(size=_2G + 2, memuse=character_size + 4) + @bigmemtest(size=_2G + 2, memuse=ascii_char_size + 4) def test_encode(self, size): return self.basic_encode_test(size, 'utf-8') - @bigmemtest(size=_4G // 6 + 2, memuse=character_size + 1) + @bigmemtest(size=_4G // 6 + 2, memuse=ascii_char_size + 1) def test_encode_raw_unicode_escape(self, size): try: return self.basic_encode_test(size, 'raw_unicode_escape') except MemoryError: pass # acceptable on 32-bit - @bigmemtest(size=_4G // 5 + 70, memuse=character_size + 1) + @bigmemtest(size=_4G // 5 + 70, memuse=ascii_char_size + 1) def test_encode_utf7(self, size): try: return self.basic_encode_test(size, 'utf7') except MemoryError: pass # acceptable on 32-bit - @bigmemtest(size=_4G // 4 + 5, memuse=character_size + 4) + @bigmemtest(size=_4G // 4 + 5, memuse=ascii_char_size + 4) def test_encode_utf32(self, size): try: - return self.basic_encode_test(size, 'utf32', expectedsize=4*size+4) + return self.basic_encode_test(size, 'utf32', expectedsize=4 * size + 4) except MemoryError: pass # acceptable on 32-bit - @bigmemtest(size=_2G - 1, memuse=character_size + 1) + @bigmemtest(size=_2G - 1, memuse=ascii_char_size + 1) def test_encode_ascii(self, size): return self.basic_encode_test(size, 'ascii', c='A') - @bigmemtest(size=_2G + 10, memuse=character_size * 2) + @bigmemtest(size=_2G + 10, memuse=ascii_char_size * 2) def test_format(self, size): s = '-' * size sf = '%s' % (s,) @@ -665,7 +666,7 @@ self.assertEqual(s.count('.'), 3) self.assertEqual(s.count('-'), size * 2) - @bigmemtest(size=_2G + 10, memuse=character_size * 2) + @bigmemtest(size=_2G + 10, memuse=ascii_char_size * 2) def test_repr_small(self, size): s = '-' * size s = repr(s) @@ -686,7 +687,7 @@ self.assertEqual(s.count('\\'), size) self.assertEqual(s.count('0'), size * 2) - @bigmemtest(size=_2G + 10, memuse=character_size * 5) + @bigmemtest(size=_2G + 10, memuse=ascii_char_size * 5) def test_repr_large(self, size): s = '\x00' * size s = repr(s) @@ -696,7 +697,7 @@ self.assertEqual(s.count('\\'), size) self.assertEqual(s.count('0'), size * 2) - @bigmemtest(size=_2G // 5 + 1, memuse=character_size * 7) + @bigmemtest(size=_2G // 5 + 1, memuse=ucs2_char_size + ascii_char_size * 6) def test_unicode_repr(self, size): # Use an assigned, but not printable code point. # It is in the range of the low surrogates \uDC00-\uDFFF. @@ -711,9 +712,7 @@ finally: r = s = None - # The character takes 4 bytes even in UCS-2 builds because it will - # be decomposed into surrogates. - @bigmemtest(size=_2G // 5 + 1, memuse=4 + character_size * 9) + @bigmemtest(size=_2G // 5 + 1, memuse=ucs4_char_size + ascii_char_size * 10) def test_unicode_repr_wide(self, size): char = "\U0001DCBA" s = char * size @@ -726,24 +725,13 @@ finally: r = s = None - @bigmemtest(size=_4G // 5, memuse=character_size * (6 + 1)) - def _test_unicode_repr_overflow(self, size): - # XXX not sure what this test is about - char = "\uDCBA" - s = char * size - try: - r = repr(s) - self.assertTrue(s == eval(r)) - finally: - r = s = None - class BytesTest(unittest.TestCase, BaseStrTest): def from_latin1(self, s): return s.encode("latin-1") - @bigmemtest(size=_2G + 2, memuse=1 + character_size) + @bigmemtest(size=_2G + 2, memuse=1 + ascii_char_size) def test_decode(self, size): s = self.from_latin1('.') * size self.assertEqual(len(s.decode('utf-8')), size) @@ -754,7 +742,7 @@ def from_latin1(self, s): return bytearray(s.encode("latin-1")) - @bigmemtest(size=_2G + 2, memuse=1 + character_size) + @bigmemtest(size=_2G + 2, memuse=1 + ascii_char_size) def test_decode(self, size): s = self.from_latin1('.') * size self.assertEqual(len(s.decode('utf-8')), size) @@ -894,11 +882,11 @@ self.assertEqual(s[-5:], '0, 0)') self.assertEqual(s.count('0'), size) - @bigmemtest(size=_2G // 3 + 2, memuse=8 + 3 * character_size) + @bigmemtest(size=_2G // 3 + 2, memuse=8 + 3 * ascii_char_size) def test_repr_small(self, size): return self.basic_test_repr(size) - @bigmemtest(size=_2G + 2, memuse=8 + 3 * character_size) + @bigmemtest(size=_2G + 2, memuse=8 + 3 * ascii_char_size) def test_repr_large(self, size): return self.basic_test_repr(size) @@ -1069,11 +1057,11 @@ self.assertEqual(s[-5:], '0, 0]') self.assertEqual(s.count('0'), size) - @bigmemtest(size=_2G // 3 + 2, memuse=8 + 3 * character_size) + @bigmemtest(size=_2G // 3 + 2, memuse=8 + 3 * ascii_char_size) def test_repr_small(self, size): return self.basic_test_repr(size) - @bigmemtest(size=_2G + 2, memuse=8 + 3 * character_size) + @bigmemtest(size=_2G + 2, memuse=8 + 3 * ascii_char_size) def test_repr_large(self, size): return self.basic_test_repr(size) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 16:07:38 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 04 Oct 2011 16:07:38 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_When_expandtabs=28=29_would?= =?utf8?q?_be_a_no-op=2C_don=27t_create_a_duplicate_string?= Message-ID: http://hg.python.org/cpython/rev/447f521ac6d9 changeset: 72663:447f521ac6d9 user: Antoine Pitrou date: Tue Oct 04 16:04:01 2011 +0200 summary: When expandtabs() would be a no-op, don't create a duplicate string files: Lib/test/test_unicode.py | 4 ++++ Objects/unicodeobject.c | 7 +++++++ 2 files changed, 11 insertions(+), 0 deletions(-) diff --git a/Lib/test/test_unicode.py b/Lib/test/test_unicode.py --- a/Lib/test/test_unicode.py +++ b/Lib/test/test_unicode.py @@ -1585,6 +1585,10 @@ return self.assertRaises(OverflowError, 't\tt\t'.expandtabs, sys.maxsize) + def test_expandtabs_optimization(self): + s = 'abc' + self.assertIs(s.expandtabs(), s) + def test_raiseMemError(self): if struct.calcsize('P') == 8: # 64 bits pointers diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -10196,6 +10196,7 @@ void *src_data, *dest_data; int tabsize = 8; int kind; + int found; if (!PyArg_ParseTuple(args, "|i:expandtabs", &tabsize)) return NULL; @@ -10205,9 +10206,11 @@ i = j = line_pos = 0; kind = PyUnicode_KIND(self); src_data = PyUnicode_DATA(self); + found = 0; for (; i < src_len; i++) { ch = PyUnicode_READ(kind, src_data, i); if (ch == '\t') { + found = 1; if (tabsize > 0) { incr = tabsize - (line_pos % tabsize); /* cannot overflow */ if (j > PY_SSIZE_T_MAX - incr) @@ -10225,6 +10228,10 @@ line_pos = 0; } } + if (!found && PyUnicode_CheckExact(self)) { + Py_INCREF((PyObject *) self); + return (PyObject *) self; + } /* Second pass: create output string and fill it */ u = PyUnicode_New(j, PyUnicode_MAX_CHAR_VALUE(self)); -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 16:11:11 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 04 Oct 2011 16:11:11 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_An_embarassing_litle_typo?= Message-ID: http://hg.python.org/cpython/rev/3daf24c9df50 changeset: 72664:3daf24c9df50 user: Antoine Pitrou date: Tue Oct 04 16:07:27 2011 +0200 summary: An embarassing litle typo files: Lib/test/test_bigmem.py | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Lib/test/test_bigmem.py b/Lib/test/test_bigmem.py --- a/Lib/test/test_bigmem.py +++ b/Lib/test/test_bigmem.py @@ -64,7 +64,7 @@ ascii_char_size = 1 ucs2_char_size = 2 -ucs4_char_size = 2 +ucs4_char_size = 4 class BaseStrTest: -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 16:21:48 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 04 Oct 2011 16:21:48 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Also_fix_pickletester?= Message-ID: http://hg.python.org/cpython/rev/56eb9a509460 changeset: 72665:56eb9a509460 user: Antoine Pitrou date: Tue Oct 04 16:18:15 2011 +0200 summary: Also fix pickletester files: Lib/test/pickletester.py | 6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/Lib/test/pickletester.py b/Lib/test/pickletester.py --- a/Lib/test/pickletester.py +++ b/Lib/test/pickletester.py @@ -19,7 +19,7 @@ # kind of outer loop. protocols = range(pickle.HIGHEST_PROTOCOL + 1) -character_size = 4 if sys.maxunicode > 0xFFFF else 2 +ascii_char_size = 1 # Return True if opcode code appears in the pickle, else False. @@ -1235,7 +1235,7 @@ # All protocols use 1-byte per printable ASCII character; we add another # byte because the encoded form has to be copied into the internal buffer. - @bigmemtest(size=_2G, memuse=2 + character_size, dry_run=False) + @bigmemtest(size=_2G, memuse=2 + ascii_char_size, dry_run=False) def test_huge_str_32b(self, size): data = "abcd" * (size // 4) try: @@ -1252,7 +1252,7 @@ # BINUNICODE (protocols 1, 2 and 3) cannot carry more than # 2**32 - 1 bytes of utf-8 encoded unicode. - @bigmemtest(size=_4G, memuse=1 + character_size, dry_run=False) + @bigmemtest(size=_4G, memuse=1 + ascii_char_size, dry_run=False) def test_huge_str_64b(self, size): data = "a" * size try: -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 18:06:23 2011 From: python-checkins at python.org (ezio.melotti) Date: Tue, 04 Oct 2011 18:06:23 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_=2313054=3A_fix_usage_of_sy?= =?utf8?q?s=2Emaxunicode_after_PEP-393=2E?= Message-ID: http://hg.python.org/cpython/rev/f39b26ca7f3d changeset: 72666:f39b26ca7f3d user: Ezio Melotti date: Tue Oct 04 19:06:00 2011 +0300 summary: #13054: fix usage of sys.maxunicode after PEP-393. files: Lib/sre_compile.py | 4 +++- Lib/test/test_builtin.py | 3 +-- Lib/test/test_codeccallbacks.py | 16 ++++------------ Lib/test/test_multibytecodec.py | 7 +------ Lib/test/test_unicode.py | 20 ++++---------------- Tools/pybench/pybench.py | 1 + Tools/unicode/comparecodecs.py | 2 +- 7 files changed, 15 insertions(+), 38 deletions(-) diff --git a/Lib/sre_compile.py b/Lib/sre_compile.py --- a/Lib/sre_compile.py +++ b/Lib/sre_compile.py @@ -318,11 +318,13 @@ # XXX: could expand category return charset # cannot compress except IndexError: - # non-BMP characters + # non-BMP characters; XXX now they should work return charset if negate: if sys.maxunicode != 65535: # XXX: negation does not work with big charsets + # XXX2: now they should work, but removing this will make the + # charmap 17 times bigger return charset for i in range(65536): charmap[i] = not charmap[i] diff --git a/Lib/test/test_builtin.py b/Lib/test/test_builtin.py --- a/Lib/test/test_builtin.py +++ b/Lib/test/test_builtin.py @@ -249,8 +249,7 @@ self.assertEqual(chr(0xff), '\xff') self.assertRaises(ValueError, chr, 1<<24) self.assertEqual(chr(sys.maxunicode), - str(('\\U%08x' % (sys.maxunicode)).encode("ascii"), - 'unicode-escape')) + str('\\U0010ffff'.encode("ascii"), 'unicode-escape')) self.assertRaises(TypeError, chr) self.assertEqual(chr(0x0000FFFF), "\U0000FFFF") self.assertEqual(chr(0x00010000), "\U00010000") diff --git a/Lib/test/test_codeccallbacks.py b/Lib/test/test_codeccallbacks.py --- a/Lib/test/test_codeccallbacks.py +++ b/Lib/test/test_codeccallbacks.py @@ -138,22 +138,14 @@ def test_backslashescape(self): # Does the same as the "unicode-escape" encoding, but with different # base encodings. - sin = "a\xac\u1234\u20ac\u8000" - if sys.maxunicode > 0xffff: - sin += chr(sys.maxunicode) - sout = b"a\\xac\\u1234\\u20ac\\u8000" - if sys.maxunicode > 0xffff: - sout += bytes("\\U%08x" % sys.maxunicode, "ascii") + sin = "a\xac\u1234\u20ac\u8000\U0010ffff" + sout = b"a\\xac\\u1234\\u20ac\\u8000\\U0010ffff" self.assertEqual(sin.encode("ascii", "backslashreplace"), sout) - sout = b"a\xac\\u1234\\u20ac\\u8000" - if sys.maxunicode > 0xffff: - sout += bytes("\\U%08x" % sys.maxunicode, "ascii") + sout = b"a\xac\\u1234\\u20ac\\u8000\\U0010ffff" self.assertEqual(sin.encode("latin-1", "backslashreplace"), sout) - sout = b"a\xac\\u1234\xa4\\u8000" - if sys.maxunicode > 0xffff: - sout += bytes("\\U%08x" % sys.maxunicode, "ascii") + sout = b"a\xac\\u1234\xa4\\u8000\\U0010ffff" self.assertEqual(sin.encode("iso-8859-15", "backslashreplace"), sout) def test_decoding_callbacks(self): diff --git a/Lib/test/test_multibytecodec.py b/Lib/test/test_multibytecodec.py --- a/Lib/test/test_multibytecodec.py +++ b/Lib/test/test_multibytecodec.py @@ -247,14 +247,9 @@ self.assertFalse(any(x > 0x80 for x in e)) def test_bug1572832(self): - if sys.maxunicode >= 0x10000: - myunichr = chr - else: - myunichr = lambda x: chr(0xD7C0+(x>>10)) + chr(0xDC00+(x&0x3FF)) - for x in range(0x10000, 0x110000): # Any ISO 2022 codec will cause the segfault - myunichr(x).encode('iso_2022_jp', 'ignore') + chr(x).encode('iso_2022_jp', 'ignore') class TestStateful(unittest.TestCase): text = '\u4E16\u4E16' diff --git a/Lib/test/test_unicode.py b/Lib/test/test_unicode.py --- a/Lib/test/test_unicode.py +++ b/Lib/test/test_unicode.py @@ -13,10 +13,6 @@ from test import support, string_tests import _string -# decorator to skip tests on narrow builds -requires_wide_build = unittest.skipIf(sys.maxunicode == 65535, - 'requires wide build') - # Error handling (bad decoder return) def search_function(encoding): def decode1(input, errors="strict"): @@ -519,7 +515,6 @@ self.assertFalse(meth(s), '%a.%s() is False' % (s, meth_name)) - @requires_wide_build def test_lower(self): string_tests.CommonTest.test_lower(self) self.assertEqual('\U00010427'.lower(), '\U0001044F') @@ -530,7 +525,6 @@ self.assertEqual('X\U00010427x\U0001044F'.lower(), 'x\U0001044Fx\U0001044F') - @requires_wide_build def test_upper(self): string_tests.CommonTest.test_upper(self) self.assertEqual('\U0001044F'.upper(), '\U00010427') @@ -541,7 +535,6 @@ self.assertEqual('X\U00010427x\U0001044F'.upper(), 'X\U00010427X\U00010427') - @requires_wide_build def test_capitalize(self): string_tests.CommonTest.test_capitalize(self) self.assertEqual('\U0001044F'.capitalize(), '\U00010427') @@ -554,7 +547,6 @@ self.assertEqual('X\U00010427x\U0001044F'.capitalize(), 'X\U0001044Fx\U0001044F') - @requires_wide_build def test_title(self): string_tests.MixinStrUnicodeUserStringTest.test_title(self) self.assertEqual('\U0001044F'.title(), '\U00010427') @@ -569,7 +561,6 @@ self.assertEqual('X\U00010427x\U0001044F X\U00010427x\U0001044F'.title(), 'X\U0001044Fx\U0001044F X\U0001044Fx\U0001044F') - @requires_wide_build def test_swapcase(self): string_tests.CommonTest.test_swapcase(self) self.assertEqual('\U0001044F'.swapcase(), '\U00010427') @@ -1114,15 +1105,12 @@ def test_codecs_utf8(self): self.assertEqual(''.encode('utf-8'), b'') self.assertEqual('\u20ac'.encode('utf-8'), b'\xe2\x82\xac') - if sys.maxunicode == 65535: - self.assertEqual('\ud800\udc02'.encode('utf-8'), b'\xf0\x90\x80\x82') - self.assertEqual('\ud84d\udc56'.encode('utf-8'), b'\xf0\xa3\x91\x96') + self.assertEqual('\U00010002'.encode('utf-8'), b'\xf0\x90\x80\x82') + self.assertEqual('\U00023456'.encode('utf-8'), b'\xf0\xa3\x91\x96') self.assertEqual('\ud800'.encode('utf-8', 'surrogatepass'), b'\xed\xa0\x80') self.assertEqual('\udc00'.encode('utf-8', 'surrogatepass'), b'\xed\xb0\x80') - if sys.maxunicode == 65535: - self.assertEqual( - ('\ud800\udc02'*1000).encode('utf-8'), - b'\xf0\x90\x80\x82'*1000) + self.assertEqual(('\U00010002'*10).encode('utf-8'), + b'\xf0\x90\x80\x82'*10) self.assertEqual( '\u6b63\u78ba\u306b\u8a00\u3046\u3068\u7ffb\u8a33\u306f' '\u3055\u308c\u3066\u3044\u307e\u305b\u3093\u3002\u4e00' diff --git a/Tools/pybench/pybench.py b/Tools/pybench/pybench.py --- a/Tools/pybench/pybench.py +++ b/Tools/pybench/pybench.py @@ -107,6 +107,7 @@ print('Getting machine details...') buildno, builddate = platform.python_build() python = platform.python_version() + # XXX this is now always UCS4, maybe replace it with 'PEP393' in 3.3+? if sys.maxunicode == 65535: # UCS2 build (standard) unitype = 'UCS2' diff --git a/Tools/unicode/comparecodecs.py b/Tools/unicode/comparecodecs.py --- a/Tools/unicode/comparecodecs.py +++ b/Tools/unicode/comparecodecs.py @@ -14,7 +14,7 @@ print('Comparing encoding/decoding of %r and %r' % (encoding1, encoding2)) mismatch = 0 # Check encoding - for i in range(sys.maxunicode): + for i in range(sys.maxunicode+1): u = chr(i) try: c1 = u.encode(encoding1) -- Repository URL: http://hg.python.org/cpython From martin at v.loewis.de Tue Oct 4 18:45:55 2011 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Tue, 04 Oct 2011 18:45:55 +0200 Subject: [Python-checkins] cpython: Migrate str.expandtabs to the new API In-Reply-To: References: Message-ID: <4E8B3843.4030505@v.loewis.de> > Migrate str.expandtabs to the new API This needs if (PyUnicode_READY(self) == -1) return NULL; right after the ParseTuple call. In most cases, the check will be a noop. But if it's not, omitting it will make expandtabs have no effect, since the string length will be 0 (in a debug build, you also get an assertion failure). Regards, Martin From python-checkins at python.org Tue Oct 4 19:15:57 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 04 Oct 2011 19:15:57 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Optimize_string_slicing_to_?= =?utf8?q?use_the_new_API?= Message-ID: http://hg.python.org/cpython/rev/1b4f886dc9e2 changeset: 72667:1b4f886dc9e2 parent: 72665:56eb9a509460 user: Antoine Pitrou date: Tue Oct 04 19:08:01 2011 +0200 summary: Optimize string slicing to use the new API files: Objects/unicodeobject.c | 36 +++++++++++++--------------- 1 files changed, 17 insertions(+), 19 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -12253,9 +12253,9 @@ return unicode_getitem((PyObject*)self, i); } else if (PySlice_Check(item)) { Py_ssize_t start, stop, step, slicelength, cur, i; - const Py_UNICODE* source_buf; - Py_UNICODE* result_buf; - PyObject* result; + PyObject *result; + void *src_data, *dest_data; + int kind; if (PySlice_GetIndicesEx(item, PyUnicode_GET_LENGTH(self), &start, &stop, &step, &slicelength) < 0) { @@ -12272,22 +12272,20 @@ } else if (step == 1) { return PyUnicode_Substring((PyObject*)self, start, start + slicelength); - } else { - source_buf = PyUnicode_AS_UNICODE((PyObject*)self); - result_buf = (Py_UNICODE *)PyObject_MALLOC(slicelength* - sizeof(Py_UNICODE)); - - if (result_buf == NULL) - return PyErr_NoMemory(); - - for (cur = start, i = 0; i < slicelength; cur += step, i++) { - result_buf[i] = source_buf[cur]; - } - - result = PyUnicode_FromUnicode(result_buf, slicelength); - PyObject_FREE(result_buf); - return result; - } + } + /* General (less optimized) case */ + result = PyUnicode_New(slicelength, PyUnicode_MAX_CHAR_VALUE(self)); + if (result == NULL) + return NULL; + kind = PyUnicode_KIND(self); + src_data = PyUnicode_DATA(self); + dest_data = PyUnicode_DATA(result); + + for (cur = start, i = 0; i < slicelength; cur += step, i++) { + Py_UCS4 ch = PyUnicode_READ(kind, src_data, cur); + PyUnicode_WRITE(kind, dest_data, i, ch); + } + return result; } else { PyErr_SetString(PyExc_TypeError, "string indices must be integers"); return NULL; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 19:15:58 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 04 Oct 2011 19:15:58 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Add_a_necessary_call_to_PyU?= =?utf8?q?nicode=5FREADY=28=29_=28followup_to_ab5086539ab9=29?= Message-ID: http://hg.python.org/cpython/rev/e6cc71820bf3 changeset: 72668:e6cc71820bf3 user: Antoine Pitrou date: Tue Oct 04 19:10:51 2011 +0200 summary: Add a necessary call to PyUnicode_READY() (followup to ab5086539ab9) files: Objects/unicodeobject.c | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -10201,6 +10201,9 @@ if (!PyArg_ParseTuple(args, "|i:expandtabs", &tabsize)) return NULL; + if (PyUnicode_READY(self) == -1) + return NULL; + /* First pass: determine size of output string */ src_len = PyUnicode_GET_LENGTH(self); i = j = line_pos = 0; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 19:15:58 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 04 Oct 2011 19:15:58 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=28merge_default_-=3E_default?= =?utf8?q?=29=3A_Merge?= Message-ID: http://hg.python.org/cpython/rev/ec6ee2a82583 changeset: 72669:ec6ee2a82583 parent: 72668:e6cc71820bf3 parent: 72666:f39b26ca7f3d user: Antoine Pitrou date: Tue Oct 04 19:11:34 2011 +0200 summary: Merge files: Lib/sre_compile.py | 4 +++- Lib/test/test_builtin.py | 3 +-- Lib/test/test_codeccallbacks.py | 16 ++++------------ Lib/test/test_multibytecodec.py | 7 +------ Lib/test/test_unicode.py | 20 ++++---------------- Tools/pybench/pybench.py | 1 + Tools/unicode/comparecodecs.py | 2 +- 7 files changed, 15 insertions(+), 38 deletions(-) diff --git a/Lib/sre_compile.py b/Lib/sre_compile.py --- a/Lib/sre_compile.py +++ b/Lib/sre_compile.py @@ -318,11 +318,13 @@ # XXX: could expand category return charset # cannot compress except IndexError: - # non-BMP characters + # non-BMP characters; XXX now they should work return charset if negate: if sys.maxunicode != 65535: # XXX: negation does not work with big charsets + # XXX2: now they should work, but removing this will make the + # charmap 17 times bigger return charset for i in range(65536): charmap[i] = not charmap[i] diff --git a/Lib/test/test_builtin.py b/Lib/test/test_builtin.py --- a/Lib/test/test_builtin.py +++ b/Lib/test/test_builtin.py @@ -249,8 +249,7 @@ self.assertEqual(chr(0xff), '\xff') self.assertRaises(ValueError, chr, 1<<24) self.assertEqual(chr(sys.maxunicode), - str(('\\U%08x' % (sys.maxunicode)).encode("ascii"), - 'unicode-escape')) + str('\\U0010ffff'.encode("ascii"), 'unicode-escape')) self.assertRaises(TypeError, chr) self.assertEqual(chr(0x0000FFFF), "\U0000FFFF") self.assertEqual(chr(0x00010000), "\U00010000") diff --git a/Lib/test/test_codeccallbacks.py b/Lib/test/test_codeccallbacks.py --- a/Lib/test/test_codeccallbacks.py +++ b/Lib/test/test_codeccallbacks.py @@ -138,22 +138,14 @@ def test_backslashescape(self): # Does the same as the "unicode-escape" encoding, but with different # base encodings. - sin = "a\xac\u1234\u20ac\u8000" - if sys.maxunicode > 0xffff: - sin += chr(sys.maxunicode) - sout = b"a\\xac\\u1234\\u20ac\\u8000" - if sys.maxunicode > 0xffff: - sout += bytes("\\U%08x" % sys.maxunicode, "ascii") + sin = "a\xac\u1234\u20ac\u8000\U0010ffff" + sout = b"a\\xac\\u1234\\u20ac\\u8000\\U0010ffff" self.assertEqual(sin.encode("ascii", "backslashreplace"), sout) - sout = b"a\xac\\u1234\\u20ac\\u8000" - if sys.maxunicode > 0xffff: - sout += bytes("\\U%08x" % sys.maxunicode, "ascii") + sout = b"a\xac\\u1234\\u20ac\\u8000\\U0010ffff" self.assertEqual(sin.encode("latin-1", "backslashreplace"), sout) - sout = b"a\xac\\u1234\xa4\\u8000" - if sys.maxunicode > 0xffff: - sout += bytes("\\U%08x" % sys.maxunicode, "ascii") + sout = b"a\xac\\u1234\xa4\\u8000\\U0010ffff" self.assertEqual(sin.encode("iso-8859-15", "backslashreplace"), sout) def test_decoding_callbacks(self): diff --git a/Lib/test/test_multibytecodec.py b/Lib/test/test_multibytecodec.py --- a/Lib/test/test_multibytecodec.py +++ b/Lib/test/test_multibytecodec.py @@ -247,14 +247,9 @@ self.assertFalse(any(x > 0x80 for x in e)) def test_bug1572832(self): - if sys.maxunicode >= 0x10000: - myunichr = chr - else: - myunichr = lambda x: chr(0xD7C0+(x>>10)) + chr(0xDC00+(x&0x3FF)) - for x in range(0x10000, 0x110000): # Any ISO 2022 codec will cause the segfault - myunichr(x).encode('iso_2022_jp', 'ignore') + chr(x).encode('iso_2022_jp', 'ignore') class TestStateful(unittest.TestCase): text = '\u4E16\u4E16' diff --git a/Lib/test/test_unicode.py b/Lib/test/test_unicode.py --- a/Lib/test/test_unicode.py +++ b/Lib/test/test_unicode.py @@ -13,10 +13,6 @@ from test import support, string_tests import _string -# decorator to skip tests on narrow builds -requires_wide_build = unittest.skipIf(sys.maxunicode == 65535, - 'requires wide build') - # Error handling (bad decoder return) def search_function(encoding): def decode1(input, errors="strict"): @@ -519,7 +515,6 @@ self.assertFalse(meth(s), '%a.%s() is False' % (s, meth_name)) - @requires_wide_build def test_lower(self): string_tests.CommonTest.test_lower(self) self.assertEqual('\U00010427'.lower(), '\U0001044F') @@ -530,7 +525,6 @@ self.assertEqual('X\U00010427x\U0001044F'.lower(), 'x\U0001044Fx\U0001044F') - @requires_wide_build def test_upper(self): string_tests.CommonTest.test_upper(self) self.assertEqual('\U0001044F'.upper(), '\U00010427') @@ -541,7 +535,6 @@ self.assertEqual('X\U00010427x\U0001044F'.upper(), 'X\U00010427X\U00010427') - @requires_wide_build def test_capitalize(self): string_tests.CommonTest.test_capitalize(self) self.assertEqual('\U0001044F'.capitalize(), '\U00010427') @@ -554,7 +547,6 @@ self.assertEqual('X\U00010427x\U0001044F'.capitalize(), 'X\U0001044Fx\U0001044F') - @requires_wide_build def test_title(self): string_tests.MixinStrUnicodeUserStringTest.test_title(self) self.assertEqual('\U0001044F'.title(), '\U00010427') @@ -569,7 +561,6 @@ self.assertEqual('X\U00010427x\U0001044F X\U00010427x\U0001044F'.title(), 'X\U0001044Fx\U0001044F X\U0001044Fx\U0001044F') - @requires_wide_build def test_swapcase(self): string_tests.CommonTest.test_swapcase(self) self.assertEqual('\U0001044F'.swapcase(), '\U00010427') @@ -1114,15 +1105,12 @@ def test_codecs_utf8(self): self.assertEqual(''.encode('utf-8'), b'') self.assertEqual('\u20ac'.encode('utf-8'), b'\xe2\x82\xac') - if sys.maxunicode == 65535: - self.assertEqual('\ud800\udc02'.encode('utf-8'), b'\xf0\x90\x80\x82') - self.assertEqual('\ud84d\udc56'.encode('utf-8'), b'\xf0\xa3\x91\x96') + self.assertEqual('\U00010002'.encode('utf-8'), b'\xf0\x90\x80\x82') + self.assertEqual('\U00023456'.encode('utf-8'), b'\xf0\xa3\x91\x96') self.assertEqual('\ud800'.encode('utf-8', 'surrogatepass'), b'\xed\xa0\x80') self.assertEqual('\udc00'.encode('utf-8', 'surrogatepass'), b'\xed\xb0\x80') - if sys.maxunicode == 65535: - self.assertEqual( - ('\ud800\udc02'*1000).encode('utf-8'), - b'\xf0\x90\x80\x82'*1000) + self.assertEqual(('\U00010002'*10).encode('utf-8'), + b'\xf0\x90\x80\x82'*10) self.assertEqual( '\u6b63\u78ba\u306b\u8a00\u3046\u3068\u7ffb\u8a33\u306f' '\u3055\u308c\u3066\u3044\u307e\u305b\u3093\u3002\u4e00' diff --git a/Tools/pybench/pybench.py b/Tools/pybench/pybench.py --- a/Tools/pybench/pybench.py +++ b/Tools/pybench/pybench.py @@ -107,6 +107,7 @@ print('Getting machine details...') buildno, builddate = platform.python_build() python = platform.python_version() + # XXX this is now always UCS4, maybe replace it with 'PEP393' in 3.3+? if sys.maxunicode == 65535: # UCS2 build (standard) unitype = 'UCS2' diff --git a/Tools/unicode/comparecodecs.py b/Tools/unicode/comparecodecs.py --- a/Tools/unicode/comparecodecs.py +++ b/Tools/unicode/comparecodecs.py @@ -14,7 +14,7 @@ print('Comparing encoding/decoding of %r and %r' % (encoding1, encoding2)) mismatch = 0 # Check encoding - for i in range(sys.maxunicode): + for i in range(sys.maxunicode+1): u = chr(i) try: c1 = u.encode(encoding1) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 19:17:50 2011 From: python-checkins at python.org (charles-francois.natali) Date: Tue, 04 Oct 2011 19:17:50 +0200 Subject: [Python-checkins] =?utf8?b?Y3B5dGhvbiAoMy4yKTogSXNzdWUgIzExOTU2?= =?utf8?q?=3A_Skip_test=5Fimport=2Etest=5Funwritable=5Fdirectory_on_FreeBS?= =?utf8?q?D_when_run_as?= Message-ID: http://hg.python.org/cpython/rev/7697223df6df changeset: 72670:7697223df6df branch: 3.2 parent: 72658:2484b2b8876e user: Charles-Fran?ois Natali date: Tue Oct 04 19:17:26 2011 +0200 summary: Issue #11956: Skip test_import.test_unwritable_directory on FreeBSD when run as root (directory permissions are ignored). files: Lib/test/test_import.py | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/Lib/test/test_import.py b/Lib/test/test_import.py --- a/Lib/test/test_import.py +++ b/Lib/test/test_import.py @@ -4,6 +4,7 @@ from importlib.test.import_ import util as importlib_util import marshal import os +import platform import py_compile import random import stat @@ -546,6 +547,8 @@ @unittest.skipUnless(os.name == 'posix', "test meaningful only on posix systems") + @unittest.skipIf(platform.system() == 'FreeBSD' and os.geteuid() == 0, + "due to non-standard filesystem permission semantics (issue #11956)") def test_unwritable_directory(self): # When the umask causes the new __pycache__ directory to be # unwritable, the import still succeeds but no .pyc file is written. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 19:17:51 2011 From: python-checkins at python.org (charles-francois.natali) Date: Tue, 04 Oct 2011 19:17:51 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=28merge_3=2E2_-=3E_default=29?= =?utf8?q?=3A_Issue_=2311956=3A_Skip_test=5Fimport=2Etest=5Funwritable=5Fd?= =?utf8?q?irectory_on_FreeBSD_when_run_as?= Message-ID: http://hg.python.org/cpython/rev/58870fe9a604 changeset: 72671:58870fe9a604 parent: 72666:f39b26ca7f3d parent: 72670:7697223df6df user: Charles-Fran?ois Natali date: Tue Oct 04 19:19:21 2011 +0200 summary: Issue #11956: Skip test_import.test_unwritable_directory on FreeBSD when run as root (directory permissions are ignored). files: Lib/test/test_import.py | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/Lib/test/test_import.py b/Lib/test/test_import.py --- a/Lib/test/test_import.py +++ b/Lib/test/test_import.py @@ -4,6 +4,7 @@ from importlib.test.import_ import util as importlib_util import marshal import os +import platform import py_compile import random import stat @@ -544,6 +545,8 @@ @unittest.skipUnless(os.name == 'posix', "test meaningful only on posix systems") + @unittest.skipIf(platform.system() == 'FreeBSD' and os.geteuid() == 0, + "due to non-standard filesystem permission semantics (issue #11956)") def test_unwritable_directory(self): # When the umask causes the new __pycache__ directory to be # unwritable, the import still succeeds but no .pyc file is written. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 19:17:52 2011 From: python-checkins at python.org (charles-francois.natali) Date: Tue, 04 Oct 2011 19:17:52 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=28merge_default_-=3E_default?= =?utf8?b?KTogTWVyZ2Uu?= Message-ID: http://hg.python.org/cpython/rev/eb6a33791fdf changeset: 72672:eb6a33791fdf parent: 72671:58870fe9a604 parent: 72669:ec6ee2a82583 user: Charles-Fran?ois Natali date: Tue Oct 04 19:20:52 2011 +0200 summary: Merge. files: Objects/unicodeobject.c | 39 ++++++++++++++-------------- 1 files changed, 20 insertions(+), 19 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -10201,6 +10201,9 @@ if (!PyArg_ParseTuple(args, "|i:expandtabs", &tabsize)) return NULL; + if (PyUnicode_READY(self) == -1) + return NULL; + /* First pass: determine size of output string */ src_len = PyUnicode_GET_LENGTH(self); i = j = line_pos = 0; @@ -12253,9 +12256,9 @@ return unicode_getitem((PyObject*)self, i); } else if (PySlice_Check(item)) { Py_ssize_t start, stop, step, slicelength, cur, i; - const Py_UNICODE* source_buf; - Py_UNICODE* result_buf; - PyObject* result; + PyObject *result; + void *src_data, *dest_data; + int kind; if (PySlice_GetIndicesEx(item, PyUnicode_GET_LENGTH(self), &start, &stop, &step, &slicelength) < 0) { @@ -12272,22 +12275,20 @@ } else if (step == 1) { return PyUnicode_Substring((PyObject*)self, start, start + slicelength); - } else { - source_buf = PyUnicode_AS_UNICODE((PyObject*)self); - result_buf = (Py_UNICODE *)PyObject_MALLOC(slicelength* - sizeof(Py_UNICODE)); - - if (result_buf == NULL) - return PyErr_NoMemory(); - - for (cur = start, i = 0; i < slicelength; cur += step, i++) { - result_buf[i] = source_buf[cur]; - } - - result = PyUnicode_FromUnicode(result_buf, slicelength); - PyObject_FREE(result_buf); - return result; - } + } + /* General (less optimized) case */ + result = PyUnicode_New(slicelength, PyUnicode_MAX_CHAR_VALUE(self)); + if (result == NULL) + return NULL; + kind = PyUnicode_KIND(self); + src_data = PyUnicode_DATA(self); + dest_data = PyUnicode_DATA(result); + + for (cur = start, i = 0; i < slicelength; cur += step, i++) { + Py_UCS4 ch = PyUnicode_READ(kind, src_data, cur); + PyUnicode_WRITE(kind, dest_data, i, ch); + } + return result; } else { PyErr_SetString(PyExc_TypeError, "string indices must be integers"); return NULL; -- Repository URL: http://hg.python.org/cpython From martin at v.loewis.de Tue Oct 4 19:49:09 2011 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Tue, 04 Oct 2011 19:49:09 +0200 Subject: [Python-checkins] cpython: Optimize string slicing to use the new API In-Reply-To: References: Message-ID: <4E8B4715.6020907@v.loewis.de> > + result = PyUnicode_New(slicelength, PyUnicode_MAX_CHAR_VALUE(self)); This is incorrect: the maxchar of the slice might be smaller than the maxchar of the input string. So you'll need to iterate over the input string first, compute the maxchar, and then allocate the result string. Or you allocate a temporary buffer of (1<<(kind-1)) * slicelength bytes, copy the slice, allocate the target object with PyUnicode_FromKindAndData, and release the temporary buffer. Regards, Martin From solipsis at pitrou.net Tue Oct 4 19:50:30 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 4 Oct 2011 19:50:30 +0200 Subject: [Python-checkins] cpython: Optimize string slicing to use the new API References: <4E8B4715.6020907@v.loewis.de> Message-ID: <20111004195030.6ecaf999@pitrou.net> On Tue, 04 Oct 2011 19:49:09 +0200 "Martin v. L?wis" wrote: > > + result = PyUnicode_New(slicelength, PyUnicode_MAX_CHAR_VALUE(self)); > > This is incorrect: the maxchar of the slice might be smaller than the > maxchar of the input string. I thought that heuristic would be good enough. I'll try to fix it. Regards Antoine. From python-checkins at python.org Tue Oct 4 20:08:56 2011 From: python-checkins at python.org (antoine.pitrou) Date: Tue, 04 Oct 2011 20:08:56 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Fix_na=C3=AFve_heuristic_in?= =?utf8?q?_unicode_slicing_=28followup_to_1b4f886dc9e2=29?= Message-ID: http://hg.python.org/cpython/rev/981deff56707 changeset: 72673:981deff56707 user: Antoine Pitrou date: Tue Oct 04 20:00:49 2011 +0200 summary: Fix na?ve heuristic in unicode slicing (followup to 1b4f886dc9e2) files: Objects/unicodeobject.c | 22 +++++++++++++++------- 1 files changed, 15 insertions(+), 7 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -12258,7 +12258,8 @@ Py_ssize_t start, stop, step, slicelength, cur, i; PyObject *result; void *src_data, *dest_data; - int kind; + int src_kind, dest_kind; + Py_UCS4 ch, max_char; if (PySlice_GetIndicesEx(item, PyUnicode_GET_LENGTH(self), &start, &stop, &step, &slicelength) < 0) { @@ -12276,17 +12277,24 @@ return PyUnicode_Substring((PyObject*)self, start, start + slicelength); } - /* General (less optimized) case */ - result = PyUnicode_New(slicelength, PyUnicode_MAX_CHAR_VALUE(self)); + /* General case */ + max_char = 127; + src_kind = PyUnicode_KIND(self); + src_data = PyUnicode_DATA(self); + for (cur = start, i = 0; i < slicelength; cur += step, i++) { + ch = PyUnicode_READ(src_kind, src_data, cur); + if (ch > max_char) + max_char = ch; + } + result = PyUnicode_New(slicelength, max_char); if (result == NULL) return NULL; - kind = PyUnicode_KIND(self); - src_data = PyUnicode_DATA(self); + dest_kind = PyUnicode_KIND(result); dest_data = PyUnicode_DATA(result); for (cur = start, i = 0; i < slicelength; cur += step, i++) { - Py_UCS4 ch = PyUnicode_READ(kind, src_data, cur); - PyUnicode_WRITE(kind, dest_data, i, ch); + Py_UCS4 ch = PyUnicode_READ(src_kind, src_data, cur); + PyUnicode_WRITE(dest_kind, dest_data, i, ch); } return result; } else { -- Repository URL: http://hg.python.org/cpython From martin at v.loewis.de Tue Oct 4 20:09:06 2011 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 04 Oct 2011 20:09:06 +0200 Subject: [Python-checkins] cpython: Optimize string slicing to use the new API In-Reply-To: <20111004195030.6ecaf999@pitrou.net> References: <4E8B4715.6020907@v.loewis.de> <20111004195030.6ecaf999@pitrou.net> Message-ID: <4E8B4BC2.8030804@v.loewis.de> Am 04.10.11 19:50, schrieb Antoine Pitrou: > On Tue, 04 Oct 2011 19:49:09 +0200 > "Martin v. L?wis" wrote: > >>> + result = PyUnicode_New(slicelength, PyUnicode_MAX_CHAR_VALUE(self)); >> >> This is incorrect: the maxchar of the slice might be smaller than the >> maxchar of the input string. > > I thought that heuristic would be good enough. I'll try to fix it. No - strings must always be in the canonical form. For example, PyUnicode_RichCompare considers string unequal if they have different kinds. As a consequence, your slice result may not compare equal to a canonical variant of itself. From python-checkins at python.org Tue Oct 4 20:40:50 2011 From: python-checkins at python.org (charles-francois.natali) Date: Tue, 04 Oct 2011 20:40:50 +0200 Subject: [Python-checkins] =?utf8?b?Y3B5dGhvbiAoMy4yKTogSXNzdWUgIzExOTU2?= =?utf8?q?=3A_Always_skip_test=5Fimport=2Etest=5Funwritable=5Fdirectory_wh?= =?utf8?q?en_run_as?= Message-ID: http://hg.python.org/cpython/rev/cbda512c6d7f changeset: 72674:cbda512c6d7f branch: 3.2 parent: 72670:7697223df6df user: Charles-Fran?ois Natali date: Tue Oct 04 20:40:58 2011 +0200 summary: Issue #11956: Always skip test_import.test_unwritable_directory when run as root, since the semantics varies across Unix variants. files: Lib/test/test_import.py | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/Lib/test/test_import.py b/Lib/test/test_import.py --- a/Lib/test/test_import.py +++ b/Lib/test/test_import.py @@ -547,8 +547,8 @@ @unittest.skipUnless(os.name == 'posix', "test meaningful only on posix systems") - @unittest.skipIf(platform.system() == 'FreeBSD' and os.geteuid() == 0, - "due to non-standard filesystem permission semantics (issue #11956)") + @unittest.skipIf(os.geteuid() == 0, + "due to varying filesystem permission semantics (issue #11956)") def test_unwritable_directory(self): # When the umask causes the new __pycache__ directory to be # unwritable, the import still succeeds but no .pyc file is written. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 20:40:51 2011 From: python-checkins at python.org (charles-francois.natali) Date: Tue, 04 Oct 2011 20:40:51 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=28merge_3=2E2_-=3E_default=29?= =?utf8?q?=3A_Issue_=2311956=3A_Always_skip_test=5Fimport=2Etest=5Funwrita?= =?utf8?q?ble=5Fdirectory_when_run_as?= Message-ID: http://hg.python.org/cpython/rev/971093a75613 changeset: 72675:971093a75613 parent: 72673:981deff56707 parent: 72674:cbda512c6d7f user: Charles-Fran?ois Natali date: Tue Oct 04 20:41:52 2011 +0200 summary: Issue #11956: Always skip test_import.test_unwritable_directory when run as root, since the semantics varies across Unix variants. files: Lib/test/test_import.py | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/Lib/test/test_import.py b/Lib/test/test_import.py --- a/Lib/test/test_import.py +++ b/Lib/test/test_import.py @@ -545,8 +545,8 @@ @unittest.skipUnless(os.name == 'posix', "test meaningful only on posix systems") - @unittest.skipIf(platform.system() == 'FreeBSD' and os.geteuid() == 0, - "due to non-standard filesystem permission semantics (issue #11956)") + @unittest.skipIf(os.geteuid() == 0, + "due to varying filesystem permission semantics (issue #11956)") def test_unwritable_directory(self): # When the umask causes the new __pycache__ directory to be # unwritable, the import still succeeds but no .pyc file is written. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 20:52:53 2011 From: python-checkins at python.org (victor.stinner) Date: Tue, 04 Oct 2011 20:52:53 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Add_assertion_to_=5FPy=5FRe?= =?utf8?q?leaseInternedUnicodeStrings=28=29_if_READY_fails?= Message-ID: http://hg.python.org/cpython/rev/3e721c405093 changeset: 72676:3e721c405093 user: Victor Stinner date: Tue Oct 04 20:04:52 2011 +0200 summary: Add assertion to _Py_ReleaseInternedUnicodeStrings() if READY fails files: Objects/unicodeobject.c | 6 ++++-- 1 files changed, 4 insertions(+), 2 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -13131,7 +13131,7 @@ if (PyUnicode_CHECK_INTERNED(s)) return; if (_PyUnicode_READY_REPLACE(p)) { - assert(0 && "PyUnicode_READY fail in PyUnicode_InternInPlace"); + assert(0 && "_PyUnicode_READY_REPLACE fail in PyUnicode_InternInPlace"); return; } s = (PyUnicodeObject *)(*p); @@ -13217,8 +13217,10 @@ n); for (i = 0; i < n; i++) { s = (PyUnicodeObject *) PyList_GET_ITEM(keys, i); - if (PyUnicode_READY(s) == -1) + if (PyUnicode_READY(s) == -1) { + assert(0 && "could not ready string"); fprintf(stderr, "could not ready string\n"); + } switch (PyUnicode_CHECK_INTERNED(s)) { case SSTATE_NOT_INTERNED: /* XXX Shouldn't happen */ -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 20:52:53 2011 From: python-checkins at python.org (victor.stinner) Date: Tue, 04 Oct 2011 20:52:53 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Add_DONT=5FMAKE=5FRESULT=5F?= =?utf8?q?READY_to_unicodeobject=2Ec_to_help_detecting_bugs?= Message-ID: http://hg.python.org/cpython/rev/411c0734cf48 changeset: 72677:411c0734cf48 user: Victor Stinner date: Tue Oct 04 20:05:46 2011 +0200 summary: Add DONT_MAKE_RESULT_READY to unicodeobject.c to help detecting bugs Use also _PyUnicode_READY_REPLACE() when it's applicable. files: Objects/unicodeobject.c | 30 +++++++++++++++++++++++++++- 1 files changed, 28 insertions(+), 2 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -2625,10 +2625,12 @@ goto onError; } Py_DECREF(buffer); +#ifndef DONT_MAKE_RESULT_READY if (_PyUnicode_READY_REPLACE(&unicode)) { Py_DECREF(unicode); return NULL; } +#endif return unicode; onError: @@ -3674,10 +3676,12 @@ Py_XDECREF(errorHandler); Py_XDECREF(exc); +#ifndef DONT_MAKE_RESULT_READY if (_PyUnicode_READY_REPLACE(&unicode)) { Py_DECREF(unicode); return NULL; } +#endif return (PyObject *)unicode; onError: @@ -4244,10 +4248,12 @@ Py_XDECREF(errorHandler); Py_XDECREF(exc); +#ifndef DONT_MAKE_RESULT_READY if (_PyUnicode_READY_REPLACE(&unicode)) { Py_DECREF(unicode); return NULL; } +#endif return (PyObject *)unicode; onError: @@ -4747,10 +4753,12 @@ Py_XDECREF(errorHandler); Py_XDECREF(exc); +#ifndef DONT_MAKE_RESULT_READY if (_PyUnicode_READY_REPLACE(&unicode)) { Py_DECREF(unicode); return NULL; } +#endif return (PyObject *)unicode; onError: @@ -5145,10 +5153,12 @@ Py_XDECREF(errorHandler); Py_XDECREF(exc); +#ifndef DONT_MAKE_RESULT_READY if (_PyUnicode_READY_REPLACE(&unicode)) { Py_DECREF(unicode); return NULL; } +#endif return (PyObject *)unicode; onError: @@ -5604,10 +5614,12 @@ } Py_XDECREF(errorHandler); Py_XDECREF(exc); +#ifndef DONT_MAKE_RESULT_READY if (_PyUnicode_READY_REPLACE(&v)) { Py_DECREF(v); return NULL; } +#endif return (PyObject *)v; ucnhashError: @@ -5905,10 +5917,12 @@ goto onError; Py_XDECREF(errorHandler); Py_XDECREF(exc); +#ifndef DONT_MAKE_RESULT_READY if (_PyUnicode_READY_REPLACE(&v)) { Py_DECREF(v); return NULL; } +#endif return (PyObject *)v; onError: @@ -6093,10 +6107,12 @@ goto onError; Py_XDECREF(errorHandler); Py_XDECREF(exc); +#ifndef DONT_MAKE_RESULT_READY if (_PyUnicode_READY_REPLACE(&v)) { Py_DECREF(v); return NULL; } +#endif return (PyObject *)v; onError: @@ -6519,10 +6535,12 @@ goto onError; Py_XDECREF(errorHandler); Py_XDECREF(exc); +#ifndef DONT_MAKE_RESULT_READY if (_PyUnicode_READY_REPLACE(&v)) { Py_DECREF(v); return NULL; } +#endif return (PyObject *)v; onError: @@ -6713,10 +6731,12 @@ goto retry; } #endif +#ifndef DONT_MAKE_RESULT_READY if (_PyUnicode_READY_REPLACE(&v)) { Py_DECREF(v); return NULL; } +#endif return (PyObject *)v; } @@ -7012,10 +7032,12 @@ goto onError; Py_XDECREF(errorHandler); Py_XDECREF(exc); +#ifndef DONT_MAKE_RESULT_READY if (_PyUnicode_READY_REPLACE(&v)) { Py_DECREF(v); return NULL; } +#endif return (PyObject *)v; onError: @@ -8057,10 +8079,12 @@ p[i] = '0' + decimal; } } - if (PyUnicode_READY((PyUnicodeObject*)result) == -1) { +#ifndef DONT_MAKE_RESULT_READY + if (_PyUnicode_READY_REPLACE(&result)) { Py_DECREF(result); return NULL; } +#endif return result; } /* --- Decimal Encoder ---------------------------------------------------- */ @@ -10265,10 +10289,12 @@ } } assert (j == PyUnicode_GET_LENGTH(u)); - if (PyUnicode_READY(u)) { +#ifndef DONT_MAKE_RESULT_READY + if (_PyUnicode_READY_REPLACE(&u)) { Py_DECREF(u); return NULL; } +#endif return (PyObject*) u; overflow: -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 20:52:54 2011 From: python-checkins at python.org (victor.stinner) Date: Tue, 04 Oct 2011 20:52:54 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_=5FPyUnicode=5FREADY=5FREPL?= =?utf8?q?ACE=28=29_cannot_be_used_in_unicode=5Fsubtype=5Fnew=28=29?= Message-ID: http://hg.python.org/cpython/rev/a81f5ced46a8 changeset: 72678:a81f5ced46a8 user: Victor Stinner date: Tue Oct 04 20:52:31 2011 +0200 summary: _PyUnicode_READY_REPLACE() cannot be used in unicode_subtype_new() files: Objects/unicodeobject.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -12949,7 +12949,7 @@ if (unicode == NULL) return NULL; assert(_PyUnicode_CHECK(unicode)); - if (_PyUnicode_READY_REPLACE(&unicode)) + if (PyUnicode_READY(unicode)) return NULL; self = (PyUnicodeObject *) type->tp_alloc(type, 0); -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 20:52:55 2011 From: python-checkins at python.org (victor.stinner) Date: Tue, 04 Oct 2011 20:52:55 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Fix_usage_og_PyUnicode=5FRE?= =?utf8?b?QURZKCk=?= Message-ID: http://hg.python.org/cpython/rev/b66033a0f140 changeset: 72679:b66033a0f140 user: Victor Stinner date: Tue Oct 04 20:53:03 2011 +0200 summary: Fix usage og PyUnicode_READY() files: Modules/_io/stringio.c | 4 ++++ Objects/unicodeobject.c | 14 +++++++++----- Python/getargs.c | 21 ++++++++++++++------- 3 files changed, 27 insertions(+), 12 deletions(-) diff --git a/Modules/_io/stringio.c b/Modules/_io/stringio.c --- a/Modules/_io/stringio.c +++ b/Modules/_io/stringio.c @@ -131,6 +131,10 @@ return -1; assert(PyUnicode_Check(decoded)); + if (PyUnicode_READY(decoded)) { + Py_DECREF(decoded); + return -1; + } len = PyUnicode_GET_LENGTH(decoded); assert(len >= 0); diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -2120,6 +2120,10 @@ str_obj = PyUnicode_DecodeUTF8(str, strlen(str), "replace"); if (!str_obj) goto fail; + if (PyUnicode_READY(str_obj)) { + Py_DECREF(str_obj); + goto fail; + } argmaxchar = PyUnicode_MAX_CHAR_VALUE(str_obj); maxchar = Py_MAX(maxchar, argmaxchar); n += PyUnicode_GET_LENGTH(str_obj); @@ -10062,17 +10066,17 @@ goto error; } + if (PyUnicode_READY(left)) + goto error; + if (PyUnicode_READY(right)) + goto error; + if (PyUnicode_CheckExact(left) && left != unicode_empty && PyUnicode_CheckExact(right) && right != unicode_empty && unicode_resizable(left) && (_PyUnicode_KIND(right) <= _PyUnicode_KIND(left) || _PyUnicode_WSTR(left) != NULL)) { - if (PyUnicode_READY(left)) - goto error; - if (PyUnicode_READY(right)) - goto error; - /* Don't resize for ascii += latin1. Convert ascii to latin1 requires to change the structure size, but characters are stored just after the structure, and so it requires to move all charactres which is diff --git a/Python/getargs.c b/Python/getargs.c --- a/Python/getargs.c +++ b/Python/getargs.c @@ -834,14 +834,21 @@ case 'C': {/* unicode char */ int *p = va_arg(*p_va, int *); - if (PyUnicode_Check(arg) && - PyUnicode_GET_LENGTH(arg) == 1) { - int kind = PyUnicode_KIND(arg); - void *data = PyUnicode_DATA(arg); - *p = PyUnicode_READ(kind, data, 0); - } - else + int kind; + void *data; + + if (!PyUnicode_Check(arg)) return converterr("a unicode character", arg, msgbuf, bufsize); + + if (PyUnicode_READY(arg)) + RETURN_ERR_OCCURRED; + + if (PyUnicode_GET_LENGTH(arg) != 1) + return converterr("a unicode character", arg, msgbuf, bufsize); + + kind = PyUnicode_KIND(arg); + data = PyUnicode_DATA(arg); + *p = PyUnicode_READ(kind, data, 0); break; } -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 23:34:46 2011 From: python-checkins at python.org (charles-francois.natali) Date: Tue, 04 Oct 2011 23:34:46 +0200 Subject: [Python-checkins] =?utf8?b?Y3B5dGhvbiAoMy4yKTogb3MuZ2V0ZXVpZCgp?= =?utf8?q?_may_not_be_available=2E=2E=2E?= Message-ID: http://hg.python.org/cpython/rev/7a2127ca6c8a changeset: 72680:7a2127ca6c8a branch: 3.2 parent: 72674:cbda512c6d7f user: Charles-Fran?ois Natali date: Tue Oct 04 23:35:47 2011 +0200 summary: os.geteuid() may not be available... files: Lib/test/test_import.py | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Lib/test/test_import.py b/Lib/test/test_import.py --- a/Lib/test/test_import.py +++ b/Lib/test/test_import.py @@ -547,7 +547,7 @@ @unittest.skipUnless(os.name == 'posix', "test meaningful only on posix systems") - @unittest.skipIf(os.geteuid() == 0, + @unittest.skipIf(hasattr(os, 'geteuid') and os.geteuid() == 0, "due to varying filesystem permission semantics (issue #11956)") def test_unwritable_directory(self): # When the umask causes the new __pycache__ directory to be -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 23:34:47 2011 From: python-checkins at python.org (charles-francois.natali) Date: Tue, 04 Oct 2011 23:34:47 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=28merge_3=2E2_-=3E_default=29?= =?utf8?q?=3A_os=2Egeteuid=28=29_may_not_be_available=2E=2E=2E?= Message-ID: http://hg.python.org/cpython/rev/08dd0f9b79fa changeset: 72681:08dd0f9b79fa parent: 72675:971093a75613 parent: 72680:7a2127ca6c8a user: Charles-Fran?ois Natali date: Tue Oct 04 23:36:49 2011 +0200 summary: os.geteuid() may not be available... files: Lib/test/test_import.py | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Lib/test/test_import.py b/Lib/test/test_import.py --- a/Lib/test/test_import.py +++ b/Lib/test/test_import.py @@ -545,7 +545,7 @@ @unittest.skipUnless(os.name == 'posix', "test meaningful only on posix systems") - @unittest.skipIf(os.geteuid() == 0, + @unittest.skipIf(hasattr(os, 'geteuid') and os.geteuid() == 0, "due to varying filesystem permission semantics (issue #11956)") def test_unwritable_directory(self): # When the umask causes the new __pycache__ directory to be -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Tue Oct 4 23:34:48 2011 From: python-checkins at python.org (charles-francois.natali) Date: Tue, 04 Oct 2011 23:34:48 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=28merge_default_-=3E_default?= =?utf8?b?KTogTWVyZ2Uu?= Message-ID: http://hg.python.org/cpython/rev/f62c13f5e689 changeset: 72682:f62c13f5e689 parent: 72681:08dd0f9b79fa parent: 72679:b66033a0f140 user: Charles-Fran?ois Natali date: Tue Oct 04 23:37:43 2011 +0200 summary: Merge. files: Modules/_io/stringio.c | 4 ++ Objects/unicodeobject.c | 52 +++++++++++++++++++++++----- Python/getargs.c | 21 +++++++--- 3 files changed, 60 insertions(+), 17 deletions(-) diff --git a/Modules/_io/stringio.c b/Modules/_io/stringio.c --- a/Modules/_io/stringio.c +++ b/Modules/_io/stringio.c @@ -131,6 +131,10 @@ return -1; assert(PyUnicode_Check(decoded)); + if (PyUnicode_READY(decoded)) { + Py_DECREF(decoded); + return -1; + } len = PyUnicode_GET_LENGTH(decoded); assert(len >= 0); diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -2120,6 +2120,10 @@ str_obj = PyUnicode_DecodeUTF8(str, strlen(str), "replace"); if (!str_obj) goto fail; + if (PyUnicode_READY(str_obj)) { + Py_DECREF(str_obj); + goto fail; + } argmaxchar = PyUnicode_MAX_CHAR_VALUE(str_obj); maxchar = Py_MAX(maxchar, argmaxchar); n += PyUnicode_GET_LENGTH(str_obj); @@ -2625,10 +2629,12 @@ goto onError; } Py_DECREF(buffer); +#ifndef DONT_MAKE_RESULT_READY if (_PyUnicode_READY_REPLACE(&unicode)) { Py_DECREF(unicode); return NULL; } +#endif return unicode; onError: @@ -3674,10 +3680,12 @@ Py_XDECREF(errorHandler); Py_XDECREF(exc); +#ifndef DONT_MAKE_RESULT_READY if (_PyUnicode_READY_REPLACE(&unicode)) { Py_DECREF(unicode); return NULL; } +#endif return (PyObject *)unicode; onError: @@ -4244,10 +4252,12 @@ Py_XDECREF(errorHandler); Py_XDECREF(exc); +#ifndef DONT_MAKE_RESULT_READY if (_PyUnicode_READY_REPLACE(&unicode)) { Py_DECREF(unicode); return NULL; } +#endif return (PyObject *)unicode; onError: @@ -4747,10 +4757,12 @@ Py_XDECREF(errorHandler); Py_XDECREF(exc); +#ifndef DONT_MAKE_RESULT_READY if (_PyUnicode_READY_REPLACE(&unicode)) { Py_DECREF(unicode); return NULL; } +#endif return (PyObject *)unicode; onError: @@ -5145,10 +5157,12 @@ Py_XDECREF(errorHandler); Py_XDECREF(exc); +#ifndef DONT_MAKE_RESULT_READY if (_PyUnicode_READY_REPLACE(&unicode)) { Py_DECREF(unicode); return NULL; } +#endif return (PyObject *)unicode; onError: @@ -5604,10 +5618,12 @@ } Py_XDECREF(errorHandler); Py_XDECREF(exc); +#ifndef DONT_MAKE_RESULT_READY if (_PyUnicode_READY_REPLACE(&v)) { Py_DECREF(v); return NULL; } +#endif return (PyObject *)v; ucnhashError: @@ -5905,10 +5921,12 @@ goto onError; Py_XDECREF(errorHandler); Py_XDECREF(exc); +#ifndef DONT_MAKE_RESULT_READY if (_PyUnicode_READY_REPLACE(&v)) { Py_DECREF(v); return NULL; } +#endif return (PyObject *)v; onError: @@ -6093,10 +6111,12 @@ goto onError; Py_XDECREF(errorHandler); Py_XDECREF(exc); +#ifndef DONT_MAKE_RESULT_READY if (_PyUnicode_READY_REPLACE(&v)) { Py_DECREF(v); return NULL; } +#endif return (PyObject *)v; onError: @@ -6519,10 +6539,12 @@ goto onError; Py_XDECREF(errorHandler); Py_XDECREF(exc); +#ifndef DONT_MAKE_RESULT_READY if (_PyUnicode_READY_REPLACE(&v)) { Py_DECREF(v); return NULL; } +#endif return (PyObject *)v; onError: @@ -6713,10 +6735,12 @@ goto retry; } #endif +#ifndef DONT_MAKE_RESULT_READY if (_PyUnicode_READY_REPLACE(&v)) { Py_DECREF(v); return NULL; } +#endif return (PyObject *)v; } @@ -7012,10 +7036,12 @@ goto onError; Py_XDECREF(errorHandler); Py_XDECREF(exc); +#ifndef DONT_MAKE_RESULT_READY if (_PyUnicode_READY_REPLACE(&v)) { Py_DECREF(v); return NULL; } +#endif return (PyObject *)v; onError: @@ -8057,10 +8083,12 @@ p[i] = '0' + decimal; } } - if (PyUnicode_READY((PyUnicodeObject*)result) == -1) { +#ifndef DONT_MAKE_RESULT_READY + if (_PyUnicode_READY_REPLACE(&result)) { Py_DECREF(result); return NULL; } +#endif return result; } /* --- Decimal Encoder ---------------------------------------------------- */ @@ -10038,17 +10066,17 @@ goto error; } + if (PyUnicode_READY(left)) + goto error; + if (PyUnicode_READY(right)) + goto error; + if (PyUnicode_CheckExact(left) && left != unicode_empty && PyUnicode_CheckExact(right) && right != unicode_empty && unicode_resizable(left) && (_PyUnicode_KIND(right) <= _PyUnicode_KIND(left) || _PyUnicode_WSTR(left) != NULL)) { - if (PyUnicode_READY(left)) - goto error; - if (PyUnicode_READY(right)) - goto error; - /* Don't resize for ascii += latin1. Convert ascii to latin1 requires to change the structure size, but characters are stored just after the structure, and so it requires to move all charactres which is @@ -10265,10 +10293,12 @@ } } assert (j == PyUnicode_GET_LENGTH(u)); - if (PyUnicode_READY(u)) { +#ifndef DONT_MAKE_RESULT_READY + if (_PyUnicode_READY_REPLACE(&u)) { Py_DECREF(u); return NULL; } +#endif return (PyObject*) u; overflow: @@ -12923,7 +12953,7 @@ if (unicode == NULL) return NULL; assert(_PyUnicode_CHECK(unicode)); - if (_PyUnicode_READY_REPLACE(&unicode)) + if (PyUnicode_READY(unicode)) return NULL; self = (PyUnicodeObject *) type->tp_alloc(type, 0); @@ -13131,7 +13161,7 @@ if (PyUnicode_CHECK_INTERNED(s)) return; if (_PyUnicode_READY_REPLACE(p)) { - assert(0 && "PyUnicode_READY fail in PyUnicode_InternInPlace"); + assert(0 && "_PyUnicode_READY_REPLACE fail in PyUnicode_InternInPlace"); return; } s = (PyUnicodeObject *)(*p); @@ -13217,8 +13247,10 @@ n); for (i = 0; i < n; i++) { s = (PyUnicodeObject *) PyList_GET_ITEM(keys, i); - if (PyUnicode_READY(s) == -1) + if (PyUnicode_READY(s) == -1) { + assert(0 && "could not ready string"); fprintf(stderr, "could not ready string\n"); + } switch (PyUnicode_CHECK_INTERNED(s)) { case SSTATE_NOT_INTERNED: /* XXX Shouldn't happen */ diff --git a/Python/getargs.c b/Python/getargs.c --- a/Python/getargs.c +++ b/Python/getargs.c @@ -834,14 +834,21 @@ case 'C': {/* unicode char */ int *p = va_arg(*p_va, int *); - if (PyUnicode_Check(arg) && - PyUnicode_GET_LENGTH(arg) == 1) { - int kind = PyUnicode_KIND(arg); - void *data = PyUnicode_DATA(arg); - *p = PyUnicode_READ(kind, data, 0); - } - else + int kind; + void *data; + + if (!PyUnicode_Check(arg)) return converterr("a unicode character", arg, msgbuf, bufsize); + + if (PyUnicode_READY(arg)) + RETURN_ERR_OCCURRED; + + if (PyUnicode_GET_LENGTH(arg) != 1) + return converterr("a unicode character", arg, msgbuf, bufsize); + + kind = PyUnicode_KIND(arg); + data = PyUnicode_DATA(arg); + *p = PyUnicode_READ(kind, data, 0); break; } -- Repository URL: http://hg.python.org/cpython From solipsis at pitrou.net Wed Oct 5 05:26:02 2011 From: solipsis at pitrou.net (solipsis at pitrou.net) Date: Wed, 05 Oct 2011 05:26:02 +0200 Subject: [Python-checkins] Daily reference leaks (f62c13f5e689): sum=0 Message-ID: results for f62c13f5e689 on branch "default" -------------------------------------------- Command line was: ['./python', '-m', 'test.regrtest', '-uall', '-R', '3:3:/home/antoine/cpython/refleaks/reflognfw3pH', '-x'] From python-checkins at python.org Wed Oct 5 13:05:18 2011 From: python-checkins at python.org (antoine.pitrou) Date: Wed, 05 Oct 2011 13:05:18 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Fix_text_failures_when_ctyp?= =?utf8?q?es_is_not_available?= Message-ID: http://hg.python.org/cpython/rev/da3bfa2e9daa changeset: 72683:da3bfa2e9daa user: Antoine Pitrou date: Wed Oct 05 13:01:41 2011 +0200 summary: Fix text failures when ctypes is not available (followup to Victor's 85d11cf67aa8 and 7a50e549bd11) files: Lib/test/test_codeccallbacks.py | 58 +++++++++++--------- Lib/test/test_codecs.py | 9 ++- 2 files changed, 39 insertions(+), 28 deletions(-) diff --git a/Lib/test/test_codeccallbacks.py b/Lib/test/test_codeccallbacks.py --- a/Lib/test/test_codeccallbacks.py +++ b/Lib/test/test_codeccallbacks.py @@ -1,8 +1,13 @@ import test.support, unittest import sys, codecs, html.entities, unicodedata -import ctypes -SIZEOF_WCHAR_T = ctypes.sizeof(ctypes.c_wchar) +try: + import ctypes +except ImportError: + ctypes = None + SIZEOF_WCHAR_T = -1 +else: + SIZEOF_WCHAR_T = ctypes.sizeof(ctypes.c_wchar) class PosReturn: # this can be used for configurable callbacks @@ -572,33 +577,34 @@ UnicodeEncodeError("ascii", "\uffff", 0, 1, "ouch")), ("\\uffff", 1) ) - if ctypes.sizeof(ctypes.c_wchar) == 2: + if SIZEOF_WCHAR_T == 2: len_wide = 2 else: len_wide = 1 - self.assertEqual( - codecs.backslashreplace_errors( - UnicodeEncodeError("ascii", "\U00010000", - 0, len_wide, "ouch")), - ("\\U00010000", len_wide) - ) - self.assertEqual( - codecs.backslashreplace_errors( - UnicodeEncodeError("ascii", "\U0010ffff", - 0, len_wide, "ouch")), - ("\\U0010ffff", len_wide) - ) - # Lone surrogates (regardless of unicode width) - self.assertEqual( - codecs.backslashreplace_errors( - UnicodeEncodeError("ascii", "\ud800", 0, 1, "ouch")), - ("\\ud800", 1) - ) - self.assertEqual( - codecs.backslashreplace_errors( - UnicodeEncodeError("ascii", "\udfff", 0, 1, "ouch")), - ("\\udfff", 1) - ) + if SIZEOF_WCHAR_T > 0: + self.assertEqual( + codecs.backslashreplace_errors( + UnicodeEncodeError("ascii", "\U00010000", + 0, len_wide, "ouch")), + ("\\U00010000", len_wide) + ) + self.assertEqual( + codecs.backslashreplace_errors( + UnicodeEncodeError("ascii", "\U0010ffff", + 0, len_wide, "ouch")), + ("\\U0010ffff", len_wide) + ) + # Lone surrogates (regardless of unicode width) + self.assertEqual( + codecs.backslashreplace_errors( + UnicodeEncodeError("ascii", "\ud800", 0, 1, "ouch")), + ("\\ud800", 1) + ) + self.assertEqual( + codecs.backslashreplace_errors( + UnicodeEncodeError("ascii", "\udfff", 0, 1, "ouch")), + ("\\udfff", 1) + ) def test_badhandlerresults(self): results = ( 42, "foo", (1,2,3), ("foo", 1, 3), ("foo", None), ("foo",), ("foo", 1, 3), ("foo", None), ("foo",) ) diff --git a/Lib/test/test_codecs.py b/Lib/test/test_codecs.py --- a/Lib/test/test_codecs.py +++ b/Lib/test/test_codecs.py @@ -3,9 +3,14 @@ import codecs import locale import sys, _testcapi, io -import ctypes -SIZEOF_WCHAR_T = ctypes.sizeof(ctypes.c_wchar) +try: + import ctypes +except ImportError: + ctypes = None + SIZEOF_WCHAR_T = -1 +else: + SIZEOF_WCHAR_T = ctypes.sizeof(ctypes.c_wchar) class Queue(object): """ -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Oct 5 14:13:31 2011 From: python-checkins at python.org (victor.stinner) Date: Wed, 05 Oct 2011 14:13:31 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Speedup_the_ASCII_decoder?= Message-ID: http://hg.python.org/cpython/rev/392cfd37e312 changeset: 72684:392cfd37e312 user: Victor Stinner date: Wed Oct 05 13:50:52 2011 +0200 summary: Speedup the ASCII decoder It is faster for long string and a little bit faster for short strings, benchmark on Linux 32 bits, Intel Core i5 @ 3.33GHz: ./python -m timeit 'x=b"a"' 'x.decode("ascii")' ./python -m timeit 'x=b"x"*80' 'x.decode("ascii")' ./python -m timeit 'x=b"abc"*4096' 'x.decode("ascii")' length | before | after -------+------------+----------- 1 | 0.234 usec | 0.229 usec 80 | 0.381 usec | 0.357 usec 12,288 | 11.2 usec | 3.01 usec files: Objects/unicodeobject.c | 82 +++++++++++++++++++--------- 1 files changed, 54 insertions(+), 28 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -1515,6 +1515,16 @@ } static PyObject* +unicode_fromascii(const unsigned char* u, Py_ssize_t size) +{ + PyObject *res = PyUnicode_New(size, 127); + if (!res) + return NULL; + memcpy(PyUnicode_1BYTE_DATA(res), u, size); + return res; +} + +static PyObject* _PyUnicode_FromUCS1(const unsigned char* u, Py_ssize_t size) { PyObject *res; @@ -6477,65 +6487,81 @@ { const char *starts = s; PyUnicodeObject *v; - Py_UNICODE *p; + Py_UNICODE *u; Py_ssize_t startinpos; Py_ssize_t endinpos; Py_ssize_t outpos; const char *e; - unsigned char* d; + int has_error; + const unsigned char *p = (const unsigned char *)s; + const unsigned char *end = p + size; + const unsigned char *aligned_end = (const unsigned char *) ((size_t) end & ~LONG_PTR_MASK); PyObject *errorHandler = NULL; PyObject *exc = NULL; - Py_ssize_t i; /* ASCII is equivalent to the first 128 ordinals in Unicode. */ - if (size == 1 && *(unsigned char*)s < 128) - return PyUnicode_FromOrdinal(*(unsigned char*)s); - - /* Fast path. Assume the input actually *is* ASCII, and allocate - a single-block Unicode object with that assumption. If there is - an error, drop the object and start over. */ - v = (PyUnicodeObject*)PyUnicode_New(size, 127); - if (v == NULL) - goto onError; - d = PyUnicode_1BYTE_DATA(v); - for (i = 0; i < size; i++) { - unsigned char ch = ((unsigned char*)s)[i]; - if (ch < 128) - d[i] = ch; - else + if (size == 1 && (unsigned char)s[0] < 128) + return get_latin1_char((unsigned char)s[0]); + + has_error = 0; + while (p < end && !has_error) { + /* Fast path, see below in PyUnicode_DecodeUTF8Stateful for + an explanation. */ + if (!((size_t) p & LONG_PTR_MASK)) { + /* Help register allocation */ + register const unsigned char *_p = p; + while (_p < aligned_end) { + unsigned long value = *(unsigned long *) _p; + if (value & ASCII_CHAR_MASK) { + has_error = 1; + break; + } + _p += SIZEOF_LONG; + } + if (_p == end) + break; + if (has_error) + break; + p = _p; + } + if (*p & 0x80) { + has_error = 1; break; - } - if (i == size) - return (PyObject*)v; - Py_DECREF(v); /* start over */ + } + else { + ++p; + } + } + if (!has_error) + return unicode_fromascii((const unsigned char *)s, size); v = _PyUnicode_New(size); if (v == NULL) goto onError; if (size == 0) return (PyObject *)v; - p = PyUnicode_AS_UNICODE(v); + u = PyUnicode_AS_UNICODE(v); e = s + size; while (s < e) { register unsigned char c = (unsigned char)*s; if (c < 128) { - *p++ = c; + *u++ = c; ++s; } else { startinpos = s-starts; endinpos = startinpos + 1; - outpos = p - (Py_UNICODE *)PyUnicode_AS_UNICODE(v); + outpos = u - (Py_UNICODE *)PyUnicode_AS_UNICODE(v); if (unicode_decode_call_errorhandler( errors, &errorHandler, "ascii", "ordinal not in range(128)", &starts, &e, &startinpos, &endinpos, &exc, &s, - &v, &outpos, &p)) + &v, &outpos, &u)) goto onError; } } - if (p - PyUnicode_AS_UNICODE(v) < PyUnicode_GET_SIZE(v)) - if (PyUnicode_Resize((PyObject**)&v, p - PyUnicode_AS_UNICODE(v)) < 0) + if (u - PyUnicode_AS_UNICODE(v) < PyUnicode_GET_SIZE(v)) + if (PyUnicode_Resize((PyObject**)&v, u - PyUnicode_AS_UNICODE(v)) < 0) goto onError; Py_XDECREF(errorHandler); Py_XDECREF(exc); -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Oct 5 14:13:32 2011 From: python-checkins at python.org (victor.stinner) Date: Wed, 05 Oct 2011 14:13:32 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Speedup_str=5Ba=3Ab=5D_and_?= =?utf8?q?PyUnicode=5FFromKindAndData?= Message-ID: http://hg.python.org/cpython/rev/61ca5262b6c4 changeset: 72685:61ca5262b6c4 user: Victor Stinner date: Wed Oct 05 14:01:42 2011 +0200 summary: Speedup str[a:b] and PyUnicode_FromKindAndData * str[a:b] doesn't scan the string for the maximum character if the string is ascii only * PyUnicode_FromKindAndData() stops if we are sure that we cannot use a shorter character type. For example, _PyUnicode_FromUCS1() stops if we have at least one character in range U+0080-U+00FF files: Include/unicodeobject.h | 2 + Objects/unicodeobject.c | 78 ++++++++++++++++++---------- 2 files changed, 52 insertions(+), 28 deletions(-) diff --git a/Include/unicodeobject.h b/Include/unicodeobject.h --- a/Include/unicodeobject.h +++ b/Include/unicodeobject.h @@ -654,6 +654,8 @@ const char *u /* UTF-8 encoded string */ ); +/* Create a new string from a buffer of Py_UCS1, Py_UCS2 or Py_UCS4 characters. + Scan the string to find the maximum character. */ #ifndef Py_LIMITED_API PyAPI_FUNC(PyObject*) PyUnicode_FromKindAndData( int kind, diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -969,7 +969,7 @@ if (from_kind == to_kind /* deny latin1 => ascii */ - && PyUnicode_MAX_CHAR_VALUE(to) >= PyUnicode_MAX_CHAR_VALUE(from)) + && !(!PyUnicode_IS_ASCII(from) && PyUnicode_IS_ASCII(to))) { Py_MEMCPY((char*)to_data + PyUnicode_KIND_SIZE(to_kind, to_start), @@ -1013,9 +1013,7 @@ /* check if max_char(from substring) <= max_char(to) */ if (from_kind > to_kind /* latin1 => ascii */ - || (PyUnicode_IS_ASCII(to) - && to_kind == PyUnicode_1BYTE_KIND - && !PyUnicode_IS_ASCII(from))) + || (!PyUnicode_IS_ASCII(from) && PyUnicode_IS_ASCII(to))) { /* slow path to check for character overflow */ const Py_UCS4 to_maxchar = PyUnicode_MAX_CHAR_VALUE(to); @@ -1528,15 +1526,17 @@ _PyUnicode_FromUCS1(const unsigned char* u, Py_ssize_t size) { PyObject *res; - unsigned char max = 127; + unsigned char max_char = 127; Py_ssize_t i; + + assert(size >= 0); for (i = 0; i < size; i++) { if (u[i] & 0x80) { - max = 255; + max_char = 255; break; } } - res = PyUnicode_New(size, max); + res = PyUnicode_New(size, max_char); if (!res) return NULL; memcpy(PyUnicode_1BYTE_DATA(res), u, size); @@ -1547,15 +1547,21 @@ _PyUnicode_FromUCS2(const Py_UCS2 *u, Py_ssize_t size) { PyObject *res; - Py_UCS2 max = 0; + Py_UCS2 max_char = 0; Py_ssize_t i; - for (i = 0; i < size; i++) - if (u[i] > max) - max = u[i]; - res = PyUnicode_New(size, max); + + assert(size >= 0); + for (i = 0; i < size; i++) { + if (u[i] > max_char) { + max_char = u[i]; + if (max_char >= 256) + break; + } + } + res = PyUnicode_New(size, max_char); if (!res) return NULL; - if (max >= 256) + if (max_char >= 256) memcpy(PyUnicode_2BYTE_DATA(res), u, sizeof(Py_UCS2)*size); else for (i = 0; i < size; i++) @@ -1567,15 +1573,21 @@ _PyUnicode_FromUCS4(const Py_UCS4 *u, Py_ssize_t size) { PyObject *res; - Py_UCS4 max = 0; + Py_UCS4 max_char = 0; Py_ssize_t i; - for (i = 0; i < size; i++) - if (u[i] > max) - max = u[i]; - res = PyUnicode_New(size, max); + + assert(size >= 0); + for (i = 0; i < size; i++) { + if (u[i] > max_char) { + max_char = u[i]; + if (max_char >= 0x10000) + break; + } + } + res = PyUnicode_New(size, max_char); if (!res) return NULL; - if (max >= 0x10000) + if (max_char >= 0x10000) memcpy(PyUnicode_4BYTE_DATA(res), u, sizeof(Py_UCS4)*size); else { int kind = PyUnicode_KIND(res); @@ -1596,9 +1608,11 @@ return _PyUnicode_FromUCS2(buffer, size); case PyUnicode_4BYTE_KIND: return _PyUnicode_FromUCS4(buffer, size); - } - PyErr_SetString(PyExc_SystemError, "invalid kind"); - return NULL; + default: + assert(0 && "invalid kind"); + PyErr_SetString(PyExc_SystemError, "invalid kind"); + return NULL; + } } PyObject* @@ -9383,11 +9397,12 @@ maxchar = PyUnicode_MAX_CHAR_VALUE(self); /* Replacing u1 with u2 may cause a maxchar reduction in the result string. */ - mayshrink = maxchar > 127; if (u2 > maxchar) { maxchar = u2; mayshrink = 0; } + else + mayshrink = maxchar > 127; u = PyUnicode_New(slen, maxchar); if (!u) goto error; @@ -11039,11 +11054,18 @@ return NULL; } - kind = PyUnicode_KIND(self); - data = PyUnicode_1BYTE_DATA(self); - return PyUnicode_FromKindAndData(kind, - data + PyUnicode_KIND_SIZE(kind, start), - length); + if (PyUnicode_IS_ASCII(self)) { + kind = PyUnicode_KIND(self); + data = PyUnicode_1BYTE_DATA(self); + return unicode_fromascii(data + start, length); + } + else { + kind = PyUnicode_KIND(self); + data = PyUnicode_1BYTE_DATA(self); + return PyUnicode_FromKindAndData(kind, + data + PyUnicode_KIND_SIZE(kind, start), + length); + } } static PyObject * -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Oct 5 14:13:33 2011 From: python-checkins at python.org (victor.stinner) Date: Wed, 05 Oct 2011 14:13:33 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Speedup_find=5Fmaxchar=5Fsu?= =?utf8?q?rrogates=28=29_for_32-bit_wchar=5Ft?= Message-ID: http://hg.python.org/cpython/rev/b8d4c4a89065 changeset: 72686:b8d4c4a89065 user: Victor Stinner date: Wed Oct 05 14:02:44 2011 +0200 summary: Speedup find_maxchar_surrogates() for 32-bit wchar_t If we have at least one character in U+10000-U+10FFFF, we know that we must use PyUnicode_4BYTE_KIND kind. files: Objects/unicodeobject.c | 14 ++++++-------- 1 files changed, 6 insertions(+), 8 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -1060,19 +1060,17 @@ const wchar_t *iter; assert(num_surrogates != NULL && maxchar != NULL); - if (num_surrogates == NULL || maxchar == NULL) { - PyErr_SetString(PyExc_SystemError, - "unexpected NULL arguments to " - "PyUnicode_FindMaxCharAndNumSurrogatePairs"); - return -1; - } - *num_surrogates = 0; *maxchar = 0; for (iter = begin; iter < end; ) { - if (*iter > *maxchar) + if (*iter > *maxchar) { *maxchar = *iter; +#if SIZEOF_WCHAR_T != 2 + if (*maxchar >= 0x10000) + return 0; +#endif + } #if SIZEOF_WCHAR_T == 2 if (*iter >= 0xD800 && *iter <= 0xDBFF && (iter+1) < end && iter[1] >= 0xDC00 && iter[1] <= 0xDFFF) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Oct 5 14:13:34 2011 From: python-checkins at python.org (victor.stinner) Date: Wed, 05 Oct 2011 14:13:34 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Speedup_str=5Ba=3Ab=3Astep?= =?utf8?q?=5D_for_step_!=3D_1?= Message-ID: http://hg.python.org/cpython/rev/ceffb5751d52 changeset: 72687:ceffb5751d52 user: Victor Stinner date: Wed Oct 05 14:13:28 2011 +0200 summary: Speedup str[a:b:step] for step != 1 Try to stop the scanner of the maximum character before the end using a limit depending on the kind (e.g. 256 for PyUnicode_2BYTE_KIND). files: Objects/unicodeobject.c | 26 +++++++++++++++++++++++--- 1 files changed, 23 insertions(+), 3 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -1520,6 +1520,22 @@ return res; } +static Py_UCS4 +kind_maxchar_limit(unsigned int kind) +{ + switch(kind) { + case PyUnicode_1BYTE_KIND: + return 0x80; + case PyUnicode_2BYTE_KIND: + return 0x100; + case PyUnicode_4BYTE_KIND: + return 0x10000; + default: + assert(0 && "invalid kind"); + return 0x10ffff; + } +} + static PyObject* _PyUnicode_FromUCS1(const unsigned char* u, Py_ssize_t size) { @@ -12335,7 +12351,7 @@ PyObject *result; void *src_data, *dest_data; int src_kind, dest_kind; - Py_UCS4 ch, max_char; + Py_UCS4 ch, max_char, kind_limit; if (PySlice_GetIndicesEx(item, PyUnicode_GET_LENGTH(self), &start, &stop, &step, &slicelength) < 0) { @@ -12354,13 +12370,17 @@ start, start + slicelength); } /* General case */ - max_char = 127; + max_char = 0; src_kind = PyUnicode_KIND(self); + kind_limit = kind_maxchar_limit(src_kind); src_data = PyUnicode_DATA(self); for (cur = start, i = 0; i < slicelength; cur += step, i++) { ch = PyUnicode_READ(src_kind, src_data, cur); - if (ch > max_char) + if (ch > max_char) { max_char = ch; + if (max_char >= kind_limit) + break; + } } result = PyUnicode_New(slicelength, max_char); if (result == NULL) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Oct 5 16:12:19 2011 From: python-checkins at python.org (georg.brandl) Date: Wed, 05 Oct 2011 16:12:19 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Fix_grammar=2E?= Message-ID: http://hg.python.org/cpython/rev/6d4701a0e76b changeset: 72688:6d4701a0e76b user: Georg Brandl date: Wed Oct 05 16:12:21 2011 +0200 summary: Fix grammar. files: Include/unicodeobject.h | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Include/unicodeobject.h b/Include/unicodeobject.h --- a/Include/unicodeobject.h +++ b/Include/unicodeobject.h @@ -426,7 +426,7 @@ #define PyUnicode_CHARACTER_SIZE(op) \ (1 << (PyUnicode_KIND(op) - 1)) -/* Return pointers to the canonical representation casted as unsigned char, +/* Return pointers to the canonical representation cast to unsigned char, Py_UCS2, or Py_UCS4 for direct character access. No checks are performed, use PyUnicode_CHARACTER_SIZE or PyUnicode_KIND() before to ensure these will work correctly. */ -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Oct 5 16:24:01 2011 From: python-checkins at python.org (georg.brandl) Date: Wed, 05 Oct 2011 16:24:01 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Fix_a_few_typos_in_the_unic?= =?utf8?q?ode_header=2E?= Message-ID: http://hg.python.org/cpython/rev/aa79b4e62c38 changeset: 72689:aa79b4e62c38 user: Georg Brandl date: Wed Oct 05 16:23:09 2011 +0200 summary: Fix a few typos in the unicode header. files: Include/unicodeobject.h | 22 +++++++++++----------- 1 files changed, 11 insertions(+), 11 deletions(-) diff --git a/Include/unicodeobject.h b/Include/unicodeobject.h --- a/Include/unicodeobject.h +++ b/Include/unicodeobject.h @@ -85,7 +85,7 @@ /* Py_UNICODE was the native Unicode storage format (code unit) used by Python and represents a single Unicode element in the Unicode type. - With PEP 393, Py_UNICODE is deprected and replaced with a + With PEP 393, Py_UNICODE is deprecated and replaced with a typedef to wchar_t. */ #ifndef Py_LIMITED_API @@ -115,7 +115,7 @@ # include #endif -/* Py_UCS4 and Py_UCS2 are typdefs for the respecitve +/* Py_UCS4 and Py_UCS2 are typedefs for the respective unicode representations. */ #if SIZEOF_INT >= 4 typedef unsigned int Py_UCS4; @@ -313,7 +313,7 @@ } PyASCIIObject; /* Non-ASCII strings allocated through PyUnicode_New use the - PyCompactUnicodeOject structure. state.compact is set, and the data + PyCompactUnicodeObject structure. state.compact is set, and the data immediately follow the structure. */ typedef struct { PyASCIIObject _base; @@ -382,7 +382,7 @@ ((const char *)(PyUnicode_AS_UNICODE(op))) -/* --- Flexible String Representaion Helper Macros (PEP 393) -------------- */ +/* --- Flexible String Representation Helper Macros (PEP 393) -------------- */ /* Values for PyUnicodeObject.state: */ @@ -468,9 +468,9 @@ /* Write into the canonical representation, this macro does not do any sanity checks and is intended for usage in loops. The caller should cache the - kind and data pointers optained form other macro calls. + kind and data pointers obtained form other macro calls. index is the index in the string (starts at 0) and value is the new - code point value which shoule be written to that location. */ + code point value which should be written to that location. */ #define PyUnicode_WRITE(kind, data, index, value) \ do { \ switch ((kind)) { \ @@ -542,7 +542,7 @@ /* Return a maximum character value which is suitable for creating another string based on op. This is always an approximation but more efficient - than interating over the string. */ + than iterating over the string. */ #define PyUnicode_MAX_CHAR_VALUE(op) \ (assert(PyUnicode_IS_READY(op)), \ (PyUnicode_IS_COMPACT_ASCII(op) ? 0x7f: \ @@ -936,8 +936,8 @@ In case of an error, no *size is set. - This funcation caches the UTF-8 encoded string in the unicodeobject - and subsequent calls will return the same string. The memory is relased + This function caches the UTF-8 encoded string in the unicodeobject + and subsequent calls will return the same string. The memory is released when the unicodeobject is deallocated. _PyUnicode_AsStringAndSize is a #define for PyUnicode_AsUTF8AndSize to @@ -1587,7 +1587,7 @@ These are capable of handling Unicode objects and strings on input (we refer to them as strings in the descriptions) and return - Unicode objects or integers as apporpriate. */ + Unicode objects or integers as appropriate. */ /* Concat two strings giving a new Unicode string. */ @@ -1767,7 +1767,7 @@ /* Rich compare two strings and return one of the following: - NULL in case an exception was raised - - Py_True or Py_False for successfuly comparisons + - Py_True or Py_False for successfully comparisons - Py_NotImplemented in case the type combination is unknown Note that Py_EQ and Py_NE comparisons can cause a UnicodeWarning in -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Oct 5 16:36:44 2011 From: python-checkins at python.org (georg.brandl) Date: Wed, 05 Oct 2011 16:36:44 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_More_typoes=2E?= Message-ID: http://hg.python.org/cpython/rev/63ac488538cb changeset: 72690:63ac488538cb user: Georg Brandl date: Wed Oct 05 16:36:47 2011 +0200 summary: More typoes. files: Objects/unicodeobject.c | 14 +++++++------- 1 files changed, 7 insertions(+), 7 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -881,7 +881,7 @@ #if SIZEOF_WCHAR_T == 2 /* Helper function to convert a 16-bits wchar_t representation to UCS4, this will decode surrogate pairs, the other conversions are implemented as macros - for efficency. + for efficiency. This function assumes that unicode can hold one more code point than wstr characters for a terminating null character. */ @@ -1110,7 +1110,7 @@ assert(p_obj != NULL); unicode = (PyUnicodeObject *)*p_obj; - /* _PyUnicode_Ready() is only intented for old-style API usage where + /* _PyUnicode_Ready() is only intended for old-style API usage where strings were created using _PyObject_New() and where no canonical representation (the str field) has been set yet aka strings which are not yet ready. */ @@ -1950,8 +1950,8 @@ * (we call PyObject_Str()/PyObject_Repr()/PyObject_ASCII()/ * PyUnicode_DecodeUTF8() for these objects once during step 3 and put the * result in an array) - * also esimate a upper bound for all the number formats in the string, - * numbers will be formated in step 3 and be keept in a '\0'-separated + * also estimate a upper bound for all the number formats in the string, + * numbers will be formatted in step 3 and be kept in a '\0'-separated * buffer before putting everything together. */ for (f = format; *f; f++) { if (*f == '%') { @@ -3967,7 +3967,7 @@ err = 1; } /* Instead of number of overall bytes for this code point, - n containts the number of following bytes: */ + n contains the number of following bytes: */ --n; /* Check if the follow up chars are all valid continuation bytes */ if (n >= 1) { @@ -8982,7 +8982,7 @@ sep = separator; seplen = PyUnicode_GET_LENGTH(separator); maxchar = PyUnicode_MAX_CHAR_VALUE(separator); - /* inc refcount to keep this code path symetric with the + /* inc refcount to keep this code path symmetric with the above case of a blank separator */ Py_INCREF(sep); } @@ -10134,7 +10134,7 @@ { /* Don't resize for ascii += latin1. Convert ascii to latin1 requires to change the structure size, but characters are stored just after - the structure, and so it requires to move all charactres which is + the structure, and so it requires to move all characters which is not so different than duplicating the string. */ if (!(PyUnicode_IS_ASCII(left) && !PyUnicode_IS_ASCII(right))) { -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Oct 5 16:47:34 2011 From: python-checkins at python.org (georg.brandl) Date: Wed, 05 Oct 2011 16:47:34 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_More_fixes=2E?= Message-ID: http://hg.python.org/cpython/rev/c2127d79f7c1 changeset: 72691:c2127d79f7c1 user: Georg Brandl date: Wed Oct 05 16:47:38 2011 +0200 summary: More fixes. files: Include/unicodeobject.h | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/Include/unicodeobject.h b/Include/unicodeobject.h --- a/Include/unicodeobject.h +++ b/Include/unicodeobject.h @@ -468,7 +468,7 @@ /* Write into the canonical representation, this macro does not do any sanity checks and is intended for usage in loops. The caller should cache the - kind and data pointers obtained form other macro calls. + kind and data pointers obtained from other macro calls. index is the index in the string (starts at 0) and value is the new code point value which should be written to that location. */ #define PyUnicode_WRITE(kind, data, index, value) \ @@ -489,7 +489,7 @@ } \ } while (0) -/* Read a code point form the string's canonical representation. No checks +/* Read a code point from the string's canonical representation. No checks or ready calls are performed. */ #define PyUnicode_READ(kind, data, index) \ ((Py_UCS4) \ -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Oct 5 17:27:51 2011 From: python-checkins at python.org (senthil.kumaran) Date: Wed, 05 Oct 2011 17:27:51 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=283=2E2=29=3A_Issue__=2313073?= =?utf8?q?_-_Address_the_review_comments_made_by_Ezio=2E?= Message-ID: http://hg.python.org/cpython/rev/befa7b926aad changeset: 72692:befa7b926aad branch: 3.2 parent: 72680:7a2127ca6c8a user: Senthil Kumaran date: Wed Oct 05 23:26:49 2011 +0800 summary: Issue #13073 - Address the review comments made by Ezio. files: Doc/library/http.client.rst | 9 ++++----- Lib/http/client.py | 10 +++++----- 2 files changed, 9 insertions(+), 10 deletions(-) diff --git a/Doc/library/http.client.rst b/Doc/library/http.client.rst --- a/Doc/library/http.client.rst +++ b/Doc/library/http.client.rst @@ -475,11 +475,10 @@ .. method:: HTTPConnection.endheaders(message_body=None) Send a blank line to the server, signalling the end of the headers. The - optional message_body argument can be used to pass message body - associated with the request. The message body will be sent in - the same packet as the message headers if possible. The - message_body should be a string. - + optional *message_body* argument can be used to pass a message body + associated with the request. The message body will be sent in the same + packet as the message headers if it is string, otherwise it is sent in a + separate packet. .. method:: HTTPConnection.send(data) diff --git a/Lib/http/client.py b/Lib/http/client.py --- a/Lib/http/client.py +++ b/Lib/http/client.py @@ -947,11 +947,11 @@ def endheaders(self, message_body=None): """Indicate that the last header line has been sent to the server. - This method sends the request to the server. The optional - message_body argument can be used to pass message body - associated with the request. The message body will be sent in - the same packet as the message headers if possible. The - message_body should be a string. + This method sends the request to the server. The optional message_body + argument can be used to pass a message body associated with the + request. The message body will be sent in the same packet as the + message headers if it is a string, otherwise it is sent as a separate + packet. """ if self.__state == _CS_REQ_STARTED: self.__state = _CS_REQ_SENT -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Oct 5 17:27:52 2011 From: python-checkins at python.org (senthil.kumaran) Date: Wed, 05 Oct 2011 17:27:52 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=28merge_3=2E2_-=3E_default=29?= =?utf8?q?=3A_merge_from_3=2E2=2E_Issue__=2313073_-_Address_the_review_com?= =?utf8?q?ments_made_by_Ezio=2E?= Message-ID: http://hg.python.org/cpython/rev/a7b7ba225de7 changeset: 72693:a7b7ba225de7 parent: 72691:c2127d79f7c1 parent: 72692:befa7b926aad user: Senthil Kumaran date: Wed Oct 05 23:27:37 2011 +0800 summary: merge from 3.2. Issue #13073 - Address the review comments made by Ezio. files: Doc/library/http.client.rst | 9 ++++----- Lib/http/client.py | 10 +++++----- 2 files changed, 9 insertions(+), 10 deletions(-) diff --git a/Doc/library/http.client.rst b/Doc/library/http.client.rst --- a/Doc/library/http.client.rst +++ b/Doc/library/http.client.rst @@ -475,11 +475,10 @@ .. method:: HTTPConnection.endheaders(message_body=None) Send a blank line to the server, signalling the end of the headers. The - optional message_body argument can be used to pass message body - associated with the request. The message body will be sent in - the same packet as the message headers if possible. The - message_body should be a string. - + optional *message_body* argument can be used to pass a message body + associated with the request. The message body will be sent in the same + packet as the message headers if it is string, otherwise it is sent in a + separate packet. .. method:: HTTPConnection.send(data) diff --git a/Lib/http/client.py b/Lib/http/client.py --- a/Lib/http/client.py +++ b/Lib/http/client.py @@ -947,11 +947,11 @@ def endheaders(self, message_body=None): """Indicate that the last header line has been sent to the server. - This method sends the request to the server. The optional - message_body argument can be used to pass message body - associated with the request. The message body will be sent in - the same packet as the message headers if possible. The - message_body should be a string. + This method sends the request to the server. The optional message_body + argument can be used to pass a message body associated with the + request. The message body will be sent in the same packet as the + message headers if it is a string, otherwise it is sent as a separate + packet. """ if self.__state == _CS_REQ_STARTED: self.__state = _CS_REQ_SENT -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Oct 5 17:53:11 2011 From: python-checkins at python.org (senthil.kumaran) Date: Wed, 05 Oct 2011 17:53:11 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=282=2E7=29=3A_Issue13073_-_Ad?= =?utf8?q?dress_review_comments_and_add_versionchanged_information_in_the?= Message-ID: http://hg.python.org/cpython/rev/64fae6f7b64c changeset: 72694:64fae6f7b64c branch: 2.7 parent: 72660:504981afa007 user: Senthil Kumaran date: Wed Oct 05 23:52:49 2011 +0800 summary: Issue13073 - Address review comments and add versionchanged information in the docs. files: Doc/library/httplib.rst | 13 ++++++++----- Lib/httplib.py | 6 +++--- 2 files changed, 11 insertions(+), 8 deletions(-) diff --git a/Doc/library/httplib.rst b/Doc/library/httplib.rst --- a/Doc/library/httplib.rst +++ b/Doc/library/httplib.rst @@ -494,11 +494,14 @@ .. method:: HTTPConnection.endheaders(message_body=None) - Send a blank line to the server, signalling the end of the headers. - The optional message_body argument can be used to pass message body - associated with the request. The message body will be sent in - the same packet as the message headers if possible. The - message_body should be a string. + Send a blank line to the server, signalling the end of the headers. The + optional *message_body* argument can be used to pass a message body + associated with the request. The message body will be sent in the same + packet as the message headers if it is string, otherwise it is sent in a + separate packet. + + .. versionchanged:: 2.7 + *message_body* was added. .. method:: HTTPConnection.send(data) diff --git a/Lib/httplib.py b/Lib/httplib.py --- a/Lib/httplib.py +++ b/Lib/httplib.py @@ -939,10 +939,10 @@ """Indicate that the last header line has been sent to the server. This method sends the request to the server. The optional - message_body argument can be used to pass message body + message_body argument can be used to pass a message body associated with the request. The message body will be sent in - the same packet as the message headers if possible. The - message_body should be a string. + the same packet as the message headers if it is string, otherwise it is + sent as a separate packet. """ if self.__state == _CS_REQ_STARTED: self.__state = _CS_REQ_SENT -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Oct 5 18:33:07 2011 From: python-checkins at python.org (senthil.kumaran) Date: Wed, 05 Oct 2011 18:33:07 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=283=2E2=29=3A_Issue13104_-_Fi?= =?utf8?q?x_urllib=2Erequest=2Ethishost=28=29_utility_function=2E?= Message-ID: http://hg.python.org/cpython/rev/805a0a1e3c2b changeset: 72695:805a0a1e3c2b branch: 3.2 parent: 72692:befa7b926aad user: Senthil Kumaran date: Thu Oct 06 00:32:02 2011 +0800 summary: Issue13104 - Fix urllib.request.thishost() utility function. files: Lib/test/test_urllib.py | 4 ++++ Lib/urllib/request.py | 2 +- 2 files changed, 5 insertions(+), 1 deletions(-) diff --git a/Lib/test/test_urllib.py b/Lib/test/test_urllib.py --- a/Lib/test/test_urllib.py +++ b/Lib/test/test_urllib.py @@ -1058,6 +1058,10 @@ self.assertEqual(('user', 'a\vb'),urllib.parse.splitpasswd('user:a\vb')) self.assertEqual(('user', 'a:b'),urllib.parse.splitpasswd('user:a:b')) + def test_thishost(self): + """Test the urllib.request.thishost utility function returns a tuple""" + self.assertIsInstance(urllib.request.thishost(), tuple) + class URLopener_Tests(unittest.TestCase): """Testcase to test the open method of URLopener class.""" diff --git a/Lib/urllib/request.py b/Lib/urllib/request.py --- a/Lib/urllib/request.py +++ b/Lib/urllib/request.py @@ -2116,7 +2116,7 @@ """Return the IP addresses of the current host.""" global _thishost if _thishost is None: - _thishost = tuple(socket.gethostbyname_ex(socket.gethostname()[2])) + _thishost = tuple(socket.gethostbyname_ex(socket.gethostname())[2]) return _thishost _ftperrors = None -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Oct 5 18:33:08 2011 From: python-checkins at python.org (senthil.kumaran) Date: Wed, 05 Oct 2011 18:33:08 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=28merge_3=2E2_-=3E_default=29?= =?utf8?q?=3A_merge_from_3=2E2=2E__Issue13104_-_Fix_urllib=2Erequest=2Ethi?= =?utf8?q?shost=28=29_utility_function=2E?= Message-ID: http://hg.python.org/cpython/rev/a228e59ad693 changeset: 72696:a228e59ad693 parent: 72693:a7b7ba225de7 parent: 72695:805a0a1e3c2b user: Senthil Kumaran date: Thu Oct 06 00:32:52 2011 +0800 summary: merge from 3.2. Issue13104 - Fix urllib.request.thishost() utility function. files: Lib/test/test_urllib.py | 4 ++++ Lib/urllib/request.py | 2 +- 2 files changed, 5 insertions(+), 1 deletions(-) diff --git a/Lib/test/test_urllib.py b/Lib/test/test_urllib.py --- a/Lib/test/test_urllib.py +++ b/Lib/test/test_urllib.py @@ -1058,6 +1058,10 @@ self.assertEqual(('user', 'a\vb'),urllib.parse.splitpasswd('user:a\vb')) self.assertEqual(('user', 'a:b'),urllib.parse.splitpasswd('user:a:b')) + def test_thishost(self): + """Test the urllib.request.thishost utility function returns a tuple""" + self.assertIsInstance(urllib.request.thishost(), tuple) + class URLopener_Tests(unittest.TestCase): """Testcase to test the open method of URLopener class.""" diff --git a/Lib/urllib/request.py b/Lib/urllib/request.py --- a/Lib/urllib/request.py +++ b/Lib/urllib/request.py @@ -2125,7 +2125,7 @@ """Return the IP addresses of the current host.""" global _thishost if _thishost is None: - _thishost = tuple(socket.gethostbyname_ex(socket.gethostname()[2])) + _thishost = tuple(socket.gethostbyname_ex(socket.gethostname())[2]) return _thishost _ftperrors = None -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Oct 5 19:43:09 2011 From: python-checkins at python.org (victor.stinner) Date: Wed, 05 Oct 2011 19:43:09 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_unicodeobject=2Ec_doesn=27t?= =?utf8?q?_make_output_strings_ready_in_debug_mode?= Message-ID: http://hg.python.org/cpython/rev/78a2374e1d83 changeset: 72697:78a2374e1d83 user: Victor Stinner date: Wed Oct 05 00:42:43 2011 +0200 summary: unicodeobject.c doesn't make output strings ready in debug mode Try to only create non ready strings in debug mode to ensure that all functions (not only in unicodeobject.c, everywhere) make input strings ready. files: Objects/unicodeobject.c | 4 ++++ 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -46,6 +46,10 @@ #include #endif +#ifdef Py_DEBUG +# define DONT_MAKE_RESULT_READY +#endif + /* Limit for the Unicode object free list */ #define PyUnicode_MAXFREELIST 1024 -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Oct 5 19:43:10 2011 From: python-checkins at python.org (victor.stinner) Date: Wed, 05 Oct 2011 19:43:10 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Replace_PyUnicodeObject*_wi?= =?utf8?q?th_PyObject*_where_it_was_inappropriate?= Message-ID: http://hg.python.org/cpython/rev/5564bcb0035e changeset: 72698:5564bcb0035e user: Victor Stinner date: Wed Oct 05 00:59:23 2011 +0200 summary: Replace PyUnicodeObject* with PyObject* where it was inappropriate files: Objects/unicodeobject.c | 84 ++++++++++++++-------------- 1 files changed, 42 insertions(+), 42 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -432,7 +432,7 @@ /* --- Unicode Object ----------------------------------------------------- */ static PyObject * -fixup(PyUnicodeObject *self, Py_UCS4 (*fixfct)(PyUnicodeObject *s)); +fixup(PyObject *self, Py_UCS4 (*fixfct)(PyObject *s)); Py_LOCAL_INLINE(char *) findchar(void *s, int kind, Py_ssize_t size, Py_UCS4 ch, @@ -8066,7 +8066,7 @@ } static Py_UCS4 -fix_decimal_and_space_to_ascii(PyUnicodeObject *self) +fix_decimal_and_space_to_ascii(PyObject *self) { /* No need to call PyUnicode_READY(self) because this function is only called as a callback from fixup() which does it already. */ @@ -8116,7 +8116,7 @@ Py_INCREF(unicode); return unicode; } - return fixup((PyUnicodeObject *)unicode, fix_decimal_and_space_to_ascii); + return fixup(unicode, fix_decimal_and_space_to_ascii); } PyObject * @@ -8661,8 +8661,8 @@ reference to the modified object */ static PyObject * -fixup(PyUnicodeObject *self, - Py_UCS4 (*fixfct)(PyUnicodeObject *s)) +fixup(PyObject *self, + Py_UCS4 (*fixfct)(PyObject *s)) { PyObject *u; Py_UCS4 maxchar_old, maxchar_new = 0; @@ -8682,7 +8682,7 @@ if the kind of the resulting unicode object does not change, everything is fine. Otherwise we need to change the string kind and re-run the fix function. */ - maxchar_new = fixfct((PyUnicodeObject*)u); + maxchar_new = fixfct(u); if (maxchar_new == 0) /* do nothing, keep maxchar_new at 0 which means no changes. */; else if (maxchar_new <= 127) @@ -8724,7 +8724,7 @@ Py_DECREF(u); return NULL; } - maxchar_old = fixfct((PyUnicodeObject*)v); + maxchar_old = fixfct(v); assert(maxchar_old > 0 && maxchar_old <= maxchar_new); } else { @@ -8743,7 +8743,7 @@ } static Py_UCS4 -fixupper(PyUnicodeObject *self) +fixupper(PyObject *self) { /* No need to call PyUnicode_READY(self) because this function is only called as a callback from fixup() which does it already. */ @@ -8774,7 +8774,7 @@ } static Py_UCS4 -fixlower(PyUnicodeObject *self) +fixlower(PyObject *self) { /* No need to call PyUnicode_READY(self) because fixup() which does it. */ const Py_ssize_t len = PyUnicode_GET_LENGTH(self); @@ -8804,7 +8804,7 @@ } static Py_UCS4 -fixswapcase(PyUnicodeObject *self) +fixswapcase(PyObject *self) { /* No need to call PyUnicode_READY(self) because fixup() which does it. */ const Py_ssize_t len = PyUnicode_GET_LENGTH(self); @@ -8840,7 +8840,7 @@ } static Py_UCS4 -fixcapitalize(PyUnicodeObject *self) +fixcapitalize(PyObject *self) { /* No need to call PyUnicode_READY(self) because fixup() which does it. */ const Py_ssize_t len = PyUnicode_GET_LENGTH(self); @@ -8881,7 +8881,7 @@ } static Py_UCS4 -fixtitle(PyUnicodeObject *self) +fixtitle(PyObject *self) { /* No need to call PyUnicode_READY(self) because fixup() which does it. */ const Py_ssize_t len = PyUnicode_GET_LENGTH(self); @@ -9093,8 +9093,8 @@ } \ } while (0) -static PyUnicodeObject * -pad(PyUnicodeObject *self, +static PyObject * +pad(PyObject *self, Py_ssize_t left, Py_ssize_t right, Py_UCS4 fill) @@ -9178,8 +9178,8 @@ } static PyObject * -split(PyUnicodeObject *self, - PyUnicodeObject *substring, +split(PyObject *self, + PyObject *substring, Py_ssize_t maxcount) { int kind1, kind2, kind; @@ -9260,8 +9260,8 @@ } static PyObject * -rsplit(PyUnicodeObject *self, - PyUnicodeObject *substring, +rsplit(PyObject *self, + PyObject *substring, Py_ssize_t maxcount) { int kind1, kind2, kind; @@ -9639,7 +9639,7 @@ characters, all remaining cased characters have lower case."); static PyObject* -unicode_title(PyUnicodeObject *self) +unicode_title(PyObject *self) { return fixup(self, fixtitle); } @@ -9651,7 +9651,7 @@ have upper case and the rest lower case."); static PyObject* -unicode_capitalize(PyUnicodeObject *self) +unicode_capitalize(PyObject *self) { return fixup(self, fixcapitalize); } @@ -9726,7 +9726,7 @@ done using the specified fill character (default is a space)"); static PyObject * -unicode_center(PyUnicodeObject *self, PyObject *args) +unicode_center(PyObject *self, PyObject *args) { Py_ssize_t marg, left; Py_ssize_t width; @@ -9746,7 +9746,7 @@ marg = width - _PyUnicode_LENGTH(self); left = marg / 2 + (marg & width & 1); - return (PyObject*) pad(self, left, marg - left, fillchar); + return pad(self, left, marg - left, fillchar); } #if 0 @@ -10963,7 +10963,7 @@ done using the specified fill character (default is a space)."); static PyObject * -unicode_ljust(PyUnicodeObject *self, PyObject *args) +unicode_ljust(PyObject *self, PyObject *args) { Py_ssize_t width; Py_UCS4 fillchar = ' '; @@ -10988,7 +10988,7 @@ Return a copy of the string S converted to lowercase."); static PyObject* -unicode_lower(PyUnicodeObject *self) +unicode_lower(PyObject *self) { return fixup(self, fixlower); } @@ -11554,7 +11554,7 @@ done using the specified fill character (default is a space)."); static PyObject * -unicode_rjust(PyUnicodeObject *self, PyObject *args) +unicode_rjust(PyObject *self, PyObject *args) { Py_ssize_t width; Py_UCS4 fillchar = ' '; @@ -11589,7 +11589,7 @@ } } - result = split((PyUnicodeObject *)s, (PyUnicodeObject *)sep, maxsplit); + result = split(s, sep, maxsplit); Py_DECREF(s); Py_XDECREF(sep); @@ -11606,7 +11606,7 @@ removed from the result."); static PyObject* -unicode_split(PyUnicodeObject *self, PyObject *args) +unicode_split(PyObject *self, PyObject *args) { PyObject *substring = Py_None; Py_ssize_t maxcount = -1; @@ -11617,7 +11617,7 @@ if (substring == Py_None) return split(self, NULL, maxcount); else if (PyUnicode_Check(substring)) - return split(self, (PyUnicodeObject *)substring, maxcount); + return split(self, substring, maxcount); else return PyUnicode_Split((PyObject *)self, substring, maxcount); } @@ -11767,9 +11767,9 @@ found, return S and two empty strings."); static PyObject* -unicode_partition(PyUnicodeObject *self, PyObject *separator) -{ - return PyUnicode_Partition((PyObject *)self, separator); +unicode_partition(PyObject *self, PyObject *separator) +{ + return PyUnicode_Partition(self, separator); } PyDoc_STRVAR(rpartition__doc__, @@ -11780,9 +11780,9 @@ separator is not found, return two empty strings and S."); static PyObject* -unicode_rpartition(PyUnicodeObject *self, PyObject *separator) -{ - return PyUnicode_RPartition((PyObject *)self, separator); +unicode_rpartition(PyObject *self, PyObject *separator) +{ + return PyUnicode_RPartition(self, separator); } PyObject * @@ -11801,7 +11801,7 @@ } } - result = rsplit((PyUnicodeObject *)s, (PyUnicodeObject *)sep, maxsplit); + result = rsplit(s, sep, maxsplit); Py_DECREF(s); Py_XDECREF(sep); @@ -11818,7 +11818,7 @@ is a separator."); static PyObject* -unicode_rsplit(PyUnicodeObject *self, PyObject *args) +unicode_rsplit(PyObject *self, PyObject *args) { PyObject *substring = Py_None; Py_ssize_t maxcount = -1; @@ -11829,9 +11829,9 @@ if (substring == Py_None) return rsplit(self, NULL, maxcount); else if (PyUnicode_Check(substring)) - return rsplit(self, (PyUnicodeObject *)substring, maxcount); + return rsplit(self, substring, maxcount); else - return PyUnicode_RSplit((PyObject *)self, substring, maxcount); + return PyUnicode_RSplit(self, substring, maxcount); } PyDoc_STRVAR(splitlines__doc__, @@ -11872,7 +11872,7 @@ and vice versa."); static PyObject* -unicode_swapcase(PyUnicodeObject *self) +unicode_swapcase(PyObject *self) { return fixup(self, fixswapcase); } @@ -12014,7 +12014,7 @@ Return a copy of S converted to uppercase."); static PyObject* -unicode_upper(PyUnicodeObject *self) +unicode_upper(PyObject *self) { return fixup(self, fixupper); } @@ -12026,10 +12026,10 @@ of the specified width. The string S is never truncated."); static PyObject * -unicode_zfill(PyUnicodeObject *self, PyObject *args) +unicode_zfill(PyObject *self, PyObject *args) { Py_ssize_t fill; - PyUnicodeObject *u; + PyObject *u; Py_ssize_t width; int kind; void *data; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Oct 5 19:43:11 2011 From: python-checkins at python.org (victor.stinner) Date: Wed, 05 Oct 2011 19:43:11 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Document_requierements_of_U?= =?utf8?q?nicode_kinds?= Message-ID: http://hg.python.org/cpython/rev/055174308822 changeset: 72699:055174308822 user: Victor Stinner date: Wed Oct 05 01:31:05 2011 +0200 summary: Document requierements of Unicode kinds files: Include/unicodeobject.h | 24 ++++++++++++++++++++---- 1 files changed, 20 insertions(+), 4 deletions(-) diff --git a/Include/unicodeobject.h b/Include/unicodeobject.h --- a/Include/unicodeobject.h +++ b/Include/unicodeobject.h @@ -288,10 +288,26 @@ unsigned int interned:2; /* Character size: - PyUnicode_WCHAR_KIND (0): wchar_t* - PyUnicode_1BYTE_KIND (1): Py_UCS1* - PyUnicode_2BYTE_KIND (2): Py_UCS2* - PyUnicode_4BYTE_KIND (3): Py_UCS4* + - PyUnicode_WCHAR_KIND (0): + + * character type = wchar_t (16 or 32 bits, depending on the + platform) + + - PyUnicode_1BYTE_KIND (1): + + * character type = Py_UCS1 (8 bits, unsigned) + * if ascii is 1, at least one character must be in range + U+80-U+FF, otherwise all characters must be in range U+00-U+7F + + - PyUnicode_2BYTE_KIND (2): + + * character type = Py_UCS2 (16 bits, unsigned) + * at least one character must be in range U+0100-U+1FFFF + + - PyUnicode_4BYTE_KIND (3): + + * character type = Py_UCS4 (32 bits, unsigned) + * at least one character must be in range U+10000-U+10FFFF */ unsigned int kind:2; /* Compact is with respect to the allocation scheme. Compact unicode -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Oct 5 19:43:12 2011 From: python-checkins at python.org (victor.stinner) Date: Wed, 05 Oct 2011 19:43:12 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Ensure_that_newly_created_s?= =?utf8?q?trings_use_the_most_efficient_store_in_debug_mode?= Message-ID: http://hg.python.org/cpython/rev/5f11621a6f51 changeset: 72700:5f11621a6f51 user: Victor Stinner date: Wed Oct 05 01:34:17 2011 +0200 summary: Ensure that newly created strings use the most efficient store in debug mode files: Objects/unicodeobject.c | 89 ++++++++++++++++++++++++---- 1 files changed, 75 insertions(+), 14 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -94,7 +94,7 @@ #endif #ifdef Py_DEBUG -# define _PyUnicode_CHECK(op) _PyUnicode_CheckConsistency(op) +# define _PyUnicode_CHECK(op) _PyUnicode_CheckConsistency(op, 0) #else # define _PyUnicode_CHECK(op) PyUnicode_Check(op) #endif @@ -297,7 +297,8 @@ #ifdef Py_DEBUG static int -_PyUnicode_CheckConsistency(void *op) +/* FIXME: use PyObject* type for op */ +_PyUnicode_CheckConsistency(void *op, int check_content) { PyASCIIObject *ascii; unsigned int kind; @@ -371,12 +372,29 @@ if (ascii->wstr == NULL) assert(compact->wstr_length == 0); } - return 1; -} -#else -static int -_PyUnicode_CheckConsistency(void *op) -{ + /* check that the best kind is used */ + if (check_content && kind != PyUnicode_WCHAR_KIND) + { + Py_ssize_t i; + Py_UCS4 maxchar = 0; + void *data = PyUnicode_DATA(ascii); + for (i=0; i < ascii->length; i++) + { + Py_UCS4 ch = PyUnicode_READ(kind, data, i); + if (ch > maxchar) + maxchar = ch; + } + if (kind == PyUnicode_1BYTE_KIND) { + if (ascii->state.ascii == 0) + assert(maxchar >= 128); + else + assert(maxchar < 128); + } + else if (kind == PyUnicode_2BYTE_KIND) + assert(maxchar >= 0x100); + else + assert(maxchar >= 0x10000); + } return 1; } #endif @@ -546,7 +564,7 @@ _PyUnicode_LENGTH(unicode) = length; PyUnicode_WRITE(PyUnicode_KIND(unicode), data, length, 0); if (share_wstr || _PyUnicode_WSTR(unicode) == NULL) { - _PyUnicode_CheckConsistency(unicode); + assert(_PyUnicode_CheckConsistency(unicode, 0)); return 0; } } @@ -566,7 +584,7 @@ _PyUnicode_WSTR(unicode) = wstr; _PyUnicode_WSTR(unicode)[length] = 0; _PyUnicode_WSTR_LENGTH(unicode) = length; - _PyUnicode_CheckConsistency(unicode); + assert(_PyUnicode_CheckConsistency(unicode, 0)); return 0; } @@ -879,6 +897,7 @@ _PyUnicode_WSTR(unicode) = NULL; } } + assert(_PyUnicode_CheckConsistency(unicode, 0)); return obj; } @@ -1255,6 +1274,7 @@ PyUnicode_4BYTE_DATA(unicode)[_PyUnicode_LENGTH(unicode)] = '\0'; } _PyUnicode_STATE(unicode).ready = 1; + assert(_PyUnicode_CheckConsistency(unicode, 1)); return 0; } @@ -1360,7 +1380,7 @@ *p_unicode = resize_compact(unicode, length); if (*p_unicode == NULL) return -1; - _PyUnicode_CheckConsistency(*p_unicode); + assert(_PyUnicode_CheckConsistency(*p_unicode, 0)); return 0; } return resize_inplace((PyUnicodeObject*)unicode, length); @@ -1393,6 +1413,7 @@ if (!unicode) return NULL; PyUnicode_1BYTE_DATA(unicode)[0] = ch; + assert(_PyUnicode_CheckConsistency(unicode, 1)); unicode_latin1[ch] = unicode; } Py_INCREF(unicode); @@ -1461,6 +1482,7 @@ assert(0 && "Impossible state"); } + assert(_PyUnicode_CheckConsistency(unicode, 1)); return (PyObject *)unicode; } @@ -1558,6 +1580,7 @@ if (!res) return NULL; memcpy(PyUnicode_1BYTE_DATA(res), u, size); + assert(_PyUnicode_CheckConsistency(res, 1)); return res; } @@ -1584,6 +1607,7 @@ else for (i = 0; i < size; i++) PyUnicode_1BYTE_DATA(res)[i] = (Py_UCS1)u[i]; + assert(_PyUnicode_CheckConsistency(res, 1)); return res; } @@ -1613,6 +1637,7 @@ for (i = 0; i < size; i++) PyUnicode_WRITE(kind, data, i, u[i]); } + assert(_PyUnicode_CheckConsistency(res, 1)); return res; } @@ -1669,6 +1694,7 @@ assert(0); break; } + assert(_PyUnicode_CheckConsistency(copy, 1)); return copy; } @@ -2378,6 +2404,7 @@ PyObject_Free(callresults); if (numberresults) PyObject_Free(numberresults); + assert(_PyUnicode_CheckConsistency(string, 1)); return (PyObject *)string; fail: if (callresults) { @@ -2508,6 +2535,7 @@ if (v == NULL) return NULL; PyUnicode_WRITE(PyUnicode_KIND(v), PyUnicode_DATA(v), 0, ordinal); + assert(_PyUnicode_CheckConsistency(v, 1)); return v; } @@ -2677,6 +2705,7 @@ return NULL; } #endif + assert(_PyUnicode_CheckConsistency(unicode, 1)); return unicode; onError: @@ -2703,6 +2732,7 @@ v = PyCodec_Decode(unicode, encoding, errors); if (v == NULL) goto onError; + assert(_PyUnicode_CheckConsistency(v, 1)); return v; onError: @@ -2735,6 +2765,7 @@ Py_DECREF(v); goto onError; } + assert(_PyUnicode_CheckConsistency(v, 1)); return v; onError: @@ -3728,6 +3759,7 @@ return NULL; } #endif + assert(_PyUnicode_CheckConsistency(unicode, 1)); return (PyObject *)unicode; onError: @@ -4300,6 +4332,7 @@ return NULL; } #endif + assert(_PyUnicode_CheckConsistency(unicode, 1)); return (PyObject *)unicode; onError: @@ -4805,6 +4838,7 @@ return NULL; } #endif + assert(_PyUnicode_CheckConsistency(unicode, 1)); return (PyObject *)unicode; onError: @@ -5205,6 +5239,7 @@ return NULL; } #endif + assert(_PyUnicode_CheckConsistency(unicode, 1)); return (PyObject *)unicode; onError: @@ -5666,6 +5701,7 @@ return NULL; } #endif + assert(_PyUnicode_CheckConsistency(v, 1)); return (PyObject *)v; ucnhashError: @@ -5969,6 +6005,7 @@ return NULL; } #endif + assert(_PyUnicode_CheckConsistency(v, 1)); return (PyObject *)v; onError: @@ -6159,6 +6196,7 @@ return NULL; } #endif + assert(_PyUnicode_CheckConsistency(v, 1)); return (PyObject *)v; onError: @@ -6603,6 +6641,7 @@ return NULL; } #endif + assert(_PyUnicode_CheckConsistency(v, 1)); return (PyObject *)v; onError: @@ -6799,6 +6838,7 @@ return NULL; } #endif + assert(_PyUnicode_CheckConsistency(v, 1)); return (PyObject *)v; } @@ -7100,6 +7140,7 @@ return NULL; } #endif + assert(_PyUnicode_CheckConsistency(v, 1)); return (PyObject *)v; onError: @@ -8147,6 +8188,7 @@ return NULL; } #endif + assert(_PyUnicode_CheckConsistency(result, 1)); return result; } /* --- Decimal Encoder ---------------------------------------------------- */ @@ -8738,6 +8780,7 @@ } Py_DECREF(u); + assert(_PyUnicode_CheckConsistency(v, 1)); return v; } } @@ -9061,6 +9104,7 @@ Done: Py_DECREF(fseq); Py_XDECREF(sep); + assert(_PyUnicode_CheckConsistency(res, 1)); return res; onError: @@ -9140,7 +9184,8 @@ return NULL; } - return (PyUnicodeObject*)u; + assert(_PyUnicode_CheckConsistency(u, 1)); + return u; } #undef FILL @@ -9605,6 +9650,7 @@ PyMem_FREE(buf1); if (release2) PyMem_FREE(buf2); + assert(_PyUnicode_CheckConsistency(u, 1)); return u; nothing: @@ -10052,6 +10098,7 @@ goto onError; Py_DECREF(u); Py_DECREF(v); + assert(_PyUnicode_CheckConsistency(w, 1)); return w; onError: @@ -10143,6 +10190,8 @@ if (!(PyUnicode_IS_ASCII(left) && !PyUnicode_IS_ASCII(right))) { unicode_append_inplace(p_left, right); + if (p_left != NULL) + assert(_PyUnicode_CheckConsistency(*p_left, 1)); return; } } @@ -10151,6 +10200,7 @@ if (res == NULL) goto error; Py_DECREF(left); + assert(_PyUnicode_CheckConsistency(res, 1)); *p_left = res; return; @@ -10358,6 +10408,7 @@ return NULL; } #endif + assert(_PyUnicode_CheckConsistency(u, 1)); return (PyObject*) u; overflow: @@ -11248,6 +11299,7 @@ } } + assert(_PyUnicode_CheckConsistency(u, 1)); return (PyObject*) u; } @@ -11465,6 +11517,7 @@ } } /* Closing quote already added at the beginning */ + assert(_PyUnicode_CheckConsistency(unicode, 1)); return repr; } @@ -12067,6 +12120,7 @@ PyUnicode_WRITE(kind, data, fill, '0'); } + assert(_PyUnicode_CheckConsistency(u, 1)); return (PyObject*) u; } @@ -12191,13 +12245,16 @@ static PyObject * unicode__format__(PyObject* self, PyObject* args) { - PyObject *format_spec; + PyObject *format_spec, *out; if (!PyArg_ParseTuple(args, "U:__format__", &format_spec)) return NULL; - return _PyUnicode_FormatAdvanced(self, format_spec, 0, + out = _PyUnicode_FormatAdvanced(self, format_spec, 0, PyUnicode_GET_LENGTH(format_spec)); + if (out != NULL) + assert(_PyUnicode_CheckConsistency(out, 1)); + return out; } PyDoc_STRVAR(p_format__doc__, @@ -12396,6 +12453,7 @@ Py_UCS4 ch = PyUnicode_READ(src_kind, src_data, cur); PyUnicode_WRITE(dest_kind, dest_data, i, ch); } + assert(_PyUnicode_CheckConsistency(result, 1)); return result; } else { PyErr_SetString(PyExc_TypeError, "string indices must be integers"); @@ -12973,6 +13031,7 @@ Py_DECREF(args); } Py_DECREF(uformat); + assert(_PyUnicode_CheckConsistency(result, 1)); return (PyObject *)result; onError: @@ -13090,6 +13149,7 @@ Py_MEMCPY(data, PyUnicode_DATA(unicode), PyUnicode_KIND_SIZE(kind, length + 1)); Py_DECREF(unicode); + assert(_PyUnicode_CheckConsistency(self, 1)); return (PyObject *)self; onError: @@ -13171,6 +13231,7 @@ /* Init the implementation */ unicode_empty = PyUnicode_New(0, 0); + assert(_PyUnicode_CheckConsistency(unicode_empty, 1)); if (!unicode_empty) Py_FatalError("Can't create empty string"); -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Oct 5 19:53:30 2011 From: python-checkins at python.org (charles-francois.natali) Date: Wed, 05 Oct 2011 19:53:30 +0200 Subject: [Python-checkins] =?utf8?b?Y3B5dGhvbiAoMy4yKTogSXNzdWUgIzEzMDcw?= =?utf8?q?=3A_Fix_a_crash_when_a_TextIOWrapper_caught_in_a_reference_cycle?= Message-ID: http://hg.python.org/cpython/rev/d60c00015f01 changeset: 72701:d60c00015f01 branch: 3.2 parent: 72695:805a0a1e3c2b user: Charles-Fran?ois Natali date: Wed Oct 05 19:53:43 2011 +0200 summary: Issue #13070: Fix a crash when a TextIOWrapper caught in a reference cycle would be finalized after the reference to its underlying BufferedRWPair's writer got cleared by the GC. files: Lib/test/test_io.py | 15 +++++++++++++++ Misc/NEWS | 4 ++++ Modules/_io/bufferedio.c | 5 +++++ 3 files changed, 24 insertions(+), 0 deletions(-) diff --git a/Lib/test/test_io.py b/Lib/test/test_io.py --- a/Lib/test/test_io.py +++ b/Lib/test/test_io.py @@ -2414,6 +2414,21 @@ with self.open(support.TESTFN, "rb") as f: self.assertEqual(f.read(), b"456def") + def test_rwpair_cleared_before_textio(self): + # Issue 13070: TextIOWrapper's finalization would crash when called + # after the reference to the underlying BufferedRWPair's writer got + # cleared by the GC. + for i in range(1000): + b1 = self.BufferedRWPair(self.MockRawIO(), self.MockRawIO()) + t1 = self.TextIOWrapper(b1, encoding="ascii") + b2 = self.BufferedRWPair(self.MockRawIO(), self.MockRawIO()) + t2 = self.TextIOWrapper(b2, encoding="ascii") + # circular references + t1.buddy = t2 + t2.buddy = t1 + support.gc_collect() + + class PyTextIOWrapperTest(TextIOWrapperTest): pass diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -98,6 +98,10 @@ Extension Modules ----------------- +- Issue #13070: Fix a crash when a TextIOWrapper caught in a reference cycle + would be finalized after the reference to its underlying BufferedRWPair's + writer got cleared by the GC. + - Issue #12881: ctypes: Fix segfault with large structure field names. - Issue #13058: ossaudiodev: fix a file descriptor leak on error. Patch by diff --git a/Modules/_io/bufferedio.c b/Modules/_io/bufferedio.c --- a/Modules/_io/bufferedio.c +++ b/Modules/_io/bufferedio.c @@ -2212,6 +2212,11 @@ static PyObject * bufferedrwpair_closed_get(rwpair *self, void *context) { + if (self->writer == NULL) { + PyErr_SetString(PyExc_RuntimeError, + "the BufferedRWPair object is being garbage-collected"); + return NULL; + } return PyObject_GetAttr((PyObject *) self->writer, _PyIO_str_closed); } -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Oct 5 19:53:35 2011 From: python-checkins at python.org (charles-francois.natali) Date: Wed, 05 Oct 2011 19:53:35 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=28merge_3=2E2_-=3E_default=29?= =?utf8?q?=3A_Issue_=2313070=3A_Fix_a_crash_when_a_TextIOWrapper_caught_in?= =?utf8?q?_a_reference_cycle?= Message-ID: http://hg.python.org/cpython/rev/7defc1e5d13a changeset: 72702:7defc1e5d13a parent: 72700:5f11621a6f51 parent: 72701:d60c00015f01 user: Charles-Fran?ois Natali date: Wed Oct 05 19:55:56 2011 +0200 summary: Issue #13070: Fix a crash when a TextIOWrapper caught in a reference cycle would be finalized after the reference to its underlying BufferedRWPair's writer got cleared by the GC. files: Lib/test/test_io.py | 15 +++++++++++++++ Misc/NEWS | 4 ++++ Modules/_io/bufferedio.c | 5 +++++ 3 files changed, 24 insertions(+), 0 deletions(-) diff --git a/Lib/test/test_io.py b/Lib/test/test_io.py --- a/Lib/test/test_io.py +++ b/Lib/test/test_io.py @@ -2421,6 +2421,21 @@ with self.open(support.TESTFN, "rb") as f: self.assertEqual(f.read(), b"456def") + def test_rwpair_cleared_before_textio(self): + # Issue 13070: TextIOWrapper's finalization would crash when called + # after the reference to the underlying BufferedRWPair's writer got + # cleared by the GC. + for i in range(1000): + b1 = self.BufferedRWPair(self.MockRawIO(), self.MockRawIO()) + t1 = self.TextIOWrapper(b1, encoding="ascii") + b2 = self.BufferedRWPair(self.MockRawIO(), self.MockRawIO()) + t2 = self.TextIOWrapper(b2, encoding="ascii") + # circular references + t1.buddy = t2 + t2.buddy = t1 + support.gc_collect() + + class PyTextIOWrapperTest(TextIOWrapperTest): pass diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -1314,6 +1314,10 @@ Extension Modules ----------------- +- Issue #13070: Fix a crash when a TextIOWrapper caught in a reference cycle + would be finalized after the reference to its underlying BufferedRWPair's + writer got cleared by the GC. + - Issue #12881: ctypes: Fix segfault with large structure field names. - Issue #13058: ossaudiodev: fix a file descriptor leak on error. Patch by diff --git a/Modules/_io/bufferedio.c b/Modules/_io/bufferedio.c --- a/Modules/_io/bufferedio.c +++ b/Modules/_io/bufferedio.c @@ -2307,6 +2307,11 @@ static PyObject * bufferedrwpair_closed_get(rwpair *self, void *context) { + if (self->writer == NULL) { + PyErr_SetString(PyExc_RuntimeError, + "the BufferedRWPair object is being garbage-collected"); + return NULL; + } return PyObject_GetAttr((PyObject *) self->writer, _PyIO_str_closed); } -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Oct 5 20:14:39 2011 From: python-checkins at python.org (victor.stinner) Date: Wed, 05 Oct 2011 20:14:39 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Fix_my=5Fbasename=28=29=3A_?= =?utf8?q?make_the_string_ready?= Message-ID: http://hg.python.org/cpython/rev/eb2821cc3edf changeset: 72703:eb2821cc3edf user: Victor Stinner date: Wed Oct 05 20:14:23 2011 +0200 summary: Fix my_basename(): make the string ready files: Objects/exceptions.c | 9 +++++++-- 1 files changed, 7 insertions(+), 2 deletions(-) diff --git a/Objects/exceptions.c b/Objects/exceptions.c --- a/Objects/exceptions.c +++ b/Objects/exceptions.c @@ -963,8 +963,13 @@ my_basename(PyObject *name) { Py_ssize_t i, size, offset; - int kind = PyUnicode_KIND(name); - void *data = PyUnicode_DATA(name); + int kind; + void *data; + + if (PyUnicode_READY(name)) + return NULL; + kind = PyUnicode_KIND(name); + data = PyUnicode_DATA(name); size = PyUnicode_GET_LENGTH(name); offset = 0; for(i=0; i < size; i++) { -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Oct 5 21:30:45 2011 From: python-checkins at python.org (victor.stinner) Date: Wed, 05 Oct 2011 21:30:45 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Fix_PyUnicode=5FPartition?= =?utf8?b?KCk6IHN0cl9pbi0+c3RyX29iag==?= Message-ID: http://hg.python.org/cpython/rev/891b0c54297d changeset: 72704:891b0c54297d user: Victor Stinner date: Wed Oct 05 20:58:25 2011 +0200 summary: Fix PyUnicode_Partition(): str_in->str_obj files: Objects/unicodeobject.c | 10 +++++----- 1 files changed, 5 insertions(+), 5 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -11694,12 +11694,12 @@ return NULL; } - kind1 = PyUnicode_KIND(str_in); + kind1 = PyUnicode_KIND(str_obj); kind2 = PyUnicode_KIND(sep_obj); - kind = kind1 > kind2 ? kind1 : kind2; - buf1 = PyUnicode_DATA(str_in); + kind = Py_MAX(kind1, kind2); + buf1 = PyUnicode_DATA(str_obj); if (kind1 != kind) - buf1 = _PyUnicode_AsKind(str_in, kind); + buf1 = _PyUnicode_AsKind(str_obj, kind); if (!buf1) goto onError; buf2 = PyUnicode_DATA(sep_obj); @@ -11710,7 +11710,7 @@ len1 = PyUnicode_GET_LENGTH(str_obj); len2 = PyUnicode_GET_LENGTH(sep_obj); - switch(PyUnicode_KIND(str_in)) { + switch(PyUnicode_KIND(str_obj)) { case PyUnicode_1BYTE_KIND: out = ucs1lib_partition(str_obj, buf1, len1, sep_obj, buf2, len2); break; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Oct 5 21:30:46 2011 From: python-checkins at python.org (victor.stinner) Date: Wed, 05 Oct 2011 21:30:46 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Add_asciilib=3A_similar_to_?= =?utf8?q?ucs1=2C_ucs2_and_ucs4_library=2C_but_specialized_to_ASCII?= Message-ID: http://hg.python.org/cpython/rev/05ed6e5f2cf4 changeset: 72705:05ed6e5f2cf4 user: Victor Stinner date: Wed Oct 05 21:24:08 2011 +0200 summary: Add asciilib: similar to ucs1, ucs2 and ucs4 library, but specialized to ASCII ucs1, ucs2 and ucs4 libraries have to scan created substring to find the maximum character, whereas it is not need to ASCII strings. Because ASCII strings are common, it is useful to optimize ASCII. files: Include/unicodeobject.h | 1 + Objects/stringlib/asciilib.h | 34 ++++ Objects/unicodeobject.c | 163 ++++++++++++++++------ Python/formatter_unicode.c | 4 +- 4 files changed, 153 insertions(+), 49 deletions(-) diff --git a/Include/unicodeobject.h b/Include/unicodeobject.h --- a/Include/unicodeobject.h +++ b/Include/unicodeobject.h @@ -1851,6 +1851,7 @@ see Objects/stringlib/localeutil.h */ #ifndef Py_LIMITED_API PyAPI_FUNC(Py_ssize_t) _PyUnicode_InsertThousandsGrouping( + PyObject *unicode, int kind, void *buffer, Py_ssize_t n_buffer, diff --git a/Objects/stringlib/asciilib.h b/Objects/stringlib/asciilib.h new file mode 100644 --- /dev/null +++ b/Objects/stringlib/asciilib.h @@ -0,0 +1,34 @@ +/* this is sort of a hack. there's at least one place (formatting + floats) where some stringlib code takes a different path if it's + compiled as unicode. */ +#define STRINGLIB_IS_UNICODE 1 + +#define FASTSEARCH asciilib_fastsearch +#define STRINGLIB(F) asciilib_##F +#define STRINGLIB_OBJECT PyUnicodeObject +#define STRINGLIB_CHAR Py_UCS1 +#define STRINGLIB_TYPE_NAME "unicode" +#define STRINGLIB_PARSE_CODE "U" +#define STRINGLIB_EMPTY unicode_empty +#define STRINGLIB_ISSPACE Py_UNICODE_ISSPACE +#define STRINGLIB_ISLINEBREAK BLOOM_LINEBREAK +#define STRINGLIB_ISDECIMAL Py_UNICODE_ISDECIMAL +#define STRINGLIB_TODECIMAL Py_UNICODE_TODECIMAL +#define STRINGLIB_TOUPPER Py_UNICODE_TOUPPER +#define STRINGLIB_TOLOWER Py_UNICODE_TOLOWER +#define STRINGLIB_FILL Py_UNICODE_FILL +#define STRINGLIB_STR PyUnicode_1BYTE_DATA +#define STRINGLIB_LEN PyUnicode_GET_LENGTH +#define STRINGLIB_NEW unicode_fromascii +#define STRINGLIB_RESIZE not_supported +#define STRINGLIB_CHECK PyUnicode_Check +#define STRINGLIB_CHECK_EXACT PyUnicode_CheckExact +#define STRINGLIB_GROUPING _PyUnicode_InsertThousandsGrouping +#define STRINGLIB_GROUPING_LOCALE _PyUnicode_InsertThousandsGroupingLocale + +#define STRINGLIB_TOSTR PyObject_Str +#define STRINGLIB_TOASCII PyObject_ASCII + +#define _Py_InsertThousandsGrouping _PyUnicode_ascii_InsertThousandsGrouping +#define _Py_InsertThousandsGroupingLocale _PyUnicode_ascii_InsertThousandsGroupingLocale + diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -8331,6 +8331,15 @@ /* --- Helpers ------------------------------------------------------------ */ +#include "stringlib/asciilib.h" +#include "stringlib/fastsearch.h" +#include "stringlib/partition.h" +#include "stringlib/split.h" +#include "stringlib/count.h" +#include "stringlib/find.h" +#include "stringlib/localeutil.h" +#include "stringlib/undef.h" + #include "stringlib/ucs1lib.h" #include "stringlib/fastsearch.h" #include "stringlib/partition.h" @@ -8359,7 +8368,10 @@ #include "stringlib/undef.h" static Py_ssize_t -any_find_slice(Py_ssize_t Py_LOCAL_CALLBACK(ucs1)(const Py_UCS1*, Py_ssize_t, +any_find_slice(Py_ssize_t Py_LOCAL_CALLBACK(ascii)(const Py_UCS1*, Py_ssize_t, + const Py_UCS1*, Py_ssize_t, + Py_ssize_t, Py_ssize_t), + Py_ssize_t Py_LOCAL_CALLBACK(ucs1)(const Py_UCS1*, Py_ssize_t, const Py_UCS1*, Py_ssize_t, Py_ssize_t, Py_ssize_t), Py_ssize_t Py_LOCAL_CALLBACK(ucs2)(const Py_UCS2*, Py_ssize_t, @@ -8396,7 +8408,10 @@ switch(kind) { case PyUnicode_1BYTE_KIND: - result = ucs1(buf1, len1, buf2, len2, start, end); + if (PyUnicode_IS_ASCII(s1) && PyUnicode_IS_ASCII(s2)) + result = ascii(buf1, len1, buf2, len2, start, end); + else + result = ucs1(buf1, len1, buf2, len2, start, end); break; case PyUnicode_2BYTE_KIND: result = ucs2(buf1, len1, buf2, len2, start, end); @@ -8417,7 +8432,7 @@ } Py_ssize_t -_PyUnicode_InsertThousandsGrouping(int kind, void *data, +_PyUnicode_InsertThousandsGrouping(PyObject *unicode, int kind, void *data, Py_ssize_t n_buffer, void *digits, Py_ssize_t n_digits, Py_ssize_t min_width, @@ -8426,9 +8441,14 @@ { switch(kind) { case PyUnicode_1BYTE_KIND: - return _PyUnicode_ucs1_InsertThousandsGrouping( - (Py_UCS1*)data, n_buffer, (Py_UCS1*)digits, n_digits, - min_width, grouping, thousands_sep); + if (unicode != NULL && PyUnicode_IS_ASCII(unicode)) + return _PyUnicode_ascii_InsertThousandsGrouping( + (Py_UCS1*)data, n_buffer, (Py_UCS1*)digits, n_digits, + min_width, grouping, thousands_sep); + else + return _PyUnicode_ucs1_InsertThousandsGrouping( + (Py_UCS1*)data, n_buffer, (Py_UCS1*)digits, n_digits, + min_width, grouping, thousands_sep); case PyUnicode_2BYTE_KIND: return _PyUnicode_ucs2_InsertThousandsGrouping( (Py_UCS2*)data, n_buffer, (Py_UCS2*)digits, n_digits, @@ -8505,10 +8525,16 @@ ADJUST_INDICES(start, end, len1); switch(kind) { case PyUnicode_1BYTE_KIND: - result = ucs1lib_count( - ((Py_UCS1*)buf1) + start, end - start, - buf2, len2, PY_SSIZE_T_MAX - ); + if (PyUnicode_IS_ASCII(str_obj) && PyUnicode_IS_ASCII(sub_obj)) + result = asciilib_count( + ((Py_UCS1*)buf1) + start, end - start, + buf2, len2, PY_SSIZE_T_MAX + ); + else + result = ucs1lib_count( + ((Py_UCS1*)buf1) + start, end - start, + buf2, len2, PY_SSIZE_T_MAX + ); break; case PyUnicode_2BYTE_KIND: result = ucs2lib_count( @@ -8565,12 +8591,14 @@ if (direction > 0) result = any_find_slice( - ucs1lib_find_slice, ucs2lib_find_slice, ucs4lib_find_slice, + asciilib_find_slice, ucs1lib_find_slice, + ucs2lib_find_slice, ucs4lib_find_slice, str, sub, start, end ); else result = any_find_slice( - ucs1lib_rfind_slice, ucs2lib_rfind_slice, ucs4lib_rfind_slice, + asciilib_find_slice, ucs1lib_rfind_slice, + ucs2lib_rfind_slice, ucs4lib_rfind_slice, str, sub, start, end ); @@ -9200,9 +9228,14 @@ switch(PyUnicode_KIND(string)) { case PyUnicode_1BYTE_KIND: - list = ucs1lib_splitlines( - (PyObject*) string, PyUnicode_1BYTE_DATA(string), - PyUnicode_GET_LENGTH(string), keepends); + if (PyUnicode_IS_ASCII(string)) + list = asciilib_splitlines( + (PyObject*) string, PyUnicode_1BYTE_DATA(string), + PyUnicode_GET_LENGTH(string), keepends); + else + list = ucs1lib_splitlines( + (PyObject*) string, PyUnicode_1BYTE_DATA(string), + PyUnicode_GET_LENGTH(string), keepends); break; case PyUnicode_2BYTE_KIND: list = ucs2lib_splitlines( @@ -9241,10 +9274,16 @@ if (substring == NULL) switch(PyUnicode_KIND(self)) { case PyUnicode_1BYTE_KIND: - return ucs1lib_split_whitespace( - (PyObject*) self, PyUnicode_1BYTE_DATA(self), - PyUnicode_GET_LENGTH(self), maxcount - ); + if (PyUnicode_IS_ASCII(self)) + return asciilib_split_whitespace( + (PyObject*) self, PyUnicode_1BYTE_DATA(self), + PyUnicode_GET_LENGTH(self), maxcount + ); + else + return ucs1lib_split_whitespace( + (PyObject*) self, PyUnicode_1BYTE_DATA(self), + PyUnicode_GET_LENGTH(self), maxcount + ); case PyUnicode_2BYTE_KIND: return ucs2lib_split_whitespace( (PyObject*) self, PyUnicode_2BYTE_DATA(self), @@ -9283,8 +9322,12 @@ switch(kind) { case PyUnicode_1BYTE_KIND: - out = ucs1lib_split( - (PyObject*) self, buf1, len1, buf2, len2, maxcount); + if (PyUnicode_IS_ASCII(self) && PyUnicode_IS_ASCII(substring)) + out = asciilib_split( + (PyObject*) self, buf1, len1, buf2, len2, maxcount); + else + out = ucs1lib_split( + (PyObject*) self, buf1, len1, buf2, len2, maxcount); break; case PyUnicode_2BYTE_KIND: out = ucs2lib_split( @@ -9323,10 +9366,16 @@ if (substring == NULL) switch(PyUnicode_KIND(self)) { case PyUnicode_1BYTE_KIND: - return ucs1lib_rsplit_whitespace( - (PyObject*) self, PyUnicode_1BYTE_DATA(self), - PyUnicode_GET_LENGTH(self), maxcount - ); + if (PyUnicode_IS_ASCII(self)) + return asciilib_rsplit_whitespace( + (PyObject*) self, PyUnicode_1BYTE_DATA(self), + PyUnicode_GET_LENGTH(self), maxcount + ); + else + return ucs1lib_rsplit_whitespace( + (PyObject*) self, PyUnicode_1BYTE_DATA(self), + PyUnicode_GET_LENGTH(self), maxcount + ); case PyUnicode_2BYTE_KIND: return ucs2lib_rsplit_whitespace( (PyObject*) self, PyUnicode_2BYTE_DATA(self), @@ -9365,8 +9414,12 @@ switch(kind) { case PyUnicode_1BYTE_KIND: - out = ucs1lib_rsplit( - (PyObject*) self, buf1, len1, buf2, len2, maxcount); + if (PyUnicode_IS_ASCII(self) && PyUnicode_IS_ASCII(substring)) + out = asciilib_rsplit( + (PyObject*) self, buf1, len1, buf2, len2, maxcount); + else + out = ucs1lib_rsplit( + (PyObject*) self, buf1, len1, buf2, len2, maxcount); break; case PyUnicode_2BYTE_KIND: out = ucs2lib_rsplit( @@ -9387,12 +9440,15 @@ } static Py_ssize_t -anylib_find(int kind, void *buf1, Py_ssize_t len1, - void *buf2, Py_ssize_t len2, Py_ssize_t offset) +anylib_find(int kind, PyObject *str1, void *buf1, Py_ssize_t len1, + PyObject *str2, void *buf2, Py_ssize_t len2, Py_ssize_t offset) { switch(kind) { case PyUnicode_1BYTE_KIND: - return ucs1lib_find(buf1, len1, buf2, len2, offset); + if (PyUnicode_IS_ASCII(str1) && PyUnicode_IS_ASCII(str2)) + return asciilib_find(buf1, len1, buf2, len2, offset); + else + return ucs1lib_find(buf1, len1, buf2, len2, offset); case PyUnicode_2BYTE_KIND: return ucs2lib_find(buf1, len1, buf2, len2, offset); case PyUnicode_4BYTE_KIND: @@ -9403,12 +9459,15 @@ } static Py_ssize_t -anylib_count(int kind, void* sbuf, Py_ssize_t slen, - void *buf1, Py_ssize_t len1, Py_ssize_t maxcount) +anylib_count(int kind, PyObject *sstr, void* sbuf, Py_ssize_t slen, + PyObject *str1, void *buf1, Py_ssize_t len1, Py_ssize_t maxcount) { switch(kind) { case PyUnicode_1BYTE_KIND: - return ucs1lib_count(sbuf, slen, buf1, len1, maxcount); + if (PyUnicode_IS_ASCII(sstr) && PyUnicode_IS_ASCII(str1)) + return asciilib_count(sbuf, slen, buf1, len1, maxcount); + else + return ucs1lib_count(sbuf, slen, buf1, len1, maxcount); case PyUnicode_2BYTE_KIND: return ucs2lib_count(sbuf, slen, buf1, len1, maxcount); case PyUnicode_4BYTE_KIND: @@ -9497,7 +9556,7 @@ if (!buf1) goto error; release1 = 1; } - i = anylib_find(rkind, sbuf, slen, buf1, len1, 0); + i = anylib_find(rkind, self, sbuf, slen, str1, buf1, len1, 0); if (i < 0) goto nothing; if (rkind > kind2) { @@ -9530,9 +9589,9 @@ i += len1; while ( --maxcount > 0) { - i = anylib_find(rkind, sbuf+PyUnicode_KIND_SIZE(rkind, i), - slen-i, - buf1, len1, i); + i = anylib_find(rkind, self, + sbuf+PyUnicode_KIND_SIZE(rkind, i), slen-i, + str1, buf1, len1, i); if (i == -1) break; memcpy(res + PyUnicode_KIND_SIZE(rkind, i), @@ -9557,7 +9616,7 @@ if (!buf1) goto error; release1 = 1; } - n = anylib_count(rkind, sbuf, slen, buf1, len1, maxcount); + n = anylib_count(rkind, self, sbuf, slen, str1, buf1, len1, maxcount); if (n == 0) goto nothing; if (kind2 < rkind) { @@ -9596,9 +9655,9 @@ if (len1 > 0) { while (n-- > 0) { /* look for next match */ - j = anylib_find(rkind, - sbuf + PyUnicode_KIND_SIZE(rkind, i), - slen-i, buf1, len1, i); + j = anylib_find(rkind, self, + sbuf + PyUnicode_KIND_SIZE(rkind, i), slen-i, + str1, buf1, len1, i); if (j == -1) break; else if (j > i) { @@ -10443,7 +10502,8 @@ return NULL; result = any_find_slice( - ucs1lib_find_slice, ucs2lib_find_slice, ucs4lib_find_slice, + asciilib_find_slice, ucs1lib_find_slice, + ucs2lib_find_slice, ucs4lib_find_slice, self, (PyObject*)substring, start, end ); @@ -10536,7 +10596,8 @@ return NULL; result = any_find_slice( - ucs1lib_find_slice, ucs2lib_find_slice, ucs4lib_find_slice, + asciilib_find_slice, ucs1lib_find_slice, + ucs2lib_find_slice, ucs4lib_find_slice, self, (PyObject*)substring, start, end ); @@ -11548,7 +11609,8 @@ return NULL; result = any_find_slice( - ucs1lib_rfind_slice, ucs2lib_rfind_slice, ucs4lib_rfind_slice, + asciilib_rfind_slice, ucs1lib_rfind_slice, + ucs2lib_rfind_slice, ucs4lib_rfind_slice, self, (PyObject*)substring, start, end ); @@ -11583,7 +11645,8 @@ return NULL; result = any_find_slice( - ucs1lib_rfind_slice, ucs2lib_rfind_slice, ucs4lib_rfind_slice, + asciilib_rfind_slice, ucs1lib_rfind_slice, + ucs2lib_rfind_slice, ucs4lib_rfind_slice, self, (PyObject*)substring, start, end ); @@ -11712,7 +11775,10 @@ switch(PyUnicode_KIND(str_obj)) { case PyUnicode_1BYTE_KIND: - out = ucs1lib_partition(str_obj, buf1, len1, sep_obj, buf2, len2); + if (PyUnicode_IS_ASCII(str_obj) && PyUnicode_IS_ASCII(sep_obj)) + out = asciilib_partition(str_obj, buf1, len1, sep_obj, buf2, len2); + else + out = ucs1lib_partition(str_obj, buf1, len1, sep_obj, buf2, len2); break; case PyUnicode_2BYTE_KIND: out = ucs2lib_partition(str_obj, buf1, len1, sep_obj, buf2, len2); @@ -11781,7 +11847,10 @@ switch(PyUnicode_KIND(str_in)) { case PyUnicode_1BYTE_KIND: - out = ucs1lib_rpartition(str_obj, buf1, len1, sep_obj, buf2, len2); + if (PyUnicode_IS_ASCII(str_obj) && PyUnicode_IS_ASCII(sep_obj)) + out = asciilib_rpartition(str_obj, buf1, len1, sep_obj, buf2, len2); + else + out = ucs1lib_rpartition(str_obj, buf1, len1, sep_obj, buf2, len2); break; case PyUnicode_2BYTE_KIND: out = ucs2lib_rpartition(str_obj, buf1, len1, sep_obj, buf2, len2); diff --git a/Python/formatter_unicode.c b/Python/formatter_unicode.c --- a/Python/formatter_unicode.c +++ b/Python/formatter_unicode.c @@ -501,7 +501,7 @@ spec->n_grouped_digits = 0; else spec->n_grouped_digits = _PyUnicode_InsertThousandsGrouping( - PyUnicode_1BYTE_KIND, NULL, 0, NULL, + NULL, PyUnicode_1BYTE_KIND, NULL, 0, NULL, spec->n_digits, spec->n_min_width, locale->grouping, locale->thousands_sep); @@ -603,7 +603,7 @@ r = #endif _PyUnicode_InsertThousandsGrouping( - kind, + out, kind, (char*)data + PyUnicode_KIND_SIZE(kind, pos), spec->n_grouped_digits, pdigits + PyUnicode_KIND_SIZE(kind, d_pos), -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Oct 5 22:29:05 2011 From: python-checkins at python.org (amaury.forgeotdarc) Date: Wed, 05 Oct 2011 22:29:05 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Fix_a_few_ResourceWarnings_?= =?utf8?q?in_idle?= Message-ID: http://hg.python.org/cpython/rev/3a5a0943b201 changeset: 72706:3a5a0943b201 user: Amaury Forgeot d'Arc date: Mon Oct 03 20:33:24 2011 +0200 summary: Fix a few ResourceWarnings in idle files: Lib/idlelib/configHandler.py | 3 ++- Lib/idlelib/rpc.py | 4 ++++ 2 files changed, 6 insertions(+), 1 deletions(-) diff --git a/Lib/idlelib/configHandler.py b/Lib/idlelib/configHandler.py --- a/Lib/idlelib/configHandler.py +++ b/Lib/idlelib/configHandler.py @@ -145,7 +145,8 @@ except IOError: os.unlink(fname) cfgFile = open(fname, 'w') - self.write(cfgFile) + with cfgFile: + self.write(cfgFile) else: self.RemoveFile() diff --git a/Lib/idlelib/rpc.py b/Lib/idlelib/rpc.py --- a/Lib/idlelib/rpc.py +++ b/Lib/idlelib/rpc.py @@ -534,6 +534,10 @@ def get_remote_proxy(self, oid): return RPCProxy(self, oid) + def close(self): + self.listening_sock.close() + SocketIO.close(self) + class RPCProxy(object): __methods = None -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Oct 5 22:37:58 2011 From: python-checkins at python.org (amaury.forgeotdarc) Date: Wed, 05 Oct 2011 22:37:58 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=282=2E7=29=3A_Enable_the_only?= =?utf8?q?_tests_for_sys=2Egettrace?= Message-ID: http://hg.python.org/cpython/rev/16c4137a413c changeset: 72707:16c4137a413c branch: 2.7 parent: 72694:64fae6f7b64c user: Amaury Forgeot d'Arc date: Wed Oct 05 22:34:51 2011 +0200 summary: Enable the only tests for sys.gettrace files: Lib/test/test_sys_settrace.py | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/Lib/test/test_sys_settrace.py b/Lib/test/test_sys_settrace.py --- a/Lib/test/test_sys_settrace.py +++ b/Lib/test/test_sys_settrace.py @@ -282,11 +282,11 @@ self.compare_events(func.func_code.co_firstlineno, tracer.events, func.events) - def set_and_retrieve_none(self): + def test_set_and_retrieve_none(self): sys.settrace(None) assert sys.gettrace() is None - def set_and_retrieve_func(self): + def test_set_and_retrieve_func(self): def fn(*args): pass -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Oct 5 22:37:59 2011 From: python-checkins at python.org (amaury.forgeotdarc) Date: Wed, 05 Oct 2011 22:37:59 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=283=2E2=29=3A_Enable_the_only?= =?utf8?q?_tests_for_sys=2Egettrace?= Message-ID: http://hg.python.org/cpython/rev/a0393cbe4872 changeset: 72708:a0393cbe4872 branch: 3.2 parent: 72701:d60c00015f01 user: Amaury Forgeot d'Arc date: Wed Oct 05 22:36:05 2011 +0200 summary: Enable the only tests for sys.gettrace files: Lib/test/test_sys_settrace.py | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/Lib/test/test_sys_settrace.py b/Lib/test/test_sys_settrace.py --- a/Lib/test/test_sys_settrace.py +++ b/Lib/test/test_sys_settrace.py @@ -282,11 +282,11 @@ self.compare_events(func.__code__.co_firstlineno, tracer.events, func.events) - def set_and_retrieve_none(self): + def test_set_and_retrieve_none(self): sys.settrace(None) assert sys.gettrace() is None - def set_and_retrieve_func(self): + def test_set_and_retrieve_func(self): def fn(*args): pass -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Oct 5 22:38:00 2011 From: python-checkins at python.org (amaury.forgeotdarc) Date: Wed, 05 Oct 2011 22:38:00 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=28merge_3=2E2_-=3E_default=29?= =?utf8?q?=3A_Merge_from_3=2E2?= Message-ID: http://hg.python.org/cpython/rev/e9cf6e6d6b1f changeset: 72709:e9cf6e6d6b1f parent: 72706:3a5a0943b201 parent: 72708:a0393cbe4872 user: Amaury Forgeot d'Arc date: Wed Oct 05 22:37:06 2011 +0200 summary: Merge from 3.2 files: Lib/test/test_sys_settrace.py | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/Lib/test/test_sys_settrace.py b/Lib/test/test_sys_settrace.py --- a/Lib/test/test_sys_settrace.py +++ b/Lib/test/test_sys_settrace.py @@ -283,11 +283,11 @@ self.compare_events(func.__code__.co_firstlineno, tracer.events, func.events) - def set_and_retrieve_none(self): + def test_set_and_retrieve_none(self): sys.settrace(None) assert sys.gettrace() is None - def set_and_retrieve_func(self): + def test_set_and_retrieve_func(self): def fn(*args): pass -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Wed Oct 5 22:44:12 2011 From: python-checkins at python.org (victor.stinner) Date: Wed, 05 Oct 2011 22:44:12 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_traceback=3A_fix_dump=5Fasc?= =?utf8?q?ii=28=29_for_string_with_kind=3DPyUnicode=5FWCHAR=5FKIND?= Message-ID: http://hg.python.org/cpython/rev/2a8ccff8f337 changeset: 72710:2a8ccff8f337 user: Victor Stinner date: Wed Oct 05 22:44:12 2011 +0200 summary: traceback: fix dump_ascii() for string with kind=PyUnicode_WCHAR_KIND files: Python/traceback.c | 16 +++++++++++++--- 1 files changed, 13 insertions(+), 3 deletions(-) diff --git a/Python/traceback.c b/Python/traceback.c --- a/Python/traceback.c +++ b/Python/traceback.c @@ -483,7 +483,8 @@ Py_ssize_t i, size; int truncated; int kind; - void *data; + void *data = NULL; + wchar_t *wstr = NULL; Py_UCS4 ch; size = ascii->length; @@ -494,11 +495,17 @@ else data = ((PyCompactUnicodeObject*)text) + 1; } - else { + else if (kind != PyUnicode_WCHAR_KIND) { data = ((PyUnicodeObject *)text)->data.any; if (data == NULL) return; } + else { + wstr = ((PyASCIIObject *)text)->wstr; + if (wstr == NULL) + return; + size = ((PyCompactUnicodeObject *)text)->wstr_length; + } if (MAX_STRING_LENGTH < size) { size = MAX_STRING_LENGTH; @@ -508,7 +515,10 @@ truncated = 0; for (i=0; i < size; i++) { - ch = PyUnicode_READ(kind, data, i); + if (kind != PyUnicode_WCHAR_KIND) + ch = PyUnicode_READ(kind, data, i); + else + ch = wstr[i]; if (ch < 128) { char c = (char)ch; write(fd, &c, 1); -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 01:50:59 2011 From: python-checkins at python.org (victor.stinner) Date: Thu, 06 Oct 2011 01:50:59 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_unicode=5Ffromascii=28=29_c?= =?utf8?q?hecks_that_the_input_is_ASCII_in_debug_mode?= Message-ID: http://hg.python.org/cpython/rev/2bf53a7e253f changeset: 72711:2bf53a7e253f user: Victor Stinner date: Wed Oct 05 23:26:01 2011 +0200 summary: unicode_fromascii() checks that the input is ASCII in debug mode files: Objects/unicodeobject.c | 16 ++++++++++++---- 1 files changed, 12 insertions(+), 4 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -1537,12 +1537,20 @@ } static PyObject* -unicode_fromascii(const unsigned char* u, Py_ssize_t size) -{ - PyObject *res = PyUnicode_New(size, 127); +unicode_fromascii(const unsigned char* s, Py_ssize_t size) +{ + PyObject *res; +#ifdef Py_DEBUG + const unsigned char *p; + const unsigned char *end = s + size; + for (p=s; p < end; p++) { + assert(*p < 128); + } +#endif + res = PyUnicode_New(size, 127); if (!res) return NULL; - memcpy(PyUnicode_1BYTE_DATA(res), u, size); + memcpy(PyUnicode_1BYTE_DATA(res), s, size); return res; } -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 01:51:00 2011 From: python-checkins at python.org (victor.stinner) Date: Thu, 06 Oct 2011 01:51:00 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_replace=28=29_uses_unicode?= =?utf8?q?=5Ffromascii=28=29_if_the_input_and_replace_string_is_ASCII?= Message-ID: http://hg.python.org/cpython/rev/b2330e70b41e changeset: 72712:b2330e70b41e user: Victor Stinner date: Wed Oct 05 23:27:08 2011 +0200 summary: replace() uses unicode_fromascii() if the input and replace string is ASCII files: Objects/unicodeobject.c | 5 ++++- 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -9708,7 +9708,10 @@ sbuf + PyUnicode_KIND_SIZE(rkind, i), PyUnicode_KIND_SIZE(rkind, slen-i)); } - u = PyUnicode_FromKindAndData(rkind, res, new_size); + if (PyUnicode_IS_ASCII(self) && PyUnicode_IS_ASCII(str2)) + u = unicode_fromascii((unsigned char*)res, new_size); + else + u = PyUnicode_FromKindAndData(rkind, res, new_size); PyMem_Free(res); } if (srelease) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 01:51:01 2011 From: python-checkins at python.org (victor.stinner) Date: Thu, 06 Oct 2011 01:51:01 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Fix_post-condition_in_unico?= =?utf8?q?de=5Frepr=28=29=3A_check_the_result=2C_not_the_input?= Message-ID: http://hg.python.org/cpython/rev/18e5c247c625 changeset: 72713:18e5c247c625 user: Victor Stinner date: Thu Oct 06 01:13:58 2011 +0200 summary: Fix post-condition in unicode_repr(): check the result, not the input files: Objects/unicodeobject.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -11589,7 +11589,7 @@ } } /* Closing quote already added at the beginning */ - assert(_PyUnicode_CheckConsistency(unicode, 1)); + assert(_PyUnicode_CheckConsistency(repr, 1)); return repr; } -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 01:51:02 2011 From: python-checkins at python.org (victor.stinner) Date: Thu, 06 Oct 2011 01:51:02 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Don=27t_check_for_the_maxim?= =?utf8?q?um_character_when_copying_from_unicodeobject=2Ec?= Message-ID: http://hg.python.org/cpython/rev/6f03716079a9 changeset: 72714:6f03716079a9 user: Victor Stinner date: Thu Oct 06 01:45:57 2011 +0200 summary: Don't check for the maximum character when copying from unicodeobject.c * Create copy_characters() function which doesn't check for the maximum character in release mode * _PyUnicode_CheckConsistency() is no more static to be able to use it in _PyUnicode_FormatAdvanced() (in formatter_unicode.c) * _PyUnicode_CheckConsistency() checks the string hash files: Include/unicodeobject.h | 7 + Objects/unicodeobject.c | 378 ++++++++++++------------ Python/formatter_unicode.c | 16 +- 3 files changed, 203 insertions(+), 198 deletions(-) diff --git a/Include/unicodeobject.h b/Include/unicodeobject.h --- a/Include/unicodeobject.h +++ b/Include/unicodeobject.h @@ -2030,6 +2030,13 @@ ); #endif /* Py_LIMITED_API */ +#if defined(Py_DEBUG) && !defined(Py_LIMITED_API) +/* FIXME: use PyObject* type for op */ +PyAPI_FUNC(int) _PyUnicode_CheckConsistency( + void *op, + int check_content); +#endif + #ifdef __cplusplus } #endif diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -239,6 +239,11 @@ /* forward */ static PyUnicodeObject *_PyUnicode_New(Py_ssize_t length); static PyObject* get_latin1_char(unsigned char ch); +static void copy_characters( + PyObject *to, Py_ssize_t to_start, + PyObject *from, Py_ssize_t from_start, + Py_ssize_t how_many); +static int unicode_is_singleton(PyObject *unicode); static PyObject * unicode_encode_call_errorhandler(const char *errors, @@ -296,7 +301,7 @@ } #ifdef Py_DEBUG -static int +int /* FIXME: use PyObject* type for op */ _PyUnicode_CheckConsistency(void *op, int check_content) { @@ -395,6 +400,8 @@ else assert(maxchar >= 0x10000); } + if (check_content && !unicode_is_singleton((PyObject*)ascii)) + assert(ascii->hash == -1); return 1; } #endif @@ -601,13 +608,7 @@ return NULL; copy_length = Py_MIN(length, PyUnicode_GET_LENGTH(unicode)); - if (PyUnicode_CopyCharacters(copy, 0, - unicode, 0, - copy_length) < 0) - { - Py_DECREF(copy); - return NULL; - } + copy_characters(copy, 0, unicode, 0, copy_length); return copy; } else { @@ -953,47 +954,55 @@ return 0; } -Py_ssize_t -PyUnicode_CopyCharacters(PyObject *to, Py_ssize_t to_start, - PyObject *from, Py_ssize_t from_start, - Py_ssize_t how_many) +static int +_copy_characters(PyObject *to, Py_ssize_t to_start, + PyObject *from, Py_ssize_t from_start, + Py_ssize_t how_many, int check_maxchar) { unsigned int from_kind, to_kind; void *from_data, *to_data; - - if (!PyUnicode_Check(from) || !PyUnicode_Check(to)) { - PyErr_BadInternalCall(); - return -1; - } - - if (PyUnicode_READY(from)) - return -1; - if (PyUnicode_READY(to)) - return -1; - - how_many = Py_MIN(PyUnicode_GET_LENGTH(from), how_many); - if (to_start + how_many > PyUnicode_GET_LENGTH(to)) { - PyErr_Format(PyExc_SystemError, - "Cannot write %zi characters at %zi " - "in a string of %zi characters", - how_many, to_start, PyUnicode_GET_LENGTH(to)); - return -1; - } + int fast; + + assert(PyUnicode_Check(from)); + assert(PyUnicode_Check(to)); + assert(PyUnicode_IS_READY(from)); + assert(PyUnicode_IS_READY(to)); + + assert(PyUnicode_GET_LENGTH(from) >= how_many); + assert(to_start + how_many <= PyUnicode_GET_LENGTH(to)); + assert(0 <= how_many); + if (how_many == 0) return 0; - if (_PyUnicode_Dirty(to)) - return -1; - from_kind = PyUnicode_KIND(from); from_data = PyUnicode_DATA(from); to_kind = PyUnicode_KIND(to); to_data = PyUnicode_DATA(to); - if (from_kind == to_kind +#ifdef Py_DEBUG + if (!check_maxchar + && (from_kind > to_kind + || (!PyUnicode_IS_ASCII(from) && PyUnicode_IS_ASCII(to)))) + { + const Py_UCS4 to_maxchar = PyUnicode_MAX_CHAR_VALUE(to); + Py_UCS4 ch; + Py_ssize_t i; + for (i=0; i < how_many; i++) { + ch = PyUnicode_READ(from_kind, from_data, from_start + i); + assert(ch <= to_maxchar); + } + } +#endif + fast = (from_kind == to_kind); + if (check_maxchar + && (!PyUnicode_IS_ASCII(from) && PyUnicode_IS_ASCII(to))) + { /* deny latin1 => ascii */ - && !(!PyUnicode_IS_ASCII(from) && PyUnicode_IS_ASCII(to))) - { + fast = 0; + } + + if (fast) { Py_MEMCPY((char*)to_data + PyUnicode_KIND_SIZE(to_kind, to_start), (char*)from_data @@ -1031,8 +1040,6 @@ ); } else { - int invalid_kinds; - /* check if max_char(from substring) <= max_char(to) */ if (from_kind > to_kind /* latin1 => ascii */ @@ -1040,34 +1047,77 @@ { /* slow path to check for character overflow */ const Py_UCS4 to_maxchar = PyUnicode_MAX_CHAR_VALUE(to); - Py_UCS4 ch, maxchar; + Py_UCS4 ch; Py_ssize_t i; - maxchar = 0; - invalid_kinds = 0; for (i=0; i < how_many; i++) { ch = PyUnicode_READ(from_kind, from_data, from_start + i); - if (ch > maxchar) { - maxchar = ch; - if (maxchar > to_maxchar) { - invalid_kinds = 1; - break; - } + if (check_maxchar) { + if (ch > to_maxchar) + return 1; + } + else { + assert(ch <= to_maxchar); } PyUnicode_WRITE(to_kind, to_data, to_start + i, ch); } } - else - invalid_kinds = 1; - if (invalid_kinds) { - PyErr_Format(PyExc_SystemError, - "Cannot copy %s characters " - "into a string of %s characters", - unicode_kind_name(from), - unicode_kind_name(to)); + else { return -1; } } + return 0; +} + +static void +copy_characters(PyObject *to, Py_ssize_t to_start, + PyObject *from, Py_ssize_t from_start, + Py_ssize_t how_many) +{ + (void)_copy_characters(to, to_start, from, from_start, how_many, 0); +} + +Py_ssize_t +PyUnicode_CopyCharacters(PyObject *to, Py_ssize_t to_start, + PyObject *from, Py_ssize_t from_start, + Py_ssize_t how_many) +{ + int err; + + if (!PyUnicode_Check(from) || !PyUnicode_Check(to)) { + PyErr_BadInternalCall(); + return -1; + } + + if (PyUnicode_READY(from)) + return -1; + if (PyUnicode_READY(to)) + return -1; + + how_many = Py_MIN(PyUnicode_GET_LENGTH(from), how_many); + if (to_start + how_many > PyUnicode_GET_LENGTH(to)) { + PyErr_Format(PyExc_SystemError, + "Cannot write %zi characters at %zi " + "in a string of %zi characters", + how_many, to_start, PyUnicode_GET_LENGTH(to)); + return -1; + } + + if (how_many == 0) + return 0; + + if (_PyUnicode_Dirty(to)) + return -1; + + err = _copy_characters(to, to_start, from, from_start, how_many, 1); + if (err) { + PyErr_Format(PyExc_SystemError, + "Cannot copy %s characters " + "into a string of %s characters", + unicode_kind_name(from), + unicode_kind_name(to)); + return -1; + } return how_many; } @@ -1327,6 +1377,23 @@ } } +#ifdef Py_DEBUG +static int +unicode_is_singleton(PyObject *unicode) +{ + PyASCIIObject *ascii = (PyASCIIObject *)unicode; + if (unicode == unicode_empty) + return 1; + if (ascii->state.kind != PyUnicode_WCHAR_KIND && ascii->length == 1) + { + Py_UCS4 ch = PyUnicode_READ_CHAR(unicode, 0); + if (ch < 256 && unicode_latin1[ch] == unicode) + return 1; + } + return 0; +} +#endif + static int unicode_resizable(PyObject *unicode) { @@ -1334,15 +1401,9 @@ return 0; if (PyUnicode_CHECK_INTERNED(unicode)) return 0; - assert(unicode != unicode_empty); #ifdef Py_DEBUG - if (_PyUnicode_KIND(unicode) != PyUnicode_WCHAR_KIND - && PyUnicode_GET_LENGTH(unicode) == 1) - { - Py_UCS4 ch = PyUnicode_READ_CHAR(unicode, 0); - if (ch < 256 && unicode_latin1[ch] == unicode) - return 0; - } + /* singleton refcount is greater than 1 */ + assert(!unicode_is_singleton(unicode)); #endif return 1; } @@ -1971,7 +2032,7 @@ int precision = 0; int zeropad; const char* f; - PyUnicodeObject *string; + PyObject *string; /* used by sprintf */ char fmt[61]; /* should be enough for %0width.precisionlld */ Py_UCS4 maxchar = 127; /* result is ASCII by default */ @@ -2270,7 +2331,7 @@ /* Since we've analyzed how much space we need, we don't have to resize the string. There can be no errors beyond this point. */ - string = (PyUnicodeObject *)PyUnicode_New(n, maxchar); + string = PyUnicode_New(n, maxchar); if (!string) goto fail; kind = PyUnicode_KIND(string); @@ -2321,10 +2382,7 @@ (void) va_arg(vargs, char *); size = PyUnicode_GET_LENGTH(*callresult); assert(PyUnicode_KIND(*callresult) <= PyUnicode_KIND(string)); - if (PyUnicode_CopyCharacters((PyObject*)string, i, - *callresult, 0, - size) < 0) - goto fail; + copy_characters(string, i, *callresult, 0, size); i += size; /* We're done with the unicode()/repr() => forget it */ Py_DECREF(*callresult); @@ -2338,10 +2396,7 @@ Py_ssize_t size; assert(PyUnicode_KIND(obj) <= PyUnicode_KIND(string)); size = PyUnicode_GET_LENGTH(obj); - if (PyUnicode_CopyCharacters((PyObject*)string, i, - obj, 0, - size) < 0) - goto fail; + copy_characters(string, i, obj, 0, size); i += size; break; } @@ -2353,19 +2408,13 @@ if (obj) { size = PyUnicode_GET_LENGTH(obj); assert(PyUnicode_KIND(obj) <= PyUnicode_KIND(string)); - if (PyUnicode_CopyCharacters((PyObject*)string, i, - obj, 0, - size) < 0) - goto fail; + copy_characters(string, i, obj, 0, size); i += size; } else { size = PyUnicode_GET_LENGTH(*callresult); assert(PyUnicode_KIND(*callresult) <= PyUnicode_KIND(string)); - if (PyUnicode_CopyCharacters((PyObject*)string, i, - *callresult, - 0, size) < 0) - goto fail; + copy_characters(string, i, *callresult, 0, size); i += size; Py_DECREF(*callresult); } @@ -2376,14 +2425,12 @@ case 'R': case 'A': { + Py_ssize_t size = PyUnicode_GET_LENGTH(*callresult); /* unused, since we already have the result */ (void) va_arg(vargs, PyObject *); assert(PyUnicode_KIND(*callresult) <= PyUnicode_KIND(string)); - if (PyUnicode_CopyCharacters((PyObject*)string, i, - *callresult, 0, - PyUnicode_GET_LENGTH(*callresult)) < 0) - goto fail; - i += PyUnicode_GET_LENGTH(*callresult); + copy_characters(string, i, *callresult, 0, size); + i += size; /* We're done with the unicode()/repr() => forget it */ Py_DECREF(*callresult); /* switch to next unicode()/repr() result */ @@ -8795,24 +8842,12 @@ /* If the maxchar increased so that the kind changed, not all characters are representable anymore and we need to fix the string again. This only happens in very few cases. */ - if (PyUnicode_CopyCharacters(v, 0, - (PyObject*)self, 0, - PyUnicode_GET_LENGTH(self)) < 0) - { - Py_DECREF(u); - return NULL; - } + copy_characters(v, 0, self, 0, PyUnicode_GET_LENGTH(self)); maxchar_old = fixfct(v); assert(maxchar_old > 0 && maxchar_old <= maxchar_new); } else { - if (PyUnicode_CopyCharacters(v, 0, - u, 0, - PyUnicode_GET_LENGTH(self)) < 0) - { - Py_DECREF(u); - return NULL; - } + copy_characters(v, 0, u, 0, PyUnicode_GET_LENGTH(self)); } Py_DECREF(u); @@ -9016,7 +9051,7 @@ PyObject **items; PyObject *item; Py_ssize_t sz, i, res_offset; - Py_UCS4 maxchar = 0; + Py_UCS4 maxchar; Py_UCS4 item_maxchar; fseq = PySequence_Fast(seq, ""); @@ -9031,44 +9066,45 @@ seqlen = PySequence_Fast_GET_SIZE(fseq); /* If empty sequence, return u"". */ if (seqlen == 0) { - res = PyUnicode_New(0, 0); - goto Done; - } + Py_DECREF(fseq); + Py_INCREF(unicode_empty); + res = unicode_empty; + return res; + } + + /* If singleton sequence with an exact Unicode, return that. */ items = PySequence_Fast_ITEMS(fseq); - /* If singleton sequence with an exact Unicode, return that. */ - if (seqlen == 1) { - item = items[0]; - if (PyUnicode_CheckExact(item)) { - Py_INCREF(item); - res = item; - goto Done; - } + if (seqlen == 1 && PyUnicode_CheckExact(items[0])) { + res = items[0]; + Py_INCREF(res); + Py_DECREF(fseq); + return res; + } + + /* Set up sep and seplen */ + if (separator == NULL) { + /* fall back to a blank space separator */ + sep = PyUnicode_FromOrdinal(' '); + if (!sep) + goto onError; + maxchar = 32; } else { - /* Set up sep and seplen */ - if (separator == NULL) { - /* fall back to a blank space separator */ - sep = PyUnicode_FromOrdinal(' '); - if (!sep) - goto onError; - } - else { - if (!PyUnicode_Check(separator)) { - PyErr_Format(PyExc_TypeError, - "separator: expected str instance," - " %.80s found", - Py_TYPE(separator)->tp_name); - goto onError; - } - if (PyUnicode_READY(separator)) - goto onError; - sep = separator; - seplen = PyUnicode_GET_LENGTH(separator); - maxchar = PyUnicode_MAX_CHAR_VALUE(separator); - /* inc refcount to keep this code path symmetric with the - above case of a blank separator */ - Py_INCREF(sep); - } + if (!PyUnicode_Check(separator)) { + PyErr_Format(PyExc_TypeError, + "separator: expected str instance," + " %.80s found", + Py_TYPE(separator)->tp_name); + goto onError; + } + if (PyUnicode_READY(separator)) + goto onError; + sep = separator; + seplen = PyUnicode_GET_LENGTH(separator); + maxchar = PyUnicode_MAX_CHAR_VALUE(separator); + /* inc refcount to keep this code path symmetric with the + above case of a blank separator */ + Py_INCREF(sep); } /* There are at least two things to join, or else we have a subclass @@ -9108,36 +9144,21 @@ /* Catenate everything. */ for (i = 0, res_offset = 0; i < seqlen; ++i) { - Py_ssize_t itemlen, copied; + Py_ssize_t itemlen; item = items[i]; /* Copy item, and maybe the separator. */ if (i && seplen != 0) { - copied = PyUnicode_CopyCharacters(res, res_offset, - sep, 0, seplen); - if (copied < 0) - goto onError; -#ifdef Py_DEBUG - res_offset += copied; -#else + copy_characters(res, res_offset, sep, 0, seplen); res_offset += seplen; -#endif } itemlen = PyUnicode_GET_LENGTH(item); if (itemlen != 0) { - copied = PyUnicode_CopyCharacters(res, res_offset, - item, 0, itemlen); - if (copied < 0) - goto onError; -#ifdef Py_DEBUG - res_offset += copied; -#else + copy_characters(res, res_offset, item, 0, itemlen); res_offset += itemlen; -#endif } } assert(res_offset == PyUnicode_GET_LENGTH(res)); - Done: Py_DECREF(fseq); Py_XDECREF(sep); assert(_PyUnicode_CheckConsistency(res, 1)); @@ -9212,14 +9233,7 @@ FILL(kind, data, fill, 0, left); if (right) FILL(kind, data, fill, left + _PyUnicode_LENGTH(self), right); - if (PyUnicode_CopyCharacters(u, left, - (PyObject*)self, 0, - _PyUnicode_LENGTH(self)) < 0) - { - Py_DECREF(u); - return NULL; - } - + copy_characters(u, left, self, 0, _PyUnicode_LENGTH(self)); assert(_PyUnicode_CheckConsistency(u, 1)); return u; } @@ -9536,12 +9550,7 @@ u = PyUnicode_New(slen, maxchar); if (!u) goto error; - if (PyUnicode_CopyCharacters(u, 0, - (PyObject*)self, 0, slen) < 0) - { - Py_DECREF(u); - return NULL; - } + copy_characters(u, 0, self, 0, slen); rkind = PyUnicode_KIND(u); for (i = 0; i < PyUnicode_GET_LENGTH(u); i++) if (PyUnicode_READ(rkind, PyUnicode_DATA(u), i) == u1) { @@ -10160,12 +10169,8 @@ maxchar); if (w == NULL) goto onError; - if (PyUnicode_CopyCharacters(w, 0, u, 0, PyUnicode_GET_LENGTH(u)) < 0) - goto onError; - if (PyUnicode_CopyCharacters(w, PyUnicode_GET_LENGTH(u), - v, 0, - PyUnicode_GET_LENGTH(v)) < 0) - goto onError; + copy_characters(w, 0, u, 0, PyUnicode_GET_LENGTH(u)); + copy_characters(w, PyUnicode_GET_LENGTH(u), v, 0, PyUnicode_GET_LENGTH(v)); Py_DECREF(u); Py_DECREF(v); assert(_PyUnicode_CheckConsistency(w, 1)); @@ -10181,9 +10186,6 @@ unicode_append_inplace(PyObject **p_left, PyObject *right) { Py_ssize_t left_len, right_len, new_len; -#ifdef Py_DEBUG - Py_ssize_t copied; -#endif assert(PyUnicode_IS_READY(*p_left)); assert(PyUnicode_IS_READY(right)); @@ -10210,14 +10212,8 @@ goto error; } /* copy 'right' into the newly allocated area of 'left' */ -#ifdef Py_DEBUG - copied = PyUnicode_CopyCharacters(*p_left, left_len, - right, 0, - right_len); - assert(0 <= copied); -#else - PyUnicode_CopyCharacters(*p_left, left_len, right, 0, right_len); -#endif + copy_characters(*p_left, left_len, right, 0, right_len); + _PyUnicode_DIRTY(*p_left); return; error: @@ -10270,7 +10266,6 @@ if (res == NULL) goto error; Py_DECREF(left); - assert(_PyUnicode_CheckConsistency(res, 1)); *p_left = res; return; @@ -12332,8 +12327,6 @@ out = _PyUnicode_FormatAdvanced(self, format_spec, 0, PyUnicode_GET_LENGTH(format_spec)); - if (out != NULL) - assert(_PyUnicode_CheckConsistency(out, 1)); return out; } @@ -13174,7 +13167,11 @@ length = PyUnicode_GET_LENGTH(unicode); _PyUnicode_LENGTH(self) = length; +#ifdef Py_DEBUG + _PyUnicode_HASH(self) = -1; +#else _PyUnicode_HASH(self) = _PyUnicode_HASH(unicode); +#endif _PyUnicode_STATE(self).interned = 0; _PyUnicode_STATE(self).kind = kind; _PyUnicode_STATE(self).compact = 0; @@ -13230,6 +13227,9 @@ PyUnicode_KIND_SIZE(kind, length + 1)); Py_DECREF(unicode); assert(_PyUnicode_CheckConsistency(self, 1)); +#ifdef Py_DEBUG + _PyUnicode_HASH(self) = _PyUnicode_HASH(unicode); +#endif return (PyObject *)self; onError: diff --git a/Python/formatter_unicode.c b/Python/formatter_unicode.c --- a/Python/formatter_unicode.c +++ b/Python/formatter_unicode.c @@ -1284,33 +1284,31 @@ Py_ssize_t start, Py_ssize_t end) { InternalFormatSpec format; - PyObject *result = NULL; + PyObject *result; /* check for the special case of zero length format spec, make it equivalent to str(obj) */ - if (start == end) { - result = PyObject_Str(obj); - goto done; - } + if (start == end) + return PyObject_Str(obj); /* parse the format_spec */ if (!parse_internal_render_format_spec(format_spec, start, end, &format, 's', '<')) - goto done; + return NULL; /* type conversion? */ switch (format.type) { case 's': /* no type conversion needed, already a string. do the formatting */ result = format_string_internal(obj, &format); + if (result != NULL) + assert(_PyUnicode_CheckConsistency(result, 1)); break; default: /* unknown */ unknown_presentation_type(format.type, obj->ob_type->tp_name); - goto done; + result = NULL; } - -done: return result; } -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 01:51:03 2011 From: python-checkins at python.org (victor.stinner) Date: Thu, 06 Oct 2011 01:51:03 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_rephrase_PyUnicode=5F1BYTE?= =?utf8?q?=5FKIND_documentation?= Message-ID: http://hg.python.org/cpython/rev/341c3002ffb2 changeset: 72715:341c3002ffb2 user: Victor Stinner date: Thu Oct 06 01:51:19 2011 +0200 summary: rephrase PyUnicode_1BYTE_KIND documentation files: Include/unicodeobject.h | 13 +++++++------ 1 files changed, 7 insertions(+), 6 deletions(-) diff --git a/Include/unicodeobject.h b/Include/unicodeobject.h --- a/Include/unicodeobject.h +++ b/Include/unicodeobject.h @@ -296,13 +296,14 @@ - PyUnicode_1BYTE_KIND (1): * character type = Py_UCS1 (8 bits, unsigned) - * if ascii is 1, at least one character must be in range - U+80-U+FF, otherwise all characters must be in range U+00-U+7F + * if ascii is set, all characters must be in range + U+0000-U+007F, otherwise at least one character must be in range + U+0080-U+00FF - PyUnicode_2BYTE_KIND (2): * character type = Py_UCS2 (16 bits, unsigned) - * at least one character must be in range U+0100-U+1FFFF + * at least one character must be in range U+0100-U+FFFF - PyUnicode_4BYTE_KIND (3): @@ -315,9 +316,9 @@ one block for the PyUnicodeObject struct and another for its data buffer. */ unsigned int compact:1; - /* kind is PyUnicode_1BYTE_KIND but data contains only ASCII - characters. If ascii is 1 and compact is 1, use the PyASCIIObject - structure. */ + /* The string only contains characters in range U+0000-U+007F (ASCII) + and the kind is PyUnicode_1BYTE_KIND. If ascii is set and compact is + set, use the PyASCIIObject structure. */ unsigned int ascii:1; /* The ready flag indicates whether the object layout is initialized completely. This means that this is either a compact object, or -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 02:37:42 2011 From: python-checkins at python.org (victor.stinner) Date: Thu, 06 Oct 2011 02:37:42 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Fix_=5Fwarnings=2Ec=3A_make?= =?utf8?q?_the_filename_string_ready?= Message-ID: http://hg.python.org/cpython/rev/b1e5ade81097 changeset: 72716:b1e5ade81097 user: Victor Stinner date: Thu Oct 06 02:34:51 2011 +0200 summary: Fix _warnings.c: make the filename string ready files: Python/_warnings.c | 13 ++++++++++--- 1 files changed, 10 insertions(+), 3 deletions(-) diff --git a/Python/_warnings.c b/Python/_warnings.c --- a/Python/_warnings.c +++ b/Python/_warnings.c @@ -497,9 +497,16 @@ /* Setup filename. */ *filename = PyDict_GetItemString(globals, "__file__"); if (*filename != NULL && PyUnicode_Check(*filename)) { - Py_ssize_t len = PyUnicode_GetSize(*filename); - int kind = PyUnicode_KIND(*filename); - void *data = PyUnicode_DATA(*filename); + Py_ssize_t len; + int kind; + void *data; + + if (PyUnicode_READY(*filename)) + goto handle_error; + + len = PyUnicode_GetSize(*filename); + kind = PyUnicode_KIND(*filename); + data = PyUnicode_DATA(*filename); /* if filename.lower().endswith((".pyc", ".pyo")): */ if (len >= 4 && -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 02:37:43 2011 From: python-checkins at python.org (victor.stinner) Date: Thu, 06 Oct 2011 02:37:43 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Fix_a_compiler_warning=3A_d?= =?utf8?q?on=27t_define_unicode=5Fis=5Fsingleton=28=29_in_release_mode?= Message-ID: http://hg.python.org/cpython/rev/0535bf5e1ea6 changeset: 72717:0535bf5e1ea6 user: Victor Stinner date: Thu Oct 06 02:36:59 2011 +0200 summary: Fix a compiler warning: don't define unicode_is_singleton() in release mode files: Objects/unicodeobject.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -243,7 +243,9 @@ PyObject *to, Py_ssize_t to_start, PyObject *from, Py_ssize_t from_start, Py_ssize_t how_many); +#ifdef Py_DEBUG static int unicode_is_singleton(PyObject *unicode); +#endif static PyObject * unicode_encode_call_errorhandler(const char *errors, -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 02:39:14 2011 From: python-checkins at python.org (victor.stinner) Date: Thu, 06 Oct 2011 02:39:14 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Fix_find=5Fmodule=5Fpath=28?= =?utf8?q?=29=3A_make_the_string_ready?= Message-ID: http://hg.python.org/cpython/rev/c97ba8f80935 changeset: 72718:c97ba8f80935 user: Victor Stinner date: Thu Oct 06 02:39:42 2011 +0200 summary: Fix find_module_path(): make the string ready files: Python/import.c | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/Python/import.c b/Python/import.c --- a/Python/import.c +++ b/Python/import.c @@ -1785,6 +1785,9 @@ else return 0; + if (PyUnicode_READY(path_unicode)) + return -1; + len = PyUnicode_GET_LENGTH(path_unicode); if (!PyUnicode_AsUCS4(path_unicode, buf, Py_ARRAY_LENGTH(buf), 1)) { Py_DECREF(path_unicode); -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 02:47:07 2011 From: python-checkins at python.org (victor.stinner) Date: Thu, 06 Oct 2011 02:47:07 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_=5Fcopy=5Fcharacters=28=29_?= =?utf8?q?fails_more_quickly_in_debug_mode_on_inconsistent_state?= Message-ID: http://hg.python.org/cpython/rev/357750802e86 changeset: 72719:357750802e86 user: Victor Stinner date: Thu Oct 06 02:47:11 2011 +0200 summary: _copy_characters() fails more quickly in debug mode on inconsistent state files: Objects/unicodeobject.c | 28 ++++++++++++++++++++-------- 1 files changed, 20 insertions(+), 8 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -1052,20 +1052,32 @@ Py_UCS4 ch; Py_ssize_t i; +#ifdef Py_DEBUG for (i=0; i < how_many; i++) { ch = PyUnicode_READ(from_kind, from_data, from_start + i); - if (check_maxchar) { + assert(ch <= to_maxchar); + PyUnicode_WRITE(to_kind, to_data, to_start + i, ch); + } +#else + if (!check_maxchar) { + for (i=0; i < how_many; i++) { + ch = PyUnicode_READ(from_kind, from_data, from_start + i); + PyUnicode_WRITE(to_kind, to_data, to_start + i, ch); + } + } + else { + for (i=0; i < how_many; i++) { + ch = PyUnicode_READ(from_kind, from_data, from_start + i); if (ch > to_maxchar) return 1; - } - else { - assert(ch <= to_maxchar); - } - PyUnicode_WRITE(to_kind, to_data, to_start + i, ch); - } + PyUnicode_WRITE(to_kind, to_data, to_start + i, ch); + } + } +#endif } else { - return -1; + assert(0 && "inconsistent state"); + return 1; } } return 0; -- Repository URL: http://hg.python.org/cpython From solipsis at pitrou.net Thu Oct 6 05:26:04 2011 From: solipsis at pitrou.net (solipsis at pitrou.net) Date: Thu, 06 Oct 2011 05:26:04 +0200 Subject: [Python-checkins] Daily reference leaks (357750802e86): sum=0 Message-ID: results for 357750802e86 on branch "default" -------------------------------------------- Command line was: ['./python', '-m', 'test.regrtest', '-uall', '-R', '3:3:/home/antoine/cpython/refleaks/reflogsv0ruJ', '-x'] From python-checkins at python.org Thu Oct 6 12:37:17 2011 From: python-checkins at python.org (victor.stinner) Date: Thu, 06 Oct 2011 12:37:17 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_str=2Ereplace=28=29_avoids_?= =?utf8?q?memory_when_it=27s_possible?= Message-ID: http://hg.python.org/cpython/rev/79c68caacb73 changeset: 72720:79c68caacb73 user: Victor Stinner date: Thu Oct 06 12:31:55 2011 +0200 summary: str.replace() avoids memory when it's possible files: Objects/unicodeobject.c | 102 +++++++++++++++++++++++---- 1 files changed, 84 insertions(+), 18 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -1741,6 +1741,63 @@ } } +/* Ensure that a string uses the most efficient storage, if it is not the + case: create a new string with of the right kind. Write NULL into *p_unicode + on error. */ +void +unicode_adjust_maxchar(PyObject **p_unicode) +{ + PyObject *unicode, *copy; + Py_UCS4 max_char; + Py_ssize_t i, len; + unsigned int kind; + + assert(p_unicode != NULL); + unicode = *p_unicode; + assert(PyUnicode_IS_READY(unicode)); + if (PyUnicode_IS_ASCII(unicode)) + return; + + len = PyUnicode_GET_LENGTH(unicode); + kind = PyUnicode_KIND(unicode); + if (kind == PyUnicode_1BYTE_KIND) { + const Py_UCS1 *u = PyUnicode_1BYTE_DATA(unicode); + for (i = 0; i < len; i++) { + if (u[i] & 0x80) + return; + } + max_char = 127; + } + else if (kind == PyUnicode_2BYTE_KIND) { + const Py_UCS2 *u = PyUnicode_2BYTE_DATA(unicode); + max_char = 0; + for (i = 0; i < len; i++) { + if (u[i] > max_char) { + max_char = u[i]; + if (max_char >= 256) + return; + } + } + } + else { + assert(kind == PyUnicode_4BYTE_KIND); + const Py_UCS4 *u = PyUnicode_4BYTE_DATA(unicode); + max_char = 0; + for (i = 0; i < len; i++) { + if (u[i] > max_char) { + max_char = u[i]; + if (max_char >= 0x10000) + return; + } + } + } + assert(max_char > PyUnicode_MAX_CHAR_VALUE(unicode)); + copy = PyUnicode_New(len, max_char); + copy_characters(copy, 0, unicode, 0, len); + Py_DECREF(unicode); + *p_unicode = copy; +} + PyObject* PyUnicode_Copy(PyObject *unicode) { @@ -9573,14 +9630,16 @@ PyUnicode_WRITE(rkind, PyUnicode_DATA(u), i, u2); } if (mayshrink) { - PyObject *tmp = u; - u = PyUnicode_FromKindAndData(rkind, PyUnicode_DATA(tmp), - PyUnicode_GET_LENGTH(tmp)); - Py_DECREF(tmp); + unicode_adjust_maxchar(&u); + if (u == NULL) + goto error; } } else { int rkind = skind; char *res; + PyObject *rstr; + Py_UCS4 maxchar; + if (kind1 < rkind) { /* widen substring */ buf1 = _PyUnicode_AsKind(str1, rkind); @@ -9607,11 +9666,13 @@ if (!buf1) goto error; release1 = 1; } - res = PyMem_Malloc(PyUnicode_KIND_SIZE(rkind, slen)); - if (!res) { - PyErr_NoMemory(); + maxchar = PyUnicode_MAX_CHAR_VALUE(self); + maxchar = Py_MAX(maxchar, PyUnicode_MAX_CHAR_VALUE(str2)); + rstr = PyUnicode_New(slen, maxchar); + if (!rstr) goto error; - } + res = PyUnicode_DATA(rstr); + memcpy(res, sbuf, PyUnicode_KIND_SIZE(rkind, slen)); /* change everything in-place, starting with this one */ memcpy(res + PyUnicode_KIND_SIZE(rkind, i), @@ -9631,16 +9692,19 @@ i += len1; } - u = PyUnicode_FromKindAndData(rkind, res, slen); - PyMem_Free(res); - if (!u) goto error; + u = rstr; + unicode_adjust_maxchar(&u); + if (!u) + goto error; } } else { Py_ssize_t n, i, j, ires; Py_ssize_t product, new_size; int rkind = skind; + PyObject *rstr; char *res; + Py_UCS4 maxchar; if (kind1 < rkind) { buf1 = _PyUnicode_AsKind(str1, rkind); @@ -9679,9 +9743,12 @@ "replace string is too long"); goto error; } - res = PyMem_Malloc(PyUnicode_KIND_SIZE(rkind, new_size)); - if (!res) + maxchar = PyUnicode_MAX_CHAR_VALUE(self); + maxchar = Py_MAX(maxchar, PyUnicode_MAX_CHAR_VALUE(str2)); + rstr = PyUnicode_New(new_size, maxchar); + if (!rstr) goto error; + res = PyUnicode_DATA(rstr); ires = i = 0; if (len1 > 0) { while (n-- > 0) { @@ -9731,11 +9798,10 @@ sbuf + PyUnicode_KIND_SIZE(rkind, i), PyUnicode_KIND_SIZE(rkind, slen-i)); } - if (PyUnicode_IS_ASCII(self) && PyUnicode_IS_ASCII(str2)) - u = unicode_fromascii((unsigned char*)res, new_size); - else - u = PyUnicode_FromKindAndData(rkind, res, new_size); - PyMem_Free(res); + u = rstr; + unicode_adjust_maxchar(&u); + if (u == NULL) + goto error; } if (srelease) PyMem_FREE(sbuf); -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 12:37:17 2011 From: python-checkins at python.org (victor.stinner) Date: Thu, 06 Oct 2011 12:37:17 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Fix_my_last_change_on_PyUni?= =?utf8?q?code=5FJoin=28=29=3A_don=27t_process_separator_if_len=3D=3D1?= Message-ID: http://hg.python.org/cpython/rev/3636a39fa557 changeset: 72721:3636a39fa557 user: Victor Stinner date: Thu Oct 06 12:32:37 2011 +0200 summary: Fix my last change on PyUnicode_Join(): don't process separator if len==1 files: Objects/unicodeobject.c | 62 +++++++++++++++------------- 1 files changed, 33 insertions(+), 29 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -9145,37 +9145,41 @@ /* If singleton sequence with an exact Unicode, return that. */ items = PySequence_Fast_ITEMS(fseq); - if (seqlen == 1 && PyUnicode_CheckExact(items[0])) { - res = items[0]; - Py_INCREF(res); - Py_DECREF(fseq); - return res; - } - - /* Set up sep and seplen */ - if (separator == NULL) { - /* fall back to a blank space separator */ - sep = PyUnicode_FromOrdinal(' '); - if (!sep) - goto onError; - maxchar = 32; + if (seqlen == 1) { + if (PyUnicode_CheckExact(items[0])) { + res = items[0]; + Py_INCREF(res); + Py_DECREF(fseq); + return res; + } + sep = NULL; } else { - if (!PyUnicode_Check(separator)) { - PyErr_Format(PyExc_TypeError, - "separator: expected str instance," - " %.80s found", - Py_TYPE(separator)->tp_name); - goto onError; - } - if (PyUnicode_READY(separator)) - goto onError; - sep = separator; - seplen = PyUnicode_GET_LENGTH(separator); - maxchar = PyUnicode_MAX_CHAR_VALUE(separator); - /* inc refcount to keep this code path symmetric with the - above case of a blank separator */ - Py_INCREF(sep); + /* Set up sep and seplen */ + if (separator == NULL) { + /* fall back to a blank space separator */ + sep = PyUnicode_FromOrdinal(' '); + if (!sep) + goto onError; + maxchar = 32; + } + else { + if (!PyUnicode_Check(separator)) { + PyErr_Format(PyExc_TypeError, + "separator: expected str instance," + " %.80s found", + Py_TYPE(separator)->tp_name); + goto onError; + } + if (PyUnicode_READY(separator)) + goto onError; + sep = separator; + seplen = PyUnicode_GET_LENGTH(separator); + maxchar = PyUnicode_MAX_CHAR_VALUE(separator); + /* inc refcount to keep this code path symmetric with the + above case of a blank separator */ + Py_INCREF(sep); + } } /* There are at least two things to join, or else we have a subclass -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 13:19:43 2011 From: python-checkins at python.org (eric.araujo) Date: Thu, 06 Oct 2011 13:19:43 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=282=2E7=29=3A_Move_doc_of_sys?= =?utf8?q?=2Edont=5Fwrite=5Fbytecode_to_make_all_attributes_sorted_again?= Message-ID: http://hg.python.org/cpython/rev/8ef2436b14ec changeset: 72722:8ef2436b14ec branch: 2.7 parent: 72660:504981afa007 user: ?ric Araujo date: Wed Oct 05 02:25:58 2011 +0200 summary: Move doc of sys.dont_write_bytecode to make all attributes sorted again files: Doc/library/sys.rst | 22 +++++++++++----------- 1 files changed, 11 insertions(+), 11 deletions(-) diff --git a/Doc/library/sys.rst b/Doc/library/sys.rst --- a/Doc/library/sys.rst +++ b/Doc/library/sys.rst @@ -95,6 +95,17 @@ customized by assigning another one-argument function to ``sys.displayhook``. +.. data:: dont_write_bytecode + + If this is true, Python won't try to write ``.pyc`` or ``.pyo`` files on the + import of source modules. This value is initially set to ``True`` or + ``False`` depending on the :option:`-B` command line option and the + :envvar:`PYTHONDONTWRITEBYTECODE` environment variable, but you can set it + yourself to control bytecode file generation. + + .. versionadded:: 2.6 + + .. function:: excepthook(type, value, traceback) This function prints out a given traceback and exception to ``sys.stderr``. @@ -786,17 +797,6 @@ .. versionadded:: 2.6 -.. data:: dont_write_bytecode - - If this is true, Python won't try to write ``.pyc`` or ``.pyo`` files on the - import of source modules. This value is initially set to ``True`` or ``False`` - depending on the ``-B`` command line option and the ``PYTHONDONTWRITEBYTECODE`` - environment variable, but you can set it yourself to control bytecode file - generation. - - .. versionadded:: 2.6 - - .. function:: setcheckinterval(interval) Set the interpreter's "check interval". This integer value determines how often -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 13:19:44 2011 From: python-checkins at python.org (eric.araujo) Date: Thu, 06 Oct 2011 13:19:44 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=282=2E7=29=3A_Fix_markup_used?= =?utf8?q?_in_the_documentation_of_sys=2Eprefix_and_sys=2Eexec=5Fprefix=2E?= Message-ID: http://hg.python.org/cpython/rev/9f6704da4abb changeset: 72723:9f6704da4abb branch: 2.7 user: ?ric Araujo date: Wed Oct 05 02:34:28 2011 +0200 summary: Fix markup used in the documentation of sys.prefix and sys.exec_prefix. - Using the file role with {placeholders} is IMO clearer than fake Python code. - The fact that sys.version[:3] gives '2.7' is a CPython detail and should not be advertised (see #9442), even if some stdlib modules currently rely on that detail. files: Doc/library/sys.rst | 14 +++++++------- 1 files changed, 7 insertions(+), 7 deletions(-) diff --git a/Doc/library/sys.rst b/Doc/library/sys.rst --- a/Doc/library/sys.rst +++ b/Doc/library/sys.rst @@ -207,10 +207,10 @@ Python files are installed; by default, this is also ``'/usr/local'``. This can be set at build time with the ``--exec-prefix`` argument to the :program:`configure` script. Specifically, all configuration files (e.g. the - :file:`pyconfig.h` header file) are installed in the directory ``exec_prefix + - '/lib/pythonversion/config'``, and shared library modules are installed in - ``exec_prefix + '/lib/pythonversion/lib-dynload'``, where *version* is equal to - ``version[:3]``. + :file:`pyconfig.h` header file) are installed in the directory + :file:`{exec_prefix}/lib/python{X.Y}/config', and shared library modules are + installed in :file:`{exec_prefix}/lib/python{X.Y}/lib-dynload`, where *X.Y* + is the version number of Python, for example ``2.7``. .. data:: executable @@ -766,10 +766,10 @@ independent Python files are installed; by default, this is the string ``'/usr/local'``. This can be set at build time with the ``--prefix`` argument to the :program:`configure` script. The main collection of Python - library modules is installed in the directory ``prefix + '/lib/pythonversion'`` + library modules is installed in the directory :file:`{prefix}/lib/python{X.Y}`` while the platform independent header files (all except :file:`pyconfig.h`) are - stored in ``prefix + '/include/pythonversion'``, where *version* is equal to - ``version[:3]``. + stored in :file:`{prefix}/include/python{X.Y}``, where *X.Y* is the version + number of Python, for example ``2.7``. .. data:: ps1 -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 13:19:45 2011 From: python-checkins at python.org (eric.araujo) Date: Thu, 06 Oct 2011 13:19:45 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=282=2E7=29=3A_Fix_typo_and_ca?= =?utf8?q?se_in_a_recently_added_test?= Message-ID: http://hg.python.org/cpython/rev/3b2aea6b1628 changeset: 72724:3b2aea6b1628 branch: 2.7 user: ?ric Araujo date: Wed Oct 05 02:35:09 2011 +0200 summary: Fix typo and case in a recently added test files: Lib/test/test_minidom.py | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Lib/test/test_minidom.py b/Lib/test/test_minidom.py --- a/Lib/test/test_minidom.py +++ b/Lib/test/test_minidom.py @@ -439,7 +439,7 @@ dom.unlink() self.confirm(domstr == str.replace("\n", "\r\n")) - def test_toPrettyXML_perserves_content_of_text_node(self): + def test_toprettyxml_preserves_content_of_text_node(self): str = 'B' dom = parseString(str) dom2 = parseString(dom.toprettyxml()) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 13:19:46 2011 From: python-checkins at python.org (eric.araujo) Date: Thu, 06 Oct 2011 13:19:46 +0200 Subject: [Python-checkins] =?utf8?b?Y3B5dGhvbiAobWVyZ2UgMi43IC0+IDIuNyk6?= =?utf8?q?_Branch_merge?= Message-ID: http://hg.python.org/cpython/rev/83c486c1112c changeset: 72725:83c486c1112c branch: 2.7 parent: 72707:16c4137a413c parent: 72724:3b2aea6b1628 user: ?ric Araujo date: Thu Oct 06 13:19:34 2011 +0200 summary: Branch merge files: Doc/library/sys.rst | 36 ++++++++++++++-------------- Lib/test/test_minidom.py | 2 +- 2 files changed, 19 insertions(+), 19 deletions(-) diff --git a/Doc/library/sys.rst b/Doc/library/sys.rst --- a/Doc/library/sys.rst +++ b/Doc/library/sys.rst @@ -95,6 +95,17 @@ customized by assigning another one-argument function to ``sys.displayhook``. +.. data:: dont_write_bytecode + + If this is true, Python won't try to write ``.pyc`` or ``.pyo`` files on the + import of source modules. This value is initially set to ``True`` or + ``False`` depending on the :option:`-B` command line option and the + :envvar:`PYTHONDONTWRITEBYTECODE` environment variable, but you can set it + yourself to control bytecode file generation. + + .. versionadded:: 2.6 + + .. function:: excepthook(type, value, traceback) This function prints out a given traceback and exception to ``sys.stderr``. @@ -196,10 +207,10 @@ Python files are installed; by default, this is also ``'/usr/local'``. This can be set at build time with the ``--exec-prefix`` argument to the :program:`configure` script. Specifically, all configuration files (e.g. the - :file:`pyconfig.h` header file) are installed in the directory ``exec_prefix + - '/lib/pythonversion/config'``, and shared library modules are installed in - ``exec_prefix + '/lib/pythonversion/lib-dynload'``, where *version* is equal to - ``version[:3]``. + :file:`pyconfig.h` header file) are installed in the directory + :file:`{exec_prefix}/lib/python{X.Y}/config', and shared library modules are + installed in :file:`{exec_prefix}/lib/python{X.Y}/lib-dynload`, where *X.Y* + is the version number of Python, for example ``2.7``. .. data:: executable @@ -755,10 +766,10 @@ independent Python files are installed; by default, this is the string ``'/usr/local'``. This can be set at build time with the ``--prefix`` argument to the :program:`configure` script. The main collection of Python - library modules is installed in the directory ``prefix + '/lib/pythonversion'`` + library modules is installed in the directory :file:`{prefix}/lib/python{X.Y}`` while the platform independent header files (all except :file:`pyconfig.h`) are - stored in ``prefix + '/include/pythonversion'``, where *version* is equal to - ``version[:3]``. + stored in :file:`{prefix}/include/python{X.Y}``, where *X.Y* is the version + number of Python, for example ``2.7``. .. data:: ps1 @@ -786,17 +797,6 @@ .. versionadded:: 2.6 -.. data:: dont_write_bytecode - - If this is true, Python won't try to write ``.pyc`` or ``.pyo`` files on the - import of source modules. This value is initially set to ``True`` or ``False`` - depending on the ``-B`` command line option and the ``PYTHONDONTWRITEBYTECODE`` - environment variable, but you can set it yourself to control bytecode file - generation. - - .. versionadded:: 2.6 - - .. function:: setcheckinterval(interval) Set the interpreter's "check interval". This integer value determines how often diff --git a/Lib/test/test_minidom.py b/Lib/test/test_minidom.py --- a/Lib/test/test_minidom.py +++ b/Lib/test/test_minidom.py @@ -439,7 +439,7 @@ dom.unlink() self.confirm(domstr == str.replace("\n", "\r\n")) - def test_toPrettyXML_perserves_content_of_text_node(self): + def test_toprettyxml_preserves_content_of_text_node(self): str = 'B' dom = parseString(str) dom2 = parseString(dom.toprettyxml()) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 13:20:50 2011 From: python-checkins at python.org (eric.araujo) Date: Thu, 06 Oct 2011 13:20:50 +0200 Subject: [Python-checkins] =?utf8?q?distutils2=3A_Add_tests_for_comparing_?= =?utf8?q?candidate_and_final_versions_=28=2311841=29=2E?= Message-ID: http://hg.python.org/distutils2/rev/1f1de3b5b520 changeset: 1197:1f1de3b5b520 parent: 1195:9e4d083c4ad6 user: ?ric Araujo date: Wed Oct 05 00:58:40 2011 +0200 summary: Add tests for comparing candidate and final versions (#11841). This used to be buggy; Filip Gruszczy?ski contributed these tests and a code patch but the latter is not needed. files: distutils2/tests/test_version.py | 12 ++++++++++++ 1 files changed, 12 insertions(+), 0 deletions(-) diff --git a/distutils2/tests/test_version.py b/distutils2/tests/test_version.py --- a/distutils2/tests/test_version.py +++ b/distutils2/tests/test_version.py @@ -101,8 +101,18 @@ True >>> V('1.2.0') >= V('1.2.3') False + >>> V('1.2.0rc1') >= V('1.2.0') + False >>> (V('1.0') > V('1.0b2')) True + >>> V('1.0') > V('1.0c2') + True + >>> V('1.0') > V('1.0rc2') + True + >>> V('1.0rc2') > V('1.0rc1') + True + >>> V('1.0c4') > V('1.0c1') + True >>> (V('1.0') > V('1.0c2') > V('1.0c1') > V('1.0b2') > V('1.0b1') ... > V('1.0a2') > V('1.0a1')) True @@ -129,6 +139,8 @@ ... < V('1.0.dev18') ... < V('1.0.dev456') ... < V('1.0.dev1234') + ... < V('1.0rc1') + ... < V('1.0rc2') ... < V('1.0') ... < V('1.0.post456.dev623') # development version of a post release ... < V('1.0.post456')) -- Repository URL: http://hg.python.org/distutils2 From python-checkins at python.org Thu Oct 6 13:20:50 2011 From: python-checkins at python.org (eric.araujo) Date: Thu, 06 Oct 2011 13:20:50 +0200 Subject: [Python-checkins] =?utf8?q?distutils2=3A_Cosmetic_fixes_for_white?= =?utf8?q?space_and_a_regex=2E?= Message-ID: http://hg.python.org/distutils2/rev/b42d7955f76a changeset: 1198:b42d7955f76a user: ?ric Araujo date: Wed Oct 05 01:08:51 2011 +0200 summary: Cosmetic fixes for whitespace and a regex. The goal of the regex is to catch a (alpha), b (beta), c or rc (release candidate), so the existing pattern puzzled me. Tests were OK before and after the change. files: distutils2/tests/test_version.py | 8 ++++---- distutils2/version.py | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/distutils2/tests/test_version.py b/distutils2/tests/test_version.py --- a/distutils2/tests/test_version.py +++ b/distutils2/tests/test_version.py @@ -103,7 +103,7 @@ False >>> V('1.2.0rc1') >= V('1.2.0') False - >>> (V('1.0') > V('1.0b2')) + >>> V('1.0') > V('1.0b2') True >>> V('1.0') > V('1.0c2') True @@ -248,9 +248,9 @@ def test_parse_numdots(self): # For code coverage completeness, as pad_zeros_length can't be set or # influenced from the public interface - self.assertEqual(V('1.0')._parse_numdots('1.0', '1.0', - pad_zeros_length=3), - [1, 0, 0]) + self.assertEqual( + V('1.0')._parse_numdots('1.0', '1.0', pad_zeros_length=3), + [1, 0, 0]) def test_suite(): diff --git a/distutils2/version.py b/distutils2/version.py --- a/distutils2/version.py +++ b/distutils2/version.py @@ -253,7 +253,7 @@ # if we have something like "b-2" or "a.2" at the end of the # version, that is pobably beta, alpha, etc # let's remove the dash or dot - rs = re.sub(r"([abc|rc])[\-\.](\d+)$", r"\1\2", rs) + rs = re.sub(r"([abc]|rc)[\-\.](\d+)$", r"\1\2", rs) # 1.0-dev-r371 -> 1.0.dev371 # 0.1-dev-r79 -> 0.1.dev79 -- Repository URL: http://hg.python.org/distutils2 From python-checkins at python.org Thu Oct 6 13:20:50 2011 From: python-checkins at python.org (eric.araujo) Date: Thu, 06 Oct 2011 13:20:50 +0200 Subject: [Python-checkins] =?utf8?q?distutils2_=28merge_default_-=3E_pytho?= =?utf8?q?n3=29=3A_Merge_=2311841_and_other_changes_from_default?= Message-ID: http://hg.python.org/distutils2/rev/46c31c3c8839 changeset: 1199:46c31c3c8839 branch: python3 parent: 1196:51bb31c537cb parent: 1198:b42d7955f76a user: ?ric Araujo date: Wed Oct 05 01:10:36 2011 +0200 summary: Merge #11841 and other changes from default files: distutils2/tests/test_version.py | 20 ++++++++++++++++---- distutils2/version.py | 2 +- 2 files changed, 17 insertions(+), 5 deletions(-) diff --git a/distutils2/tests/test_version.py b/distutils2/tests/test_version.py --- a/distutils2/tests/test_version.py +++ b/distutils2/tests/test_version.py @@ -101,7 +101,17 @@ True >>> V('1.2.0') >= V('1.2.3') False - >>> (V('1.0') > V('1.0b2')) + >>> V('1.2.0rc1') >= V('1.2.0') + False + >>> V('1.0') > V('1.0b2') + True + >>> V('1.0') > V('1.0c2') + True + >>> V('1.0') > V('1.0rc2') + True + >>> V('1.0rc2') > V('1.0rc1') + True + >>> V('1.0c4') > V('1.0c1') True >>> (V('1.0') > V('1.0c2') > V('1.0c1') > V('1.0b2') > V('1.0b1') ... > V('1.0a2') > V('1.0a1')) @@ -129,6 +139,8 @@ ... < V('1.0.dev18') ... < V('1.0.dev456') ... < V('1.0.dev1234') + ... < V('1.0rc1') + ... < V('1.0rc2') ... < V('1.0') ... < V('1.0.post456.dev623') # development version of a post release ... < V('1.0.post456')) @@ -236,9 +248,9 @@ def test_parse_numdots(self): # For code coverage completeness, as pad_zeros_length can't be set or # influenced from the public interface - self.assertEqual(V('1.0')._parse_numdots('1.0', '1.0', - pad_zeros_length=3), - [1, 0, 0]) + self.assertEqual( + V('1.0')._parse_numdots('1.0', '1.0', pad_zeros_length=3), + [1, 0, 0]) def test_suite(): diff --git a/distutils2/version.py b/distutils2/version.py --- a/distutils2/version.py +++ b/distutils2/version.py @@ -253,7 +253,7 @@ # if we have something like "b-2" or "a.2" at the end of the # version, that is pobably beta, alpha, etc # let's remove the dash or dot - rs = re.sub(r"([abc|rc])[\-\.](\d+)$", r"\1\2", rs) + rs = re.sub(r"([abc]|rc)[\-\.](\d+)$", r"\1\2", rs) # 1.0-dev-r371 -> 1.0.dev371 # 0.1-dev-r79 -> 0.1.dev79 -- Repository URL: http://hg.python.org/distutils2 From python-checkins at python.org Thu Oct 6 13:20:50 2011 From: python-checkins at python.org (eric.araujo) Date: Thu, 06 Oct 2011 13:20:50 +0200 Subject: [Python-checkins] =?utf8?q?distutils2=3A_Change_one_name_in_test?= =?utf8?q?=5Funinstall_to_avoid_confusion=2E?= Message-ID: http://hg.python.org/distutils2/rev/8be41269957e changeset: 1200:8be41269957e parent: 1198:b42d7955f76a user: ?ric Araujo date: Thu Oct 06 05:34:42 2011 +0200 summary: Change one name in test_uninstall to avoid confusion. install_lib may be the name of a module, a command or an option, so I find it clearer to use site_packages to refer to a string object containing the path of the site-packages directory created in a temporary directory during tests. files: distutils2/tests/test_uninstall.py | 24 +++++++++--------- 1 files changed, 12 insertions(+), 12 deletions(-) diff --git a/distutils2/tests/test_uninstall.py b/distutils2/tests/test_uninstall.py --- a/distutils2/tests/test_uninstall.py +++ b/distutils2/tests/test_uninstall.py @@ -86,26 +86,26 @@ old_out = sys.stderr sys.stderr = StringIO() dist = self.run_setup('install_dist', '--prefix=' + self.root_dir) - install_lib = self.get_path(dist, 'purelib') - return dist, install_lib + site_packages = self.get_path(dist, 'purelib') + return dist, site_packages def test_uninstall_unknow_distribution(self): self.assertRaises(PackagingError, remove, 'Foo', paths=[self.root_dir]) def test_uninstall(self): - dist, install_lib = self.install_dist() - self.assertIsFile(install_lib, 'foo', '__init__.py') - self.assertIsFile(install_lib, 'foo', 'sub', '__init__.py') - self.assertIsFile(install_lib, 'Foo-0.1.dist-info', 'RECORD') - self.assertTrue(remove('Foo', paths=[install_lib])) - self.assertIsNotFile(install_lib, 'foo', 'sub', '__init__.py') - self.assertIsNotFile(install_lib, 'Foo-0.1.dist-info', 'RECORD') + dist, site_packages = self.install_dist() + self.assertIsFile(site_packages, 'foo', '__init__.py') + self.assertIsFile(site_packages, 'foo', 'sub', '__init__.py') + self.assertIsFile(site_packages, 'Foo-0.1.dist-info', 'RECORD') + self.assertTrue(remove('Foo', paths=[site_packages])) + self.assertIsNotFile(site_packages, 'foo', 'sub', '__init__.py') + self.assertIsNotFile(site_packages, 'Foo-0.1.dist-info', 'RECORD') def test_remove_issue(self): # makes sure if there are OSErrors (like permission denied) # remove() stops and display a clean error - dist, install_lib = self.install_dist('Meh') + dist, site_packages = self.install_dist('Meh') # breaking os.rename old = os.rename @@ -115,11 +115,11 @@ os.rename = _rename try: - self.assertFalse(remove('Meh', paths=[install_lib])) + self.assertFalse(remove('Meh', paths=[site_packages])) finally: os.rename = old - self.assertTrue(remove('Meh', paths=[install_lib])) + self.assertTrue(remove('Meh', paths=[site_packages])) def test_suite(): -- Repository URL: http://hg.python.org/distutils2 From python-checkins at python.org Thu Oct 6 13:20:51 2011 From: python-checkins at python.org (eric.araujo) Date: Thu, 06 Oct 2011 13:20:51 +0200 Subject: [Python-checkins] =?utf8?q?distutils2=3A_Fix_incorrect_test=2E?= Message-ID: http://hg.python.org/distutils2/rev/61b0f35818a9 changeset: 1201:61b0f35818a9 user: ?ric Araujo date: Thu Oct 06 05:35:54 2011 +0200 summary: Fix incorrect test. The distutils2.install.remove function (a.k.a. the uninstall feature) takes a path argument to allow client code to use custom directories instead of sys.path. The test used to give self.root_dir as path, which corresponds to a prefix option, but prefix is not on sys.path, it?s only the base directory used to compute the stdlib and site-packages directory paths. The test now gives a valid site-packages path to the function. files: distutils2/tests/test_uninstall.py | 5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-) diff --git a/distutils2/tests/test_uninstall.py b/distutils2/tests/test_uninstall.py --- a/distutils2/tests/test_uninstall.py +++ b/distutils2/tests/test_uninstall.py @@ -89,9 +89,10 @@ site_packages = self.get_path(dist, 'purelib') return dist, site_packages - def test_uninstall_unknow_distribution(self): + def test_uninstall_unknown_distribution(self): + dist, site_packages = self.install_dist('Foospam') self.assertRaises(PackagingError, remove, 'Foo', - paths=[self.root_dir]) + paths=[site_packages]) def test_uninstall(self): dist, site_packages = self.install_dist() -- Repository URL: http://hg.python.org/distutils2 From python-checkins at python.org Thu Oct 6 13:20:51 2011 From: python-checkins at python.org (eric.araujo) Date: Thu, 06 Oct 2011 13:20:51 +0200 Subject: [Python-checkins] =?utf8?q?distutils2=3A_Add_test_that_was_promis?= =?utf8?q?ed_in_a_comment_but_not_actually_written?= Message-ID: http://hg.python.org/distutils2/rev/0125e6e2eda2 changeset: 1202:0125e6e2eda2 user: ?ric Araujo date: Thu Oct 06 05:37:40 2011 +0200 summary: Add test that was promised in a comment but not actually written files: distutils2/tests/test_uninstall.py | 9 +++++++-- 1 files changed, 7 insertions(+), 2 deletions(-) diff --git a/distutils2/tests/test_uninstall.py b/distutils2/tests/test_uninstall.py --- a/distutils2/tests/test_uninstall.py +++ b/distutils2/tests/test_uninstall.py @@ -1,6 +1,7 @@ """Tests for the uninstall command.""" import os import sys +import logging from StringIO import StringIO import stat import distutils2.util @@ -105,14 +106,14 @@ def test_remove_issue(self): # makes sure if there are OSErrors (like permission denied) - # remove() stops and display a clean error + # remove() stops and displays a clean error dist, site_packages = self.install_dist('Meh') # breaking os.rename old = os.rename def _rename(source, target): - raise OSError + raise OSError(42, 'impossible operation') os.rename = _rename try: @@ -120,6 +121,10 @@ finally: os.rename = old + logs = [log for log in self.get_logs(logging.INFO) + if log.startswith('Error:')] + self.assertEqual(logs, ['Error: [Errno 42] impossible operation']) + self.assertTrue(remove('Meh', paths=[site_packages])) -- Repository URL: http://hg.python.org/distutils2 From python-checkins at python.org Thu Oct 6 13:20:51 2011 From: python-checkins at python.org (eric.araujo) Date: Thu, 06 Oct 2011 13:20:51 +0200 Subject: [Python-checkins] =?utf8?q?distutils2=3A_Minor=3A_improve_one_tes?= =?utf8?q?t_name=2C_address_pyflakes_warnings?= Message-ID: http://hg.python.org/distutils2/rev/b5b96ab18810 changeset: 1203:b5b96ab18810 user: ?ric Araujo date: Thu Oct 06 05:40:04 2011 +0200 summary: Minor: improve one test name, address pyflakes warnings files: distutils2/tests/test_uninstall.py | 10 ++++------ 1 files changed, 4 insertions(+), 6 deletions(-) diff --git a/distutils2/tests/test_uninstall.py b/distutils2/tests/test_uninstall.py --- a/distutils2/tests/test_uninstall.py +++ b/distutils2/tests/test_uninstall.py @@ -1,15 +1,14 @@ -"""Tests for the uninstall command.""" +"""Tests for the distutils2.uninstall module.""" import os import sys import logging -from StringIO import StringIO -import stat import distutils2.util -from distutils2.database import disable_cache, enable_cache +from StringIO import StringIO from distutils2.run import main from distutils2.errors import PackagingError from distutils2.install import remove +from distutils2.database import disable_cache, enable_cache from distutils2.command.install_dist import install_dist from distutils2.tests import unittest, support @@ -84,7 +83,6 @@ if not dirname: dirname = self.make_dist(name, **kw) os.chdir(dirname) - old_out = sys.stderr sys.stderr = StringIO() dist = self.run_setup('install_dist', '--prefix=' + self.root_dir) site_packages = self.get_path(dist, 'purelib') @@ -104,7 +102,7 @@ self.assertIsNotFile(site_packages, 'foo', 'sub', '__init__.py') self.assertIsNotFile(site_packages, 'Foo-0.1.dist-info', 'RECORD') - def test_remove_issue(self): + def test_uninstall_error_handling(self): # makes sure if there are OSErrors (like permission denied) # remove() stops and displays a clean error dist, site_packages = self.install_dist('Meh') -- Repository URL: http://hg.python.org/distutils2 From python-checkins at python.org Thu Oct 6 13:20:51 2011 From: python-checkins at python.org (eric.araujo) Date: Thu, 06 Oct 2011 13:20:51 +0200 Subject: [Python-checkins] =?utf8?q?distutils2=3A_Fix_return_code_of_?= =?utf8?b?4oCccHlzZXR1cCBydW4gQ09NTUFOROKAnSAoY2xvc2VzICMxMjIyMik=?= Message-ID: http://hg.python.org/distutils2/rev/3749fcae0dce changeset: 1204:3749fcae0dce user: ?ric Araujo date: Thu Oct 06 05:43:56 2011 +0200 summary: Fix return code of ?pysetup run COMMAND? (closes #12222) files: distutils2/run.py | 5 +- distutils2/tests/test_uninstall.py | 30 +++++++++-------- 2 files changed, 19 insertions(+), 16 deletions(-) diff --git a/distutils2/run.py b/distutils2/run.py --- a/distutils2/run.py +++ b/distutils2/run.py @@ -283,10 +283,11 @@ dist.parse_config_files() for cmd in dispatcher.commands: + # FIXME need to catch MetadataMissingError here (from the check command + # e.g.)--or catch any exception, print an error message and exit with 1 dist.run_command(cmd, dispatcher.command_options[cmd]) - # XXX this is crappy - return dist + return 0 @action_help("""\ diff --git a/distutils2/tests/test_uninstall.py b/distutils2/tests/test_uninstall.py --- a/distutils2/tests/test_uninstall.py +++ b/distutils2/tests/test_uninstall.py @@ -4,12 +4,9 @@ import logging import distutils2.util -from StringIO import StringIO -from distutils2.run import main from distutils2.errors import PackagingError from distutils2.install import remove from distutils2.database import disable_cache, enable_cache -from distutils2.command.install_dist import install_dist from distutils2.tests import unittest, support @@ -47,16 +44,12 @@ distutils2.util._path_created.clear() super(UninstallTestCase, self).tearDown() - def run_setup(self, *args): - # run setup with args - args = ['run'] + list(args) - dist = main(args) - return dist - def get_path(self, dist, name): - cmd = install_dist(dist) - cmd.prefix = self.root_dir - cmd.finalize_options() + # the dist argument must contain an install_dist command correctly + # initialized with a prefix option and finalized befored this method + # can be called successfully; practically, this means that you should + # call self.install_dist before self.get_path + cmd = dist.get_command_obj('install_dist') return getattr(cmd, 'install_' + name) def make_dist(self, name='Foo', **kw): @@ -83,8 +76,17 @@ if not dirname: dirname = self.make_dist(name, **kw) os.chdir(dirname) - sys.stderr = StringIO() - dist = self.run_setup('install_dist', '--prefix=' + self.root_dir) + + dist = support.TestDistribution() + # for some unfathomable reason, the tests will fail horribly if the + # parse_config_files method is not called, even if it doesn't do + # anything useful; trying to build and use a command object manually + # also fails + dist.parse_config_files() + dist.finalize_options() + dist.run_command('install_dist', + {'prefix': ('command line', self.root_dir)}) + site_packages = self.get_path(dist, 'purelib') return dist, site_packages -- Repository URL: http://hg.python.org/distutils2 From python-checkins at python.org Thu Oct 6 13:20:51 2011 From: python-checkins at python.org (eric.araujo) Date: Thu, 06 Oct 2011 13:20:51 +0200 Subject: [Python-checkins] =?utf8?q?distutils2_=28merge_default_-=3E_pytho?= =?utf8?q?n3=29=3A_Merge_fix_for_=2312222_and_other_changes_from_default?= Message-ID: http://hg.python.org/distutils2/rev/b9f449eb1e36 changeset: 1205:b9f449eb1e36 branch: python3 parent: 1199:46c31c3c8839 parent: 1204:3749fcae0dce user: ?ric Araujo date: Thu Oct 06 06:02:05 2011 +0200 summary: Merge fix for #12222 and other changes from default files: distutils2/run.py | 5 +- distutils2/tests/test_uninstall.py | 76 +++++++++-------- 2 files changed, 44 insertions(+), 37 deletions(-) diff --git a/distutils2/run.py b/distutils2/run.py --- a/distutils2/run.py +++ b/distutils2/run.py @@ -283,10 +283,11 @@ dist.parse_config_files() for cmd in dispatcher.commands: + # FIXME need to catch MetadataMissingError here (from the check command + # e.g.)--or catch any exception, print an error message and exit with 1 dist.run_command(cmd, dispatcher.command_options[cmd]) - # XXX this is crappy - return dist + return 0 @action_help("""\ diff --git a/distutils2/tests/test_uninstall.py b/distutils2/tests/test_uninstall.py --- a/distutils2/tests/test_uninstall.py +++ b/distutils2/tests/test_uninstall.py @@ -1,15 +1,12 @@ -"""Tests for the uninstall command.""" +"""Tests for the distutils2.uninstall module.""" import os import sys -from io import StringIO -import stat +import logging import distutils2.util -from distutils2.database import disable_cache, enable_cache -from distutils2.run import main from distutils2.errors import PackagingError from distutils2.install import remove -from distutils2.command.install_dist import install_dist +from distutils2.database import disable_cache, enable_cache from distutils2.tests import unittest, support @@ -47,16 +44,12 @@ distutils2.util._path_created.clear() super(UninstallTestCase, self).tearDown() - def run_setup(self, *args): - # run setup with args - args = ['run'] + list(args) - dist = main(args) - return dist - def get_path(self, dist, name): - cmd = install_dist(dist) - cmd.prefix = self.root_dir - cmd.finalize_options() + # the dist argument must contain an install_dist command correctly + # initialized with a prefix option and finalized befored this method + # can be called successfully; practically, this means that you should + # call self.install_dist before self.get_path + cmd = dist.get_command_obj('install_dist') return getattr(cmd, 'install_' + name) def make_dist(self, name='Foo', **kw): @@ -83,43 +76,56 @@ if not dirname: dirname = self.make_dist(name, **kw) os.chdir(dirname) - old_out = sys.stderr - sys.stderr = StringIO() - dist = self.run_setup('install_dist', '--prefix=' + self.root_dir) - install_lib = self.get_path(dist, 'purelib') - return dist, install_lib - def test_uninstall_unknow_distribution(self): + dist = support.TestDistribution() + # for some unfathomable reason, the tests will fail horribly if the + # parse_config_files method is not called, even if it doesn't do + # anything useful; trying to build and use a command object manually + # also fails + dist.parse_config_files() + dist.finalize_options() + dist.run_command('install_dist', + {'prefix': ('command line', self.root_dir)}) + + site_packages = self.get_path(dist, 'purelib') + return dist, site_packages + + def test_uninstall_unknown_distribution(self): + dist, site_packages = self.install_dist('Foospam') self.assertRaises(PackagingError, remove, 'Foo', - paths=[self.root_dir]) + paths=[site_packages]) def test_uninstall(self): - dist, install_lib = self.install_dist() - self.assertIsFile(install_lib, 'foo', '__init__.py') - self.assertIsFile(install_lib, 'foo', 'sub', '__init__.py') - self.assertIsFile(install_lib, 'Foo-0.1.dist-info', 'RECORD') - self.assertTrue(remove('Foo', paths=[install_lib])) - self.assertIsNotFile(install_lib, 'foo', 'sub', '__init__.py') - self.assertIsNotFile(install_lib, 'Foo-0.1.dist-info', 'RECORD') + dist, site_packages = self.install_dist() + self.assertIsFile(site_packages, 'foo', '__init__.py') + self.assertIsFile(site_packages, 'foo', 'sub', '__init__.py') + self.assertIsFile(site_packages, 'Foo-0.1.dist-info', 'RECORD') + self.assertTrue(remove('Foo', paths=[site_packages])) + self.assertIsNotFile(site_packages, 'foo', 'sub', '__init__.py') + self.assertIsNotFile(site_packages, 'Foo-0.1.dist-info', 'RECORD') - def test_remove_issue(self): + def test_uninstall_error_handling(self): # makes sure if there are OSErrors (like permission denied) - # remove() stops and display a clean error - dist, install_lib = self.install_dist('Meh') + # remove() stops and displays a clean error + dist, site_packages = self.install_dist('Meh') # breaking os.rename old = os.rename def _rename(source, target): - raise OSError + raise OSError(42, 'impossible operation') os.rename = _rename try: - self.assertFalse(remove('Meh', paths=[install_lib])) + self.assertFalse(remove('Meh', paths=[site_packages])) finally: os.rename = old - self.assertTrue(remove('Meh', paths=[install_lib])) + logs = [log for log in self.get_logs(logging.INFO) + if log.startswith('Error:')] + self.assertEqual(logs, ['Error: [Errno 42] impossible operation']) + + self.assertTrue(remove('Meh', paths=[site_packages])) def test_suite(): -- Repository URL: http://hg.python.org/distutils2 From python-checkins at python.org Thu Oct 6 13:24:03 2011 From: python-checkins at python.org (eric.araujo) Date: Thu, 06 Oct 2011 13:24:03 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Minor_updates_to_the_whatsn?= =?utf8?q?ew_maintenance_rules?= Message-ID: http://hg.python.org/cpython/rev/7cd4f1be101b changeset: 72726:7cd4f1be101b parent: 72665:56eb9a509460 user: ?ric Araujo date: Wed Oct 05 01:03:34 2011 +0200 summary: Minor updates to the whatsnew maintenance rules files: Doc/whatsnew/3.3.rst | 9 ++++----- 1 files changed, 4 insertions(+), 5 deletions(-) diff --git a/Doc/whatsnew/3.3.rst b/Doc/whatsnew/3.3.rst --- a/Doc/whatsnew/3.3.rst +++ b/Doc/whatsnew/3.3.rst @@ -6,8 +6,7 @@ :Release: |release| :Date: |today| -.. $Id$ - Rules for maintenance: +.. Rules for maintenance: * Anyone can add text to this document. Do not spend very much time on the wording of your changes, because your text will probably @@ -40,12 +39,11 @@ * It's helpful to add the bug/patch number as a comment: - % Patch 12345 XXX Describe the transmogrify() function added to the socket module. - (Contributed by P.Y. Developer.) + (Contributed by P.Y. Developer in :issue:`12345`.) - This saves the maintainer the effort of going through the SVN log + This saves the maintainer the effort of going through the Mercurial log when researching a change. This article explains the new features in Python 3.3, compared to 3.2. @@ -109,6 +107,7 @@ XXX mention new and deprecated functions and macros + Other Language Changes ====================== -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 13:24:04 2011 From: python-checkins at python.org (eric.araujo) Date: Thu, 06 Oct 2011 13:24:04 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_More_info_about_PEP_393_in_?= =?utf8?q?whatsnew_and_NEWS?= Message-ID: http://hg.python.org/cpython/rev/a55513a45742 changeset: 72727:a55513a45742 user: ?ric Araujo date: Wed Oct 05 01:04:18 2011 +0200 summary: More info about PEP 393 in whatsnew and NEWS files: Doc/whatsnew/3.3.rst | 11 ++++++----- Misc/NEWS | 2 ++ 2 files changed, 8 insertions(+), 5 deletions(-) diff --git a/Doc/whatsnew/3.3.rst b/Doc/whatsnew/3.3.rst --- a/Doc/whatsnew/3.3.rst +++ b/Doc/whatsnew/3.3.rst @@ -49,14 +49,15 @@ This article explains the new features in Python 3.3, compared to 3.2. -PEP XXX: Stub -============= - - PEP 393: Flexible String Representation ======================================= -XXX Give a short introduction about :pep:`393`. +[Abstract copied from the PEP: The Unicode string type is changed to support +multiple internal representations, depending on the character with the largest +Unicode ordinal (1, 2, or 4 bytes). This allows a space-efficient +representation in common cases, but gives access to full UCS-4 on all systems. +For compatibility with existing APIs, several representations may exist in +parallel; over time, this compatibility should be phased out.] PEP 393 is fully backward compatible. The legacy API should remain available at least five years. Applications using the legacy API will not diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -1261,6 +1261,8 @@ Build ----- +- PEP 393: the configure option --with-wide-unicode is removed. + - Issue #12852: Set _XOPEN_SOURCE to 700, instead of 600, to get POSIX 2008 functions on OpenBSD (e.g. fdopendir). -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 13:24:05 2011 From: python-checkins at python.org (eric.araujo) Date: Thu, 06 Oct 2011 13:24:05 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Fix_minor_wording_issue=2E?= Message-ID: http://hg.python.org/cpython/rev/024e849b0d2c changeset: 72728:024e849b0d2c user: ?ric Araujo date: Wed Oct 05 01:06:31 2011 +0200 summary: Fix minor wording issue. sys.maxunicode is not called and thus does not return anything; it *is* something. (I checked the doc quickly to see if it tells that expression return things but found nothing.) I also removed markup that would just generate a useless link to the enclosing section. files: Doc/library/sys.rst | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Doc/library/sys.rst b/Doc/library/sys.rst --- a/Doc/library/sys.rst +++ b/Doc/library/sys.rst @@ -629,7 +629,7 @@ i.e. ``1114111`` (``0x10FFFF`` in hexadecimal). .. versionchanged:: 3.3 - Before :pep:`393`, :data:`sys.maxunicode` used to return either ``0xFFFF`` + Before :pep:`393`, ``sys.maxunicode`` used to be either ``0xFFFF`` or ``0x10FFFF``, depending on the configuration option that specified whether Unicode characters were stored as UCS-2 or UCS-4. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 13:24:06 2011 From: python-checkins at python.org (eric.araujo) Date: Thu, 06 Oct 2011 13:24:06 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Fix_typo?= Message-ID: http://hg.python.org/cpython/rev/109c8b9f2828 changeset: 72729:109c8b9f2828 user: ?ric Araujo date: Wed Oct 05 01:11:12 2011 +0200 summary: Fix typo files: Include/unicodeobject.h | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Include/unicodeobject.h b/Include/unicodeobject.h --- a/Include/unicodeobject.h +++ b/Include/unicodeobject.h @@ -206,7 +206,7 @@ immediately follow the structure. utf8_length and wstr_length can be found in the length field; the utf8 pointer is equal to the data pointer. */ typedef struct { - /* There a 4 forms of Unicode strings: + /* There are 4 forms of Unicode strings: - compact ascii: -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 13:24:06 2011 From: python-checkins at python.org (eric.araujo) Date: Thu, 06 Oct 2011 13:24:06 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Remove_inline_comment=2C_no?= =?utf8?q?_longer_supported_by_configparser=2E?= Message-ID: http://hg.python.org/cpython/rev/285950ceee8a changeset: 72730:285950ceee8a user: ?ric Araujo date: Wed Oct 05 01:14:02 2011 +0200 summary: Remove inline comment, no longer supported by configparser. (Deleted rather than moved because multilib implementations vary.) files: Lib/sysconfig.cfg | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Lib/sysconfig.cfg b/Lib/sysconfig.cfg --- a/Lib/sysconfig.cfg +++ b/Lib/sysconfig.cfg @@ -31,7 +31,7 @@ # be used directly in [resource_locations]. confdir = /etc datadir = /usr/share -libdir = /usr/lib ; or /usr/lib64 on a multilib system +libdir = /usr/lib statedir = /var # User resource directory local = ~/.local/{distribution.name} -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 13:24:07 2011 From: python-checkins at python.org (eric.araujo) Date: Thu, 06 Oct 2011 13:24:07 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Add_tests_for_comparing_can?= =?utf8?q?didate_and_final_versions_in_packaging_=28=2311841=29=2E?= Message-ID: http://hg.python.org/cpython/rev/2105ab8553b7 changeset: 72731:2105ab8553b7 user: ?ric Araujo date: Wed Oct 05 01:41:14 2011 +0200 summary: Add tests for comparing candidate and final versions in packaging (#11841). This used to be buggy; Filip Gruszczy?ski contributed tests and a code patch but the latter is not needed. files: Lib/packaging/tests/test_version.py | 12 ++++++++++++ 1 files changed, 12 insertions(+), 0 deletions(-) diff --git a/Lib/packaging/tests/test_version.py b/Lib/packaging/tests/test_version.py --- a/Lib/packaging/tests/test_version.py +++ b/Lib/packaging/tests/test_version.py @@ -101,8 +101,18 @@ True >>> V('1.2.0') >= V('1.2.3') False + >>> V('1.2.0rc1') >= V('1.2.0') + False >>> (V('1.0') > V('1.0b2')) True + >>> V('1.0') > V('1.0c2') + True + >>> V('1.0') > V('1.0rc2') + True + >>> V('1.0rc2') > V('1.0rc1') + True + >>> V('1.0c4') > V('1.0c1') + True >>> (V('1.0') > V('1.0c2') > V('1.0c1') > V('1.0b2') > V('1.0b1') ... > V('1.0a2') > V('1.0a1')) True @@ -129,6 +139,8 @@ ... < V('1.0.dev18') ... < V('1.0.dev456') ... < V('1.0.dev1234') + ... < V('1.0rc1') + ... < V('1.0rc2') ... < V('1.0') ... < V('1.0.post456.dev623') # development version of a post release ... < V('1.0.post456')) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 13:24:08 2011 From: python-checkins at python.org (eric.araujo) Date: Thu, 06 Oct 2011 13:24:08 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Cosmetic_fixes_for_whitespa?= =?utf8?q?ce_and_a_regex_in_packaging=2E?= Message-ID: http://hg.python.org/cpython/rev/c17b91e08b60 changeset: 72732:c17b91e08b60 user: ?ric Araujo date: Wed Oct 05 01:46:37 2011 +0200 summary: Cosmetic fixes for whitespace and a regex in packaging. The goal of the regex is to catch a (alpha), b (beta), c or rc (release candidate), so the existing pattern puzzled me. Tests were OK before and after the change. files: Lib/packaging/tests/test_version.py | 8 ++++---- Lib/packaging/version.py | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/Lib/packaging/tests/test_version.py b/Lib/packaging/tests/test_version.py --- a/Lib/packaging/tests/test_version.py +++ b/Lib/packaging/tests/test_version.py @@ -103,7 +103,7 @@ False >>> V('1.2.0rc1') >= V('1.2.0') False - >>> (V('1.0') > V('1.0b2')) + >>> V('1.0') > V('1.0b2') True >>> V('1.0') > V('1.0c2') True @@ -248,9 +248,9 @@ def test_parse_numdots(self): # For code coverage completeness, as pad_zeros_length can't be set or # influenced from the public interface - self.assertEqual(V('1.0')._parse_numdots('1.0', '1.0', - pad_zeros_length=3), - [1, 0, 0]) + self.assertEqual( + V('1.0')._parse_numdots('1.0', '1.0', pad_zeros_length=3), + [1, 0, 0]) def test_suite(): diff --git a/Lib/packaging/version.py b/Lib/packaging/version.py --- a/Lib/packaging/version.py +++ b/Lib/packaging/version.py @@ -253,7 +253,7 @@ # if we have something like "b-2" or "a.2" at the end of the # version, that is pobably beta, alpha, etc # let's remove the dash or dot - rs = re.sub(r"([abc|rc])[\-\.](\d+)$", r"\1\2", rs) + rs = re.sub(r"([abc]|rc)[\-\.](\d+)$", r"\1\2", rs) # 1.0-dev-r371 -> 1.0.dev371 # 0.1-dev-r79 -> 0.1.dev79 -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 13:24:09 2011 From: python-checkins at python.org (eric.araujo) Date: Thu, 06 Oct 2011 13:24:09 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Update_skip_message_printed?= =?utf8?q?_by_test=2Esupport=2Eget=5Fattribute=2E?= Message-ID: http://hg.python.org/cpython/rev/472ee6ed7949 changeset: 72733:472ee6ed7949 user: ?ric Araujo date: Wed Oct 05 01:50:22 2011 +0200 summary: Update skip message printed by test.support.get_attribute. This helper was changed to work with any object instead of only modules (or technically something with a __name__ attribute, see code in 3.2) but the message stayed as is. files: Lib/test/support.py | 3 +-- 1 files changed, 1 insertions(+), 2 deletions(-) diff --git a/Lib/test/support.py b/Lib/test/support.py --- a/Lib/test/support.py +++ b/Lib/test/support.py @@ -187,8 +187,7 @@ try: attribute = getattr(obj, name) except AttributeError: - raise unittest.SkipTest("module %s has no attribute %s" % ( - repr(obj), name)) + raise unittest.SkipTest("object %r has no attribute %r" % (obj, name)) else: return attribute -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 13:24:09 2011 From: python-checkins at python.org (eric.araujo) Date: Thu, 06 Oct 2011 13:24:09 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=283=2E2=29=3A_Move_doc_of_sys?= =?utf8?q?=2Edont=5Fwrite=5Fbytecode_to_make_all_attributes_sorted_again?= Message-ID: http://hg.python.org/cpython/rev/c5342904aeab changeset: 72734:c5342904aeab branch: 3.2 parent: 72658:2484b2b8876e user: ?ric Araujo date: Wed Oct 05 01:17:38 2011 +0200 summary: Move doc of sys.dont_write_bytecode to make all attributes sorted again files: Doc/library/sys.rst | 18 +++++++++--------- 1 files changed, 9 insertions(+), 9 deletions(-) diff --git a/Doc/library/sys.rst b/Doc/library/sys.rst --- a/Doc/library/sys.rst +++ b/Doc/library/sys.rst @@ -121,6 +121,15 @@ Use ``'backslashreplace'`` error handler on :exc:`UnicodeEncodeError`. +.. data:: dont_write_bytecode + + If this is true, Python won't try to write ``.pyc`` or ``.pyo`` files on the + import of source modules. This value is initially set to ``True`` or + ``False`` depending on the :option:`-B` command line option and the + :envvar:`PYTHONDONTWRITEBYTECODE` environment variable, but you can set it + yourself to control bytecode file generation. + + .. function:: excepthook(type, value, traceback) This function prints out a given traceback and exception to ``sys.stderr``. @@ -764,15 +773,6 @@ implement a dynamic prompt. -.. data:: dont_write_bytecode - - If this is true, Python won't try to write ``.pyc`` or ``.pyo`` files on the - import of source modules. This value is initially set to ``True`` or ``False`` - depending on the ``-B`` command line option and the ``PYTHONDONTWRITEBYTECODE`` - environment variable, but you can set it yourself to control bytecode file - generation. - - .. function:: setcheckinterval(interval) Set the interpreter's "check interval". This integer value determines how often -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 13:24:10 2011 From: python-checkins at python.org (eric.araujo) Date: Thu, 06 Oct 2011 13:24:10 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=283=2E2=29=3A_Fix_markup_used?= =?utf8?q?_in_the_documentation_of_sys=2Eprefix_and_sys=2Eexec=5Fprefix=2E?= Message-ID: http://hg.python.org/cpython/rev/6ea47522f466 changeset: 72735:6ea47522f466 branch: 3.2 user: ?ric Araujo date: Wed Oct 05 01:28:24 2011 +0200 summary: Fix markup used in the documentation of sys.prefix and sys.exec_prefix. - Using the file role with {placeholders} is IMO clearer than fake Python code. - The fact that sys.version[:3] gives '3.2' is a CPython detail and should not be advertised (see #9442), even if some stdlib modules currently rely on that detail. files: Doc/library/sys.rst | 14 +++++++------- 1 files changed, 7 insertions(+), 7 deletions(-) diff --git a/Doc/library/sys.rst b/Doc/library/sys.rst --- a/Doc/library/sys.rst +++ b/Doc/library/sys.rst @@ -194,10 +194,10 @@ Python files are installed; by default, this is also ``'/usr/local'``. This can be set at build time with the ``--exec-prefix`` argument to the :program:`configure` script. Specifically, all configuration files (e.g. the - :file:`pyconfig.h` header file) are installed in the directory ``exec_prefix + - '/lib/pythonversion/config'``, and shared library modules are installed in - ``exec_prefix + '/lib/pythonversion/lib-dynload'``, where *version* is equal to - ``version[:3]``. + :file:`pyconfig.h` header file) are installed in the directory + :file:`{exec_prefix}/lib/python{X.Y}/config', and shared library modules are + installed in :file:`{exec_prefix}/lib/python{X.Y}/lib-dynload`, where *X.Y* + is the version number of Python, for example ``3.2``. .. data:: executable @@ -752,10 +752,10 @@ independent Python files are installed; by default, this is the string ``'/usr/local'``. This can be set at build time with the ``--prefix`` argument to the :program:`configure` script. The main collection of Python - library modules is installed in the directory ``prefix + '/lib/pythonversion'`` + library modules is installed in the directory :file:`{prefix}/lib/python{X.Y}`` while the platform independent header files (all except :file:`pyconfig.h`) are - stored in ``prefix + '/include/pythonversion'``, where *version* is equal to - ``version[:3]``. + stored in :file:`{prefix}/include/python{X.Y}``, where *X.Y* is the version + number of Python, for example ``3.2``. .. data:: ps1 -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 13:24:11 2011 From: python-checkins at python.org (eric.araujo) Date: Thu, 06 Oct 2011 13:24:11 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=283=2E2=29=3A_Fix_typo_and_ca?= =?utf8?q?se_in_a_recently_added_test?= Message-ID: http://hg.python.org/cpython/rev/bfb02edcad12 changeset: 72736:bfb02edcad12 branch: 3.2 user: ?ric Araujo date: Wed Oct 05 01:29:22 2011 +0200 summary: Fix typo and case in a recently added test files: Lib/test/test_minidom.py | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Lib/test/test_minidom.py b/Lib/test/test_minidom.py --- a/Lib/test/test_minidom.py +++ b/Lib/test/test_minidom.py @@ -446,7 +446,7 @@ dom.unlink() self.confirm(domstr == str.replace("\n", "\r\n")) - def test_toPrettyXML_perserves_content_of_text_node(self): + def test_toprettyxml_preserves_content_of_text_node(self): str = 'B' dom = parseString(str) dom2 = parseString(dom.toprettyxml()) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 13:24:12 2011 From: python-checkins at python.org (eric.araujo) Date: Thu, 06 Oct 2011 13:24:12 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=28merge_3=2E2_-=3E_default=29?= =?utf8?q?=3A_Merge_3=2E2?= Message-ID: http://hg.python.org/cpython/rev/2e48fbc8170c changeset: 72737:2e48fbc8170c parent: 72733:472ee6ed7949 parent: 72736:bfb02edcad12 user: ?ric Araujo date: Wed Oct 05 01:52:45 2011 +0200 summary: Merge 3.2 files: Doc/library/sys.rst | 32 ++++++++++++++-------------- Lib/test/test_minidom.py | 2 +- 2 files changed, 17 insertions(+), 17 deletions(-) diff --git a/Doc/library/sys.rst b/Doc/library/sys.rst --- a/Doc/library/sys.rst +++ b/Doc/library/sys.rst @@ -121,6 +121,15 @@ Use ``'backslashreplace'`` error handler on :exc:`UnicodeEncodeError`. +.. data:: dont_write_bytecode + + If this is true, Python won't try to write ``.pyc`` or ``.pyo`` files on the + import of source modules. This value is initially set to ``True`` or + ``False`` depending on the :option:`-B` command line option and the + :envvar:`PYTHONDONTWRITEBYTECODE` environment variable, but you can set it + yourself to control bytecode file generation. + + .. function:: excepthook(type, value, traceback) This function prints out a given traceback and exception to ``sys.stderr``. @@ -185,10 +194,10 @@ Python files are installed; by default, this is also ``'/usr/local'``. This can be set at build time with the ``--exec-prefix`` argument to the :program:`configure` script. Specifically, all configuration files (e.g. the - :file:`pyconfig.h` header file) are installed in the directory ``exec_prefix + - '/lib/pythonversion/config'``, and shared library modules are installed in - ``exec_prefix + '/lib/pythonversion/lib-dynload'``, where *version* is equal to - ``version[:3]``. + :file:`pyconfig.h` header file) are installed in the directory + :file:`{exec_prefix}/lib/python{X.Y}/config', and shared library modules are + installed in :file:`{exec_prefix}/lib/python{X.Y}/lib-dynload`, where *X.Y* + is the version number of Python, for example ``3.2``. .. data:: executable @@ -750,10 +759,10 @@ independent Python files are installed; by default, this is the string ``'/usr/local'``. This can be set at build time with the ``--prefix`` argument to the :program:`configure` script. The main collection of Python - library modules is installed in the directory ``prefix + '/lib/pythonversion'`` + library modules is installed in the directory :file:`{prefix}/lib/python{X.Y}`` while the platform independent header files (all except :file:`pyconfig.h`) are - stored in ``prefix + '/include/pythonversion'``, where *version* is equal to - ``version[:3]``. + stored in :file:`{prefix}/include/python{X.Y}``, where *X.Y* is the version + number of Python, for example ``3.2``. .. data:: ps1 @@ -771,15 +780,6 @@ implement a dynamic prompt. -.. data:: dont_write_bytecode - - If this is true, Python won't try to write ``.pyc`` or ``.pyo`` files on the - import of source modules. This value is initially set to ``True`` or ``False`` - depending on the ``-B`` command line option and the ``PYTHONDONTWRITEBYTECODE`` - environment variable, but you can set it yourself to control bytecode file - generation. - - .. function:: setcheckinterval(interval) Set the interpreter's "check interval". This integer value determines how often diff --git a/Lib/test/test_minidom.py b/Lib/test/test_minidom.py --- a/Lib/test/test_minidom.py +++ b/Lib/test/test_minidom.py @@ -467,7 +467,7 @@ dom.unlink() self.confirm(domstr == str.replace("\n", "\r\n")) - def test_toPrettyXML_perserves_content_of_text_node(self): + def test_toprettyxml_preserves_content_of_text_node(self): str = 'B' dom = parseString(str) dom2 = parseString(dom.toprettyxml()) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 13:24:13 2011 From: python-checkins at python.org (eric.araujo) Date: Thu, 06 Oct 2011 13:24:13 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Add_regrtest_check_for_cach?= =?utf8?q?es_in_packaging=2Edatabase_=28see_=2312167=29?= Message-ID: http://hg.python.org/cpython/rev/e76c6aaff135 changeset: 72738:e76c6aaff135 user: ?ric Araujo date: Thu Oct 06 02:44:19 2011 +0200 summary: Add regrtest check for caches in packaging.database (see #12167) files: Lib/test/regrtest.py | 24 ++++++++++++++++++++++++ 1 files changed, 24 insertions(+), 0 deletions(-) diff --git a/Lib/test/regrtest.py b/Lib/test/regrtest.py --- a/Lib/test/regrtest.py +++ b/Lib/test/regrtest.py @@ -173,6 +173,7 @@ import json import logging import os +import packaging.database import platform import random import re @@ -967,6 +968,7 @@ 'sys.warnoptions', 'threading._dangling', 'multiprocessing.process._dangling', 'sysconfig._CONFIG_VARS', 'sysconfig._SCHEMES', + 'packaging.database_caches', ) def get_sys_argv(self): @@ -1054,6 +1056,28 @@ # Can't easily revert the logging state pass + def get_packaging_database_caches(self): + # caching system used by the PEP 376 implementation + # we have one boolean and four dictionaries, initially empty + switch = packaging.database._cache_enabled + saved = [] + for name in ('_cache_name', '_cache_name_egg', + '_cache_path', '_cache_path_egg'): + cache = getattr(packaging.database, name) + saved.append((id(cache), cache, cache.copy())) + return switch, saved + def restore_packaging_database_caches(self, saved): + switch, saved_caches = saved + packaging.database._cache_enabled = switch + for offset, name in enumerate(('_cache_name', '_cache_name_egg', + '_cache_path', '_cache_path_egg')): + _, cache, items = saved_caches[offset] + # put back the same object in place + setattr(packaging.database, name, cache) + # now restore its items + cache.clear() + cache.update(items) + def get_sys_warnoptions(self): return id(sys.warnoptions), sys.warnoptions, sys.warnoptions[:] def restore_sys_warnoptions(self, saved_options): -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 13:24:13 2011 From: python-checkins at python.org (eric.araujo) Date: Thu, 06 Oct 2011 13:24:13 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Change_one_name_in_packagin?= =?utf8?q?g=E2=80=99s_test=5Funinstall_to_avoid_confusion=2E?= Message-ID: http://hg.python.org/cpython/rev/fd2d239699a5 changeset: 72739:fd2d239699a5 user: ?ric Araujo date: Thu Oct 06 04:59:41 2011 +0200 summary: Change one name in packaging?s test_uninstall to avoid confusion. install_lib may be the name of a module, a command or an option, so I find it clearer to use site_packages to refer to a string object containing the path of the site-packages directory created in a temporary directory during tests. files: Lib/packaging/tests/test_uninstall.py | 24 +++++++------- 1 files changed, 12 insertions(+), 12 deletions(-) diff --git a/Lib/packaging/tests/test_uninstall.py b/Lib/packaging/tests/test_uninstall.py --- a/Lib/packaging/tests/test_uninstall.py +++ b/Lib/packaging/tests/test_uninstall.py @@ -86,26 +86,26 @@ old_out = sys.stderr sys.stderr = StringIO() dist = self.run_setup('install_dist', '--prefix=' + self.root_dir) - install_lib = self.get_path(dist, 'purelib') - return dist, install_lib + site_packages = self.get_path(dist, 'purelib') + return dist, site_packages def test_uninstall_unknow_distribution(self): self.assertRaises(PackagingError, remove, 'Foo', paths=[self.root_dir]) def test_uninstall(self): - dist, install_lib = self.install_dist() - self.assertIsFile(install_lib, 'foo', '__init__.py') - self.assertIsFile(install_lib, 'foo', 'sub', '__init__.py') - self.assertIsFile(install_lib, 'Foo-0.1.dist-info', 'RECORD') - self.assertTrue(remove('Foo', paths=[install_lib])) - self.assertIsNotFile(install_lib, 'foo', 'sub', '__init__.py') - self.assertIsNotFile(install_lib, 'Foo-0.1.dist-info', 'RECORD') + dist, site_packages = self.install_dist() + self.assertIsFile(site_packages, 'foo', '__init__.py') + self.assertIsFile(site_packages, 'foo', 'sub', '__init__.py') + self.assertIsFile(site_packages, 'Foo-0.1.dist-info', 'RECORD') + self.assertTrue(remove('Foo', paths=[site_packages])) + self.assertIsNotFile(site_packages, 'foo', 'sub', '__init__.py') + self.assertIsNotFile(site_packages, 'Foo-0.1.dist-info', 'RECORD') def test_remove_issue(self): # makes sure if there are OSErrors (like permission denied) # remove() stops and display a clean error - dist, install_lib = self.install_dist('Meh') + dist, site_packages = self.install_dist('Meh') # breaking os.rename old = os.rename @@ -115,11 +115,11 @@ os.rename = _rename try: - self.assertFalse(remove('Meh', paths=[install_lib])) + self.assertFalse(remove('Meh', paths=[site_packages])) finally: os.rename = old - self.assertTrue(remove('Meh', paths=[install_lib])) + self.assertTrue(remove('Meh', paths=[site_packages])) def test_suite(): -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 13:24:14 2011 From: python-checkins at python.org (eric.araujo) Date: Thu, 06 Oct 2011 13:24:14 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Fix_incorrect_test=2E?= Message-ID: http://hg.python.org/cpython/rev/45274853d5ef changeset: 72740:45274853d5ef user: ?ric Araujo date: Thu Oct 06 05:10:09 2011 +0200 summary: Fix incorrect test. The packaging.install.remove function (a.k.a. the uninstall feature) takes a path argument to allow client code to use custom directories instead of sys.path. The test used to give self.root_dir as path, which corresponds to a prefix option, but prefix is not on sys.path, it?s only the base directory used to compute the stdlib and site-packages directory paths. The test now gives a valid site-packages path to the function. files: Lib/packaging/tests/test_uninstall.py | 5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-) diff --git a/Lib/packaging/tests/test_uninstall.py b/Lib/packaging/tests/test_uninstall.py --- a/Lib/packaging/tests/test_uninstall.py +++ b/Lib/packaging/tests/test_uninstall.py @@ -89,9 +89,10 @@ site_packages = self.get_path(dist, 'purelib') return dist, site_packages - def test_uninstall_unknow_distribution(self): + def test_uninstall_unknown_distribution(self): + dist, site_packages = self.install_dist('Foospam') self.assertRaises(PackagingError, remove, 'Foo', - paths=[self.root_dir]) + paths=[site_packages]) def test_uninstall(self): dist, site_packages = self.install_dist() -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 13:24:14 2011 From: python-checkins at python.org (eric.araujo) Date: Thu, 06 Oct 2011 13:24:14 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Add_test_that_was_promised_?= =?utf8?q?in_a_comment_but_not_actually_written?= Message-ID: http://hg.python.org/cpython/rev/2d5b7993fae1 changeset: 72741:2d5b7993fae1 user: ?ric Araujo date: Thu Oct 06 05:15:09 2011 +0200 summary: Add test that was promised in a comment but not actually written files: Lib/packaging/tests/test_uninstall.py | 9 +++++++-- 1 files changed, 7 insertions(+), 2 deletions(-) diff --git a/Lib/packaging/tests/test_uninstall.py b/Lib/packaging/tests/test_uninstall.py --- a/Lib/packaging/tests/test_uninstall.py +++ b/Lib/packaging/tests/test_uninstall.py @@ -1,6 +1,7 @@ """Tests for the uninstall command.""" import os import sys +import logging from io import StringIO import stat import packaging.util @@ -105,14 +106,14 @@ def test_remove_issue(self): # makes sure if there are OSErrors (like permission denied) - # remove() stops and display a clean error + # remove() stops and displays a clean error dist, site_packages = self.install_dist('Meh') # breaking os.rename old = os.rename def _rename(source, target): - raise OSError + raise OSError(42, 'impossible operation') os.rename = _rename try: @@ -120,6 +121,10 @@ finally: os.rename = old + logs = [log for log in self.get_logs(logging.INFO) + if log.startswith('Error:')] + self.assertEqual(logs, ['Error: [Errno 42] impossible operation']) + self.assertTrue(remove('Meh', paths=[site_packages])) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 13:24:15 2011 From: python-checkins at python.org (eric.araujo) Date: Thu, 06 Oct 2011 13:24:15 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Minor=3A_improve_one_test_n?= =?utf8?q?ame=2C_address_pyflakes_warnings?= Message-ID: http://hg.python.org/cpython/rev/4846e84c0360 changeset: 72742:4846e84c0360 user: ?ric Araujo date: Thu Oct 06 05:18:41 2011 +0200 summary: Minor: improve one test name, address pyflakes warnings files: Lib/packaging/tests/test_uninstall.py | 10 ++++------ 1 files changed, 4 insertions(+), 6 deletions(-) diff --git a/Lib/packaging/tests/test_uninstall.py b/Lib/packaging/tests/test_uninstall.py --- a/Lib/packaging/tests/test_uninstall.py +++ b/Lib/packaging/tests/test_uninstall.py @@ -1,15 +1,14 @@ -"""Tests for the uninstall command.""" +"""Tests for the packaging.uninstall module.""" import os import sys import logging -from io import StringIO -import stat import packaging.util -from packaging.database import disable_cache, enable_cache +from io import StringIO from packaging.run import main from packaging.errors import PackagingError from packaging.install import remove +from packaging.database import disable_cache, enable_cache from packaging.command.install_dist import install_dist from packaging.tests import unittest, support @@ -84,7 +83,6 @@ if not dirname: dirname = self.make_dist(name, **kw) os.chdir(dirname) - old_out = sys.stderr sys.stderr = StringIO() dist = self.run_setup('install_dist', '--prefix=' + self.root_dir) site_packages = self.get_path(dist, 'purelib') @@ -104,7 +102,7 @@ self.assertIsNotFile(site_packages, 'foo', 'sub', '__init__.py') self.assertIsNotFile(site_packages, 'Foo-0.1.dist-info', 'RECORD') - def test_remove_issue(self): + def test_uninstall_error_handling(self): # makes sure if there are OSErrors (like permission denied) # remove() stops and displays a clean error dist, site_packages = self.install_dist('Meh') -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 13:24:16 2011 From: python-checkins at python.org (eric.araujo) Date: Thu, 06 Oct 2011 13:24:16 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Fix_return_code_of_?= =?utf8?b?4oCccHlzZXR1cCBydW4gQ09NTUFOROKAnSAoY2xvc2VzICMxMjIyMik=?= Message-ID: http://hg.python.org/cpython/rev/ab125793243f changeset: 72743:ab125793243f user: ?ric Araujo date: Thu Oct 06 05:28:56 2011 +0200 summary: Fix return code of ?pysetup run COMMAND? (closes #12222) files: Lib/packaging/run.py | 5 +- Lib/packaging/tests/test_uninstall.py | 30 ++++++++------- 2 files changed, 19 insertions(+), 16 deletions(-) diff --git a/Lib/packaging/run.py b/Lib/packaging/run.py --- a/Lib/packaging/run.py +++ b/Lib/packaging/run.py @@ -283,10 +283,11 @@ dist.parse_config_files() for cmd in dispatcher.commands: + # FIXME need to catch MetadataMissingError here (from the check command + # e.g.)--or catch any exception, print an error message and exit with 1 dist.run_command(cmd, dispatcher.command_options[cmd]) - # XXX this is crappy - return dist + return 0 @action_help("""\ diff --git a/Lib/packaging/tests/test_uninstall.py b/Lib/packaging/tests/test_uninstall.py --- a/Lib/packaging/tests/test_uninstall.py +++ b/Lib/packaging/tests/test_uninstall.py @@ -4,12 +4,9 @@ import logging import packaging.util -from io import StringIO -from packaging.run import main from packaging.errors import PackagingError from packaging.install import remove from packaging.database import disable_cache, enable_cache -from packaging.command.install_dist import install_dist from packaging.tests import unittest, support @@ -47,16 +44,12 @@ packaging.util._path_created.clear() super(UninstallTestCase, self).tearDown() - def run_setup(self, *args): - # run setup with args - args = ['run'] + list(args) - dist = main(args) - return dist - def get_path(self, dist, name): - cmd = install_dist(dist) - cmd.prefix = self.root_dir - cmd.finalize_options() + # the dist argument must contain an install_dist command correctly + # initialized with a prefix option and finalized befored this method + # can be called successfully; practically, this means that you should + # call self.install_dist before self.get_path + cmd = dist.get_command_obj('install_dist') return getattr(cmd, 'install_' + name) def make_dist(self, name='Foo', **kw): @@ -83,8 +76,17 @@ if not dirname: dirname = self.make_dist(name, **kw) os.chdir(dirname) - sys.stderr = StringIO() - dist = self.run_setup('install_dist', '--prefix=' + self.root_dir) + + dist = support.TestDistribution() + # for some unfathomable reason, the tests will fail horribly if the + # parse_config_files method is not called, even if it doesn't do + # anything useful; trying to build and use a command object manually + # also fails + dist.parse_config_files() + dist.finalize_options() + dist.run_command('install_dist', + {'prefix': ('command line', self.root_dir)}) + site_packages = self.get_path(dist, 'purelib') return dist, site_packages -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 13:24:17 2011 From: python-checkins at python.org (eric.araujo) Date: Thu, 06 Oct 2011 13:24:17 +0200 Subject: [Python-checkins] =?utf8?b?Y3B5dGhvbiAobWVyZ2UgMy4yIC0+IDMuMik6?= =?utf8?q?_Branch_merge?= Message-ID: http://hg.python.org/cpython/rev/7b277d5530bc changeset: 72744:7b277d5530bc branch: 3.2 parent: 72708:a0393cbe4872 parent: 72736:bfb02edcad12 user: ?ric Araujo date: Thu Oct 06 13:10:34 2011 +0200 summary: Branch merge files: Doc/library/sys.rst | 32 ++++++++++++++-------------- Lib/test/test_minidom.py | 2 +- 2 files changed, 17 insertions(+), 17 deletions(-) diff --git a/Doc/library/sys.rst b/Doc/library/sys.rst --- a/Doc/library/sys.rst +++ b/Doc/library/sys.rst @@ -121,6 +121,15 @@ Use ``'backslashreplace'`` error handler on :exc:`UnicodeEncodeError`. +.. data:: dont_write_bytecode + + If this is true, Python won't try to write ``.pyc`` or ``.pyo`` files on the + import of source modules. This value is initially set to ``True`` or + ``False`` depending on the :option:`-B` command line option and the + :envvar:`PYTHONDONTWRITEBYTECODE` environment variable, but you can set it + yourself to control bytecode file generation. + + .. function:: excepthook(type, value, traceback) This function prints out a given traceback and exception to ``sys.stderr``. @@ -185,10 +194,10 @@ Python files are installed; by default, this is also ``'/usr/local'``. This can be set at build time with the ``--exec-prefix`` argument to the :program:`configure` script. Specifically, all configuration files (e.g. the - :file:`pyconfig.h` header file) are installed in the directory ``exec_prefix + - '/lib/pythonversion/config'``, and shared library modules are installed in - ``exec_prefix + '/lib/pythonversion/lib-dynload'``, where *version* is equal to - ``version[:3]``. + :file:`pyconfig.h` header file) are installed in the directory + :file:`{exec_prefix}/lib/python{X.Y}/config', and shared library modules are + installed in :file:`{exec_prefix}/lib/python{X.Y}/lib-dynload`, where *X.Y* + is the version number of Python, for example ``3.2``. .. data:: executable @@ -743,10 +752,10 @@ independent Python files are installed; by default, this is the string ``'/usr/local'``. This can be set at build time with the ``--prefix`` argument to the :program:`configure` script. The main collection of Python - library modules is installed in the directory ``prefix + '/lib/pythonversion'`` + library modules is installed in the directory :file:`{prefix}/lib/python{X.Y}`` while the platform independent header files (all except :file:`pyconfig.h`) are - stored in ``prefix + '/include/pythonversion'``, where *version* is equal to - ``version[:3]``. + stored in :file:`{prefix}/include/python{X.Y}``, where *X.Y* is the version + number of Python, for example ``3.2``. .. data:: ps1 @@ -764,15 +773,6 @@ implement a dynamic prompt. -.. data:: dont_write_bytecode - - If this is true, Python won't try to write ``.pyc`` or ``.pyo`` files on the - import of source modules. This value is initially set to ``True`` or ``False`` - depending on the ``-B`` command line option and the ``PYTHONDONTWRITEBYTECODE`` - environment variable, but you can set it yourself to control bytecode file - generation. - - .. function:: setcheckinterval(interval) Set the interpreter's "check interval". This integer value determines how often diff --git a/Lib/test/test_minidom.py b/Lib/test/test_minidom.py --- a/Lib/test/test_minidom.py +++ b/Lib/test/test_minidom.py @@ -446,7 +446,7 @@ dom.unlink() self.confirm(domstr == str.replace("\n", "\r\n")) - def test_toPrettyXML_perserves_content_of_text_node(self): + def test_toprettyxml_preserves_content_of_text_node(self): str = 'B' dom = parseString(str) dom2 = parseString(dom.toprettyxml()) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 13:24:17 2011 From: python-checkins at python.org (eric.araujo) Date: Thu, 06 Oct 2011 13:24:17 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=28merge_default_-=3E_default?= =?utf8?q?=29=3A_Branch_merge?= Message-ID: http://hg.python.org/cpython/rev/4001211484e1 changeset: 72745:4001211484e1 parent: 72721:3636a39fa557 parent: 72743:ab125793243f user: ?ric Araujo date: Thu Oct 06 13:22:21 2011 +0200 summary: Branch merge files: Doc/library/sys.rst | 34 +++--- Doc/whatsnew/3.3.rst | 20 +- Include/unicodeobject.h | 2 +- Lib/packaging/run.py | 5 +- Lib/packaging/tests/test_uninstall.py | 76 ++++++++------ Lib/packaging/tests/test_version.py | 20 +++- Lib/packaging/version.py | 2 +- Lib/sysconfig.cfg | 2 +- Lib/test/regrtest.py | 24 ++++ Lib/test/support.py | 3 +- Lib/test/test_minidom.py | 2 +- Misc/NEWS | 2 + 12 files changed, 118 insertions(+), 74 deletions(-) diff --git a/Doc/library/sys.rst b/Doc/library/sys.rst --- a/Doc/library/sys.rst +++ b/Doc/library/sys.rst @@ -121,6 +121,15 @@ Use ``'backslashreplace'`` error handler on :exc:`UnicodeEncodeError`. +.. data:: dont_write_bytecode + + If this is true, Python won't try to write ``.pyc`` or ``.pyo`` files on the + import of source modules. This value is initially set to ``True`` or + ``False`` depending on the :option:`-B` command line option and the + :envvar:`PYTHONDONTWRITEBYTECODE` environment variable, but you can set it + yourself to control bytecode file generation. + + .. function:: excepthook(type, value, traceback) This function prints out a given traceback and exception to ``sys.stderr``. @@ -185,10 +194,10 @@ Python files are installed; by default, this is also ``'/usr/local'``. This can be set at build time with the ``--exec-prefix`` argument to the :program:`configure` script. Specifically, all configuration files (e.g. the - :file:`pyconfig.h` header file) are installed in the directory ``exec_prefix + - '/lib/pythonversion/config'``, and shared library modules are installed in - ``exec_prefix + '/lib/pythonversion/lib-dynload'``, where *version* is equal to - ``version[:3]``. + :file:`pyconfig.h` header file) are installed in the directory + :file:`{exec_prefix}/lib/python{X.Y}/config', and shared library modules are + installed in :file:`{exec_prefix}/lib/python{X.Y}/lib-dynload`, where *X.Y* + is the version number of Python, for example ``3.2``. .. data:: executable @@ -629,7 +638,7 @@ i.e. ``1114111`` (``0x10FFFF`` in hexadecimal). .. versionchanged:: 3.3 - Before :pep:`393`, :data:`sys.maxunicode` used to return either ``0xFFFF`` + Before :pep:`393`, ``sys.maxunicode`` used to be either ``0xFFFF`` or ``0x10FFFF``, depending on the configuration option that specified whether Unicode characters were stored as UCS-2 or UCS-4. @@ -750,10 +759,10 @@ independent Python files are installed; by default, this is the string ``'/usr/local'``. This can be set at build time with the ``--prefix`` argument to the :program:`configure` script. The main collection of Python - library modules is installed in the directory ``prefix + '/lib/pythonversion'`` + library modules is installed in the directory :file:`{prefix}/lib/python{X.Y}`` while the platform independent header files (all except :file:`pyconfig.h`) are - stored in ``prefix + '/include/pythonversion'``, where *version* is equal to - ``version[:3]``. + stored in :file:`{prefix}/include/python{X.Y}``, where *X.Y* is the version + number of Python, for example ``3.2``. .. data:: ps1 @@ -771,15 +780,6 @@ implement a dynamic prompt. -.. data:: dont_write_bytecode - - If this is true, Python won't try to write ``.pyc`` or ``.pyo`` files on the - import of source modules. This value is initially set to ``True`` or ``False`` - depending on the ``-B`` command line option and the ``PYTHONDONTWRITEBYTECODE`` - environment variable, but you can set it yourself to control bytecode file - generation. - - .. function:: setcheckinterval(interval) Set the interpreter's "check interval". This integer value determines how often diff --git a/Doc/whatsnew/3.3.rst b/Doc/whatsnew/3.3.rst --- a/Doc/whatsnew/3.3.rst +++ b/Doc/whatsnew/3.3.rst @@ -6,8 +6,7 @@ :Release: |release| :Date: |today| -.. $Id$ - Rules for maintenance: +.. Rules for maintenance: * Anyone can add text to this document. Do not spend very much time on the wording of your changes, because your text will probably @@ -40,25 +39,25 @@ * It's helpful to add the bug/patch number as a comment: - % Patch 12345 XXX Describe the transmogrify() function added to the socket module. - (Contributed by P.Y. Developer.) + (Contributed by P.Y. Developer in :issue:`12345`.) - This saves the maintainer the effort of going through the SVN log + This saves the maintainer the effort of going through the Mercurial log when researching a change. This article explains the new features in Python 3.3, compared to 3.2. -PEP XXX: Stub -============= - - PEP 393: Flexible String Representation ======================================= -XXX Give a short introduction about :pep:`393`. +[Abstract copied from the PEP: The Unicode string type is changed to support +multiple internal representations, depending on the character with the largest +Unicode ordinal (1, 2, or 4 bytes). This allows a space-efficient +representation in common cases, but gives access to full UCS-4 on all systems. +For compatibility with existing APIs, several representations may exist in +parallel; over time, this compatibility should be phased out.] PEP 393 is fully backward compatible. The legacy API should remain available at least five years. Applications using the legacy API will not @@ -109,6 +108,7 @@ XXX mention new and deprecated functions and macros + Other Language Changes ====================== diff --git a/Include/unicodeobject.h b/Include/unicodeobject.h --- a/Include/unicodeobject.h +++ b/Include/unicodeobject.h @@ -206,7 +206,7 @@ immediately follow the structure. utf8_length and wstr_length can be found in the length field; the utf8 pointer is equal to the data pointer. */ typedef struct { - /* There a 4 forms of Unicode strings: + /* There are 4 forms of Unicode strings: - compact ascii: diff --git a/Lib/packaging/run.py b/Lib/packaging/run.py --- a/Lib/packaging/run.py +++ b/Lib/packaging/run.py @@ -283,10 +283,11 @@ dist.parse_config_files() for cmd in dispatcher.commands: + # FIXME need to catch MetadataMissingError here (from the check command + # e.g.)--or catch any exception, print an error message and exit with 1 dist.run_command(cmd, dispatcher.command_options[cmd]) - # XXX this is crappy - return dist + return 0 @action_help("""\ diff --git a/Lib/packaging/tests/test_uninstall.py b/Lib/packaging/tests/test_uninstall.py --- a/Lib/packaging/tests/test_uninstall.py +++ b/Lib/packaging/tests/test_uninstall.py @@ -1,15 +1,12 @@ -"""Tests for the uninstall command.""" +"""Tests for the packaging.uninstall module.""" import os import sys -from io import StringIO -import stat +import logging import packaging.util -from packaging.database import disable_cache, enable_cache -from packaging.run import main from packaging.errors import PackagingError from packaging.install import remove -from packaging.command.install_dist import install_dist +from packaging.database import disable_cache, enable_cache from packaging.tests import unittest, support @@ -47,16 +44,12 @@ packaging.util._path_created.clear() super(UninstallTestCase, self).tearDown() - def run_setup(self, *args): - # run setup with args - args = ['run'] + list(args) - dist = main(args) - return dist - def get_path(self, dist, name): - cmd = install_dist(dist) - cmd.prefix = self.root_dir - cmd.finalize_options() + # the dist argument must contain an install_dist command correctly + # initialized with a prefix option and finalized befored this method + # can be called successfully; practically, this means that you should + # call self.install_dist before self.get_path + cmd = dist.get_command_obj('install_dist') return getattr(cmd, 'install_' + name) def make_dist(self, name='Foo', **kw): @@ -83,43 +76,56 @@ if not dirname: dirname = self.make_dist(name, **kw) os.chdir(dirname) - old_out = sys.stderr - sys.stderr = StringIO() - dist = self.run_setup('install_dist', '--prefix=' + self.root_dir) - install_lib = self.get_path(dist, 'purelib') - return dist, install_lib - def test_uninstall_unknow_distribution(self): + dist = support.TestDistribution() + # for some unfathomable reason, the tests will fail horribly if the + # parse_config_files method is not called, even if it doesn't do + # anything useful; trying to build and use a command object manually + # also fails + dist.parse_config_files() + dist.finalize_options() + dist.run_command('install_dist', + {'prefix': ('command line', self.root_dir)}) + + site_packages = self.get_path(dist, 'purelib') + return dist, site_packages + + def test_uninstall_unknown_distribution(self): + dist, site_packages = self.install_dist('Foospam') self.assertRaises(PackagingError, remove, 'Foo', - paths=[self.root_dir]) + paths=[site_packages]) def test_uninstall(self): - dist, install_lib = self.install_dist() - self.assertIsFile(install_lib, 'foo', '__init__.py') - self.assertIsFile(install_lib, 'foo', 'sub', '__init__.py') - self.assertIsFile(install_lib, 'Foo-0.1.dist-info', 'RECORD') - self.assertTrue(remove('Foo', paths=[install_lib])) - self.assertIsNotFile(install_lib, 'foo', 'sub', '__init__.py') - self.assertIsNotFile(install_lib, 'Foo-0.1.dist-info', 'RECORD') + dist, site_packages = self.install_dist() + self.assertIsFile(site_packages, 'foo', '__init__.py') + self.assertIsFile(site_packages, 'foo', 'sub', '__init__.py') + self.assertIsFile(site_packages, 'Foo-0.1.dist-info', 'RECORD') + self.assertTrue(remove('Foo', paths=[site_packages])) + self.assertIsNotFile(site_packages, 'foo', 'sub', '__init__.py') + self.assertIsNotFile(site_packages, 'Foo-0.1.dist-info', 'RECORD') - def test_remove_issue(self): + def test_uninstall_error_handling(self): # makes sure if there are OSErrors (like permission denied) - # remove() stops and display a clean error - dist, install_lib = self.install_dist('Meh') + # remove() stops and displays a clean error + dist, site_packages = self.install_dist('Meh') # breaking os.rename old = os.rename def _rename(source, target): - raise OSError + raise OSError(42, 'impossible operation') os.rename = _rename try: - self.assertFalse(remove('Meh', paths=[install_lib])) + self.assertFalse(remove('Meh', paths=[site_packages])) finally: os.rename = old - self.assertTrue(remove('Meh', paths=[install_lib])) + logs = [log for log in self.get_logs(logging.INFO) + if log.startswith('Error:')] + self.assertEqual(logs, ['Error: [Errno 42] impossible operation']) + + self.assertTrue(remove('Meh', paths=[site_packages])) def test_suite(): diff --git a/Lib/packaging/tests/test_version.py b/Lib/packaging/tests/test_version.py --- a/Lib/packaging/tests/test_version.py +++ b/Lib/packaging/tests/test_version.py @@ -101,7 +101,17 @@ True >>> V('1.2.0') >= V('1.2.3') False - >>> (V('1.0') > V('1.0b2')) + >>> V('1.2.0rc1') >= V('1.2.0') + False + >>> V('1.0') > V('1.0b2') + True + >>> V('1.0') > V('1.0c2') + True + >>> V('1.0') > V('1.0rc2') + True + >>> V('1.0rc2') > V('1.0rc1') + True + >>> V('1.0c4') > V('1.0c1') True >>> (V('1.0') > V('1.0c2') > V('1.0c1') > V('1.0b2') > V('1.0b1') ... > V('1.0a2') > V('1.0a1')) @@ -129,6 +139,8 @@ ... < V('1.0.dev18') ... < V('1.0.dev456') ... < V('1.0.dev1234') + ... < V('1.0rc1') + ... < V('1.0rc2') ... < V('1.0') ... < V('1.0.post456.dev623') # development version of a post release ... < V('1.0.post456')) @@ -236,9 +248,9 @@ def test_parse_numdots(self): # For code coverage completeness, as pad_zeros_length can't be set or # influenced from the public interface - self.assertEqual(V('1.0')._parse_numdots('1.0', '1.0', - pad_zeros_length=3), - [1, 0, 0]) + self.assertEqual( + V('1.0')._parse_numdots('1.0', '1.0', pad_zeros_length=3), + [1, 0, 0]) def test_suite(): diff --git a/Lib/packaging/version.py b/Lib/packaging/version.py --- a/Lib/packaging/version.py +++ b/Lib/packaging/version.py @@ -253,7 +253,7 @@ # if we have something like "b-2" or "a.2" at the end of the # version, that is pobably beta, alpha, etc # let's remove the dash or dot - rs = re.sub(r"([abc|rc])[\-\.](\d+)$", r"\1\2", rs) + rs = re.sub(r"([abc]|rc)[\-\.](\d+)$", r"\1\2", rs) # 1.0-dev-r371 -> 1.0.dev371 # 0.1-dev-r79 -> 0.1.dev79 diff --git a/Lib/sysconfig.cfg b/Lib/sysconfig.cfg --- a/Lib/sysconfig.cfg +++ b/Lib/sysconfig.cfg @@ -31,7 +31,7 @@ # be used directly in [resource_locations]. confdir = /etc datadir = /usr/share -libdir = /usr/lib ; or /usr/lib64 on a multilib system +libdir = /usr/lib statedir = /var # User resource directory local = ~/.local/{distribution.name} diff --git a/Lib/test/regrtest.py b/Lib/test/regrtest.py --- a/Lib/test/regrtest.py +++ b/Lib/test/regrtest.py @@ -173,6 +173,7 @@ import json import logging import os +import packaging.database import platform import random import re @@ -967,6 +968,7 @@ 'sys.warnoptions', 'threading._dangling', 'multiprocessing.process._dangling', 'sysconfig._CONFIG_VARS', 'sysconfig._SCHEMES', + 'packaging.database_caches', ) def get_sys_argv(self): @@ -1054,6 +1056,28 @@ # Can't easily revert the logging state pass + def get_packaging_database_caches(self): + # caching system used by the PEP 376 implementation + # we have one boolean and four dictionaries, initially empty + switch = packaging.database._cache_enabled + saved = [] + for name in ('_cache_name', '_cache_name_egg', + '_cache_path', '_cache_path_egg'): + cache = getattr(packaging.database, name) + saved.append((id(cache), cache, cache.copy())) + return switch, saved + def restore_packaging_database_caches(self, saved): + switch, saved_caches = saved + packaging.database._cache_enabled = switch + for offset, name in enumerate(('_cache_name', '_cache_name_egg', + '_cache_path', '_cache_path_egg')): + _, cache, items = saved_caches[offset] + # put back the same object in place + setattr(packaging.database, name, cache) + # now restore its items + cache.clear() + cache.update(items) + def get_sys_warnoptions(self): return id(sys.warnoptions), sys.warnoptions, sys.warnoptions[:] def restore_sys_warnoptions(self, saved_options): diff --git a/Lib/test/support.py b/Lib/test/support.py --- a/Lib/test/support.py +++ b/Lib/test/support.py @@ -187,8 +187,7 @@ try: attribute = getattr(obj, name) except AttributeError: - raise unittest.SkipTest("module %s has no attribute %s" % ( - repr(obj), name)) + raise unittest.SkipTest("object %r has no attribute %r" % (obj, name)) else: return attribute diff --git a/Lib/test/test_minidom.py b/Lib/test/test_minidom.py --- a/Lib/test/test_minidom.py +++ b/Lib/test/test_minidom.py @@ -467,7 +467,7 @@ dom.unlink() self.confirm(domstr == str.replace("\n", "\r\n")) - def test_toPrettyXML_perserves_content_of_text_node(self): + def test_toprettyxml_preserves_content_of_text_node(self): str = 'B' dom = parseString(str) dom2 = parseString(dom.toprettyxml()) diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -1261,6 +1261,8 @@ Build ----- +- PEP 393: the configure option --with-wide-unicode is removed. + - Issue #12852: Set _XOPEN_SOURCE to 700, instead of 600, to get POSIX 2008 functions on OpenBSD (e.g. fdopendir). -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 13:24:18 2011 From: python-checkins at python.org (eric.araujo) Date: Thu, 06 Oct 2011 13:24:18 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=28merge_3=2E2_-=3E_default=29?= =?utf8?q?=3A_Merge_3=2E2?= Message-ID: http://hg.python.org/cpython/rev/8bae9c603d04 changeset: 72746:8bae9c603d04 parent: 72745:4001211484e1 parent: 72744:7b277d5530bc user: ?ric Araujo date: Thu Oct 06 13:23:50 2011 +0200 summary: Merge 3.2 files: -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 13:26:32 2011 From: python-checkins at python.org (eric.araujo) Date: Thu, 06 Oct 2011 13:26:32 +0200 Subject: [Python-checkins] =?utf8?q?peps=3A_Use_Python_3_syntax?= Message-ID: http://hg.python.org/peps/rev/626378e168c5 changeset: 3954:626378e168c5 user: ?ric Araujo date: Thu Oct 06 13:26:09 2011 +0200 summary: Use Python 3 syntax files: pep-0335.txt | 33 +++++++++++++++++---------------- 1 files changed, 17 insertions(+), 16 deletions(-) diff --git a/pep-0335.txt b/pep-0335.txt --- a/pep-0335.txt +++ b/pep-0335.txt @@ -366,29 +366,29 @@ return "barray(%s)" % ndarray.__str__(self) def __and2__(self, other): - return (self & other) + return self & other def __or2__(self, other): - return (self & other) + return self & other def __not__(self): - return (self == 0) + return self == 0 def barray(*args, **kwds): - return array(*args, **kwds).view(type = BArray) + return array(*args, **kwds).view(type=BArray) a0 = barray([0, 1, 2, 4]) a1 = barray([1, 2, 3, 4]) a2 = barray([5, 6, 3, 4]) a3 = barray([5, 1, 2, 4]) - print "a0:", a0 - print "a1:", a1 - print "a2:", a2 - print "a3:", a3 - print "not a0:", not a0 - print "a0 == a1 and a2 == a3:", a0 == a1 and a2 == a3 - print "a0 == a1 or a2 == a3:", a0 == a1 or a2 == a3 + print("a0:", a0) + print("a1:", a1) + print("a2:", a2) + print("a3:", a3) + print("not a0:", not a0) + print("a0 == a1 and a2 == a3:", a0 == a1 and a2 == a3) + print("a0 == a1 or a2 == a3:", a0 == a1 or a2 == a3) Example 1 Output ---------------- @@ -417,7 +417,7 @@ # #----------------------------------------------------------------- - class SQLNode(object): + class SQLNode: def __and2__(self, other): return SQLBinop("and", self, other) @@ -473,7 +473,7 @@ return self def __sql__(self): - result = "SELECT %s" % ", ".join([sql(target) for target in self.targets]) + result = "SELECT %s" % ", ".join(sql(target) for target in self.targets) if self.where_clause: result = "%s WHERE %s" % (result, sql(self.where_clause)) return result @@ -491,9 +491,10 @@ def select(*targets): return SQLSelect(targets) - + #-------------------------------------------------------------------------------- +:: dishes = Table("dishes") customers = Table("customers") orders = Table("orders") @@ -502,8 +503,8 @@ customers.cust_id == orders.cust_id and orders.dish_id == dishes.dish_id and dishes.name == "Spam, Eggs, Sausages and Spam") - print repr(query) - print sql(query) + print(repr(query)) + print(sql(query)) Example 2 Output ---------------- -- Repository URL: http://hg.python.org/peps From python-checkins at python.org Thu Oct 6 13:30:11 2011 From: python-checkins at python.org (victor.stinner) Date: Thu, 06 Oct 2011 13:30:11 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Fix_assertion_in_unicode=5F?= =?utf8?q?adjust=5Fmaxchar=28=29?= Message-ID: http://hg.python.org/cpython/rev/bc2a43943507 changeset: 72747:bc2a43943507 user: Victor Stinner date: Thu Oct 06 13:27:56 2011 +0200 summary: Fix assertion in unicode_adjust_maxchar() files: Objects/unicodeobject.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -1791,7 +1791,7 @@ } } } - assert(max_char > PyUnicode_MAX_CHAR_VALUE(unicode)); + assert(max_char < PyUnicode_MAX_CHAR_VALUE(unicode)); copy = PyUnicode_New(len, max_char); copy_characters(copy, 0, unicode, 0, len); Py_DECREF(unicode); -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 15:29:09 2011 From: python-checkins at python.org (antoine.pitrou) Date: Thu, 06 Oct 2011 15:29:09 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Fix_compilation_under_Windo?= =?utf8?q?ws?= Message-ID: http://hg.python.org/cpython/rev/542c8da10de9 changeset: 72748:542c8da10de9 user: Antoine Pitrou date: Thu Oct 06 15:25:32 2011 +0200 summary: Fix compilation under Windows files: Objects/unicodeobject.c | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -1780,8 +1780,9 @@ } } else { + const Py_UCS4 *u; assert(kind == PyUnicode_4BYTE_KIND); - const Py_UCS4 *u = PyUnicode_4BYTE_DATA(unicode); + u = PyUnicode_4BYTE_DATA(unicode); max_char = 0; for (i = 0; i < len; i++) { if (u[i] > max_char) { -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 15:31:22 2011 From: python-checkins at python.org (antoine.pitrou) Date: Thu, 06 Oct 2011 15:31:22 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Issue_=233163=3A_The_struct?= =?utf8?q?_module_gets_new_format_characters_=27n=27_and_=27N=27?= Message-ID: http://hg.python.org/cpython/rev/db3e15017172 changeset: 72749:db3e15017172 user: Antoine Pitrou date: Thu Oct 06 15:27:40 2011 +0200 summary: Issue #3163: The struct module gets new format characters 'n' and 'N' supporting C integer types `ssize_t` and `size_t`, respectively. files: Doc/library/struct.rst | 21 +++++- Lib/test/test_struct.py | 66 +++++++++++++-------- Misc/NEWS | 3 + Modules/_struct.c | 90 ++++++++++++++++++++++++++++- 4 files changed, 150 insertions(+), 30 deletions(-) diff --git a/Doc/library/struct.rst b/Doc/library/struct.rst --- a/Doc/library/struct.rst +++ b/Doc/library/struct.rst @@ -187,17 +187,24 @@ | ``Q`` | :c:type:`unsigned long | integer | 8 | \(2), \(3) | | | long` | | | | +--------+--------------------------+--------------------+----------------+------------+ -| ``f`` | :c:type:`float` | float | 4 | \(4) | +| ``n`` | :c:type:`ssize_t` | integer | | \(4) | +--------+--------------------------+--------------------+----------------+------------+ -| ``d`` | :c:type:`double` | float | 8 | \(4) | +| ``N`` | :c:type:`size_t` | integer | | \(4) | ++--------+--------------------------+--------------------+----------------+------------+ +| ``f`` | :c:type:`float` | float | 4 | \(5) | ++--------+--------------------------+--------------------+----------------+------------+ +| ``d`` | :c:type:`double` | float | 8 | \(5) | +--------+--------------------------+--------------------+----------------+------------+ | ``s`` | :c:type:`char[]` | bytes | | | +--------+--------------------------+--------------------+----------------+------------+ | ``p`` | :c:type:`char[]` | bytes | | | +--------+--------------------------+--------------------+----------------+------------+ -| ``P`` | :c:type:`void \*` | integer | | \(5) | +| ``P`` | :c:type:`void \*` | integer | | \(6) | +--------+--------------------------+--------------------+----------------+------------+ +.. versionchanged:: 3.3 + Added support for the ``'n'`` and ``'N'`` formats. + Notes: (1) @@ -219,11 +226,17 @@ Use of the :meth:`__index__` method for non-integers is new in 3.2. (4) + The ``'n'`` and ``'N'`` conversion codes are only available for the native + size (selected as the default or with the ``'@'`` byte order character). + For the standard size, you can use whichever of the other integer formats + fits your application. + +(5) For the ``'f'`` and ``'d'`` conversion codes, the packed representation uses the IEEE 754 binary32 (for ``'f'``) or binary64 (for ``'d'``) format, regardless of the floating-point format used by the platform. -(5) +(6) The ``'P'`` format character is only available for the native byte ordering (selected as the default or with the ``'@'`` byte order character). The byte order character ``'='`` chooses to use little- or big-endian ordering based diff --git a/Lib/test/test_struct.py b/Lib/test/test_struct.py --- a/Lib/test/test_struct.py +++ b/Lib/test/test_struct.py @@ -8,9 +8,19 @@ ISBIGENDIAN = sys.byteorder == "big" IS32BIT = sys.maxsize == 0x7fffffff -integer_codes = 'b', 'B', 'h', 'H', 'i', 'I', 'l', 'L', 'q', 'Q' +integer_codes = 'b', 'B', 'h', 'H', 'i', 'I', 'l', 'L', 'q', 'Q', 'n', 'N' byteorders = '', '@', '=', '<', '>', '!' +def iter_integer_formats(byteorders=byteorders): + for code in integer_codes: + for byteorder in byteorders: + if (byteorder in ('', '@') and code in ('q', 'Q') and + not HAVE_LONG_LONG): + continue + if (byteorder not in ('', '@') and code in ('n', 'N')): + continue + yield code, byteorder + # Native 'q' packing isn't available on systems that don't have the C # long long type. try: @@ -141,14 +151,13 @@ } # standard integer sizes - for code in integer_codes: - for byteorder in '=', '<', '>', '!': - format = byteorder+code - size = struct.calcsize(format) - self.assertEqual(size, expected_size[code]) + for code, byteorder in iter_integer_formats(('=', '<', '>', '!')): + format = byteorder+code + size = struct.calcsize(format) + self.assertEqual(size, expected_size[code]) # native integer sizes - native_pairs = 'bB', 'hH', 'iI', 'lL' + native_pairs = 'bB', 'hH', 'iI', 'lL', 'nN' if HAVE_LONG_LONG: native_pairs += 'qQ', for format_pair in native_pairs: @@ -166,9 +175,11 @@ if HAVE_LONG_LONG: self.assertLessEqual(8, struct.calcsize('q')) self.assertLessEqual(struct.calcsize('l'), struct.calcsize('q')) + self.assertGreaterEqual(struct.calcsize('n'), struct.calcsize('i')) + self.assertGreaterEqual(struct.calcsize('n'), struct.calcsize('P')) def test_integers(self): - # Integer tests (bBhHiIlLqQ). + # Integer tests (bBhHiIlLqQnN). import binascii class IntTester(unittest.TestCase): @@ -182,11 +193,11 @@ self.byteorder) self.bytesize = struct.calcsize(format) self.bitsize = self.bytesize * 8 - if self.code in tuple('bhilq'): + if self.code in tuple('bhilqn'): self.signed = True self.min_value = -(2**(self.bitsize-1)) self.max_value = 2**(self.bitsize-1) - 1 - elif self.code in tuple('BHILQ'): + elif self.code in tuple('BHILQN'): self.signed = False self.min_value = 0 self.max_value = 2**self.bitsize - 1 @@ -316,14 +327,23 @@ struct.pack, self.format, obj) - for code in integer_codes: - for byteorder in byteorders: - if (byteorder in ('', '@') and code in ('q', 'Q') and - not HAVE_LONG_LONG): - continue + for code, byteorder in iter_integer_formats(): + format = byteorder+code + t = IntTester(format) + t.run() + + def test_nN_code(self): + # n and N don't exist in standard sizes + def assertStructError(func, *args, **kwargs): + with self.assertRaises(struct.error) as cm: + func(*args, **kwargs) + self.assertIn("bad char in struct format", str(cm.exception)) + for code in 'nN': + for byteorder in ('=', '<', '>', '!'): format = byteorder+code - t = IntTester(format) - t.run() + assertStructError(struct.calcsize, format) + assertStructError(struct.pack, format, 0) + assertStructError(struct.unpack, format, b"") def test_p_code(self): # Test p ("Pascal string") code. @@ -377,14 +397,10 @@ self.assertRaises(OverflowError, struct.pack, ">f", big) def test_1530559(self): - for byteorder in '', '@', '=', '<', '>', '!': - for code in integer_codes: - if (byteorder in ('', '@') and code in ('q', 'Q') and - not HAVE_LONG_LONG): - continue - format = byteorder + code - self.assertRaises(struct.error, struct.pack, format, 1.0) - self.assertRaises(struct.error, struct.pack, format, 1.5) + for code, byteorder in iter_integer_formats(): + format = byteorder + code + self.assertRaises(struct.error, struct.pack, format, 1.0) + self.assertRaises(struct.error, struct.pack, format, 1.5) self.assertRaises(struct.error, struct.pack, 'P', 1.0) self.assertRaises(struct.error, struct.pack, 'P', 1.5) diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -294,6 +294,9 @@ Library ------- +- Issue #3163: The struct module gets new format characters 'n' and 'N' + supporting C integer types ``ssize_t`` and ``size_t``, respectively. + - Issue #13099: Fix sqlite3.Cursor.lastrowid under a Turkish locale. Reported and diagnosed by Thomas Kluyver. diff --git a/Modules/_struct.c b/Modules/_struct.c --- a/Modules/_struct.c +++ b/Modules/_struct.c @@ -58,6 +58,7 @@ typedef struct { char c; float x; } st_float; typedef struct { char c; double x; } st_double; typedef struct { char c; void *x; } st_void_p; +typedef struct { char c; size_t x; } st_size_t; #define SHORT_ALIGN (sizeof(st_short) - sizeof(short)) #define INT_ALIGN (sizeof(st_int) - sizeof(int)) @@ -65,6 +66,7 @@ #define FLOAT_ALIGN (sizeof(st_float) - sizeof(float)) #define DOUBLE_ALIGN (sizeof(st_double) - sizeof(double)) #define VOID_P_ALIGN (sizeof(st_void_p) - sizeof(void *)) +#define SIZE_T_ALIGN (sizeof(st_size_t) - sizeof(size_t)) /* We can't support q and Q in native mode unless the compiler does; in std mode, they're 8 bytes on all platforms. */ @@ -213,6 +215,52 @@ #endif +/* Same, but handling Py_ssize_t */ + +static int +get_ssize_t(PyObject *v, Py_ssize_t *p) +{ + Py_ssize_t x; + + v = get_pylong(v); + if (v == NULL) + return -1; + assert(PyLong_Check(v)); + x = PyLong_AsSsize_t(v); + Py_DECREF(v); + if (x == (Py_ssize_t)-1 && PyErr_Occurred()) { + if (PyErr_ExceptionMatches(PyExc_OverflowError)) + PyErr_SetString(StructError, + "argument out of range"); + return -1; + } + *p = x; + return 0; +} + +/* Same, but handling size_t */ + +static int +get_size_t(PyObject *v, size_t *p) +{ + size_t x; + + v = get_pylong(v); + if (v == NULL) + return -1; + assert(PyLong_Check(v)); + x = PyLong_AsSize_t(v); + Py_DECREF(v); + if (x == (size_t)-1 && PyErr_Occurred()) { + if (PyErr_ExceptionMatches(PyExc_OverflowError)) + PyErr_SetString(StructError, + "argument out of range"); + return -1; + } + *p = x; + return 0; +} + #define RANGE_ERROR(x, f, flag, mask) return _range_error(f, flag) @@ -369,6 +417,23 @@ return PyLong_FromUnsignedLong(x); } +static PyObject * +nu_ssize_t(const char *p, const formatdef *f) +{ + Py_ssize_t x; + memcpy((char *)&x, p, sizeof x); + return PyLong_FromSsize_t(x); +} + +static PyObject * +nu_size_t(const char *p, const formatdef *f) +{ + size_t x; + memcpy((char *)&x, p, sizeof x); + return PyLong_FromSize_t(x); +} + + /* Native mode doesn't support q or Q unless the platform C supports long long (or, on Windows, __int64). */ @@ -558,6 +623,26 @@ return 0; } +static int +np_ssize_t(char *p, PyObject *v, const formatdef *f) +{ + Py_ssize_t x; + if (get_ssize_t(v, &x) < 0) + return -1; + memcpy(p, (char *)&x, sizeof x); + return 0; +} + +static int +np_size_t(char *p, PyObject *v, const formatdef *f) +{ + size_t x; + if (get_size_t(v, &x) < 0) + return -1; + memcpy(p, (char *)&x, sizeof x); + return 0; +} + #ifdef HAVE_LONG_LONG static int @@ -651,6 +736,8 @@ {'I', sizeof(int), INT_ALIGN, nu_uint, np_uint}, {'l', sizeof(long), LONG_ALIGN, nu_long, np_long}, {'L', sizeof(long), LONG_ALIGN, nu_ulong, np_ulong}, + {'n', sizeof(size_t), SIZE_T_ALIGN, nu_ssize_t, np_ssize_t}, + {'N', sizeof(size_t), SIZE_T_ALIGN, nu_size_t, np_size_t}, #ifdef HAVE_LONG_LONG {'q', sizeof(PY_LONG_LONG), LONG_LONG_ALIGN, nu_longlong, np_longlong}, {'Q', sizeof(PY_LONG_LONG), LONG_LONG_ALIGN, nu_ulonglong,np_ulonglong}, @@ -1951,7 +2038,8 @@ l:long; L:unsigned long; f:float; d:double.\n\ Special cases (preceding decimal count indicates length):\n\ s:string (array of char); p: pascal string (with count byte).\n\ -Special case (only available in native format):\n\ +Special cases (only available in native format):\n\ + n:ssize_t; N:size_t;\n\ P:an integer type that is wide enough to hold a pointer.\n\ Special case (not in native mode unless 'long long' in platform C):\n\ q:long long; Q:unsigned long long\n\ -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 15:38:19 2011 From: python-checkins at python.org (antoine.pitrou) Date: Thu, 06 Oct 2011 15:38:19 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Fix_compilation_warnings_un?= =?utf8?q?der_64-bit_Windows?= Message-ID: http://hg.python.org/cpython/rev/1a0715386d27 changeset: 72750:1a0715386d27 user: Antoine Pitrou date: Thu Oct 06 15:34:41 2011 +0200 summary: Fix compilation warnings under 64-bit Windows files: Include/unicodeobject.h | 5 +++-- Objects/stringlib/unicode_format.h | 2 +- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/Include/unicodeobject.h b/Include/unicodeobject.h --- a/Include/unicodeobject.h +++ b/Include/unicodeobject.h @@ -441,7 +441,7 @@ See also PyUnicode_KIND_SIZE(). */ #define PyUnicode_CHARACTER_SIZE(op) \ - (1 << (PyUnicode_KIND(op) - 1)) + ((Py_ssize_t) (1 << (PyUnicode_KIND(op) - 1))) /* Return pointers to the canonical representation cast to unsigned char, Py_UCS2, or Py_UCS4 for direct character access. @@ -477,7 +477,8 @@ The index is a character index, the result is a size in bytes. See also PyUnicode_CHARACTER_SIZE(). */ -#define PyUnicode_KIND_SIZE(kind, index) ((index) << ((kind) - 1)) +#define PyUnicode_KIND_SIZE(kind, index) \ + ((Py_ssize_t) ((index) << ((kind) - 1))) /* In the access macros below, "kind" may be evaluated more than once. All other macro parameters are evaluated exactly once, so it is safe diff --git a/Objects/stringlib/unicode_format.h b/Objects/stringlib/unicode_format.h --- a/Objects/stringlib/unicode_format.h +++ b/Objects/stringlib/unicode_format.h @@ -56,7 +56,7 @@ /* fill in a SubString from a pointer and length */ Py_LOCAL_INLINE(void) -SubString_init(SubString *str, PyObject *s, int start, int end) +SubString_init(SubString *str, PyObject *s, Py_ssize_t start, Py_ssize_t end) { str->str = s; str->start = start; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 15:47:52 2011 From: python-checkins at python.org (antoine.pitrou) Date: Thu, 06 Oct 2011 15:47:52 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Fix_compilation_warnings_un?= =?utf8?q?der_64-bit_Windows?= Message-ID: http://hg.python.org/cpython/rev/d760b345d7bd changeset: 72751:d760b345d7bd user: Antoine Pitrou date: Thu Oct 06 15:44:15 2011 +0200 summary: Fix compilation warnings under 64-bit Windows files: Modules/unicodedata.c | 8 ++++---- 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/Modules/unicodedata.c b/Modules/unicodedata.c --- a/Modules/unicodedata.c +++ b/Modules/unicodedata.c @@ -616,13 +616,13 @@ static int find_nfc_index(PyObject *self, struct reindex* nfc, Py_UCS4 code) { - int index; + unsigned int index; for (index = 0; nfc[index].start; index++) { - int start = nfc[index].start; + unsigned int start = nfc[index].start; if (code < start) return -1; if (code <= start + nfc[index].count) { - int delta = code - start; + unsigned int delta = code - start; return nfc[index].index + delta; } } @@ -1038,7 +1038,7 @@ *len = -1; for (i = 0; i < count; i++) { char *s = hangul_syllables[i][column]; - len1 = strlen(s); + len1 = Py_SAFE_DOWNCAST(strlen(s), size_t, int); if (len1 <= *len) continue; if (strncmp(str, s, len1) == 0) { -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 15:58:59 2011 From: python-checkins at python.org (victor.stinner) Date: Thu, 06 Oct 2011 15:58:59 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Fix_PyUnicode=5FCHARACTER?= =?utf8?q?=5FSIZE_and_PyUnicode=5FKIND=5FSIZE?= Message-ID: http://hg.python.org/cpython/rev/c512e9759059 changeset: 72752:c512e9759059 user: Victor Stinner date: Thu Oct 06 15:54:53 2011 +0200 summary: Fix PyUnicode_CHARACTER_SIZE and PyUnicode_KIND_SIZE files: Include/unicodeobject.h | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/Include/unicodeobject.h b/Include/unicodeobject.h --- a/Include/unicodeobject.h +++ b/Include/unicodeobject.h @@ -441,7 +441,7 @@ See also PyUnicode_KIND_SIZE(). */ #define PyUnicode_CHARACTER_SIZE(op) \ - ((Py_ssize_t) (1 << (PyUnicode_KIND(op) - 1))) + (((Py_ssize_t)1 << (PyUnicode_KIND(op) - 1))) /* Return pointers to the canonical representation cast to unsigned char, Py_UCS2, or Py_UCS4 for direct character access. @@ -478,7 +478,7 @@ See also PyUnicode_CHARACTER_SIZE(). */ #define PyUnicode_KIND_SIZE(kind, index) \ - ((Py_ssize_t) ((index) << ((kind) - 1))) + (((Py_ssize_t)(index)) << ((kind) - 1)) /* In the access macros below, "kind" may be evaluated more than once. All other macro parameters are evaluated exactly once, so it is safe -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 15:59:00 2011 From: python-checkins at python.org (victor.stinner) Date: Thu, 06 Oct 2011 15:59:00 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Fix_PyUnicode=5FJoin=28=29_?= =?utf8?q?for_len=3D=3D1_and_non-exact_string?= Message-ID: http://hg.python.org/cpython/rev/9a91ab415109 changeset: 72753:9a91ab415109 user: Victor Stinner date: Thu Oct 06 15:58:54 2011 +0200 summary: Fix PyUnicode_Join() for len==1 and non-exact string files: Objects/unicodeobject.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -9154,6 +9154,7 @@ return res; } sep = NULL; + maxchar = 0; } else { /* Set up sep and seplen */ @@ -9203,8 +9204,7 @@ goto onError; sz += PyUnicode_GET_LENGTH(item); item_maxchar = PyUnicode_MAX_CHAR_VALUE(item); - if (item_maxchar > maxchar) - maxchar = item_maxchar; + maxchar = Py_MAX(maxchar, item_maxchar); if (i != 0) sz += seplen; if (sz < old_sz || sz > PY_SSIZE_T_MAX) { -- Repository URL: http://hg.python.org/cpython From martin at v.loewis.de Thu Oct 6 16:10:20 2011 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Thu, 06 Oct 2011 16:10:20 +0200 Subject: [Python-checkins] cpython: Fix find_module_path(): make the string ready In-Reply-To: References: Message-ID: <4E8DB6CC.7070104@v.loewis.de> > + if (PyUnicode_READY(path_unicode)) > + return -1; > + I think we need to discuss/reconsider the return value of PyUnicode_READY. It's defined to give -1 on error currently. If that sounds good, then the check for error should be a check that it is -1. Regards, Martin From python-checkins at python.org Thu Oct 6 16:39:06 2011 From: python-checkins at python.org (senthil.kumaran) Date: Thu, 06 Oct 2011 16:39:06 +0200 Subject: [Python-checkins] =?utf8?q?peps=3A_Fix_broken_links_in_pep-0001?= =?utf8?q?=2Etxt=2E_Update_the_information_with_pointers_to_hg?= Message-ID: http://hg.python.org/peps/rev/398596beeee3 changeset: 3955:398596beeee3 user: Senthil Kumaran date: Thu Oct 06 22:38:55 2011 +0800 summary: Fix broken links in pep-0001.txt. Update the information with pointers to hg and devguide. files: pep-0001.txt | 56 ++++++++++++++++----------------------- 1 files changed, 23 insertions(+), 33 deletions(-) diff --git a/pep-0001.txt b/pep-0001.txt --- a/pep-0001.txt +++ b/pep-0001.txt @@ -116,7 +116,7 @@ approval phase, and is the final arbiter of the draft's PEP-ability. As updates are necessary, the PEP author can check in new versions if -they have SVN commit permissions, or can email new PEP versions to +they have hg push privileges, or can email new PEP versions to the PEP editor for committing. Standards Track PEPs consist of two parts, a design document and a @@ -129,11 +129,10 @@ PEP authors are responsible for collecting community feedback on a PEP before submitting it for review. However, wherever possible, long open-ended discussions on public mailing lists should be avoided. -Strategies to keep the -discussions efficient include: setting up a separate SIG mailing list -for the topic, having the PEP author accept private comments in the -early design phases, setting up a wiki page, etc. PEP authors should -use their discretion here. +Strategies to keep the discussions efficient include: setting up a +separate SIG mailing list for the topic, having the PEP author accept +private comments in the early design phases, setting up a wiki page, etc. +PEP authors should use their discretion here. Once the authors have completed a PEP, they must inform the PEP editor that it is ready for review. PEPs are reviewed by the BDFL and his @@ -266,8 +265,8 @@ PEP: Title: - Version: - Last-Modified: + Version: + Last-Modified: Author: * Discussions-To: Status: `_. +* Add the PEP to Mercurial. For mercurial work flow instructions, follow + `The Python Developers Guide `_ - The command to check out a read-only copy of the repository is:: + The mercurial repo for the peps is:: - svn checkout http://svn.python.org/projects/peps/trunk peps + http://hg.python.org/peps/ - The command to check out a read-write copy of the repository is:: - - svn checkout svn+ssh://pythondev at svn.python.org/peps/trunk peps - - In particular, the ``svn:eol-style`` property should be set to ``native`` - and the ``svn:keywords`` property to ``Author Date Id Revision``. * Monitor python.org to make sure the PEP gets added to the site properly. @@ -442,7 +434,7 @@ python-list & -dev). Updates to existing PEPs also come in to peps at python.org. Many PEP -authors are not SVN committers yet, so we do the commits for them. +authors are not Python committers yet, so we do the commits for them. Many PEPs are written and maintained by developers with write access to the Python codebase. The PEP editors monitor the python-checkins @@ -455,25 +447,23 @@ Resources: -* `How Python is Developed `_ +* `Index of Python Enhancement Proposals `_ -* `Python's Development Process `_ +* `Following Python's Development + `_ -* `Why Develop Python? `_ - -* `Development Tools `_ +* `Python Developer's Guide `_ * `Frequently Asked Questions for Developers - `_ - + `_ References and Footnotes ======================== -.. [1] This historical record is available by the normal SVN commands +.. [1] This historical record is available by the normal hg commands for retrieving older revisions. For those without direct access to - the SVN tree, you can browse the current and past PEP revisions here: - http://svn.python.org/view/peps/trunk/ + the hg repo, you can browse the current and past PEP revisions here: + http://hg.python.org/peps/ .. [2] PEP 2, Procedure for Adding New Modules, Faassen (http://www.python.org/dev/peps/pep-0002) @@ -485,8 +475,8 @@ (http://www.python.org/dev/peps/pep-0012) .. [5] The script referred to here is pep2pyramid.py, the successor to - pep2html.py, both of which live in the same directory in the SVN - tree as the PEPs themselves. Try ``pep2html.py --help`` for + pep2html.py, both of which live in the same directory in the hg + repo as the PEPs themselves. Try ``pep2html.py --help`` for details. The URL for viewing PEPs on the web is http://www.python.org/dev/peps/. -- Repository URL: http://hg.python.org/peps From python-checkins at python.org Thu Oct 6 19:07:40 2011 From: python-checkins at python.org (charles-francois.natali) Date: Thu, 06 Oct 2011 19:07:40 +0200 Subject: [Python-checkins] =?utf8?b?Y3B5dGhvbiAoMi43KTogSXNzdWUgIzEzMDcw?= =?utf8?q?=3A_Fix_a_crash_when_a_TextIOWrapper_caught_in_a_reference_cycle?= Message-ID: http://hg.python.org/cpython/rev/89b9e4bf6f1f changeset: 72754:89b9e4bf6f1f branch: 2.7 parent: 72725:83c486c1112c user: Charles-Fran?ois Natali date: Thu Oct 06 19:09:45 2011 +0200 summary: Issue #13070: Fix a crash when a TextIOWrapper caught in a reference cycle would be finalized after the reference to its underlying BufferedRWPair's writer got cleared by the GC. files: Lib/test/test_io.py | 15 +++++++++++++++ Misc/NEWS | 4 ++++ Modules/_io/bufferedio.c | 5 +++++ 3 files changed, 24 insertions(+), 0 deletions(-) diff --git a/Lib/test/test_io.py b/Lib/test/test_io.py --- a/Lib/test/test_io.py +++ b/Lib/test/test_io.py @@ -2338,6 +2338,21 @@ with self.open(support.TESTFN, "rb") as f: self.assertEqual(f.read(), b"456def") + def test_rwpair_cleared_before_textio(self): + # Issue 13070: TextIOWrapper's finalization would crash when called + # after the reference to the underlying BufferedRWPair's writer got + # cleared by the GC. + for i in range(1000): + b1 = self.BufferedRWPair(self.MockRawIO(), self.MockRawIO()) + t1 = self.TextIOWrapper(b1, encoding="ascii") + b2 = self.BufferedRWPair(self.MockRawIO(), self.MockRawIO()) + t2 = self.TextIOWrapper(b2, encoding="ascii") + # circular references + t1.buddy = t2 + t2.buddy = t1 + support.gc_collect() + + class PyTextIOWrapperTest(TextIOWrapperTest): pass diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -217,6 +217,10 @@ Extension Modules ----------------- +- Issue #13070: Fix a crash when a TextIOWrapper caught in a reference cycle + would be finalized after the reference to its underlying BufferedRWPair's + writer got cleared by the GC. + - Issue #12881: ctypes: Fix segfault with large structure field names. - Issue #13013: ctypes: Fix a reference leak in PyCArrayType_from_ctype. diff --git a/Modules/_io/bufferedio.c b/Modules/_io/bufferedio.c --- a/Modules/_io/bufferedio.c +++ b/Modules/_io/bufferedio.c @@ -2179,6 +2179,11 @@ static PyObject * bufferedrwpair_closed_get(rwpair *self, void *context) { + if (self->writer == NULL) { + PyErr_SetString(PyExc_RuntimeError, + "the BufferedRWPair object is being garbage-collected"); + return NULL; + } return PyObject_GetAttr((PyObject *) self->writer, _PyIO_str_closed); } -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 19:13:45 2011 From: python-checkins at python.org (antoine.pitrou) Date: Thu, 06 Oct 2011 19:13:45 +0200 Subject: [Python-checkins] =?utf8?b?Y3B5dGhvbiAoMy4yKTogSXNzdWUgIzEyOTEx?= =?utf8?q?=3A_Fix_memory_consumption_when_calculating_the_repr=28=29_of_hu?= =?utf8?q?ge_tuples?= Message-ID: http://hg.python.org/cpython/rev/f9f782f2369e changeset: 72755:f9f782f2369e branch: 3.2 parent: 72744:7b277d5530bc user: Antoine Pitrou date: Thu Oct 06 18:57:27 2011 +0200 summary: Issue #12911: Fix memory consumption when calculating the repr() of huge tuples or lists. This introduces a small private API for this common pattern. The issue has been discovered thanks to Martin's huge-mem buildbot. files: Include/Python.h | 4 +- Include/accu.h | 35 +++++++ Lib/test/test_list.py | 11 ++ Lib/test/test_tuple.py | 10 ++ Makefile.pre.in | 2 + Misc/NEWS | 3 + Objects/accu.c | 114 +++++++++++++++++++++++++ Objects/listobject.c | 81 +++++++---------- Objects/tupleobject.c | 73 +++++++-------- PC/VC6/pythoncore.dsp | 4 + PC/VS7.1/pythoncore.vcproj | 3 + PC/VS8.0/pythoncore.vcproj | 8 + PCbuild/pythoncore.vcproj | 8 + 13 files changed, 270 insertions(+), 86 deletions(-) diff --git a/Include/Python.h b/Include/Python.h --- a/Include/Python.h +++ b/Include/Python.h @@ -100,7 +100,7 @@ #include "warnings.h" #include "weakrefobject.h" #include "structseq.h" - +#include "accu.h" #include "codecs.h" #include "pyerrors.h" @@ -141,7 +141,7 @@ #endif /* Argument must be a char or an int in [-128, 127] or [0, 255]. */ -#define Py_CHARMASK(c) ((unsigned char)((c) & 0xff)) +#define Py_CHARMASK(c) ((unsigned char)((c) & 0xff)) #include "pyfpe.h" diff --git a/Include/accu.h b/Include/accu.h new file mode 100644 --- /dev/null +++ b/Include/accu.h @@ -0,0 +1,35 @@ +#ifndef Py_LIMITED_API +#ifndef Py_ACCU_H +#define Py_ACCU_H + +/*** This is a private API for use by the interpreter and the stdlib. + *** Its definition may be changed or removed at any moment. + ***/ + +/* + * A two-level accumulator of unicode objects that avoids both the overhead + * of keeping a huge number of small separate objects, and the quadratic + * behaviour of using a naive repeated concatenation scheme. + */ + +#ifdef __cplusplus +extern "C" { +#endif + +typedef struct { + PyObject *large; /* A list of previously accumulated large strings */ + PyObject *small; /* Pending small strings */ +} _PyAccu; + +PyAPI_FUNC(int) _PyAccu_Init(_PyAccu *acc); +PyAPI_FUNC(int) _PyAccu_Accumulate(_PyAccu *acc, PyObject *unicode); +PyAPI_FUNC(PyObject *) _PyAccu_FinishAsList(_PyAccu *acc); +PyAPI_FUNC(PyObject *) _PyAccu_Finish(_PyAccu *acc); +PyAPI_FUNC(void) _PyAccu_Destroy(_PyAccu *acc); + +#ifdef __cplusplus +} +#endif + +#endif /* Py_ACCU_H */ +#endif /* Py_LIMITED_API */ diff --git a/Lib/test/test_list.py b/Lib/test/test_list.py --- a/Lib/test/test_list.py +++ b/Lib/test/test_list.py @@ -59,6 +59,17 @@ self.assertRaises((MemoryError, OverflowError), mul, lst, n) self.assertRaises((MemoryError, OverflowError), imul, lst, n) + def test_repr_large(self): + # Check the repr of large list objects + def check(n): + l = [0] * n + s = repr(l) + self.assertEqual(s, + '[' + ', '.join(['0'] * n) + ']') + check(10) # check our checking code + check(1000000) + + def test_main(verbose=None): support.run_unittest(ListTest) diff --git a/Lib/test/test_tuple.py b/Lib/test/test_tuple.py --- a/Lib/test/test_tuple.py +++ b/Lib/test/test_tuple.py @@ -154,6 +154,16 @@ # Trying to untrack an unfinished tuple could crash Python self._not_tracked(tuple(gc.collect() for i in range(101))) + def test_repr_large(self): + # Check the repr of large list objects + def check(n): + l = (0,) * n + s = repr(l) + self.assertEqual(s, + '(' + ', '.join(['0'] * n) + ')') + check(10) # check our checking code + check(1000000) + def test_main(): support.run_unittest(TupleTest) diff --git a/Makefile.pre.in b/Makefile.pre.in --- a/Makefile.pre.in +++ b/Makefile.pre.in @@ -342,6 +342,7 @@ # Objects OBJECT_OBJS= \ Objects/abstract.o \ + Objects/accu.o \ Objects/boolobject.o \ Objects/bytes_methods.o \ Objects/bytearrayobject.o \ @@ -664,6 +665,7 @@ Include/Python-ast.h \ Include/Python.h \ Include/abstract.h \ + Include/accu.h \ Include/asdl.h \ Include/ast.h \ Include/bltinmodule.h \ diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -10,6 +10,9 @@ Core and Builtins ----------------- +- Issue #12911: Fix memory consumption when calculating the repr() of huge + tuples or lists. + - Issue #7732: Don't open a directory as a file anymore while importing a module. Ignore the direcotry if its name matchs the module name (e.g. "__init__.py") and raise a ImportError instead. diff --git a/Objects/accu.c b/Objects/accu.c new file mode 100644 --- /dev/null +++ b/Objects/accu.c @@ -0,0 +1,114 @@ +/* Accumulator struct implementation */ + +#include "Python.h" + +static PyObject * +join_list_unicode(PyObject *lst) +{ + /* return ''.join(lst) */ + PyObject *sep, *ret; + sep = PyUnicode_FromStringAndSize("", 0); + ret = PyUnicode_Join(sep, lst); + Py_DECREF(sep); + return ret; +} + +int +_PyAccu_Init(_PyAccu *acc) +{ + /* Lazily allocated */ + acc->large = NULL; + acc->small = PyList_New(0); + if (acc->small == NULL) + return -1; + return 0; +} + +static int +flush_accumulator(_PyAccu *acc) +{ + Py_ssize_t nsmall = PyList_GET_SIZE(acc->small); + if (nsmall) { + int ret; + PyObject *joined; + if (acc->large == NULL) { + acc->large = PyList_New(0); + if (acc->large == NULL) + return -1; + } + joined = join_list_unicode(acc->small); + if (joined == NULL) + return -1; + if (PyList_SetSlice(acc->small, 0, nsmall, NULL)) { + Py_DECREF(joined); + return -1; + } + ret = PyList_Append(acc->large, joined); + Py_DECREF(joined); + return ret; + } + return 0; +} + +int +_PyAccu_Accumulate(_PyAccu *acc, PyObject *unicode) +{ + Py_ssize_t nsmall; + assert(PyUnicode_Check(unicode)); + + if (PyList_Append(acc->small, unicode)) + return -1; + nsmall = PyList_GET_SIZE(acc->small); + /* Each item in a list of unicode objects has an overhead (in 64-bit + * builds) of: + * - 8 bytes for the list slot + * - 56 bytes for the header of the unicode object + * that is, 64 bytes. 100000 such objects waste more than 6MB + * compared to a single concatenated string. + */ + if (nsmall < 100000) + return 0; + return flush_accumulator(acc); +} + +PyObject * +_PyAccu_FinishAsList(_PyAccu *acc) +{ + int ret; + PyObject *res; + + ret = flush_accumulator(acc); + Py_CLEAR(acc->small); + if (ret) { + Py_CLEAR(acc->large); + return NULL; + } + res = acc->large; + acc->large = NULL; + return res; +} + +PyObject * +_PyAccu_Finish(_PyAccu *acc) +{ + PyObject *list, *res; + if (acc->large == NULL) { + list = acc->small; + acc->small = NULL; + } + else { + list = _PyAccu_FinishAsList(acc); + if (!list) + return NULL; + } + res = join_list_unicode(list); + Py_DECREF(list); + return res; +} + +void +_PyAccu_Destroy(_PyAccu *acc) +{ + Py_CLEAR(acc->small); + Py_CLEAR(acc->large); +} diff --git a/Objects/listobject.c b/Objects/listobject.c --- a/Objects/listobject.c +++ b/Objects/listobject.c @@ -321,70 +321,59 @@ list_repr(PyListObject *v) { Py_ssize_t i; - PyObject *s, *temp; - PyObject *pieces = NULL, *result = NULL; + PyObject *s = NULL; + _PyAccu acc; + static PyObject *sep = NULL; + + if (Py_SIZE(v) == 0) { + return PyUnicode_FromString("[]"); + } + + if (sep == NULL) { + sep = PyUnicode_FromString(", "); + if (sep == NULL) + return NULL; + } i = Py_ReprEnter((PyObject*)v); if (i != 0) { return i > 0 ? PyUnicode_FromString("[...]") : NULL; } - if (Py_SIZE(v) == 0) { - result = PyUnicode_FromString("[]"); - goto Done; - } + if (_PyAccu_Init(&acc)) + goto error; - pieces = PyList_New(0); - if (pieces == NULL) - goto Done; + s = PyUnicode_FromString("["); + if (s == NULL || _PyAccu_Accumulate(&acc, s)) + goto error; + Py_CLEAR(s); /* Do repr() on each element. Note that this may mutate the list, so must refetch the list size on each iteration. */ for (i = 0; i < Py_SIZE(v); ++i) { - int status; if (Py_EnterRecursiveCall(" while getting the repr of a list")) - goto Done; + goto error; s = PyObject_Repr(v->ob_item[i]); Py_LeaveRecursiveCall(); - if (s == NULL) - goto Done; - status = PyList_Append(pieces, s); - Py_DECREF(s); /* append created a new ref */ - if (status < 0) - goto Done; + if (i > 0 && _PyAccu_Accumulate(&acc, sep)) + goto error; + if (s == NULL || _PyAccu_Accumulate(&acc, s)) + goto error; + Py_CLEAR(s); } + s = PyUnicode_FromString("]"); + if (s == NULL || _PyAccu_Accumulate(&acc, s)) + goto error; + Py_CLEAR(s); - /* Add "[]" decorations to the first and last items. */ - assert(PyList_GET_SIZE(pieces) > 0); - s = PyUnicode_FromString("["); - if (s == NULL) - goto Done; - temp = PyList_GET_ITEM(pieces, 0); - PyUnicode_AppendAndDel(&s, temp); - PyList_SET_ITEM(pieces, 0, s); - if (s == NULL) - goto Done; + Py_ReprLeave((PyObject *)v); + return _PyAccu_Finish(&acc); - s = PyUnicode_FromString("]"); - if (s == NULL) - goto Done; - temp = PyList_GET_ITEM(pieces, PyList_GET_SIZE(pieces) - 1); - PyUnicode_AppendAndDel(&temp, s); - PyList_SET_ITEM(pieces, PyList_GET_SIZE(pieces) - 1, temp); - if (temp == NULL) - goto Done; - - /* Paste them all together with ", " between. */ - s = PyUnicode_FromString(", "); - if (s == NULL) - goto Done; - result = PyUnicode_Join(s, pieces); - Py_DECREF(s); - -Done: - Py_XDECREF(pieces); +error: + _PyAccu_Destroy(&acc); + Py_XDECREF(s); Py_ReprLeave((PyObject *)v); - return result; + return NULL; } static Py_ssize_t diff --git a/Objects/tupleobject.c b/Objects/tupleobject.c --- a/Objects/tupleobject.c +++ b/Objects/tupleobject.c @@ -240,13 +240,20 @@ tuplerepr(PyTupleObject *v) { Py_ssize_t i, n; - PyObject *s, *temp; - PyObject *pieces, *result = NULL; + PyObject *s = NULL; + _PyAccu acc; + static PyObject *sep = NULL; n = Py_SIZE(v); if (n == 0) return PyUnicode_FromString("()"); + if (sep == NULL) { + sep = PyUnicode_FromString(", "); + if (sep == NULL) + return NULL; + } + /* While not mutable, it is still possible to end up with a cycle in a tuple through an object that stores itself within a tuple (and thus infinitely asks for the repr of itself). This should only be @@ -256,52 +263,42 @@ return i > 0 ? PyUnicode_FromString("(...)") : NULL; } - pieces = PyTuple_New(n); - if (pieces == NULL) - return NULL; + if (_PyAccu_Init(&acc)) + goto error; + + s = PyUnicode_FromString("("); + if (s == NULL || _PyAccu_Accumulate(&acc, s)) + goto error; + Py_CLEAR(s); /* Do repr() on each element. */ for (i = 0; i < n; ++i) { if (Py_EnterRecursiveCall(" while getting the repr of a tuple")) - goto Done; + goto error; s = PyObject_Repr(v->ob_item[i]); Py_LeaveRecursiveCall(); - if (s == NULL) - goto Done; - PyTuple_SET_ITEM(pieces, i, s); + if (i > 0 && _PyAccu_Accumulate(&acc, sep)) + goto error; + if (s == NULL || _PyAccu_Accumulate(&acc, s)) + goto error; + Py_CLEAR(s); } + if (n > 1) + s = PyUnicode_FromString(")"); + else + s = PyUnicode_FromString(",)"); + if (s == NULL || _PyAccu_Accumulate(&acc, s)) + goto error; + Py_CLEAR(s); - /* Add "()" decorations to the first and last items. */ - assert(n > 0); - s = PyUnicode_FromString("("); - if (s == NULL) - goto Done; - temp = PyTuple_GET_ITEM(pieces, 0); - PyUnicode_AppendAndDel(&s, temp); - PyTuple_SET_ITEM(pieces, 0, s); - if (s == NULL) - goto Done; + Py_ReprLeave((PyObject *)v); + return _PyAccu_Finish(&acc); - s = PyUnicode_FromString(n == 1 ? ",)" : ")"); - if (s == NULL) - goto Done; - temp = PyTuple_GET_ITEM(pieces, n-1); - PyUnicode_AppendAndDel(&temp, s); - PyTuple_SET_ITEM(pieces, n-1, temp); - if (temp == NULL) - goto Done; - - /* Paste them all together with ", " between. */ - s = PyUnicode_FromString(", "); - if (s == NULL) - goto Done; - result = PyUnicode_Join(s, pieces); - Py_DECREF(s); - -Done: - Py_DECREF(pieces); +error: + _PyAccu_Destroy(&acc); + Py_XDECREF(s); Py_ReprLeave((PyObject *)v); - return result; + return NULL; } /* The addend 82520, was selected from the range(0, 1000000) for diff --git a/PC/VC6/pythoncore.dsp b/PC/VC6/pythoncore.dsp --- a/PC/VC6/pythoncore.dsp +++ b/PC/VC6/pythoncore.dsp @@ -205,6 +205,10 @@ # End Source File # Begin Source File +SOURCE=..\..\Objects\accu.c +# End Source File +# Begin Source File + SOURCE=..\..\Parser\acceler.c # End Source File # Begin Source File diff --git a/PC/VS7.1/pythoncore.vcproj b/PC/VS7.1/pythoncore.vcproj --- a/PC/VS7.1/pythoncore.vcproj +++ b/PC/VS7.1/pythoncore.vcproj @@ -445,6 +445,9 @@ RelativePath="..\..\Objects\abstract.c"> + + + + @@ -1447,6 +1451,10 @@ > + + diff --git a/PCbuild/pythoncore.vcproj b/PCbuild/pythoncore.vcproj --- a/PCbuild/pythoncore.vcproj +++ b/PCbuild/pythoncore.vcproj @@ -635,6 +635,10 @@ > + + @@ -1447,6 +1451,10 @@ > + + -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 19:13:46 2011 From: python-checkins at python.org (antoine.pitrou) Date: Thu, 06 Oct 2011 19:13:46 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=28merge_3=2E2_-=3E_default=29?= =?utf8?q?=3A_Issue_=2312911=3A_Fix_memory_consumption_when_calculating_th?= =?utf8?q?e_repr=28=29_of_huge_tuples?= Message-ID: http://hg.python.org/cpython/rev/656c13024ede changeset: 72756:656c13024ede parent: 72753:9a91ab415109 parent: 72755:f9f782f2369e user: Antoine Pitrou date: Thu Oct 06 19:04:12 2011 +0200 summary: Issue #12911: Fix memory consumption when calculating the repr() of huge tuples or lists. This introduces a small private API for this common pattern. The issue has been discovered thanks to Martin's huge-mem buildbot. files: Include/Python.h | 2 +- Include/accu.h | 35 +++++++ Lib/test/test_list.py | 11 ++ Lib/test/test_tuple.py | 10 ++ Makefile.pre.in | 2 + Misc/NEWS | 3 + Objects/accu.c | 114 +++++++++++++++++++++++++ Objects/listobject.c | 81 +++++++---------- Objects/tupleobject.c | 73 +++++++-------- PC/VC6/pythoncore.dsp | 4 + PC/VS7.1/pythoncore.vcproj | 3 + PC/VS8.0/pythoncore.vcproj | 8 + PCbuild/pythoncore.vcproj | 8 + 13 files changed, 269 insertions(+), 85 deletions(-) diff --git a/Include/Python.h b/Include/Python.h --- a/Include/Python.h +++ b/Include/Python.h @@ -101,7 +101,7 @@ #include "warnings.h" #include "weakrefobject.h" #include "structseq.h" - +#include "accu.h" #include "codecs.h" #include "pyerrors.h" diff --git a/Include/accu.h b/Include/accu.h new file mode 100644 --- /dev/null +++ b/Include/accu.h @@ -0,0 +1,35 @@ +#ifndef Py_LIMITED_API +#ifndef Py_ACCU_H +#define Py_ACCU_H + +/*** This is a private API for use by the interpreter and the stdlib. + *** Its definition may be changed or removed at any moment. + ***/ + +/* + * A two-level accumulator of unicode objects that avoids both the overhead + * of keeping a huge number of small separate objects, and the quadratic + * behaviour of using a naive repeated concatenation scheme. + */ + +#ifdef __cplusplus +extern "C" { +#endif + +typedef struct { + PyObject *large; /* A list of previously accumulated large strings */ + PyObject *small; /* Pending small strings */ +} _PyAccu; + +PyAPI_FUNC(int) _PyAccu_Init(_PyAccu *acc); +PyAPI_FUNC(int) _PyAccu_Accumulate(_PyAccu *acc, PyObject *unicode); +PyAPI_FUNC(PyObject *) _PyAccu_FinishAsList(_PyAccu *acc); +PyAPI_FUNC(PyObject *) _PyAccu_Finish(_PyAccu *acc); +PyAPI_FUNC(void) _PyAccu_Destroy(_PyAccu *acc); + +#ifdef __cplusplus +} +#endif + +#endif /* Py_ACCU_H */ +#endif /* Py_LIMITED_API */ diff --git a/Lib/test/test_list.py b/Lib/test/test_list.py --- a/Lib/test/test_list.py +++ b/Lib/test/test_list.py @@ -59,6 +59,17 @@ self.assertRaises((MemoryError, OverflowError), mul, lst, n) self.assertRaises((MemoryError, OverflowError), imul, lst, n) + def test_repr_large(self): + # Check the repr of large list objects + def check(n): + l = [0] * n + s = repr(l) + self.assertEqual(s, + '[' + ', '.join(['0'] * n) + ']') + check(10) # check our checking code + check(1000000) + + def test_main(verbose=None): support.run_unittest(ListTest) diff --git a/Lib/test/test_tuple.py b/Lib/test/test_tuple.py --- a/Lib/test/test_tuple.py +++ b/Lib/test/test_tuple.py @@ -154,6 +154,16 @@ # Trying to untrack an unfinished tuple could crash Python self._not_tracked(tuple(gc.collect() for i in range(101))) + def test_repr_large(self): + # Check the repr of large list objects + def check(n): + l = (0,) * n + s = repr(l) + self.assertEqual(s, + '(' + ', '.join(['0'] * n) + ')') + check(10) # check our checking code + check(1000000) + def test_main(): support.run_unittest(TupleTest) diff --git a/Makefile.pre.in b/Makefile.pre.in --- a/Makefile.pre.in +++ b/Makefile.pre.in @@ -342,6 +342,7 @@ # Objects OBJECT_OBJS= \ Objects/abstract.o \ + Objects/accu.o \ Objects/boolobject.o \ Objects/bytes_methods.o \ Objects/bytearrayobject.o \ @@ -661,6 +662,7 @@ PYTHON_HEADERS= \ Include/Python.h \ Include/abstract.h \ + Include/accu.h \ Include/asdl.h \ Include/ast.h \ Include/bltinmodule.h \ diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -10,6 +10,9 @@ Core and Builtins ----------------- +- Issue #12911: Fix memory consumption when calculating the repr() of huge + tuples or lists. + - PEP 393: flexible string representation. Thanks to Torsten Becker for the initial implementation, and Victor Stinner for various bug fixes. diff --git a/Objects/accu.c b/Objects/accu.c new file mode 100644 --- /dev/null +++ b/Objects/accu.c @@ -0,0 +1,114 @@ +/* Accumulator struct implementation */ + +#include "Python.h" + +static PyObject * +join_list_unicode(PyObject *lst) +{ + /* return ''.join(lst) */ + PyObject *sep, *ret; + sep = PyUnicode_FromStringAndSize("", 0); + ret = PyUnicode_Join(sep, lst); + Py_DECREF(sep); + return ret; +} + +int +_PyAccu_Init(_PyAccu *acc) +{ + /* Lazily allocated */ + acc->large = NULL; + acc->small = PyList_New(0); + if (acc->small == NULL) + return -1; + return 0; +} + +static int +flush_accumulator(_PyAccu *acc) +{ + Py_ssize_t nsmall = PyList_GET_SIZE(acc->small); + if (nsmall) { + int ret; + PyObject *joined; + if (acc->large == NULL) { + acc->large = PyList_New(0); + if (acc->large == NULL) + return -1; + } + joined = join_list_unicode(acc->small); + if (joined == NULL) + return -1; + if (PyList_SetSlice(acc->small, 0, nsmall, NULL)) { + Py_DECREF(joined); + return -1; + } + ret = PyList_Append(acc->large, joined); + Py_DECREF(joined); + return ret; + } + return 0; +} + +int +_PyAccu_Accumulate(_PyAccu *acc, PyObject *unicode) +{ + Py_ssize_t nsmall; + assert(PyUnicode_Check(unicode)); + + if (PyList_Append(acc->small, unicode)) + return -1; + nsmall = PyList_GET_SIZE(acc->small); + /* Each item in a list of unicode objects has an overhead (in 64-bit + * builds) of: + * - 8 bytes for the list slot + * - 56 bytes for the header of the unicode object + * that is, 64 bytes. 100000 such objects waste more than 6MB + * compared to a single concatenated string. + */ + if (nsmall < 100000) + return 0; + return flush_accumulator(acc); +} + +PyObject * +_PyAccu_FinishAsList(_PyAccu *acc) +{ + int ret; + PyObject *res; + + ret = flush_accumulator(acc); + Py_CLEAR(acc->small); + if (ret) { + Py_CLEAR(acc->large); + return NULL; + } + res = acc->large; + acc->large = NULL; + return res; +} + +PyObject * +_PyAccu_Finish(_PyAccu *acc) +{ + PyObject *list, *res; + if (acc->large == NULL) { + list = acc->small; + acc->small = NULL; + } + else { + list = _PyAccu_FinishAsList(acc); + if (!list) + return NULL; + } + res = join_list_unicode(list); + Py_DECREF(list); + return res; +} + +void +_PyAccu_Destroy(_PyAccu *acc) +{ + Py_CLEAR(acc->small); + Py_CLEAR(acc->large); +} diff --git a/Objects/listobject.c b/Objects/listobject.c --- a/Objects/listobject.c +++ b/Objects/listobject.c @@ -321,70 +321,59 @@ list_repr(PyListObject *v) { Py_ssize_t i; - PyObject *s, *temp; - PyObject *pieces = NULL, *result = NULL; + PyObject *s = NULL; + _PyAccu acc; + static PyObject *sep = NULL; + + if (Py_SIZE(v) == 0) { + return PyUnicode_FromString("[]"); + } + + if (sep == NULL) { + sep = PyUnicode_FromString(", "); + if (sep == NULL) + return NULL; + } i = Py_ReprEnter((PyObject*)v); if (i != 0) { return i > 0 ? PyUnicode_FromString("[...]") : NULL; } - if (Py_SIZE(v) == 0) { - result = PyUnicode_FromString("[]"); - goto Done; - } + if (_PyAccu_Init(&acc)) + goto error; - pieces = PyList_New(0); - if (pieces == NULL) - goto Done; + s = PyUnicode_FromString("["); + if (s == NULL || _PyAccu_Accumulate(&acc, s)) + goto error; + Py_CLEAR(s); /* Do repr() on each element. Note that this may mutate the list, so must refetch the list size on each iteration. */ for (i = 0; i < Py_SIZE(v); ++i) { - int status; if (Py_EnterRecursiveCall(" while getting the repr of a list")) - goto Done; + goto error; s = PyObject_Repr(v->ob_item[i]); Py_LeaveRecursiveCall(); - if (s == NULL) - goto Done; - status = PyList_Append(pieces, s); - Py_DECREF(s); /* append created a new ref */ - if (status < 0) - goto Done; + if (i > 0 && _PyAccu_Accumulate(&acc, sep)) + goto error; + if (s == NULL || _PyAccu_Accumulate(&acc, s)) + goto error; + Py_CLEAR(s); } + s = PyUnicode_FromString("]"); + if (s == NULL || _PyAccu_Accumulate(&acc, s)) + goto error; + Py_CLEAR(s); - /* Add "[]" decorations to the first and last items. */ - assert(PyList_GET_SIZE(pieces) > 0); - s = PyUnicode_FromString("["); - if (s == NULL) - goto Done; - temp = PyList_GET_ITEM(pieces, 0); - PyUnicode_AppendAndDel(&s, temp); - PyList_SET_ITEM(pieces, 0, s); - if (s == NULL) - goto Done; + Py_ReprLeave((PyObject *)v); + return _PyAccu_Finish(&acc); - s = PyUnicode_FromString("]"); - if (s == NULL) - goto Done; - temp = PyList_GET_ITEM(pieces, PyList_GET_SIZE(pieces) - 1); - PyUnicode_AppendAndDel(&temp, s); - PyList_SET_ITEM(pieces, PyList_GET_SIZE(pieces) - 1, temp); - if (temp == NULL) - goto Done; - - /* Paste them all together with ", " between. */ - s = PyUnicode_FromString(", "); - if (s == NULL) - goto Done; - result = PyUnicode_Join(s, pieces); - Py_DECREF(s); - -Done: - Py_XDECREF(pieces); +error: + _PyAccu_Destroy(&acc); + Py_XDECREF(s); Py_ReprLeave((PyObject *)v); - return result; + return NULL; } static Py_ssize_t diff --git a/Objects/tupleobject.c b/Objects/tupleobject.c --- a/Objects/tupleobject.c +++ b/Objects/tupleobject.c @@ -240,13 +240,20 @@ tuplerepr(PyTupleObject *v) { Py_ssize_t i, n; - PyObject *s, *temp; - PyObject *pieces, *result = NULL; + PyObject *s = NULL; + _PyAccu acc; + static PyObject *sep = NULL; n = Py_SIZE(v); if (n == 0) return PyUnicode_FromString("()"); + if (sep == NULL) { + sep = PyUnicode_FromString(", "); + if (sep == NULL) + return NULL; + } + /* While not mutable, it is still possible to end up with a cycle in a tuple through an object that stores itself within a tuple (and thus infinitely asks for the repr of itself). This should only be @@ -256,52 +263,42 @@ return i > 0 ? PyUnicode_FromString("(...)") : NULL; } - pieces = PyTuple_New(n); - if (pieces == NULL) - return NULL; + if (_PyAccu_Init(&acc)) + goto error; + + s = PyUnicode_FromString("("); + if (s == NULL || _PyAccu_Accumulate(&acc, s)) + goto error; + Py_CLEAR(s); /* Do repr() on each element. */ for (i = 0; i < n; ++i) { if (Py_EnterRecursiveCall(" while getting the repr of a tuple")) - goto Done; + goto error; s = PyObject_Repr(v->ob_item[i]); Py_LeaveRecursiveCall(); - if (s == NULL) - goto Done; - PyTuple_SET_ITEM(pieces, i, s); + if (i > 0 && _PyAccu_Accumulate(&acc, sep)) + goto error; + if (s == NULL || _PyAccu_Accumulate(&acc, s)) + goto error; + Py_CLEAR(s); } + if (n > 1) + s = PyUnicode_FromString(")"); + else + s = PyUnicode_FromString(",)"); + if (s == NULL || _PyAccu_Accumulate(&acc, s)) + goto error; + Py_CLEAR(s); - /* Add "()" decorations to the first and last items. */ - assert(n > 0); - s = PyUnicode_FromString("("); - if (s == NULL) - goto Done; - temp = PyTuple_GET_ITEM(pieces, 0); - PyUnicode_AppendAndDel(&s, temp); - PyTuple_SET_ITEM(pieces, 0, s); - if (s == NULL) - goto Done; + Py_ReprLeave((PyObject *)v); + return _PyAccu_Finish(&acc); - s = PyUnicode_FromString(n == 1 ? ",)" : ")"); - if (s == NULL) - goto Done; - temp = PyTuple_GET_ITEM(pieces, n-1); - PyUnicode_AppendAndDel(&temp, s); - PyTuple_SET_ITEM(pieces, n-1, temp); - if (temp == NULL) - goto Done; - - /* Paste them all together with ", " between. */ - s = PyUnicode_FromString(", "); - if (s == NULL) - goto Done; - result = PyUnicode_Join(s, pieces); - Py_DECREF(s); - -Done: - Py_DECREF(pieces); +error: + _PyAccu_Destroy(&acc); + Py_XDECREF(s); Py_ReprLeave((PyObject *)v); - return result; + return NULL; } /* The addend 82520, was selected from the range(0, 1000000) for diff --git a/PC/VC6/pythoncore.dsp b/PC/VC6/pythoncore.dsp --- a/PC/VC6/pythoncore.dsp +++ b/PC/VC6/pythoncore.dsp @@ -205,6 +205,10 @@ # End Source File # Begin Source File +SOURCE=..\..\Objects\accu.c +# End Source File +# Begin Source File + SOURCE=..\..\Parser\acceler.c # End Source File # Begin Source File diff --git a/PC/VS7.1/pythoncore.vcproj b/PC/VS7.1/pythoncore.vcproj --- a/PC/VS7.1/pythoncore.vcproj +++ b/PC/VS7.1/pythoncore.vcproj @@ -445,6 +445,9 @@ RelativePath="..\..\Objects\abstract.c"> + + + + @@ -1447,6 +1451,10 @@ > + + diff --git a/PCbuild/pythoncore.vcproj b/PCbuild/pythoncore.vcproj --- a/PCbuild/pythoncore.vcproj +++ b/PCbuild/pythoncore.vcproj @@ -635,6 +635,10 @@ > + + @@ -1455,6 +1459,10 @@ > + + -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 19:13:47 2011 From: python-checkins at python.org (antoine.pitrou) Date: Thu, 06 Oct 2011 19:13:47 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Remove_now_duplicate_code_i?= =?utf8?q?n_=5Fjson=2Ec=3B_instead=2C_reuse_the_new_private_lib?= Message-ID: http://hg.python.org/cpython/rev/e685b02ddcac changeset: 72757:e685b02ddcac user: Antoine Pitrou date: Thu Oct 06 19:09:51 2011 +0200 summary: Remove now duplicate code in _json.c; instead, reuse the new private lib files: Modules/_json.c | 145 +++++------------------------------ 1 files changed, 22 insertions(+), 123 deletions(-) diff --git a/Modules/_json.c b/Modules/_json.c --- a/Modules/_json.c +++ b/Modules/_json.c @@ -75,17 +75,6 @@ {NULL} }; -/* - * A two-level accumulator of unicode objects that avoids both the overhead - * of keeping a huge number of small separate objects, and the quadratic - * behaviour of using a naive repeated concatenation scheme. - */ - -typedef struct { - PyObject *large; /* A list of previously accumulated large strings */ - PyObject *small; /* Pending small strings */ -} accumulator; - static PyObject * join_list_unicode(PyObject *lst) { @@ -99,96 +88,6 @@ return PyUnicode_Join(sep, lst); } -static int -init_accumulator(accumulator *acc) -{ - acc->large = PyList_New(0); - if (acc->large == NULL) - return -1; - acc->small = PyList_New(0); - if (acc->small == NULL) { - Py_CLEAR(acc->large); - return -1; - } - return 0; -} - -static int -flush_accumulator(accumulator *acc) -{ - Py_ssize_t nsmall = PyList_GET_SIZE(acc->small); - if (nsmall) { - int ret; - PyObject *joined = join_list_unicode(acc->small); - if (joined == NULL) - return -1; - if (PyList_SetSlice(acc->small, 0, nsmall, NULL)) { - Py_DECREF(joined); - return -1; - } - ret = PyList_Append(acc->large, joined); - Py_DECREF(joined); - return ret; - } - return 0; -} - -static int -accumulate_unicode(accumulator *acc, PyObject *obj) -{ - int ret; - Py_ssize_t nsmall; - PyObject *joined; - assert(PyUnicode_Check(obj)); - - if (PyList_Append(acc->small, obj)) - return -1; - nsmall = PyList_GET_SIZE(acc->small); - /* Each item in a list of unicode objects has an overhead (in 64-bit - * builds) of: - * - 8 bytes for the list slot - * - 56 bytes for the header of the unicode object - * that is, 64 bytes. 100000 such objects waste more than 6MB - * compared to a single concatenated string. - */ - if (nsmall < 100000) - return 0; - joined = join_list_unicode(acc->small); - if (joined == NULL) - return -1; - if (PyList_SetSlice(acc->small, 0, nsmall, NULL)) { - Py_DECREF(joined); - return -1; - } - ret = PyList_Append(acc->large, joined); - Py_DECREF(joined); - return ret; -} - -static PyObject * -finish_accumulator(accumulator *acc) -{ - int ret; - PyObject *res; - - ret = flush_accumulator(acc); - Py_CLEAR(acc->small); - if (ret) { - Py_CLEAR(acc->large); - return NULL; - } - res = acc->large; - acc->large = NULL; - return res; -} - -static void -destroy_accumulator(accumulator *acc) -{ - Py_CLEAR(acc->small); - Py_CLEAR(acc->large); -} - /* Forward decls */ static PyObject * @@ -217,11 +116,11 @@ static int encoder_clear(PyObject *self); static int -encoder_listencode_list(PyEncoderObject *s, accumulator *acc, PyObject *seq, Py_ssize_t indent_level); +encoder_listencode_list(PyEncoderObject *s, _PyAccu *acc, PyObject *seq, Py_ssize_t indent_level); static int -encoder_listencode_obj(PyEncoderObject *s, accumulator *acc, PyObject *obj, Py_ssize_t indent_level); +encoder_listencode_obj(PyEncoderObject *s, _PyAccu *acc, PyObject *obj, Py_ssize_t indent_level); static int -encoder_listencode_dict(PyEncoderObject *s, accumulator *acc, PyObject *dct, Py_ssize_t indent_level); +encoder_listencode_dict(PyEncoderObject *s, _PyAccu *acc, PyObject *dct, Py_ssize_t indent_level); static PyObject * _encoded_const(PyObject *obj); static void @@ -1383,20 +1282,20 @@ PyObject *obj; Py_ssize_t indent_level; PyEncoderObject *s; - accumulator acc; + _PyAccu acc; assert(PyEncoder_Check(self)); s = (PyEncoderObject *)self; if (!PyArg_ParseTupleAndKeywords(args, kwds, "OO&:_iterencode", kwlist, &obj, _convertPyInt_AsSsize_t, &indent_level)) return NULL; - if (init_accumulator(&acc)) + if (_PyAccu_Init(&acc)) return NULL; if (encoder_listencode_obj(s, &acc, obj, indent_level)) { - destroy_accumulator(&acc); + _PyAccu_Destroy(&acc); return NULL; } - return finish_accumulator(&acc); + return _PyAccu_FinishAsList(&acc); } static PyObject * @@ -1468,16 +1367,16 @@ } static int -_steal_accumulate(accumulator *acc, PyObject *stolen) +_steal_accumulate(_PyAccu *acc, PyObject *stolen) { /* Append stolen and then decrement its reference count */ - int rval = accumulate_unicode(acc, stolen); + int rval = _PyAccu_Accumulate(acc, stolen); Py_DECREF(stolen); return rval; } static int -encoder_listencode_obj(PyEncoderObject *s, accumulator *acc, +encoder_listencode_obj(PyEncoderObject *s, _PyAccu *acc, PyObject *obj, Py_ssize_t indent_level) { /* Encode Python object obj to a JSON term */ @@ -1570,7 +1469,7 @@ } static int -encoder_listencode_dict(PyEncoderObject *s, accumulator *acc, +encoder_listencode_dict(PyEncoderObject *s, _PyAccu *acc, PyObject *dct, Py_ssize_t indent_level) { /* Encode Python dict dct a JSON term */ @@ -1593,7 +1492,7 @@ return -1; } if (Py_SIZE(dct) == 0) - return accumulate_unicode(acc, empty_dict); + return _PyAccu_Accumulate(acc, empty_dict); if (s->markers != Py_None) { int has_key; @@ -1611,7 +1510,7 @@ } } - if (accumulate_unicode(acc, open_dict)) + if (_PyAccu_Accumulate(acc, open_dict)) goto bail; if (s->indent != Py_None) { @@ -1698,7 +1597,7 @@ } if (idx) { - if (accumulate_unicode(acc, s->item_separator)) + if (_PyAccu_Accumulate(acc, s->item_separator)) goto bail; } @@ -1706,12 +1605,12 @@ Py_CLEAR(kstr); if (encoded == NULL) goto bail; - if (accumulate_unicode(acc, encoded)) { + if (_PyAccu_Accumulate(acc, encoded)) { Py_DECREF(encoded); goto bail; } Py_DECREF(encoded); - if (accumulate_unicode(acc, s->key_separator)) + if (_PyAccu_Accumulate(acc, s->key_separator)) goto bail; value = PyTuple_GET_ITEM(item, 1); @@ -1735,7 +1634,7 @@ yield '\n' + (' ' * (_indent * _current_indent_level)) }*/ - if (accumulate_unicode(acc, close_dict)) + if (_PyAccu_Accumulate(acc, close_dict)) goto bail; return 0; @@ -1749,7 +1648,7 @@ static int -encoder_listencode_list(PyEncoderObject *s, accumulator *acc, +encoder_listencode_list(PyEncoderObject *s, _PyAccu *acc, PyObject *seq, Py_ssize_t indent_level) { /* Encode Python list seq to a JSON term */ @@ -1776,7 +1675,7 @@ num_items = PySequence_Fast_GET_SIZE(s_fast); if (num_items == 0) { Py_DECREF(s_fast); - return accumulate_unicode(acc, empty_array); + return _PyAccu_Accumulate(acc, empty_array); } if (s->markers != Py_None) { @@ -1796,7 +1695,7 @@ } seq_items = PySequence_Fast_ITEMS(s_fast); - if (accumulate_unicode(acc, open_array)) + if (_PyAccu_Accumulate(acc, open_array)) goto bail; if (s->indent != Py_None) { /* TODO: DOES NOT RUN */ @@ -1810,7 +1709,7 @@ for (i = 0; i < num_items; i++) { PyObject *obj = seq_items[i]; if (i) { - if (accumulate_unicode(acc, s->item_separator)) + if (_PyAccu_Accumulate(acc, s->item_separator)) goto bail; } if (encoder_listencode_obj(s, acc, obj, indent_level)) @@ -1828,7 +1727,7 @@ yield '\n' + (' ' * (_indent * _current_indent_level)) }*/ - if (accumulate_unicode(acc, close_array)) + if (_PyAccu_Accumulate(acc, close_array)) goto bail; Py_DECREF(s_fast); return 0; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 19:45:31 2011 From: python-checkins at python.org (charles-francois.natali) Date: Thu, 06 Oct 2011 19:45:31 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Issue_=2310141=3A_socket=3A?= =?utf8?q?_add_SocketCAN_=28PF=5FCAN=29_support=2E_Initial_patch_by_Matthi?= =?utf8?q?as?= Message-ID: http://hg.python.org/cpython/rev/e767318baccd changeset: 72758:e767318baccd user: Charles-Fran?ois Natali date: Thu Oct 06 19:47:44 2011 +0200 summary: Issue #10141: socket: add SocketCAN (PF_CAN) support. Initial patch by Matthias Fuchs, updated by Tiago Gon?alves. files: Doc/library/socket.rst | 71 +++- Doc/whatsnew/3.3.rst | 21 +- Lib/test/test_socket.py | 164 +++++++ Misc/ACKS | 2 + Misc/NEWS | 3 + Modules/socketmodule.c | 105 ++++ Modules/socketmodule.h | 11 + configure | 614 +++++++++++++-------------- configure.in | 7 + pyconfig.h.in | 6 + 10 files changed, 683 insertions(+), 321 deletions(-) diff --git a/Doc/library/socket.rst b/Doc/library/socket.rst --- a/Doc/library/socket.rst +++ b/Doc/library/socket.rst @@ -80,6 +80,11 @@ If *addr_type* is TIPC_ADDR_ID, then *v1* is the node, *v2* is the reference, and *v3* should be set to 0. +- A tuple ``(interface, )`` is used for the :const:`AF_CAN` address family, + where *interface* is a string representing a network interface name like + ``'can0'``. The network interface name ``''`` can be used to receive packets + from all network interfaces of this family. + - Certain other address families (:const:`AF_BLUETOOTH`, :const:`AF_PACKET`) support specific representations. @@ -216,6 +221,19 @@ in the Unix header files are defined; for a few symbols, default values are provided. +.. data:: AF_CAN + PF_CAN + SOL_CAN_* + CAN_* + + Many constants of these forms, documented in the Linux documentation, are + also defined in the socket module. + + Availability: Linux >= 2.6.25. + + .. versionadded:: 3.3 + + .. data:: SIO_* RCVALL_* @@ -387,10 +405,14 @@ Create a new socket using the given address family, socket type and protocol number. The address family should be :const:`AF_INET` (the default), - :const:`AF_INET6` or :const:`AF_UNIX`. The socket type should be - :const:`SOCK_STREAM` (the default), :const:`SOCK_DGRAM` or perhaps one of the - other ``SOCK_`` constants. The protocol number is usually zero and may be - omitted in that case. + :const:`AF_INET6`, :const:`AF_UNIX` or :const:`AF_CAN`. The socket type + should be :const:`SOCK_STREAM` (the default), :const:`SOCK_DGRAM`, + :const:`SOCK_RAW` or perhaps one of the other ``SOCK_`` constants. The + protocol number is usually zero and may be omitted in that case or + :const:`CAN_RAW` in case the address family is :const:`AF_CAN`. + + .. versionchanged:: 3.3 + The AF_CAN family was added. .. function:: socketpair([family[, type[, proto]]]) @@ -1213,7 +1235,7 @@ print('Received', repr(data)) -The last example shows how to write a very simple network sniffer with raw +The next example shows how to write a very simple network sniffer with raw sockets on Windows. The example requires administrator privileges to modify the interface:: @@ -1238,6 +1260,45 @@ # disabled promiscuous mode s.ioctl(socket.SIO_RCVALL, socket.RCVALL_OFF) +The last example shows how to use the socket interface to communicate to a CAN +network. This example might require special priviledge:: + + import socket + import struct + + + # CAN frame packing/unpacking (see `struct can_frame` in ) + + can_frame_fmt = "=IB3x8s" + + def build_can_frame(can_id, data): + can_dlc = len(data) + data = data.ljust(8, b'\x00') + return struct.pack(can_frame_fmt, can_id, can_dlc, data) + + def dissect_can_frame(frame): + can_id, can_dlc, data = struct.unpack(can_frame_fmt, frame) + return (can_id, can_dlc, data[:can_dlc]) + + + # create a raw socket and bind it to the `vcan0` interface + s = socket.socket(socket.AF_CAN, socket.SOCK_RAW, socket.CAN_RAW) + s.bind(('vcan0',)) + + while True: + cf, addr = s.recvfrom(16) + + print('Received: can_id=%x, can_dlc=%x, data=%s' % dissect_can_frame(cf)) + + try: + s.send(cf) + except socket.error: + print('Error sending CAN frame') + + try: + s.send(build_can_frame(0x01, b'\x01\x02\x03')) + except socket.error: + print('Error sending CAN frame') Running an example several times with too small delay between executions, could lead to this error:: diff --git a/Doc/whatsnew/3.3.rst b/Doc/whatsnew/3.3.rst --- a/Doc/whatsnew/3.3.rst +++ b/Doc/whatsnew/3.3.rst @@ -300,15 +300,22 @@ socket ------ -The :class:`~socket.socket` class now exposes addititonal methods to -process ancillary data when supported by the underlying platform: +* The :class:`~socket.socket` class now exposes additional methods to process + ancillary data when supported by the underlying platform: -* :func:`~socket.socket.sendmsg` -* :func:`~socket.socket.recvmsg` -* :func:`~socket.socket.recvmsg_into` + * :func:`~socket.socket.sendmsg` + * :func:`~socket.socket.recvmsg` + * :func:`~socket.socket.recvmsg_into` -(Contributed by David Watson in :issue:`6560`, based on an earlier patch -by Heiko Wundram) + (Contributed by David Watson in :issue:`6560`, based on an earlier patch by + Heiko Wundram) + +* The :class:`~socket.socket` class now supports the PF_CAN protocol family + (http://en.wikipedia.org/wiki/Socketcan), on Linux + (http://lwn.net/Articles/253425). + + (Contributed by Matthias Fuchs, updated by Tiago Gon?alves in :issue:`10141`) + ssl --- diff --git a/Lib/test/test_socket.py b/Lib/test/test_socket.py --- a/Lib/test/test_socket.py +++ b/Lib/test/test_socket.py @@ -21,6 +21,7 @@ import signal import math import pickle +import struct try: import fcntl except ImportError: @@ -36,6 +37,18 @@ thread = None threading = None +def _have_socket_can(): + """Check whether CAN sockets are supported on this host.""" + try: + s = socket.socket(socket.PF_CAN, socket.SOCK_RAW, socket.CAN_RAW) + except (AttributeError, socket.error, OSError): + return False + else: + s.close() + return True + +HAVE_SOCKET_CAN = _have_socket_can() + # Size in bytes of the int type SIZEOF_INT = array.array("i").itemsize @@ -80,6 +93,30 @@ with self._cleanup_lock: return super().doCleanups(*args, **kwargs) +class SocketCANTest(unittest.TestCase): + + """To be able to run this test, a `vcan0` CAN interface can be created with + the following commands: + # modprobe vcan + # ip link add dev vcan0 type vcan + # ifconfig vcan0 up + """ + interface = 'vcan0' + bufsize = 128 + + def setUp(self): + self.s = socket.socket(socket.PF_CAN, socket.SOCK_RAW, socket.CAN_RAW) + try: + self.s.bind((self.interface,)) + except socket.error: + self.skipTest('network interface `%s` does not exist' % + self.interface) + self.s.close() + + def tearDown(self): + self.s.close() + self.s = None + class ThreadableTest: """Threadable Test class @@ -210,6 +247,26 @@ self.cli = None ThreadableTest.clientTearDown(self) +class ThreadedCANSocketTest(SocketCANTest, ThreadableTest): + + def __init__(self, methodName='runTest'): + SocketCANTest.__init__(self, methodName=methodName) + ThreadableTest.__init__(self) + + def clientSetUp(self): + self.cli = socket.socket(socket.PF_CAN, socket.SOCK_RAW, socket.CAN_RAW) + try: + self.cli.bind((self.interface,)) + except socket.error: + self.skipTest('network interface `%s` does not exist' % + self.interface) + self.cli.close() + + def clientTearDown(self): + self.cli.close() + self.cli = None + ThreadableTest.clientTearDown(self) + class SocketConnectedTest(ThreadedTCPSocketTest): """Socket tests for client-server connection. @@ -1072,6 +1129,112 @@ srv.close() + at unittest.skipUnless(HAVE_SOCKET_CAN, 'SocketCan required for this test.') +class BasicCANTest(unittest.TestCase): + + def testCrucialConstants(self): + socket.AF_CAN + socket.PF_CAN + socket.CAN_RAW + + def testCreateSocket(self): + with socket.socket(socket.PF_CAN, socket.SOCK_RAW, socket.CAN_RAW) as s: + pass + + def testBindAny(self): + with socket.socket(socket.PF_CAN, socket.SOCK_RAW, socket.CAN_RAW) as s: + s.bind(('', )) + + def testTooLongInterfaceName(self): + # most systems limit IFNAMSIZ to 16, take 1024 to be sure + with socket.socket(socket.PF_CAN, socket.SOCK_RAW, socket.CAN_RAW) as s: + self.assertRaisesRegexp(socket.error, 'interface name too long', + s.bind, ('x' * 1024,)) + + @unittest.skipUnless(hasattr(socket, "CAN_RAW_LOOPBACK"), + 'socket.CAN_RAW_LOOPBACK required for this test.') + def testLoopback(self): + with socket.socket(socket.PF_CAN, socket.SOCK_RAW, socket.CAN_RAW) as s: + for loopback in (0, 1): + s.setsockopt(socket.SOL_CAN_RAW, socket.CAN_RAW_LOOPBACK, + loopback) + self.assertEqual(loopback, + s.getsockopt(socket.SOL_CAN_RAW, socket.CAN_RAW_LOOPBACK)) + + @unittest.skipUnless(hasattr(socket, "CAN_RAW_FILTER"), + 'socket.CAN_RAW_FILTER required for this test.') + def testFilter(self): + can_id, can_mask = 0x200, 0x700 + can_filter = struct.pack("=II", can_id, can_mask) + with socket.socket(socket.PF_CAN, socket.SOCK_RAW, socket.CAN_RAW) as s: + s.setsockopt(socket.SOL_CAN_RAW, socket.CAN_RAW_FILTER, can_filter) + self.assertEqual(can_filter, + s.getsockopt(socket.SOL_CAN_RAW, socket.CAN_RAW_FILTER, 8)) + + + at unittest.skipUnless(HAVE_SOCKET_CAN, 'SocketCan required for this test.') + at unittest.skipUnless(thread, 'Threading required for this test.') +class CANTest(ThreadedCANSocketTest): + + """The CAN frame structure is defined in : + + struct can_frame { + canid_t can_id; /* 32 bit CAN_ID + EFF/RTR/ERR flags */ + __u8 can_dlc; /* data length code: 0 .. 8 */ + __u8 data[8] __attribute__((aligned(8))); + }; + """ + can_frame_fmt = "=IB3x8s" + + def __init__(self, methodName='runTest'): + ThreadedCANSocketTest.__init__(self, methodName=methodName) + + @classmethod + def build_can_frame(cls, can_id, data): + """Build a CAN frame.""" + can_dlc = len(data) + data = data.ljust(8, b'\x00') + return struct.pack(cls.can_frame_fmt, can_id, can_dlc, data) + + @classmethod + def dissect_can_frame(cls, frame): + """Dissect a CAN frame.""" + can_id, can_dlc, data = struct.unpack(cls.can_frame_fmt, frame) + return (can_id, can_dlc, data[:can_dlc]) + + def testSendFrame(self): + cf, addr = self.s.recvfrom(self.bufsize) + self.assertEqual(self.cf, cf) + self.assertEqual(addr[0], self.interface) + self.assertEqual(addr[1], socket.AF_CAN) + + def _testSendFrame(self): + self.cf = self.build_can_frame(0x00, b'\x01\x02\x03\x04\x05') + self.cli.send(self.cf) + + def testSendMaxFrame(self): + cf, addr = self.s.recvfrom(self.bufsize) + self.assertEqual(self.cf, cf) + + def _testSendMaxFrame(self): + self.cf = self.build_can_frame(0x00, b'\x07' * 8) + self.cli.send(self.cf) + + def testSendMultiFrames(self): + cf, addr = self.s.recvfrom(self.bufsize) + self.assertEqual(self.cf1, cf) + + cf, addr = self.s.recvfrom(self.bufsize) + self.assertEqual(self.cf2, cf) + + def _testSendMultiFrames(self): + self.cf1 = self.build_can_frame(0x07, b'\x44\x33\x22\x11') + self.cli.send(self.cf1) + + self.cf2 = self.build_can_frame(0x12, b'\x99\x22\x33') + self.cli.send(self.cf2) + + @unittest.skipUnless(thread, 'Threading required for this test.') class BasicTCPTest(SocketConnectedTest): @@ -4194,6 +4357,7 @@ if isTipcAvailable(): tests.append(TIPCTest) tests.append(TIPCThreadableTest) + tests.extend([BasicCANTest, CANTest]) tests.extend([ CmsgMacroTests, SendmsgUDPTest, diff --git a/Misc/ACKS b/Misc/ACKS --- a/Misc/ACKS +++ b/Misc/ACKS @@ -319,6 +319,7 @@ Martin Franklin Robin Friedrich Ivan Frohne +Matthias Fuchs Jim Fulton Tadayoshi Funaba Gyro Funch @@ -354,6 +355,7 @@ Yannick Gingras Christoph Gohlke Tim Golden +Tiago Gon?alves Chris Gonnerman David Goodger Hans de Graaff diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -1322,6 +1322,9 @@ Extension Modules ----------------- +- Issue #10141: socket: Add SocketCAN (PF_CAN) support. Initial patch by + Matthias Fuchs, updated by Tiago Gon?alves. + - Issue #13070: Fix a crash when a TextIOWrapper caught in a reference cycle would be finalized after the reference to its underlying BufferedRWPair's writer got cleared by the GC. diff --git a/Modules/socketmodule.c b/Modules/socketmodule.c --- a/Modules/socketmodule.c +++ b/Modules/socketmodule.c @@ -1220,6 +1220,25 @@ } #endif +#ifdef HAVE_LINUX_CAN_H + case AF_CAN: + { + struct sockaddr_can *a = (struct sockaddr_can *)addr; + char *ifname = ""; + struct ifreq ifr; + /* need to look up interface name given index */ + if (a->can_ifindex) { + ifr.ifr_ifindex = a->can_ifindex; + if (ioctl(sockfd, SIOCGIFNAME, &ifr) == 0) + ifname = ifr.ifr_name; + } + + return Py_BuildValue("O&h", PyUnicode_DecodeFSDefault, + ifname, + a->can_family); + } +#endif + /* More cases here... */ default: @@ -1587,6 +1606,53 @@ } #endif +#ifdef HAVE_LINUX_CAN_H + case AF_CAN: + switch (s->sock_proto) { + case CAN_RAW: + { + struct sockaddr_can *addr; + PyObject *interfaceName; + struct ifreq ifr; + addr = (struct sockaddr_can *)addr_ret; + Py_ssize_t len; + + if (!PyArg_ParseTuple(args, "O&", PyUnicode_FSConverter, + &interfaceName)) + return 0; + + len = PyBytes_GET_SIZE(interfaceName); + + if (len == 0) { + ifr.ifr_ifindex = 0; + } else if (len < sizeof(ifr.ifr_name)) { + strcpy(ifr.ifr_name, PyBytes_AS_STRING(interfaceName)); + if (ioctl(s->sock_fd, SIOCGIFINDEX, &ifr) < 0) { + s->errorhandler(); + Py_DECREF(interfaceName); + return 0; + } + } else { + PyErr_SetString(socket_error, + "AF_CAN interface name too long"); + Py_DECREF(interfaceName); + return 0; + } + + addr->can_family = AF_CAN; + addr->can_ifindex = ifr.ifr_ifindex; + + *len_ret = sizeof(*addr); + Py_DECREF(interfaceName); + return 1; + } + default: + PyErr_SetString(socket_error, + "getsockaddrarg: unsupported CAN protocol"); + return 0; + } +#endif + /* More cases here... */ default: @@ -1680,6 +1746,14 @@ } #endif +#ifdef HAVE_LINUX_CAN_H + case AF_CAN: + { + *len_ret = sizeof (struct sockaddr_can); + return 1; + } +#endif + /* More cases here... */ default: @@ -5533,6 +5607,15 @@ PyModule_AddStringConstant(m, "BDADDR_LOCAL", "00:00:00:FF:FF:FF"); #endif +#ifdef AF_CAN + /* Controller Area Network */ + PyModule_AddIntConstant(m, "AF_CAN", AF_CAN); +#endif +#ifdef PF_CAN + /* Controller Area Network */ + PyModule_AddIntConstant(m, "PF_CAN", PF_CAN); +#endif + #ifdef AF_PACKET PyModule_AddIntMacro(m, AF_PACKET); #endif @@ -5803,6 +5886,28 @@ #else PyModule_AddIntConstant(m, "SOL_UDP", 17); #endif +#ifdef SOL_CAN_BASE + PyModule_AddIntConstant(m, "SOL_CAN_BASE", SOL_CAN_BASE); +#endif +#ifdef SOL_CAN_RAW + PyModule_AddIntConstant(m, "SOL_CAN_RAW", SOL_CAN_RAW); + PyModule_AddIntConstant(m, "CAN_RAW", CAN_RAW); +#endif +#ifdef HAVE_LINUX_CAN_H + PyModule_AddIntConstant(m, "CAN_EFF_FLAG", CAN_EFF_FLAG); + PyModule_AddIntConstant(m, "CAN_RTR_FLAG", CAN_RTR_FLAG); + PyModule_AddIntConstant(m, "CAN_ERR_FLAG", CAN_ERR_FLAG); + + PyModule_AddIntConstant(m, "CAN_SFF_MASK", CAN_SFF_MASK); + PyModule_AddIntConstant(m, "CAN_EFF_MASK", CAN_EFF_MASK); + PyModule_AddIntConstant(m, "CAN_ERR_MASK", CAN_ERR_MASK); +#endif +#ifdef HAVE_LINUX_CAN_RAW_H + PyModule_AddIntConstant(m, "CAN_RAW_FILTER", CAN_RAW_FILTER); + PyModule_AddIntConstant(m, "CAN_RAW_ERR_FILTER", CAN_RAW_ERR_FILTER); + PyModule_AddIntConstant(m, "CAN_RAW_LOOPBACK", CAN_RAW_LOOPBACK); + PyModule_AddIntConstant(m, "CAN_RAW_RECV_OWN_MSGS", CAN_RAW_RECV_OWN_MSGS); +#endif #ifdef IPPROTO_IP PyModule_AddIntConstant(m, "IPPROTO_IP", IPPROTO_IP); #else diff --git a/Modules/socketmodule.h b/Modules/socketmodule.h --- a/Modules/socketmodule.h +++ b/Modules/socketmodule.h @@ -72,6 +72,14 @@ # include #endif +#ifdef HAVE_LINUX_CAN_H +#include +#endif + +#ifdef HAVE_LINUX_CAN_RAW_H +#include +#endif + #ifndef Py__SOCKET_H #define Py__SOCKET_H #ifdef __cplusplus @@ -126,6 +134,9 @@ #ifdef HAVE_NETPACKET_PACKET_H struct sockaddr_ll ll; #endif +#ifdef HAVE_LINUX_CAN_H + struct sockaddr_can can; +#endif } sock_addr_t; /* The object holding a socket. It holds some extra information, diff --git a/configure b/configure --- a/configure +++ b/configure @@ -1,6 +1,6 @@ #! /bin/sh # Guess values for system-dependent variables and create Makefiles. -# Generated by GNU Autoconf 2.68 for python 3.3. +# Generated by GNU Autoconf 2.67 for python 3.3. # # Report bugs to . # @@ -91,7 +91,6 @@ IFS=" "" $as_nl" # Find who we are. Look in the path if we contain no directory separator. -as_myself= case $0 in #(( *[\\/]* ) as_myself=$0 ;; *) as_save_IFS=$IFS; IFS=$PATH_SEPARATOR @@ -217,18 +216,11 @@ # We cannot yet assume a decent shell, so we have to provide a # neutralization value for shells without unset; and this also # works around shells that cannot unset nonexistent variables. - # Preserve -v and -x to the replacement shell. BASH_ENV=/dev/null ENV=/dev/null (unset BASH_ENV) >/dev/null 2>&1 && unset BASH_ENV ENV export CONFIG_SHELL - case $- in # (((( - *v*x* | *x*v* ) as_opts=-vx ;; - *v* ) as_opts=-v ;; - *x* ) as_opts=-x ;; - * ) as_opts= ;; - esac - exec "$CONFIG_SHELL" $as_opts "$as_myself" ${1+"$@"} + exec "$CONFIG_SHELL" "$as_myself" ${1+"$@"} fi if test x$as_have_required = xno; then : @@ -777,8 +769,7 @@ LDFLAGS LIBS CPPFLAGS -CPP -CPPFLAGS' +CPP' # Initialize some variables set by options. @@ -1183,7 +1174,7 @@ $as_echo "$as_me: WARNING: you should use --build, --host, --target" >&2 expr "x$ac_option" : ".*[^-._$as_cr_alnum]" >/dev/null && $as_echo "$as_me: WARNING: invalid host type: $ac_option" >&2 - : "${build_alias=$ac_option} ${host_alias=$ac_option} ${target_alias=$ac_option}" + : ${build_alias=$ac_option} ${host_alias=$ac_option} ${target_alias=$ac_option} ;; esac @@ -1519,7 +1510,7 @@ if $ac_init_version; then cat <<\_ACEOF python configure 3.3 -generated by GNU Autoconf 2.68 +generated by GNU Autoconf 2.67 Copyright (C) 2010 Free Software Foundation, Inc. This configure script is free software; the Free Software Foundation @@ -1565,7 +1556,7 @@ ac_retval=1 fi - eval $as_lineno_stack; ${as_lineno_stack:+:} unset as_lineno + eval $as_lineno_stack; test "x$as_lineno_stack" = x && { as_lineno=; unset as_lineno;} as_fn_set_status $ac_retval } # ac_fn_c_try_compile @@ -1611,7 +1602,7 @@ # interfere with the next link command; also delete a directory that is # left behind by Apple's compiler. We do this before executing the actions. rm -rf conftest.dSYM conftest_ipa8_conftest.oo - eval $as_lineno_stack; ${as_lineno_stack:+:} unset as_lineno + eval $as_lineno_stack; test "x$as_lineno_stack" = x && { as_lineno=; unset as_lineno;} as_fn_set_status $ac_retval } # ac_fn_c_try_link @@ -1648,7 +1639,7 @@ ac_retval=1 fi - eval $as_lineno_stack; ${as_lineno_stack:+:} unset as_lineno + eval $as_lineno_stack; test "x$as_lineno_stack" = x && { as_lineno=; unset as_lineno;} as_fn_set_status $ac_retval } # ac_fn_c_try_cpp @@ -1661,10 +1652,10 @@ ac_fn_c_check_header_mongrel () { as_lineno=${as_lineno-"$1"} as_lineno_stack=as_lineno_stack=$as_lineno_stack - if eval \${$3+:} false; then : + if eval "test \"\${$3+set}\"" = set; then : { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $2" >&5 $as_echo_n "checking for $2... " >&6; } -if eval \${$3+:} false; then : +if eval "test \"\${$3+set}\"" = set; then : $as_echo_n "(cached) " >&6 fi eval ac_res=\$$3 @@ -1731,7 +1722,7 @@ esac { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $2" >&5 $as_echo_n "checking for $2... " >&6; } -if eval \${$3+:} false; then : +if eval "test \"\${$3+set}\"" = set; then : $as_echo_n "(cached) " >&6 else eval "$3=\$ac_header_compiler" @@ -1740,7 +1731,7 @@ { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_res" >&5 $as_echo "$ac_res" >&6; } fi - eval $as_lineno_stack; ${as_lineno_stack:+:} unset as_lineno + eval $as_lineno_stack; test "x$as_lineno_stack" = x && { as_lineno=; unset as_lineno;} } # ac_fn_c_check_header_mongrel @@ -1781,7 +1772,7 @@ ac_retval=$ac_status fi rm -rf conftest.dSYM conftest_ipa8_conftest.oo - eval $as_lineno_stack; ${as_lineno_stack:+:} unset as_lineno + eval $as_lineno_stack; test "x$as_lineno_stack" = x && { as_lineno=; unset as_lineno;} as_fn_set_status $ac_retval } # ac_fn_c_try_run @@ -1795,7 +1786,7 @@ as_lineno=${as_lineno-"$1"} as_lineno_stack=as_lineno_stack=$as_lineno_stack { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $2" >&5 $as_echo_n "checking for $2... " >&6; } -if eval \${$3+:} false; then : +if eval "test \"\${$3+set}\"" = set; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext @@ -1813,7 +1804,7 @@ eval ac_res=\$$3 { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_res" >&5 $as_echo "$ac_res" >&6; } - eval $as_lineno_stack; ${as_lineno_stack:+:} unset as_lineno + eval $as_lineno_stack; test "x$as_lineno_stack" = x && { as_lineno=; unset as_lineno;} } # ac_fn_c_check_header_compile @@ -1826,7 +1817,7 @@ as_lineno=${as_lineno-"$1"} as_lineno_stack=as_lineno_stack=$as_lineno_stack { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $2" >&5 $as_echo_n "checking for $2... " >&6; } -if eval \${$3+:} false; then : +if eval "test \"\${$3+set}\"" = set; then : $as_echo_n "(cached) " >&6 else eval "$3=no" @@ -1867,7 +1858,7 @@ eval ac_res=\$$3 { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_res" >&5 $as_echo "$ac_res" >&6; } - eval $as_lineno_stack; ${as_lineno_stack:+:} unset as_lineno + eval $as_lineno_stack; test "x$as_lineno_stack" = x && { as_lineno=; unset as_lineno;} } # ac_fn_c_check_type @@ -1880,7 +1871,7 @@ as_lineno=${as_lineno-"$1"} as_lineno_stack=as_lineno_stack=$as_lineno_stack { $as_echo "$as_me:${as_lineno-$LINENO}: checking for uint$2_t" >&5 $as_echo_n "checking for uint$2_t... " >&6; } -if eval \${$3+:} false; then : +if eval "test \"\${$3+set}\"" = set; then : $as_echo_n "(cached) " >&6 else eval "$3=no" @@ -1920,7 +1911,7 @@ eval ac_res=\$$3 { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_res" >&5 $as_echo "$ac_res" >&6; } - eval $as_lineno_stack; ${as_lineno_stack:+:} unset as_lineno + eval $as_lineno_stack; test "x$as_lineno_stack" = x && { as_lineno=; unset as_lineno;} } # ac_fn_c_find_uintX_t @@ -1933,7 +1924,7 @@ as_lineno=${as_lineno-"$1"} as_lineno_stack=as_lineno_stack=$as_lineno_stack { $as_echo "$as_me:${as_lineno-$LINENO}: checking for int$2_t" >&5 $as_echo_n "checking for int$2_t... " >&6; } -if eval \${$3+:} false; then : +if eval "test \"\${$3+set}\"" = set; then : $as_echo_n "(cached) " >&6 else eval "$3=no" @@ -1994,7 +1985,7 @@ eval ac_res=\$$3 { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_res" >&5 $as_echo "$ac_res" >&6; } - eval $as_lineno_stack; ${as_lineno_stack:+:} unset as_lineno + eval $as_lineno_stack; test "x$as_lineno_stack" = x && { as_lineno=; unset as_lineno;} } # ac_fn_c_find_intX_t @@ -2171,7 +2162,7 @@ rm -f conftest.val fi - eval $as_lineno_stack; ${as_lineno_stack:+:} unset as_lineno + eval $as_lineno_stack; test "x$as_lineno_stack" = x && { as_lineno=; unset as_lineno;} as_fn_set_status $ac_retval } # ac_fn_c_compute_int @@ -2184,7 +2175,7 @@ as_lineno=${as_lineno-"$1"} as_lineno_stack=as_lineno_stack=$as_lineno_stack { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $2" >&5 $as_echo_n "checking for $2... " >&6; } -if eval \${$3+:} false; then : +if eval "test \"\${$3+set}\"" = set; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext @@ -2239,7 +2230,7 @@ eval ac_res=\$$3 { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_res" >&5 $as_echo "$ac_res" >&6; } - eval $as_lineno_stack; ${as_lineno_stack:+:} unset as_lineno + eval $as_lineno_stack; test "x$as_lineno_stack" = x && { as_lineno=; unset as_lineno;} } # ac_fn_c_check_func @@ -2252,7 +2243,7 @@ as_lineno=${as_lineno-"$1"} as_lineno_stack=as_lineno_stack=$as_lineno_stack { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $2.$3" >&5 $as_echo_n "checking for $2.$3... " >&6; } -if eval \${$4+:} false; then : +if eval "test \"\${$4+set}\"" = set; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext @@ -2296,7 +2287,7 @@ eval ac_res=\$$4 { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_res" >&5 $as_echo "$ac_res" >&6; } - eval $as_lineno_stack; ${as_lineno_stack:+:} unset as_lineno + eval $as_lineno_stack; test "x$as_lineno_stack" = x && { as_lineno=; unset as_lineno;} } # ac_fn_c_check_member @@ -2311,7 +2302,7 @@ as_decl_use=`echo $2|sed -e 's/(/((/' -e 's/)/) 0&/' -e 's/,/) 0& (/g'` { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether $as_decl_name is declared" >&5 $as_echo_n "checking whether $as_decl_name is declared... " >&6; } -if eval \${$3+:} false; then : +if eval "test \"\${$3+set}\"" = set; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext @@ -2342,7 +2333,7 @@ eval ac_res=\$$3 { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_res" >&5 $as_echo "$ac_res" >&6; } - eval $as_lineno_stack; ${as_lineno_stack:+:} unset as_lineno + eval $as_lineno_stack; test "x$as_lineno_stack" = x && { as_lineno=; unset as_lineno;} } # ac_fn_c_check_decl cat >config.log <<_ACEOF @@ -2350,7 +2341,7 @@ running configure, to aid debugging if configure makes a mistake. It was created by python $as_me 3.3, which was -generated by GNU Autoconf 2.68. Invocation command line was +generated by GNU Autoconf 2.67. Invocation command line was $ $0 $@ @@ -2608,7 +2599,7 @@ || { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error $? "failed to load site script $ac_site_file -See \`config.log' for more details" "$LINENO" 5; } +See \`config.log' for more details" "$LINENO" 5 ; } fi done @@ -2708,7 +2699,7 @@ set dummy hg; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } -if ${ac_cv_prog_HAS_HG+:} false; then : +if test "${ac_cv_prog_HAS_HG+set}" = set; then : $as_echo_n "(cached) " >&6 else if test -n "$HAS_HG"; then @@ -3257,7 +3248,7 @@ set dummy ${ac_tool_prefix}gcc; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } -if ${ac_cv_prog_CC+:} false; then : +if test "${ac_cv_prog_CC+set}" = set; then : $as_echo_n "(cached) " >&6 else if test -n "$CC"; then @@ -3297,7 +3288,7 @@ set dummy gcc; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } -if ${ac_cv_prog_ac_ct_CC+:} false; then : +if test "${ac_cv_prog_ac_ct_CC+set}" = set; then : $as_echo_n "(cached) " >&6 else if test -n "$ac_ct_CC"; then @@ -3350,7 +3341,7 @@ set dummy ${ac_tool_prefix}cc; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } -if ${ac_cv_prog_CC+:} false; then : +if test "${ac_cv_prog_CC+set}" = set; then : $as_echo_n "(cached) " >&6 else if test -n "$CC"; then @@ -3390,7 +3381,7 @@ set dummy cc; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } -if ${ac_cv_prog_CC+:} false; then : +if test "${ac_cv_prog_CC+set}" = set; then : $as_echo_n "(cached) " >&6 else if test -n "$CC"; then @@ -3449,7 +3440,7 @@ set dummy $ac_tool_prefix$ac_prog; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } -if ${ac_cv_prog_CC+:} false; then : +if test "${ac_cv_prog_CC+set}" = set; then : $as_echo_n "(cached) " >&6 else if test -n "$CC"; then @@ -3493,7 +3484,7 @@ set dummy $ac_prog; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } -if ${ac_cv_prog_ac_ct_CC+:} false; then : +if test "${ac_cv_prog_ac_ct_CC+set}" = set; then : $as_echo_n "(cached) " >&6 else if test -n "$ac_ct_CC"; then @@ -3548,7 +3539,7 @@ test -z "$CC" && { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error $? "no acceptable C compiler found in \$PATH -See \`config.log' for more details" "$LINENO" 5; } +See \`config.log' for more details" "$LINENO" 5 ; } # Provide some information about the compiler. $as_echo "$as_me:${as_lineno-$LINENO}: checking for C compiler version" >&5 @@ -3663,7 +3654,7 @@ { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error 77 "C compiler cannot create executables -See \`config.log' for more details" "$LINENO" 5; } +See \`config.log' for more details" "$LINENO" 5 ; } else { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5 $as_echo "yes" >&6; } @@ -3706,7 +3697,7 @@ { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error $? "cannot compute suffix of executables: cannot compile and link -See \`config.log' for more details" "$LINENO" 5; } +See \`config.log' for more details" "$LINENO" 5 ; } fi rm -f conftest conftest$ac_cv_exeext { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_exeext" >&5 @@ -3765,7 +3756,7 @@ $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error $? "cannot run C compiled programs. If you meant to cross compile, use \`--host'. -See \`config.log' for more details" "$LINENO" 5; } +See \`config.log' for more details" "$LINENO" 5 ; } fi fi fi @@ -3776,7 +3767,7 @@ ac_clean_files=$ac_clean_files_save { $as_echo "$as_me:${as_lineno-$LINENO}: checking for suffix of object files" >&5 $as_echo_n "checking for suffix of object files... " >&6; } -if ${ac_cv_objext+:} false; then : +if test "${ac_cv_objext+set}" = set; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext @@ -3817,7 +3808,7 @@ { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error $? "cannot compute suffix of object files: cannot compile -See \`config.log' for more details" "$LINENO" 5; } +See \`config.log' for more details" "$LINENO" 5 ; } fi rm -f conftest.$ac_cv_objext conftest.$ac_ext fi @@ -3827,7 +3818,7 @@ ac_objext=$OBJEXT { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether we are using the GNU C compiler" >&5 $as_echo_n "checking whether we are using the GNU C compiler... " >&6; } -if ${ac_cv_c_compiler_gnu+:} false; then : +if test "${ac_cv_c_compiler_gnu+set}" = set; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext @@ -3864,7 +3855,7 @@ ac_save_CFLAGS=$CFLAGS { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether $CC accepts -g" >&5 $as_echo_n "checking whether $CC accepts -g... " >&6; } -if ${ac_cv_prog_cc_g+:} false; then : +if test "${ac_cv_prog_cc_g+set}" = set; then : $as_echo_n "(cached) " >&6 else ac_save_c_werror_flag=$ac_c_werror_flag @@ -3942,7 +3933,7 @@ fi { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $CC option to accept ISO C89" >&5 $as_echo_n "checking for $CC option to accept ISO C89... " >&6; } -if ${ac_cv_prog_cc_c89+:} false; then : +if test "${ac_cv_prog_cc_c89+set}" = set; then : $as_echo_n "(cached) " >&6 else ac_cv_prog_cc_c89=no @@ -4077,7 +4068,7 @@ set dummy g++; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } -if ${ac_cv_path_CXX+:} false; then : +if test "${ac_cv_path_CXX+set}" = set; then : $as_echo_n "(cached) " >&6 else case $CXX in @@ -4118,7 +4109,7 @@ set dummy c++; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } -if ${ac_cv_path_CXX+:} false; then : +if test "${ac_cv_path_CXX+set}" = set; then : $as_echo_n "(cached) " >&6 else case $CXX in @@ -4169,7 +4160,7 @@ set dummy $ac_prog; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } -if ${ac_cv_prog_CXX+:} false; then : +if test "${ac_cv_prog_CXX+set}" = set; then : $as_echo_n "(cached) " >&6 else if test -n "$CXX"; then @@ -4270,7 +4261,7 @@ CPP= fi if test -z "$CPP"; then - if ${ac_cv_prog_CPP+:} false; then : + if test "${ac_cv_prog_CPP+set}" = set; then : $as_echo_n "(cached) " >&6 else # Double quotes because CPP needs to be expanded @@ -4386,7 +4377,7 @@ { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error $? "C preprocessor \"$CPP\" fails sanity check -See \`config.log' for more details" "$LINENO" 5; } +See \`config.log' for more details" "$LINENO" 5 ; } fi ac_ext=c @@ -4398,7 +4389,7 @@ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for grep that handles long lines and -e" >&5 $as_echo_n "checking for grep that handles long lines and -e... " >&6; } -if ${ac_cv_path_GREP+:} false; then : +if test "${ac_cv_path_GREP+set}" = set; then : $as_echo_n "(cached) " >&6 else if test -z "$GREP"; then @@ -4461,7 +4452,7 @@ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for egrep" >&5 $as_echo_n "checking for egrep... " >&6; } -if ${ac_cv_path_EGREP+:} false; then : +if test "${ac_cv_path_EGREP+set}" = set; then : $as_echo_n "(cached) " >&6 else if echo a | $GREP -E '(a|b)' >/dev/null 2>&1 @@ -4528,7 +4519,7 @@ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for ANSI C header files" >&5 $as_echo_n "checking for ANSI C header files... " >&6; } -if ${ac_cv_header_stdc+:} false; then : +if test "${ac_cv_header_stdc+set}" = set; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext @@ -4657,7 +4648,7 @@ ac_fn_c_check_header_mongrel "$LINENO" "minix/config.h" "ac_cv_header_minix_config_h" "$ac_includes_default" -if test "x$ac_cv_header_minix_config_h" = xyes; then : +if test "x$ac_cv_header_minix_config_h" = x""yes; then : MINIX=yes else MINIX= @@ -4679,7 +4670,7 @@ { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether it is safe to define __EXTENSIONS__" >&5 $as_echo_n "checking whether it is safe to define __EXTENSIONS__... " >&6; } -if ${ac_cv_safe_to_define___extensions__+:} false; then : +if test "${ac_cv_safe_to_define___extensions__+set}" = set; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext @@ -4872,7 +4863,7 @@ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for inline" >&5 $as_echo_n "checking for inline... " >&6; } -if ${ac_cv_c_inline+:} false; then : +if test "${ac_cv_c_inline+set}" = set; then : $as_echo_n "(cached) " >&6 else ac_cv_c_inline=no @@ -5068,7 +5059,7 @@ set dummy ${ac_tool_prefix}ranlib; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } -if ${ac_cv_prog_RANLIB+:} false; then : +if test "${ac_cv_prog_RANLIB+set}" = set; then : $as_echo_n "(cached) " >&6 else if test -n "$RANLIB"; then @@ -5108,7 +5099,7 @@ set dummy ranlib; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } -if ${ac_cv_prog_ac_ct_RANLIB+:} false; then : +if test "${ac_cv_prog_ac_ct_RANLIB+set}" = set; then : $as_echo_n "(cached) " >&6 else if test -n "$ac_ct_RANLIB"; then @@ -5162,7 +5153,7 @@ set dummy $ac_prog; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } -if ${ac_cv_prog_AR+:} false; then : +if test "${ac_cv_prog_AR+set}" = set; then : $as_echo_n "(cached) " >&6 else if test -n "$AR"; then @@ -5213,7 +5204,7 @@ set dummy python; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } -if ${ac_cv_prog_HAS_PYTHON+:} false; then : +if test "${ac_cv_prog_HAS_PYTHON+set}" = set; then : $as_echo_n "(cached) " >&6 else if test -n "$HAS_PYTHON"; then @@ -5307,7 +5298,7 @@ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for a BSD-compatible install" >&5 $as_echo_n "checking for a BSD-compatible install... " >&6; } if test -z "$INSTALL"; then -if ${ac_cv_path_install+:} false; then : +if test "${ac_cv_path_install+set}" = set; then : $as_echo_n "(cached) " >&6 else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR @@ -5499,7 +5490,7 @@ ac_save_cc="$CC" CC="$CC -fno-strict-aliasing" save_CFLAGS="$CFLAGS" - if ${ac_cv_no_strict_aliasing+:} false; then : + if test "${ac_cv_no_strict_aliasing+set}" = set; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext @@ -5565,7 +5556,7 @@ ac_save_cc="$CC" CC="$CC -Wunused-result -Werror" save_CFLAGS="$CFLAGS" - if ${ac_cv_disable_unused_result_warning+:} false; then : + if test "${ac_cv_disable_unused_result_warning+set}" = set; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext @@ -5792,7 +5783,7 @@ # options before we can check whether -Kpthread improves anything. { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether pthreads are available without options" >&5 $as_echo_n "checking whether pthreads are available without options... " >&6; } -if ${ac_cv_pthread_is_default+:} false; then : +if test "${ac_cv_pthread_is_default+set}" = set; then : $as_echo_n "(cached) " >&6 else if test "$cross_compiling" = yes; then : @@ -5845,7 +5836,7 @@ # function available. { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether $CC accepts -Kpthread" >&5 $as_echo_n "checking whether $CC accepts -Kpthread... " >&6; } -if ${ac_cv_kpthread+:} false; then : +if test "${ac_cv_kpthread+set}" = set; then : $as_echo_n "(cached) " >&6 else ac_save_cc="$CC" @@ -5894,7 +5885,7 @@ # function available. { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether $CC accepts -Kthread" >&5 $as_echo_n "checking whether $CC accepts -Kthread... " >&6; } -if ${ac_cv_kthread+:} false; then : +if test "${ac_cv_kthread+set}" = set; then : $as_echo_n "(cached) " >&6 else ac_save_cc="$CC" @@ -5943,7 +5934,7 @@ # function available. { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether $CC accepts -pthread" >&5 $as_echo_n "checking whether $CC accepts -pthread... " >&6; } -if ${ac_cv_thread+:} false; then : +if test "${ac_cv_thread+set}" = set; then : $as_echo_n "(cached) " >&6 else ac_save_cc="$CC" @@ -6028,7 +6019,7 @@ # checks for header files { $as_echo "$as_me:${as_lineno-$LINENO}: checking for ANSI C header files" >&5 $as_echo_n "checking for ANSI C header files... " >&6; } -if ${ac_cv_header_stdc+:} false; then : +if test "${ac_cv_header_stdc+set}" = set; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext @@ -6167,7 +6158,7 @@ as_ac_Header=`$as_echo "ac_cv_header_dirent_$ac_hdr" | $as_tr_sh` { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_hdr that defines DIR" >&5 $as_echo_n "checking for $ac_hdr that defines DIR... " >&6; } -if eval \${$as_ac_Header+:} false; then : +if eval "test \"\${$as_ac_Header+set}\"" = set; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext @@ -6207,7 +6198,7 @@ if test $ac_header_dirent = dirent.h; then { $as_echo "$as_me:${as_lineno-$LINENO}: checking for library containing opendir" >&5 $as_echo_n "checking for library containing opendir... " >&6; } -if ${ac_cv_search_opendir+:} false; then : +if test "${ac_cv_search_opendir+set}" = set; then : $as_echo_n "(cached) " >&6 else ac_func_search_save_LIBS=$LIBS @@ -6241,11 +6232,11 @@ fi rm -f core conftest.err conftest.$ac_objext \ conftest$ac_exeext - if ${ac_cv_search_opendir+:} false; then : + if test "${ac_cv_search_opendir+set}" = set; then : break fi done -if ${ac_cv_search_opendir+:} false; then : +if test "${ac_cv_search_opendir+set}" = set; then : else ac_cv_search_opendir=no @@ -6264,7 +6255,7 @@ else { $as_echo "$as_me:${as_lineno-$LINENO}: checking for library containing opendir" >&5 $as_echo_n "checking for library containing opendir... " >&6; } -if ${ac_cv_search_opendir+:} false; then : +if test "${ac_cv_search_opendir+set}" = set; then : $as_echo_n "(cached) " >&6 else ac_func_search_save_LIBS=$LIBS @@ -6298,11 +6289,11 @@ fi rm -f core conftest.err conftest.$ac_objext \ conftest$ac_exeext - if ${ac_cv_search_opendir+:} false; then : + if test "${ac_cv_search_opendir+set}" = set; then : break fi done -if ${ac_cv_search_opendir+:} false; then : +if test "${ac_cv_search_opendir+set}" = set; then : else ac_cv_search_opendir=no @@ -6322,7 +6313,7 @@ { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether sys/types.h defines makedev" >&5 $as_echo_n "checking whether sys/types.h defines makedev... " >&6; } -if ${ac_cv_header_sys_types_h_makedev+:} false; then : +if test "${ac_cv_header_sys_types_h_makedev+set}" = set; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext @@ -6350,7 +6341,7 @@ if test $ac_cv_header_sys_types_h_makedev = no; then ac_fn_c_check_header_mongrel "$LINENO" "sys/mkdev.h" "ac_cv_header_sys_mkdev_h" "$ac_includes_default" -if test "x$ac_cv_header_sys_mkdev_h" = xyes; then : +if test "x$ac_cv_header_sys_mkdev_h" = x""yes; then : $as_echo "#define MAJOR_IN_MKDEV 1" >>confdefs.h @@ -6360,7 +6351,7 @@ if test $ac_cv_header_sys_mkdev_h = no; then ac_fn_c_check_header_mongrel "$LINENO" "sys/sysmacros.h" "ac_cv_header_sys_sysmacros_h" "$ac_includes_default" -if test "x$ac_cv_header_sys_sysmacros_h" = xyes; then : +if test "x$ac_cv_header_sys_sysmacros_h" = x""yes; then : $as_echo "#define MAJOR_IN_SYSMACROS 1" >>confdefs.h @@ -6388,7 +6379,7 @@ #endif " -if test "x$ac_cv_header_net_if_h" = xyes; then : +if test "x$ac_cv_header_net_if_h" = x""yes; then : cat >>confdefs.h <<_ACEOF #define HAVE_NET_IF_H 1 _ACEOF @@ -6408,7 +6399,7 @@ #endif " -if test "x$ac_cv_header_term_h" = xyes; then : +if test "x$ac_cv_header_term_h" = x""yes; then : cat >>confdefs.h <<_ACEOF #define HAVE_TERM_H 1 _ACEOF @@ -6430,7 +6421,7 @@ #endif " -if test "x$ac_cv_header_linux_netlink_h" = xyes; then : +if test "x$ac_cv_header_linux_netlink_h" = x""yes; then : cat >>confdefs.h <<_ACEOF #define HAVE_LINUX_NETLINK_H 1 _ACEOF @@ -6440,6 +6431,26 @@ done +# On Linux, can.h and can/raw.h require sys/socket.h +for ac_header in linux/can.h linux/can/raw.h +do : + as_ac_Header=`$as_echo "ac_cv_header_$ac_header" | $as_tr_sh` +ac_fn_c_check_header_compile "$LINENO" "$ac_header" "$as_ac_Header" " +#ifdef HAVE_SYS_SOCKET_H +#include +#endif + +" +if eval test \"x\$"$as_ac_Header"\" = x"yes"; then : + cat >>confdefs.h <<_ACEOF +#define `$as_echo "HAVE_$ac_header" | $as_tr_cpp` 1 +_ACEOF + +fi + +done + + # checks for typedefs was_it_defined=no { $as_echo "$as_me:${as_lineno-$LINENO}: checking for clock_t in time.h" >&5 @@ -6566,7 +6577,7 @@ # Type availability checks ac_fn_c_check_type "$LINENO" "mode_t" "ac_cv_type_mode_t" "$ac_includes_default" -if test "x$ac_cv_type_mode_t" = xyes; then : +if test "x$ac_cv_type_mode_t" = x""yes; then : else @@ -6577,7 +6588,7 @@ fi ac_fn_c_check_type "$LINENO" "off_t" "ac_cv_type_off_t" "$ac_includes_default" -if test "x$ac_cv_type_off_t" = xyes; then : +if test "x$ac_cv_type_off_t" = x""yes; then : else @@ -6588,7 +6599,7 @@ fi ac_fn_c_check_type "$LINENO" "pid_t" "ac_cv_type_pid_t" "$ac_includes_default" -if test "x$ac_cv_type_pid_t" = xyes; then : +if test "x$ac_cv_type_pid_t" = x""yes; then : else @@ -6604,7 +6615,7 @@ _ACEOF ac_fn_c_check_type "$LINENO" "size_t" "ac_cv_type_size_t" "$ac_includes_default" -if test "x$ac_cv_type_size_t" = xyes; then : +if test "x$ac_cv_type_size_t" = x""yes; then : else @@ -6616,7 +6627,7 @@ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for uid_t in sys/types.h" >&5 $as_echo_n "checking for uid_t in sys/types.h... " >&6; } -if ${ac_cv_type_uid_t+:} false; then : +if test "${ac_cv_type_uid_t+set}" = set; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext @@ -6695,7 +6706,7 @@ esac ac_fn_c_check_type "$LINENO" "ssize_t" "ac_cv_type_ssize_t" "$ac_includes_default" -if test "x$ac_cv_type_ssize_t" = xyes; then : +if test "x$ac_cv_type_ssize_t" = x""yes; then : $as_echo "#define HAVE_SSIZE_T 1" >>confdefs.h @@ -6710,7 +6721,7 @@ # This bug is HP SR number 8606223364. { $as_echo "$as_me:${as_lineno-$LINENO}: checking size of int" >&5 $as_echo_n "checking size of int... " >&6; } -if ${ac_cv_sizeof_int+:} false; then : +if test "${ac_cv_sizeof_int+set}" = set; then : $as_echo_n "(cached) " >&6 else if ac_fn_c_compute_int "$LINENO" "(long int) (sizeof (int))" "ac_cv_sizeof_int" "$ac_includes_default"; then : @@ -6720,7 +6731,7 @@ { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error 77 "cannot compute sizeof (int) -See \`config.log' for more details" "$LINENO" 5; } +See \`config.log' for more details" "$LINENO" 5 ; } else ac_cv_sizeof_int=0 fi @@ -6743,7 +6754,7 @@ # This bug is HP SR number 8606223364. { $as_echo "$as_me:${as_lineno-$LINENO}: checking size of long" >&5 $as_echo_n "checking size of long... " >&6; } -if ${ac_cv_sizeof_long+:} false; then : +if test "${ac_cv_sizeof_long+set}" = set; then : $as_echo_n "(cached) " >&6 else if ac_fn_c_compute_int "$LINENO" "(long int) (sizeof (long))" "ac_cv_sizeof_long" "$ac_includes_default"; then : @@ -6753,7 +6764,7 @@ { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error 77 "cannot compute sizeof (long) -See \`config.log' for more details" "$LINENO" 5; } +See \`config.log' for more details" "$LINENO" 5 ; } else ac_cv_sizeof_long=0 fi @@ -6776,7 +6787,7 @@ # This bug is HP SR number 8606223364. { $as_echo "$as_me:${as_lineno-$LINENO}: checking size of void *" >&5 $as_echo_n "checking size of void *... " >&6; } -if ${ac_cv_sizeof_void_p+:} false; then : +if test "${ac_cv_sizeof_void_p+set}" = set; then : $as_echo_n "(cached) " >&6 else if ac_fn_c_compute_int "$LINENO" "(long int) (sizeof (void *))" "ac_cv_sizeof_void_p" "$ac_includes_default"; then : @@ -6786,7 +6797,7 @@ { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error 77 "cannot compute sizeof (void *) -See \`config.log' for more details" "$LINENO" 5; } +See \`config.log' for more details" "$LINENO" 5 ; } else ac_cv_sizeof_void_p=0 fi @@ -6809,7 +6820,7 @@ # This bug is HP SR number 8606223364. { $as_echo "$as_me:${as_lineno-$LINENO}: checking size of short" >&5 $as_echo_n "checking size of short... " >&6; } -if ${ac_cv_sizeof_short+:} false; then : +if test "${ac_cv_sizeof_short+set}" = set; then : $as_echo_n "(cached) " >&6 else if ac_fn_c_compute_int "$LINENO" "(long int) (sizeof (short))" "ac_cv_sizeof_short" "$ac_includes_default"; then : @@ -6819,7 +6830,7 @@ { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error 77 "cannot compute sizeof (short) -See \`config.log' for more details" "$LINENO" 5; } +See \`config.log' for more details" "$LINENO" 5 ; } else ac_cv_sizeof_short=0 fi @@ -6842,7 +6853,7 @@ # This bug is HP SR number 8606223364. { $as_echo "$as_me:${as_lineno-$LINENO}: checking size of float" >&5 $as_echo_n "checking size of float... " >&6; } -if ${ac_cv_sizeof_float+:} false; then : +if test "${ac_cv_sizeof_float+set}" = set; then : $as_echo_n "(cached) " >&6 else if ac_fn_c_compute_int "$LINENO" "(long int) (sizeof (float))" "ac_cv_sizeof_float" "$ac_includes_default"; then : @@ -6852,7 +6863,7 @@ { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error 77 "cannot compute sizeof (float) -See \`config.log' for more details" "$LINENO" 5; } +See \`config.log' for more details" "$LINENO" 5 ; } else ac_cv_sizeof_float=0 fi @@ -6875,7 +6886,7 @@ # This bug is HP SR number 8606223364. { $as_echo "$as_me:${as_lineno-$LINENO}: checking size of double" >&5 $as_echo_n "checking size of double... " >&6; } -if ${ac_cv_sizeof_double+:} false; then : +if test "${ac_cv_sizeof_double+set}" = set; then : $as_echo_n "(cached) " >&6 else if ac_fn_c_compute_int "$LINENO" "(long int) (sizeof (double))" "ac_cv_sizeof_double" "$ac_includes_default"; then : @@ -6885,7 +6896,7 @@ { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error 77 "cannot compute sizeof (double) -See \`config.log' for more details" "$LINENO" 5; } +See \`config.log' for more details" "$LINENO" 5 ; } else ac_cv_sizeof_double=0 fi @@ -6908,7 +6919,7 @@ # This bug is HP SR number 8606223364. { $as_echo "$as_me:${as_lineno-$LINENO}: checking size of fpos_t" >&5 $as_echo_n "checking size of fpos_t... " >&6; } -if ${ac_cv_sizeof_fpos_t+:} false; then : +if test "${ac_cv_sizeof_fpos_t+set}" = set; then : $as_echo_n "(cached) " >&6 else if ac_fn_c_compute_int "$LINENO" "(long int) (sizeof (fpos_t))" "ac_cv_sizeof_fpos_t" "$ac_includes_default"; then : @@ -6918,7 +6929,7 @@ { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error 77 "cannot compute sizeof (fpos_t) -See \`config.log' for more details" "$LINENO" 5; } +See \`config.log' for more details" "$LINENO" 5 ; } else ac_cv_sizeof_fpos_t=0 fi @@ -6941,7 +6952,7 @@ # This bug is HP SR number 8606223364. { $as_echo "$as_me:${as_lineno-$LINENO}: checking size of size_t" >&5 $as_echo_n "checking size of size_t... " >&6; } -if ${ac_cv_sizeof_size_t+:} false; then : +if test "${ac_cv_sizeof_size_t+set}" = set; then : $as_echo_n "(cached) " >&6 else if ac_fn_c_compute_int "$LINENO" "(long int) (sizeof (size_t))" "ac_cv_sizeof_size_t" "$ac_includes_default"; then : @@ -6951,7 +6962,7 @@ { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error 77 "cannot compute sizeof (size_t) -See \`config.log' for more details" "$LINENO" 5; } +See \`config.log' for more details" "$LINENO" 5 ; } else ac_cv_sizeof_size_t=0 fi @@ -6974,7 +6985,7 @@ # This bug is HP SR number 8606223364. { $as_echo "$as_me:${as_lineno-$LINENO}: checking size of pid_t" >&5 $as_echo_n "checking size of pid_t... " >&6; } -if ${ac_cv_sizeof_pid_t+:} false; then : +if test "${ac_cv_sizeof_pid_t+set}" = set; then : $as_echo_n "(cached) " >&6 else if ac_fn_c_compute_int "$LINENO" "(long int) (sizeof (pid_t))" "ac_cv_sizeof_pid_t" "$ac_includes_default"; then : @@ -6984,7 +6995,7 @@ { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error 77 "cannot compute sizeof (pid_t) -See \`config.log' for more details" "$LINENO" 5; } +See \`config.log' for more details" "$LINENO" 5 ; } else ac_cv_sizeof_pid_t=0 fi @@ -7034,7 +7045,7 @@ # This bug is HP SR number 8606223364. { $as_echo "$as_me:${as_lineno-$LINENO}: checking size of long long" >&5 $as_echo_n "checking size of long long... " >&6; } -if ${ac_cv_sizeof_long_long+:} false; then : +if test "${ac_cv_sizeof_long_long+set}" = set; then : $as_echo_n "(cached) " >&6 else if ac_fn_c_compute_int "$LINENO" "(long int) (sizeof (long long))" "ac_cv_sizeof_long_long" "$ac_includes_default"; then : @@ -7044,7 +7055,7 @@ { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error 77 "cannot compute sizeof (long long) -See \`config.log' for more details" "$LINENO" 5; } +See \`config.log' for more details" "$LINENO" 5 ; } else ac_cv_sizeof_long_long=0 fi @@ -7095,7 +7106,7 @@ # This bug is HP SR number 8606223364. { $as_echo "$as_me:${as_lineno-$LINENO}: checking size of long double" >&5 $as_echo_n "checking size of long double... " >&6; } -if ${ac_cv_sizeof_long_double+:} false; then : +if test "${ac_cv_sizeof_long_double+set}" = set; then : $as_echo_n "(cached) " >&6 else if ac_fn_c_compute_int "$LINENO" "(long int) (sizeof (long double))" "ac_cv_sizeof_long_double" "$ac_includes_default"; then : @@ -7105,7 +7116,7 @@ { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error 77 "cannot compute sizeof (long double) -See \`config.log' for more details" "$LINENO" 5; } +See \`config.log' for more details" "$LINENO" 5 ; } else ac_cv_sizeof_long_double=0 fi @@ -7157,7 +7168,7 @@ # This bug is HP SR number 8606223364. { $as_echo "$as_me:${as_lineno-$LINENO}: checking size of _Bool" >&5 $as_echo_n "checking size of _Bool... " >&6; } -if ${ac_cv_sizeof__Bool+:} false; then : +if test "${ac_cv_sizeof__Bool+set}" = set; then : $as_echo_n "(cached) " >&6 else if ac_fn_c_compute_int "$LINENO" "(long int) (sizeof (_Bool))" "ac_cv_sizeof__Bool" "$ac_includes_default"; then : @@ -7167,7 +7178,7 @@ { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error 77 "cannot compute sizeof (_Bool) -See \`config.log' for more details" "$LINENO" 5; } +See \`config.log' for more details" "$LINENO" 5 ; } else ac_cv_sizeof__Bool=0 fi @@ -7193,7 +7204,7 @@ #include #endif " -if test "x$ac_cv_type_uintptr_t" = xyes; then : +if test "x$ac_cv_type_uintptr_t" = x""yes; then : cat >>confdefs.h <<_ACEOF #define HAVE_UINTPTR_T 1 @@ -7205,7 +7216,7 @@ # This bug is HP SR number 8606223364. { $as_echo "$as_me:${as_lineno-$LINENO}: checking size of uintptr_t" >&5 $as_echo_n "checking size of uintptr_t... " >&6; } -if ${ac_cv_sizeof_uintptr_t+:} false; then : +if test "${ac_cv_sizeof_uintptr_t+set}" = set; then : $as_echo_n "(cached) " >&6 else if ac_fn_c_compute_int "$LINENO" "(long int) (sizeof (uintptr_t))" "ac_cv_sizeof_uintptr_t" "$ac_includes_default"; then : @@ -7215,7 +7226,7 @@ { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error 77 "cannot compute sizeof (uintptr_t) -See \`config.log' for more details" "$LINENO" 5; } +See \`config.log' for more details" "$LINENO" 5 ; } else ac_cv_sizeof_uintptr_t=0 fi @@ -7241,7 +7252,7 @@ # This bug is HP SR number 8606223364. { $as_echo "$as_me:${as_lineno-$LINENO}: checking size of off_t" >&5 $as_echo_n "checking size of off_t... " >&6; } -if ${ac_cv_sizeof_off_t+:} false; then : +if test "${ac_cv_sizeof_off_t+set}" = set; then : $as_echo_n "(cached) " >&6 else if ac_fn_c_compute_int "$LINENO" "(long int) (sizeof (off_t))" "ac_cv_sizeof_off_t" " @@ -7256,7 +7267,7 @@ { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error 77 "cannot compute sizeof (off_t) -See \`config.log' for more details" "$LINENO" 5; } +See \`config.log' for more details" "$LINENO" 5 ; } else ac_cv_sizeof_off_t=0 fi @@ -7300,7 +7311,7 @@ # This bug is HP SR number 8606223364. { $as_echo "$as_me:${as_lineno-$LINENO}: checking size of time_t" >&5 $as_echo_n "checking size of time_t... " >&6; } -if ${ac_cv_sizeof_time_t+:} false; then : +if test "${ac_cv_sizeof_time_t+set}" = set; then : $as_echo_n "(cached) " >&6 else if ac_fn_c_compute_int "$LINENO" "(long int) (sizeof (time_t))" "ac_cv_sizeof_time_t" " @@ -7318,7 +7329,7 @@ { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error 77 "cannot compute sizeof (time_t) -See \`config.log' for more details" "$LINENO" 5; } +See \`config.log' for more details" "$LINENO" 5 ; } else ac_cv_sizeof_time_t=0 fi @@ -7375,7 +7386,7 @@ # This bug is HP SR number 8606223364. { $as_echo "$as_me:${as_lineno-$LINENO}: checking size of pthread_t" >&5 $as_echo_n "checking size of pthread_t... " >&6; } -if ${ac_cv_sizeof_pthread_t+:} false; then : +if test "${ac_cv_sizeof_pthread_t+set}" = set; then : $as_echo_n "(cached) " >&6 else if ac_fn_c_compute_int "$LINENO" "(long int) (sizeof (pthread_t))" "ac_cv_sizeof_pthread_t" " @@ -7390,7 +7401,7 @@ { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error 77 "cannot compute sizeof (pthread_t) -See \`config.log' for more details" "$LINENO" 5; } +See \`config.log' for more details" "$LINENO" 5 ; } else ac_cv_sizeof_pthread_t=0 fi @@ -7821,7 +7832,7 @@ # checks for libraries { $as_echo "$as_me:${as_lineno-$LINENO}: checking for sendfile in -lsendfile" >&5 $as_echo_n "checking for sendfile in -lsendfile... " >&6; } -if ${ac_cv_lib_sendfile_sendfile+:} false; then : +if test "${ac_cv_lib_sendfile_sendfile+set}" = set; then : $as_echo_n "(cached) " >&6 else ac_check_lib_save_LIBS=$LIBS @@ -7855,7 +7866,7 @@ fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_sendfile_sendfile" >&5 $as_echo "$ac_cv_lib_sendfile_sendfile" >&6; } -if test "x$ac_cv_lib_sendfile_sendfile" = xyes; then : +if test "x$ac_cv_lib_sendfile_sendfile" = x""yes; then : cat >>confdefs.h <<_ACEOF #define HAVE_LIBSENDFILE 1 _ACEOF @@ -7866,7 +7877,7 @@ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for dlopen in -ldl" >&5 $as_echo_n "checking for dlopen in -ldl... " >&6; } -if ${ac_cv_lib_dl_dlopen+:} false; then : +if test "${ac_cv_lib_dl_dlopen+set}" = set; then : $as_echo_n "(cached) " >&6 else ac_check_lib_save_LIBS=$LIBS @@ -7900,7 +7911,7 @@ fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_dl_dlopen" >&5 $as_echo "$ac_cv_lib_dl_dlopen" >&6; } -if test "x$ac_cv_lib_dl_dlopen" = xyes; then : +if test "x$ac_cv_lib_dl_dlopen" = x""yes; then : cat >>confdefs.h <<_ACEOF #define HAVE_LIBDL 1 _ACEOF @@ -7911,7 +7922,7 @@ # Dynamic linking for SunOS/Solaris and SYSV { $as_echo "$as_me:${as_lineno-$LINENO}: checking for shl_load in -ldld" >&5 $as_echo_n "checking for shl_load in -ldld... " >&6; } -if ${ac_cv_lib_dld_shl_load+:} false; then : +if test "${ac_cv_lib_dld_shl_load+set}" = set; then : $as_echo_n "(cached) " >&6 else ac_check_lib_save_LIBS=$LIBS @@ -7945,7 +7956,7 @@ fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_dld_shl_load" >&5 $as_echo "$ac_cv_lib_dld_shl_load" >&6; } -if test "x$ac_cv_lib_dld_shl_load" = xyes; then : +if test "x$ac_cv_lib_dld_shl_load" = x""yes; then : cat >>confdefs.h <<_ACEOF #define HAVE_LIBDLD 1 _ACEOF @@ -7959,7 +7970,7 @@ if test "$with_threads" = "yes" -o -z "$with_threads"; then { $as_echo "$as_me:${as_lineno-$LINENO}: checking for library containing sem_init" >&5 $as_echo_n "checking for library containing sem_init... " >&6; } -if ${ac_cv_search_sem_init+:} false; then : +if test "${ac_cv_search_sem_init+set}" = set; then : $as_echo_n "(cached) " >&6 else ac_func_search_save_LIBS=$LIBS @@ -7993,11 +8004,11 @@ fi rm -f core conftest.err conftest.$ac_objext \ conftest$ac_exeext - if ${ac_cv_search_sem_init+:} false; then : + if test "${ac_cv_search_sem_init+set}" = set; then : break fi done -if ${ac_cv_search_sem_init+:} false; then : +if test "${ac_cv_search_sem_init+set}" = set; then : else ac_cv_search_sem_init=no @@ -8020,7 +8031,7 @@ # check if we need libintl for locale functions { $as_echo "$as_me:${as_lineno-$LINENO}: checking for textdomain in -lintl" >&5 $as_echo_n "checking for textdomain in -lintl... " >&6; } -if ${ac_cv_lib_intl_textdomain+:} false; then : +if test "${ac_cv_lib_intl_textdomain+set}" = set; then : $as_echo_n "(cached) " >&6 else ac_check_lib_save_LIBS=$LIBS @@ -8054,7 +8065,7 @@ fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_intl_textdomain" >&5 $as_echo "$ac_cv_lib_intl_textdomain" >&6; } -if test "x$ac_cv_lib_intl_textdomain" = xyes; then : +if test "x$ac_cv_lib_intl_textdomain" = x""yes; then : $as_echo "#define WITH_LIBINTL 1" >>confdefs.h @@ -8101,7 +8112,7 @@ # Most SVR4 platforms (e.g. Solaris) need -lsocket and -lnsl. { $as_echo "$as_me:${as_lineno-$LINENO}: checking for t_open in -lnsl" >&5 $as_echo_n "checking for t_open in -lnsl... " >&6; } -if ${ac_cv_lib_nsl_t_open+:} false; then : +if test "${ac_cv_lib_nsl_t_open+set}" = set; then : $as_echo_n "(cached) " >&6 else ac_check_lib_save_LIBS=$LIBS @@ -8135,13 +8146,13 @@ fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_nsl_t_open" >&5 $as_echo "$ac_cv_lib_nsl_t_open" >&6; } -if test "x$ac_cv_lib_nsl_t_open" = xyes; then : +if test "x$ac_cv_lib_nsl_t_open" = x""yes; then : LIBS="-lnsl $LIBS" fi # SVR4 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for socket in -lsocket" >&5 $as_echo_n "checking for socket in -lsocket... " >&6; } -if ${ac_cv_lib_socket_socket+:} false; then : +if test "${ac_cv_lib_socket_socket+set}" = set; then : $as_echo_n "(cached) " >&6 else ac_check_lib_save_LIBS=$LIBS @@ -8175,7 +8186,7 @@ fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_socket_socket" >&5 $as_echo "$ac_cv_lib_socket_socket" >&6; } -if test "x$ac_cv_lib_socket_socket" = xyes; then : +if test "x$ac_cv_lib_socket_socket" = x""yes; then : LIBS="-lsocket $LIBS" fi # SVR4 sockets @@ -8201,7 +8212,7 @@ set dummy ${ac_tool_prefix}pkg-config; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } -if ${ac_cv_path_PKG_CONFIG+:} false; then : +if test "${ac_cv_path_PKG_CONFIG+set}" = set; then : $as_echo_n "(cached) " >&6 else case $PKG_CONFIG in @@ -8244,7 +8255,7 @@ set dummy pkg-config; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } -if ${ac_cv_path_ac_pt_PKG_CONFIG+:} false; then : +if test "${ac_cv_path_ac_pt_PKG_CONFIG+set}" = set; then : $as_echo_n "(cached) " >&6 else case $ac_pt_PKG_CONFIG in @@ -8540,7 +8551,7 @@ LIBS=$_libs ac_fn_c_check_func "$LINENO" "pthread_detach" "ac_cv_func_pthread_detach" -if test "x$ac_cv_func_pthread_detach" = xyes; then : +if test "x$ac_cv_func_pthread_detach" = x""yes; then : $as_echo "#define WITH_THREAD 1" >>confdefs.h posix_threads=yes @@ -8549,7 +8560,7 @@ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for pthread_create in -lpthreads" >&5 $as_echo_n "checking for pthread_create in -lpthreads... " >&6; } -if ${ac_cv_lib_pthreads_pthread_create+:} false; then : +if test "${ac_cv_lib_pthreads_pthread_create+set}" = set; then : $as_echo_n "(cached) " >&6 else ac_check_lib_save_LIBS=$LIBS @@ -8583,7 +8594,7 @@ fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_pthreads_pthread_create" >&5 $as_echo "$ac_cv_lib_pthreads_pthread_create" >&6; } -if test "x$ac_cv_lib_pthreads_pthread_create" = xyes; then : +if test "x$ac_cv_lib_pthreads_pthread_create" = x""yes; then : $as_echo "#define WITH_THREAD 1" >>confdefs.h posix_threads=yes @@ -8593,7 +8604,7 @@ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for pthread_create in -lc_r" >&5 $as_echo_n "checking for pthread_create in -lc_r... " >&6; } -if ${ac_cv_lib_c_r_pthread_create+:} false; then : +if test "${ac_cv_lib_c_r_pthread_create+set}" = set; then : $as_echo_n "(cached) " >&6 else ac_check_lib_save_LIBS=$LIBS @@ -8627,7 +8638,7 @@ fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_c_r_pthread_create" >&5 $as_echo "$ac_cv_lib_c_r_pthread_create" >&6; } -if test "x$ac_cv_lib_c_r_pthread_create" = xyes; then : +if test "x$ac_cv_lib_c_r_pthread_create" = x""yes; then : $as_echo "#define WITH_THREAD 1" >>confdefs.h posix_threads=yes @@ -8637,7 +8648,7 @@ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for __pthread_create_system in -lpthread" >&5 $as_echo_n "checking for __pthread_create_system in -lpthread... " >&6; } -if ${ac_cv_lib_pthread___pthread_create_system+:} false; then : +if test "${ac_cv_lib_pthread___pthread_create_system+set}" = set; then : $as_echo_n "(cached) " >&6 else ac_check_lib_save_LIBS=$LIBS @@ -8671,7 +8682,7 @@ fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_pthread___pthread_create_system" >&5 $as_echo "$ac_cv_lib_pthread___pthread_create_system" >&6; } -if test "x$ac_cv_lib_pthread___pthread_create_system" = xyes; then : +if test "x$ac_cv_lib_pthread___pthread_create_system" = x""yes; then : $as_echo "#define WITH_THREAD 1" >>confdefs.h posix_threads=yes @@ -8681,7 +8692,7 @@ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for pthread_create in -lcma" >&5 $as_echo_n "checking for pthread_create in -lcma... " >&6; } -if ${ac_cv_lib_cma_pthread_create+:} false; then : +if test "${ac_cv_lib_cma_pthread_create+set}" = set; then : $as_echo_n "(cached) " >&6 else ac_check_lib_save_LIBS=$LIBS @@ -8715,7 +8726,7 @@ fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_cma_pthread_create" >&5 $as_echo "$ac_cv_lib_cma_pthread_create" >&6; } -if test "x$ac_cv_lib_cma_pthread_create" = xyes; then : +if test "x$ac_cv_lib_cma_pthread_create" = x""yes; then : $as_echo "#define WITH_THREAD 1" >>confdefs.h posix_threads=yes @@ -8741,7 +8752,7 @@ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for usconfig in -lmpc" >&5 $as_echo_n "checking for usconfig in -lmpc... " >&6; } -if ${ac_cv_lib_mpc_usconfig+:} false; then : +if test "${ac_cv_lib_mpc_usconfig+set}" = set; then : $as_echo_n "(cached) " >&6 else ac_check_lib_save_LIBS=$LIBS @@ -8775,7 +8786,7 @@ fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_mpc_usconfig" >&5 $as_echo "$ac_cv_lib_mpc_usconfig" >&6; } -if test "x$ac_cv_lib_mpc_usconfig" = xyes; then : +if test "x$ac_cv_lib_mpc_usconfig" = x""yes; then : $as_echo "#define WITH_THREAD 1" >>confdefs.h LIBS="$LIBS -lmpc" @@ -8787,7 +8798,7 @@ if test "$posix_threads" != "yes"; then { $as_echo "$as_me:${as_lineno-$LINENO}: checking for thr_create in -lthread" >&5 $as_echo_n "checking for thr_create in -lthread... " >&6; } -if ${ac_cv_lib_thread_thr_create+:} false; then : +if test "${ac_cv_lib_thread_thr_create+set}" = set; then : $as_echo_n "(cached) " >&6 else ac_check_lib_save_LIBS=$LIBS @@ -8821,7 +8832,7 @@ fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_thread_thr_create" >&5 $as_echo "$ac_cv_lib_thread_thr_create" >&6; } -if test "x$ac_cv_lib_thread_thr_create" = xyes; then : +if test "x$ac_cv_lib_thread_thr_create" = x""yes; then : $as_echo "#define WITH_THREAD 1" >>confdefs.h LIBS="$LIBS -lthread" @@ -8857,7 +8868,7 @@ { $as_echo "$as_me:${as_lineno-$LINENO}: checking if PTHREAD_SCOPE_SYSTEM is supported" >&5 $as_echo_n "checking if PTHREAD_SCOPE_SYSTEM is supported... " >&6; } - if ${ac_cv_pthread_system_supported+:} false; then : + if test "${ac_cv_pthread_system_supported+set}" = set; then : $as_echo_n "(cached) " >&6 else if test "$cross_compiling" = yes; then : @@ -8900,7 +8911,7 @@ for ac_func in pthread_sigmask do : ac_fn_c_check_func "$LINENO" "pthread_sigmask" "ac_cv_func_pthread_sigmask" -if test "x$ac_cv_func_pthread_sigmask" = xyes; then : +if test "x$ac_cv_func_pthread_sigmask" = x""yes; then : cat >>confdefs.h <<_ACEOF #define HAVE_PTHREAD_SIGMASK 1 _ACEOF @@ -9292,7 +9303,7 @@ $as_echo "$with_valgrind" >&6; } if test "$with_valgrind" != no; then ac_fn_c_check_header_mongrel "$LINENO" "valgrind/valgrind.h" "ac_cv_header_valgrind_valgrind_h" "$ac_includes_default" -if test "x$ac_cv_header_valgrind_valgrind_h" = xyes; then : +if test "x$ac_cv_header_valgrind_valgrind_h" = x""yes; then : $as_echo "#define WITH_VALGRIND 1" >>confdefs.h @@ -9314,7 +9325,7 @@ for ac_func in dlopen do : ac_fn_c_check_func "$LINENO" "dlopen" "ac_cv_func_dlopen" -if test "x$ac_cv_func_dlopen" = xyes; then : +if test "x$ac_cv_func_dlopen" = x""yes; then : cat >>confdefs.h <<_ACEOF #define HAVE_DLOPEN 1 _ACEOF @@ -9648,7 +9659,7 @@ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for flock declaration" >&5 $as_echo_n "checking for flock declaration... " >&6; } -if ${ac_cv_flock_decl+:} false; then : +if test "${ac_cv_flock_decl+set}" = set; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext @@ -9678,7 +9689,7 @@ for ac_func in flock do : ac_fn_c_check_func "$LINENO" "flock" "ac_cv_func_flock" -if test "x$ac_cv_func_flock" = xyes; then : +if test "x$ac_cv_func_flock" = x""yes; then : cat >>confdefs.h <<_ACEOF #define HAVE_FLOCK 1 _ACEOF @@ -9686,7 +9697,7 @@ else { $as_echo "$as_me:${as_lineno-$LINENO}: checking for flock in -lbsd" >&5 $as_echo_n "checking for flock in -lbsd... " >&6; } -if ${ac_cv_lib_bsd_flock+:} false; then : +if test "${ac_cv_lib_bsd_flock+set}" = set; then : $as_echo_n "(cached) " >&6 else ac_check_lib_save_LIBS=$LIBS @@ -9720,7 +9731,7 @@ fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_bsd_flock" >&5 $as_echo "$ac_cv_lib_bsd_flock" >&6; } -if test "x$ac_cv_lib_bsd_flock" = xyes; then : +if test "x$ac_cv_lib_bsd_flock" = x""yes; then : $as_echo "#define HAVE_FLOCK 1" >>confdefs.h @@ -9797,7 +9808,7 @@ set dummy $ac_prog; ac_word=$2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5 $as_echo_n "checking for $ac_word... " >&6; } -if ${ac_cv_prog_TRUE+:} false; then : +if test "${ac_cv_prog_TRUE+set}" = set; then : $as_echo_n "(cached) " >&6 else if test -n "$TRUE"; then @@ -9837,7 +9848,7 @@ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for inet_aton in -lc" >&5 $as_echo_n "checking for inet_aton in -lc... " >&6; } -if ${ac_cv_lib_c_inet_aton+:} false; then : +if test "${ac_cv_lib_c_inet_aton+set}" = set; then : $as_echo_n "(cached) " >&6 else ac_check_lib_save_LIBS=$LIBS @@ -9871,12 +9882,12 @@ fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_c_inet_aton" >&5 $as_echo "$ac_cv_lib_c_inet_aton" >&6; } -if test "x$ac_cv_lib_c_inet_aton" = xyes; then : +if test "x$ac_cv_lib_c_inet_aton" = x""yes; then : $ac_cv_prog_TRUE else { $as_echo "$as_me:${as_lineno-$LINENO}: checking for inet_aton in -lresolv" >&5 $as_echo_n "checking for inet_aton in -lresolv... " >&6; } -if ${ac_cv_lib_resolv_inet_aton+:} false; then : +if test "${ac_cv_lib_resolv_inet_aton+set}" = set; then : $as_echo_n "(cached) " >&6 else ac_check_lib_save_LIBS=$LIBS @@ -9910,7 +9921,7 @@ fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_resolv_inet_aton" >&5 $as_echo "$ac_cv_lib_resolv_inet_aton" >&6; } -if test "x$ac_cv_lib_resolv_inet_aton" = xyes; then : +if test "x$ac_cv_lib_resolv_inet_aton" = x""yes; then : cat >>confdefs.h <<_ACEOF #define HAVE_LIBRESOLV 1 _ACEOF @@ -9927,7 +9938,7 @@ # exit Python { $as_echo "$as_me:${as_lineno-$LINENO}: checking for chflags" >&5 $as_echo_n "checking for chflags... " >&6; } -if ${ac_cv_have_chflags+:} false; then : +if test "${ac_cv_have_chflags+set}" = set; then : $as_echo_n "(cached) " >&6 else if test "$cross_compiling" = yes; then : @@ -9961,7 +9972,7 @@ $as_echo "$ac_cv_have_chflags" >&6; } if test "$ac_cv_have_chflags" = cross ; then ac_fn_c_check_func "$LINENO" "chflags" "ac_cv_func_chflags" -if test "x$ac_cv_func_chflags" = xyes; then : +if test "x$ac_cv_func_chflags" = x""yes; then : ac_cv_have_chflags="yes" else ac_cv_have_chflags="no" @@ -9976,7 +9987,7 @@ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for lchflags" >&5 $as_echo_n "checking for lchflags... " >&6; } -if ${ac_cv_have_lchflags+:} false; then : +if test "${ac_cv_have_lchflags+set}" = set; then : $as_echo_n "(cached) " >&6 else if test "$cross_compiling" = yes; then : @@ -10010,7 +10021,7 @@ $as_echo "$ac_cv_have_lchflags" >&6; } if test "$ac_cv_have_lchflags" = cross ; then ac_fn_c_check_func "$LINENO" "lchflags" "ac_cv_func_lchflags" -if test "x$ac_cv_func_lchflags" = xyes; then : +if test "x$ac_cv_func_lchflags" = x""yes; then : ac_cv_have_lchflags="yes" else ac_cv_have_lchflags="no" @@ -10034,7 +10045,7 @@ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for inflateCopy in -lz" >&5 $as_echo_n "checking for inflateCopy in -lz... " >&6; } -if ${ac_cv_lib_z_inflateCopy+:} false; then : +if test "${ac_cv_lib_z_inflateCopy+set}" = set; then : $as_echo_n "(cached) " >&6 else ac_check_lib_save_LIBS=$LIBS @@ -10068,7 +10079,7 @@ fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_z_inflateCopy" >&5 $as_echo "$ac_cv_lib_z_inflateCopy" >&6; } -if test "x$ac_cv_lib_z_inflateCopy" = xyes; then : +if test "x$ac_cv_lib_z_inflateCopy" = x""yes; then : $as_echo "#define HAVE_ZLIB_COPY 1" >>confdefs.h @@ -10211,7 +10222,7 @@ for ac_func in openpty do : ac_fn_c_check_func "$LINENO" "openpty" "ac_cv_func_openpty" -if test "x$ac_cv_func_openpty" = xyes; then : +if test "x$ac_cv_func_openpty" = x""yes; then : cat >>confdefs.h <<_ACEOF #define HAVE_OPENPTY 1 _ACEOF @@ -10219,7 +10230,7 @@ else { $as_echo "$as_me:${as_lineno-$LINENO}: checking for openpty in -lutil" >&5 $as_echo_n "checking for openpty in -lutil... " >&6; } -if ${ac_cv_lib_util_openpty+:} false; then : +if test "${ac_cv_lib_util_openpty+set}" = set; then : $as_echo_n "(cached) " >&6 else ac_check_lib_save_LIBS=$LIBS @@ -10253,13 +10264,13 @@ fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_util_openpty" >&5 $as_echo "$ac_cv_lib_util_openpty" >&6; } -if test "x$ac_cv_lib_util_openpty" = xyes; then : +if test "x$ac_cv_lib_util_openpty" = x""yes; then : $as_echo "#define HAVE_OPENPTY 1" >>confdefs.h LIBS="$LIBS -lutil" else { $as_echo "$as_me:${as_lineno-$LINENO}: checking for openpty in -lbsd" >&5 $as_echo_n "checking for openpty in -lbsd... " >&6; } -if ${ac_cv_lib_bsd_openpty+:} false; then : +if test "${ac_cv_lib_bsd_openpty+set}" = set; then : $as_echo_n "(cached) " >&6 else ac_check_lib_save_LIBS=$LIBS @@ -10293,7 +10304,7 @@ fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_bsd_openpty" >&5 $as_echo "$ac_cv_lib_bsd_openpty" >&6; } -if test "x$ac_cv_lib_bsd_openpty" = xyes; then : +if test "x$ac_cv_lib_bsd_openpty" = x""yes; then : $as_echo "#define HAVE_OPENPTY 1" >>confdefs.h LIBS="$LIBS -lbsd" fi @@ -10308,7 +10319,7 @@ for ac_func in forkpty do : ac_fn_c_check_func "$LINENO" "forkpty" "ac_cv_func_forkpty" -if test "x$ac_cv_func_forkpty" = xyes; then : +if test "x$ac_cv_func_forkpty" = x""yes; then : cat >>confdefs.h <<_ACEOF #define HAVE_FORKPTY 1 _ACEOF @@ -10316,7 +10327,7 @@ else { $as_echo "$as_me:${as_lineno-$LINENO}: checking for forkpty in -lutil" >&5 $as_echo_n "checking for forkpty in -lutil... " >&6; } -if ${ac_cv_lib_util_forkpty+:} false; then : +if test "${ac_cv_lib_util_forkpty+set}" = set; then : $as_echo_n "(cached) " >&6 else ac_check_lib_save_LIBS=$LIBS @@ -10350,13 +10361,13 @@ fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_util_forkpty" >&5 $as_echo "$ac_cv_lib_util_forkpty" >&6; } -if test "x$ac_cv_lib_util_forkpty" = xyes; then : +if test "x$ac_cv_lib_util_forkpty" = x""yes; then : $as_echo "#define HAVE_FORKPTY 1" >>confdefs.h LIBS="$LIBS -lutil" else { $as_echo "$as_me:${as_lineno-$LINENO}: checking for forkpty in -lbsd" >&5 $as_echo_n "checking for forkpty in -lbsd... " >&6; } -if ${ac_cv_lib_bsd_forkpty+:} false; then : +if test "${ac_cv_lib_bsd_forkpty+set}" = set; then : $as_echo_n "(cached) " >&6 else ac_check_lib_save_LIBS=$LIBS @@ -10390,7 +10401,7 @@ fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_bsd_forkpty" >&5 $as_echo "$ac_cv_lib_bsd_forkpty" >&6; } -if test "x$ac_cv_lib_bsd_forkpty" = xyes; then : +if test "x$ac_cv_lib_bsd_forkpty" = x""yes; then : $as_echo "#define HAVE_FORKPTY 1" >>confdefs.h LIBS="$LIBS -lbsd" fi @@ -10407,7 +10418,7 @@ for ac_func in memmove do : ac_fn_c_check_func "$LINENO" "memmove" "ac_cv_func_memmove" -if test "x$ac_cv_func_memmove" = xyes; then : +if test "x$ac_cv_func_memmove" = x""yes; then : cat >>confdefs.h <<_ACEOF #define HAVE_MEMMOVE 1 _ACEOF @@ -10431,7 +10442,7 @@ ac_fn_c_check_func "$LINENO" "dup2" "ac_cv_func_dup2" -if test "x$ac_cv_func_dup2" = xyes; then : +if test "x$ac_cv_func_dup2" = x""yes; then : $as_echo "#define HAVE_DUP2 1" >>confdefs.h else @@ -10444,7 +10455,7 @@ fi ac_fn_c_check_func "$LINENO" "getcwd" "ac_cv_func_getcwd" -if test "x$ac_cv_func_getcwd" = xyes; then : +if test "x$ac_cv_func_getcwd" = x""yes; then : $as_echo "#define HAVE_GETCWD 1" >>confdefs.h else @@ -10457,7 +10468,7 @@ fi ac_fn_c_check_func "$LINENO" "strdup" "ac_cv_func_strdup" -if test "x$ac_cv_func_strdup" = xyes; then : +if test "x$ac_cv_func_strdup" = x""yes; then : $as_echo "#define HAVE_STRDUP 1" >>confdefs.h else @@ -10473,7 +10484,7 @@ for ac_func in getpgrp do : ac_fn_c_check_func "$LINENO" "getpgrp" "ac_cv_func_getpgrp" -if test "x$ac_cv_func_getpgrp" = xyes; then : +if test "x$ac_cv_func_getpgrp" = x""yes; then : cat >>confdefs.h <<_ACEOF #define HAVE_GETPGRP 1 _ACEOF @@ -10501,7 +10512,7 @@ for ac_func in setpgrp do : ac_fn_c_check_func "$LINENO" "setpgrp" "ac_cv_func_setpgrp" -if test "x$ac_cv_func_setpgrp" = xyes; then : +if test "x$ac_cv_func_setpgrp" = x""yes; then : cat >>confdefs.h <<_ACEOF #define HAVE_SETPGRP 1 _ACEOF @@ -10529,7 +10540,7 @@ for ac_func in gettimeofday do : ac_fn_c_check_func "$LINENO" "gettimeofday" "ac_cv_func_gettimeofday" -if test "x$ac_cv_func_gettimeofday" = xyes; then : +if test "x$ac_cv_func_gettimeofday" = x""yes; then : cat >>confdefs.h <<_ACEOF #define HAVE_GETTIMEOFDAY 1 _ACEOF @@ -10631,7 +10642,7 @@ then { $as_echo "$as_me:${as_lineno-$LINENO}: checking getaddrinfo bug" >&5 $as_echo_n "checking getaddrinfo bug... " >&6; } - if ${ac_cv_buggy_getaddrinfo+:} false; then : + if test "${ac_cv_buggy_getaddrinfo+set}" = set; then : $as_echo_n "(cached) " >&6 else if test "$cross_compiling" = yes; then : @@ -10760,7 +10771,7 @@ for ac_func in getnameinfo do : ac_fn_c_check_func "$LINENO" "getnameinfo" "ac_cv_func_getnameinfo" -if test "x$ac_cv_func_getnameinfo" = xyes; then : +if test "x$ac_cv_func_getnameinfo" = x""yes; then : cat >>confdefs.h <<_ACEOF #define HAVE_GETNAMEINFO 1 _ACEOF @@ -10772,7 +10783,7 @@ # checks for structures { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether time.h and sys/time.h may both be included" >&5 $as_echo_n "checking whether time.h and sys/time.h may both be included... " >&6; } -if ${ac_cv_header_time+:} false; then : +if test "${ac_cv_header_time+set}" = set; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext @@ -10807,7 +10818,7 @@ { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether struct tm is in sys/time.h or time.h" >&5 $as_echo_n "checking whether struct tm is in sys/time.h or time.h... " >&6; } -if ${ac_cv_struct_tm+:} false; then : +if test "${ac_cv_struct_tm+set}" = set; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext @@ -10844,7 +10855,7 @@ #include <$ac_cv_struct_tm> " -if test "x$ac_cv_member_struct_tm_tm_zone" = xyes; then : +if test "x$ac_cv_member_struct_tm_tm_zone" = x""yes; then : cat >>confdefs.h <<_ACEOF #define HAVE_STRUCT_TM_TM_ZONE 1 @@ -10860,7 +10871,7 @@ else ac_fn_c_check_decl "$LINENO" "tzname" "ac_cv_have_decl_tzname" "#include " -if test "x$ac_cv_have_decl_tzname" = xyes; then : +if test "x$ac_cv_have_decl_tzname" = x""yes; then : ac_have_decl=1 else ac_have_decl=0 @@ -10872,7 +10883,7 @@ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for tzname" >&5 $as_echo_n "checking for tzname... " >&6; } -if ${ac_cv_var_tzname+:} false; then : +if test "${ac_cv_var_tzname+set}" = set; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext @@ -10908,7 +10919,7 @@ fi ac_fn_c_check_member "$LINENO" "struct stat" "st_rdev" "ac_cv_member_struct_stat_st_rdev" "$ac_includes_default" -if test "x$ac_cv_member_struct_stat_st_rdev" = xyes; then : +if test "x$ac_cv_member_struct_stat_st_rdev" = x""yes; then : cat >>confdefs.h <<_ACEOF #define HAVE_STRUCT_STAT_ST_RDEV 1 @@ -10918,7 +10929,7 @@ fi ac_fn_c_check_member "$LINENO" "struct stat" "st_blksize" "ac_cv_member_struct_stat_st_blksize" "$ac_includes_default" -if test "x$ac_cv_member_struct_stat_st_blksize" = xyes; then : +if test "x$ac_cv_member_struct_stat_st_blksize" = x""yes; then : cat >>confdefs.h <<_ACEOF #define HAVE_STRUCT_STAT_ST_BLKSIZE 1 @@ -10928,7 +10939,7 @@ fi ac_fn_c_check_member "$LINENO" "struct stat" "st_flags" "ac_cv_member_struct_stat_st_flags" "$ac_includes_default" -if test "x$ac_cv_member_struct_stat_st_flags" = xyes; then : +if test "x$ac_cv_member_struct_stat_st_flags" = x""yes; then : cat >>confdefs.h <<_ACEOF #define HAVE_STRUCT_STAT_ST_FLAGS 1 @@ -10938,7 +10949,7 @@ fi ac_fn_c_check_member "$LINENO" "struct stat" "st_gen" "ac_cv_member_struct_stat_st_gen" "$ac_includes_default" -if test "x$ac_cv_member_struct_stat_st_gen" = xyes; then : +if test "x$ac_cv_member_struct_stat_st_gen" = x""yes; then : cat >>confdefs.h <<_ACEOF #define HAVE_STRUCT_STAT_ST_GEN 1 @@ -10948,7 +10959,7 @@ fi ac_fn_c_check_member "$LINENO" "struct stat" "st_birthtime" "ac_cv_member_struct_stat_st_birthtime" "$ac_includes_default" -if test "x$ac_cv_member_struct_stat_st_birthtime" = xyes; then : +if test "x$ac_cv_member_struct_stat_st_birthtime" = x""yes; then : cat >>confdefs.h <<_ACEOF #define HAVE_STRUCT_STAT_ST_BIRTHTIME 1 @@ -10958,7 +10969,7 @@ fi ac_fn_c_check_member "$LINENO" "struct stat" "st_blocks" "ac_cv_member_struct_stat_st_blocks" "$ac_includes_default" -if test "x$ac_cv_member_struct_stat_st_blocks" = xyes; then : +if test "x$ac_cv_member_struct_stat_st_blocks" = x""yes; then : cat >>confdefs.h <<_ACEOF #define HAVE_STRUCT_STAT_ST_BLOCKS 1 @@ -10980,7 +10991,7 @@ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for time.h that defines altzone" >&5 $as_echo_n "checking for time.h that defines altzone... " >&6; } -if ${ac_cv_header_time_altzone+:} false; then : +if test "${ac_cv_header_time_altzone+set}" = set; then : $as_echo_n "(cached) " >&6 else @@ -11044,7 +11055,7 @@ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for addrinfo" >&5 $as_echo_n "checking for addrinfo... " >&6; } -if ${ac_cv_struct_addrinfo+:} false; then : +if test "${ac_cv_struct_addrinfo+set}" = set; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext @@ -11076,7 +11087,7 @@ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for sockaddr_storage" >&5 $as_echo_n "checking for sockaddr_storage... " >&6; } -if ${ac_cv_struct_sockaddr_storage+:} false; then : +if test "${ac_cv_struct_sockaddr_storage+set}" = set; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext @@ -11112,7 +11123,7 @@ { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether char is unsigned" >&5 $as_echo_n "checking whether char is unsigned... " >&6; } -if ${ac_cv_c_char_unsigned+:} false; then : +if test "${ac_cv_c_char_unsigned+set}" = set; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext @@ -11144,7 +11155,7 @@ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for an ANSI C-conforming const" >&5 $as_echo_n "checking for an ANSI C-conforming const... " >&6; } -if ${ac_cv_c_const+:} false; then : +if test "${ac_cv_c_const+set}" = set; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext @@ -11432,7 +11443,7 @@ ac_fn_c_check_func "$LINENO" "gethostbyname_r" "ac_cv_func_gethostbyname_r" -if test "x$ac_cv_func_gethostbyname_r" = xyes; then : +if test "x$ac_cv_func_gethostbyname_r" = x""yes; then : $as_echo "#define HAVE_GETHOSTBYNAME_R 1" >>confdefs.h @@ -11563,7 +11574,7 @@ for ac_func in gethostbyname do : ac_fn_c_check_func "$LINENO" "gethostbyname" "ac_cv_func_gethostbyname" -if test "x$ac_cv_func_gethostbyname" = xyes; then : +if test "x$ac_cv_func_gethostbyname" = x""yes; then : cat >>confdefs.h <<_ACEOF #define HAVE_GETHOSTBYNAME 1 _ACEOF @@ -11585,12 +11596,12 @@ # Linux requires this for correct f.p. operations ac_fn_c_check_func "$LINENO" "__fpu_control" "ac_cv_func___fpu_control" -if test "x$ac_cv_func___fpu_control" = xyes; then : +if test "x$ac_cv_func___fpu_control" = x""yes; then : else { $as_echo "$as_me:${as_lineno-$LINENO}: checking for __fpu_control in -lieee" >&5 $as_echo_n "checking for __fpu_control in -lieee... " >&6; } -if ${ac_cv_lib_ieee___fpu_control+:} false; then : +if test "${ac_cv_lib_ieee___fpu_control+set}" = set; then : $as_echo_n "(cached) " >&6 else ac_check_lib_save_LIBS=$LIBS @@ -11624,7 +11635,7 @@ fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_ieee___fpu_control" >&5 $as_echo "$ac_cv_lib_ieee___fpu_control" >&6; } -if test "x$ac_cv_lib_ieee___fpu_control" = xyes; then : +if test "x$ac_cv_lib_ieee___fpu_control" = x""yes; then : cat >>confdefs.h <<_ACEOF #define HAVE_LIBIEEE 1 _ACEOF @@ -11718,7 +11729,7 @@ { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether C doubles are little-endian IEEE 754 binary64" >&5 $as_echo_n "checking whether C doubles are little-endian IEEE 754 binary64... " >&6; } -if ${ac_cv_little_endian_double+:} false; then : +if test "${ac_cv_little_endian_double+set}" = set; then : $as_echo_n "(cached) " >&6 else @@ -11760,7 +11771,7 @@ { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether C doubles are big-endian IEEE 754 binary64" >&5 $as_echo_n "checking whether C doubles are big-endian IEEE 754 binary64... " >&6; } -if ${ac_cv_big_endian_double+:} false; then : +if test "${ac_cv_big_endian_double+set}" = set; then : $as_echo_n "(cached) " >&6 else @@ -11806,7 +11817,7 @@ # conversions work. { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether C doubles are ARM mixed-endian IEEE 754 binary64" >&5 $as_echo_n "checking whether C doubles are ARM mixed-endian IEEE 754 binary64... " >&6; } -if ${ac_cv_mixed_endian_double+:} false; then : +if test "${ac_cv_mixed_endian_double+set}" = set; then : $as_echo_n "(cached) " >&6 else @@ -11976,7 +11987,7 @@ ac_fn_c_check_decl "$LINENO" "isinf" "ac_cv_have_decl_isinf" "#include " -if test "x$ac_cv_have_decl_isinf" = xyes; then : +if test "x$ac_cv_have_decl_isinf" = x""yes; then : ac_have_decl=1 else ac_have_decl=0 @@ -11987,7 +11998,7 @@ _ACEOF ac_fn_c_check_decl "$LINENO" "isnan" "ac_cv_have_decl_isnan" "#include " -if test "x$ac_cv_have_decl_isnan" = xyes; then : +if test "x$ac_cv_have_decl_isnan" = x""yes; then : ac_have_decl=1 else ac_have_decl=0 @@ -11998,7 +12009,7 @@ _ACEOF ac_fn_c_check_decl "$LINENO" "isfinite" "ac_cv_have_decl_isfinite" "#include " -if test "x$ac_cv_have_decl_isfinite" = xyes; then : +if test "x$ac_cv_have_decl_isfinite" = x""yes; then : ac_have_decl=1 else ac_have_decl=0 @@ -12013,7 +12024,7 @@ # -0. on some architectures. { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether tanh preserves the sign of zero" >&5 $as_echo_n "checking whether tanh preserves the sign of zero... " >&6; } -if ${ac_cv_tanh_preserves_zero_sign+:} false; then : +if test "${ac_cv_tanh_preserves_zero_sign+set}" = set; then : $as_echo_n "(cached) " >&6 else @@ -12061,7 +12072,7 @@ # -0. See issue #9920. { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether log1p drops the sign of negative zero" >&5 $as_echo_n "checking whether log1p drops the sign of negative zero... " >&6; } - if ${ac_cv_log1p_drops_zero_sign+:} false; then : + if test "${ac_cv_log1p_drops_zero_sign+set}" = set; then : $as_echo_n "(cached) " >&6 else @@ -12113,7 +12124,7 @@ # sem_open results in a 'Signal 12' error. { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether POSIX semaphores are enabled" >&5 $as_echo_n "checking whether POSIX semaphores are enabled... " >&6; } -if ${ac_cv_posix_semaphores_enabled+:} false; then : +if test "${ac_cv_posix_semaphores_enabled+set}" = set; then : $as_echo_n "(cached) " >&6 else if test "$cross_compiling" = yes; then : @@ -12164,7 +12175,7 @@ # Multiprocessing check for broken sem_getvalue { $as_echo "$as_me:${as_lineno-$LINENO}: checking for broken sem_getvalue" >&5 $as_echo_n "checking for broken sem_getvalue... " >&6; } -if ${ac_cv_broken_sem_getvalue+:} false; then : +if test "${ac_cv_broken_sem_getvalue+set}" = set; then : $as_echo_n "(cached) " >&6 else if test "$cross_compiling" = yes; then : @@ -12229,7 +12240,7 @@ 15|30) ;; *) - as_fn_error $? "bad value $enable_big_digits for --enable-big-digits; value should be 15 or 30" "$LINENO" 5 ;; + as_fn_error $? "bad value $enable_big_digits for --enable-big-digits; value should be 15 or 30" "$LINENO" 5 ;; esac { $as_echo "$as_me:${as_lineno-$LINENO}: result: $enable_big_digits" >&5 $as_echo "$enable_big_digits" >&6; } @@ -12247,7 +12258,7 @@ # check for wchar.h ac_fn_c_check_header_mongrel "$LINENO" "wchar.h" "ac_cv_header_wchar_h" "$ac_includes_default" -if test "x$ac_cv_header_wchar_h" = xyes; then : +if test "x$ac_cv_header_wchar_h" = x""yes; then : $as_echo "#define HAVE_WCHAR_H 1" >>confdefs.h @@ -12270,7 +12281,7 @@ # This bug is HP SR number 8606223364. { $as_echo "$as_me:${as_lineno-$LINENO}: checking size of wchar_t" >&5 $as_echo_n "checking size of wchar_t... " >&6; } -if ${ac_cv_sizeof_wchar_t+:} false; then : +if test "${ac_cv_sizeof_wchar_t+set}" = set; then : $as_echo_n "(cached) " >&6 else if ac_fn_c_compute_int "$LINENO" "(long int) (sizeof (wchar_t))" "ac_cv_sizeof_wchar_t" "#include @@ -12281,7 +12292,7 @@ { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 $as_echo "$as_me: error: in \`$ac_pwd':" >&2;} as_fn_error 77 "cannot compute sizeof (wchar_t) -See \`config.log' for more details" "$LINENO" 5; } +See \`config.log' for more details" "$LINENO" 5 ; } else ac_cv_sizeof_wchar_t=0 fi @@ -12336,7 +12347,7 @@ # check whether wchar_t is signed or not { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether wchar_t is signed" >&5 $as_echo_n "checking whether wchar_t is signed... " >&6; } - if ${ac_cv_wchar_t_signed+:} false; then : + if test "${ac_cv_wchar_t_signed+set}" = set; then : $as_echo_n "(cached) " >&6 else @@ -12386,7 +12397,7 @@ # check for endianness { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether byte ordering is bigendian" >&5 $as_echo_n "checking whether byte ordering is bigendian... " >&6; } -if ${ac_cv_c_bigendian+:} false; then : +if test "${ac_cv_c_bigendian+set}" = set; then : $as_echo_n "(cached) " >&6 else ac_cv_c_bigendian=unknown @@ -12605,7 +12616,7 @@ ;; #( *) as_fn_error $? "unknown endianness - presetting ac_cv_c_bigendian=no (or yes) will help" "$LINENO" 5 ;; + presetting ac_cv_c_bigendian=no (or yes) will help" "$LINENO" 5 ;; esac @@ -12677,7 +12688,7 @@ # or fills with zeros (like the Cray J90, according to Tim Peters). { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether right shift extends the sign bit" >&5 $as_echo_n "checking whether right shift extends the sign bit... " >&6; } -if ${ac_cv_rshift_extends_sign+:} false; then : +if test "${ac_cv_rshift_extends_sign+set}" = set; then : $as_echo_n "(cached) " >&6 else @@ -12716,7 +12727,7 @@ # check for getc_unlocked and related locking functions { $as_echo "$as_me:${as_lineno-$LINENO}: checking for getc_unlocked() and friends" >&5 $as_echo_n "checking for getc_unlocked() and friends... " >&6; } -if ${ac_cv_have_getc_unlocked+:} false; then : +if test "${ac_cv_have_getc_unlocked+set}" = set; then : $as_echo_n "(cached) " >&6 else @@ -12814,7 +12825,7 @@ # check for readline 2.1 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for rl_callback_handler_install in -lreadline" >&5 $as_echo_n "checking for rl_callback_handler_install in -lreadline... " >&6; } -if ${ac_cv_lib_readline_rl_callback_handler_install+:} false; then : +if test "${ac_cv_lib_readline_rl_callback_handler_install+set}" = set; then : $as_echo_n "(cached) " >&6 else ac_check_lib_save_LIBS=$LIBS @@ -12848,7 +12859,7 @@ fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_readline_rl_callback_handler_install" >&5 $as_echo "$ac_cv_lib_readline_rl_callback_handler_install" >&6; } -if test "x$ac_cv_lib_readline_rl_callback_handler_install" = xyes; then : +if test "x$ac_cv_lib_readline_rl_callback_handler_install" = x""yes; then : $as_echo "#define HAVE_RL_CALLBACK 1" >>confdefs.h @@ -12900,7 +12911,7 @@ # check for readline 4.0 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for rl_pre_input_hook in -lreadline" >&5 $as_echo_n "checking for rl_pre_input_hook in -lreadline... " >&6; } -if ${ac_cv_lib_readline_rl_pre_input_hook+:} false; then : +if test "${ac_cv_lib_readline_rl_pre_input_hook+set}" = set; then : $as_echo_n "(cached) " >&6 else ac_check_lib_save_LIBS=$LIBS @@ -12934,7 +12945,7 @@ fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_readline_rl_pre_input_hook" >&5 $as_echo "$ac_cv_lib_readline_rl_pre_input_hook" >&6; } -if test "x$ac_cv_lib_readline_rl_pre_input_hook" = xyes; then : +if test "x$ac_cv_lib_readline_rl_pre_input_hook" = x""yes; then : $as_echo "#define HAVE_RL_PRE_INPUT_HOOK 1" >>confdefs.h @@ -12944,7 +12955,7 @@ # also in 4.0 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for rl_completion_display_matches_hook in -lreadline" >&5 $as_echo_n "checking for rl_completion_display_matches_hook in -lreadline... " >&6; } -if ${ac_cv_lib_readline_rl_completion_display_matches_hook+:} false; then : +if test "${ac_cv_lib_readline_rl_completion_display_matches_hook+set}" = set; then : $as_echo_n "(cached) " >&6 else ac_check_lib_save_LIBS=$LIBS @@ -12978,7 +12989,7 @@ fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_readline_rl_completion_display_matches_hook" >&5 $as_echo "$ac_cv_lib_readline_rl_completion_display_matches_hook" >&6; } -if test "x$ac_cv_lib_readline_rl_completion_display_matches_hook" = xyes; then : +if test "x$ac_cv_lib_readline_rl_completion_display_matches_hook" = x""yes; then : $as_echo "#define HAVE_RL_COMPLETION_DISPLAY_MATCHES_HOOK 1" >>confdefs.h @@ -12988,7 +12999,7 @@ # check for readline 4.2 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for rl_completion_matches in -lreadline" >&5 $as_echo_n "checking for rl_completion_matches in -lreadline... " >&6; } -if ${ac_cv_lib_readline_rl_completion_matches+:} false; then : +if test "${ac_cv_lib_readline_rl_completion_matches+set}" = set; then : $as_echo_n "(cached) " >&6 else ac_check_lib_save_LIBS=$LIBS @@ -13022,7 +13033,7 @@ fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_readline_rl_completion_matches" >&5 $as_echo "$ac_cv_lib_readline_rl_completion_matches" >&6; } -if test "x$ac_cv_lib_readline_rl_completion_matches" = xyes; then : +if test "x$ac_cv_lib_readline_rl_completion_matches" = x""yes; then : $as_echo "#define HAVE_RL_COMPLETION_MATCHES 1" >>confdefs.h @@ -13063,7 +13074,7 @@ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for broken nice()" >&5 $as_echo_n "checking for broken nice()... " >&6; } -if ${ac_cv_broken_nice+:} false; then : +if test "${ac_cv_broken_nice+set}" = set; then : $as_echo_n "(cached) " >&6 else @@ -13104,7 +13115,7 @@ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for broken poll()" >&5 $as_echo_n "checking for broken poll()... " >&6; } -if ${ac_cv_broken_poll+:} false; then : +if test "${ac_cv_broken_poll+set}" = set; then : $as_echo_n "(cached) " >&6 else if test "$cross_compiling" = yes; then : @@ -13159,7 +13170,7 @@ #include <$ac_cv_struct_tm> " -if test "x$ac_cv_member_struct_tm_tm_zone" = xyes; then : +if test "x$ac_cv_member_struct_tm_tm_zone" = x""yes; then : cat >>confdefs.h <<_ACEOF #define HAVE_STRUCT_TM_TM_ZONE 1 @@ -13175,7 +13186,7 @@ else ac_fn_c_check_decl "$LINENO" "tzname" "ac_cv_have_decl_tzname" "#include " -if test "x$ac_cv_have_decl_tzname" = xyes; then : +if test "x$ac_cv_have_decl_tzname" = x""yes; then : ac_have_decl=1 else ac_have_decl=0 @@ -13187,7 +13198,7 @@ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for tzname" >&5 $as_echo_n "checking for tzname... " >&6; } -if ${ac_cv_var_tzname+:} false; then : +if test "${ac_cv_var_tzname+set}" = set; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext @@ -13226,7 +13237,7 @@ # check tzset(3) exists and works like we expect it to { $as_echo "$as_me:${as_lineno-$LINENO}: checking for working tzset()" >&5 $as_echo_n "checking for working tzset()... " >&6; } -if ${ac_cv_working_tzset+:} false; then : +if test "${ac_cv_working_tzset+set}" = set; then : $as_echo_n "(cached) " >&6 else @@ -13323,7 +13334,7 @@ # Look for subsecond timestamps in struct stat { $as_echo "$as_me:${as_lineno-$LINENO}: checking for tv_nsec in struct stat" >&5 $as_echo_n "checking for tv_nsec in struct stat... " >&6; } -if ${ac_cv_stat_tv_nsec+:} false; then : +if test "${ac_cv_stat_tv_nsec+set}" = set; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext @@ -13360,7 +13371,7 @@ # Look for BSD style subsecond timestamps in struct stat { $as_echo "$as_me:${as_lineno-$LINENO}: checking for tv_nsec2 in struct stat" >&5 $as_echo_n "checking for tv_nsec2 in struct stat... " >&6; } -if ${ac_cv_stat_tv_nsec2+:} false; then : +if test "${ac_cv_stat_tv_nsec2+set}" = set; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext @@ -13397,7 +13408,7 @@ # On HP/UX 11.0, mvwdelch is a block with a return statement { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether mvwdelch is an expression" >&5 $as_echo_n "checking whether mvwdelch is an expression... " >&6; } -if ${ac_cv_mvwdelch_is_expression+:} false; then : +if test "${ac_cv_mvwdelch_is_expression+set}" = set; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext @@ -13434,7 +13445,7 @@ { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether WINDOW has _flags" >&5 $as_echo_n "checking whether WINDOW has _flags... " >&6; } -if ${ac_cv_window_has_flags+:} false; then : +if test "${ac_cv_window_has_flags+set}" = set; then : $as_echo_n "(cached) " >&6 else cat confdefs.h - <<_ACEOF >conftest.$ac_ext @@ -13582,7 +13593,7 @@ then { $as_echo "$as_me:${as_lineno-$LINENO}: checking for %lld and %llu printf() format support" >&5 $as_echo_n "checking for %lld and %llu printf() format support... " >&6; } - if ${ac_cv_have_long_long_format+:} false; then : + if test "${ac_cv_have_long_long_format+set}" = set; then : $as_echo_n "(cached) " >&6 else if test "$cross_compiling" = yes; then : @@ -13652,7 +13663,7 @@ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for %zd printf() format support" >&5 $as_echo_n "checking for %zd printf() format support... " >&6; } -if ${ac_cv_have_size_t_format+:} false; then : +if test "${ac_cv_have_size_t_format+set}" = set; then : $as_echo_n "(cached) " >&6 else if test "$cross_compiling" = yes; then : @@ -13725,7 +13736,7 @@ #endif " -if test "x$ac_cv_type_socklen_t" = xyes; then : +if test "x$ac_cv_type_socklen_t" = x""yes; then : else @@ -13736,7 +13747,7 @@ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for broken mbstowcs" >&5 $as_echo_n "checking for broken mbstowcs... " >&6; } -if ${ac_cv_broken_mbstowcs+:} false; then : +if test "${ac_cv_broken_mbstowcs+set}" = set; then : $as_echo_n "(cached) " >&6 else if test "$cross_compiling" = yes; then : @@ -13776,7 +13787,7 @@ { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether $CC supports computed gotos" >&5 $as_echo_n "checking whether $CC supports computed gotos... " >&6; } -if ${ac_cv_computed_gotos+:} false; then : +if test "${ac_cv_computed_gotos+set}" = set; then : $as_echo_n "(cached) " >&6 else if test "$cross_compiling" = yes; then : @@ -13943,21 +13954,10 @@ :end' >>confcache if diff "$cache_file" confcache >/dev/null 2>&1; then :; else if test -w "$cache_file"; then - if test "x$cache_file" != "x/dev/null"; then + test "x$cache_file" != "x/dev/null" && { $as_echo "$as_me:${as_lineno-$LINENO}: updating cache $cache_file" >&5 $as_echo "$as_me: updating cache $cache_file" >&6;} - if test ! -f "$cache_file" || test -h "$cache_file"; then - cat confcache >"$cache_file" - else - case $cache_file in #( - */* | ?:*) - mv -f confcache "$cache_file"$$ && - mv -f "$cache_file"$$ "$cache_file" ;; #( - *) - mv -f confcache "$cache_file" ;; - esac - fi - fi + cat confcache >$cache_file else { $as_echo "$as_me:${as_lineno-$LINENO}: not updating unwritable cache $cache_file" >&5 $as_echo "$as_me: not updating unwritable cache $cache_file" >&6;} @@ -13990,7 +13990,7 @@ -: "${CONFIG_STATUS=./config.status}" +: ${CONFIG_STATUS=./config.status} ac_write_fail=0 ac_clean_files_save=$ac_clean_files ac_clean_files="$ac_clean_files $CONFIG_STATUS" @@ -14091,7 +14091,6 @@ IFS=" "" $as_nl" # Find who we are. Look in the path if we contain no directory separator. -as_myself= case $0 in #(( *[\\/]* ) as_myself=$0 ;; *) as_save_IFS=$IFS; IFS=$PATH_SEPARATOR @@ -14399,7 +14398,7 @@ # values after options handling. ac_log=" This file was extended by python $as_me 3.3, which was -generated by GNU Autoconf 2.68. Invocation command line was +generated by GNU Autoconf 2.67. Invocation command line was CONFIG_FILES = $CONFIG_FILES CONFIG_HEADERS = $CONFIG_HEADERS @@ -14461,7 +14460,7 @@ ac_cs_config="`$as_echo "$ac_configure_args" | sed 's/^ //; s/[\\""\`\$]/\\\\&/g'`" ac_cs_version="\\ python config.status 3.3 -configured by $0, generated by GNU Autoconf 2.68, +configured by $0, generated by GNU Autoconf 2.67, with options \\"\$ac_cs_config\\" Copyright (C) 2010 Free Software Foundation, Inc. @@ -14592,7 +14591,7 @@ "Misc/python.pc") CONFIG_FILES="$CONFIG_FILES Misc/python.pc" ;; "Modules/ld_so_aix") CONFIG_FILES="$CONFIG_FILES Modules/ld_so_aix" ;; - *) as_fn_error $? "invalid argument: \`$ac_config_target'" "$LINENO" 5;; + *) as_fn_error $? "invalid argument: \`$ac_config_target'" "$LINENO" 5 ;; esac done @@ -14614,10 +14613,9 @@ # after its creation but before its name has been assigned to `$tmp'. $debug || { - tmp= ac_tmp= + tmp= trap 'exit_status=$? - : "${ac_tmp:=$tmp}" - { test ! -d "$ac_tmp" || rm -fr "$ac_tmp"; } && exit $exit_status + { test -z "$tmp" || test ! -d "$tmp" || rm -fr "$tmp"; } && exit $exit_status ' 0 trap 'as_fn_exit 1' 1 2 13 15 } @@ -14625,13 +14623,12 @@ { tmp=`(umask 077 && mktemp -d "./confXXXXXX") 2>/dev/null` && - test -d "$tmp" + test -n "$tmp" && test -d "$tmp" } || { tmp=./conf$$-$RANDOM (umask 077 && mkdir "$tmp") } || as_fn_error $? "cannot create a temporary directory in ." "$LINENO" 5 -ac_tmp=$tmp # Set up the scripts for CONFIG_FILES section. # No need to generate them if there are no CONFIG_FILES. @@ -14653,7 +14650,7 @@ ac_cs_awk_cr=$ac_cr fi -echo 'BEGIN {' >"$ac_tmp/subs1.awk" && +echo 'BEGIN {' >"$tmp/subs1.awk" && _ACEOF @@ -14681,7 +14678,7 @@ rm -f conf$$subs.sh cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 -cat >>"\$ac_tmp/subs1.awk" <<\\_ACAWK && +cat >>"\$tmp/subs1.awk" <<\\_ACAWK && _ACEOF sed -n ' h @@ -14729,7 +14726,7 @@ rm -f conf$$subs.awk cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1 _ACAWK -cat >>"\$ac_tmp/subs1.awk" <<_ACAWK && +cat >>"\$tmp/subs1.awk" <<_ACAWK && for (key in S) S_is_set[key] = 1 FS = "" @@ -14761,7 +14758,7 @@ sed "s/$ac_cr\$//; s/$ac_cr/$ac_cs_awk_cr/g" else cat -fi < "$ac_tmp/subs1.awk" > "$ac_tmp/subs.awk" \ +fi < "$tmp/subs1.awk" > "$tmp/subs.awk" \ || as_fn_error $? "could not setup config files machinery" "$LINENO" 5 _ACEOF @@ -14795,7 +14792,7 @@ # No need to generate them if there are no CONFIG_HEADERS. # This happens for instance with `./config.status Makefile'. if test -n "$CONFIG_HEADERS"; then -cat >"$ac_tmp/defines.awk" <<\_ACAWK || +cat >"$tmp/defines.awk" <<\_ACAWK || BEGIN { _ACEOF @@ -14807,8 +14804,8 @@ # handling of long lines. ac_delim='%!_!# ' for ac_last_try in false false :; do - ac_tt=`sed -n "/$ac_delim/p" confdefs.h` - if test -z "$ac_tt"; then + ac_t=`sed -n "/$ac_delim/p" confdefs.h` + if test -z "$ac_t"; then break elif $ac_last_try; then as_fn_error $? "could not make $CONFIG_HEADERS" "$LINENO" 5 @@ -14909,7 +14906,7 @@ esac case $ac_mode$ac_tag in :[FHL]*:*);; - :L* | :C*:*) as_fn_error $? "invalid tag \`$ac_tag'" "$LINENO" 5;; + :L* | :C*:*) as_fn_error $? "invalid tag \`$ac_tag'" "$LINENO" 5 ;; :[FH]-) ac_tag=-:-;; :[FH]*) ac_tag=$ac_tag:$ac_tag.in;; esac @@ -14928,7 +14925,7 @@ for ac_f do case $ac_f in - -) ac_f="$ac_tmp/stdin";; + -) ac_f="$tmp/stdin";; *) # Look for the file first in the build tree, then in the source tree # (if the path is not absolute). The absolute path cannot be DOS-style, # because $ac_f cannot contain `:'. @@ -14937,7 +14934,7 @@ [\\/$]*) false;; *) test -f "$srcdir/$ac_f" && ac_f="$srcdir/$ac_f";; esac || - as_fn_error 1 "cannot find input file: \`$ac_f'" "$LINENO" 5;; + as_fn_error 1 "cannot find input file: \`$ac_f'" "$LINENO" 5 ;; esac case $ac_f in *\'*) ac_f=`$as_echo "$ac_f" | sed "s/'/'\\\\\\\\''/g"`;; esac as_fn_append ac_file_inputs " '$ac_f'" @@ -14963,8 +14960,8 @@ esac case $ac_tag in - *:-:* | *:-) cat >"$ac_tmp/stdin" \ - || as_fn_error $? "could not create $ac_file" "$LINENO" 5 ;; + *:-:* | *:-) cat >"$tmp/stdin" \ + || as_fn_error $? "could not create $ac_file" "$LINENO" 5 ;; esac ;; esac @@ -15094,22 +15091,21 @@ s&@INSTALL@&$ac_INSTALL&;t t $ac_datarootdir_hack " -eval sed \"\$ac_sed_extra\" "$ac_file_inputs" | $AWK -f "$ac_tmp/subs.awk" \ - >$ac_tmp/out || as_fn_error $? "could not create $ac_file" "$LINENO" 5 +eval sed \"\$ac_sed_extra\" "$ac_file_inputs" | $AWK -f "$tmp/subs.awk" >$tmp/out \ + || as_fn_error $? "could not create $ac_file" "$LINENO" 5 test -z "$ac_datarootdir_hack$ac_datarootdir_seen" && - { ac_out=`sed -n '/\${datarootdir}/p' "$ac_tmp/out"`; test -n "$ac_out"; } && - { ac_out=`sed -n '/^[ ]*datarootdir[ ]*:*=/p' \ - "$ac_tmp/out"`; test -z "$ac_out"; } && + { ac_out=`sed -n '/\${datarootdir}/p' "$tmp/out"`; test -n "$ac_out"; } && + { ac_out=`sed -n '/^[ ]*datarootdir[ ]*:*=/p' "$tmp/out"`; test -z "$ac_out"; } && { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: $ac_file contains a reference to the variable \`datarootdir' which seems to be undefined. Please make sure it is defined" >&5 $as_echo "$as_me: WARNING: $ac_file contains a reference to the variable \`datarootdir' which seems to be undefined. Please make sure it is defined" >&2;} - rm -f "$ac_tmp/stdin" + rm -f "$tmp/stdin" case $ac_file in - -) cat "$ac_tmp/out" && rm -f "$ac_tmp/out";; - *) rm -f "$ac_file" && mv "$ac_tmp/out" "$ac_file";; + -) cat "$tmp/out" && rm -f "$tmp/out";; + *) rm -f "$ac_file" && mv "$tmp/out" "$ac_file";; esac \ || as_fn_error $? "could not create $ac_file" "$LINENO" 5 ;; @@ -15120,20 +15116,20 @@ if test x"$ac_file" != x-; then { $as_echo "/* $configure_input */" \ - && eval '$AWK -f "$ac_tmp/defines.awk"' "$ac_file_inputs" - } >"$ac_tmp/config.h" \ + && eval '$AWK -f "$tmp/defines.awk"' "$ac_file_inputs" + } >"$tmp/config.h" \ || as_fn_error $? "could not create $ac_file" "$LINENO" 5 - if diff "$ac_file" "$ac_tmp/config.h" >/dev/null 2>&1; then + if diff "$ac_file" "$tmp/config.h" >/dev/null 2>&1; then { $as_echo "$as_me:${as_lineno-$LINENO}: $ac_file is unchanged" >&5 $as_echo "$as_me: $ac_file is unchanged" >&6;} else rm -f "$ac_file" - mv "$ac_tmp/config.h" "$ac_file" \ + mv "$tmp/config.h" "$ac_file" \ || as_fn_error $? "could not create $ac_file" "$LINENO" 5 fi else $as_echo "/* $configure_input */" \ - && eval '$AWK -f "$ac_tmp/defines.awk"' "$ac_file_inputs" \ + && eval '$AWK -f "$tmp/defines.awk"' "$ac_file_inputs" \ || as_fn_error $? "could not create -" "$LINENO" 5 fi ;; diff --git a/configure.in b/configure.in --- a/configure.in +++ b/configure.in @@ -1376,6 +1376,13 @@ #endif ]) +# On Linux, can.h and can/raw.h require sys/socket.h +AC_CHECK_HEADERS(linux/can.h linux/can/raw.h,,,[ +#ifdef HAVE_SYS_SOCKET_H +#include +#endif +]) + # checks for typedefs was_it_defined=no AC_MSG_CHECKING(for clock_t in time.h) diff --git a/pyconfig.h.in b/pyconfig.h.in --- a/pyconfig.h.in +++ b/pyconfig.h.in @@ -467,6 +467,12 @@ /* Define to 1 if you have the `linkat' function. */ #undef HAVE_LINKAT +/* Define to 1 if you have the header file. */ +#undef HAVE_LINUX_CAN_H + +/* Define to 1 if you have the header file. */ +#undef HAVE_LINUX_CAN_RAW_H + /* Define to 1 if you have the header file. */ #undef HAVE_LINUX_NETLINK_H -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 20:28:01 2011 From: python-checkins at python.org (victor.stinner) Date: Thu, 06 Oct 2011 20:28:01 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Issue_=2310141=3A_Don=27t_u?= =?utf8?q?se_hardcoded_frame_size_in_example=2C_use_struct=2Ecalcsize=28?= =?utf8?q?=29?= Message-ID: http://hg.python.org/cpython/rev/a4af684bb54e changeset: 72759:a4af684bb54e user: Victor Stinner date: Thu Oct 06 20:27:20 2011 +0200 summary: Issue #10141: Don't use hardcoded frame size in example, use struct.calcsize() files: Doc/library/socket.rst | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/Doc/library/socket.rst b/Doc/library/socket.rst --- a/Doc/library/socket.rst +++ b/Doc/library/socket.rst @@ -1270,6 +1270,7 @@ # CAN frame packing/unpacking (see `struct can_frame` in ) can_frame_fmt = "=IB3x8s" + can_frame_size = struct.calcsize(can_frame_fmt) def build_can_frame(can_id, data): can_dlc = len(data) @@ -1286,7 +1287,7 @@ s.bind(('vcan0',)) while True: - cf, addr = s.recvfrom(16) + cf, addr = s.recvfrom(can_frame_size) print('Received: can_id=%x, can_dlc=%x, data=%s' % dissect_can_frame(cf)) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 21:50:02 2011 From: python-checkins at python.org (antoine.pitrou) Date: Thu, 06 Oct 2011 21:50:02 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Fix_the_expected_memory_con?= =?utf8?q?sumption_for_some_tests?= Message-ID: http://hg.python.org/cpython/rev/55c313924d63 changeset: 72760:55c313924d63 user: Antoine Pitrou date: Thu Oct 06 21:46:23 2011 +0200 summary: Fix the expected memory consumption for some tests files: Lib/test/test_bigmem.py | 12 +++++++----- 1 files changed, 7 insertions(+), 5 deletions(-) diff --git a/Lib/test/test_bigmem.py b/Lib/test/test_bigmem.py --- a/Lib/test/test_bigmem.py +++ b/Lib/test/test_bigmem.py @@ -615,26 +615,28 @@ for name, memuse in self._adjusted.items(): getattr(type(self), name).memuse = memuse - # the utf8 encoder preallocates big time (4x the number of characters) - @bigmemtest(size=_2G + 2, memuse=ascii_char_size + 4) + # Many codecs convert to the legacy representation first, explaining + # why we add 'ucs4_char_size' to the 'memuse' below. + + @bigmemtest(size=_2G + 2, memuse=ascii_char_size + 1) def test_encode(self, size): return self.basic_encode_test(size, 'utf-8') - @bigmemtest(size=_4G // 6 + 2, memuse=ascii_char_size + 1) + @bigmemtest(size=_4G // 6 + 2, memuse=ascii_char_size + ucs4_char_size + 1) def test_encode_raw_unicode_escape(self, size): try: return self.basic_encode_test(size, 'raw_unicode_escape') except MemoryError: pass # acceptable on 32-bit - @bigmemtest(size=_4G // 5 + 70, memuse=ascii_char_size + 1) + @bigmemtest(size=_4G // 5 + 70, memuse=ascii_char_size + ucs4_char_size + 1) def test_encode_utf7(self, size): try: return self.basic_encode_test(size, 'utf7') except MemoryError: pass # acceptable on 32-bit - @bigmemtest(size=_4G // 4 + 5, memuse=ascii_char_size + 4) + @bigmemtest(size=_4G // 4 + 5, memuse=ascii_char_size + ucs4_char_size + 4) def test_encode_utf32(self, size): try: return self.basic_encode_test(size, 'utf32', expectedsize=4 * size + 4) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 21:59:28 2011 From: python-checkins at python.org (antoine.pitrou) Date: Thu, 06 Oct 2011 21:59:28 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Fix_size_estimation_for_tes?= =?utf8?q?t=5Fbigmem=2EStrTest=2Etest=5Fformat?= Message-ID: http://hg.python.org/cpython/rev/4e0b570ac8a3 changeset: 72761:4e0b570ac8a3 user: Antoine Pitrou date: Thu Oct 06 21:55:51 2011 +0200 summary: Fix size estimation for test_bigmem.StrTest.test_format files: Lib/test/test_bigmem.py | 4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/Lib/test/test_bigmem.py b/Lib/test/test_bigmem.py --- a/Lib/test/test_bigmem.py +++ b/Lib/test/test_bigmem.py @@ -647,7 +647,9 @@ def test_encode_ascii(self, size): return self.basic_encode_test(size, 'ascii', c='A') - @bigmemtest(size=_2G + 10, memuse=ascii_char_size * 2) + # str % (...) uses a Py_UCS4 intermediate representation + + @bigmemtest(size=_2G + 10, memuse=ascii_char_size * 2 + ucs4_char_size) def test_format(self, size): s = '-' * size sf = '%s' % (s,) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 22:13:02 2011 From: python-checkins at python.org (antoine.pitrou) Date: Thu, 06 Oct 2011 22:13:02 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Ensure_that_1-char_singleto?= =?utf8?q?ns_get_used?= Message-ID: http://hg.python.org/cpython/rev/d3d7ec004af0 changeset: 72762:d3d7ec004af0 user: Antoine Pitrou date: Thu Oct 06 22:07:51 2011 +0200 summary: Ensure that 1-char singletons get used files: Objects/unicodeobject.c | 8 ++++++++ 1 files changed, 8 insertions(+), 0 deletions(-) diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c --- a/Objects/unicodeobject.c +++ b/Objects/unicodeobject.c @@ -1622,6 +1622,8 @@ assert(*p < 128); } #endif + if (size == 1) + return get_latin1_char(s[0]); res = PyUnicode_New(size, 127); if (!res) return NULL; @@ -1653,6 +1655,8 @@ Py_ssize_t i; assert(size >= 0); + if (size == 1) + return get_latin1_char(u[0]); for (i = 0; i < size; i++) { if (u[i] & 0x80) { max_char = 255; @@ -1675,6 +1679,8 @@ Py_ssize_t i; assert(size >= 0); + if (size == 1 && u[0] < 256) + return get_latin1_char(u[0]); for (i = 0; i < size; i++) { if (u[i] > max_char) { max_char = u[i]; @@ -1702,6 +1708,8 @@ Py_ssize_t i; assert(size >= 0); + if (size == 1 && u[0] < 256) + return get_latin1_char(u[0]); for (i = 0; i < size; i++) { if (u[i] > max_char) { max_char = u[i]; -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 22:13:03 2011 From: python-checkins at python.org (antoine.pitrou) Date: Thu, 06 Oct 2011 22:13:03 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Make_the_formula_for_this_e?= =?utf8?q?stimate_more_explicit?= Message-ID: http://hg.python.org/cpython/rev/e795ab617914 changeset: 72763:e795ab617914 user: Antoine Pitrou date: Thu Oct 06 22:09:18 2011 +0200 summary: Make the formula for this estimate more explicit files: Lib/test/test_bigmem.py | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Lib/test/test_bigmem.py b/Lib/test/test_bigmem.py --- a/Lib/test/test_bigmem.py +++ b/Lib/test/test_bigmem.py @@ -374,7 +374,7 @@ # suffer for the list size. (Otherwise, it'd cost another 48 times # size in bytes!) Nevertheless, a list of size takes # 8*size bytes. - @bigmemtest(size=_2G + 5, memuse=10) + @bigmemtest(size=_2G + 5, memuse=2 * ascii_char_size + 8) def test_split_large(self, size): _ = self.from_latin1 s = _(' a') * size + _(' ') -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 22:35:43 2011 From: python-checkins at python.org (ned.deily) Date: Thu, 06 Oct 2011 22:35:43 +0200 Subject: [Python-checkins] =?utf8?q?devguide=3A_Replace_tabs_with_spaces_i?= =?utf8?q?n_compiler=2Erst_to_satisfy_the_checkwhitespace_hook=2E?= Message-ID: http://hg.python.org/devguide/rev/ddcf4a0ece4e changeset: 454:ddcf4a0ece4e user: Ned Deily date: Thu Oct 06 13:32:42 2011 -0700 summary: Replace tabs with spaces in compiler.rst to satisfy the checkwhitespace hook. files: compiler.rst | 60 ++++++++++++++++++++-------------------- 1 files changed, 30 insertions(+), 30 deletions(-) diff --git a/compiler.rst b/compiler.rst --- a/compiler.rst +++ b/compiler.rst @@ -48,22 +48,22 @@ macros (which are all defined in Include/token.h): ``CHILD(node *, int)`` - Returns the nth child of the node using zero-offset indexing + Returns the nth child of the node using zero-offset indexing ``RCHILD(node *, int)`` - Returns the nth child of the node from the right side; use - negative numbers! + Returns the nth child of the node from the right side; use + negative numbers! ``NCH(node *)`` - Number of children the node has + Number of children the node has ``STR(node *)`` - String representation of the node; e.g., will return ``:`` for a - COLON token + String representation of the node; e.g., will return ``:`` for a + COLON token ``TYPE(node *)`` - The type of node as specified in ``Include/graminit.h`` + The type of node as specified in ``Include/graminit.h`` ``REQ(node *, TYPE)`` - Assert that the node is the type that is expected + Assert that the node is the type that is expected ``LINENO(node *)`` - retrieve the line number of the source code that led to the - creation of the parse rule; defined in Python/ast.c + retrieve the line number of the source code that led to the + creation of the parse rule; defined in Python/ast.c To tie all of this example, consider the rule for 'while':: @@ -99,10 +99,10 @@ module Python { - stmt = FunctionDef(identifier name, arguments args, stmt* body, - expr* decorators) - | Return(expr? value) | Yield(expr value) - attributes (int lineno) + stmt = FunctionDef(identifier name, arguments args, stmt* body, + expr* decorators) + | Return(expr? value) | Yield(expr value) + attributes (int lineno) } The preceding example describes three different kinds of statements; @@ -221,13 +221,13 @@ in Python/asdl.c and Include/asdl.h: ``asdl_seq_new()`` - Allocate memory for an asdl_seq for the specified length + Allocate memory for an asdl_seq for the specified length ``asdl_seq_GET()`` - Get item held at a specific position in an asdl_seq + Get item held at a specific position in an asdl_seq ``asdl_seq_SET()`` - Set a specific index in an asdl_seq to the specified value + Set a specific index in an asdl_seq to the specified value ``asdl_seq_LEN(asdl_seq *)`` - Return the length of an asdl_seq + Return the length of an asdl_seq If you are working with statements, you must also worry about keeping track of what line number generated the statement. Currently the line @@ -426,7 +426,7 @@ asdl_c.py "Generate C code from an ASDL description." Generates - Python/Python-ast.c and Include/Python-ast.h . + Python/Python-ast.c and Include/Python-ast.h . spark.py SPARK_ parser generator @@ -435,9 +435,9 @@ Python-ast.c Creates C structs corresponding to the ASDL types. Also - contains code for marshaling AST nodes (core ASDL types have - marshaling code in asdl.c). "File automatically generated by - Parser/asdl_c.py". This file must be committed separately + contains code for marshaling AST nodes (core ASDL types have + marshaling code in asdl.c). "File automatically generated by + Parser/asdl_c.py". This file must be committed separately after every grammar change is committed since the __version__ value is set to the latest grammar change revision number. @@ -456,14 +456,14 @@ Emits bytecode based on the AST. symtable.c - Generates a symbol table from AST. + Generates a symbol table from AST. pyarena.c Implementation of the arena memory manager. import.c Home of the magic number (named ``MAGIC``) for bytecode versioning - + + Include/ @@ -479,12 +479,12 @@ Declares PyAST_FromNode() external (from Python/ast.c). code.h - Header file for Objects/codeobject.c; contains definition of - PyCodeObject. + Header file for Objects/codeobject.c; contains definition of + PyCodeObject. symtable.h - Header for Python/symtable.c . struct symtable and - PySTEntryObject are defined here. + Header for Python/symtable.c . struct symtable and + PySTEntryObject are defined here. pyarena.h Header file for the corresponding Python/pyarena.c . @@ -496,8 +496,8 @@ + Objects/ codeobject.c - Contains PyCodeObject-related code (originally in - Python/compile.c). + Contains PyCodeObject-related code (originally in + Python/compile.c). + Lib/ -- Repository URL: http://hg.python.org/devguide From python-checkins at python.org Thu Oct 6 22:35:46 2011 From: python-checkins at python.org (ned.deily) Date: Thu, 06 Oct 2011 22:35:46 +0200 Subject: [Python-checkins] =?utf8?q?devguide=3A_Issue_=2313117=3A_Fix_brok?= =?utf8?q?en_links_in_the_compiler_page_of_the_Developer=27s_Guide=2E?= Message-ID: http://hg.python.org/devguide/rev/76159c6d265a changeset: 455:76159c6d265a user: Ned Deily date: Thu Oct 06 13:35:08 2011 -0700 summary: Issue #13117: Fix broken links in the compiler page of the Developer's Guide. (Patch by Francisco Mart?n Brugu?) files: compiler.rst | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/compiler.rst b/compiler.rst --- a/compiler.rst +++ b/compiler.rst @@ -548,12 +548,12 @@ 213--227, 1997. .. _The Zephyr Abstract Syntax Description Language.: - http://www.cs.princeton.edu/~danwang/Papers/dsl97/dsl97.html + http://www.cs.princeton.edu/research/techreps/TR-554-97 .. _SPARK: http://pages.cpsc.ucalgary.ca/~aycock/spark/ .. [#skip-peephole] Skip Montanaro's Peephole Optimizer Paper - (http://www.foretec.com/python/workshops/1998-11/proceedings/papers/montanaro/montanaro.html) + (http://www.python.org/workshops/1998-11/proceedings/papers/montanaro/montanaro.html) .. [#Bytecodehacks] Bytecodehacks Project (http://bytecodehacks.sourceforge.net/bch-docs/bch/index.html) -- Repository URL: http://hg.python.org/devguide From python-checkins at python.org Thu Oct 6 22:35:47 2011 From: python-checkins at python.org (antoine.pitrou) Date: Thu, 06 Oct 2011 22:35:47 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Fix_test=5Fsplitlines_to_re?= =?utf8?q?ach_its_size_estimate?= Message-ID: http://hg.python.org/cpython/rev/6131a2fc0a0f changeset: 72764:6131a2fc0a0f user: Antoine Pitrou date: Thu Oct 06 22:19:07 2011 +0200 summary: Fix test_splitlines to reach its size estimate files: Lib/test/test_bigmem.py | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/Lib/test/test_bigmem.py b/Lib/test/test_bigmem.py --- a/Lib/test/test_bigmem.py +++ b/Lib/test/test_bigmem.py @@ -393,9 +393,9 @@ # take up an inordinate amount of memory chunksize = int(size ** 0.5 + 2) // 2 SUBSTR = _(' ') * chunksize + _('\n') + _(' ') * chunksize + _('\r\n') - s = SUBSTR * chunksize + s = SUBSTR * (chunksize * 2) l = s.splitlines() - self.assertEqual(len(l), chunksize * 2) + self.assertEqual(len(l), chunksize * 4) expected = _(' ') * chunksize for item in l: self.assertEqual(item, expected) -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 22:35:48 2011 From: python-checkins at python.org (antoine.pitrou) Date: Thu, 06 Oct 2011 22:35:48 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Fix_size_estimate_for_test?= =?utf8?q?=5Funicode=5Frepr?= Message-ID: http://hg.python.org/cpython/rev/6b5a8d7b1336 changeset: 72765:6b5a8d7b1336 user: Antoine Pitrou date: Thu Oct 06 22:32:10 2011 +0200 summary: Fix size estimate for test_unicode_repr files: Lib/test/test_bigmem.py | 8 +++++++- 1 files changed, 7 insertions(+), 1 deletions(-) diff --git a/Lib/test/test_bigmem.py b/Lib/test/test_bigmem.py --- a/Lib/test/test_bigmem.py +++ b/Lib/test/test_bigmem.py @@ -701,7 +701,13 @@ self.assertEqual(s.count('\\'), size) self.assertEqual(s.count('0'), size * 2) - @bigmemtest(size=_2G // 5 + 1, memuse=ucs2_char_size + ascii_char_size * 6) + # ascii() calls encode('ascii', 'backslashreplace'), which itself + # creates a temporary Py_UNICODE representation in addition to the + # original (Py_UCS2) one + # There's also some overallocation when resizing the ascii() result + # that isn't taken into account here. + @bigmemtest(size=_2G // 5 + 1, memuse=ucs2_char_size + + ucs4_char_size + ascii_char_size * 6) def test_unicode_repr(self, size): # Use an assigned, but not printable code point. # It is in the range of the low surrogates \uDC00-\uDFFF. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 22:44:45 2011 From: python-checkins at python.org (antoine.pitrou) Date: Thu, 06 Oct 2011 22:44:45 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Fix_expected_memory_consump?= =?utf8?q?tion_for_test=5Ftranslate?= Message-ID: http://hg.python.org/cpython/rev/849f35a575a5 changeset: 72766:849f35a575a5 user: Antoine Pitrou date: Thu Oct 06 22:41:08 2011 +0200 summary: Fix expected memory consumption for test_translate files: Lib/test/test_bigmem.py | 33 +++++++++++++++++++++------- 1 files changed, 25 insertions(+), 8 deletions(-) diff --git a/Lib/test/test_bigmem.py b/Lib/test/test_bigmem.py --- a/Lib/test/test_bigmem.py +++ b/Lib/test/test_bigmem.py @@ -446,14 +446,7 @@ def test_translate(self, size): _ = self.from_latin1 SUBSTR = _('aZz.z.Aaz.') - if isinstance(SUBSTR, str): - trans = { - ord(_('.')): _('-'), - ord(_('a')): _('!'), - ord(_('Z')): _('$'), - } - else: - trans = bytes.maketrans(b'.aZ', b'-!$') + trans = bytes.maketrans(b'.aZ', b'-!$') sublen = len(SUBSTR) repeats = size // sublen + 2 s = SUBSTR * repeats @@ -735,6 +728,30 @@ finally: r = s = None + # The original test_translate is overriden here, so as to get the + # correct size estimate: str.translate() uses an intermediate Py_UCS4 + # representation. + + @bigmemtest(size=_2G, memuse=ascii_char_size * 2 + ucs4_char_size) + def test_translate(self, size): + _ = self.from_latin1 + SUBSTR = _('aZz.z.Aaz.') + trans = { + ord(_('.')): _('-'), + ord(_('a')): _('!'), + ord(_('Z')): _('$'), + } + sublen = len(SUBSTR) + repeats = size // sublen + 2 + s = SUBSTR * repeats + s = s.translate(trans) + self.assertEqual(len(s), repeats * sublen) + self.assertEqual(s[:sublen], SUBSTR.translate(trans)) + self.assertEqual(s[-sublen:], SUBSTR.translate(trans)) + self.assertEqual(s.count(_('.')), 0) + self.assertEqual(s.count(_('!')), repeats * 2) + self.assertEqual(s.count(_('z')), repeats * 3) + class BytesTest(unittest.TestCase, BaseStrTest): -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 23:06:29 2011 From: python-checkins at python.org (benjamin.peterson) Date: Thu, 06 Oct 2011 23:06:29 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_fix_compiler_warnings?= Message-ID: http://hg.python.org/cpython/rev/11fed1ed1757 changeset: 72767:11fed1ed1757 user: Benjamin Peterson date: Thu Oct 06 17:06:25 2011 -0400 summary: fix compiler warnings files: Modules/_csv.c | 7 +++---- 1 files changed, 3 insertions(+), 4 deletions(-) diff --git a/Modules/_csv.c b/Modules/_csv.c --- a/Modules/_csv.c +++ b/Modules/_csv.c @@ -529,13 +529,13 @@ self->field = PyMem_New(Py_UNICODE, self->field_size); } else { + Py_UNICODE *field = self->field; if (self->field_size > PY_SSIZE_T_MAX / 2) { PyErr_NoMemory(); return 0; } self->field_size *= 2; - self->field = PyMem_Resize(self->field, Py_UNICODE, - self->field_size); + self->field = PyMem_Resize(field, Py_UNICODE, self->field_size); } if (self->field == NULL) { PyErr_NoMemory(); @@ -1055,8 +1055,7 @@ Py_UNICODE* old_rec = self->rec; self->rec_size = (rec_len / MEM_INCR + 1) * MEM_INCR; - self->rec = PyMem_Resize(self->rec, Py_UNICODE, - self->rec_size); + self->rec = PyMem_Resize(old_rec, Py_UNICODE, self->rec_size); if (self->rec == NULL) PyMem_Free(old_rec); } -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 23:43:07 2011 From: python-checkins at python.org (ned.deily) Date: Thu, 06 Oct 2011 23:43:07 +0200 Subject: [Python-checkins] =?utf8?b?Y3B5dGhvbiAoMi43KTogSXNzdWUgIzc0MjU6?= =?utf8?q?_Refactor_test=5Fpydoc_test_case_for_=27-k=27_behavior_and_add?= Message-ID: http://hg.python.org/cpython/rev/45862f4ab1c5 changeset: 72768:45862f4ab1c5 branch: 2.7 parent: 72754:89b9e4bf6f1f user: Ned Deily date: Thu Oct 06 14:17:34 2011 -0700 summary: Issue #7425: Refactor test_pydoc test case for '-k' behavior and add new test cases for importing bad packages and unreadable packages dirs. files: Lib/test/test_pydoc.py | 112 ++++++++++++++++------------ 1 files changed, 63 insertions(+), 49 deletions(-) diff --git a/Lib/test/test_pydoc.py b/Lib/test/test_pydoc.py --- a/Lib/test/test_pydoc.py +++ b/Lib/test/test_pydoc.py @@ -1,7 +1,6 @@ import os import sys import difflib -import subprocess import __builtin__ import re import pydoc @@ -10,10 +9,10 @@ import unittest import xml.etree import test.test_support -from contextlib import contextmanager from collections import namedtuple +from test.script_helper import assert_python_ok from test.test_support import ( - TESTFN, forget, rmtree, EnvironmentVarGuard, reap_children, captured_stdout) + TESTFN, rmtree, reap_children, captured_stdout) from test import pydoc_mod @@ -176,17 +175,15 @@ # output pattern for module with bad imports badimport_pattern = "problem in %s - : No module named %s" -def run_pydoc(module_name, *args): +def run_pydoc(module_name, *args, **env): """ Runs pydoc on the specified module. Returns the stripped output of pydoc. """ - cmd = [sys.executable, pydoc.__file__, " ".join(args), module_name] - try: - output = subprocess.Popen(cmd, stdout=subprocess.PIPE).communicate()[0] - return output.strip() - finally: - reap_children() + args = args + (module_name,) + # do not write bytecode files to avoid caching errors + rc, out, err = assert_python_ok('-B', pydoc.__file__, *args, **env) + return out.strip() def get_pydoc_html(module): "Returns pydoc generated output as html" @@ -259,42 +256,6 @@ self.assertEqual(expected, result, "documentation for missing module found") - def test_badimport(self): - # This tests the fix for issue 5230, where if pydoc found the module - # but the module had an internal import error pydoc would report no doc - # found. - modname = 'testmod_xyzzy' - testpairs = ( - ('i_am_not_here', 'i_am_not_here'), - ('test.i_am_not_here_either', 'i_am_not_here_either'), - ('test.i_am_not_here.neither_am_i', 'i_am_not_here.neither_am_i'), - ('i_am_not_here.{}'.format(modname), 'i_am_not_here.{}'.format(modname)), - ('test.{}'.format(modname), modname), - ) - - @contextmanager - def newdirinpath(dir): - os.mkdir(dir) - sys.path.insert(0, dir) - yield - sys.path.pop(0) - rmtree(dir) - - with newdirinpath(TESTFN), EnvironmentVarGuard() as env: - env['PYTHONPATH'] = TESTFN - fullmodname = os.path.join(TESTFN, modname) - sourcefn = fullmodname + os.extsep + "py" - for importstring, expectedinmsg in testpairs: - f = open(sourcefn, 'w') - f.write("import {}\n".format(importstring)) - f.close() - try: - result = run_pydoc(modname) - finally: - forget(modname) - expected = badimport_pattern % (modname, expectedinmsg) - self.assertEqual(expected, result) - def test_input_strip(self): missing_module = " test.i_am_not_here " result = run_pydoc(missing_module) @@ -317,6 +278,55 @@ "") +class PydocImportTest(unittest.TestCase): + + def setUp(self): + self.test_dir = os.mkdir(TESTFN) + self.addCleanup(rmtree, TESTFN) + + def test_badimport(self): + # This tests the fix for issue 5230, where if pydoc found the module + # but the module had an internal import error pydoc would report no doc + # found. + modname = 'testmod_xyzzy' + testpairs = ( + ('i_am_not_here', 'i_am_not_here'), + ('test.i_am_not_here_either', 'i_am_not_here_either'), + ('test.i_am_not_here.neither_am_i', 'i_am_not_here.neither_am_i'), + ('i_am_not_here.{}'.format(modname), + 'i_am_not_here.{}'.format(modname)), + ('test.{}'.format(modname), modname), + ) + + sourcefn = os.path.join(TESTFN, modname) + os.extsep + "py" + for importstring, expectedinmsg in testpairs: + with open(sourcefn, 'w') as f: + f.write("import {}\n".format(importstring)) + result = run_pydoc(modname, PYTHONPATH=TESTFN) + expected = badimport_pattern % (modname, expectedinmsg) + self.assertEqual(expected, result) + + def test_apropos_with_bad_package(self): + # Issue 7425 - pydoc -k failed when bad package on path + pkgdir = os.path.join(TESTFN, "syntaxerr") + os.mkdir(pkgdir) + badsyntax = os.path.join(pkgdir, "__init__") + os.extsep + "py" + with open(badsyntax, 'w') as f: + f.write("invalid python syntax = $1\n") + result = run_pydoc('nothing', '-k', PYTHONPATH=TESTFN) + self.assertEqual('', result) + + def test_apropos_with_unreadable_dir(self): + # Issue 7367 - pydoc -k failed when unreadable dir on path + self.unreadable_dir = os.path.join(TESTFN, "unreadable") + os.mkdir(self.unreadable_dir, 0) + self.addCleanup(os.rmdir, self.unreadable_dir) + # Note, on Windows the directory appears to be still + # readable so this is not really testing the issue there + result = run_pydoc('nothing', '-k', PYTHONPATH=TESTFN) + self.assertEqual('', result) + + class TestDescriptions(unittest.TestCase): def test_module(self): @@ -376,9 +386,13 @@ def test_main(): - test.test_support.run_unittest(PyDocDocTest, - TestDescriptions, - TestHelper) + try: + test.test_support.run_unittest(PyDocDocTest, + PydocImportTest, + TestDescriptions, + TestHelper) + finally: + reap_children() if __name__ == "__main__": test_main() -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 23:43:07 2011 From: python-checkins at python.org (ned.deily) Date: Thu, 06 Oct 2011 23:43:07 +0200 Subject: [Python-checkins] =?utf8?b?Y3B5dGhvbiAoMi43KTogSXNzdWUgIzczNjc6?= =?utf8?q?_Add_test_case_to_test=5Fpkgutil_for_walking_path_with?= Message-ID: http://hg.python.org/cpython/rev/096b010ae90b changeset: 72769:096b010ae90b branch: 2.7 user: Ned Deily date: Thu Oct 06 14:17:41 2011 -0700 summary: Issue #7367: Add test case to test_pkgutil for walking path with an unreadable directory. files: Lib/test/test_pkgutil.py | 11 +++++++++++ 1 files changed, 11 insertions(+), 0 deletions(-) diff --git a/Lib/test/test_pkgutil.py b/Lib/test/test_pkgutil.py --- a/Lib/test/test_pkgutil.py +++ b/Lib/test/test_pkgutil.py @@ -78,6 +78,17 @@ del sys.modules[pkg] + def test_unreadable_dir_on_syspath(self): + # issue7367 - walk_packages failed if unreadable dir on sys.path + package_name = "unreadable_package" + d = os.path.join(self.dirname, package_name) + # this does not appear to create an unreadable dir on Windows + # but the test should not fail anyway + os.mkdir(d, 0) + for t in pkgutil.walk_packages(path=[self.dirname]): + self.fail("unexpected package found") + os.rmdir(d) + class PkgutilPEP302Tests(unittest.TestCase): class MyTestLoader(object): -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 23:43:08 2011 From: python-checkins at python.org (ned.deily) Date: Thu, 06 Oct 2011 23:43:08 +0200 Subject: [Python-checkins] =?utf8?b?Y3B5dGhvbiAoMi43KTogSXNzdWUgIzc0MjU6?= =?utf8?q?_Prevent_pydoc_-k_failures_due_to_module_import_errors=2E?= Message-ID: http://hg.python.org/cpython/rev/3acf90f71178 changeset: 72770:3acf90f71178 branch: 2.7 user: Ned Deily date: Thu Oct 06 14:17:44 2011 -0700 summary: Issue #7425: Prevent pydoc -k failures due to module import errors. (Backport to 2.7 of existing 3.x fix) files: Lib/pydoc.py | 11 ++++++----- 1 files changed, 6 insertions(+), 5 deletions(-) diff --git a/Lib/pydoc.py b/Lib/pydoc.py --- a/Lib/pydoc.py +++ b/Lib/pydoc.py @@ -52,7 +52,7 @@ # the current directory is changed with os.chdir(), an incorrect # path will be displayed. -import sys, imp, os, re, types, inspect, __builtin__, pkgutil +import sys, imp, os, re, types, inspect, __builtin__, pkgutil, warnings from repr import Repr from string import expandtabs, find, join, lower, split, strip, rfind, rstrip from traceback import extract_tb @@ -1968,10 +1968,11 @@ if modname[-9:] == '.__init__': modname = modname[:-9] + ' (package)' print modname, desc and '- ' + desc - try: import warnings - except ImportError: pass - else: warnings.filterwarnings('ignore') # ignore problems during import - ModuleScanner().run(callback, key) + def onerror(modname): + pass + with warnings.catch_warnings(): + warnings.filterwarnings('ignore') # ignore problems during import + ModuleScanner().run(callback, key, onerror=onerror) # --------------------------------------------------- web browser interface -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 23:43:09 2011 From: python-checkins at python.org (ned.deily) Date: Thu, 06 Oct 2011 23:43:09 +0200 Subject: [Python-checkins] =?utf8?b?Y3B5dGhvbiAoMi43KTogSXNzdWUgIzczNjc6?= =?utf8?q?_Fix_pkgutil=2Ewalk=5Fpaths_to_skip_directories_whose?= Message-ID: http://hg.python.org/cpython/rev/1449095397ae changeset: 72771:1449095397ae branch: 2.7 user: Ned Deily date: Thu Oct 06 14:17:47 2011 -0700 summary: Issue #7367: Fix pkgutil.walk_paths to skip directories whose contents cannot be read. files: Lib/pkgutil.py | 14 +++++++++++--- 1 files changed, 11 insertions(+), 3 deletions(-) diff --git a/Lib/pkgutil.py b/Lib/pkgutil.py --- a/Lib/pkgutil.py +++ b/Lib/pkgutil.py @@ -194,8 +194,11 @@ yielded = {} import inspect - - filenames = os.listdir(self.path) + try: + filenames = os.listdir(self.path) + except OSError: + # ignore unreadable directories like import does + filenames = [] filenames.sort() # handle packages before same-named modules for fn in filenames: @@ -208,7 +211,12 @@ if not modname and os.path.isdir(path) and '.' not in fn: modname = fn - for fn in os.listdir(path): + try: + dircontents = os.listdir(path) + except OSError: + # ignore unreadable directories like import does + dircontents = [] + for fn in dircontents: subname = inspect.getmodulename(fn) if subname=='__init__': ispkg = True -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 23:43:10 2011 From: python-checkins at python.org (ned.deily) Date: Thu, 06 Oct 2011 23:43:10 +0200 Subject: [Python-checkins] =?utf8?b?Y3B5dGhvbiAoMy4yKTogSXNzdWUgIzc0MjU6?= =?utf8?q?_Refactor_test=5Fpydoc_test_case_for_=27-k=27_behavior_and_add?= Message-ID: http://hg.python.org/cpython/rev/6a45f917f167 changeset: 72772:6a45f917f167 branch: 3.2 parent: 72755:f9f782f2369e user: Ned Deily date: Thu Oct 06 14:19:03 2011 -0700 summary: Issue #7425: Refactor test_pydoc test case for '-k' behavior and add new test cases for importing bad packages and unreadable packages dirs. files: Lib/test/test_pydoc.py | 95 ++++++++++++++++------------- 1 files changed, 53 insertions(+), 42 deletions(-) diff --git a/Lib/test/test_pydoc.py b/Lib/test/test_pydoc.py --- a/Lib/test/test_pydoc.py +++ b/Lib/test/test_pydoc.py @@ -7,7 +7,6 @@ import keyword import re import string -import subprocess import test.support import time import unittest @@ -15,11 +14,9 @@ import textwrap from io import StringIO from collections import namedtuple -from contextlib import contextmanager - from test.script_helper import assert_python_ok from test.support import ( - TESTFN, forget, rmtree, EnvironmentVarGuard, + TESTFN, rmtree, reap_children, reap_threads, captured_output, captured_stdout, unlink ) from test import pydoc_mod @@ -209,7 +206,8 @@ output of pydoc. """ args = args + (module_name,) - rc, out, err = assert_python_ok(pydoc.__file__, *args, **env) + # do not write bytecode files to avoid caching errors + rc, out, err = assert_python_ok('-B', pydoc.__file__, *args, **env) return out.strip() def get_pydoc_html(module): @@ -291,43 +289,6 @@ self.assertEqual(expected, result, "documentation for missing module found") - def test_badimport(self): - # This tests the fix for issue 5230, where if pydoc found the module - # but the module had an internal import error pydoc would report no doc - # found. - modname = 'testmod_xyzzy' - testpairs = ( - ('i_am_not_here', 'i_am_not_here'), - ('test.i_am_not_here_either', 'i_am_not_here_either'), - ('test.i_am_not_here.neither_am_i', 'i_am_not_here.neither_am_i'), - ('i_am_not_here.{}'.format(modname), - 'i_am_not_here.{}'.format(modname)), - ('test.{}'.format(modname), modname), - ) - - @contextmanager - def newdirinpath(dir): - os.mkdir(dir) - sys.path.insert(0, dir) - try: - yield - finally: - sys.path.pop(0) - rmtree(dir) - - with newdirinpath(TESTFN): - fullmodname = os.path.join(TESTFN, modname) - sourcefn = fullmodname + os.extsep + "py" - for importstring, expectedinmsg in testpairs: - with open(sourcefn, 'w') as f: - f.write("import {}\n".format(importstring)) - try: - result = run_pydoc(modname, PYTHONPATH=TESTFN).decode("ascii") - finally: - forget(modname) - expected = badimport_pattern % (modname, expectedinmsg) - self.assertEqual(expected, result) - def test_input_strip(self): missing_module = " test.i_am_not_here " result = str(run_pydoc(missing_module), 'ascii') @@ -403,6 +364,55 @@ self.assertEqual(synopsis, 'line 1: h\xe9') +class PydocImportTest(unittest.TestCase): + + def setUp(self): + self.test_dir = os.mkdir(TESTFN) + self.addCleanup(rmtree, TESTFN) + + def test_badimport(self): + # This tests the fix for issue 5230, where if pydoc found the module + # but the module had an internal import error pydoc would report no doc + # found. + modname = 'testmod_xyzzy' + testpairs = ( + ('i_am_not_here', 'i_am_not_here'), + ('test.i_am_not_here_either', 'i_am_not_here_either'), + ('test.i_am_not_here.neither_am_i', 'i_am_not_here.neither_am_i'), + ('i_am_not_here.{}'.format(modname), + 'i_am_not_here.{}'.format(modname)), + ('test.{}'.format(modname), modname), + ) + + sourcefn = os.path.join(TESTFN, modname) + os.extsep + "py" + for importstring, expectedinmsg in testpairs: + with open(sourcefn, 'w') as f: + f.write("import {}\n".format(importstring)) + result = run_pydoc(modname, PYTHONPATH=TESTFN).decode("ascii") + expected = badimport_pattern % (modname, expectedinmsg) + self.assertEqual(expected, result) + + def test_apropos_with_bad_package(self): + # Issue 7425 - pydoc -k failed when bad package on path + pkgdir = os.path.join(TESTFN, "syntaxerr") + os.mkdir(pkgdir) + badsyntax = os.path.join(pkgdir, "__init__") + os.extsep + "py" + with open(badsyntax, 'w') as f: + f.write("invalid python syntax = $1\n") + result = run_pydoc('nothing', '-k', PYTHONPATH=TESTFN) + self.assertEqual(b'', result) + + def test_apropos_with_unreadable_dir(self): + # Issue 7367 - pydoc -k failed when unreadable dir on path + self.unreadable_dir = os.path.join(TESTFN, "unreadable") + os.mkdir(self.unreadable_dir, 0) + self.addCleanup(os.rmdir, self.unreadable_dir) + # Note, on Windows the directory appears to be still + # readable so this is not really testing the issue there + result = run_pydoc('nothing', '-k', PYTHONPATH=TESTFN) + self.assertEqual(b'', result) + + class TestDescriptions(unittest.TestCase): def test_module(self): @@ -511,6 +521,7 @@ def test_main(): try: test.support.run_unittest(PydocDocTest, + PydocImportTest, TestDescriptions, PydocServerTest, PydocUrlHandlerTest, -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 23:43:11 2011 From: python-checkins at python.org (ned.deily) Date: Thu, 06 Oct 2011 23:43:11 +0200 Subject: [Python-checkins] =?utf8?b?Y3B5dGhvbiAoMy4yKTogSXNzdWUgIzczNjc6?= =?utf8?q?_Add_test_case_to_test=5Fpkgutil_for_walking_path_with?= Message-ID: http://hg.python.org/cpython/rev/a1e6633ef3f1 changeset: 72773:a1e6633ef3f1 branch: 3.2 user: Ned Deily date: Thu Oct 06 14:19:06 2011 -0700 summary: Issue #7367: Add test case to test_pkgutil for walking path with an unreadable directory. files: Lib/test/test_pkgutil.py | 11 +++++++++++ 1 files changed, 11 insertions(+), 0 deletions(-) diff --git a/Lib/test/test_pkgutil.py b/Lib/test/test_pkgutil.py --- a/Lib/test/test_pkgutil.py +++ b/Lib/test/test_pkgutil.py @@ -84,6 +84,17 @@ del sys.modules[pkg] + def test_unreadable_dir_on_syspath(self): + # issue7367 - walk_packages failed if unreadable dir on sys.path + package_name = "unreadable_package" + d = os.path.join(self.dirname, package_name) + # this does not appear to create an unreadable dir on Windows + # but the test should not fail anyway + os.mkdir(d, 0) + for t in pkgutil.walk_packages(path=[self.dirname]): + self.fail("unexpected package found") + os.rmdir(d) + class PkgutilPEP302Tests(unittest.TestCase): class MyTestLoader(object): -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 23:43:12 2011 From: python-checkins at python.org (ned.deily) Date: Thu, 06 Oct 2011 23:43:12 +0200 Subject: [Python-checkins] =?utf8?b?Y3B5dGhvbiAoMy4yKTogSXNzdWUgIzczNjc6?= =?utf8?q?_Fix_pkgutil=2Ewalk=5Fpaths_to_skip_directories_whose?= Message-ID: http://hg.python.org/cpython/rev/77bac85f610a changeset: 72774:77bac85f610a branch: 3.2 user: Ned Deily date: Thu Oct 06 14:19:08 2011 -0700 summary: Issue #7367: Fix pkgutil.walk_paths to skip directories whose contents cannot be read. files: Lib/pkgutil.py | 14 +++++++++++--- 1 files changed, 11 insertions(+), 3 deletions(-) diff --git a/Lib/pkgutil.py b/Lib/pkgutil.py --- a/Lib/pkgutil.py +++ b/Lib/pkgutil.py @@ -191,8 +191,11 @@ yielded = {} import inspect - - filenames = os.listdir(self.path) + try: + filenames = os.listdir(self.path) + except OSError: + # ignore unreadable directories like import does + filenames = [] filenames.sort() # handle packages before same-named modules for fn in filenames: @@ -205,7 +208,12 @@ if not modname and os.path.isdir(path) and '.' not in fn: modname = fn - for fn in os.listdir(path): + try: + dircontents = os.listdir(path) + except OSError: + # ignore unreadable directories like import does + dircontents = [] + for fn in dircontents: subname = inspect.getmodulename(fn) if subname=='__init__': ispkg = True -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 23:43:12 2011 From: python-checkins at python.org (ned.deily) Date: Thu, 06 Oct 2011 23:43:12 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=28merge_3=2E2_-=3E_default=29?= =?utf8?q?=3A_merge_from_3=2E2?= Message-ID: http://hg.python.org/cpython/rev/8bb0000e81e5 changeset: 72775:8bb0000e81e5 parent: 72767:11fed1ed1757 parent: 72774:77bac85f610a user: Ned Deily date: Thu Oct 06 14:24:31 2011 -0700 summary: merge from 3.2 files: Lib/pkgutil.py | 14 +++- Lib/test/test_pkgutil.py | 11 +++ Lib/test/test_pydoc.py | 95 +++++++++++++++------------ 3 files changed, 75 insertions(+), 45 deletions(-) diff --git a/Lib/pkgutil.py b/Lib/pkgutil.py --- a/Lib/pkgutil.py +++ b/Lib/pkgutil.py @@ -191,8 +191,11 @@ yielded = {} import inspect - - filenames = os.listdir(self.path) + try: + filenames = os.listdir(self.path) + except OSError: + # ignore unreadable directories like import does + filenames = [] filenames.sort() # handle packages before same-named modules for fn in filenames: @@ -205,7 +208,12 @@ if not modname and os.path.isdir(path) and '.' not in fn: modname = fn - for fn in os.listdir(path): + try: + dircontents = os.listdir(path) + except OSError: + # ignore unreadable directories like import does + dircontents = [] + for fn in dircontents: subname = inspect.getmodulename(fn) if subname=='__init__': ispkg = True diff --git a/Lib/test/test_pkgutil.py b/Lib/test/test_pkgutil.py --- a/Lib/test/test_pkgutil.py +++ b/Lib/test/test_pkgutil.py @@ -84,6 +84,17 @@ del sys.modules[pkg] + def test_unreadable_dir_on_syspath(self): + # issue7367 - walk_packages failed if unreadable dir on sys.path + package_name = "unreadable_package" + d = os.path.join(self.dirname, package_name) + # this does not appear to create an unreadable dir on Windows + # but the test should not fail anyway + os.mkdir(d, 0) + for t in pkgutil.walk_packages(path=[self.dirname]): + self.fail("unexpected package found") + os.rmdir(d) + class PkgutilPEP302Tests(unittest.TestCase): class MyTestLoader(object): diff --git a/Lib/test/test_pydoc.py b/Lib/test/test_pydoc.py --- a/Lib/test/test_pydoc.py +++ b/Lib/test/test_pydoc.py @@ -7,7 +7,6 @@ import keyword import re import string -import subprocess import test.support import time import unittest @@ -15,11 +14,9 @@ import textwrap from io import StringIO from collections import namedtuple -from contextlib import contextmanager - from test.script_helper import assert_python_ok from test.support import ( - TESTFN, forget, rmtree, EnvironmentVarGuard, + TESTFN, rmtree, reap_children, reap_threads, captured_output, captured_stdout, unlink ) from test import pydoc_mod @@ -209,7 +206,8 @@ output of pydoc. """ args = args + (module_name,) - rc, out, err = assert_python_ok(pydoc.__file__, *args, **env) + # do not write bytecode files to avoid caching errors + rc, out, err = assert_python_ok('-B', pydoc.__file__, *args, **env) return out.strip() def get_pydoc_html(module): @@ -295,43 +293,6 @@ self.assertEqual(expected, result, "documentation for missing module found") - def test_badimport(self): - # This tests the fix for issue 5230, where if pydoc found the module - # but the module had an internal import error pydoc would report no doc - # found. - modname = 'testmod_xyzzy' - testpairs = ( - ('i_am_not_here', 'i_am_not_here'), - ('test.i_am_not_here_either', 'i_am_not_here_either'), - ('test.i_am_not_here.neither_am_i', 'i_am_not_here.neither_am_i'), - ('i_am_not_here.{}'.format(modname), - 'i_am_not_here.{}'.format(modname)), - ('test.{}'.format(modname), modname), - ) - - @contextmanager - def newdirinpath(dir): - os.mkdir(dir) - sys.path.insert(0, dir) - try: - yield - finally: - sys.path.pop(0) - rmtree(dir) - - with newdirinpath(TESTFN): - fullmodname = os.path.join(TESTFN, modname) - sourcefn = fullmodname + os.extsep + "py" - for importstring, expectedinmsg in testpairs: - with open(sourcefn, 'w') as f: - f.write("import {}\n".format(importstring)) - try: - result = run_pydoc(modname, PYTHONPATH=TESTFN).decode("ascii") - finally: - forget(modname) - expected = badimport_pattern % (modname, expectedinmsg) - self.assertEqual(expected, result) - def test_input_strip(self): missing_module = " test.i_am_not_here " result = str(run_pydoc(missing_module), 'ascii') @@ -409,6 +370,55 @@ self.assertEqual(synopsis, 'line 1: h\xe9') +class PydocImportTest(unittest.TestCase): + + def setUp(self): + self.test_dir = os.mkdir(TESTFN) + self.addCleanup(rmtree, TESTFN) + + def test_badimport(self): + # This tests the fix for issue 5230, where if pydoc found the module + # but the module had an internal import error pydoc would report no doc + # found. + modname = 'testmod_xyzzy' + testpairs = ( + ('i_am_not_here', 'i_am_not_here'), + ('test.i_am_not_here_either', 'i_am_not_here_either'), + ('test.i_am_not_here.neither_am_i', 'i_am_not_here.neither_am_i'), + ('i_am_not_here.{}'.format(modname), + 'i_am_not_here.{}'.format(modname)), + ('test.{}'.format(modname), modname), + ) + + sourcefn = os.path.join(TESTFN, modname) + os.extsep + "py" + for importstring, expectedinmsg in testpairs: + with open(sourcefn, 'w') as f: + f.write("import {}\n".format(importstring)) + result = run_pydoc(modname, PYTHONPATH=TESTFN).decode("ascii") + expected = badimport_pattern % (modname, expectedinmsg) + self.assertEqual(expected, result) + + def test_apropos_with_bad_package(self): + # Issue 7425 - pydoc -k failed when bad package on path + pkgdir = os.path.join(TESTFN, "syntaxerr") + os.mkdir(pkgdir) + badsyntax = os.path.join(pkgdir, "__init__") + os.extsep + "py" + with open(badsyntax, 'w') as f: + f.write("invalid python syntax = $1\n") + result = run_pydoc('nothing', '-k', PYTHONPATH=TESTFN) + self.assertEqual(b'', result) + + def test_apropos_with_unreadable_dir(self): + # Issue 7367 - pydoc -k failed when unreadable dir on path + self.unreadable_dir = os.path.join(TESTFN, "unreadable") + os.mkdir(self.unreadable_dir, 0) + self.addCleanup(os.rmdir, self.unreadable_dir) + # Note, on Windows the directory appears to be still + # readable so this is not really testing the issue there + result = run_pydoc('nothing', '-k', PYTHONPATH=TESTFN) + self.assertEqual(b'', result) + + class TestDescriptions(unittest.TestCase): def test_module(self): @@ -517,6 +527,7 @@ def test_main(): try: test.support.run_unittest(PydocDocTest, + PydocImportTest, TestDescriptions, PydocServerTest, PydocUrlHandlerTest, -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 23:43:13 2011 From: python-checkins at python.org (ned.deily) Date: Thu, 06 Oct 2011 23:43:13 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=282=2E7=29=3A_Issue_=237425_a?= =?utf8?q?nd_Issue_=237367=3A_add_NEWS_items=2E?= Message-ID: http://hg.python.org/cpython/rev/add444274c3d changeset: 72776:add444274c3d branch: 2.7 parent: 72771:1449095397ae user: Ned Deily date: Thu Oct 06 14:29:49 2011 -0700 summary: Issue #7425 and Issue #7367: add NEWS items. files: Misc/NEWS | 6 ++++++ 1 files changed, 6 insertions(+), 0 deletions(-) diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -50,6 +50,12 @@ Library ------- +- Issue #7367: Fix pkgutil.walk_paths to skip directories whose + contents cannot be read. + +- Issue #7425: Prevent pydoc -k failures due to module import errors. + (Backport to 2.7 of existing 3.x fix) + - Issue #13099: Fix sqlite3.Cursor.lastrowid under a Turkish locale. Reported and diagnosed by Thomas Kluyver. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 23:43:14 2011 From: python-checkins at python.org (ned.deily) Date: Thu, 06 Oct 2011 23:43:14 +0200 Subject: [Python-checkins] =?utf8?b?Y3B5dGhvbiAoMy4yKTogSXNzdWUgIzczNjc6?= =?utf8?q?_add_NEWS_item=2E?= Message-ID: http://hg.python.org/cpython/rev/5a4018570a59 changeset: 72777:5a4018570a59 branch: 3.2 parent: 72774:77bac85f610a user: Ned Deily date: Thu Oct 06 14:31:14 2011 -0700 summary: Issue #7367: add NEWS item. files: Misc/NEWS | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -39,6 +39,9 @@ Library ------- +- Issue #7367: Fix pkgutil.walk_paths to skip directories whose + contents cannot be read. + - Issue #13099: Fix sqlite3.Cursor.lastrowid under a Turkish locale. Reported and diagnosed by Thomas Kluyver. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Thu Oct 6 23:43:15 2011 From: python-checkins at python.org (ned.deily) Date: Thu, 06 Oct 2011 23:43:15 +0200 Subject: [Python-checkins] =?utf8?q?cpython_=28merge_3=2E2_-=3E_default=29?= =?utf8?q?=3A_Issue_=237367=3A_merge_from_3=2E2?= Message-ID: http://hg.python.org/cpython/rev/0408001e4765 changeset: 72778:0408001e4765 parent: 72775:8bb0000e81e5 parent: 72777:5a4018570a59 user: Ned Deily date: Thu Oct 06 14:41:30 2011 -0700 summary: Issue #7367: merge from 3.2 files: Misc/NEWS | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -297,6 +297,9 @@ Library ------- +- Issue #7367: Fix pkgutil.walk_paths to skip directories whose + contents cannot be read. + - Issue #3163: The struct module gets new format characters 'n' and 'N' supporting C integer types ``ssize_t`` and ``size_t``, respectively. -- Repository URL: http://hg.python.org/cpython From python-checkins at python.org Fri Oct 7 01:57:46 2011 From: python-checkins at python.org (antoine.pitrou) Date: Fri, 07 Oct 2011 01:57:46 +0200 Subject: [Python-checkins] =?utf8?q?cpython=3A_Fix_massive_slowdown_in_str?= =?utf8?q?ing_formatting_with_the_=25_operator?= Message-ID: http://hg.python.org/cpython/rev/e0df7db13d55 changeset: 72779:e0d