April 2016
- 103 participants
- 82 discussions
After digging through obmalloc.c to optimize some memory intensive code, I
put a paper together on the entire private memory heap that may or may not
be a useful addition to docs.
I was hoping someone could review/proof it for errors in content.
Not sure the policy on links but I've uploaded it to google drive:
Re: [Python-Dev] [Python-checkins] cpython: Python 8: no pep8, no chocolate!
by Brett Cannon April 1, 2016
by Brett Cannon April 1, 2016
April 1, 2016
Are you planning on removing this after today? My worry about leaving it in
is if it's a modified copy that follows your Python 8 April Fools joke then
it will quite possibly trip people up who try and run pep8 but don't have
it installed, leading them to wonder why the heck their imports are now all
flagged as broken.
On Thu, 31 Mar 2016 at 14:40 victor.stinner <python-checkins(a)>
> + if indent[d] > prev_indent:
> + indent[d] = 0
> + for ind in list(indent_chances):
> + if ind >= prev_indent:
> + del indent_chances[ind]
> + del open_rows[depth + 1:]
> + depth -= 1
> + if depth:
> + indent_chances[indent[depth]] = True
> + for idx in range(row, -1, -1):
> + if parens[idx]:
> + parens[idx] -= 1
> + break
> + assert len(indent) == depth + 1
> + if start[1] not in indent_chances:
> + # allow to line up tokens
> + indent_chances[start[1]] = text
> +
> + last_token_multiline = (start[0] != end[0])
> + if last_token_multiline:
> + rel_indent[end[0] - first_row] = rel_indent[row]
> +
> + if indent_next and expand_indent(line) == indent_level + 4:
> + pos = (start[0], indent[0] + 4)
> + if visual_indent:
> + code = "E129 visually indented line"
> + else:
> + code = "E125 continuation line"
> + yield pos, "%s with same indent as next logical line" % code
> +
> +
> +def whitespace_before_parameters(logical_line, tokens):
> + r"""Avoid extraneous whitespace.
> +
> + Avoid extraneous whitespace in the following situations:
> + - before the open parenthesis that starts the argument list of a
> + function call.
> + - before the open parenthesis that starts an indexing or slicing.
> +
> + Okay: spam(1)
> + E211: spam (1)
> +
> + Okay: dict['key'] = list[index]
> + E211: dict ['key'] = list[index]
> + E211: dict['key'] = list [index]
> + """
> + prev_type, prev_text, __, prev_end, __ = tokens[0]
> + for index in range(1, len(tokens)):
> + token_type, text, start, end, __ = tokens[index]
> + if (token_type == tokenize.OP and
> + text in '([' and
> + start != prev_end and
> + (prev_type == tokenize.NAME or prev_text in '}])') and
> + # Syntax "class A (B):" is allowed, but avoid it
> + (index < 2 or tokens[index - 2][1] != 'class') and
> + # Allow "return ( for a in range(5))"
> + not keyword.iskeyword(prev_text)):
> + yield prev_end, "E211 whitespace before '%s'" % text
> + prev_type = token_type
> + prev_text = text
> + prev_end = end
> +
> +
> +def whitespace_around_operator(logical_line):
> + r"""Avoid extraneous whitespace around an operator.
> +
> + Okay: a = 12 + 3
> + E221: a = 4 + 5
> + E222: a = 4 + 5
> + E223: a = 4\t+ 5
> + E224: a = 4 +\t5
> + """
> + for match in OPERATOR_REGEX.finditer(logical_line):
> + before, after = match.groups()
> +
> + if '\t' in before:
> + yield match.start(1), "E223 tab before operator"
> + elif len(before) > 1:
> + yield match.start(1), "E221 multiple spaces before operator"
> +
> + if '\t' in after:
> + yield match.start(2), "E224 tab after operator"
> + elif len(after) > 1:
> + yield match.start(2), "E222 multiple spaces after operator"
> +
> +
> +def missing_whitespace_around_operator(logical_line, tokens):
> + r"""Surround operators with a single space on either side.
> +
> + - Always surround these binary operators with a single space on
> + either side: assignment (=), augmented assignment (+=, -= etc.),
> + comparisons (==, <, >, !=, <=, >=, in, not in, is, is not),
> + Booleans (and, or, not).
> +
> + - If operators with different priorities are used, consider adding
> + whitespace around the operators with the lowest priorities.
> +
> + Okay: i = i + 1
> + Okay: submitted += 1
> + Okay: x = x * 2 - 1
> + Okay: hypot2 = x * x + y * y
> + Okay: c = (a + b) * (a - b)
> + Okay: foo(bar, key='word', *args, **kwargs)
> + Okay: alpha[:-i]
> +
> + E225: i=i+1
> + E225: submitted +=1
> + E225: x = x /2 - 1
> + E225: z = x **y
> + E226: c = (a+b) * (a-b)
> + E226: hypot2 = x*x + y*y
> + E227: c = a|b
> + E228: msg = fmt%(errno, errmsg)
> + """
> + parens = 0
> + need_space = False
> + prev_type = tokenize.OP
> + prev_text = prev_end = None
> + for token_type, text, start, end, line in tokens:
> + if token_type in SKIP_COMMENTS:
> + continue
> + if text in ('(', 'lambda'):
> + parens += 1
> + elif text == ')':
> + parens -= 1
> + if need_space:
> + if start != prev_end:
> + # Found a (probably) needed space
> + if need_space is not True and not need_space[1]:
> + yield (need_space[0],
> + "E225 missing whitespace around operator")
> + need_space = False
> + elif text == '>' and prev_text in ('<', '-'):
> + # Tolerate the "<>" operator, even if running Python 3
> + # Deal with Python 3's annotated return value "->"
> + pass
> + else:
> + if need_space is True or need_space[1]:
> + # A needed trailing space was not found
> + yield prev_end, "E225 missing whitespace around
> operator"
> + elif prev_text != '**':
> + code, optype = 'E226', 'arithmetic'
> + if prev_text == '%':
> + code, optype = 'E228', 'modulo'
> + elif prev_text not in ARITHMETIC_OP:
> + code, optype = 'E227', 'bitwise or shift'
> + yield (need_space[0], "%s missing whitespace "
> + "around %s operator" % (code, optype))
> + need_space = False
> + elif token_type == tokenize.OP and prev_end is not None:
> + if text == '=' and parens:
> + # Allow keyword args or defaults: foo(bar=None).
> + pass
> + elif text in WS_NEEDED_OPERATORS:
> + need_space = True
> + elif text in UNARY_OPERATORS:
> + # Check if the operator is being used as a binary operator
> + # Allow unary operators: -123, -x, +1.
> + # Allow argument unpacking: foo(*args, **kwargs).
> + if (prev_text in '}])' if prev_type == tokenize.OP
> + else prev_text not in KEYWORDS):
> + need_space = None
> + elif text in WS_OPTIONAL_OPERATORS:
> + need_space = None
> +
> + if need_space is None:
> + # Surrounding space is optional, but ensure that
> + # trailing space matches opening space
> + need_space = (prev_end, start != prev_end)
> + elif need_space and start == prev_end:
> + # A needed opening space was not found
> + yield prev_end, "E225 missing whitespace around operator"
> + need_space = False
> + prev_type = token_type
> + prev_text = text
> + prev_end = end
> +
> +
> +def whitespace_around_comma(logical_line):
> + r"""Avoid extraneous whitespace after a comma or a colon.
> +
> + Note: these checks are disabled by default
> +
> + Okay: a = (1, 2)
> + E241: a = (1, 2)
> + E242: a = (1,\t2)
> + """
> + line = logical_line
> + for m in WHITESPACE_AFTER_COMMA_REGEX.finditer(line):
> + found = m.start() + 1
> + if '\t' in
> + yield found, "E242 tab after '%s'" %[0]
> + else:
> + yield found, "E241 multiple spaces after '%s'" %[0]
> +
> +
> +def whitespace_around_named_parameter_equals(logical_line, tokens):
> + r"""Don't use spaces around the '=' sign in function arguments.
> +
> + Don't use spaces around the '=' sign when used to indicate a
> + keyword argument or a default parameter value.
> +
> + Okay: def complex(real, imag=0.0):
> + Okay: return magic(r=real, i=imag)
> + Okay: boolean(a == b)
> + Okay: boolean(a != b)
> + Okay: boolean(a <= b)
> + Okay: boolean(a >= b)
> + Okay: def foo(arg: int = 42):
> +
> + E251: def complex(real, imag = 0.0):
> + E251: return magic(r = real, i = imag)
> + """
> + parens = 0
> + no_space = False
> + prev_end = None
> + annotated_func_arg = False
> + in_def = logical_line.startswith('def')
> + message = "E251 unexpected spaces around keyword / parameter equals"
> + for token_type, text, start, end, line in tokens:
> + if token_type == tokenize.NL:
> + continue
> + if no_space:
> + no_space = False
> + if start != prev_end:
> + yield (prev_end, message)
> + if token_type == tokenize.OP:
> + if text == '(':
> + parens += 1
> + elif text == ')':
> + parens -= 1
> + elif in_def and text == ':' and parens == 1:
> + annotated_func_arg = True
> + elif parens and text == ',' and parens == 1:
> + annotated_func_arg = False
> + elif parens and text == '=' and not annotated_func_arg:
> + no_space = True
> + if start != prev_end:
> + yield (prev_end, message)
> + if not parens:
> + annotated_func_arg = False
> +
> + prev_end = end
> +
> +
> +def whitespace_before_comment(logical_line, tokens):
> + r"""Separate inline comments by at least two spaces.
> +
> + An inline comment is a comment on the same line as a statement.
> Inline
> + comments should be separated by at least two spaces from the
> statement.
> + They should start with a # and a single space.
> +
> + Each line of a block comment starts with a # and a single space
> + (unless it is indented text inside the comment).
> +
> + Okay: x = x + 1 # Increment x
> + Okay: x = x + 1 # Increment x
> + Okay: # Block comment
> + E261: x = x + 1 # Increment x
> + E262: x = x + 1 #Increment x
> + E262: x = x + 1 # Increment x
> + E265: #Block comment
> + E266: ### Block comment
> + """
> + prev_end = (0, 0)
> + for token_type, text, start, end, line in tokens:
> + if token_type == tokenize.COMMENT:
> + inline_comment = line[:start[1]].strip()
> + if inline_comment:
> + if prev_end[0] == start[0] and start[1] < prev_end[1] + 2:
> + yield (prev_end,
> + "E261 at least two spaces before inline
> comment")
> + symbol, sp, comment = text.partition(' ')
> + bad_prefix = symbol not in '#:' and (symbol.lstrip('#')[:1]
> or '#')
> + if inline_comment:
> + if bad_prefix or comment[:1] in WHITESPACE:
> + yield start, "E262 inline comment should start with
> '# '"
> + elif bad_prefix and (bad_prefix != '!' or start[0] > 1):
> + if bad_prefix != '#':
> + yield start, "E265 block comment should start with '#
> '"
> + elif comment:
> + yield start, "E266 too many leading '#' for block
> comment"
> + elif token_type != tokenize.NL:
> + prev_end = end
> +
> +
> +def imports_on_separate_lines(logical_line):
> + r"""Imports should usually be on separate lines.
> +
> + Okay: import os\nimport sys
> + E401: import sys, os
> +
> + Okay: from subprocess import Popen, PIPE
> + Okay: from myclas import MyClass
> + Okay: from import YourClass
> + Okay: import myclass
> + Okay: import
> + """
> + line = logical_line
> + if line.startswith('import '):
> + found = line.find(',')
> + if -1 < found and ';' not in line[:found]:
> + yield found, "E401 multiple imports on one line"
> +
> +
> +def module_imports_on_top_of_file(
> + logical_line, indent_level, checker_state, noqa):
> + r"""Imports are always put at the top of the file, just after any
> module
> + comments and docstrings, and before module globals and constants.
> +
> + Okay: import os
> + Okay: # this is a comment\nimport os
> + Okay: '''this is a module docstring'''\nimport os
> + Okay: r'''this is a module docstring'''\nimport os
> + Okay: try:\n import x\nexcept:\n pass\nelse:\n pass\nimport y
> + Okay: try:\n import x\nexcept:\n pass\nfinally:\n
> pass\nimport y
> + E402: a=1\nimport os
> + E402: 'One string'\n"Two string"\nimport os
> + E402: a=1\nfrom sys import x
> +
> + Okay: if x:\n import os
> + """
> + def is_string_literal(line):
> + if line[0] in 'uUbB':
> + line = line[1:]
> + if line and line[0] in 'rR':
> + line = line[1:]
> + return line and (line[0] == '"' or line[0] == "'")
> +
> + allowed_try_keywords = ('try', 'except', 'else', 'finally')
> +
> + if indent_level: # Allow imports in conditional statements or
> functions
> + return
> + if not logical_line: # Allow empty lines or comments
> + return
> + if noqa:
> + return
> + line = logical_line
> + if line.startswith('import ') or line.startswith('from '):
> + if checker_state.get('seen_non_imports', False):
> + yield 0, "E402 module level import not at top of file"
> + elif any(line.startswith(kw) for kw in allowed_try_keywords):
> + # Allow try, except, else, finally keywords intermixed with
> imports in
> + # order to support conditional importing
> + return
> + elif is_string_literal(line):
> + # The first literal is a docstring, allow it. Otherwise, report
> error.
> + if checker_state.get('seen_docstring', False):
> + checker_state['seen_non_imports'] = True
> + else:
> + checker_state['seen_docstring'] = True
> + else:
> + checker_state['seen_non_imports'] = True
> +
> +
> +def compound_statements(logical_line):
> + r"""Compound statements (on the same line) are generally discouraged.
> +
> + While sometimes it's okay to put an if/for/while with a small body
> + on the same line, never do this for multi-clause statements.
> + Also avoid folding such long lines!
> +
> + Always use a def statement instead of an assignment statement that
> + binds a lambda expression directly to a name.
> +
> + Okay: if foo == 'blah':\n do_blah_thing()
> + Okay: do_one()
> + Okay: do_two()
> + Okay: do_three()
> +
> + E701: if foo == 'blah': do_blah_thing()
> + E701: for x in lst: total += x
> + E701: while t < 10: t = delay()
> + E701: if foo == 'blah': do_blah_thing()
> + E701: else: do_non_blah_thing()
> + E701: try: something()
> + E701: finally: cleanup()
> + E701: if foo == 'blah': one(); two(); three()
> + E702: do_one(); do_two(); do_three()
> + E703: do_four(); # useless semicolon
> + E704: def f(x): return 2*x
> + E731: f = lambda x: 2*x
> + """
> + line = logical_line
> + last_char = len(line) - 1
> + found = line.find(':')
> + while -1 < found < last_char:
> + before = line[:found]
> + if ((before.count('{') <= before.count('}') and # {'a': 1}
> (dict)
> + before.count('[') <= before.count(']') and # [1:2] (slice)
> + before.count('(') <= before.count(')'))): # (annotation)
> + lambda_kw =
> + if lambda_kw:
> + before = line[:lambda_kw.start()].rstrip()
> + if before[-1:] == '=' and
> isidentifier(before[:-1].strip()):
> + yield 0, ("E731 do not assign a lambda expression,
> use a "
> + "def")
> + break
> + if before.startswith('def '):
> + yield 0, "E704 multiple statements on one line (def)"
> + else:
> + yield found, "E701 multiple statements on one line
> (colon)"
> + found = line.find(':', found + 1)
> + found = line.find(';')
> + while -1 < found:
> + if found < last_char:
> + yield found, "E702 multiple statements on one line
> (semicolon)"
> + else:
> + yield found, "E703 statement ends with a semicolon"
> + found = line.find(';', found + 1)
> +
> +
> +def explicit_line_join(logical_line, tokens):
> + r"""Avoid explicit line join between brackets.
> +
> + The preferred way of wrapping long lines is by using Python's implied
> line
> + continuation inside parentheses, brackets and braces. Long lines can
> be
> + broken over multiple lines by wrapping expressions in parentheses.
> These
> + should be used in preference to using a backslash for line
> continuation.
> +
> + E502: aaa = [123, \\n 123]
> + E502: aaa = ("bbb " \\n "ccc")
> +
> + Okay: aaa = [123,\n 123]
> + Okay: aaa = ("bbb "\n "ccc")
> + Okay: aaa = "bbb " \\n "ccc"
> + Okay: aaa = 123 # \\
> + """
> + prev_start = prev_end = parens = 0
> + comment = False
> + backslash = None
> + for token_type, text, start, end, line in tokens:
> + if token_type == tokenize.COMMENT:
> + comment = True
> + if start[0] != prev_start and parens and backslash and not
> comment:
> + yield backslash, "E502 the backslash is redundant between
> brackets"
> + if end[0] != prev_end:
> + if line.rstrip('\r\n').endswith('\\'):
> + backslash = (end[0], len(line.splitlines()[-1]) - 1)
> + else:
> + backslash = None
> + prev_start = prev_end = end[0]
> + else:
> + prev_start = start[0]
> + if token_type == tokenize.OP:
> + if text in '([{':
> + parens += 1
> + elif text in ')]}':
> + parens -= 1
> +
> +
> +def break_around_binary_operator(logical_line, tokens):
> + r"""
> + Avoid breaks before binary operators.
> +
> + The preferred place to break around a binary operator is after the
> + operator, not before it.
> +
> + W503: (width == 0\n + height == 0)
> + W503: (width == 0\n and height == 0)
> +
> + Okay: (width == 0 +\n height == 0)
> + Okay: foo(\n -x)
> + Okay: foo(x\n [])
> + Okay: x = '''\n''' + ''
> + Okay: foo(x,\n -y)
> + Okay: foo(x, # comment\n -y)
> + """
> + def is_binary_operator(token_type, text):
> + # The % character is strictly speaking a binary operator, but the
> + # common usage seems to be to put it next to the format
> parameters,
> + # after a line break.
> + return ((token_type == tokenize.OP or text in ['and', 'or']) and
> + text not in "()[]{},:.;@=%")
> +
> + line_break = False
> + unary_context = True
> + for token_type, text, start, end, line in tokens:
> + if token_type == tokenize.COMMENT:
> + continue
> + if ('\n' in text or '\r' in text) and token_type !=
> tokenize.STRING:
> + line_break = True
> + else:
> + if (is_binary_operator(token_type, text) and line_break and
> + not unary_context):
> + yield start, "W503 line break before binary operator"
> + unary_context = text in '([{,;'
> + line_break = False
> +
> +
> +def comparison_to_singleton(logical_line, noqa):
> + r"""Comparison to singletons should use "is" or "is not".
> +
> + Comparisons to singletons like None should always be done
> + with "is" or "is not", never the equality operators.
> +
> + Okay: if arg is not None:
> + E711: if arg != None:
> + E711: if None == arg:
> + E712: if arg == True:
> + E712: if False == arg:
> +
> + Also, beware of writing if x when you really mean if x is not None --
> + e.g. when testing whether a variable or argument that defaults to
> None was
> + set to some other value. The other value might have a type (such as a
> + container) that could be false in a boolean context!
> + """
> + match = not noqa and
> + if match:
> + singleton = or
> + same = ( == '==')
> +
> + msg = "'if cond is %s:'" % (('' if same else 'not ') + singleton)
> + if singleton in ('None',):
> + code = 'E711'
> + else:
> + code = 'E712'
> + nonzero = ((singleton == 'True' and same) or
> + (singleton == 'False' and not same))
> + msg += " or 'if %scond:'" % ('' if nonzero else 'not ')
> + yield match.start(2), ("%s comparison to %s should be %s" %
> + (code, singleton, msg))
> +
> +
> +def comparison_negative(logical_line):
> + r"""Negative comparison should be done using "not in" and "is not".
> +
> + Okay: if x not in y:\n pass
> + Okay: assert (X in Y or X is Z)
> + Okay: if not (X in Y):\n pass
> + Okay: zz = x is not y
> + E713: Z = not X in Y
> + E713: if not X.B in Y:\n pass
> + E714: if not X is Y:\n pass
> + E714: Z = not X.B is Y
> + """
> + match =
> + if match:
> + pos = match.start(1)
> + if == 'in':
> + yield pos, "E713 test for membership should be 'not in'"
> + else:
> + yield pos, "E714 test for object identity should be 'is not'"
> +
> +
> +def comparison_type(logical_line, noqa):
> + r"""Object type comparisons should always use isinstance().
> +
> + Do not compare types directly.
> +
> + Okay: if isinstance(obj, int):
> + E721: if type(obj) is type(1):
> +
> + When checking if an object is a string, keep in mind that it might be
> a
> + unicode string too! In Python 2.3, str and unicode have a common base
> + class, basestring, so you can do:
> +
> + Okay: if isinstance(obj, basestring):
> + Okay: if type(a1) is type(b1):
> + """
> + match =
> + if match and not noqa:
> + inst =
> + if inst and isidentifier(inst) and inst not in SINGLETONS:
> + return # Allow comparison for types which are not obvious
> + yield match.start(), "E721 do not compare types, use
> 'isinstance()'"
> +
> +
> +def python_3000_has_key(logical_line, noqa):
> + r"""The {}.has_key() method is removed in Python 3: use the 'in'
> operator.
> +
> + Okay: if "alph" in d:\n print d["alph"]
> + W601: assert d.has_key('alph')
> + """
> + pos = logical_line.find('.has_key(')
> + if pos > -1 and not noqa:
> + yield pos, "W601 .has_key() is deprecated, use 'in'"
> +
> +
> +def python_3000_raise_comma(logical_line):
> + r"""When raising an exception, use "raise ValueError('message')".
> +
> + The older form is removed in Python 3.
> +
> + Okay: raise DummyError("Message")
> + W602: raise DummyError, "Message"
> + """
> + match = RAISE_COMMA_REGEX.match(logical_line)
> + if match and not RERAISE_COMMA_REGEX.match(logical_line):
> + yield match.end() - 1, "W602 deprecated form of raising exception"
> +
> +
> +def python_3000_not_equal(logical_line):
> + r"""New code should always use != instead of <>.
> +
> + The older syntax is removed in Python 3.
> +
> + Okay: if a != 'no':
> + W603: if a <> 'no':
> + """
> + pos = logical_line.find('<>')
> + if pos > -1:
> + yield pos, "W603 '<>' is deprecated, use '!='"
> +
> +
> +def python_3000_backticks(logical_line):
> + r"""Backticks are removed in Python 3: use repr() instead.
> +
> + Okay: val = repr(1 + 2)
> + W604: val = `1 + 2`
> + """
> + pos = logical_line.find('`')
> + if pos > -1:
> + yield pos, "W604 backticks are deprecated, use 'repr()'"
> +
> +
> +##############################################################################
> +# Helper functions
> +##############################################################################
> +
> +
> +if sys.version_info < (3,):
> + # Python 2: implicit encoding.
> + def readlines(filename):
> + """Read the source code."""
> + with open(filename, 'rU') as f:
> + return f.readlines()
> + isidentifier = re.compile(r'[a-zA-Z_]\w*$').match
> + stdin_get_value =
> +else:
> + # Python 3
> + def readlines(filename):
> + """Read the source code."""
> + try:
> + with open(filename, 'rb') as f:
> + (coding, lines) = tokenize.detect_encoding(f.readline)
> + f = TextIOWrapper(f, coding, line_buffering=True)
> + return [l.decode(coding) for l in lines] + f.readlines()
> + except (LookupError, SyntaxError, UnicodeError):
> + # Fall back if file encoding is improperly declared
> + with open(filename, encoding='latin-1') as f:
> + return f.readlines()
> + isidentifier = str.isidentifier
> +
> + def stdin_get_value():
> + return TextIOWrapper(sys.stdin.buffer, errors='ignore').read()
> +noqa = re.compile(r'# no(?:qa|pep8)\b', re.I).search
> +
> +
> +def expand_indent(line):
> + r"""Return the amount of indentation.
> +
> + Tabs are expanded to the next multiple of 8.
> +
> + >>> expand_indent(' ')
> + 4
> + >>> expand_indent('\t')
> + 8
> + >>> expand_indent(' \t')
> + 8
> + >>> expand_indent(' \t')
> + 16
> + """
> + if '\t' not in line:
> + return len(line) - len(line.lstrip())
> + result = 0
> + for char in line:
> + if char == '\t':
> + result = result // 8 * 8 + 8
> + elif char == ' ':
> + result += 1
> + else:
> + break
> + return result
> +
> +
> +def mute_string(text):
> + """Replace contents with 'xxx' to prevent syntax matching.
> +
> + >>> mute_string('"abc"')
> + '"xxx"'
> + >>> mute_string("'''abc'''")
> + "'''xxx'''"
> + >>> mute_string("r'abc'")
> + "r'xxx'"
> + """
> + # String modifiers (e.g. u or r)
> + start = text.index(text[-1]) + 1
> + end = len(text) - 1
> + # Triple quotes
> + if text[-3:] in ('"""', "'''"):
> + start += 2
> + end -= 2
> + return text[:start] + 'x' * (end - start) + text[end:]
> +
> +
> +def parse_udiff(diff, patterns=None, parent='.'):
> + """Return a dictionary of matching lines."""
> + # For each file of the diff, the entry key is the filename,
> + # and the value is a set of row numbers to consider.
> + rv = {}
> + path = nrows = None
> + for line in diff.splitlines():
> + if nrows:
> + if line[:1] != '-':
> + nrows -= 1
> + continue
> + if line[:3] == '@@ ':
> + hunk_match = HUNK_REGEX.match(line)
> + (row, nrows) = [int(g or '1') for g in hunk_match.groups()]
> + rv[path].update(range(row, row + nrows))
> + elif line[:3] == '+++':
> + path = line[4:].split('\t', 1)[0]
> + if path[:2] == 'b/':
> + path = path[2:]
> + rv[path] = set()
> + return dict([(os.path.join(parent, path), rows)
> + for (path, rows) in rv.items()
> + if rows and filename_match(path, patterns)])
> +
> +
> +def normalize_paths(value, parent=os.curdir):
> + """Parse a comma-separated list of paths.
> +
> + Return a list of absolute paths.
> + """
> + if not value:
> + return []
> + if isinstance(value, list):
> + return value
> + paths = []
> + for path in value.split(','):
> + path = path.strip()
> + if '/' in path:
> + path = os.path.abspath(os.path.join(parent, path))
> + paths.append(path.rstrip('/'))
> + return paths
> +
> +
> +def filename_match(filename, patterns, default=True):
> + """Check if patterns contains a pattern that matches filename.
> +
> + If patterns is unspecified, this always returns True.
> + """
> + if not patterns:
> + return default
> + return any(fnmatch(filename, pattern) for pattern in patterns)
> +
> +
> +def _is_eol_token(token):
> + return token[0] in NEWLINE or token[4][token[3][1]:].lstrip() ==
> '\\\n'
> + def _is_eol_token(token, _eol_token=_is_eol_token):
> + return _eol_token(token) or (token[0] == tokenize.COMMENT and
> + token[1] == token[4])
> +
> +##############################################################################
> +# Framework to run all checks
> +##############################################################################
> +
> +
> +_checks = {'physical_line': {}, 'logical_line': {}, 'tree': {}}
> +
> +
> +def _get_parameters(function):
> + if sys.version_info >= (3, 3):
> + return [
> + for parameter
> + in inspect.signature(function).parameters.values()
> + if parameter.kind == parameter.POSITIONAL_OR_KEYWORD]
> + else:
> + return inspect.getargspec(function)[0]
> +
> +
> +def register_check(check, codes=None):
> + """Register a new check object."""
> + def _add_check(check, kind, codes, args):
> + if check in _checks[kind]:
> + _checks[kind][check][0].extend(codes or [])
> + else:
> + _checks[kind][check] = (codes or [''], args)
> + if inspect.isfunction(check):
> + args = _get_parameters(check)
> + if args and args[0] in ('physical_line', 'logical_line'):
> + if codes is None:
> + codes = ERRORCODE_REGEX.findall(check.__doc__ or '')
> + _add_check(check, args[0], codes, args)
> + elif inspect.isclass(check):
> + if _get_parameters(check.__init__)[:2] == ['self', 'tree']:
> + _add_check(check, 'tree', codes, None)
> +
> +
> +def init_checks_registry():
> + """Register all globally visible functions.
> +
> + The first argument name is either 'physical_line' or 'logical_line'.
> + """
> + mod = inspect.getmodule(register_check)
> + for (name, function) in inspect.getmembers(mod, inspect.isfunction):
> + register_check(function)
> +init_checks_registry()
> +
> +
> +class Checker(object):
> + """Load a Python source file, tokenize it, check coding style."""
> +
> + def __init__(self, filename=None, lines=None,
> + options=None, report=None, **kwargs):
> + if options is None:
> + options = StyleGuide(kwargs).options
> + else:
> + assert not kwargs
> + self._io_error = None
> + self._physical_checks = options.physical_checks
> + self._logical_checks = options.logical_checks
> + self._ast_checks = options.ast_checks
> + self.max_line_length = options.max_line_length
> + self.multiline = False # in a multiline string?
> + self.hang_closing = options.hang_closing
> + self.verbose = options.verbose
> + self.filename = filename
> + # Dictionary where a checker can store its custom state.
> + self._checker_states = {}
> + if filename is None:
> + self.filename = 'stdin'
> + self.lines = lines or []
> + elif filename == '-':
> + self.filename = 'stdin'
> + self.lines = stdin_get_value().splitlines(True)
> + elif lines is None:
> + try:
> + self.lines = readlines(filename)
> + except IOError:
> + (exc_type, exc) = sys.exc_info()[:2]
> + self._io_error = '%s: %s' % (exc_type.__name__, exc)
> + self.lines = []
> + else:
> + self.lines = lines
> + if self.lines:
> + ord0 = ord(self.lines[0][0])
> + if ord0 in (0xef, 0xfeff): # Strip the UTF-8 BOM
> + if ord0 == 0xfeff:
> + self.lines[0] = self.lines[0][1:]
> + elif self.lines[0][:3] == '\xef\xbb\xbf':
> + self.lines[0] = self.lines[0][3:]
> + = report or
> + self.report_error =
> +
> + def report_invalid_syntax(self):
> + """Check if the syntax is valid."""
> + (exc_type, exc) = sys.exc_info()[:2]
> + if len(exc.args) > 1:
> + offset = exc.args[1]
> + if len(offset) > 2:
> + offset = offset[1:3]
> + else:
> + offset = (1, 0)
> + self.report_error(offset[0], offset[1] or 0,
> + 'E901 %s: %s' % (exc_type.__name__,
> exc.args[0]),
> + self.report_invalid_syntax)
> +
> + def readline(self):
> + """Get the next line from the input buffer."""
> + if self.line_number >= self.total_lines:
> + return ''
> + line = self.lines[self.line_number]
> + self.line_number += 1
> + if self.indent_char is None and line[:1] in WHITESPACE:
> + self.indent_char = line[0]
> + return line
> +
> + def run_check(self, check, argument_names):
> + """Run a check plugin."""
> + arguments = []
> + for name in argument_names:
> + arguments.append(getattr(self, name))
> + return check(*arguments)
> +
> + def init_checker_state(self, name, argument_names):
> + """ Prepares a custom state for the specific checker plugin."""
> + if 'checker_state' in argument_names:
> + self.checker_state = self._checker_states.setdefault(name, {})
> +
> + def check_physical(self, line):
> + """Run all physical checks on a raw input line."""
> + self.physical_line = line
> + for name, check, argument_names in self._physical_checks:
> + self.init_checker_state(name, argument_names)
> + result = self.run_check(check, argument_names)
> + if result is not None:
> + (offset, text) = result
> + self.report_error(self.line_number, offset, text, check)
> + if text[:4] == 'E101':
> + self.indent_char = line[0]
> +
> + def build_tokens_line(self):
> + """Build a logical line from tokens."""
> + logical = []
> + comments = []
> + length = 0
> + prev_row = prev_col = mapping = None
> + for token_type, text, start, end, line in self.tokens:
> + if token_type in SKIP_TOKENS:
> + continue
> + if not mapping:
> + mapping = [(0, start)]
> + if token_type == tokenize.COMMENT:
> + comments.append(text)
> + continue
> + if token_type == tokenize.STRING:
> + text = mute_string(text)
> + if prev_row:
> + (start_row, start_col) = start
> + if prev_row != start_row: # different row
> + prev_text = self.lines[prev_row - 1][prev_col - 1]
> + if prev_text == ',' or (prev_text not in '{[(' and
> + text not in '}])'):
> + text = ' ' + text
> + elif prev_col != start_col: # different column
> + text = line[prev_col:start_col] + text
> + logical.append(text)
> + length += len(text)
> + mapping.append((length, end))
> + (prev_row, prev_col) = end
> + self.logical_line = ''.join(logical)
> + self.noqa = comments and noqa(''.join(comments))
> + return mapping
> +
> + def check_logical(self):
> + """Build a line from tokens and run all logical checks on it."""
> +
> + mapping = self.build_tokens_line()
> +
> + if not mapping:
> + return
> +
> + (start_row, start_col) = mapping[0][1]
> + start_line = self.lines[start_row - 1]
> + self.indent_level = expand_indent(start_line[:start_col])
> + if self.blank_before < self.blank_lines:
> + self.blank_before = self.blank_lines
> + if self.verbose >= 2:
> + print(self.logical_line[:80].rstrip())
> + for name, check, argument_names in self._logical_checks:
> + if self.verbose >= 4:
> + print(' ' + name)
> + self.init_checker_state(name, argument_names)
> + for offset, text in self.run_check(check, argument_names) or
> ():
> + if not isinstance(offset, tuple):
> + for token_offset, pos in mapping:
> + if offset <= token_offset:
> + break
> + offset = (pos[0], pos[1] + offset - token_offset)
> + self.report_error(offset[0], offset[1], text, check)
> + if self.logical_line:
> + self.previous_indent_level = self.indent_level
> + self.previous_logical = self.logical_line
> + self.blank_lines = 0
> + self.tokens = []
> +
> + def check_ast(self):
> + """Build the file's AST and run all AST checks."""
> + try:
> + tree = compile(''.join(self.lines), '', 'exec', PyCF_ONLY_AST)
> + except (ValueError, SyntaxError, TypeError):
> + return self.report_invalid_syntax()
> + for name, cls, __ in self._a
ACTIVITY SUMMARY (2016-03-25 - 2016-04-01)
Python tracker at
To view or respond to any of the issues listed below, click on the issue.
Do NOT respond to this message.
Issues counts and deltas:
open 5471 (+10)
closed 32971 (+33)
total 38442 (+43)
Open issues with patches: 2379
Issues opened (32)
#26643: regrtest: rework libregrtest.save_env submodule opened by haypo
#26646: Allow built-in module in …
[View More]package opened by Daniel Shaulov
#26647: ceval: use Wordcode, 16-bit bytecode opened by Demur Rumed
#26648: csv.reader Error message indicates to use deprecated opened by Philip Martin
#26650: calendar: OverflowErrors for year == 1 and firstweekday > 0 opened by mjpieters
#26651: Deprecate register_adapter() and register_converter() in sqlit opened by berker.peksag
#26652: Cannot install Python 2.7.11 on Windows Server 2008 R2 opened by Hung-Hsuan Chen
#26654: asyncio is not inspecting keyword arguments of functools.parti opened by iceboy
#26656: Documentation for re.compile is a bit outdated opened by Sworddragon
#26657: Directory traversal with http.server and SimpleHTTPServer on w opened by Thomas
#26658: test_os fails when run on Windows ramdisk opened by jkloth
#26659: slice() leaks memory when part of a cycle opened by Kevin Modzelewski
#26660: tempfile.TemporaryDirectory() cleanup exception on Windows if opened by Laurent.Mazuel
#26661: python fails to locate system libffi opened by rkuska
#26662: configure/Makefile doesn't check if "python" command works, ne opened by haypo
#26663: asyncio _UnixWritePipeTransport._close abandons unflushed writ opened by Robert Smallshire
#26664: find a bug in of venv of cpython3.6 opened by 鄭景文
#26665: pip is not bootstrapped by default on 2.7 opened by Axel
#26666: File object hook to modify select(ors) event mask opened by zwol
#26667: Update importlib to accept pathlib.Path objects opened by brett.cannon
#26668: Remove Lib/test/test_importlib/ opened by haypo
#26669: time.localtime(float("NaN")) does not raise a ValueError on al opened by gregory.p.smith
#26671: Clean up path_converter in posixmodule.c opened by serhiy.storchaka
#26672: regrtest missing in the module name opened by Axel
#26673: Tkinter error when opening IDLE configuration menu opened by wysaard
#26677: pyvenv: breaks $PATH for bash scripts opened by Florian.Dold
#26678: Incorrect linking to elements in datetime package opened by andymaier
#26679: curses: Descripton of KEY_NPAGE and KEY_PPAGE inverted opened by Robert Bachmann
#26680: Incorporating float.is_integer into the numeric tower and Deci opened by Robert Smallshire2
#26682: Ttk Notebook tabs do not show with 1-2 char names opened by terry.reedy
#26683: Questionable terminology for describing what locals() does opened by rhettinger
#26685: Raise errors from socket.close() opened by martin.panter
Most recent 15 issues with no replies (15)
#26677: pyvenv: breaks $PATH for bash scripts
#26672: regrtest missing in the module name
#26669: time.localtime(float("NaN")) does not raise a ValueError on al
#26667: Update importlib to accept pathlib.Path objects
#26665: pip is not bootstrapped by default on 2.7
#26663: asyncio _UnixWritePipeTransport._close abandons unflushed writ
#26661: python fails to locate system libffi
#26660: tempfile.TemporaryDirectory() cleanup exception on Windows if
#26656: Documentation for re.compile is a bit outdated
#26652: Cannot install Python 2.7.11 on Windows Server 2008 R2
#26626: test_dbm_gnu
#26618: _overlapped extension module of asyncio uses deprecated WSAStr
#26615: Missing entry in WRAPPER_ASSIGNMENTS in update_wrapper's doc
#26609: Wrong request target in
#26600: MagickMock __str__ sometimes returns MagickMock instead of str
Most recent 15 issues waiting for review (15)
#26685: Raise errors from socket.close()
#26680: Incorporating float.is_integer into the numeric tower and Deci
#26679: curses: Descripton of KEY_NPAGE and KEY_PPAGE inverted
#26671: Clean up path_converter in posixmodule.c
#26661: python fails to locate system libffi
#26658: test_os fails when run on Windows ramdisk
#26657: Directory traversal with http.server and SimpleHTTPServer on w
#26651: Deprecate register_adapter() and register_converter() in sqlit
#26650: calendar: OverflowErrors for year == 1 and firstweekday > 0
#26648: csv.reader Error message indicates to use deprecated
#26647: ceval: use Wordcode, 16-bit bytecode
#26646: Allow built-in module in package
#26643: regrtest: rework libregrtest.save_env submodule
#26642: Replace stdout and stderr with simple standard printers at Pyt
#26639: Tools/i18n/ replace deprecated imp module with im
Top 10 most discussed issues (10)
#26488: hashlib command line interface 15 msgs
#26647: ceval: use Wordcode, 16-bit bytecode 15 msgs
#26624: Windows hangs in call to CRT setlocale() 10 msgs
#18844: allow weights in random.choice 8 msgs
#26632: __all__ decorator 6 msgs
#26658: test_os fails when run on Windows ramdisk 6 msgs
#26680: Incorporating float.is_integer into the numeric tower and Deci 6 msgs
#23551: IDLE to provide menu link to PIP gui. 5 msgs
#23735: Readline not adjusting width after resize with 6.3 5 msgs
#26606: logging.baseConfig is missing the encoding parameter 5 msgs
Issues closed (30)
#15117: Please document top-level sqlite3 module variables closed by berker.peksag
#18691: sqlite3.Cursor.execute expects sequence as second argument. closed by berker.peksag
#19065: sqlite3 timestamp adapter chokes on timezones closed by berker.peksag
#22218: Fix more compiler warnings "comparison between signed and unsi closed by haypo
#22854: Documentation/implementation out of sync for IO closed by martin.panter
#23758: Improve documenation about num_params in sqlite3 create_functi closed by berker.peksag
#23804: SSLSocket.recv(0) receives up to 1024 bytes closed by martin.panter
#25195: mock.ANY doesn't match mock.MagicMock() object closed by berker.peksag
#25256: Add sys.debug_build public variable to check if Python was com closed by haypo
#25276: Intermittent segfaults on PPC64 AIX 3.x closed by haypo
#25289: test_strptime hangs sometimes on AMD64 Windows7 SP1 3.x buildb closed by haypo
#25940: SSL tests failed due to expired SSL certificate closed by martin.panter
#26130: redundant local copy of a char pointer in classify in Parser\p closed by berker.peksag
#26492: Exhausted array iterator should left exhausted closed by serhiy.storchaka
#26494: Double deallocation on iterator exhausting closed by serhiy.storchaka
#26591: datetime datetime.time to datetime.time comparison does nothin closed by belopolsky
#26616: A bug in datetime.astimezone() method closed by belopolsky
#26640: xmlrpc.server imports xmlrpc.client closed by brett.cannon
#26641: doctest doesn't support packages closed by haypo
#26644: SSLSocket.recv(-1) triggers SystemError closed by martin.panter
#26645: argparse prints help messages to stdout instead of stderr by d closed by serhiy.storchaka
#26649: Fail update installation: 'utf-8' codec can't decode closed by haypo
#26653: bisect raises a TypeError when hi is None closed by rhettinger
#26655: pathlib glob case sensitivity issue on Windows closed by SilentGhost
#26670: Add a developer mode: -X dev command line option closed by haypo
#26674: 【typo】 Japanese Documentation closed by ezio.melotti
#26675: Appending to a large list flushes old entries closed by Swaprava Nath
#26676: Add missing XMLPullParser to ElementTree.__all__ closed by martin.panter
#26681: decorators for attributes closed by ethan.furman
#26684: pathlib.Path.with_name() and .with_suffix do not allow combini closed by ethan.furman
[View Less]

April 1, 2016
Python's exception handling system is currently badly brokeTypeError:
unsupported operand type(s) for +: 'NoneType' and 'NoneType'n. Therefore,
with the recent news of the joyous release of Python 8 (, I
have decided to propose a revolutionary idea: safe mock objects.
A "safe" mock object (qualified name
Java-style naming was adopted for …
[View More]readability purposes; comments are now no
longer necessary) is a magic object that supports everything and returns
itself. Since examples speak more words than are in the Python source code,
here are some (examples, not words in the Python source code):
a = 1
b = None
c = a + b # Returns a
print(c) # Prints the empty string.
d = c+1 # All operations on
return a new one.
e =, 2, 3) # `e` is now a
def f():
assert 0 # Causes the function to return a
raise 123 # Does the same thing.
print(L) # L is undefined, so it becomes a
Safe mock objects are obviously the Next Error Handling Revolution ™.
errors now simply disappear and return more
As for `try` and `catch` (protest the naming of `except`!!) statements,
they will
be completely ignored. The `try`, `except`, and `finally` bodies will all be
executed in sequence, except that printing and returning values with an
statement does nothing:
xyz = None.a # `xyz` becomes a
print(123) # Does nothing.
return None # Does nothing.
return xyz # Returns a
Aggressive error handling (as shown in PanicSort [])
that does destructive actions (such as `rm -rf /`) will always execute the
destructive code, encouraging more honest development.
In addition, due to errors simply being ignored, nothing can ever quite go
All discussions about a safe navigation operator can now be immediately
since any undefined attributes will simply return a
Although I have not yet destroy--I mean, improved CPython to allow for this
amazing idea, I have created a primitive implementation of the
`_frozensafemockobjectimplementation` module:
I hope you will all realize that this new idea is a drastic improvement
over current technologies and therefore support it, because we can Make
Python Great Again™.
[ERROR]: Your autotools build scripts are 200 lines longer than your
program. Something’s wrong.
[View Less]
Summary: There are two prospective Google Summer of Code (GSOC) students
applying to work on writing a gui interface to the basic pip functions
needed by beginners. I expect Google to accept their proposals. Before
I commit to mentoring a student (sometime in April), I would like to be
sure, by addressing any objections now, that I will be able to commit
the code when ready (August or before).
Long version:
In February 2015, Raymond Hettinger opened tracker issue
"IDLE to provide menu …
[View More]options for using PIP"
The menu options would presumably open dialog boxes defined in a new
module such as idlelib.pipgui. Raymond gave a list of 9 features he
thought would be useful to pip beginners.
Donald Stufft (pip maintainer) answered that he already wanted someone
to write a pip gui, to be put somewhere, and that he would give advice
on interfacing (which he has).
I answered that I had also had a vague idea of a pip gui, and thought it
should be a stand-alone window invoked by a single IDLE menu item, just
as turtledemo can be now. Instead of multiple dialogs (for multiple
IDLE menu items), there could be, for instance, multiple tabs in a
ttk.Notebook. Some pages might implement more than 1 of the features on
Raymond's list.
Last September, I did some proof-of-concept experiments and changed the
title to "IDLE to provide menu link to PIP gui". In January, when Terri
Oda requested Core Python GSOC project ideas, I suggested the pip gui
project. I believe Raymond's list can easily be programmed in the time
alloted. I also volunteered to help mentor.
Since then, two students have submitted competent prototypes (on the
tracker issue above) that show that they can write a basic tkinter app
and revise in response to reviews.
My current plan is to add idlelib/ (or perhaps to 3.5
and 3.6. The file will be structured so that it can either be run as a
separate process ('python -m idlelib.pipgui' either at a console or in a
subprocess call) or imported into a running process. IDLE would
currently use a subprocess call, but if IDLE is restructured into a
single-window, multi-tab application, it might switch to using an import.
I would document the new IDLE menu entry in the current IDLE page.
Separately from the pip gui project, I plan, at some point, to add a new
'idlelib' section that documents public entry points to generally useful
idlelib components. If I do that before next August, I would add an
entry for pipgui (which would say that details of the GUI are subject to
Possible objections:
1. One might argue that if pipgui is written so as to not depend on
IDLE, then it, like turtledemo, should be located elsewhere, possibly in
Tools/scrips. I would answer that managing packages, unlike running
turtle demos, *is* an IDE function.
2. One might argue that adding a new module with a public entry point,
in a maintenance release, somehow abuses the license granted by PEP434,
in a way that declaring a public interface in an existing module would
not. If this is sustained, I could not document the new module for 3.5.
Terry Jan Reedy
[View Less]
Python 3 becomes more and more popular and is close to a dangerous point
where it can become popular that Python 2. The PSF decided that it's
time to elaborate a new secret plan to ensure that Python users suffer
again with a new major release breaking all their legacy code.
The PSF is happy to announce that the new Python release will be
Python 8!
Why the version 8? It's just to be greater than Perl 6 and PHP 7, but
it's also a mnemonic for PEP 8. By the way, each minor release will now
[View More]multiply the version by 2. With Python 8 released in 2016 and one
release every two years, we will beat Firefox 44 in 2022 (Python 64) and
Windows 2003 in 2032 (Python 2048).
A major release requires a major change to justify a version bump: the
new killer feature is that it's no longer possible to import a module
which does not respect the PEP 8. It ensures that all your code is pure.
$ python8 -c 'import keyword'
Lib/ E122 continuation line missing indentation or outdented
Lib/ E265 block comment should start with '# '
Lib/ E122 continuation line missing indentation or outdented
ImportError: no pep8, no glory
Good news: since *no* module of the current standard library of Python 3
respect the PEP 8, the standard library will be simplified to one
unique module, which is new in Python 8: pep8. The standard library will
move to the Python Cheeseshop (PyPI), to reply to an old and popular
DON'T PANIC! You are still able to import your legacy code into
Python 8, you just have to rename all your modules to add a "_noqa" suffix
to the filename. For example, rename to A side
effect is that you have to update all imports. For example, replace
"import django" with "import django_noqa". After a study of the PSF,
it's a best option to split again the Python community and make sure
that all users are angry.
The plan is that in 10 years, at least 50% of the 77,000 packages on the
Python cheeseshop will be updated to get the "_noqa" tag. After 2020,
the PSF will start to sponsor trolls to harass users of the legacy
Python 3 to force them to migrate to Python 8.
Python 8 is a work-in-progress (it's still an alpha version), the
standard library was not removed yet. Hopefully, trying to import any
module of the standard library fails.
Don't hesitate to propose more ideas to make Python 8 more incompatible
with Python 3!
Note: The change is already effective in the default branch of Python:
Have fun,
[View Less]