[Python-Dev] PEP 515: Underscores in Numeric Literals (revision 3)
Guido van Rossum
guido at python.org
Fri Mar 18 20:02:42 EDT 2016
I'm happy to accept this PEP as is stands, assuming the authors are
ready for this news. I recommend also implementing the option from
footnote [11] (extend the number-to-string formatting language to
allow ``_`` as a thousans separator).
On Thu, Mar 17, 2016 at 11:19 AM, Brett Cannon <brett at python.org> wrote:
> Where did this PEP leave off? Anything blocking its acceptance?
>
> On Sat, 13 Feb 2016 at 00:49 Georg Brandl <g.brandl at gmx.net> wrote:
>>
>> Hi all,
>>
>> after talking to Guido and Serhiy we present the next revision
>> of this PEP. It is a compromise that we are all happy with,
>> and a relatively restricted rule that makes additions to PEP 8
>> basically unnecessary.
>>
>> I think the discussion has shown that supporting underscores in
>> the from-string constructors is valuable, therefore this is now
>> added to the specification section.
>>
>> The remaining open question is about the reverse direction: do
>> we want a string formatting modifier that adds underscores as
>> thousands separators?
>>
>> cheers,
>> Georg
>>
>> -----------------------------------------------------------------
>>
>> PEP: 515
>> Title: Underscores in Numeric Literals
>> Version: $Revision$
>> Last-Modified: $Date$
>> Author: Georg Brandl, Serhiy Storchaka
>> Status: Draft
>> Type: Standards Track
>> Content-Type: text/x-rst
>> Created: 10-Feb-2016
>> Python-Version: 3.6
>> Post-History: 10-Feb-2016, 11-Feb-2016
>>
>> Abstract and Rationale
>> ======================
>>
>> This PEP proposes to extend Python's syntax and number-from-string
>> constructors so that underscores can be used as visual separators for
>> digit grouping purposes in integral, floating-point and complex number
>> literals.
>>
>> This is a common feature of other modern languages, and can aid
>> readability of long literals, or literals whose value should clearly
>> separate into parts, such as bytes or words in hexadecimal notation.
>>
>> Examples::
>>
>> # grouping decimal numbers by thousands
>> amount = 10_000_000.0
>>
>> # grouping hexadecimal addresses by words
>> addr = 0xDEAD_BEEF
>>
>> # grouping bits into nibbles in a binary literal
>> flags = 0b_0011_1111_0100_1110
>>
>> # same, for string conversions
>> flags = int('0b_1111_0000', 2)
>>
>>
>> Specification
>> =============
>>
>> The current proposal is to allow one underscore between digits, and
>> after base specifiers in numeric literals. The underscores have no
>> semantic meaning, and literals are parsed as if the underscores were
>> absent.
>>
>> Literal Grammar
>> ---------------
>>
>> The production list for integer literals would therefore look like
>> this::
>>
>> integer: decinteger | bininteger | octinteger | hexinteger
>> decinteger: nonzerodigit (["_"] digit)* | "0" (["_"] "0")*
>> bininteger: "0" ("b" | "B") (["_"] bindigit)+
>> octinteger: "0" ("o" | "O") (["_"] octdigit)+
>> hexinteger: "0" ("x" | "X") (["_"] hexdigit)+
>> nonzerodigit: "1"..."9"
>> digit: "0"..."9"
>> bindigit: "0" | "1"
>> octdigit: "0"..."7"
>> hexdigit: digit | "a"..."f" | "A"..."F"
>>
>> For floating-point and complex literals::
>>
>> floatnumber: pointfloat | exponentfloat
>> pointfloat: [digitpart] fraction | digitpart "."
>> exponentfloat: (digitpart | pointfloat) exponent
>> digitpart: digit (["_"] digit)*
>> fraction: "." digitpart
>> exponent: ("e" | "E") ["+" | "-"] digitpart
>> imagnumber: (floatnumber | digitpart) ("j" | "J")
>>
>> Constructors
>> ------------
>>
>> Following the same rules for placement, underscores will be allowed in
>> the following constructors:
>>
>> - ``int()`` (with any base)
>> - ``float()``
>> - ``complex()``
>> - ``Decimal()``
>>
>>
>> Prior Art
>> =========
>>
>> Those languages that do allow underscore grouping implement a large
>> variety of rules for allowed placement of underscores. In cases where
>> the language spec contradicts the actual behavior, the actual behavior
>> is listed. ("single" or "multiple" refer to allowing runs of
>> consecutive underscores.)
>>
>> * Ada: single, only between digits [8]_
>> * C# (open proposal for 7.0): multiple, only between digits [6]_
>> * C++14: single, between digits (different separator chosen) [1]_
>> * D: multiple, anywhere, including trailing [2]_
>> * Java: multiple, only between digits [7]_
>> * Julia: single, only between digits (but not in float exponent parts)
>> [9]_
>> * Perl 5: multiple, basically anywhere, although docs say it's
>> restricted to one underscore between digits [3]_
>> * Ruby: single, only between digits (although docs say "anywhere")
>> [10]_
>> * Rust: multiple, anywhere, except for between exponent "e" and digits
>> [4]_
>> * Swift: multiple, between digits and trailing (although textual
>> description says only "between digits") [5]_
>>
>>
>> Alternative Syntax
>> ==================
>>
>> Underscore Placement Rules
>> --------------------------
>>
>> Instead of the relatively strict rule specified above, the use of
>> underscores could be limited. As we seen from other languages, common
>> rules include:
>>
>> * Only one consecutive underscore allowed, and only between digits.
>> * Multiple consecutive underscores allowed, but only between digits.
>> * Multiple consecutive underscores allowed, in most positions except
>> for the start of the literal, or special positions like after a
>> decimal point.
>>
>> The syntax in this PEP has ultimately been selected because it covers
>> the common use cases, and does not allow for syntax that would have to
>> be discouraged in style guides anyway.
>>
>> A less common rule would be to allow underscores only every N digits
>> (where N could be 3 for decimal literals, or 4 for hexadecimal ones).
>> This is unnecessarily restrictive, especially considering the
>> separator placement is different in different cultures.
>>
>> Different Separators
>> --------------------
>>
>> A proposed alternate syntax was to use whitespace for grouping.
>> Although strings are a precedent for combining adjoining literals, the
>> behavior can lead to unexpected effects which are not possible with
>> underscores. Also, no other language is known to use this rule,
>> except for languages that generally disregard any whitespace.
>>
>> C++14 introduces apostrophes for grouping (because underscores
>> introduce ambiguity with user-defined literals), which is not
>> considered because of the use in Python's string literals. [1]_
>>
>>
>> Open Proposals
>> ==============
>>
>> It has been proposed [11]_ to extend the number-to-string formatting
>> language to allow ``_`` as a thousans separator, where currently only
>> ``,`` is supported. This could be used to easily generate code with
>> more readable literals.
>>
>>
>> Implementation
>> ==============
>>
>> A preliminary patch that implements the specification given above has
>> been posted to the issue tracker. [12]_
>>
>>
>> References
>> ==========
>>
>> .. [1] http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3499.html
>>
>> .. [2] http://dlang.org/spec/lex.html#integerliteral
>>
>> .. [3] http://perldoc.perl.org/perldata.html#Scalar-value-constructors
>>
>> .. [4] http://doc.rust-lang.org/reference.html#number-literals
>>
>> .. [5]
>>
>> https://developer.apple.com/library/ios/documentation/Swift/Conceptual/Swift_Programming_Language/LexicalStructure.html
>>
>> .. [6] https://github.com/dotnet/roslyn/issues/216
>>
>> .. [7]
>>
>> https://docs.oracle.com/javase/7/docs/technotes/guides/language/underscores-literals.html
>>
>> .. [8] http://archive.adaic.com/standards/83lrm/html/lrm-02-04.html#2.4
>>
>> .. [9]
>>
>> http://docs.julialang.org/en/release-0.4/manual/integers-and-floating-point-numbers/
>>
>> .. [10]
>> http://ruby-doc.org/core-2.3.0/doc/syntax/literals_rdoc.html#label-Numbers
>>
>> .. [11]
>> https://mail.python.org/pipermail/python-dev/2016-February/143283.html
>>
>> .. [12] http://bugs.python.org/issue26331
>>
>>
>> Copyright
>> =========
>>
>> This document has been placed in the public domain.
>>
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at python.org
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe:
>> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>
--
--Guido van Rossum (python.org/~guido)
More information about the Python-Dev
mailing list