[Python-Dev] PEP 515: Underscores in Numeric Literals (revision 3)

Brett Cannon brett at python.org
Thu Mar 17 14:19:14 EDT 2016


Where did this PEP leave off? Anything blocking its acceptance?

On Sat, 13 Feb 2016 at 00:49 Georg Brandl <g.brandl at gmx.net> wrote:

> Hi all,
>
> after talking to Guido and Serhiy we present the next revision
> of this PEP.  It is a compromise that we are all happy with,
> and a relatively restricted rule that makes additions to PEP 8
> basically unnecessary.
>
> I think the discussion has shown that supporting underscores in
> the from-string constructors is valuable, therefore this is now
> added to the specification section.
>
> The remaining open question is about the reverse direction: do
> we want a string formatting modifier that adds underscores as
> thousands separators?
>
> cheers,
> Georg
>
> -----------------------------------------------------------------
>
> PEP: 515
> Title: Underscores in Numeric Literals
> Version: $Revision$
> Last-Modified: $Date$
> Author: Georg Brandl, Serhiy Storchaka
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 10-Feb-2016
> Python-Version: 3.6
> Post-History: 10-Feb-2016, 11-Feb-2016
>
> Abstract and Rationale
> ======================
>
> This PEP proposes to extend Python's syntax and number-from-string
> constructors so that underscores can be used as visual separators for
> digit grouping purposes in integral, floating-point and complex number
> literals.
>
> This is a common feature of other modern languages, and can aid
> readability of long literals, or literals whose value should clearly
> separate into parts, such as bytes or words in hexadecimal notation.
>
> Examples::
>
>     # grouping decimal numbers by thousands
>     amount = 10_000_000.0
>
>     # grouping hexadecimal addresses by words
>     addr = 0xDEAD_BEEF
>
>     # grouping bits into nibbles in a binary literal
>     flags = 0b_0011_1111_0100_1110
>
>     # same, for string conversions
>     flags = int('0b_1111_0000', 2)
>
>
> Specification
> =============
>
> The current proposal is to allow one underscore between digits, and
> after base specifiers in numeric literals.  The underscores have no
> semantic meaning, and literals are parsed as if the underscores were
> absent.
>
> Literal Grammar
> ---------------
>
> The production list for integer literals would therefore look like
> this::
>
>    integer: decinteger | bininteger | octinteger | hexinteger
>    decinteger: nonzerodigit (["_"] digit)* | "0" (["_"] "0")*
>    bininteger: "0" ("b" | "B") (["_"] bindigit)+
>    octinteger: "0" ("o" | "O") (["_"] octdigit)+
>    hexinteger: "0" ("x" | "X") (["_"] hexdigit)+
>    nonzerodigit: "1"..."9"
>    digit: "0"..."9"
>    bindigit: "0" | "1"
>    octdigit: "0"..."7"
>    hexdigit: digit | "a"..."f" | "A"..."F"
>
> For floating-point and complex literals::
>
>    floatnumber: pointfloat | exponentfloat
>    pointfloat: [digitpart] fraction | digitpart "."
>    exponentfloat: (digitpart | pointfloat) exponent
>    digitpart: digit (["_"] digit)*
>    fraction: "." digitpart
>    exponent: ("e" | "E") ["+" | "-"] digitpart
>    imagnumber: (floatnumber | digitpart) ("j" | "J")
>
> Constructors
> ------------
>
> Following the same rules for placement, underscores will be allowed in
> the following constructors:
>
> - ``int()`` (with any base)
> - ``float()``
> - ``complex()``
> - ``Decimal()``
>
>
> Prior Art
> =========
>
> Those languages that do allow underscore grouping implement a large
> variety of rules for allowed placement of underscores.  In cases where
> the language spec contradicts the actual behavior, the actual behavior
> is listed.  ("single" or "multiple" refer to allowing runs of
> consecutive underscores.)
>
> * Ada: single, only between digits [8]_
> * C# (open proposal for 7.0): multiple, only between digits [6]_
> * C++14: single, between digits (different separator chosen) [1]_
> * D: multiple, anywhere, including trailing [2]_
> * Java: multiple, only between digits [7]_
> * Julia: single, only between digits (but not in float exponent parts)
>   [9]_
> * Perl 5: multiple, basically anywhere, although docs say it's
>   restricted to one underscore between digits [3]_
> * Ruby: single, only between digits (although docs say "anywhere")
>   [10]_
> * Rust: multiple, anywhere, except for between exponent "e" and digits
>   [4]_
> * Swift: multiple, between digits and trailing (although textual
>   description says only "between digits") [5]_
>
>
> Alternative Syntax
> ==================
>
> Underscore Placement Rules
> --------------------------
>
> Instead of the relatively strict rule specified above, the use of
> underscores could be limited.  As we seen from other languages, common
> rules include:
>
> * Only one consecutive underscore allowed, and only between digits.
> * Multiple consecutive underscores allowed, but only between digits.
> * Multiple consecutive underscores allowed, in most positions except
>   for the start of the literal, or special positions like after a
>   decimal point.
>
> The syntax in this PEP has ultimately been selected because it covers
> the common use cases, and does not allow for syntax that would have to
> be discouraged in style guides anyway.
>
> A less common rule would be to allow underscores only every N digits
> (where N could be 3 for decimal literals, or 4 for hexadecimal ones).
> This is unnecessarily restrictive, especially considering the
> separator placement is different in different cultures.
>
> Different Separators
> --------------------
>
> A proposed alternate syntax was to use whitespace for grouping.
> Although strings are a precedent for combining adjoining literals, the
> behavior can lead to unexpected effects which are not possible with
> underscores.  Also, no other language is known to use this rule,
> except for languages that generally disregard any whitespace.
>
> C++14 introduces apostrophes for grouping (because underscores
> introduce ambiguity with user-defined literals), which is not
> considered because of the use in Python's string literals. [1]_
>
>
> Open Proposals
> ==============
>
> It has been proposed [11]_ to extend the number-to-string formatting
> language to allow ``_`` as a thousans separator, where currently only
> ``,`` is supported.  This could be used to easily generate code with
> more readable literals.
>
>
> Implementation
> ==============
>
> A preliminary patch that implements the specification given above has
> been posted to the issue tracker. [12]_
>
>
> References
> ==========
>
> .. [1] http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3499.html
>
> .. [2] http://dlang.org/spec/lex.html#integerliteral
>
> .. [3] http://perldoc.perl.org/perldata.html#Scalar-value-constructors
>
> .. [4] http://doc.rust-lang.org/reference.html#number-literals
>
> .. [5]
>
> https://developer.apple.com/library/ios/documentation/Swift/Conceptual/Swift_Programming_Language/LexicalStructure.html
>
> .. [6] https://github.com/dotnet/roslyn/issues/216
>
> .. [7]
>
> https://docs.oracle.com/javase/7/docs/technotes/guides/language/underscores-literals.html
>
> .. [8] http://archive.adaic.com/standards/83lrm/html/lrm-02-04.html#2.4
>
> .. [9]
>
> http://docs.julialang.org/en/release-0.4/manual/integers-and-floating-point-numbers/
>
> .. [10]
> http://ruby-doc.org/core-2.3.0/doc/syntax/literals_rdoc.html#label-Numbers
>
> .. [11]
> https://mail.python.org/pipermail/python-dev/2016-February/143283.html
>
> .. [12] http://bugs.python.org/issue26331
>
>
> Copyright
> =========
>
> This document has been placed in the public domain.
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160317/27c010a0/attachment.html>


More information about the Python-Dev mailing list