
Not sure if this has been proposed before: A syntax change to allow underscores as thousands separators in literal numbers to improve readability, e.g.: for i in range(1, 1_000_000): pass I believe D allows this and while it's a small thing it really is much more readable. Worth a PEP? Thanks, Matt

On 6 May 2011 19:51, Matt Chaput <matt@whoosh.ca> wrote:
Not sure if this has been proposed before: A syntax change to allow underscores as thousands separators in literal numbers to improve readability, e.g.:
for i in range(1, 1_000_000): pass
I believe D allows this and while it's a small thing it really is much more readable.
Ruby too. You could also use e-notation[1]: 1e6, in your example. In many situations it's even more readable because you don't need to "count the zeros". This is already supported in Python. [1] http://en.wikipedia.org/wiki/Scientific_notation#E_notation

On Fri, 6 May 2011 23:06:18 +0200 "dag.odenhall@gmail.com" <dag.odenhall@gmail.com> wrote:
On 6 May 2011 19:51, Matt Chaput <matt-KKMwxO2wslj3fQ9qLvQP4Q@public.gmane.org> wrote:
Not sure if this has been proposed before: A syntax change to allow underscores as thousands separators in literal numbers to improve readability, e.g.:
for i in range(1, 1_000_000): pass
I believe D allows this and while it's a small thing it really is much more readable.
Ruby too.
You could also use e-notation[1]: 1e6, in your example. In many situations it's even more readable because you don't need to "count the zeros". This is already supported in Python.
Yes, but it gives a float, not an integer:
for i in range(0, 1e6): pass ... Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: 'float' object cannot be interpreted as an integer
Regards Antoine.

How about range(10**60) ? - Kirubakaran. On Fri, May 6, 2011 at 2:24 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
On Fri, 6 May 2011 23:06:18 +0200 "dag.odenhall@gmail.com" <dag.odenhall@gmail.com> wrote:
On 6 May 2011 19:51, Matt Chaput < matt-KKMwxO2wslj3fQ9qLvQP4Q@public.gmane.org> wrote:
Not sure if this has been proposed before: A syntax change to allow underscores as thousands separators in literal numbers to improve readability, e.g.:
for i in range(1, 1_000_000): pass
I believe D allows this and while it's a small thing it really is much more readable.
Ruby too.
You could also use e-notation[1]: 1e6, in your example. In many situations it's even more readable because you don't need to "count the zeros". This is already supported in Python.
Yes, but it gives a float, not an integer:
for i in range(0, 1e6): pass ... Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: 'float' object cannot be interpreted as an integer
Regards
Antoine.
_______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas

(fixed typo) How about range(10**6) ? - Kirubakaran. On Fri, May 6, 2011 at 2:25 PM, Kirubakaran <kirubakaran@gmail.com> wrote:
How about range(10**60) ?
- Kirubakaran.
On Fri, May 6, 2011 at 2:24 PM, Antoine Pitrou <solipsis@pitrou.net>wrote:
On Fri, 6 May 2011 23:06:18 +0200 "dag.odenhall@gmail.com" <dag.odenhall@gmail.com> wrote:
On 6 May 2011 19:51, Matt Chaput < matt-KKMwxO2wslj3fQ9qLvQP4Q@public.gmane.org> wrote:
Not sure if this has been proposed before: A syntax change to allow underscores as thousands separators in literal numbers to improve readability, e.g.:
for i in range(1, 1_000_000): pass
I believe D allows this and while it's a small thing it really is much more readable.
Ruby too.
You could also use e-notation[1]: 1e6, in your example. In many situations it's even more readable because you don't need to "count the zeros". This is already supported in Python.
Yes, but it gives a float, not an integer:
for i in range(0, 1e6): pass ... Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: 'float' object cannot be interpreted as an integer
Regards
Antoine.
_______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas

On 06/05/2011 5:26 PM, Kirubakaran wrote:
(fixed typo) How about range(10**6) ?
Both 1e6 (if it worked in the example) and 10**6 both require a bit of work (at least for my non-mathematician brain) to decode as "1 million", whereas with 1_000_000 you're not so much counting the zeros in your head as counting the *groups* of zeros visually. For me it's much more readable at a glance. Also, obviously the 10**6 trick doesn't work so well if the example is: for i in range(47_284_345): pass Matt

None of these answers address the original suggestion. Matt didn't say that he only wanted this for numbers of the form 10^N; he just gave that as an example. Consider these examples instead: - 1_234_000 - 9.876_543_210 - 0xFEFF_0042 I'm not advocating this change (nor against it); I just think the discussion should be focused on the actual idea. I do have a question: Is _ just ignored in numbers or are there more complex rules? - 1_2345_6789 (can I use groups of other sizes instead?) - 1_2_3_4_5 (ditto) - 1_234_6789 (do all the groups need to be the same size?) - 1_ (must the _ only be in between 2 digits?) - 1__234 (what about multiple _s?) - 9.876_543_210 (can it be used to the right of the decimal point?) - 0xFEFF_0042 (can it be used in hex, octal or binary numbers?) - int('123_456') (do other functions accept this syntax too?) --- Bruce Puzzazz newsletter: http://j.mp/puzzazz-news-2011-04 including April Fools! Blog post: http://www.vroospeak.com Ironically, a glaring Google grammatical error On Fri, May 6, 2011 at 2:26 PM, Kirubakaran <kirubakaran@gmail.com> wrote:
(fixed typo) How about range(10**6) ?
- Kirubakaran.
On Fri, May 6, 2011 at 2:25 PM, Kirubakaran <kirubakaran@gmail.com> wrote:
How about range(10**60) ?
- Kirubakaran.
On Fri, May 6, 2011 at 2:24 PM, Antoine Pitrou <solipsis@pitrou.net>wrote:
On Fri, 6 May 2011 23:06:18 +0200 "dag.odenhall@gmail.com" <dag.odenhall@gmail.com> wrote:
On 6 May 2011 19:51, Matt Chaput < matt-KKMwxO2wslj3fQ9qLvQP4Q@public.gmane.org> wrote:
Not sure if this has been proposed before: A syntax change to allow underscores as thousands separators in literal numbers to improve readability, e.g.:
for i in range(1, 1_000_000): pass
I believe D allows this and while it's a small thing it really is much more readable.
Ruby too.
You could also use e-notation[1]: 1e6, in your example. In many situations it's even more readable because you don't need to "count the zeros". This is already supported in Python.
Yes, but it gives a float, not an integer:
for i in range(0, 1e6): pass ... Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: 'float' object cannot be interpreted as an integer
Regards
Antoine.
_______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
_______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas

Bruce Leban wrote:
None of these answers address the original suggestion. Matt didn't say that he only wanted this for numbers of the form 10^N; he just gave that as an example.
Consider these examples instead:
* 1_234_000 * 9.876_543_210 * 0xFEFF_0042
I'm not advocating this change (nor against it); I just think the discussion should be focused on the actual idea. I do have a question:
Is _ just ignored in numbers or are there more complex rules?
* 1_2345_6789 (can I use groups of other sizes instead?) * 1_2_3_4_5 (ditto) * 1_234_6789 (do all the groups need to be the same size?) * 1_ (must the _ only be in between 2 digits?) * 1__234 (what about multiple _s?) * 9.876_543_210 (can it be used to the right of the decimal point?) * 0xFEFF_0042 (can it be used in hex, octal or binary numbers?) * int('123_456') (do other functions accept this syntax too?)
I would say it's ignored. Have the rule be something like number_string.replace('_',''). The only wrinkle is that currently '_1' is usable name, and that should probably be disallowed if the above change took place. I'm +1 on the idea. ~Ethan~

Alexander Belopolsky wrote:
On Fri, May 6, 2011 at 6:40 PM, Ethan Furman <ethan@stoneleaf.us> wrote: ..
The only wrinkle is that currently '_1' is usable name, and that should probably be disallowed if the above change took place.
-1_000 if _1 becomes invalid as an identifier.
+0 otherwise.
So you use _8127 style names for your objects* then? ~Ethan~ *Okay, avoiding the word 'variables' can make for some slightly odd sounding sentences! ;)

On Fri, May 6, 2011 at 6:58 PM, Ethan Furman <ethan@stoneleaf.us> wrote:
So you use _8127 style names for your objects* then?
Code generators often use such names, though. Since _1234 is currently a legal identifier, you'd be breaking backward compatibility. I understand the motivation for a thousands separator, at least (though I'll admit, I don't find it compelling; *all* big numbers in code are too magical). -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> "Give me the luxuries of life and I will willingly do without the necessities." --Frank Lloyd Wright

Fred Drake wrote:
I understand the motivation for a thousands separator, at least (though I'll admit, I don't find it compelling; *all* big numbers in code are too magical).
Bigness is a relative concept. Avogadro's number is fairly big in absolute terms, but you can hold that many molecules in your hand quite easily. Although writing it as 6_020_000_000_000_000_000_000_000_000 probably wouldn't be very helpful. -- Greg

On Fri, May 6, 2011 at 6:40 PM, Ethan Furman <ethan@stoneleaf.us> wrote:
The only wrinkle is that currently '_1' is usable name, and that should probably be disallowed if the above change took place.
Why? I've never seen a leading thousands separator in practice. For example, ,123,456 isn't generally accepted usage, so why should _123_456 be considered acceptable? (I'm not taking a position on the proposal here; just commenting on the problem of breaking code by making _1 a number instead of an identifier.) -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> "Give me the luxuries of life and I will willingly do without the necessities." --Frank Lloyd Wright

Fred Drake wrote:
On Fri, May 6, 2011 at 6:40 PM, Ethan Furman <ethan@stoneleaf.us> wrote:
The only wrinkle is that currently '_1' is usable name, and that should probably be disallowed if the above change took place.
Why? I've never seen a leading thousands separator in practice. For example,
,123,456
isn't generally accepted usage, so why should
_123_456
be considered acceptable?
(I'm not taking a position on the proposal here; just commenting on the problem of breaking code by making _1 a number instead of an identifier.)
I see it as a readability issue -- if you have 1_024 and _1025 (etc, etc), where one is a number and the other a name, confusion can easily result. ~Ethan~

Ethan Furman wrote:
I see it as a readability issue -- if you have 1_024 and _1025 (etc, etc), where one is a number and the other a name, confusion can easily result.
I don't think there will be *that* much confusion though. _1025 can occur on the LHS of an assignment, 1_024 cannot. And we already distinguish between x1234 and 0x1234 without much confusion. -- Steven

Ethan Furman wrote:
I see it as a readability issue -- if you have 1_024 and _1025 (etc, etc), where one is a number and the other a name, confusion can easily result.
But probably not much worse than the confusion you can get today between 1234e6 and _1234e6, or O000001 and 0000001. There will always be ways of creating confusing-looking code if you put your mind to it. :-) -- Greg

On 06May2011 15:40, Ethan Furman <ethan@stoneleaf.us> wrote: | Bruce Leban wrote: | >Is _ just ignored in numbers or are there more complex rules? | > | > * 1_2345_6789 (can I use groups of other sizes instead?) | > * 1_2_3_4_5 (ditto) | > * 1_234_6789 (do all the groups need to be the same size?) | > * 1_ (must the _ only be in between 2 digits?) | > * 1__234 (what about multiple _s?) | > * 9.876_543_210 (can it be used to the right of the decimal point?) | > * 0xFEFF_0042 (can it be used in hex, octal or binary numbers?) | > * int('123_456') (do other functions accept this syntax too?) | | I would say it's ignored. Have the rule be something like | number_string.replace('_',''). | | The only wrinkle is that currently '_1' is usable name, and that | should probably be disallowed if the above change took place. | | I'm +1 on the idea. Personally I'm be for ignoring the _ also, save that I would forbid it at the start or end, so no _1 or 1_. And I would permit it in hex code etc. I'm +0.5, myself. Cheers, -- Cameron Simpson <cs@zip.com.au> DoD#743 http://www.cskk.ezoshosting.com/cs/ A strong conviction that something must be done is the parent of many bad measures. - Daniel Webster

On 06/05/2011 23:51, Cameron Simpson wrote:
On 06May2011 15:40, Ethan Furman<ethan@stoneleaf.us> wrote: | Bruce Leban wrote: |>Is _ just ignored in numbers or are there more complex rules? |> |> * 1_2345_6789 (can I use groups of other sizes instead?) |> * 1_2_3_4_5 (ditto) |> * 1_234_6789 (do all the groups need to be the same size?) |> * 1_ (must the _ only be in between 2 digits?) |> * 1__234 (what about multiple _s?) |> * 9.876_543_210 (can it be used to the right of the decimal point?) |> * 0xFEFF_0042 (can it be used in hex, octal or binary numbers?) |> * int('123_456') (do other functions accept this syntax too?) | | I would say it's ignored. Have the rule be something like | number_string.replace('_',''). | | The only wrinkle is that currently '_1' is usable name, and that | should probably be disallowed if the above change took place. | | I'm +1 on the idea.
Personally I'm be for ignoring the _ also, save that I would forbid it at the start or end, so no _1 or 1_.
And I would permit it in hex code etc.
I'm +0.5, myself.
As far as I remember, Ada also permits it, but has the rule that it can occur only between digits. If we follow that, then: 1_2345_6789 => Yes 1_2_3_4_5 => Yes 1_234_6789 => Yes 1_ => No _1 => No 1__234 => No 9.876_543_210 => Yes 9._876_543_210 => No 9_.876_543_210 => No 0xFEFF_0042 => Yes int('123_456') => Yes

I'm opposed to changing int so that int('123_456') ignores the _ as that will change the behavior of existing code and could break apps. Alternatively, if you want to change int how about int('123_456', separator='_') ignores the _. That would also admit int('123,456', separator=',') --- Bruce * * On Fri, May 6, 2011 at 4:41 PM, MRAB <python@mrabarnett.plus.com> wrote:
On 06/05/2011 23:51, Cameron Simpson wrote:
On 06May2011 15:40, Ethan Furman<ethan@stoneleaf.us> wrote: | Bruce Leban wrote: |>Is _ just ignored in numbers or are there more complex rules? |> |> * 1_2345_6789 (can I use groups of other sizes instead?) |> * 1_2_3_4_5 (ditto) |> * 1_234_6789 (do all the groups need to be the same size?) |> * 1_ (must the _ only be in between 2 digits?) |> * 1__234 (what about multiple _s?) |> * 9.876_543_210 (can it be used to the right of the decimal point?) |> * 0xFEFF_0042 (can it be used in hex, octal or binary numbers?) |> * int('123_456') (do other functions accept this syntax too?) | | I would say it's ignored. Have the rule be something like | number_string.replace('_',''). | | The only wrinkle is that currently '_1' is usable name, and that | should probably be disallowed if the above change took place. | | I'm +1 on the idea.
Personally I'm be for ignoring the _ also, save that I would forbid it at the start or end, so no _1 or 1_.
And I would permit it in hex code etc.
I'm +0.5, myself.
As far as I remember, Ada also permits it, but has the rule that it can occur only between digits. If we follow that, then:
1_2345_6789 => Yes 1_2_3_4_5 => Yes 1_234_6789 => Yes 1_ => No _1 => No 1__234 => No 9.876_543_210 => Yes 9._876_543_210 => No 9_.876_543_210 => No 0xFEFF_0042 => Yes int('123_456') => Yes
_______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas

On Fri, May 6, 2011 at 7:44 PM, Bruce Leban <bruce@leapyear.org> wrote:
I'm opposed to changing int so that int('123_456') ignores the _ as that will change the behavior of existing code and could break apps. Alternatively, if you want to change int how about int('123_456', separator='_') ignores the _. That would also admit int('123,456', separator=',') --- Bruce
On Fri, May 6, 2011 at 4:41 PM, MRAB <python@mrabarnett.plus.com> wrote:
On 06/05/2011 23:51, Cameron Simpson wrote:
On 06May2011 15:40, Ethan Furman<ethan@stoneleaf.us> wrote: | Bruce Leban wrote: |>Is _ just ignored in numbers or are there more complex rules? |> |> * 1_2345_6789 (can I use groups of other sizes instead?) |> * 1_2_3_4_5 (ditto) |> * 1_234_6789 (do all the groups need to be the same size?) |> * 1_ (must the _ only be in between 2 digits?) |> * 1__234 (what about multiple _s?) |> * 9.876_543_210 (can it be used to the right of the decimal point?) |> * 0xFEFF_0042 (can it be used in hex, octal or binary numbers?) |> * int('123_456') (do other functions accept this syntax too?) | | I would say it's ignored. Have the rule be something like | number_string.replace('_',''). | | The only wrinkle is that currently '_1' is usable name, and that | should probably be disallowed if the above change took place. | | I'm +1 on the idea.
Personally I'm be for ignoring the _ also, save that I would forbid it at the start or end, so no _1 or 1_.
And I would permit it in hex code etc.
I'm +0.5, myself.
As far as I remember, Ada also permits it, but has the rule that it can occur only between digits. If we follow that, then:
1_2345_6789 => Yes 1_2_3_4_5 => Yes 1_234_6789 => Yes 1_ => No _1 => No 1__234 => No 9.876_543_210 => Yes 9._876_543_210 => No 9_.876_543_210 => No 0xFEFF_0042 => Yes int('123_456') => Yes _______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
_______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
I am +0 on the whole idea, but +0.5 if is not an underscore, which I think is ugly. Would it conflict with any other syntax rules if numbers allowed a space separator? for i in range(1 111 111): foo(i) It looks cleaner and in a fixed-font should be just as obvious about separator placement. -- Read my blog! I depend on your acceptance of my opinion! I am interesting! http://techblog.ironfroggy.com/ Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy

On 06May2011 19:55, Calvin Spealman <ironfroggy@gmail.com> wrote: | I am +0 on the whole idea, but +0.5 if is not an underscore, which I | think is ugly. I think the underscore is one of the better choices: - it is very visible, unlike a dot or comma - it is "low" or "flat", not intruding into the glyph space of the digits, leaving things easy to read - it is already widely used (perl (sorry), Ada (where I first encountered it now that someone ele has mentioned it, etc) i.e. it is a pre-existing idom with successful use | Would it conflict with any other syntax rules if | numbers allowed a space separator? | | for i in range(1 111 111): | foo(i) | | It looks cleaner and in a fixed-font should be just as obvious about | separator placement. I'm very -1 on this one. Like another recent proposal it take a common typing error and turns it into legal syntax. Code that once would fail to compile because the author dropped a comma between values now runs, with silent breakage (the new stuff isn't even the wrong type!) Cheers, -- Cameron Simpson <cs@zip.com.au> DoD#743 http://www.cskk.ezoshosting.com/cs/ It's there as a sop to former Ada programmers. :-) - Larry Wall regarding 10_000_000 in <11556@jpl-devvax.JPL.NASA.GOV>

Bruce Leban wrote:
I'm opposed to changing int so that int('123_456') ignores the _ as that will change the behavior of existing code and could break apps.
But int('123_456', 0) should perhaps work? (On the grounds that it parses numbers using the same syntax as Python source.) -- Greg

On 07/05/2011 08:46, Greg Ewing wrote:
Bruce Leban wrote:
I'm opposed to changing int so that int('123_456') ignores the _ as that will change the behavior of existing code and could break apps.
But int('123_456', 0) should perhaps work? (On the grounds that it parses numbers using the same syntax as Python source.)
There's also the argument that if you forbid it then the programmer may have to write: int(string.replace("_", "")) in order to let the user include underscores, which would make it too permissive. If the user entered "_10", the above code would accept it.

On May 6, 2011, at 4:41 PM, MRAB wrote:
On 06/05/2011 23:51, Cameron Simpson wrote:
On 06May2011 15:40, Ethan Furman<ethan@stoneleaf.us> wrote: | Bruce Leban wrote: |>Is _ just ignored in numbers or are there more complex rules? |> |> * 1_2345_6789 (can I use groups of other sizes instead?) |> * 1_2_3_4_5 (ditto) |> * 1_234_6789 (do all the groups need to be the same size?) |> * 1_ (must the _ only be in between 2 digits?) |> * 1__234 (what about multiple _s?) |> * 9.876_543_210 (can it be used to the right of the decimal point?) |> * 0xFEFF_0042 (can it be used in hex, octal or binary numbers?) |> * int('123_456') (do other functions accept this syntax too?) | | I would say it's ignored. Have the rule be something like | number_string.replace('_',''). | | The only wrinkle is that currently '_1' is usable name, and that | should probably be disallowed if the above change took place. | | I'm +1 on the idea.
Personally I'm be for ignoring the _ also, save that I would forbid it at the start or end, so no _1 or 1_.
And I would permit it in hex code etc.
I'm +0.5, myself.
As far as I remember, Ada also permits it, but has the rule that it can occur only between digits. If we follow that, then:
1_2345_6789 => Yes 1_2_3_4_5 => Yes 1_234_6789 => Yes 1_ => No _1 => No 1__234 => No 9.876_543_210 => Yes 9._876_543_210 => No 9_.876_543_210 => No 0xFEFF_0042 => Yes int('123_456') => Yes
Java 7 also adds this feature. Its rules: You can place underscores only between digits; you cannot place underscores in the following places: • At the beginning or end of a number • Adjacent to a decimal point in a floating point literal • Prior to an F or L suffix • In positions where a string of digits is expected The following examples demonstrate valid and invalid underscore placements in numeric literals: float pi1 = 3_.1415F; // Invalid; cannot put underscores adjacent to a decimal point float pi2 = 3._1415F; // Invalid; cannot put underscores adjacent to a decimal point long socialSecurityNumber1 = 999_99_9999_L; // Invalid; cannot put underscores prior to an L suffix int x1 = _52; // This is an identifier, not a numeric literal int x2 = 5_2; // OK (decimal literal) int x3 = 52_; // Invalid; cannot put underscores at the end of a literal int x4 = 5_______2; // OK (decimal literal) int x5 = 0_x52; // Invalid; cannot put underscores in the 0x radix prefix int x6 = 0x_52; // Invalid; cannot put underscores at the beginning of a number int x7 = 0x5_2; // OK (hexadecimal literal) int x8 = 0x52_; // Invalid; cannot put underscores at the end of a number int x9 = 0_52; // OK (octal literal) int x10 = 05_2; // OK (octal literal) int x11 = 052_; // Invalid; cannot put underscores at the end of a number (From http://download.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html ) -- Philip Jenvey

Greg Ewing wrote:
Philip Jenvey wrote:
int x4 = 5_______2; // OK (decimal literal)
Hmmm, that one looks really weird -- maybe it should be disallowed as well?
I don't think we need disallow it merely over an aesthetic judgement (although it does look weird *grins*). There is precedence with separators in collections:
t = (1,,,,2) File "<stdin>", line 1 t = (1,,,,2) ^ SyntaxError: invalid syntax
Like consecutive commas, consecutive underscores are likely to indicate a typo rather than a deliberate decision. So I'm +1 on strictly enforcing a single underscore between digits. -- Steven

Steven D'Aprano wrote:
Like consecutive commas, consecutive underscores are likely to indicate a typo rather than a deliberate decision.
Well, yes, that's really the rationale I had in mind. Although it would provide an amusingly funky way of introducing dividing line comments into your code: class A: ... ... ... 0____________________________________0 class B: ... ... ... You could even decorate it with scissors for a bit more panache: 0_____8<0_____8<0_____8<0_____8<0_____0 -- Greg

On 07.05.2011 10:27, Greg Ewing wrote:
Steven D'Aprano wrote:
Like consecutive commas, consecutive underscores are likely to indicate a typo rather than a deliberate decision.
Well, yes, that's really the rationale I had in mind.
Although it would provide an amusingly funky way of introducing dividing line comments into your code:
class A: ... ... ...
0____________________________________0
+1__________________________________________________________0! Georg

On Sat, May 7, 2011 at 4:27 AM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
You could even decorate it with scissors for a bit more panache:
0_____8<0_____8<0_____8<0_____8<0_____0
Heh. Thanks for the swell tip, Martha Stewart! -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> "Give me the luxuries of life and I will willingly do without the necessities." --Frank Lloyd Wright

On 07May2011 00:41, MRAB <python@mrabarnett.plus.com> wrote: | As far as I remember, Ada also permits it, That's where I first encountered it myself. | but has the rule that it can | occur only between digits. If we follow that, then: | | 1_2345_6789 => Yes | 1_2_3_4_5 => Yes | 1_234_6789 => Yes | 1_ => No | _1 => No | 1__234 => No | 9.876_543_210 => Yes | 9._876_543_210 => No | 9_.876_543_210 => No | 0xFEFF_0042 => Yes | int('123_456') => Yes +1 to this. Cheers, -- Cameron Simpson <cs@zip.com.au> DoD#743 http://www.cskk.ezoshosting.com/cs/ It is impossible to travel faster than light, and certainly not desirable as ones hat keeps blowing off. - Woody Allen

Bruce Leban wrote:
Consider these examples instead:
- 1_234_000 - 9.876_543_210 - 0xFEFF_0042
I'm not advocating this change (nor against it); I just think the discussion should be focused on the actual idea. I do have a question:
Is _ just ignored in numbers or are there more complex rules?
- 1_2345_6789 (can I use groups of other sizes instead?) - 1_2_3_4_5 (ditto) - 1_234_6789 (do all the groups need to be the same size?)
+1 on all of these. I don't particularly like the look of _ as a number separator, but it's hard to think of any alternatives other than space, and some separator is better than long sequences of digits. I'm -0.5 on spaces even though it looks MUCH better, because it's too easy to leave the commas out in lists etc: L = [1, 2, 3, 4 5, 6, 7, 8, 9, 10] # oops, wanted 4 & 5 not 45 (Admittedly if the items where strings, the same failure mode applies.)
- 1_ (must the _ only be in between 2 digits?) - 1__234 (what about multiple _s?)
-1 on allowing either _1 or 1_ as numbers. -0 on allowing doubled underscores.
- 9.876_543_210 (can it be used to the right of the decimal point?) - 0xFEFF_0042 (can it be used in hex, octal or binary numbers?)
+1 on these two.
- int('123_456') (do other functions accept this syntax too?)
That's a tricky one... I'd say No, but I'm not entirely sure. It's easy enough to say: int('123_456'.replace('_', '')) albeit a tad verbose. Also easy to say: int('123' '456') which is less verbose. And it will change the behaviour of the int function. So I don't think we need to support separators inside strings. We can always change our mind later and add it in, but it's much harder to take it out later. -- Steven

On Fri, May 6, 2011 at 7:00 PM, Steven D'Aprano <steve@pearwood.info> wrote:
Bruce Leban wrote:
Consider these examples instead:
- 1_234_000 - 9.876_543_210 - 0xFEFF_0042
I'm not advocating this change (nor against it); I just think the discussion should be focused on the actual idea. I do have a question:
Is _ just ignored in numbers or are there more complex rules?
- 1_2345_6789 (can I use groups of other sizes instead?) - 1_2_3_4_5 (ditto) - 1_234_6789 (do all the groups need to be the same size?)
+1 on all of these. I don't particularly like the look of _ as a number separator, but it's hard to think of any alternatives other than space, and some separator is better than long sequences of digits.
I'm -0.5 on spaces even though it looks MUCH better, because it's too easy to leave the commas out in lists etc:
L = [1, 2, 3, 4 5, 6, 7, 8, 9, 10] # oops, wanted 4 & 5 not 45
(Admittedly if the items where strings, the same failure mode applies.)
And it does sometimes bite. So let's not do more of that. (In retrospect 'xxx' + 'yyy' would have been good enough.)
- 1_ (must the _ only be in between 2 digits?) - 1__234 (what about multiple _s?)
-1 on allowing either _1 or 1_ as numbers.
-0 on allowing doubled underscores.
- 9.876_543_210 (can it be used to the right of the decimal point?) - 0xFEFF_0042 (can it be used in hex, octal or binary numbers?)
+1 on these two.
Steven channels me well so far. Fine points about _ in floats: IMO the _ should be allowed to appear between any two digits, or between the last digit and the 'e' in the exponent, or between the 'e' and a following digit. But not adjacent to the '.' or to the '+' or '-' in the exponent. So 3.141_593 yes, 3_.14 no. Fine points about _ in bin/oct/hex literals: 0x_dead_beef yes, 0_xdeadbeef no. (The overall rule seems to be that it must be internal to alphanumeric strings, except that leading 0x, 0o or 0b must not be separated -- somehow I find 0_x_dead_beef would be a disservice to human readers.)
- int('123_456') (do other functions accept this syntax too?)
That's a tricky one... I'd say No, but I'm not entirely sure. It's easy enough to say:
int('123_456'.replace('_', ''))
albeit a tad verbose. Also easy to say:
int('123' '456')
which is less verbose.
But that's not how it'll be used. The argument will be provided by the user of the code.
And it will change the behaviour of the int function. So I don't think we need to support separators inside strings.
I think it's fine, the same reason why we want to write 1_234_567 in code sometimes applies to input or command line arguments too, and I see little harm.
We can always change our mind later and add it in, but it's much harder to take it out later.
It seems entirely harmless here. Also for float(). It would also be nice to have an easy way to emit _ in suitable places. Maybe this could be added to the .format() language for numbers? It would be nice if you could tell it to emit an _ every N positions. -- --Guido van Rossum (python.org/~guido)

On 05/06/2011 11:45 PM, Guido van Rossum wrote:
It would also be nice to have an easy way to emit _ in suitable places. Maybe this could be added to the .format() language for numbers? It would be nice if you could tell it to emit an _ every N positions.
We already support commas (PEP 378). Adding underscores in the same way would be easy. However, you can't specify N, it's always 3. Eric.

On Sat, May 7, 2011 at 12:51 PM, Eric Smith <eric@trueblade.com> wrote:
On 05/06/2011 11:45 PM, Guido van Rossum wrote:
It would also be nice to have an easy way to emit _ in suitable places. Maybe this could be added to the .format() language for numbers? It would be nice if you could tell it to emit an _ every N positions.
We already support commas (PEP 378). Adding underscores in the same way would be easy. However, you can't specify N, it's always 3.
Which would suck for non-decimal formats. :-( Also there seem to be some countries where the conventions for formatting currency uses groupings other than 1000. E.g. http://www.ozgrid.com/forum/showthread.php?t=10226 (though specifying N wouldn't be enough there). -- --Guido van Rossum (python.org/~guido)

On 05/06/2011 11:45 PM, Guido van Rossum wrote: Which would suck for non-decimal formats. :-( Also there seem to be some countries where the conventions for formatting currency uses groupings other than 1000. E.g. http://www.ozgrid.com/forum/showthread.php?t=10226 (though specifying N wouldn't be enough there).
Wouldn't something like that be the job of locale.currency()? Devin Jeanpierre

Greg Ewing <greg.ewing@canterbury.ac.nz> writes:
An alternative would be to allow spaces.
I would prefer to allow space between digits in a numeric literal. 1 2345 6789 1 2 3 4 5 6789 1 234 6789 1 234 567 89 9.876 543 210 0xFEFF 0042 This nicely parallels the fact that space can separate chunks of a string literal. But that still leaves the following inconsistency: int('1 234 567') That will currently raise a ValueError. Should it continue to do so under this proposal? -- \ “You say “Carmina”, and I say “Burana”, You say “Fortuna”, and | `\ I say “cantata”, Carmina, Burana, Fortuna, cantata, Let's Carl | _o__) the whole thing Orff.” —anonymous | Ben Finney

Too ambiguous, too hard to parse. I like the _ proposal. On May 6, 2011 5:45 PM, "Ben Finney" <ben+python@benfinney.id.au> wrote:
Greg Ewing <greg.ewing@canterbury.ac.nz> writes:
An alternative would be to allow spaces.
I would prefer to allow space between digits in a numeric literal.
1 2345 6789 1 2 3 4 5 6789 1 234 6789 1 234 567 89 9.876 543 210 0xFEFF 0042
This nicely parallels the fact that space can separate chunks of a string literal.
But that still leaves the following inconsistency:
int('1 234 567')
That will currently raise a ValueError. Should it continue to do so under this proposal?
-- \ “You say “Carmina”, and I say “Burana”, You say “Fortuna”, and | `\ I say “cantata”, Carmina, Burana, Fortuna, cantata, Let's Carl | _o__) the whole thing Orff.” —anonymous | Ben Finney
_______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas

On 07/05/2011 01:44, Ben Finney wrote:
Greg Ewing<greg.ewing@canterbury.ac.nz> writes:
An alternative would be to allow spaces.
I would prefer to allow space between digits in a numeric literal.
1 2345 6789 1 2 3 4 5 6789 1 234 6789 1 234 567 89 9.876 543 210 0xFEFF 0042
This nicely parallels the fact that space can separate chunks of a string literal.
But that still leaves the following inconsistency:
int('1 234 567')
That will currently raise a ValueError. Should it continue to do so under this proposal?
I prefer there not to be whitespace inside tokens. String literals are an exception, they are explicitly delimited.

MRAB <python@mrabarnett.plus.com> writes:
On 07/05/2011 01:44, Ben Finney wrote:
I would prefer to allow space between digits in a numeric literal. […]
This nicely parallels the fact that space can separate chunks of a string literal.
I prefer there not to be whitespace inside tokens. String literals are an exception, they are explicitly delimited.
That's a good justification for the special case. Okay, I withdraw my proposal. -- \ “Facts are stubborn things; and whatever may be our wishes, our | `\ inclinations, or the dictates of our passion, they cannot alter | _o__) the state of facts and evidence.” —John Adams, 1770-12-04 | Ben Finney
participants (20)
-
Alexander Belopolsky
-
Antoine Pitrou
-
Ben Finney
-
Bruce Leban
-
Calvin Spealman
-
Cameron Simpson
-
dag.odenhall@gmail.com
-
Devin Jeanpierre
-
Eric Smith
-
Ethan Furman
-
Fred Drake
-
Georg Brandl
-
Greg Ewing
-
Guido van Rossum
-
Kirubakaran
-
Matt Chaput
-
MRAB
-
Nadeem Vawda
-
Philip Jenvey
-
Steven D'Aprano