
I've been doing some work with large ints, of well over 100 digits. For example, this number has 131 digits: P = 29674495668685510550154174642905332730771991799853043350995075531276838753171770199594238596428121188033664754218345562493168782883 Sometimes I'm tempted to write numbers like that as follows: P = int('29674495668685510550154174642905332730771991' '79985304335099507553127683875317177019959423' '8596428121188033664754218345562493168782883') which is nicer to read, except for the minor annoyance of the call to int and the string delimiters. That got me thinking that it might be Nice To Have if we could split long int literals across multiple lines: P = 29674495668685510550154174642905332730771991\ 79985304335099507553127683875317177019959423\ 8596428121188033664754218345562493168782883 Or if you don't like backslashes, trailing underscores are currently illegal, so we could use them: P = 29674495668685510550154174642905332730771991_ 79985304335099507553127683875317177019959423_ 8596428121188033664754218345562493168782883 Thoughts? -- Steven

On Tue, Feb 4, 2020 at 9:41 PM Steven D'Aprano <steve@pearwood.info> wrote:
Or if you don't like backslashes, trailing underscores are currently illegal, so we could use them:
P = 29674495668685510550154174642905332730771991_ 79985304335099507553127683875317177019959423_ 8596428121188033664754218345562493168782883
Another one to throw into the mix: Trailing underscores, but only if the expression is incomplete. So in simple cases like this, that means parenthesizing the number: P = (29674495668685510550154174642905332730771991_ 79985304335099507553127683875317177019959423_ 8596428121188033664754218345562493168782883) ChrisA

On Tue, Feb 4, 2020 at 3:29 AM Chris Angelico <rosuav@gmail.com> wrote:
Another one to throw into the mix: Trailing underscores, but only if the expression is incomplete. So in simple cases like this, that means parenthesizing the number:
P = (29674495668685510550154174642905332730771991_ 79985304335099507553127683875317177019959423_ 8596428121188033664754218345562493168782883)
FWIW, if a multi-line int literal syntax is deemed worthy of having, this syntax really makes me smile as the most obvious about its intent. I do not think anybody unaware of specific Python syntaxes would misread it. The requirement of the ()s fits with the general recommendation made to avoid \ by enclosing in ()s. The question that remains is if the () around every such integer are required, or if this occurring within any existing ()s is sufficient. ex: method.call(123_ 456, 786_ 9) *could* be semiconfusing. Though , and _ are visually distinct enough that I think it would stand out. And disallowing a final trailing _ prevents "_," accidents. Requiring additional ()s in this case would be fine, but probably isn't worth it. I expect anyone entering a multi-line super long literal to not be inlining them in common practice and always be assigning them to a useful name for readability's sake. -gps
ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/TTEDIQ... Code of Conduct: http://python.org/psf/codeofconduct/

Hi How about something like:
def t1(*argv): ... value = 0 ... for n in argv: ... value *= 1_000 ... value += n ... return value
t1(123, 456, 789) 123456789
Similarly, for define t2 to use 1_000_000, t3 to use 1_000_000_000 and so on, instead of 1_000. For really big numbers, you might want to use t10. I think something like
t10( ... 111_222_333_444_555_666_777_888_999_000, ... 111_222_333_444_555_666_777_888_999_000, ... 111_222_333_444_555_666_777_888_999_000, ... ) gives a nice representation of a 90 digit number.
If you're dealing with really big integers (say 1000 digits or more) then you might be want to use https://pypi.org/project/gmpy2/, in which case you'll appreciate the extra flexibility provided by t10. (This would allow t10 to return a gmpy integer, if import gmpy2 succeeds.) Finally, perhaps really big numbers should be stored as data, rather than placed in source code files. (For example, this would allow these large pieces of data to be verified, via a secure hash.) -- Jonathan

How about something like:
def t1(*argv): ... value = 0 ... for n in argv: ... value *= 1_000 ... value += n ... return value
t1(123, 456, 789) 123456789
Similarly, for define t2 to use 1_000_000, t3 to use 1_000_000_000 and so on, instead of 1_000. For really big numbers, you might want to use t10.
I think something like
t10( ... 111_222_333_444_555_666_777_888_999_000, ... 111_222_333_444_555_666_777_888_999_000, ... 111_222_333_444_555_666_777_888_999_000, ... ) gives a nice representation of a 90 digit number.
I think everyone on this thread thought this way once, but this way is a bother. If you want to type a very long number, we may code like this in Jonathans proposal.
t1(3, 826, 393, 629, 472, 794, 629, 263, 026, 302... 3826393629472794629263026302...
But if you noticed that you did typo in very last argument(=low digit), you need to remove and add commas. This is really a burden. In this respect, I'd agree with making new syntax.
Finally, perhaps really big numbers should be stored as data, rather than placed in source code files. (For example, this would allow these large pieces of data to be verified, via a secure hash.)
As your thought, I think big numbers like MBytes should be stored as data, but how about big constant or something like that? I believe those kind of numbers should be stored in some python files like constants.py . Yamato Nagata ----------------

On Wed, 5 Feb 2020 11:09:16 +0000 Jonathan Fine <jfine2358@gmail.com> wrote:
How about something like:
def t1(*argv): ... value = 0 ... for n in argv: ... value *= 1_000 ... value += n ... return value
t1(123, 456, 789) 123456789
Similarly, for define t2 to use 1_000_000, t3 to use 1_000_000_000 and so on, instead of 1_000. For really big numbers, you might want to use t10.
Someone previously asked about a "base"; your idea could be extended to accommodate same:
def tbuilder(base): def t(*argv): value = 0 for n in argv: value *= base value += n return value return t
tbuilder(1000)(123, 456, 789) 123456789
If you're dealing with really big integers (say 1000 digits or more) then you might be want to use https://pypi.org/project/gmpy2/, in which case you'll appreciate the extra flexibility provided by t10. (This would allow t10 to return a gmpy integer, if import gmpy2 succeeds.)
+1
Finally, perhaps really big numbers should be stored as data, rather than placed in source code files. (For example, this would allow these large pieces of data to be verified, via a secure hash.)
+1 -- “Atoms are not things.” – Werner Heisenberg Dan Sommers, http://www.tombstonezero.net/dan

Dan Sommers wrote:
On Wed, 5 Feb 2020 11:09:16 +0000 Jonathan Fine jfine2358@gmail.com wrote:
How about something like: def t1(argv): ... value = 0 ... for n in argv: ... value = 1_000 ... value += n ... return value t1(123, 456, 789) 123456789 Similarly, for define t2 to use 1_000_000, t3 to use 1_000_000_000 and so on, instead of 1_000. For really big numbers, you might want to use t10. Someone previously asked about a "base"; your idea could be extended to accommodate same: def tbuilder(base): def t(argv): value = 0 for n in argv: value = base value += n return value return t tbuilder(1000)(123, 456, 789) 123456789
This won't work when leading zeros are involved, e.g. consider `123_006_789`: `t(123, 006, 789)` gives a SyntaxError. Also it's weird that when we want to write an int literal that we end up with two function calls and individual numbers as arguments (+ you'd need to have that function handy in the first place). It would be clearer to use a kw-only argument `base=1000` but even then it doesn't really convey they idea of a literal.
If you're dealing with really big integers (say 1000 digits or more) then you might be want to use https://pypi.org/project/gmpy2/, in which case you'll appreciate the extra flexibility provided by t10. (This would allow t10 to return a gmpy integer, if import gmpy2 succeeds.) +1 Finally, perhaps really big numbers should be stored as data, rather than placed in source code files. (For example, this would allow these large pieces of data to be verified, via a secure hash.) +1

I thank Dominik Vilsmeier for noticing
001 SyntaxError: invalid token
This is a problem both the original poster's suggestion:
P = 100\ ... 000\ ... 000
and mine:
t1( 100, 000, 000)
That '001' is a syntax error interests me. I'll start a new thread for that. -- Jonathan

On Thu, Feb 6, 2020 at 10:35 PM Jonathan Fine <jfine2358@gmail.com> wrote:
I thank Dominik Vilsmeier for noticing
001 SyntaxError: invalid token
This is a problem both the original poster's suggestion:
P = 100\ ... 000\ ... 000
and mine:
t1( 100, 000, 000)
That '001' is a syntax error interests me. I'll start a new thread for that.
That's a simple matter of history. Python 2.7.13 (default, Sep 26 2018, 18:42:22) [GCC 6.3.0 20170516] on linux2 Type "help", "copyright", "credits" or "license" for more information.
0100 64
In C and its friends and family, a leading zero means octal. Python 3 removed this (you can use "0o100" for octal, paralleling "0x100" for hex), but in order to ensure that code would cleanly break rather than inexplicably giving the wrong result, "001" is an error. ChrisA

On Thu, Feb 6, 2020 at 11:58 AM Chris Angelico <rosuav@gmail.com> wrote: That's a simple matter of history.
Python 2.7.13 (default, Sep 26 2018, 18:42:22) [GCC 6.3.0 20170516] on linux2 Type "help", "copyright", "credits" or "license" for more information.
0100 64
In C and its friends and family, a leading zero means octal. Python 3 removed this (you can use "0o100" for octal, paralleling "0x100" for hex), but in order to ensure that code would cleanly break rather than inexplicably giving the wrong result, "001" is an error.
Thank you for this clear and concise explanation. As I explain in the post I promised (in the message you responded to), it was a good idea then, and it might not be a good idea now. I ask that all follow up to that specific topic go to the new thread: https://mail.python.org/archives/list/python-ideas@python.org/thread/7IKEXSM... -- Jonathan

On Tue, 4 Feb 2020 at 10:39, Steven D'Aprano <steve@pearwood.info> wrote:
I've been doing some work with large ints, of well over 100 digits. For example, this number has 131 digits:
P = 29674495668685510550154174642905332730771991799853043350995075531276838753171770199594238596428121188033664754218345562493168782883
Sometimes I'm tempted to write numbers like that as follows:
P = int('29674495668685510550154174642905332730771991' '79985304335099507553127683875317177019959423' '8596428121188033664754218345562493168782883')
which is nicer to read, except for the minor annoyance of the call to int and the string delimiters.
That got me thinking that it might be Nice To Have if we could split long int literals across multiple lines:
P = 29674495668685510550154174642905332730771991\ 79985304335099507553127683875317177019959423\ 8596428121188033664754218345562493168782883
Or if you don't like backslashes, trailing underscores are currently illegal, so we could use them:
P = 29674495668685510550154174642905332730771991_ 79985304335099507553127683875317177019959423_ 8596428121188033664754218345562493168782883
Thoughts?
My immediate instinct was underscores so I favour that over backslashes. And I think that if you need to write huge numbers like that then having a better way to break them up is important (you don't use internal underscores at all in your example, which surprises me, as that would be the first thing I'd do). But I would ask, do you *really* type numbers like that in manually??? I can imagine them being output from another program, or from information in a webpage, that you copy and paste in here, but I'd be more likely to address that with a comment above the definition, saying how to regenerate the number, and that it was copy-and-pasted from that output. If you're copy/pasting, having to reformat is more awkward, rather than less. Can you share a bit more about why you need to do this? In the abstract, having the ability to split numbers over lines seems harmless and occasionally useful, but conversely it's not at all obvious why anyone would be doing this in real life. Paul

This seems like a pretty uncommon use case. But are there applications to other contexts where we might want easy line continuation? -CHB On Tue, Feb 4, 2020 at 6:55 AM Paul Moore <p.f.moore@gmail.com> wrote:
On Tue, 4 Feb 2020 at 10:39, Steven D'Aprano <steve@pearwood.info> wrote:
I've been doing some work with large ints, of well over 100 digits. For example, this number has 131 digits:
P =
29674495668685510550154174642905332730771991799853043350995075531276838753171770199594238596428121188033664754218345562493168782883
Sometimes I'm tempted to write numbers like that as follows:
P = int('29674495668685510550154174642905332730771991' '79985304335099507553127683875317177019959423' '8596428121188033664754218345562493168782883')
which is nicer to read, except for the minor annoyance of the call to int and the string delimiters.
That got me thinking that it might be Nice To Have if we could split long int literals across multiple lines:
P = 29674495668685510550154174642905332730771991\ 79985304335099507553127683875317177019959423\ 8596428121188033664754218345562493168782883
Or if you don't like backslashes, trailing underscores are currently illegal, so we could use them:
P = 29674495668685510550154174642905332730771991_ 79985304335099507553127683875317177019959423_ 8596428121188033664754218345562493168782883
Thoughts?
My immediate instinct was underscores so I favour that over backslashes. And I think that if you need to write huge numbers like that then having a better way to break them up is important (you don't use internal underscores at all in your example, which surprises me, as that would be the first thing I'd do).
But I would ask, do you *really* type numbers like that in manually??? I can imagine them being output from another program, or from information in a webpage, that you copy and paste in here, but I'd be more likely to address that with a comment above the definition, saying how to regenerate the number, and that it was copy-and-pasted from that output. If you're copy/pasting, having to reformat is more awkward, rather than less.
Can you share a bit more about why you need to do this? In the abstract, having the ability to split numbers over lines seems harmless and occasionally useful, but conversely it's not at all obvious why anyone would be doing this in real life.
Paul _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/YOE53K... Code of Conduct: http://python.org/psf/codeofconduct/
-- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython

[Paul Moore <p.f.moore@gmail.com>]
... Can you share a bit more about why you need to do this? In the abstract, having the ability to split numbers over lines seems harmless and occasionally useful, but conversely it's not at all obvious why anyone would be doing this in real life.
It's not all that uncommon among people who work in algorithmic number theory. The P in Steven's example is taken from a paper, where it's the _input_ to a short calculation that computes Q, a number with about 3 times as many digits. It's Q that's interesting, not really P. By careful construction, Q is a composite number that fooled many "prime testing" functions in many popular packages. They claimed Q is prime(*). Where did P come from? It's complicated. Far easier to copy/paste than to compute (for Q, the opposite). Which, as computational resources grow more capable, becomes more common: interesting results can be "big" indeed, but computing them _can_ require CPU-centuries of effort. That said, while I enjoy playing in that area, I don't have a real problem with pasting such things in. They're too big to get any intuitive concept of their size by eyeball, so it's fine by me to leave them on a single line. Steven's "has 131 digits" comment is far more informative to me than any way of breaking the literal across multiple lines. So I'm -0 on complicating syntax to cater to this. The int(implicitly pasted string literals) trick has been good enough for me when I've really wanted it. (*) For those who care, Q = P * (313*(P-1) + 1) * (353*(P-1) + 1) Q is a "Carmichael number", a Fermat pseudoprime to all bases relatively prime to Q. It's also a strong pseudoprime to all bases < 307, meaning that all Miller-Rabin primality testers that stick to bases < 307 falsely claim it's prime. And P is Q's smallest prime factor, so "small to large" trial division is also hopeless for discovering that Q is composite. Ditto "large to small" trial division, since Q has no factors anywhere near its square root.

Thanks Tim, this is the sort of thing that I do like to dabble in, so the background information is really interesting. But as I say, I'd view it the same way as you. I'd copy/paste it in, leave the long line as it is and add a comment saying where I got it from. So I'm pretty neutral on the proposed syntax. Paul On Tue, 4 Feb 2020 at 19:12, Tim Peters <tim.peters@gmail.com> wrote:
[Paul Moore <p.f.moore@gmail.com>]
... Can you share a bit more about why you need to do this? In the abstract, having the ability to split numbers over lines seems harmless and occasionally useful, but conversely it's not at all obvious why anyone would be doing this in real life.
It's not all that uncommon among people who work in algorithmic number theory.
The P in Steven's example is taken from a paper, where it's the _input_ to a short calculation that computes Q, a number with about 3 times as many digits. It's Q that's interesting, not really P. By careful construction, Q is a composite number that fooled many "prime testing" functions in many popular packages. They claimed Q is prime(*).
Where did P come from? It's complicated. Far easier to copy/paste than to compute (for Q, the opposite).
Which, as computational resources grow more capable, becomes more common: interesting results can be "big" indeed, but computing them _can_ require CPU-centuries of effort.
That said, while I enjoy playing in that area, I don't have a real problem with pasting such things in. They're too big to get any intuitive concept of their size by eyeball, so it's fine by me to leave them on a single line. Steven's "has 131 digits" comment is far more informative to me than any way of breaking the literal across multiple lines.
So I'm -0 on complicating syntax to cater to this. The int(implicitly pasted string literals) trick has been good enough for me when I've really wanted it.
(*) For those who care,
Q = P * (313*(P-1) + 1) * (353*(P-1) + 1)
Q is a "Carmichael number", a Fermat pseudoprime to all bases relatively prime to Q. It's also a strong pseudoprime to all bases < 307, meaning that all Miller-Rabin primality testers that stick to bases < 307 falsely claim it's prime. And P is Q's smallest prime factor, so "small to large" trial division is also hopeless for discovering that Q is composite. Ditto "large to small" trial division, since Q has no factors anywhere near its square root.

On 2020-02-04 02:37, Steven D'Aprano wrote:
Sometimes I'm tempted to write numbers like that as follows:
P = int('29674495668685510550154174642905332730771991' '79985304335099507553127683875317177019959423' '8596428121188033664754218345562493168782883')
which is nicer to read, except for the minor annoyance of the call to int and the string delimiters.
This variant seems to work pretty well, and I find it easy to understand. If I had this problem often enough and end-users I'd probably put them in an .ini file (values broken with newlines), with a short function to load and assemble them properly. -Mike

Thanks everyone for your comments. Replying to everyone (so far) at once. Chris Angelico: I would be okay with your suggestion that line-breaking the int requires parentheses: a = 123_ # remains a SyntaxError b = (123_ 456) # legal but I don't know how easy that is for the parser. Paul Moore:
But I would ask, do you *really* type numbers like that in manually???
Alas, sometimes the only source I have for the number is a PDF of a scanned paper (pixels, not copyable text) so yes, it has happened a few times that I have had to manually retype the number. But the motivation for this is not to reduce the amount of typing, but to make the number nicer to read. It's about presentation. For example, here's the source of the number P I gave earlier: https://www.sciencedirect.com/science/article/pii/S0747717185710425?via%3Dih... and P is given on page 7 (as p1), split over two lines. Even if I could copy and paste it, I would still have to edit it back to a single line. Using underscores to group digits doesn't reduce the length of the number, it makes it longer :-) For smaller numbers, in the trillions say, breaking the number up into triplets is helpful: a = 478_190_347_801 but with 100+ digit numbers I don't find that helpful. Although now you mention it I might separate the number into groups of ten or twenty. But I would still like to split it over multiple lines. Christopher Barker:
This seems like a pretty uncommon use case. But are there applications to other contexts where we might want easy line continuation?
Sure, using huge ints is rather uncommon. Are there other use-cases for breaking huge ints over multiple lines that don't involve huge ints? Probably not :-) I'm not proposing trailing underscore as a general line continuation mechanism. We already have two of those: - explicit line continuation with a trailing backslash; - implicit line continuation inside open brackets; but neither can be used to split a 100+ digit integer in the middle of the constant. So this proposal is specific to huge ints. Mike Miller: I'm not sure what benefit you gain from moving the constant into an ini file. Moving it into an ini file just adds distance between the code and the value it needs to run, and adds a dependency that must be met or the code can't run. You need extra function calls to open the file, read the value as a string, and convert to an int. By default, ints in ini files can't be split over multiple lines, so you haven't actually solved the problem. And worst of all, the value isn't a config setting that the end user can configure. Putting it into an ini file is an invitation to "tweak" the constant, but that's just going to break things. -- Steven

Why not just build the number using some kind of fancy "continuing integer" constructor function? Then you could just use commas, Blacken the code, and it will look great, right? P = my_int_maker(29674495668685510550154174642905332730771991, 79985304335099507553127683875317177019959423, 8596428121188033664754218345562493168782883) --- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler

that's not such a good idea if you can't specify base. On 2020-02-04 8:17 p.m., Ricky Teachey wrote:
Why not just build the number using some kind of fancy "continuing integer" constructor function? Then you could just use commas, Blacken the code, and it will look great, right?
P = my_int_maker(29674495668685510550154174642905332730771991, 79985304335099507553127683875317177019959423, 8596428121188033664754218345562493168782883)
--- Ricky.
"I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/LHAWNM... Code of Conduct: http://python.org/psf/codeofconduct/

On 2/4/2020 6:17 PM, Ricky Teachey wrote:
Why not just build the number using some kind of fancy "continuing integer" constructor function? Then you could just use commas, Blacken the code, and it will look great, right?
P = my_int_maker(29674495668685510550154174642905332730771991, 79985304335099507553127683875317177019959423, 8596428121188033664754218345562493168782883)
Unless you switch to strings, you're going to have a problem with leading zeros. Eric

On Tue, Feb 4, 2020 at 3:21 PM Ricky Teachey <ricky@teachey.org> wrote:
Why not just build the number using some kind of fancy "continuing integer" constructor function? Then you could just use commas, Blacken the code, and it will look great, right?
P = my_int_maker(29674495668685510550154174642905332730771991, 79985304335099507553127683875317177019959423, 8596428121188033664754218345562493168782883)
(Rhetorical question:) How could my_int_maker(1, 00, 23) return 10023? -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

On Wed, Feb 5, 2020 at 10:30 AM Ricky Teachey <ricky@teachey.org> wrote:
(Rhetorical question:) How could my_int_maker(1, 00, 23) return 10023?
Sorry I haven't been on the list long: does "rhetorical question" mean the same thing to the Dutch as to Americans? ;)
Not sure about Dutch and American usage, but where I grew up, it means "I'm asking this question, not because I want an answer, but because I want you to think about the question". ChrisA

Maybe I meant "Socratic question." I was hoping you would realize there's no way (short of parsing the source code) to know how many zeros were entered as the second argument, hence that the proposal has a flaw. On Tue, Feb 4, 2020 at 3:28 PM Ricky Teachey <ricky@teachey.org> wrote:
(Rhetorical question:) How could my_int_maker(1, 00, 23) return 10023?
Sorry I haven't been on the list long: does "rhetorical question" mean the same thing to the Dutch as to Americans? ;)
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

On Tue, Feb 4, 2020, 6:35 PM Guido van Rossum <guido@python.org> wrote:
Maybe I meant "Socratic question." I was hoping you would realize there's no way (short of parsing the source code) to know how many zeros were entered as the second argument, hence that the proposal has a flaw.
I don't deserve your kindness! Sorry for the lame suggestion, kids.
participants (15)
-
Chris Angelico
-
Christopher Barker
-
Dan Sommers
-
Dominik Vilsmeier
-
Eric V. Smith
-
Gregory P. Smith
-
Guido van Rossum
-
Jonathan Fine
-
Mike Miller
-
Paul Moore
-
Ricky Teachey
-
Soni L.
-
Steven D'Aprano
-
Tim Peters
-
永田大和