Perhaps allow leading zeroes in integer literals
SUMMARY ========= It was once a good idea, for Python 3 to forbid leading zeroes in integer literals. Since then circumstances have changed. Perhaps it's now, or soon will be, a good time to permit this. A SURPRISE ========== I was surprised by:
0 0 00 0 000 0 001 SyntaxError: invalid token
Compare this to:
int('0'), int('00'), int('000'), int('001') (0, 0, 0, 1)
And also:
001.0 1.0
A NEWBY GOTCHA ================ A standard way of writing today's date is 2020-02-06. Let's try this in Python, first with Christmas Day, and then for today:
datetime.date(2020, 12, 25) datetime.date(2020, 12, 25) datetime.date(2020, 02, 06) SyntaxError
EXPLANATION ============ So what's happening? Briefly, in Python 2.7 we have two forms with the same meaning
01, 010 (1, 8) 0o1, 0o10 (1, 8)
or in other words 010 is an OCTAL integer literal. In Python3 this was removed, and instead we have only:
0o1, 0o10 (1, 8)
0O1, 0O10 (1, 8)
In some fonts, capital O and digit 0 are very similar. So using capital O for OCTAL is probably a bad thing to do. CONCLUSIONS ============= Here's my view of the situation. 1. It was good to remove from Python3 the octal trap for 010. (In the beginning, most Python users already knew C. I doubt that's true now. And explicit is better than implicit.) 2. Once we've done that it made sense not to reuse 010 as a form for decimal numbers. (We don't want code that is valid for both Python 2 and Python 3, but with different meaning. Particularly for something so basic as a integer literal. Errors should never pass silently.) 3. Python2 is now obsolete. Perhaps now, or sometime in the next year or two, it makes sense to enable leading zeros in integer literals. 4. This would allow us to remove the newby trap, by providing:
datetime.date(2020, 02, 06) datetime.date(2020, 02, 06)
Notice I've changed the repr of today's date! REFERENCES ============ Here's a couple of stackoverflow posts: https://stackoverflow.com/questions/36386346/syntaxerror-invalid-token https://stackoverflow.com/questions/50290476/why-python-shows-invalid-token-... Here's a thread that brought this syntax error to my attention. https://mail.python.org/archives/list/python-ideas@python.org/message/2NCIAU... -- Jonathan
On Thu, Feb 6, 2020 at 11:07 PM Jonathan Fine <jfine2358@gmail.com> wrote:
SUMMARY =========
It was once a good idea, for Python 3 to forbid leading zeroes in integer literals. Since then circumstances have changed. Perhaps it's now, or soon will be, a good time to permit this.
So what's happening? Briefly, in Python 2.7 we have two forms with the same meaning
01, 010 (1, 8) 0o1, 0o10 (1, 8)
or in other words 010 is an OCTAL integer literal. In Python3 this was removed, and instead we have only:
0o1, 0o10 (1, 8)
0O1, 0O10 (1, 8)
IMO this isn't worth enabling. The chances of a subtle bug are too high. The usage of "leading zero means octal" exists in far too many languages, and if someone comes from one of those (or from Python 2) and tries something in Python 3, it's important to not create bizarrely incorrect behaviour; an immediate SyntaxError is safe. The proposed advantage, by comparison, is fairly minor. ChrisA
06.02.20 14:03, Jonathan Fine пише:
001 SyntaxError: invalid token
001 File "<stdin>", line 1 SyntaxError: leading zeros in decimal integer literals are not
The error message was improved in the latest Python: permitted; use an 0o prefix for octal integers
1. It was good to remove from Python3 the octal trap for 010. (In the beginning, most Python users already knew C. I doubt that's true now. And explicit is better than implicit.)
C is still in use. As well as other programming languages which use the C format for octals.
3. Python2 is now obsolete. Perhaps now, or sometime in the next year or two, it makes sense to enable leading zeros in integer literals.
It is obsolete, but is still in use, and will be for many years (maybe tens of years). There is no official release of CPython 2.7 planned after April of 2020, but other implementations (PyPy, Jython) can continue to release new versions. Some companieas will provide distributions of CPython 2.7 for years (maybe with their own patches).
4. This would allow us to remove the newby trap, by providing:
datetime.date(2020, 02, 06) datetime.date(2020, 02, 06)
It is not a trap. It is just an error, and the compiler immediately give you a clue how to to fix it (remove leading 0s if you meant decimals, and use the 0o prefix if you meant octals). The trap is using 0776 for octal 0o776 and get a decimal 776 instead.
On Feb 6, 2020, at 04:06, Jonathan Fine <jfine2358@gmail.com> wrote:
A NEWBY GOTCHA ================
A standard way of writing today's date is 2020-02-06. Let's try this in Python, first with Christmas Day, and then for today:
datetime.date(2020, 12, 25) datetime.date(2020, 12, 25) datetime.date(2020, 02, 06) SyntaxError
On the other hand, even if you’re not familiar with any of the languages that use the C octal format, you are likely to run into it in documentation. For example, let’s say you want to change a file to not be world-readable or -executable. You look up os.chmod and, wow, look at all those things you’re supposed to or together. Is there an easier way? Sure, it’s just calling the system’s chmod function, and you can man chmod or search online to find a helpful tutorial blog or look on StackOverflow or ServerFault or SuperUser, and you’ll quickly find not just a more in-depth explanation, but an example that does exactly what you want: chmod(path, 0770) So, you figure, in Python, it should be just: os.chmod(path, 0770) So you put that in a script and you get an error that explains that if this is meant to be octal you wanted 0o770, if it was meant to be decimal you wanted 770. So you do what it says and add the o and you’re set. With your change, this instead compiles and runs and sets the mode to 0o1402, which if you’re lucky you don’t have permissions to do, otherwise it’s now sticky, execute-only for owner, inaccessible for group, and write-only for world. There aren’t mang cases not related to mode values and the os module where this problem would come up, but I think mode values are common enough, and unfamiliar enough to newbies that they’re going to search, to be a problem. And they’re ubiquitously documented in C or shell terms that will give you C-style octal values, so most people who do search will be misled.
On 2/6/20 7:03 AM, Jonathan Fine wrote:
SUMMARY =========
It was once a good idea, for Python 3 to forbid leading zeroes in integer literals. Since then circumstances have changed. Perhaps it's now, or soon will be, a good time to permit this.
A SURPRISE ==========
I was surprised by:
0 0 00 0 000 0 001 SyntaxError: invalid token
Just thinking a bit, it should be safe to relax the prohibition for numbers with only 1 significant digit. 02 is ok, 020 is not, since numbers with 1 significant digit can't be misinterpreted. As to when it would be safe to totally remove that protection, Python 2.x isn't dead yet, and really won't be for awhile. It may be that the python developer have declared that support for it from them is over (though there still is that last release in April still to come), but it will still be alive, maybe on life support, while the various Long Term Support distributions maintain it. That won't end till some time next year by my understanding. And then you have people who will still use the unsupported 2.7 while they work on switching over and trying to figure out how to make their ancient code work on these new systems. This doesn't mean that Python 3.x needs to bend over backwards to cater to 2.7 users, but thought should be made that there are, and will be for a while, People used to Python 2.x In this particular case, since many other langagues and some operating systems use as simple leading zero to indicate octal, it may make sense to just keep the ambiguous numbers invalid forever. If we admit that the reasoning is ambiguity, then allowing the single digit numbers could make sense even if it does mean that writing the rule as a grammar is complicated. -- Richard Damon
participants (5)
-
Andrew Barnert
-
Chris Angelico
-
Jonathan Fine
-
Richard Damon
-
Serhiy Storchaka