Proposal: Using % sign for percentage

Good day! As Python focuses on readability, why not use % sign for actual percentages? For example: ``` rate = 0.1058 # float rate = 10.58% # percent, similar to above ``` It does not interfere with modulo operator as modulo follows a different format: ``` a = x % y ``` This looks like a small feature but it will surely set Python a level higher in terms of readability. Thanks a lot!

Hi Ronie There's a problem with using % as you suggest. What's the meaning and thus value of >>> 300%+57 At present it's 15. You would have it equal to 60. You motivation is good, but the solution you've proposed might not work. By the way, Python does care about readability of numeric constants. It was recently (?) changed to allow >>> billion = 1_000_000_000 I do agree with you, readability counts. Thank you for your contribution. Perhaps there's a better solution. -- Jonathan

I don't have any specific knowledge about this, but it probably has not been done because it could easily cause some subtle bugs that aren't immediately obvious, and the benefits aren't that great in comparison to just using the format specification language to produce % representation. Consider, if % was a postix operator: the modulo value is: 111.00000000000001% yuck! Another problem is: if you want a "%" sign displayed as part of the repr, this would require a new "percent" numerical subtype with it's own __repr__. Which opens up a big can of worms: - should percent instances inter-operate with floats, ints, Decimals....? - if so, what should the return value be when you add/subtract/etc them with other types? Also consider that if this was done, it would be the only postfix operator in the entire language. If you really want to do this, you could subclass float and overload the __rmod__ and just have it take an ellipses as the RHS argument, which doesn't look too awful to me:
But IMO there's already a much better way to do all of this:
f"{x:.0%}" '111%'

What if we had reverse format strings, i.e. reading formatted input? x = readf("{:.0%}", "50%") # => x == 0.50 x = readf("{:.0}", "50") # => x == 0.50 Readf takes the repr() of its second argument and “un-formats” the string.

On Jun 25, 2019, at 08:44, James Lu <jamtlu@gmail.com> wrote:
What’s the algorithm for this? There’s at least possibly an answer for things like C printf, because the format specifier “%.1f” tells you the type—if you pass anything but a double to printf with that specifier, it’s undefined behavior, so you can write a scanf function that knows that “%.1f” has to “unprintf” into a double. But Python format specifier “.1” tells you nothing about the type—it can be passed a float, a str, an instance of some arbitrary user-defined type… so what type can tell you how to “unformat” it? (By the way, format(50.0, “.1”) gives you “5e+01”, not “50”. And the repr of the string “50” is the string “‘50’”. Also, surely unformatting returns not a single value, but as many values as there are format specifiers, right? But these issues are trivial.) It might be possible to come up with a solution to this. Maybe you could explicitly specify the types in the “unformat” specifier, before the colon, and it calls __unformat__ on that type, which returns the parsed value and the remainder of the string?That could work for simple cases: x, = unformat(“{float:.0f}”, “50”) … or even: x, op, y = unformat(“{float.0f} {str:.1} {float:.0f}”, “2 * 3”) But what happens in this case: x, op, y = unformat(“{float.0f} {str} {float:.0f}”, “2 * 3”) There’s no way str.__unformat__(“”, “* 3”) can know whether to parse one character or the whole string or anything in between. Unless you want some horribly inefficient backtracking scheme, this has to be illegal. (And this is exactly why people come up with more restricted languages to parse. If you know the language is regular, you can write a regex to parse it, which actually can disambiguate cases like this without exponential backtracking.)

Please change the subject line when you change the topic of conversation. Or even better, since this isn't strongly related to what you're replying to, and you quoted no text whatsoever - just make it a brand new post. Thanks! On Thu, Jun 27, 2019 at 5:06 AM James Lu <jamtlu@gmail.com> wrote:
What you're talking about sounds like a scanf sort of thing. In C, printf and scanf are approximate counterparts, as are sprintf and sscanf, which work directly with strings (compare json.load and json.loads, or json.dump and json.dumps). The way sscanf works is that every marker clearly defines the type of value it can parse: %d - one integer, decimal %f - floating-point value in decimal %s - string etc, etc Trying to form a parallel to the way Python's .format() method works is a little tricky, because format() starts with the object it's formatting, and then says "hey, format yourself, kay?". So it may be best to abandon that and instead stick with the well-known sscanf notation. The main advantage of sscanf over a regular expression is that it performs a single left-to-right pass over the format string and the target string simultaneously, with no backtracking. (This is also its main DISadvantage compared to a regular expression.) A tiny amount of look-ahead in the format string is the sole exception (for instance, format string "%s$%d" would collect a string up until it finds a dollar sign, which would otherwise have to be written "%[^$]$%d"). There is significant value in having an extremely simple parsing tool available; the question is, is it worth complicating matters with yet another way to parse strings? (We still have fewer ways to parse than ways to format strings. I think.) ChrisA

On Sun, May 05, 2019 at 02:34:27AM +0800, Ronie Martinez wrote:
Alas, that's not correct, because + and - are both unary operators as well as binary operators, so this: x % + y is ambiguous: x modulo (+y) (x%) plus y
This looks like a small feature but it will surely set Python a level higher in terms of readability.
It definitely does look like a very small feature, but can you demonstrate that it will be better for readability by showing some actual real code that would be improved by this? How often do you hard-code a rate into your program like this? rate = 12.75% I don't see the readability improvement from converting the decimal to a percent in my head once, when I write the code: # think: I want 12.75% so divide by 100 rate = 0.1275 versus having to convert the percent to a decimal in my head every single time I read the code: # read: rate = 12.75% # think: percent operator, not modulo, # so divide by 100, the value must be 0.1275 Percent notation is already ambiguous in real life, which is why mathematicians don't use it. This is fine: discount = sell_price * 20% but what do you expect these to do? price = cost + 20% marked_down_price = price - 15% Aside from the ambiguity with modulo, and the dubious readability improvement, I think this will trip people up when they try to calculate with it. There are two obvious meanings for the above examples: # percent unary operator has low precedence price = (cost + 20)/100 # percent unary operator has high precedence price = cost + (20/100) and neither match what people mean when they say "the price is the cost plus 20 percent". -- Steven

Hi Ronie There's a problem with using % as you suggest. What's the meaning and thus value of >>> 300%+57 At present it's 15. You would have it equal to 60. You motivation is good, but the solution you've proposed might not work. By the way, Python does care about readability of numeric constants. It was recently (?) changed to allow >>> billion = 1_000_000_000 I do agree with you, readability counts. Thank you for your contribution. Perhaps there's a better solution. -- Jonathan

I don't have any specific knowledge about this, but it probably has not been done because it could easily cause some subtle bugs that aren't immediately obvious, and the benefits aren't that great in comparison to just using the format specification language to produce % representation. Consider, if % was a postix operator: the modulo value is: 111.00000000000001% yuck! Another problem is: if you want a "%" sign displayed as part of the repr, this would require a new "percent" numerical subtype with it's own __repr__. Which opens up a big can of worms: - should percent instances inter-operate with floats, ints, Decimals....? - if so, what should the return value be when you add/subtract/etc them with other types? Also consider that if this was done, it would be the only postfix operator in the entire language. If you really want to do this, you could subclass float and overload the __rmod__ and just have it take an ellipses as the RHS argument, which doesn't look too awful to me:
But IMO there's already a much better way to do all of this:
f"{x:.0%}" '111%'

What if we had reverse format strings, i.e. reading formatted input? x = readf("{:.0%}", "50%") # => x == 0.50 x = readf("{:.0}", "50") # => x == 0.50 Readf takes the repr() of its second argument and “un-formats” the string.

On Jun 25, 2019, at 08:44, James Lu <jamtlu@gmail.com> wrote:
What’s the algorithm for this? There’s at least possibly an answer for things like C printf, because the format specifier “%.1f” tells you the type—if you pass anything but a double to printf with that specifier, it’s undefined behavior, so you can write a scanf function that knows that “%.1f” has to “unprintf” into a double. But Python format specifier “.1” tells you nothing about the type—it can be passed a float, a str, an instance of some arbitrary user-defined type… so what type can tell you how to “unformat” it? (By the way, format(50.0, “.1”) gives you “5e+01”, not “50”. And the repr of the string “50” is the string “‘50’”. Also, surely unformatting returns not a single value, but as many values as there are format specifiers, right? But these issues are trivial.) It might be possible to come up with a solution to this. Maybe you could explicitly specify the types in the “unformat” specifier, before the colon, and it calls __unformat__ on that type, which returns the parsed value and the remainder of the string?That could work for simple cases: x, = unformat(“{float:.0f}”, “50”) … or even: x, op, y = unformat(“{float.0f} {str:.1} {float:.0f}”, “2 * 3”) But what happens in this case: x, op, y = unformat(“{float.0f} {str} {float:.0f}”, “2 * 3”) There’s no way str.__unformat__(“”, “* 3”) can know whether to parse one character or the whole string or anything in between. Unless you want some horribly inefficient backtracking scheme, this has to be illegal. (And this is exactly why people come up with more restricted languages to parse. If you know the language is regular, you can write a regex to parse it, which actually can disambiguate cases like this without exponential backtracking.)

Please change the subject line when you change the topic of conversation. Or even better, since this isn't strongly related to what you're replying to, and you quoted no text whatsoever - just make it a brand new post. Thanks! On Thu, Jun 27, 2019 at 5:06 AM James Lu <jamtlu@gmail.com> wrote:
What you're talking about sounds like a scanf sort of thing. In C, printf and scanf are approximate counterparts, as are sprintf and sscanf, which work directly with strings (compare json.load and json.loads, or json.dump and json.dumps). The way sscanf works is that every marker clearly defines the type of value it can parse: %d - one integer, decimal %f - floating-point value in decimal %s - string etc, etc Trying to form a parallel to the way Python's .format() method works is a little tricky, because format() starts with the object it's formatting, and then says "hey, format yourself, kay?". So it may be best to abandon that and instead stick with the well-known sscanf notation. The main advantage of sscanf over a regular expression is that it performs a single left-to-right pass over the format string and the target string simultaneously, with no backtracking. (This is also its main DISadvantage compared to a regular expression.) A tiny amount of look-ahead in the format string is the sole exception (for instance, format string "%s$%d" would collect a string up until it finds a dollar sign, which would otherwise have to be written "%[^$]$%d"). There is significant value in having an extremely simple parsing tool available; the question is, is it worth complicating matters with yet another way to parse strings? (We still have fewer ways to parse than ways to format strings. I think.) ChrisA

On Sun, May 05, 2019 at 02:34:27AM +0800, Ronie Martinez wrote:
Alas, that's not correct, because + and - are both unary operators as well as binary operators, so this: x % + y is ambiguous: x modulo (+y) (x%) plus y
This looks like a small feature but it will surely set Python a level higher in terms of readability.
It definitely does look like a very small feature, but can you demonstrate that it will be better for readability by showing some actual real code that would be improved by this? How often do you hard-code a rate into your program like this? rate = 12.75% I don't see the readability improvement from converting the decimal to a percent in my head once, when I write the code: # think: I want 12.75% so divide by 100 rate = 0.1275 versus having to convert the percent to a decimal in my head every single time I read the code: # read: rate = 12.75% # think: percent operator, not modulo, # so divide by 100, the value must be 0.1275 Percent notation is already ambiguous in real life, which is why mathematicians don't use it. This is fine: discount = sell_price * 20% but what do you expect these to do? price = cost + 20% marked_down_price = price - 15% Aside from the ambiguity with modulo, and the dubious readability improvement, I think this will trip people up when they try to calculate with it. There are two obvious meanings for the above examples: # percent unary operator has low precedence price = (cost + 20)/100 # percent unary operator has high precedence price = cost + (20/100) and neither match what people mean when they say "the price is the cost plus 20 percent". -- Steven
participants (7)
-
Andrew Barnert
-
Chris Angelico
-
James Lu
-
Jonathan Fine
-
Ricky Teachey
-
Ronie Martinez
-
Steven D'Aprano