SI scale factors alone, without units or dimensional analysis
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
Ken has made what I consider a very reasonable suggestion, to introduce SI prefixes to Python syntax for numbers. For example, typing 1K will be equivalent to 1000. However, there are some complexities that have been glossed over. (1) Are the results floats, ints, or something else? I would expect that 1K would be int 1000, not float 1000. But what about fractional prefixes, like 1m? Should that be a float or a decimal? If I write 7981m I would expect 7.981, not 7.9809999999999999, so maybe I want a decimal float, not a binary float? Actually, what I would really want is for the scale factor to be tracked separately. If I write 7981m * 1M, I should end up with 7981000 as an int, not a float. Am I being unreasonable? Obviously if I write 1.1K then I'm expecting a float. So I'm not *entirely* unreasonable :-) (2) Decimal or binary scale factors? The SI units are all decimal, and I think if we support these, we should insist that K == 1000, not 1024. For binary scale factors, there is the IEC standard: http://physics.nist.gov/cuu/Units/binary.html which defines Ki = 2**10, Mi = 2**20, etc. (Fortunately this doesn't have to deal with fractional prefixes.) So it would be easy enough to support them as well. (3) µ or u, k or K? I'm going to go to the barricades to fight for the real SI prefixes µ and k to be supported. If people want to support the common fakes u and K as well, that's fine, I have no objection, but I think that its important to support the actual prefixes too. (Python 3 assumes UTF-8 as the default encoding, so it shouldn't cause any technical difficulties to support µ as syntax. The political difficulties though...) (4) What about E? E is tricky if we want 1E to be read as the integer 10**18, because it matches the floating point syntax 1E (which is currently a syntax error). So there's a nasty bit of ambiguity where it may be unclear whether or not 1E is intended as an int or an incomplete float, and then there's 1E1E which might be read as 1E1*10**18 or as just an error. Replacing E with (say) X is risky. The two largest current SI prefixes are Z and Y, it seems very likely that the next one added (if that ever happens) will be X. Actually, using any other letter risks clashing with a future expansion of the SI prefixes. (5) What about other numeric types? Just because there's no syntactic support for Fraction and Decimal shouldn't mean we can't use these scale factors with them. (6) What happens to int(), float() etc? I wouldn't want int("23K") to suddenly change from being an error to returning 23000. Presumably we would want int to take an optional argument to allow the interpretation of scale factors. This gives us an advantage: int("23E", scale=True) is unambiguously an int, and we can ignore the fact that it looks like a float. (7) What about repr() and str()? I don't think that the repr() or str() of numeric types should change. But perhaps format() could grow some new codes to display numbers using either the most obvious scale factor, or some specific scale factor. * * * This leads to my first proposal: require an explicit numeric prefix on numbers before scale factors are allowed, similar to how we treat non-decimal bases. 8M # remains a syntax error 0s8M # unambiguously an int with a scale factor of M = 10**6 0s1E1E # a float 1E1 with a scale factor of E = 10**18 0s1.E # a float 1. with a scale factor of E, not an exponent int('8M') # remains a ValueError int('0s8M', base=0) # returns 8*10**6 Or if that's too heavy (two whole characters, plus the suffix!) perhaps we could have a rule that the suffix must follow the final underscore of the number: 8_M # int 8*10*6 123_456_789_M # int 123456789*10**6 123_M_456 # still an error 8._M # float 8.0*10**6 int() and float() take a keyword only argument to allow a scale factor when converting from strings: int("8_M") # remains an error int("8_M", scale=True) # allowed This solves the problem with E and floats. Its only a scale factor if it immediately follows the final underscore in the float, otherwise it is the regular exponent sign. Proposal number two: don't make any changes to the syntax, but treat these as *literally* numeric scale factors. Add a simple module to the std lib defining the various factors: k = kilo = 10**3 M = mega = 10**6 G = giga = 10**9 etc. and then allow the user to literally treat them as scale factors by multiplying: from scaling import * int_value = 8*M float_value = 8.0*M fraction_value = Fraction(1, 8)*M decimal_value = Decimal("1.2345")*M and so forth. The biggest advantage of this is that there is no syntactic changes needed, it is completely backwards compatible, it works with any numeric type and even non-numbers: py> x = [None]*M py> len(x) 1000000 You can even scale by multiple factors: x = 8*M*K Disadvantages: none I can think of. (Some cleverness may be needed to have fractional scale values work with both floats and Decimals, but that shouldn't be hard.) -- Steve
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Fri, Aug 26, 2016 at 10:47 PM, Steven D'Aprano <steve@pearwood.info> wrote:
Introduce "d" as a prefix meaning 1, and this could be the way of creating something that people have periodically asked for: Decimal literals. (Though IIRC there were some complexities involving Decimal literals and decimal.getcontext(), which would have to be resolved before 1m could represent a Decimal.)
Easy. Make them Fraction literals instead. You'll end up with 7981000/1 as a rational, rather than a pure int, but if you want accurate handling of SI prefixes, rationals will serve you fairly well.
Obviously if I write 1.1K then I'm expecting a float. So I'm not *entirely* unreasonable :-)
Obviously :)
from __future__ import binary_scale_factors as scale_factors from __future__ import decimal_scale_factors as scale_factors # tongue only partly in cheek
I would strongly support the use of µ and weakly u. With k vs K, no opinion. If both can be supported without being confusing, grab 'em both. With output formats, it's less clear, but I would still be inclined toward µ for output.
It's worse than that. Currently, 1E+2 is a perfectly legal 100.0 (float), but under this proposal, it would be a constant expression yielding 1_000_000_000_000_000_002, so it wouldn't just be giving meaning to things that are currently errors.
Anything's risky. Probably the least risky option is to simply stop before Exa and implement the feature without.
Agreed. And I'd have them simply pick the one most obvious - if you want a specific factor, you can simply invert and display.
Hmm, interesting. Feels clunky but could work.
This sounds better IMO. It's not legal syntax in any version of Python older than 3.6, so there's minimal backward compatibility trouble. ChrisA
data:image/s3,"s3://crabby-images/2eb67/2eb67cbdf286f4b7cb5a376d9175b1c368b87f28" alt=""
On 2016-08-26 14:34, Chris Angelico wrote:
According to Wikipedia, it's recommended that there be a space between the number and the units, thus not "1kg" but "1 kg". As we don't put spaces inside numbers in Python, insisting on an underscore instead would seem to be a reasonable compromise.
data:image/s3,"s3://crabby-images/6f9e2/6f9e2a838d337853327a41861359cf06e67039fb" alt=""
Okay, so I talked to Guido about this, and all he was trying to convey is that there is an extremely high bar that must be reached before he will consider changing the base language, which of course is both prudent and expected. I'd like to continue the discussion because I believe there is some chance that we could reach that bar even though Guido is clearly skeptical. At this point I'd like to suggest that there be two more constraints we consider: 1. whatever form we choose to be used when outputting numbers should the same form we use when inputting numbers (so %r should produce valid python input, as does all the other format codes), 2. and whatever form we choose to be used when outputting numbers should look natural to end users. So ideas like using 2.4*G, 2.4*GHz, or 0s2.4G wouldn't really work because we would not want to output numbers to end users in this form. Even 2.4_GHz, while better, would still look somewhat unnatural to end users. One more thing to consider is that we already have a precedent here. Python already accepts a suffix on real numbers: j signifies an imaginary number. In this case the above constraints are satisfied. For example, 2j is a natural form to show any end user that understands imaginary numbers, and 2j is acceptable input to the language. To be consistent with that, it seems like 2G or 2GHz should be preferred over 2_G or 2_GHz. Of course, this brings up another issue, how to we handle imaginary numbers with scale factors. The possibilities include: 1. you don't get them, you can either specify j or a scale factor, but not both 2. you do get them, but if we allow units, then j should be first 3. you do get them, but we don't allow units and j could be first or second I like choice 2 myself. Also, to be consistent with j, and because I think it is simpler overall, I think 2G should be a real number, not an integer. Similarly, I think 2Gi, if we accept it, should also be a real number, simply for consistency. One last thing, we can accept 273K as input for 273,000, but when we output it we must use k to avoid confusion with Kelvin (and because that is the standard). Also, we can use μ for for inputting or outputting 1e-6, but we must always accept u as valid input. -Ken On Fri, Aug 26, 2016 at 07:42:00AM -0700, Guido van Rossum wrote:
data:image/s3,"s3://crabby-images/2eb67/2eb67cbdf286f4b7cb5a376d9175b1c368b87f28" alt=""
On 2016-08-26 22:25, Ken Kundert wrote:
"2.4" would not look natural to a lot of people; they would expect "2,4". Either you force the end user to accept what Python uses for a decimal point and digit grouping, or you convert on input and output. And if you're doing that for the decimal point and digit grouping, you might as well also do that for scale factors and units.
Interestingly, it's not possible to use J as a unit (Joule):
1J 1j
data:image/s3,"s3://crabby-images/d1d84/d1d8423b45941c63ba15e105c19af0a5e4c41fda" alt=""
Ken Kundert writes:
OK.
input to the language. To be consistent with that, it seems like 2G or 2GHz should be preferred over 2_G or 2_GHz.
Sure, *in the user interface*. The Python interpreter REPL is a user interface, but it's a "bare minimum" intended to expose the language and no more. Interfaces like Jupyter can provide the heuristic that an identifier following a numeric literal is a "unit" (whatever we decide that should be, a type as I suggest, semantically null as you seem to prefer, or something else) without changing the language. This separation of concerns also allows Python (the language) to experiment with different implementations of "unit" while Jupyter maintains its user interface without change. If you want to change the *language* you need to provide answers to the following. I have no answers to them that I like, but maybe you can do better. How about 2.4Gaunitwithaveryveryveryveryveryverylongname? Consider the chemical unit "mol". How do you distinguish "1 mol" from "1/1000 ol"? Similarly, how do you distinguish "1 joule" from "1 imaginary oule"? If you allow both naked prefixes and prefixed units, how do you distinguish "1/10 a" from "10" when both are represented "1da"?
I think that's unacceptable. If "273K" has the valid interpretation "0 degrees Celcius" and we're going to accept units at all, we must not ask users to type 273000mK or even 2730dK. So if we ever accept "1K" to mean 1000, we're kinda hosed for accepting units. I think units syntax is broken anyway per the examples above, and Guido already pronounced on "naked scale prefixes":
On Fri, Aug 26, 2016 at 07:42:00AM -0700, Guido van Rossum wrote:
Please curb your enthusiasm. This is not going to happen.
+1 Guido may have retracted this pronouncement in private mail, but by that same token, he can reinstate it. I've learned to trust his first reactions; backing off this way is a symptom of openness to new ideas, not inaccuracy of the first reaction. Steve
data:image/s3,"s3://crabby-images/6f9e2/6f9e2a838d337853327a41861359cf06e67039fb" alt=""
On Sat, Aug 27, 2016 at 03:24:49PM +0900, Stephen J. Turnbull wrote:
Why would we care if the user wants to use a long name for their units? We don't care if they use a long name for their variables.
Consider the chemical unit "mol". How do you distinguish "1 mol" from "1/1000 ol"?
The rule is you cannot give unit without a scale factor, and the unity scale factor is _, so if you wanted to say 1 mol you would use 1_mol. 1mol means one milli ol.
Similarly, how do you distinguish "1 joule" from "1 imaginary oule"?
Again, you cannot give units without a scale factor. so 1 joule is 1_J. For one imaginary joule, it would be 1j_J. These look a little strange, but that is because the use they unit scale factor, which is the one that is currently not in heavy use. Other scale factors look much more natural. For example, 1 milli mol is 1mmol. 1 kilo joule is 1kJ.
If you allow both naked prefixes and prefixed units, how do you distinguish "1/10 a" from "10" when both are represented "1da"?
I suggest that we do not support the h (=100), da (=10), d (=0.1), or c (=0.01) scale factors. The primary supported scale factors should be TGMk_munpfa. The extended set would include YZEP and zy.
The valid interpretation of 273K is 273,000. If you want 273 Kelvin, you would use 273_K.
Steve
data:image/s3,"s3://crabby-images/e2594/e259423d3f20857071589262f2cb6e7688fbc5bf" alt=""
On 8/26/2016 8:47 AM, Steven D'Aprano wrote:
-1 for the syntax, +1 for keeping it an error
0s8M # unambiguously an int with a scale factor of M = 10**6
-.5, better 0a for a in b,o,x is used for numeral base, which is related to scaling of each numeral individually. 0s would be good for ancient sexigesimal (base 60) notation, which we still use for time and circle degrees.
-.1, better I do not remember seeing this use of SI scale factors divorced from units. I can see how it works well for the relatively small community of EEs, but I expect it would only make Python more confusing for many others, especially school kids.
+1 for PyPI, currently +-0 for stdlib '*' is easier to type than '_'.
A main use for me would be large ints: us_debt = 19*G. But I would also want to be able to write 19.3*G and get an int, and that would not work. The new _ syntax will alleviate the problem in a different way. 19_300_000_000 will work. Rounded 0s for counts do not always come in groups of 3.
and for PyPI, it does not need pydev + Guido approval.
Disadvantages: none I can think of.
Someone mentioned cluttering namespace with 20-30 single char names. For readers, remembering what each letter means
(Some cleverness may be needed to have fractional scale values work with both floats and Decimals, but that shouldn't be hard.)
Make M, G, etc, instances of a class with def __mul__(self, other) that conditions on type of other. -- Terry Jan Reedy
data:image/s3,"s3://crabby-images/2658f/2658f17e607cac9bc627d74487bef4b14b9bfee8" alt=""
Steven D'Aprano wrote:
Obviously if I write 1.1K then I'm expecting a float.
Why is it obvious that you're expecting a float and not a decimal in that case?
Or perhaps allow the multiplier to be followed by 'b' or 'B' (bits/bytes/binary) to signal a binary scale factor.
and then there's 1E1E which might be read as 1E1*10**18 or as just an error.
I don't think it's necessary or desirable to support having both a scale factor *and* an exponent, so I'd go for making it an error. You can always write 1E * 1E18 etc. if you need to.
8M # remains a syntax error 0s8M # unambiguously an int with a scale factor of M = 10**6
That looks ugly and hard to read to me. If we're to have that, I'm not sure 's' is the best character, since it suggest something to do with strings.
I like this!
You can even scale by multiple factors:
x = 8*M*K
Which also offers a neat solution to the "floppy megabytes" problem: k = 1000 kB = 1024 floppy_size = 1.44*k*kB -- Greg
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Sat, Aug 27, 2016 at 01:16:42PM +1200, Greg Ewing wrote:
Because if you search the list archives you'll see that, in the short term at least, I'm not in favour of changing the default floating point type from binary floats to decimal floats *wink* Also, I didn't say "binary float", I might have meant "decimal float" :-) But all joking aside, you are making a good point. Since the SI units are all powers of ten, maybe this should be linked to decimal and integer numbers rather than binary floats. Let's look ahead to the day (Python 4? Python 5?) where there is a built-in decimal floating point type with fixed precision. The existing Decimal type will remain variable precision. Whatever suffixes we allow now will limit our choices for this hypothetical "decimal128" (say) type. So if we have: 123d to mean 123 deci-units, or 123*10**-1, then we've just eliminated the ability to make 123d a decimal. Now that's not necessarily a reason to reject this proposal, but it does add a complication. (And frankly, I'd rather get built-in fixed precision decimals than syntax for scale factors. But that's another story.)
Why sure! Let's ignore the perfectly good, well-known existing official standard to invent our own standard that clashes with the de facto standard use of "b" for bits and "B" for bytes! *wink*
To be honest, I agree. Even though I suggested 0s as a prefix, I don't actually like it. I much prefer the rule "any scale factor must follow the last underscore of the number".
I've started experimenting with this, and when I get time, I'll put it on PyPI so that people can experiment with it too.
Indeed, although I would write that as k*Ki. -- Steve
data:image/s3,"s3://crabby-images/c437d/c437dcdb651291e4422bd662821948cd672a26a3" alt=""
This is the only variant I've seen that I would consider "not awful." Of course, this involves no change in the language, but just a module on PyPI. Of the awful options, a suffix underscore and multiplier (1.1_G) is the least awful. It's a little bit reminiscent of the optional internal underscores being added to literals.
data:image/s3,"s3://crabby-images/2eb67/2eb67cbdf286f4b7cb5a376d9175b1c368b87f28" alt=""
On 2016-08-26 13:47, Steven D'Aprano wrote:
Just for the record, this is what you can now do in C++: User-Defined Literals http://arne-mertz.de/2016/10/modern-c-features-user-defined-literals/
data:image/s3,"s3://crabby-images/cb134/cb134842fc7db3d0220c2cb6fc6e27900212bd7c" alt=""
From that page:
Obviously the arbitrary-function-part of that will never happen in Python (yes?) Also, for discussion, remember to make the distinction between 'units' (amps, meters, seconds) and 'prefixes' (micro, milli, kilo, mega). Right away from comments, it seems 1_m could look like 1 meter to some, or 0.001 to others. Typically when I need to enter very small/large literals, I'll use "engineering" SI notation (powers divisible by 3 that correspond to the prefixes): 0.1e-9 = 0.1 micro____. On Sat, Oct 29, 2016 at 12:20 AM, Ryan Birmingham <rainventions@gmail.com> wrote:
data:image/s3,"s3://crabby-images/d82cf/d82cfdcfaa7411c61e6ca877f84970109000fbcc" alt=""
On Sat, Oct 29, 2016 at 12:43 PM, Nick Timkovich <prometheus235@gmail.com> wrote:
Why not? It seems like that would solve a lot of use-cases. People get bringing up various new uses for prefix or suffix syntax that they want built directly into the language. Providing a generic way to implement third-party prefixes or suffixes would save having to put all of these directly into the language. And it opens up a lot of other potential use-cases as well.
data:image/s3,"s3://crabby-images/cb134/cb134842fc7db3d0220c2cb6fc6e27900212bd7c" alt=""
Ah, always mess up micro = 6/9 until I think about it for half a second. Maybe a "n" suffix could have saved me there ;) For "long" numbers there's the new _ so you can say 0.000_000_1 if you so preferred for 0.1 micro (I generally see _ as more useful for high-precison numbers with more non-zero digits, e.g. 1_234_456_789). Would that be 0.1µ, 0.1u in a new system. Veering a bit away from the 'suffixing SI prefixes for literals': Literal unary suffix operators might be slightly nicer than multiplication if they were #1 in operator precedence, then you could omit some parentheses. Right now if I want to use a unit: $ pip install quantities import quantities as pq F = 1 * pq.N d = 1 * pq.m F * d # => array(1.0) * m*N but with literal operators & functions could be something like F = 1 pq.N d = 1 pq.m On Sat, Oct 29, 2016 at 1:18 PM, Todd <toddrjen@gmail.com> wrote:
data:image/s3,"s3://crabby-images/0f8ec/0f8eca326d99e0699073a022a66a77b162e23683" alt=""
On Fri, Aug 26, 2016 at 10:47 PM, Steven D'Aprano <steve@pearwood.info> wrote:
Introduce "d" as a prefix meaning 1, and this could be the way of creating something that people have periodically asked for: Decimal literals. (Though IIRC there were some complexities involving Decimal literals and decimal.getcontext(), which would have to be resolved before 1m could represent a Decimal.)
Easy. Make them Fraction literals instead. You'll end up with 7981000/1 as a rational, rather than a pure int, but if you want accurate handling of SI prefixes, rationals will serve you fairly well.
Obviously if I write 1.1K then I'm expecting a float. So I'm not *entirely* unreasonable :-)
Obviously :)
from __future__ import binary_scale_factors as scale_factors from __future__ import decimal_scale_factors as scale_factors # tongue only partly in cheek
I would strongly support the use of µ and weakly u. With k vs K, no opinion. If both can be supported without being confusing, grab 'em both. With output formats, it's less clear, but I would still be inclined toward µ for output.
It's worse than that. Currently, 1E+2 is a perfectly legal 100.0 (float), but under this proposal, it would be a constant expression yielding 1_000_000_000_000_000_002, so it wouldn't just be giving meaning to things that are currently errors.
Anything's risky. Probably the least risky option is to simply stop before Exa and implement the feature without.
Agreed. And I'd have them simply pick the one most obvious - if you want a specific factor, you can simply invert and display.
Hmm, interesting. Feels clunky but could work.
This sounds better IMO. It's not legal syntax in any version of Python older than 3.6, so there's minimal backward compatibility trouble. ChrisA
data:image/s3,"s3://crabby-images/2eb67/2eb67cbdf286f4b7cb5a376d9175b1c368b87f28" alt=""
On 2016-08-26 14:34, Chris Angelico wrote:
According to Wikipedia, it's recommended that there be a space between the number and the units, thus not "1kg" but "1 kg". As we don't put spaces inside numbers in Python, insisting on an underscore instead would seem to be a reasonable compromise.
data:image/s3,"s3://crabby-images/6f9e2/6f9e2a838d337853327a41861359cf06e67039fb" alt=""
Okay, so I talked to Guido about this, and all he was trying to convey is that there is an extremely high bar that must be reached before he will consider changing the base language, which of course is both prudent and expected. I'd like to continue the discussion because I believe there is some chance that we could reach that bar even though Guido is clearly skeptical. At this point I'd like to suggest that there be two more constraints we consider: 1. whatever form we choose to be used when outputting numbers should the same form we use when inputting numbers (so %r should produce valid python input, as does all the other format codes), 2. and whatever form we choose to be used when outputting numbers should look natural to end users. So ideas like using 2.4*G, 2.4*GHz, or 0s2.4G wouldn't really work because we would not want to output numbers to end users in this form. Even 2.4_GHz, while better, would still look somewhat unnatural to end users. One more thing to consider is that we already have a precedent here. Python already accepts a suffix on real numbers: j signifies an imaginary number. In this case the above constraints are satisfied. For example, 2j is a natural form to show any end user that understands imaginary numbers, and 2j is acceptable input to the language. To be consistent with that, it seems like 2G or 2GHz should be preferred over 2_G or 2_GHz. Of course, this brings up another issue, how to we handle imaginary numbers with scale factors. The possibilities include: 1. you don't get them, you can either specify j or a scale factor, but not both 2. you do get them, but if we allow units, then j should be first 3. you do get them, but we don't allow units and j could be first or second I like choice 2 myself. Also, to be consistent with j, and because I think it is simpler overall, I think 2G should be a real number, not an integer. Similarly, I think 2Gi, if we accept it, should also be a real number, simply for consistency. One last thing, we can accept 273K as input for 273,000, but when we output it we must use k to avoid confusion with Kelvin (and because that is the standard). Also, we can use μ for for inputting or outputting 1e-6, but we must always accept u as valid input. -Ken On Fri, Aug 26, 2016 at 07:42:00AM -0700, Guido van Rossum wrote:
data:image/s3,"s3://crabby-images/2eb67/2eb67cbdf286f4b7cb5a376d9175b1c368b87f28" alt=""
On 2016-08-26 22:25, Ken Kundert wrote:
"2.4" would not look natural to a lot of people; they would expect "2,4". Either you force the end user to accept what Python uses for a decimal point and digit grouping, or you convert on input and output. And if you're doing that for the decimal point and digit grouping, you might as well also do that for scale factors and units.
Interestingly, it's not possible to use J as a unit (Joule):
1J 1j
data:image/s3,"s3://crabby-images/d1d84/d1d8423b45941c63ba15e105c19af0a5e4c41fda" alt=""
Ken Kundert writes:
OK.
input to the language. To be consistent with that, it seems like 2G or 2GHz should be preferred over 2_G or 2_GHz.
Sure, *in the user interface*. The Python interpreter REPL is a user interface, but it's a "bare minimum" intended to expose the language and no more. Interfaces like Jupyter can provide the heuristic that an identifier following a numeric literal is a "unit" (whatever we decide that should be, a type as I suggest, semantically null as you seem to prefer, or something else) without changing the language. This separation of concerns also allows Python (the language) to experiment with different implementations of "unit" while Jupyter maintains its user interface without change. If you want to change the *language* you need to provide answers to the following. I have no answers to them that I like, but maybe you can do better. How about 2.4Gaunitwithaveryveryveryveryveryverylongname? Consider the chemical unit "mol". How do you distinguish "1 mol" from "1/1000 ol"? Similarly, how do you distinguish "1 joule" from "1 imaginary oule"? If you allow both naked prefixes and prefixed units, how do you distinguish "1/10 a" from "10" when both are represented "1da"?
I think that's unacceptable. If "273K" has the valid interpretation "0 degrees Celcius" and we're going to accept units at all, we must not ask users to type 273000mK or even 2730dK. So if we ever accept "1K" to mean 1000, we're kinda hosed for accepting units. I think units syntax is broken anyway per the examples above, and Guido already pronounced on "naked scale prefixes":
On Fri, Aug 26, 2016 at 07:42:00AM -0700, Guido van Rossum wrote:
Please curb your enthusiasm. This is not going to happen.
+1 Guido may have retracted this pronouncement in private mail, but by that same token, he can reinstate it. I've learned to trust his first reactions; backing off this way is a symptom of openness to new ideas, not inaccuracy of the first reaction. Steve
data:image/s3,"s3://crabby-images/6f9e2/6f9e2a838d337853327a41861359cf06e67039fb" alt=""
On Sat, Aug 27, 2016 at 03:24:49PM +0900, Stephen J. Turnbull wrote:
Why would we care if the user wants to use a long name for their units? We don't care if they use a long name for their variables.
Consider the chemical unit "mol". How do you distinguish "1 mol" from "1/1000 ol"?
The rule is you cannot give unit without a scale factor, and the unity scale factor is _, so if you wanted to say 1 mol you would use 1_mol. 1mol means one milli ol.
Similarly, how do you distinguish "1 joule" from "1 imaginary oule"?
Again, you cannot give units without a scale factor. so 1 joule is 1_J. For one imaginary joule, it would be 1j_J. These look a little strange, but that is because the use they unit scale factor, which is the one that is currently not in heavy use. Other scale factors look much more natural. For example, 1 milli mol is 1mmol. 1 kilo joule is 1kJ.
If you allow both naked prefixes and prefixed units, how do you distinguish "1/10 a" from "10" when both are represented "1da"?
I suggest that we do not support the h (=100), da (=10), d (=0.1), or c (=0.01) scale factors. The primary supported scale factors should be TGMk_munpfa. The extended set would include YZEP and zy.
The valid interpretation of 273K is 273,000. If you want 273 Kelvin, you would use 273_K.
Steve
data:image/s3,"s3://crabby-images/e2594/e259423d3f20857071589262f2cb6e7688fbc5bf" alt=""
On 8/26/2016 8:47 AM, Steven D'Aprano wrote:
-1 for the syntax, +1 for keeping it an error
0s8M # unambiguously an int with a scale factor of M = 10**6
-.5, better 0a for a in b,o,x is used for numeral base, which is related to scaling of each numeral individually. 0s would be good for ancient sexigesimal (base 60) notation, which we still use for time and circle degrees.
-.1, better I do not remember seeing this use of SI scale factors divorced from units. I can see how it works well for the relatively small community of EEs, but I expect it would only make Python more confusing for many others, especially school kids.
+1 for PyPI, currently +-0 for stdlib '*' is easier to type than '_'.
A main use for me would be large ints: us_debt = 19*G. But I would also want to be able to write 19.3*G and get an int, and that would not work. The new _ syntax will alleviate the problem in a different way. 19_300_000_000 will work. Rounded 0s for counts do not always come in groups of 3.
and for PyPI, it does not need pydev + Guido approval.
Disadvantages: none I can think of.
Someone mentioned cluttering namespace with 20-30 single char names. For readers, remembering what each letter means
(Some cleverness may be needed to have fractional scale values work with both floats and Decimals, but that shouldn't be hard.)
Make M, G, etc, instances of a class with def __mul__(self, other) that conditions on type of other. -- Terry Jan Reedy
data:image/s3,"s3://crabby-images/2658f/2658f17e607cac9bc627d74487bef4b14b9bfee8" alt=""
Steven D'Aprano wrote:
Obviously if I write 1.1K then I'm expecting a float.
Why is it obvious that you're expecting a float and not a decimal in that case?
Or perhaps allow the multiplier to be followed by 'b' or 'B' (bits/bytes/binary) to signal a binary scale factor.
and then there's 1E1E which might be read as 1E1*10**18 or as just an error.
I don't think it's necessary or desirable to support having both a scale factor *and* an exponent, so I'd go for making it an error. You can always write 1E * 1E18 etc. if you need to.
8M # remains a syntax error 0s8M # unambiguously an int with a scale factor of M = 10**6
That looks ugly and hard to read to me. If we're to have that, I'm not sure 's' is the best character, since it suggest something to do with strings.
I like this!
You can even scale by multiple factors:
x = 8*M*K
Which also offers a neat solution to the "floppy megabytes" problem: k = 1000 kB = 1024 floppy_size = 1.44*k*kB -- Greg
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Sat, Aug 27, 2016 at 01:16:42PM +1200, Greg Ewing wrote:
Because if you search the list archives you'll see that, in the short term at least, I'm not in favour of changing the default floating point type from binary floats to decimal floats *wink* Also, I didn't say "binary float", I might have meant "decimal float" :-) But all joking aside, you are making a good point. Since the SI units are all powers of ten, maybe this should be linked to decimal and integer numbers rather than binary floats. Let's look ahead to the day (Python 4? Python 5?) where there is a built-in decimal floating point type with fixed precision. The existing Decimal type will remain variable precision. Whatever suffixes we allow now will limit our choices for this hypothetical "decimal128" (say) type. So if we have: 123d to mean 123 deci-units, or 123*10**-1, then we've just eliminated the ability to make 123d a decimal. Now that's not necessarily a reason to reject this proposal, but it does add a complication. (And frankly, I'd rather get built-in fixed precision decimals than syntax for scale factors. But that's another story.)
Why sure! Let's ignore the perfectly good, well-known existing official standard to invent our own standard that clashes with the de facto standard use of "b" for bits and "B" for bytes! *wink*
To be honest, I agree. Even though I suggested 0s as a prefix, I don't actually like it. I much prefer the rule "any scale factor must follow the last underscore of the number".
I've started experimenting with this, and when I get time, I'll put it on PyPI so that people can experiment with it too.
Indeed, although I would write that as k*Ki. -- Steve
data:image/s3,"s3://crabby-images/c437d/c437dcdb651291e4422bd662821948cd672a26a3" alt=""
This is the only variant I've seen that I would consider "not awful." Of course, this involves no change in the language, but just a module on PyPI. Of the awful options, a suffix underscore and multiplier (1.1_G) is the least awful. It's a little bit reminiscent of the optional internal underscores being added to literals.
data:image/s3,"s3://crabby-images/2eb67/2eb67cbdf286f4b7cb5a376d9175b1c368b87f28" alt=""
On 2016-08-26 13:47, Steven D'Aprano wrote:
Just for the record, this is what you can now do in C++: User-Defined Literals http://arne-mertz.de/2016/10/modern-c-features-user-defined-literals/
data:image/s3,"s3://crabby-images/cb134/cb134842fc7db3d0220c2cb6fc6e27900212bd7c" alt=""
From that page:
Obviously the arbitrary-function-part of that will never happen in Python (yes?) Also, for discussion, remember to make the distinction between 'units' (amps, meters, seconds) and 'prefixes' (micro, milli, kilo, mega). Right away from comments, it seems 1_m could look like 1 meter to some, or 0.001 to others. Typically when I need to enter very small/large literals, I'll use "engineering" SI notation (powers divisible by 3 that correspond to the prefixes): 0.1e-9 = 0.1 micro____. On Sat, Oct 29, 2016 at 12:20 AM, Ryan Birmingham <rainventions@gmail.com> wrote:
data:image/s3,"s3://crabby-images/d82cf/d82cfdcfaa7411c61e6ca877f84970109000fbcc" alt=""
On Sat, Oct 29, 2016 at 12:43 PM, Nick Timkovich <prometheus235@gmail.com> wrote:
Why not? It seems like that would solve a lot of use-cases. People get bringing up various new uses for prefix or suffix syntax that they want built directly into the language. Providing a generic way to implement third-party prefixes or suffixes would save having to put all of these directly into the language. And it opens up a lot of other potential use-cases as well.
data:image/s3,"s3://crabby-images/cb134/cb134842fc7db3d0220c2cb6fc6e27900212bd7c" alt=""
Ah, always mess up micro = 6/9 until I think about it for half a second. Maybe a "n" suffix could have saved me there ;) For "long" numbers there's the new _ so you can say 0.000_000_1 if you so preferred for 0.1 micro (I generally see _ as more useful for high-precison numbers with more non-zero digits, e.g. 1_234_456_789). Would that be 0.1µ, 0.1u in a new system. Veering a bit away from the 'suffixing SI prefixes for literals': Literal unary suffix operators might be slightly nicer than multiplication if they were #1 in operator precedence, then you could omit some parentheses. Right now if I want to use a unit: $ pip install quantities import quantities as pq F = 1 * pq.N d = 1 * pq.m F * d # => array(1.0) * m*N but with literal operators & functions could be something like F = 1 pq.N d = 1 pq.m On Sat, Oct 29, 2016 at 1:18 PM, Todd <toddrjen@gmail.com> wrote:
participants (14)
-
Chris Angelico
-
David Mertz
-
Greg Ewing
-
Guido van Rossum
-
Ken Kundert
-
MRAB
-
Nick Timkovich
-
Pavol Lisy
-
Ryan Birmingham
-
Stephen J. Turnbull
-
Steven D'Aprano
-
Sven R. Kunze
-
Terry Reedy
-
Todd