SI scale factors in Python

All, I propose that support for SI scale factors be added to Python. This would be very helpful for any program that heavily uses real numbers, such as those involved with scientific and engineering computation. There would be two primary changes. First, the lexer would be enhanced to take real literals with the following forms: c1 = 1nF (same as: c1 = 1e-9 # F ) c = 299.79M (same as: c = 299.79e6 ) f_hy = 1.4204GHz (same as: f_hy = 1.4204e9 # Hz) Basically a scale factor and units may follow a number, both of which are optional, but if the units are given the scale factor must also be given. Any units given could be kept with the number and would be accessible through an attribute or method call, or if it is felt that the cost of storing the units are too high, it may simply be discarded, in which case it is simply serving as documentation. The second change would be to the various real-to-string conversions available in Python. New formatting options would be provided to support SI scale factors. For example, print('Hydrogen line frequency: %q' % f_hy) Hydrogen line frequency: 1.4204GHz If the units are retained with the numbers, then %q (quantity) could be used to print number with SI scale factors and units, and %r (real) could be used to print the number with SI scale factors but without the units. A small package that fleshes out these ideas is available from https://github.com/KenKundert/engfmt It used to be that SI scale factors were only used by scientists and engineers, but over the last 20 years their popularity has increased and so now they are used everywhere. It is time for our programming languages to catch up. I find it a little shocking that no programming languages offer this feature yet, but engineering applications such as SPICE and Verilog have supported SI scale factors for a very long time. -Ken

Ian Kelly wrote:
Should 1m be interpreted as 1 meter or 0.001 (unitless)?
I've never seen anyone use a scale factor prefix on its own with a dimensionless number. Sometimes informally the unit is omitted when it can be inferred from context (e.g. "1k" written next to a resistor symbol obviously means "1 kilohm"). But without that context it's ambiguous, so I don't think it should be allowed in program code. -- Greg

On Thu, Aug 25, 2016 at 2:28 PM, Ken Kundert <python-ideas@shalmirane.com> wrote:
If units are retained, what you have is no longer a simple number, but a value with a unit, and is a quite different beast. (For instance, addition would have to cope with unit mismatches (probably by throwing an error), and multiplication would have to combine the units (length * length = area).) That would be a huge new feature. I'd be inclined to require, for simplicity, that the scale factor and the unit be separated with a hash: c1 = 1n#F c = 299.79M f_hy = 1.4204G#Hz It reads almost as well as "GHz" does, but is clearly non-semantic. The resulting values would simply be floats, and the actual tag would be discarded - there'd be no difference between 1.4204G and 1420.4M, and the %q formatting code would render them the same way. Question, though: What happens with exa-? Currently, if the parser sees "1E", it'll expect to see another number, eg 1E+1 == 10.0. Will this double meaning cause confusion? ChrisA

On Thu, Aug 25, 2016 at 3:57 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Yeah. And a full-on unit-aware numeric system doesn't belong in the core language IMO. It belongs on PyPI, with an API like: length = N("100m") width = N("50m") area = length * width depth = N('2"') # inches volume = area * depth time = N("5 hours") flow_rate = volume/time print("Rain flowed through the pipe at", flow_rate) No core language changes needed for that. And since, in most cases, the values will come from user input anyway, a literal syntax won't be as important. ChrisA

On Wed, Aug 24, 2016 at 11:57 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
I'd say that it more accurately depends on whether the distance represents a displacement or a position of application. If one pushes a shopping cart off-center, that produces both work and torque, with different "distance" vectors for each. Analytically, one is a cross-product and the other is a dot-product. The unit matching engine would have to understand the difference and know which one is being applied in the calculation.

On 25/08/2016 06:28, Ken Kundert wrote:
There is little difference (except that it ask for a syntax modification which should be heavy weighted) between this proposition and c1 = 1*nF (same as: c1 = 1e-9 # F ) c = 299.79*M (same as: c = 299.79e6 ) f_hy = 1.4204*GHz (same as: f_hy = 1.4204e9 # Hz) with correct definition of the constants in a library. So a library would be welcome.

On Wed, Aug 24, 2016 at 09:28:03PM -0700, Ken Kundert wrote:
All, I propose that support for SI scale factors be added to Python.
I think there's something to be said for these ideas, but you are combining multiple ideas into one suggestion. First off, units with quantities: I think that is an excellent idea, but one best supported by a proper unit library that supports more than just SI units. There are already a few of those. See for example this Stackoverflow question: http://stackoverflow.com/questions/2125076/unit-conversion-in-python Sympy also does dimensional analysis: http://docs.sympy.org/latest/modules/physics/unitsystems/examples.html Google for more. If I try to add 30 feet to 6 metres and get either 36 feet or 36 metres, then your unit system is *worse* than useless, it is actively harmful. I don't mind if I get 15.144 metres or 49.685039 feet or even 5.0514947e-08 lightseconds, but I better not get 36 of anything. And likewise for adding 30 kilograms to 6 metres. That has to be an error, or this system will just be an attractive nuisance, luring people into a false sense of security while actually not protecting them from dimensional and unit conversion bugs at all. So I am an extremely strong -1 to the suggestion that we allow unit suffixes on numeric quantities but treat them as a no-op. Should Python support a unit conversion library in the standard library? I think perhaps not -- there's plenty of competition in the unit conversion ecosystem, both in Python and out of it, and I don't think that there's any one library that is both sufficiently "best of breed" enough and stable enough to put into the std lib. Remember that the std lib is where good libraries go to die: once they hit the std lib, stability becomes much, much more important than new features. But if you wish to argue differently, I'll be willing to hear your suggestions. Now, on to the second part of the suggestion: support for SI prefixes. I think this is simple enough, and useful enough, that we could make it part of the std lib -- and possibly even in 3.6 (possibly under a provisional basis). The library could be dead simple: # prefixes.py # Usage: # from prefixes import * # x = 123*M # like 123000000 # y = 45*Ki # like 45*1024 # SI unit prefixes # http://physics.nist.gov/cuu/Units/prefixes.html Y = yotta = 10**24 Z = zetta = 10**21 [...] k = kilo = 10**3 K = k # not strictly an SI prefix, but common enough to allow it m = milli = 1e-3 µ = micro = 1e-6 # A minor PEP-8 violation, but (I hope) forgiveable. u = µ # not really an SI prefix, but very common # etc # International Electrotechnical Commission (IEC) binary prefixes # http://physics.nist.gov/cuu/Units/binary.html Ki = kibi = 1024 Mi = mibi = 1024**2 Gi = 1024**3 # etc That's practically it. Of course, this simple implementation would allow usage that was a technical violation of the SI system: x = 45*M*µ*Ki as well as usage that is semantically meaningless: x = 45 + M but those abuses are best covered by "consenting adults". (In other words, if you don't like it, don't do it.) And it wouldn't support the obsolete binary prefixes that use SI symbols with binary values (K=1024, M=1024**2, etc), but that's a good thing. They're an abomination that need to die as soon as possible. -- Steve

Ian Kelly wrote:
Should 1m be interpreted as 1 meter or 0.001 (unitless)?
I've never seen anyone use a scale factor prefix on its own with a dimensionless number. Sometimes informally the unit is omitted when it can be inferred from context (e.g. "1k" written next to a resistor symbol obviously means "1 kilohm"). But without that context it's ambiguous, so I don't think it should be allowed in program code. -- Greg

On Thu, Aug 25, 2016 at 2:28 PM, Ken Kundert <python-ideas@shalmirane.com> wrote:
If units are retained, what you have is no longer a simple number, but a value with a unit, and is a quite different beast. (For instance, addition would have to cope with unit mismatches (probably by throwing an error), and multiplication would have to combine the units (length * length = area).) That would be a huge new feature. I'd be inclined to require, for simplicity, that the scale factor and the unit be separated with a hash: c1 = 1n#F c = 299.79M f_hy = 1.4204G#Hz It reads almost as well as "GHz" does, but is clearly non-semantic. The resulting values would simply be floats, and the actual tag would be discarded - there'd be no difference between 1.4204G and 1420.4M, and the %q formatting code would render them the same way. Question, though: What happens with exa-? Currently, if the parser sees "1E", it'll expect to see another number, eg 1E+1 == 10.0. Will this double meaning cause confusion? ChrisA

On Thu, Aug 25, 2016 at 3:57 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Yeah. And a full-on unit-aware numeric system doesn't belong in the core language IMO. It belongs on PyPI, with an API like: length = N("100m") width = N("50m") area = length * width depth = N('2"') # inches volume = area * depth time = N("5 hours") flow_rate = volume/time print("Rain flowed through the pipe at", flow_rate) No core language changes needed for that. And since, in most cases, the values will come from user input anyway, a literal syntax won't be as important. ChrisA

On Wed, Aug 24, 2016 at 11:57 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
I'd say that it more accurately depends on whether the distance represents a displacement or a position of application. If one pushes a shopping cart off-center, that produces both work and torque, with different "distance" vectors for each. Analytically, one is a cross-product and the other is a dot-product. The unit matching engine would have to understand the difference and know which one is being applied in the calculation.

On 25/08/2016 06:28, Ken Kundert wrote:
There is little difference (except that it ask for a syntax modification which should be heavy weighted) between this proposition and c1 = 1*nF (same as: c1 = 1e-9 # F ) c = 299.79*M (same as: c = 299.79e6 ) f_hy = 1.4204*GHz (same as: f_hy = 1.4204e9 # Hz) with correct definition of the constants in a library. So a library would be welcome.

On Wed, Aug 24, 2016 at 09:28:03PM -0700, Ken Kundert wrote:
All, I propose that support for SI scale factors be added to Python.
I think there's something to be said for these ideas, but you are combining multiple ideas into one suggestion. First off, units with quantities: I think that is an excellent idea, but one best supported by a proper unit library that supports more than just SI units. There are already a few of those. See for example this Stackoverflow question: http://stackoverflow.com/questions/2125076/unit-conversion-in-python Sympy also does dimensional analysis: http://docs.sympy.org/latest/modules/physics/unitsystems/examples.html Google for more. If I try to add 30 feet to 6 metres and get either 36 feet or 36 metres, then your unit system is *worse* than useless, it is actively harmful. I don't mind if I get 15.144 metres or 49.685039 feet or even 5.0514947e-08 lightseconds, but I better not get 36 of anything. And likewise for adding 30 kilograms to 6 metres. That has to be an error, or this system will just be an attractive nuisance, luring people into a false sense of security while actually not protecting them from dimensional and unit conversion bugs at all. So I am an extremely strong -1 to the suggestion that we allow unit suffixes on numeric quantities but treat them as a no-op. Should Python support a unit conversion library in the standard library? I think perhaps not -- there's plenty of competition in the unit conversion ecosystem, both in Python and out of it, and I don't think that there's any one library that is both sufficiently "best of breed" enough and stable enough to put into the std lib. Remember that the std lib is where good libraries go to die: once they hit the std lib, stability becomes much, much more important than new features. But if you wish to argue differently, I'll be willing to hear your suggestions. Now, on to the second part of the suggestion: support for SI prefixes. I think this is simple enough, and useful enough, that we could make it part of the std lib -- and possibly even in 3.6 (possibly under a provisional basis). The library could be dead simple: # prefixes.py # Usage: # from prefixes import * # x = 123*M # like 123000000 # y = 45*Ki # like 45*1024 # SI unit prefixes # http://physics.nist.gov/cuu/Units/prefixes.html Y = yotta = 10**24 Z = zetta = 10**21 [...] k = kilo = 10**3 K = k # not strictly an SI prefix, but common enough to allow it m = milli = 1e-3 µ = micro = 1e-6 # A minor PEP-8 violation, but (I hope) forgiveable. u = µ # not really an SI prefix, but very common # etc # International Electrotechnical Commission (IEC) binary prefixes # http://physics.nist.gov/cuu/Units/binary.html Ki = kibi = 1024 Mi = mibi = 1024**2 Gi = 1024**3 # etc That's practically it. Of course, this simple implementation would allow usage that was a technical violation of the SI system: x = 45*M*µ*Ki as well as usage that is semantically meaningless: x = 45 + M but those abuses are best covered by "consenting adults". (In other words, if you don't like it, don't do it.) And it wouldn't support the obsolete binary prefixes that use SI symbols with binary values (K=1024, M=1024**2, etc), but that's a good thing. They're an abomination that need to die as soon as possible. -- Steve
participants (7)
-
Chris Angelico
-
Greg Ewing
-
Ian Kelly
-
Ken Kundert
-
Random832
-
Steven D'Aprano
-
Xavier Combelle