[Python-ideas] SI scale factors in Python
Steven D'Aprano
steve at pearwood.info
Fri Aug 26 02:54:03 EDT 2016
On Thu, Aug 25, 2016 at 08:46:54PM -0700, Ken Kundert wrote:
> This idea is new to general purpose languages,
For the record, at least some HP calculators include a "units" data type
as part of the programming language "RPL", e.g. the HP-28 and HP-48
series. I've been using those for 20+ years so I'm quite familiar with
how useful this feature can be.
> but it has been used for over 40
> years in the circuit design community. Specifically, SPICE, an extremely heavily
> used circuit simulation package, introduced this concept in 1974. In SPICE the
> scale factor is honored but any thing after the scale factor is ignored. Being
> both a heavy user and developer of SPICE, I can tell you that in all that time
> this issue has never come up. In fact, the users never expected there to be any
> support for dimensional analysis, nor did they request it.
I can't comment about the circuit design community, but you're trying to
extrapolate from a single specialist application to a general purpose
programming language used by people of many, many varied levels of
expertise, of competence, with many different needs.
It makes a lot of sense for applications to allow SI prefixes as
suffixes within a restricted range. For example, the dd application
allows the user to specify the amount of data to copy using either bytes
or blocks, with optional suffixes:
BLOCKS and BYTES may be followed by the following multiplicative
suffixes: xM M, c 1, w 2, b 512, kB 1000, K 1024, MB 1000*1000,
M 1024*1024, GB 1000*1000*1000, G 1024*1024*1024, and so on for
T, P, E, Z, Y.
(Quoting from the man page.)
That makes excellent sense for a specialist application where numeric
quantities always mean the same thing, or in this case, one of two
things. As purely multiplicative suffixes, that even makes sense for
Python: earlier I said that it was a good idea to add a simple module
defining SI and IEC multiplicative constants in the std lib so that we
could do x = 42*M or similar.
But that's a far cry from allowing and ignoring units.
> > Don't think of people writing code like this:
> >
> > result = 23mA + 75MHz
> >
> > which is obviously wrong. Think about them writing code like this:
> >
> > total = sum_resistors_in_parallel(input, extra)
>
> You say that '23mA + 75MHz' is obviously wrong, but it is only obviously wrong
> because the units are included, which is my point. If I had written '0.023
> + 76e6', it would not be obviously wrong.
I understand your point and the benefit of dimensional analysis. But the
problem is, as users of specialised applications we may be used to
doing direct arithmetic on numeric literal values, with or without
attached units:
23mA + 75MHz # error is visible
23 + 75 # error is hidden
but as *programmers* we rarely do that. Generally speaking, it is rare
to be doing arithmetic on literals where we might have the opportunity
to attach a unit. We doing arithmetic on *variables* that have come from
elsewhere. Reading the source code doesn't show us something that might
be improved by adding a unit:
# we hardly ever see this
23 + 75
# we almost always see something like this
input + argument
At best, we can choose descriptive variable names that hint what the
correct dimensions should be:
weight_of_contents + weight_of_container
The argument that units would make it easier for the programmer to spot
errors is, I think, invalid, because the programmer will hardly ever get
to see the units.
[...]
> Indeed that is the point of dimensional analysis. However, despite the
> availability of all these packages, they are rarely if ever used because you
> have to invest a tremendous effort before they can be effective. For example,
> consider the simple case of Ohms Law:
>
> V = R*I
>
> To perform dimensional analysis we need to know the units of V, R, and I. These
> are variables not literals, so some mechanism needs to be provided for
> specifying the units of variables, even those that don't exist yet, like V.
This is not difficult, and you exaggerate the amount of effort required.
To my sorrow, I'm not actually familiar with any of the Python libraries
for this, so I'll give an example using the HP-48GX RPL language.
Suppose I have a value which I am expecting to be current in amperes.
(On the HP calculator, it will be passed to me on the stack, but the
equivalent in Python will be a argument passed to a function.) For
simplicity, let's assume that if it is a raw number, I will trust that
the user knows what they are doing and just convert it to a unit object
with dimension "ampere", otherwise I expect some sort of unit object
which is dimensionally compatible:
1_A CONV
is the RPL program to perform this conversion on the top of the stack,
and raise an error if the dimensions are incompatible. Converting to a
more familiar Python-like API, I would expect something like:
current = Unit("A").convert(current)
or possibly:
current = Unit("A", current)
take your pick. That's not a "tremendous" amount of effort, it is
comparable to ensuring that I'm using (say) floats in the first place:
if not isinstance(current, float):
raise TypeError
> And what if the following is encountered:
>
> V = I
>
> Dimensional analysis says this is wrong,
That's because it is wrong.
> but the it may be that the resistance
> is simply being suppressed because it is unity.
Your specialist experience in the area of circuit design is
misleading you. There's effectively only one unit of resistance, the
ohm, although my "units" program also lists:
R_K 25812.807 ohm
abohm abvolt / abamp
intohm 1.000495 ohm
kilohm kiloohm
megohm megaohm
microhm microohm
ohm V/A
siemensunit 0.9534 ohm
statohm statvolt / statamp
So even here with resistence "unity" is ambiguous. Do you mean one
intohm, one microhm, one statohm or something else? I'll grant you that
in the world of circuit design perhaps it could only mean the SI ohm.
But you're not in the world of circuit design any more, you are dealing
with a programming language that will be used by people for many,
many different purposes, for whom "unity" might mean (for example):
1 foot per second
1 foot per minute
1 metre per second
1 kilometre per hour
1 mile per hour
1 lightspeed
1 knot
1 mach
Specialist applications might be able to take shortcuts in dimensional
analysis when "everybody knows" what the suppressed units must be.
General purpose programming languages *cannot*. It is better NOT to
offer the illusion of dimensional analysis than to mislead the user into
thinking they are covered when they are not.
Better to let them use a dedicated units package, not build a half-baked
bug magnet into the language syntax.
--
Steve
More information about the Python-ideas
mailing list