[Python-ideas] SI scale factors in Python

Steven D'Aprano steve at pearwood.info
Sat Aug 27 14:25:15 EDT 2016


On Fri, Aug 26, 2016 at 03:23:24PM -0700, Ken Kundert wrote:
> Steven,
>     This keeps coming up, so let me address it again.
> 
> First, I concede that you are correct that my proposal does not provide 
> dimensional analysis, so any dimensional errors that exist in this new code will 
> not be caught by Python itself, as is currently the case.
> 
> However, you should concede that by bringing the units front and center in the 
> language, they are more likely to be caught by the user themselves.

I do not concede any such thing at all.

At best this might apply under some circumstances with extremely simple 
formulae when working directly with literals, in other words when using 
Python like a souped-up calculator. (Which is a perfectly reasonable way 
to use Python -- I do that myself. But it's not the only way to use 
Python.) But counting against that is that there will be other cases 
where what should be a error will sneak past because it happens to look 
like a valid scale factor or unit:

    x += 1Y  # oops, I fat-fingered the Y when I meant 16


> It is 
> my position that dimensional analysis is so difficult and burdensome that there 
> is no way it should be in the base Python language. If available, it should be 
> as an add on.

This is a strange and contradictory position to take. If dimensional 
analysis is so "difficult and burdensome", how do you expect the user to 
do it in their head by just looking at the source code?

It is your argument above that users will be able to catch dimensional 
errors just by looking at the units in the source code, but here, just 
one sentence later, you claim that dimensional analysis is so difficult 
and burdensome that users cannot deal with it even with the assistence 
of the interpreter. I cannot reconcile those two beliefs. If you think 
that dimensional analysis is both important and "difficult and 
burdensome", then surely we should want to automate as much of it as 
possible?

Of course the easy cases are easy:

    torque = 45_N * 18_m

is obviously correct, but the hard cases are not. As far as I can tell, 
your suggested syntax doesn't easily support compound units, let alone 
more realistic cases of formulae from sciences other than electrical 
engineering:

    # Van der Waals equation
    pressure = (5_mol * 6.022140857e23/1_mol * 1.38064852e−23_J/1_K 
                * 340_K / (2.5_m**3 - 5_mol * 0.1281_m**3/1_mol)
                - (5_mol)**2*(19.7483_L*1_L*1_bar/(1_mol)**2)
                /(2.5_m**3)**2)
                )


I'm not even sure if I've got that right after checking it three times. 
I believe it is completely unrealistic to expect the reader to spot 
dimensional errors by eye in anything but the most trivial cases.

Here is how I would do the same calculation in sympy. For starters, 
rather than using a bunch of "magic constants" directly in the formula, 
I would set them up as named variables. That's just good programming 
practice whether there are units involved or not.


# Grab the units we need.
from sympy.physics.units import mol, J, K, m, Pa, bar, liter as L
# And some constants.
from sympy.physics.units import avogadro_constant as N_a, boltzmann as k
R = N_a*k
# Define our variables.
n = 5*mol
T = 340*K
V = 2.5*m**3
# Van der Waals constants for carbon tetrachloride.
a = 19.7483*L**2*bar/mol**2
b = 0.1281*m**3/mol
# Apply Van der Waals equation to calculate the pressure
p = n*R*T/(V - n*b) - n**2*a/V**2
# Print the result in Pascal.
print p/Pa


Sympy (apparently) doesn't warn you if your units are incompatible, it 
just treats them as separate terms in an expression:

py> 2*m + 3*K
3*K + 2*m

which probably makes sense from the point of view of a computer algebra 
system (adding two metres and three degrees Kelvin is no weirder than 
adding x and y). But from a unit conversion point of view, I think 
sympy is the wrong solution. Nevertheless, it still manages to give the 
right result, and in a form that is easy to understand, easy to read, 
and easy to confirm is correct.

(If p/Pa is not a pure number, then I know the units are wrong. That's 
not ideal, but it's better than having to track the units myself. There 
are better solutions than sympy, I just picked this because I happened 
to have it already installed.)


> This proposal is more about adding capabilities to be base 
> language that happen to make dimensional analysis easier and more attractive 
> than about providing dimensional analysis itself.

I think it is an admirable aim to want to make unit tracking easier in 
Python. That doesn't imply that this is the right way to go about it.

Perhaps you should separate your suggested syntax from your ultimate 
aim. Instead of insisting that your syntax is the One Right Way to get 
units into Python, how about thinking about what other possible syntax 
might work? Here's a possibility, thrown out just to be shot down:

# Van der Waals constants for carbon tetrachloride.
a = 19.7483 as L**2*bar/mol**2
b = 0.1281 as m**3/mol


I think that's better than:

a = 19.7483_L * (1_L) * (1_bar) / (1_mol)**2
b = 0.1281_m * (1_m)**2 / 1_mol

and *certainly* better than trying to have the intrepreter guess 
whether:

19.7483_L**2*bar/mol**2

means 

19.7483 with units L**2*bar/mol**2

or 

19.7483_L squared, times bar, divided by mol**2



-- 
Steve


More information about the Python-ideas mailing list