[Python-ideas] SI scale factors in Python

Paul Moore p.f.moore at gmail.com
Thu Aug 25 16:03:32 EDT 2016


On 25 August 2016 at 19:02, Ken Kundert <python-ideas at shalmirane.com> wrote:
>     This proposal basically has two parts. One part is that Python should
> naturally support printing real numbers with SI scale factors.  Currently there
> are three formats for printing real number: %f, %d, %g. They all become
> difficult to read for even moderately large or small numbers.  Exponential
> notation is hard for humans to read. That is why SI scale factors have largely
> replaced exponential notation everywhere except in programming.  Adding another
> format for printing real numbers in human readable form seems like a modest
> extension that is long past due in all programming languages. I am asking that
> Python be the leader here. I am sure other languages will pick it up once it is
> implemented in Python.

This part would be easy to implement as a small PyPI module, with a
formatting function siformat(num) -> str. If that proved popular (and
it well might) then there's more of a case for adding it as a language
feature, Maybe as a custom string formatting code, or maybe just as a
standard library function. But without that sort of real-world
experience, it's likely that any proposal will get bogged down in
"what if" theoretical debate.

> The second part is the logical dual to the first: input. People should be able
> to enter numbers in Python using SI scale factors. This means as real literals,
> such as 2.4G, but it should also work with casting, float('2.4G').  Once you
> allow SI scale factors on numbers, the natural tendency is for people to want to
> add units, which is a good thing because it gives important information about
> the number. We should allow it because it improves the code by making it more
> self documenting. Even if the language completely ignores the units, we have
> still gained by allowing the units to be there, just like we gain when we allow
> user to add comments to their code even though the compiler ignores them.

This is much more problematic. Currently, even the long-established
decimal and rational types don't enjoy syntactic support, so I'd be
surprised if SI scale factors got syntax support first. But following
their lead, by (again) starting with a conversion function along the
lines of SI("3k") -> 3000 would be a good test of applicability. It
could easily go in the same module as you're using to trial the string
formatting function above.

I'd expect that a function to do this conversion would be a good way
of thrashing out the more controversial aspects of the proposal -
whether E means "exponent" or "exa", whether M and G get
misinterpreted as computing-style 2**20 and 2**30, etc. Having real
world experience of how to solve these questions would be invaluable
in moving forward with a proposal to add language support.


> Some people have suggested that we take the next step and use the units for
> dimensional analysis, but that is highly problematic because you cannot do
> dimensional analysis unless everything is specified with the correct units, and
> that can be a huge burden for the user. So instead, I am suggesting that we
> provide simple hooks that simply allow access to the units.  That way people can
> build dimensional analysis packages using the units if they felt the need.

Dimensional analysis packages already exist (I believe) and they don't
rely on syntactic support. Any proposal to add something to the
language *really* needs to demonstrate that

1. Those packages are currently suffering from the lack of language support.
2. The proposed change would allow them to resolve existing problems
that they haven't been able to address any other way.
3. The proposed change isn't some sort of "attractive nuisance" for
naive users, leading them to think they can write dimensionally
correct programs *without* using one of the existing packages.

Python has a track record of being open to adding syntactic support if
it demonstrably helps 3rd party tools (for example, the matrix
multiplication operator was added specifically to help the numeric
Python folks address a long-standing issue they had), so this is a
genuine possibility - but such proposals need support from the groups
they are intended to help. At the moment, I'm not even aware of a
particular "dimensional analysis with Python" community, or any
particular "best of breed" package in this area that might lead such a
proposal - and a language change of this nature probably does need
that sort of backing.

That's not to say there's no room for debate here - the proposal is
interesting, and not without precedent (for example Windows Powershell
supports constants of the form 1MB, 1GB - which ironically are
computing-style 2*20 and 2*30 rather than SI-style 10*6 and 10*9). But
there's a pretty high bar for a language change like this, and it's
worth doing the groundwork to avoid wasting a lot of time on something
that's not going to be accepted in its current form.

Hope this helps,
Paul


More information about the Python-ideas mailing list