Native support for units [was: custom literals]
![](https://secure.gravatar.com/avatar/de311342220232e618cb27c9936ab9bf.jpg?s=120&d=mm&r=g)
On 4/3/22 11:52, Brian McCall wrote:
If you had asked me twenty years ago if I thought units should be a native part of any programming language, I would have said absolutely - because in my youthful ignorance I had no idea what it would take to make such a thing work. Five years later, I would have said "not worth it". Now I'm back where I started. The lack of native language support for SI units is a problem for an entire segment of programmers. Programming languages took a big step forward in deciding that EVERYTHING is a pointer/reference, and EVERYTHING is an object. They need to take another step forward to say that EVERY number has a unit, including "unitless". Not having this language feature is becoming (or already is) a problem. The question is, is it Python's problem?
On 4/3/22 14:20, Ricky Teachey wrote:
The libraries out there- pint is probably the biggest one- have filled those gap as much as they can, but there are so many shortfalls...
The old engineering disciplines- mine (civil engineering), structural, electrical, etc- are the next frontier in the "software eats the world" revolution, and they desperately need a language with native units support. I was just on an interview call yesterday for a senior engineer role at a large multinational earth works engineering firm and we spent 15 minutes talking about software and what we see coming down the road when it comes to the need for our discipline to grow in its software creation capabilities.
Python SHOULD be that language we do this with. It is awesome in every other way. But if it isn't DEAD SIMPLE to use units in python, it won't happen.
Well, if we're spit-balling ideas, what about: 63_lbs or 77_km/hr ? Variables cannot start with a number, so there'd be no ambiguity there; we started allowing underbars for separating digits a few versions ago, so there is some precedent. We could use the asterisk, although I find the underbar easeir to read. Mechanically, are `lbs`, `km`, `hr`, etc., something that is imported, or are they tags attached to the numbers? If attached to the numbers, memory size would increase and performance might decrease -- but, how often do we have a number that is truly without a unit? How old are you? 35 years How much do you weigh? 300 kg What temperature do you cook bread at? 350 F -- ~Ethan~
![](https://secure.gravatar.com/avatar/d67ab5d94c2fed8ab6b727b62dc1b213.jpg?s=120&d=mm&r=g)
On Mon, 4 Apr 2022 at 07:45, Ethan Furman <ethan@stoneleaf.us> wrote:
Mechanically, are `lbs`, `km`, `hr`, etc., something that is imported, or are they tags attached to the numbers? If attached to the numbers, memory size would increase and performance might decrease -- but, how often do we have a number that is truly without a unit?
How old are you? 35 years How much do you weigh? 300 kg What temperature do you cook bread at? 350 F
Very frequently - it'd be called an index. (What sort of numbers should enumerate() return, for instance? Clearly that, whatever it is, is an index.) But if every int and float has a tag attached to it, it's not that big a deal to have either a default tag, or leave the field NULL or None, to define it to be a unitless value or index. ChrisA
![](https://secure.gravatar.com/avatar/9ea64fa01ed0d8529e4ae1b8873bb930.jpg?s=120&d=mm&r=g)
On Sun, Apr 3, 2022, 6:27 PM Chris Angelico <rosuav@gmail.com> wrote: On Mon, 4 Apr 2022 at 07:45, Ethan Furman <ethan@stoneleaf.us> wrote:
Mechanically, are `lbs`, `km`, `hr`, etc., something that is imported, or are they tags attached to the numbers? If attached to the numbers, memory size would increase and performance might decrease -- but, how often do we have a number that is truly without a unit?
How old are you? 35 years How much do you weigh? 300 kg What temperature do you cook bread at? 350 F
Very frequently - it'd be called an index. (What sort of numbers should enumerate() return, for instance? Clearly that, whatever it is, is an index.) But if every int and float has a tag attached to it, it's not that big a deal to have either a default tag, or leave the field NULL or None, to define it to be a unitless value or index. ChrisA How much memory would have to be added to every number? Could it be dynamically sized so it doesn't have a large impact on all the code out there with no units? But also grow to be big enough to capture the vast constellation of units and unit combinations out there? I'm unsure simple tags are enough. What should the behavior of this be? height = 5ft + 4.5in Surely we ought to be able to add these values. But what should the resulting tag be? Should we be able to write it like this? height = 5ft 4.5in I was cheerleading this effort earlier and I still think it would be a massive contribution to needs of the engineering world to solve this problem at the language level. But boy howdy is it a tough but of a problem to crack.
![](https://secure.gravatar.com/avatar/d67ab5d94c2fed8ab6b727b62dc1b213.jpg?s=120&d=mm&r=g)
On Mon, 4 Apr 2022 at 12:42, Ricky Teachey <ricky@teachey.org> wrote:
On Sun, Apr 3, 2022, 6:27 PM Chris Angelico <rosuav@gmail.com> wrote:
On Mon, 4 Apr 2022 at 07:45, Ethan Furman <ethan@stoneleaf.us> wrote:
Mechanically, are `lbs`, `km`, `hr`, etc., something that is imported, or are they tags attached to the numbers? If attached to the numbers, memory size would increase and performance might decrease -- but, how often do we have a number that is truly without a unit?
How old are you? 35 years How much do you weigh? 300 kg What temperature do you cook bread at? 350 F
Very frequently - it'd be called an index. (What sort of numbers should enumerate() return, for instance? Clearly that, whatever it is, is an index.) But if every int and float has a tag attached to it, it's not that big a deal to have either a default tag, or leave the field NULL or None, to define it to be a unitless value or index.
ChrisA
How much memory would have to be added to every number? Could it be dynamically sized so it doesn't have a large impact on all the code out there with no units? But also grow to be big enough to capture the vast constellation of units and unit combinations out there?
I'm not sure that the semantics should be defined by the language to that extent. My thinking here is that most of the actual work should be done by a unit-specific library, and the language simply defines some low-level semantics (and, of course, the corresponding syntax). So the effect would be that int and float don't actually change, but numbers-with-units are all instances of custom classes.
I'm unsure simple tags are enough. What should the behavior of this be?
height = 5ft + 4.5in
My view on this is that it should basically be defined as: height = ft(5) + in(4.5) where you register your constructor functions (possibly types, possibly factory functions) using a hook in the sys module.
Surely we ought to be able to add these values. But what should the resulting tag be? Should we be able to write it like this?
height = 5ft 4.5in
That, I would say, no. There are a lot of human-readable notations that aren't supported by programming languages, such as algebraic "abuttal means multiplication" and such. It's not that hard to say "5ft + 4.5in", just like you'd say "3 + 4j" for a complex number. (Also, I don't really see a lot of point in making feet-and-inches minorly less complicated, given that people should be trying to use metric anyway.)
I was cheerleading this effort earlier and I still think it would be a massive contribution to needs of the engineering world to solve this problem at the language level. But boy howdy is it a tough but of a problem to crack.
This is something where (again, my view only, others may disagree) the language itself need not actually make a decision. Ideally, the standard library should have something usable, but it should be such that you could choose to do something else. The standard library could even have more than one unit system in it - for instance, you could have a metric-focused system that converts "5ft" into "1.524m", or you could have a unit-retaining system that keeps everything in the unit specified until such time as conversions are needed, etc. One very important feature for the standard library would be a generic "Quantity" type, which could save a lot of hassle. You could create a "length" quantity and register a number of units, and define that any length can be added to any other length, a length can be multiplied by a scalar, etc, etc, etc. Then a bit of generic handling could also recognize that when you multiply or divide two quantities, you get something in a combined unit (whether that's "length*length" meaning area, or "length/time" meaning velocity), and libraries could slot in whatever units make the most sense. Maybe you're working with astronomical distances, and the meter just causes too many floating-point roundoff errors, so you define your base unit to be the parsec, and everything else derives from that. Or maybe you're working with video game analysis and the most important quantity is "items per minute" (eg 780 iron ore per minute), so you define each type of item as its own quantity, and redefine time with a basis of minutes instead of seconds. Having the tools to do that is probably more important than defining it all at the language level. ChrisA
![](https://secure.gravatar.com/avatar/9ea64fa01ed0d8529e4ae1b8873bb930.jpg?s=120&d=mm&r=g)
On Sun, Apr 3, 2022, 11:03 PM Chris Angelico <rosuav@gmail.com> wrote:
I'm unsure simple tags are enough. What should the behavior of this be?
height = 5ft + 4.5in
My view on this is that it should basically be defined as:
height = ft(5) + in(4.5)
where you register your constructor functions (possibly types, possibly factory functions) using a hook in the sys module.
This is really similar to what pint does already (except it uses the multiplication syntax if course). What does that idea bring other than being able to say: 5.0m (m registered in a previously run module) .... instead of: 5.0*m (m an object imported in a previously run module) ?
Surely we ought to be able to add these values. But what should the resulting tag be? Should we be able to write it like this?
height = 5ft 4.5in
That, I would say, no. There are a lot of human-readable notations that aren't supported by programming languages, such as algebraic "abuttal means multiplication" and such. It's not that hard to say "5ft + 4.5in", just like you'd say "3 + 4j" for a complex number.
Makes sense to me. (Also, I don't really see a lot of point in making feet-and-inches
minorly less complicated, given that people should be trying to use metric anyway.)
They're almost all I use professionally. I graduated college in 2010 and have almost never used centimeters or millimeters since then. "Shoulds" aside, feet and inches are not going away and any unit system needs to make them first class and easy to use or it will be extremely painful for large chunks of progressional engineers. But 5ft+4in is fine.
I was cheerleading this effort earlier and I still think it would be a massive contribution to needs of the engineering world to solve this problem at the language level. But boy howdy is it a tough but of a problem to crack.
This is something where (again, my view only, others may disagree) the language itself need not actually make a decision. Ideally, the standard library should have something usable, but it should be such that you could choose to do something else. The standard library could even have more than one unit system in it - for instance, you could have a metric-focused system that converts "5ft" into "1.524m", or you could have a unit-retaining system that keeps everything in the unit specified until such time as conversions are needed, etc.
Yup, being able to choose to retain units or convert them to preferred units is important. One very important feature for the standard library would be a generic
"Quantity" type, which could save a lot of hassle. You could create a "length" quantity and register a number of units, and define that any length can be added to any other length, a length can be multiplied by a scalar, etc, etc, etc. Then a bit of generic handling could also recognize that when you multiply or divide two quantities, you get something in a combined unit (whether that's "length*length" meaning area, or "length/time" meaning velocity), and libraries could slot in whatever units make the most sense. Maybe you're working with astronomical distances, and the meter just causes too many floating-point roundoff errors, so you define your base unit to be the parsec, and everything else derives from that. Or maybe you're working with video game analysis and the most important quantity is "items per minute" (eg 780 iron ore per minute), so you define each type of item as its own quantity, and redefine time with a basis of minutes instead of seconds. Having the tools to do that is probably more important than defining it all at the language level.
I like and think I understand most of what you're saying here. I think much of this has been thought through and solved by the pint library, and some others. Having a "slot" to natively associate a unit tag with numerical values, and syntax support with defined semantics, could be a big win. Leaving the "now let's make decisions about behavior" part to 3rd parties makes sense to me. But what superpower does bringing the notion of tagging things into the native language bring for us if all the tag does is call a function that returns a thing? It almost just sounds like a postfix function call syntax to me.
![](https://secure.gravatar.com/avatar/d67ab5d94c2fed8ab6b727b62dc1b213.jpg?s=120&d=mm&r=g)
On Mon, 4 Apr 2022 at 14:13, Ricky Teachey <ricky@teachey.org> wrote:
This is really similar to what pint does already (except it uses the multiplication syntax if course).
What does that idea bring other than being able to say:
5.0m
(m registered in a previously run module)
.... instead of:
5.0*m
(m an object imported in a previously run module)
?
A large amount of clarity, readability, and namespacing (you don't have to pollute your global namespace with a large number of single-letter names, since these tokens will ONLY have meaning when immediately following an int or float literal).
Makes sense to me.
(Also, I don't really see a lot of point in making feet-and-inches minorly less complicated, given that people should be trying to use metric anyway.)
They're almost all I use professionally. I graduated college in 2010 and have almost never used centimeters or millimeters since then. "Shoulds" aside, feet and inches are not going away and any unit system needs to make them first class and easy to use or it will be extremely painful for large chunks of progressional engineers.
But 5ft+4in is fine.
Yeah - if it were clunkier than that, I would be more sympathetic to the "feet and inches are important" crowd, but the cost is a single addition operator. (Though there's still the question of what unit "5ft+4in" is - is it fractional feet or a large number of inches? Or is it a number of meters? But that's something a library can decide.)
I like and think I understand most of what you're saying here. I think much of this has been thought through and solved by the pint library, and some others. Having a "slot" to natively associate a unit tag with numerical values, and syntax support with defined semantics, could be a big win. Leaving the "now let's make decisions about behavior" part to 3rd parties makes sense to me.
But what superpower does bringing the notion of tagging things into the native language bring for us if all the tag does is call a function that returns a thing?
It almost just sounds like a postfix function call syntax to me.
Ultimately, EVERYTHING can be seen as just a function call in disguise. Why do we have subscripting when we could just have a method to return the Nth item from a list? Etcetera. What this brings is not a superpower, but simple clarity. We have imaginary literals because they make code more readable; in theory, we could just have a variable called "j" (or "i" if you prefer) which you multiply by something and add something, and that's your complex number. But it's cleaner to write "3+4j". If this were accepted, I would fully expect that libraries like pint would adopt it, so this example:
3 * ureg.meter + 4 * ureg.cm <Quantity(3.04, 'meter')>
could look like this:
3m + 4cm <Quantity(3.04, 'meter')>
with everything behaving the exact same after that point. Which would YOU prefer to write in your source code, assuming they have the same run-time behaviour? ChrisA
![](https://secure.gravatar.com/avatar/6d6150353bc4f27822f669a36559ec13.jpg?s=120&d=mm&r=g)
Typing "3m + 4cm" into a terminal produced the above output, even if it meant I needed to import the pint module, that would be great. No idea how that would work out, but all for it. argparse still seems like it would be a loose end, though. Although, to be fair, it would be a loose end no matter what.
![](https://secure.gravatar.com/avatar/d67ab5d94c2fed8ab6b727b62dc1b213.jpg?s=120&d=mm&r=g)
On Mon, 4 Apr 2022 at 15:20, Brian McCall <brian.patrick.mccall@gmail.com> wrote:
Typing "3m + 4cm" into a terminal produced the above output, even if it meant I needed to import the pint module, that would be great. No idea how that would work out, but all for it. argparse still seems like it would be a loose end, though. Although, to be fair, it would be a loose end no matter what.
Not sure what the implications of argparse would be? If the meaning is divided into three as per my previous description, it absolutely would be possible to "import pint; 3m + 4cm" to get that result, assuming that pint is enhanced to register itself in this way. It might work out better to write it as "import pint; pint.register_si(); print(3m + 4cm)" to explicitly choose the SI unit set, but that'd be in the hands of the pint maintainers. ChrisA
![](https://secure.gravatar.com/avatar/6d6150353bc4f27822f669a36559ec13.jpg?s=120&d=mm&r=g)
I think I posted this somewhere else in this thread, or the previous thread. argparse can handle negative numbers, but only of one of the built-in primitive types. See example below: ``` import re, argparse class meters(float): def __new__(cls, x): return super().__new__(cls, float(re.sub(r'm','',x))) parser = argparse.ArgumentParser() parser.add_argument("-l", "--length", dest='length', type=meters) # _StoreAction(option_strings=['-l', '--length'], dest='length', nargs=None, const=None, default=None, type=<class '__main__.meters'>, choices=None, help=None, metavar=None) meters("12") #12.0 meters("12m") #12.0 parser.parse_args("-l -12".split()) #Namespace(length=-12.0) parser.parse_args("-l -12m".split()) #usage: [-h] [-l LENGTH] #: error: argument -l/--length: expected one argument ```
![](https://secure.gravatar.com/avatar/b01753e0c78849bd34045ed730d59db6.jpg?s=120&d=mm&r=g)
On Sun, Apr 3, 2022 at 10:10 PM Chris Angelico <rosuav@gmail.com> wrote:
On Mon, 4 Apr 2022 at 14:13, Ricky Teachey <ricky@teachey.org> wrote:
What does that idea bring other than being able to say 5.0m [...] instead of 5.0*m [...]?
A large amount of clarity, readability, and namespacing (you don't have to pollute your global namespace with a large number of single-letter names, since these tokens will ONLY have meaning when immediately following an int or float literal).
I feel like one of the biggest sticking points in this thread is that people are arguing for a new kind of global scope just for units, and the sole reason seems to be that they want short names for them. The register_numeric_suffix idea would create a true global namespace, independent of the module system. That seems like a bad idea: libraries should be able to use their own units internally without potentially breaking other libraries. Units should be local to each module. You need a way to import them into your module's unit namespace. You might want to import only some of the units exported by a unit provider... There is already a solution to all of these problems. All you have to do to be able to use that solution as-is, instead of introducing a large amount of new complexity to the language, is give your units names like "ampere" instead of "A". I don't think that would be much of a problem. How often will explicit unit names show up in a program? Maybe you'll multiply by 1 meter in one place when reading a CSV file, and divide by 1 meter when writing. You probably won't write 1.380649e-23 J/K inline, even if you only use it once; you'll assign it to k_B or something. Or just import it from a physics-constants module. If you're doing a-la-carte computations at the interactive prompt, you can "from units import *" for convenience; the official docs already advise "from math import *" in that situation. in theory, we could just have a variable called "j" (or "i" if you prefer)
which you multiply by something and add something, and that's your complex number. But it's cleaner to write "3+4j".
I would like to see examples of how complex literals are used in the wild. I feel like most will look like (4086.184383622179764082821 - 3003.003538923749396546871j) (an actual example from mpmath.functions.elliptic), or are just ±1j. There's practically no situation in which you'd want a literal like 3+4j. Even crazy constants like those from mpmath are unlikely to appear in your code because most people aren't numeric analysts who write math libraries. I feel like the biggest benefit of the suffix j syntax is that it's recognizable as a foldable compile-time constant, so you can put foo+barj in an inner loop cheaply. mpmath has "3.1415926535897932j" in a couple places, which I suppose is for speed; it certainly isn't for readability. Python should have some generic solution for this problem, like C++ constexpr, but I don't know how to do it, and it's a different discussion.
3 * ureg.meter + 4 * ureg.cm
Same problem here: I don't believe that anyone would write that in a real program. How are libraries like pint actually used?
![](https://secure.gravatar.com/avatar/6d6150353bc4f27822f669a36559ec13.jpg?s=120&d=mm&r=g)
The C++ way is custom literals (it's where this thread originated) https://mail.python.org/archives/list/python-ideas@python.org/thread/MFZ52D3... One problem I am seeing is a misunderstanding between standard, well-defined units, and technical lingo - for which I am not advocating native language support. Also, standard units here does not mean universal constants. There is no reason or need to provide any kind of new language support for those. The problem as described in my original post (rant?) is that without units, any computational code is less WYSIWYG. C++ has a way to deal with units in literal expressions, why not Python? And no, language support alone is not enough, since NumPy and SciPy would have to add code to implement as well, and then there is data entry (command line arguments, config files, web forms, etc.). Native language support is not a magic bullet, but it is a bullet.
The register_numeric_suffix idea would create a true global namespace, independent of the module system. That seems like a bad idea: libraries should be able to use their own units internally without potentially breaking other libraries. Units should be local to each module. You need a way to import them into your module's unit namespace. You might want to import only some of the units exported by a unit provider...
The reason I disagree with this view is that units of measurement are fixed. I'm sure that when I say this, some will be reminded of trauma that they have experienced with "datetime" objects and UTC jujitsu, but those are not the same. Units of measure are immutable, and have only one proper meaning. PEP 20 says "There should be one-- and preferably only one --obvious way to do it. [sic]" Seems like that should apply to units as well. In fact, for mechanical engineering software packages like SolidWorks there already is a standard for unit conversions: https://en.wikipedia.org/wiki/Unified_Code_for_Units_of_Measure
![](https://secure.gravatar.com/avatar/d67ab5d94c2fed8ab6b727b62dc1b213.jpg?s=120&d=mm&r=g)
On Mon, 4 Apr 2022 at 16:56, Ben Rudiak-Gould <benrudiak@gmail.com> wrote:
I feel like one of the biggest sticking points in this thread is that people are arguing for a new kind of global scope just for units, and the sole reason seems to be that they want short names for them.
The register_numeric_suffix idea would create a true global namespace, independent of the module system. That seems like a bad idea: libraries should be able to use their own units internally without potentially breaking other libraries. Units should be local to each module. You need a way to import them into your module's unit namespace. You might want to import only some of the units exported by a unit provider...
How often do you ACTUALLY need them to be local to a module? When is this ever a concern?
There is already a solution to all of these problems. All you have to do to be able to use that solution as-is, instead of introducing a large amount of new complexity to the language, is give your units names like "ampere" instead of "A".
You can already do that. Just import the appropriate terms from pint. I notice that people aren't doing it very much though - maybe because "1*ampere" is too clunky.
I don't think that would be much of a problem. How often will explicit unit names show up in a program? Maybe you'll multiply by 1 meter in one place when reading a CSV file, and divide by 1 meter when writing. You probably won't write 1.380649e-23 J/K inline, even if you only use it once; you'll assign it to k_B or something. Or just import it from a physics-constants module.
That's a massive massive assumption, not supported by the way people work with complex numbers. Why is the square root of negative one special, if all other constants can be imported from modules?
I would like to see examples of how complex literals are used in the wild. I feel like most will look like (4086.184383622179764082821 - 3003.003538923749396546871j) (an actual example from mpmath.functions.elliptic), or are just ±1j. There's practically no situation in which you'd want a literal like 3+4j. Even crazy constants like those from mpmath are unlikely to appear in your code because most people aren't numeric analysts who write math libraries.
I've used them frequently for working with curves on planes. I don't have any Python code to hand, though, since my latest such project needed to run in a web browser. It would have benefited significantly from Python's complex number support, though.
I feel like the biggest benefit of the suffix j syntax is that it's recognizable as a foldable compile-time constant, so you can put foo+barj in an inner loop cheaply. mpmath has "3.1415926535897932j" in a couple places, which I suppose is for speed; it certainly isn't for readability. Python should have some generic solution for this problem, like C++ constexpr, but I don't know how to do it, and it's a different discussion.
Yeah, I thought about that possibility and the implications of requiring that unit-handlers be pure functions, but it might have other consequences not worth accepting. That can be an orthogonal discussion. ChrisA
![](https://secure.gravatar.com/avatar/72ee673975357d43d79069ac1cd6abda.jpg?s=120&d=mm&r=g)
On 4/04/22 7:59 pm, Chris Angelico wrote:
How often do you ACTUALLY need them to be local to a module? When is this ever a concern?
As long as there are competing *implementations* of units there will be potential for conflicts, even if the actual units being represented are the same. -- Greg
![](https://secure.gravatar.com/avatar/c6313f579e12a3332028d33fe6c0814f.jpg?s=120&d=mm&r=g)
Much of this discussion is based on a misconception. Units and SI scale factors are very useful in software that describes or interacts with the real world, but primarily on input and output. They are not normally used for internal calculations. The idea that one carries units on variables interior to a program, and that those units are checked for all interior calculations, is naive. Doing such thing adds unnecessary and often undesired complexity. Rather, it is generally only desirable to allow users to include scale factors and units on values they specify and values they read. This implies that it is only necessary to provide a package for reading and writing physical quantities, and indeed such a package exists: QuantiPhy. QuantiPhy came out of the ideas that were raised the last time this topic was discussed on this mailing list a few years ago. However, there are two reasons to consider adding both SI scale factors and unit in Python itself. First, SI scale factors have been an international standard way of specifying real value for over 50 years, and use of SI scale factors results in numbers that are more compact and easier to read than using exponential notation. Second, providing units with numbers provides important information, and specifying that information in a program reduces ambiguity, and providing the units results generally decreases the chance of errors. For example, consider the following three versions of the same line of code: virt /= 1048576 virt /= 1.048576e6 virt /= 1MiB The last is the easiest to read and the least ambiguous. Using the units and scale factor on the scaling constant results in an easy to read line that makes it clear what is intended. Notice that in this case the program does not use the specified units, rather the act of specifying the units clarifies the programmers intent and reduces the chance of misunderstandings or error when the code is modified by later programmers. But this suggests that it is not necessary for Python to interpret the units. The most it needs do is to save the units as an attribute so that it is available if needed later. -Ken
![](https://secure.gravatar.com/avatar/8dd7d6a49cdc89df68f43a1610085b52.jpg?s=120&d=mm&r=g)
On 2022-04-04 01:19:13, python@shalmirane.com wrote:
This implies that it is only necessary to provide a package for reading and writing physical quantities, and indeed such a package exists: QuantiPhy. QuantiPhy came out of the ideas that were raised the last time this topic was discussed on this mailing list a few years ago.
Link for the curious, IMHO seems to solve most of the objections raised here without language-level changes: https://pypi.org/project/quantiphy/ https://pypi.org/project/quantiphy-eval/ I wish I knew this existed when I re-implemented part of this (with bugs) a while back.
![](https://secure.gravatar.com/avatar/6d6150353bc4f27822f669a36559ec13.jpg?s=120&d=mm&r=g)
Much of this discussion is based on a misconception. Units and SI scale factors are very useful in software that describes or interacts with the real world, but primarily on input and output. They are not normally used for internal calculations. The idea that one carries units on variables interior to a program, and that those units are checked for all interior calculations, is naive. Doing such thing adds unnecessary and often undesired complexity. Rather, it is generally only desirable to allow users to include scale factors and units on values they specify and values they read. This implies that it is only necessary to provide a package for reading and writing physical quantities, and indeed such a package exists: QuantiPhy. QuantiPhy came out of the ideas that were raised the last time this topic was discussed on this mailing list a few years ago.
Why is it naive to carry the units through calculation? Seems to me that a one-byte lookup and a 64-bit add/subtract would be enough to enable any plausible combination of standard units during computation. The conversion from raw powers of 7 base SI units to units of choice could be done at a higher level code at the input/output stage. QuantiPhy is definitely not what I am thinking of. You don't happen to have a subject line for the previous discussion that I can look up, do you?
![](https://secure.gravatar.com/avatar/c6313f579e12a3332028d33fe6c0814f.jpg?s=120&d=mm&r=g)
As why it is naive, see my previous post where I talk about the limitations of dimensional analysis. As a point of reference, I have been developing software for electrical engineers for over 40 years. That software uses physical quantities (voltage current, resistance, capacitance, etc.) heavily. Over those 40 years I have written and rewritten units packages maybe a half-dozen times. In that time I have never seriously considered writing a dimensional analysis based units package. In general, dimensional analysis is something you do once, not every time the program runs. As for links to the previous discussion, search for “SI scale factors”. The discussion occurred 5 years ago. -Ken On Mon, Apr 04, 2022 at 03:06:26PM -0000, Brian McCall wrote:
Much of this discussion is based on a misconception. Units and SI scale factors are very useful in software that describes or interacts with the real world, but primarily on input and output. They are not normally used for internal calculations. The idea that one carries units on variables interior to a program, and that those units are checked for all interior calculations, is naive. Doing such thing adds unnecessary and often undesired complexity. Rather, it is generally only desirable to allow users to include scale factors and units on values they specify and values they read. This implies that it is only necessary to provide a package for reading and writing physical quantities, and indeed such a package exists: QuantiPhy. QuantiPhy came out of the ideas that were raised the last time this topic was discussed on this mailing list a few years ago.
Why is it naive to carry the units through calculation? Seems to me that a one-byte lookup and a 64-bit add/subtract would be enough to enable any plausible combination of standard units during computation. The conversion from raw powers of 7 base SI units to units of choice could be done at a higher level code at the input/output stage.
QuantiPhy is definitely not what I am thinking of. You don't happen to have a subject line for the previous discussion that I can look up, do you? _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/PL3UHV... Code of Conduct: http://python.org/psf/codeofconduct/
![](https://secure.gravatar.com/avatar/c6313f579e12a3332028d33fe6c0814f.jpg?s=120&d=mm&r=g)
As why it is naive, see my previous post where I talk about the limitations of dimensional analysis. As a point of reference, I have been developing software for electrical engineers for over 40 years. That software uses physical quantities (voltage current, resistance, capacitance, etc.) heavily. Over those 40 years I have written and rewritten units packages maybe a half-dozen times. In that time I have never seriously considered writing a dimensional analysis based units package. In general, dimensional analysis is something you do once, not every time the program runs. As for links to the previous discussion, search for “SI scale factors”. The discussion occurred 5 years ago. -Ken On Mon, Apr 04, 2022 at 03:06:26PM -0000, Brian McCall wrote:
Much of this discussion is based on a misconception. Units and SI scale factors are very useful in software that describes or interacts with the real world, but primarily on input and output. They are not normally used for internal calculations. The idea that one carries units on variables interior to a program, and that those units are checked for all interior calculations, is naive. Doing such thing adds unnecessary and often undesired complexity. Rather, it is generally only desirable to allow users to include scale factors and units on values they specify and values they read. This implies that it is only necessary to provide a package for reading and writing physical quantities, and indeed such a package exists: QuantiPhy. QuantiPhy came out of the ideas that were raised the last time this topic was discussed on this mailing list a few years ago.
Why is it naive to carry the units through calculation? Seems to me that a one-byte lookup and a 64-bit add/subtract would be enough to enable any plausible combination of standard units during computation. The conversion from raw powers of 7 base SI units to units of choice could be done at a higher level code at the input/output stage.
QuantiPhy is definitely not what I am thinking of. You don't happen to have a subject line for the previous discussion that I can look up, do you? _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/PL3UHV... Code of Conduct: http://python.org/psf/codeofconduct/
![](https://secure.gravatar.com/avatar/b4f6d4f8b501cb05fd054944a166a121.jpg?s=120&d=mm&r=g)
On Mon, 2022-04-04 at 09:14 -0700, Ken Kundert wrote:
As why it is naive, see my previous post where I talk about the limitations of dimensional analysis.
As a point of reference, I have been developing software for electrical engineers for over 40 years. That software uses physical quantities (voltage current, resistance, capacitance, etc.) heavily. Over those 40 years I have written and rewritten units packages maybe a half-dozen times. In that time I have never seriously considered
Are there any Python or e.g. C/C++ libraries that may be useful to wrap for Python that are open source? There is interest in making units available to NumPy through a DType directly on the NumPy array. I.e. not quite the same way as `astropy.units` or `pint`, but rather ingraining deeper into NumPy (but written outside). And it would be great to have a full featured units library as a basis for this.
writing a dimensional analysis based units package. In general, dimensional analysis is something you do once, not every time the program runs.
I would say it actually is often something you are happy to do every time the program runs. If you think about NumPy (like) usage, there are at least two use-cases: 1. You have a mid-to large sized amount of data. Figuring out units probably just doesn't matter performance wise. 2. You don't care about performance anyway, but rather about flexibility in your analysis workflow. Now, libraries may often need to strip units for more performance. That is, for example while running machine-learning algorithm or an integration, etc. But, I think it is fair to say that often reversing the approach may well be good: Use units by default, but strip them for performance. I can understand the idea of a (soft) "literal operator", i.e. so that it is possible to write: from units import cm, s length = 1.8_cm speed = length / 3_s And it would use `cm.__from_literal__("1.8", kind="floating")`. Of course that could be incompatible with other unit providing libraries (unless they agree on some ABC). I assume this is what C++ does? I am unsure that "tagging" in itself is helpful. What will: 1.3_m / 2_s give you? A `TaggedFloat(1.3, "m")` object may work. But can such a naïve units even propagate clearly enough that it can be reasonably implemented in a generic way? (Unless there is a blessed standard library units implementation at least.) Cheers, Sebastian
As for links to the previous discussion, search for “SI scale factors”. The discussion occurred 5 years ago.
-Ken
On Mon, Apr 04, 2022 at 03:06:26PM -0000, Brian McCall wrote:
Much of this discussion is based on a misconception. Units and SI scale factors are very useful in software that describes or interacts with the real world, but primarily on input and output. They are not normally used for internal calculations. The idea that one carries units on variables interior to a program, and that those units are checked for all interior calculations, is naive. Doing such thing adds unnecessary and often undesired complexity. Rather, it is generally only desirable to allow users to include scale factors and units on values they specify and values they read. This implies that it is only necessary to provide a package for reading and writing physical quantities, and indeed such a package exists: QuantiPhy. QuantiPhy came out of the ideas that were raised the last time this topic was discussed on this mailing list a few years ago.
Why is it naive to carry the units through calculation? Seems to me that a one-byte lookup and a 64-bit add/subtract would be enough to enable any plausible combination of standard units during computation. The conversion from raw powers of 7 base SI units to units of choice could be done at a higher level code at the input/output stage.
QuantiPhy is definitely not what I am thinking of. You don't happen to have a subject line for the previous discussion that I can look up, do you? _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/PL3UHV... Code of Conduct: http://python.org/psf/codeofconduct/
Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/NYVZME... Code of Conduct: http://python.org/psf/codeofconduct/
![](https://secure.gravatar.com/avatar/c6313f579e12a3332028d33fe6c0814f.jpg?s=120&d=mm&r=g)
I think there is one more point worth making here. There is a suggestion that dimensional analysis can underpin a units system. Actually, the idea that all units can be broken down into a small set of fundamental units is very limiting and results in many vexing issues. For example, consider currencies. There are currently hundreds of national currencies and thousands of cryptocurrencies. They all have the same basic fundamental unit of “value”, but value is only loosely defined. Furthermore, there is no fixed ratio between the currency and its value. It varies over time, over location, and from person to person. Consider units of a particular commodity. The example of a ream of paper was recently mentioned. A ream is 500 sheets of paper. However two reams may not be comparable. They may refer to a different size of paper or a different quality of paper. So all prices for reams of paper would have the same fundamental units of “value per each”, but both “value” and “each” are not necessarily comparable. In effect, the fundamental unit system is not complete. You also need to include information about what you are measuring. For example, “each” could represent a single item of anything. The unit is not complete until you include a description of what that anything is, and in effect, there is an unlimited number of things it could be. Now consider the issue of “unitless units”. In electrical circuit we often talk about gain, which it the ratio between the signal level at the output of a circuit relative to the signal level at the input. But you need to be specific about how you measure signal level. Typically, it would be a voltage, current, or power level. Consider a circuit where both the input and output are measured in volts. Then the gain would have units of "V/V", which is unitless in dimensional analysis. But units of "V/V" (voltage gain) is much different from units of "A/A" (current gain) or "W/W" (power gain), even though they have the same dimensions. Mixing them up results in errors and confusion. An additional complication is that sometimes logarithmic units are used. For example, decibels in voltage, or dBV, is 20*log(Vout/Vin). Again, a dimensionless quantity, but nonetheless "dBV" much different from "V/V". The same issue occurs with the arguments to trigonometric functions like sin(), cos() and tan(). Generally, we assume the arguments are given in radians, which is a dimensionless number. But it could also be given in degrees, another dimensionless number. Radians and degrees are indistinguishable from the perspective on dimensional analysis, but mixing them up results in errors. This is not to knock the idea of dimensional analysis. It is just not something that would be done in most programs that process physical quantities. Rather it is something that is largely done as a one-time check on your analysis. It is a “second opinion” on whether your hand calculation are correct, or at least plausible. So dimensional analysis packages such as pint have their place, but dimensional analysis is not something that belongs in the base language or even the standard library. However I do believe that a case can be made to allow numbers to be easily tagged with units in the base language, and then allowing those units to be accessed as an attribute of the number. Packages such as pint and QuantiPhy could then use that attribute to provide processing of units that is appropriate for the particular application. -Ken
![](https://secure.gravatar.com/avatar/cdc87637918eccd37ca88e9079e73705.jpg?s=120&d=mm&r=g)
Just to elaborate on units I use, here's a hypothetical (not stuff that actually happened today, but very commonplace nature): I drove 20 minutes up the road to by a bushel (US, not British) of U.S. No.
1. apples, to make apple cider. On my return trip, I stopped at the hardware store to buy a 2 lb box of 1-3/4" ring shank 12 penny nails. I used my 7/8 hole kitchen planer blade to grate the apples, then squeezed them for an hour and a 15 minutes at 30 psi to extract the juice. For good measure I added 2 tablespoons of vanilla and a pinch of salt. Then I drove the nails into grade C 2x4 joists (whose sizes are 1.5 x 3.5 inches, with a 1/16th inch permissible tolerance in sizing).
Please express that description in SI units! ;-) On Mon, Apr 4, 2022 at 11:55 AM Ken Kundert <python@shalmirane.com> wrote:
I think there is one more point worth making here. There is a suggestion that dimensional analysis can underpin a units system. Actually, the idea that all units can be broken down into a small set of fundamental units is very limiting and results in many vexing issues.
For example, consider currencies. There are currently hundreds of national currencies and thousands of cryptocurrencies. They all have the same basic fundamental unit of “value”, but value is only loosely defined. Furthermore, there is no fixed ratio between the currency and its value. It varies over time, over location, and from person to person.
Consider units of a particular commodity. The example of a ream of paper was recently mentioned. A ream is 500 sheets of paper. However two reams may not be comparable. They may refer to a different size of paper or a different quality of paper. So all prices for reams of paper would have the same fundamental units of “value per each”, but both “value” and “each” are not necessarily comparable. In effect, the fundamental unit system is not complete. You also need to include information about what you are measuring. For example, “each” could represent a single item of anything. The unit is not complete until you include a description of what that anything is, and in effect, there is an unlimited number of things it could be.
Now consider the issue of “unitless units”. In electrical circuit we often talk about gain, which it the ratio between the signal level at the output of a circuit relative to the signal level at the input. But you need to be specific about how you measure signal level. Typically, it would be a voltage, current, or power level. Consider a circuit where both the input and output are measured in volts. Then the gain would have units of "V/V", which is unitless in dimensional analysis. But units of "V/V" (voltage gain) is much different from units of "A/A" (current gain) or "W/W" (power gain), even though they have the same dimensions. Mixing them up results in errors and confusion. An additional complication is that sometimes logarithmic units are used. For example, decibels in voltage, or dBV, is 20*log(Vout/Vin). Again, a dimensionless quantity, but nonetheless "dBV" much different from "V/V".
The same issue occurs with the arguments to trigonometric functions like sin(), cos() and tan(). Generally, we assume the arguments are given in radians, which is a dimensionless number. But it could also be given in degrees, another dimensionless number. Radians and degrees are indistinguishable from the perspective on dimensional analysis, but mixing them up results in errors.
This is not to knock the idea of dimensional analysis. It is just not something that would be done in most programs that process physical quantities. Rather it is something that is largely done as a one-time check on your analysis. It is a “second opinion” on whether your hand calculation are correct, or at least plausible. So dimensional analysis packages such as pint have their place, but dimensional analysis is not something that belongs in the base language or even the standard library.
However I do believe that a case can be made to allow numbers to be easily tagged with units in the base language, and then allowing those units to be accessed as an attribute of the number. Packages such as pint and QuantiPhy could then use that attribute to provide processing of units that is appropriate for the particular application.
-Ken _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/A6BVGO... Code of Conduct: http://python.org/psf/codeofconduct/
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.
![](https://secure.gravatar.com/avatar/6d6150353bc4f27822f669a36559ec13.jpg?s=120&d=mm&r=g)
I drove 20 minutes up the road to by a bushel (US, not British) of U.S. No. 1. apples, to make apple cider. On my return trip, I stopped at the hardware store to buy a 2 lb box of 1-3/4" ring shank 12 penny nails. I used my 7/8 hole kitchen planer blade to grate the apples, then squeezed them for an hour and a 15 minutes at 30 psi to extract the juice. For good measure I added 2 tablespoons of vanilla and a pinch of salt. Then I drove the nails into grade C 2x4 joists (whose sizes are 1.5 x 3.5 inches, with a 1/16th inch permissible tolerance in sizing).
I'm going to start by ignoring any quantities above that are not involved in any sort of calculation. Those are outside the scope of the problem and proposal. And I don't care what anyone thinks of that. The only calculation I see here is 1.5 in x 3.5 in = 0.00339 mm2. The fact that joists are called 2x4 even though their nominal dimension is 1.5 in x 3.5 in is also outside the scope of the problem and proposal. Come on man, you think you're the only one who knows examples like this? How about 1/3" image sensors? How about display diagonals? I know about these things, and I know that they do not need to be accounted for in a standard, language-supported representation of units. Units have infinite precision, so grades and tolerances are also irrelevant. The units you mentioned are: minutes bushel (imperial) bushel (US) psi tablespoon pinch (it's a stretch, but okay) foot inch 7/8 hole - this is a specification, not a unit 2x4 joist - specification, not a unit grade C - specification, not a unit 12 penny - Wikipedia calls it a unit, but calculations in measurements taken in units of pennies are neither associative nor distributive, and transformations on measurements taken in units of pennies are neither additive nor multiplicative. Anyway, you mentioned you knew of at least 1000 units. I count 7. You have another 993?
![](https://secure.gravatar.com/avatar/cdc87637918eccd37ca88e9079e73705.jpg?s=120&d=mm&r=g)
Units have infinite precision, so grades and tolerances are also irrelevant.
Not if you believe in Planck lengths (or quantum states) :-). -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.
![](https://secure.gravatar.com/avatar/cdc87637918eccd37ca88e9079e73705.jpg?s=120&d=mm&r=g)
You should probably change the thread subject to "All-and-only 7 SI units" if that's what you mean. On Mon, Apr 4, 2022 at 1:46 PM Brian McCall <brian.patrick.mccall@gmail.com> wrote:
I drove 20 minutes up the road to by a bushel (US, not British) of U.S. No. 1. apples, to make apple cider. On my return trip, I stopped at the hardware store to buy a 2 lb box of 1-3/4" ring shank 12 penny nails. I used my 7/8 hole kitchen planer blade to grate the apples, then squeezed them for an hour and a 15 minutes at 30 psi to extract the juice. For good measure I added 2 tablespoons of vanilla and a pinch of salt. Then I drove the nails into grade C 2x4 joists (whose sizes are 1.5 x 3.5 inches, with a 1/16th inch permissible tolerance in sizing).
I'm going to start by ignoring any quantities above that are not involved in any sort of calculation. Those are outside the scope of the problem and proposal. And I don't care what anyone thinks of that.
The only calculation I see here is 1.5 in x 3.5 in = 0.00339 mm2. The fact that joists are called 2x4 even though their nominal dimension is 1.5 in x 3.5 in is also outside the scope of the problem and proposal. Come on man, you think you're the only one who knows examples like this? How about 1/3" image sensors? How about display diagonals? I know about these things, and I know that they do not need to be accounted for in a standard, language-supported representation of units.
Units have infinite precision, so grades and tolerances are also irrelevant.
The units you mentioned are: minutes bushel (imperial) bushel (US) psi tablespoon pinch (it's a stretch, but okay) foot inch
7/8 hole - this is a specification, not a unit 2x4 joist - specification, not a unit grade C - specification, not a unit 12 penny - Wikipedia calls it a unit, but calculations in measurements taken in units of pennies are neither associative nor distributive, and transformations on measurements taken in units of pennies are neither additive nor multiplicative.
Anyway, you mentioned you knew of at least 1000 units. I count 7. You have another 993? _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/DHRG6S... Code of Conduct: http://python.org/psf/codeofconduct/
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.
![](https://secure.gravatar.com/avatar/de311342220232e618cb27c9936ab9bf.jpg?s=120&d=mm&r=g)
On 4/4/22 13:31, David Mertz, Ph.D. wrote:
You should probably change the thread subject to "All-and-only 7 SI units" if that's what you mean.
While I'm sure SI would be very useful, I suspect that any system will have to be useful for a much broader audience to be accepted; given the vast amount of units, I suspect individual libraries will be needed for each area/discipline. So what hooks does Python need to provide to make such a thing feasible, easy to use, and, eventually if not immediately, performant? One idea: have a `unit` attribute, and that attribute keeps track of the units so far in the operations: [quick PoC] class TaggedInt(int): # def __new__(cls, *args, unit=None, **kwds): ti = int.__new__(cls, *args, **kwds) if unit is None: unit = () elif not isinstance(unit, tuple): unit = (unit, ) ti.unit = unit return ti # def __mul__(self, other): other_unit = getattr(other, 'unit', None) if other_unit is None: unit = self.unit else: unit = self.condense(self.unit, '*', other_unit) ti = TaggedInt(int.__mul__(self, other), unit=unit) return ti # def __rmul__(self, other): other_unit = getattr(other, 'unit', None) if other_unit is None: unit = self.unit else: unit = self.condense(other_unit, '*', self.unit) ti = TaggedInt(int.__mul__(self, other), unit=unit) return ti # def __repr__(self): return '%s %s' % (int.__repr__(self), ''.join(str(u) for u in self.unit)) # def condense(*args): result = [] for arg in args: if isinstance(arg, tuple) and len(arg) == 1: result.append(arg[0]) else: result.append(arg) return tuple(result) in use: >>> to_home = TaggedInt(5, unit='km') >>> to_home 5 km >>> # home and back >>> 2 * to_home 10 km >>> # big square >>> to_home * to_home 25 km*km >>> _.unit ('km', '*', 'km') Beyond that, do we make Python smart enough to, for example, convert `km*km` to `km^2`, or do we let libraries provide their own functions? I'm inclined to let libraries provide their own, as they could also implement unit conversions as desired. -- ~Ethan~
![](https://secure.gravatar.com/avatar/5615a372d9866f203a22b2c437527bbb.jpg?s=120&d=mm&r=g)
On Mon, Apr 04, 2022 at 04:44:58PM -0700, Ethan Furman wrote:
Beyond that, do we make Python smart enough to, for example, convert `km*km` to `km^2`, or do we let libraries provide their own functions? I'm inclined to let libraries provide their own, as they could also implement unit conversions as desired.
There already are libraries for units in Python: https://pypi.org/project/units/ https://pypi.org/project/quantities/ https://pypi.org/project/Unum/ https://pypi.org/project/Pint/ It would be kind of nice if there was a little less wild conjecture about units and more consideration of the state of the art :-) (This is not aimed specifically at you Ethan, just a general observation of the discussion in this thread.) -- Steve
![](https://secure.gravatar.com/avatar/cdc87637918eccd37ca88e9079e73705.jpg?s=120&d=mm&r=g)
On Mon, Apr 4, 2022 at 7:45 PM Ethan Furman <ethan@stoneleaf.us> wrote:
On 4/4/22 13:31, David Mertz, Ph.D. wrote:
You should probably change the thread subject to "All-and-only 7 SI units" if that's what you mean.
While I'm sure SI would be very useful, I suspect that any system will have to be useful for a much broader audience to be accepted; given the vast amount of units, I suspect individual libraries will be needed for each area/discipline.
That's pretty much exactly my point. Every time anyone points out complications or difficulties, Brian falls back to "we only care about the 7 SI units, all other uses are illegitimate." Since that is his position, he should state that in the thread. I think a LIBRARY (not syntax) that handles the real complication of units is an excellent thing. A couple do exists, but I'm sure they could be further improved.
![](https://secure.gravatar.com/avatar/9ea64fa01ed0d8529e4ae1b8873bb930.jpg?s=120&d=mm&r=g)
On Mon, Apr 4, 2022 at 11:55 AM Ken Kundert <python@shalmirane.com> wrote:
I think there is one more point worth making here. There is a suggestion that dimensional analysis can underpin a units system. Actually, the idea that all units can be broken down into a small set of fundamental units is very limiting and results in many vexing issues.
For example, consider currencies. There are currently hundreds of national currencies and thousands of cryptocurrencies. They all have the same basic fundamental unit of “value”, but value is only loosely defined. Furthermore, there is no fixed ratio between the currency and its value. It varies over time, over location, and from person to person.
Consider units of a particular commodity. The example of a ream of paper was recently mentioned. A ream is 500 sheets of paper. However two reams may not be comparable. They may refer to a different size of paper or a different quality of paper. So all prices for reams of paper would have the same fundamental units of “value per each”, but both “value” and “each” are not necessarily comparable. In effect, the fundamental unit system is not complete. You also need to include information about what you are measuring. For example, “each” could represent a single item of anything. The unit is not complete until you include a description of what that anything is, and in effect, there is an unlimited number of things it could be.
Now consider the issue of “unitless units”. In electrical circuit we often talk about gain, which it the ratio between the signal level at the output of a circuit relative to the signal level at the input. But you need to be specific about how you measure signal level. Typically, it would be a voltage, current, or power level. Consider a circuit where both the input and output are measured in volts. Then the gain would have units of "V/V", which is unitless in dimensional analysis. But units of "V/V" (voltage gain) is much different from units of "A/A" (current gain) or "W/W" (power gain), even though they have the same dimensions. Mixing them up results in errors and confusion. An additional complication is that sometimes logarithmic units are used. For example, decibels in voltage, or dBV, is 20*log(Vout/Vin). Again, a dimensionless quantity, but nonetheless "dBV" much different from "V/V".
The same issue occurs with the arguments to trigonometric functions like sin(), cos() and tan(). Generally, we assume the arguments are given in radians, which is a dimensionless number. But it could also be given in degrees, another dimensionless number. Radians and degrees are indistinguishable from the perspective on dimensional analysis, but mixing them up results in errors.
Just lending support to all of these comments. These are universal problems and far far from specific to electrical engineering. In civil we talk about strains and moments; strains are unitless, but they still carry a unit (e.g., in/in) and those units are important. We also talk about moments, which are the same dimensions as for energy (FORCE X LENGTH) but a wholly different thing with different unit expression (e.g., kip X ft). And there is also torque, which is the same units as a moment, but a different concept and you probably shouldn't be adding them together even though they are exactly the same units (someone might want to argue with me about that, I'm unsure). There is also the 2nd moment of area per length, which is the same dimensions as volume but we express it in different units than volume (in4/in), and there is section modulus per length, which is the same dimensions as area but different units (in3/in), and area per length, which is the same as length, but different units (in2/in or in2/ft). There is also the concept of loads: pressure loads (lbs/sq ft or psi), linear loads (lbs/ft, kips/in, etc), and these probably come out to have the same dimensions are other concepts in other disciplines that should not to be added to them. Since the concept of what a unit means is very very complex, what Chris A is saying about punting these responsibilities out to others is very powerful idea. I'm uncertain whether the idea of just calling functions as the consequence of applying unit tag to a number is the right solution (but I'm not against it). I wonder if having a totally separated namespace for the "unit slot" makes some kind of sense. Maybe you say something like: from pint.SI import as units meter as m, kilogram as kg from pint.US_Customary import as units feet as ft, pounds_force as lbf <<NOTE THE "*as units*" part above!!!>> ...and with the above "as units" imports you can say: m = "mosquito" lbf = "low battery freakout" kg = "kill goliath" ft = "foolish tolkien" x = 2m y = 5kg a = 4.0ft b = 4.1lbf ...etc. etc., and you have a unit namespace separate from the regular namespace. The m and lbf and ft and kg in these spaces never conflict. --- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler
![](https://secure.gravatar.com/avatar/de311342220232e618cb27c9936ab9bf.jpg?s=120&d=mm&r=g)
On 4/4/22 08:54, Ken Kundert wrote:
Now consider the issue of “unitless units”. In electrical circuit we often talk about gain, which it the ratio between the signal level at the output of a circuit relative to the signal level at the input. But you need to be specific about how you measure signal level. Typically, it would be a voltage, current, or power level. Consider a circuit where both the input and output are measured in volts. Then the gain would have units of "V/V", which is unitless in dimensional analysis. But units of "V/V" (voltage gain) is much different from units of "A/A" (current gain) or "W/W" (power gain), even though they have the same dimensions. Mixing them up results in errors and confusion. An additional complication is that sometimes logarithmic units are used. For example, decibels in voltage, or dBV, is 20*log(Vout/Vin). Again, a dimensionless quantity, but nonetheless "dBV" much different from "V/V".
[several other examples elided] It seems to me that these "unitless' units actually have units, even if they *appear* to cancel each other out. -- ~Ethan~
![](https://secure.gravatar.com/avatar/5615a372d9866f203a22b2c437527bbb.jpg?s=120&d=mm&r=g)
On Mon, Apr 04, 2022 at 11:05:54AM -0700, Ethan Furman wrote:
On 4/4/22 08:54, Ken Kundert wrote:
It seems to me that these "unitless' units actually have units, even if they *appear* to cancel each other out.
The term is *dimensionless* units. 1 dozen and 1 mole of objects both are dimensionless numbers, but if I asked for a dozen eggs and you gave me 1 mol (6.02 x 10^23) instead, I wouldn't have room to store them in the fridge. -- Steve
![](https://secure.gravatar.com/avatar/72ee673975357d43d79069ac1cd6abda.jpg?s=120&d=mm&r=g)
On 5/04/22 6:05 am, Ethan Furman wrote:
It seems to me that these "unitless' units actually have units, even if they *appear* to cancel each other out.
I think it's more that units alone don't capture everything that's important about the physical situation. Another example: the capacitance of a capacitor is the area of the plates divided by the distance between them and multiplied by a constant. Unit-wise, an area divided by a length is just a length, so the constant has units of farads per metre. But the metres don't correspond to a length you can measure anywhere. -- Greg
![](https://secure.gravatar.com/avatar/5615a372d9866f203a22b2c437527bbb.jpg?s=120&d=mm&r=g)
On Wed, Apr 06, 2022 at 12:28:49AM +1200, Greg Ewing wrote:
On 5/04/22 6:05 am, Ethan Furman wrote:
It seems to me that these "unitless' units actually have units, even if they *appear* to cancel each other out.
I think it's more that units alone don't capture everything that's important about the physical situation.
+1 This, a thousand times this! Neither dimensional analysis nor units capture all the semantics of calculations of real quantities. Both are powerful tools, but they do not capture everything interesting in the universe. For example, I'm going to try to draw some ASCII art of a rectangle with a diagonal line: +---+ | /| | / | |/ | +---+ We can capture the shape of the rectangle by giving the aspect ratio, let's say 1:2 using the convention width:height, which is another way of writing the fraction 1/2 = 0.5. Or we can give the gradient of the diagonal (rise over run), which is height/width, or 2. Or the angle made by the diagonal to the base, which would be arctan(2) in radians. Or the angle made by the diagonal to the vertical. In degrees. All these things describe the same rectangle, but they are numerically distinct, and tracking units "cm/cm" isn't going to tell you whether the number you have is an aspect ratio, gradient, angle or something else. -- Steve
![](https://secure.gravatar.com/avatar/6d6150353bc4f27822f669a36559ec13.jpg?s=120&d=mm&r=g)
For example, consider currencies. There are currently hundreds of national currencies and thousands of cryptocurrencies. They all have the same basic fundamental unit of “value”, but value is only loosely defined. Furthermore, there is no fixed ratio between the currency and its value. It varies over time, over location, and from person to person.
Considered. Within a single currency system, units are well defined in a relative since (i.e. - cents per dollar). But in an absolute sense, units of currency do not have infinite precision, which makes them outside of the scope of this problem and proposal. Give financial folks support for dollars and cents or, other units in other currency systems if need be, but they're SOL for as far as native language conversions between currencies. They'll otherwise have to stick with unitless, implied numbers rather than explicit units. In other words, their world does not change.
Consider units of a particular commodity. The example of a ream of paper was recently mentioned. A ream is 500 sheets of paper. However two reams may not be comparable. They may refer to a different size of paper or a different quality of paper. So all prices for reams of paper would have the same fundamental units of “value per each”, but both “value” and “each” are not necessarily comparable. In effect, the fundamental unit system is not complete. You also need to include information about what you are measuring. For example, “each” could represent a single item ofanything. The unit is not complete until you include a description of what that anything is, and ineffect, there is an unlimited number of things it could be.
Considered and in my opinion should be rejected from inclusion in any sort of PEP that leads to native support of standard units, because these are not standard units. And supporting standard units in native language is not going to break anything that currently works.
Now consider the issue of “unitless units”. In electrical circuit we often talk about gain, which it the ratio between the signal level at the output of a circuit relative to the signal level at the input. But you need to be specific about how you measure signal level. Typically, it would be a voltage, current, or power level. Consider a circuit where both the input and output are measured in volts. Then the gain would have units of "V/V", which is unitless in dimensional analysis. But units of "V/V" (voltage gain) is much different from units of "A/A" (current gain) or "W/W" (power gain), even though they have the same dimensions. Mixing them up results in errors and confusion. An additional complication is that sometimes logarithmic units are used. For example, decibels in voltage, or dBV, is 20*log(Vout/Vin). Again, a dimensionless quantity, but nonetheless "dBV" much different from "V/V".
I cannot think of a low level implementation of units that would support dimensional analysis that would catch problems with adding a quantity expressed in "V/V" to a quantity expressed in "A/A". But like ChrisA pointed out, there is "float" and there is "Decimal". If there is a need for units that go beyond low level implementation of SI units, then why should this be a blocker for native language support for units? At a low level, there are 7 dimensions that need to be accounted for in order to support anything that a higher level of code wants to do. There is even a low level way to tell Python how to interpret unitless dimensions using explicit casting, which covers a big chunk of what modules like QuantiPhy say they do, and more. What a low level, native representation of units cannot do is figure out what the preferred type of a unitless value like A/A, or even a unit value like N*m after the result of a calculation. The best it can do is provide defaults, and leave it up to the programmer to explicitly cast where needed. But that is true even today with binary operations combining floats and ints. There is a default way that these operations are done, and if that operation does not meet your need, then you must explicitly cast.
The same issue occurs with the arguments to trigonometric functions like sin(), cos() and tan(). Generally, we assume the arguments are given in radians, which is a dimensionless number. But it could also be given in degrees, another dimensionless number. Radians and degrees are indistinguishable from the perspective on dimensional analysis, but mixing them up results in errors.
Considered, and this is tricky. I do not have an answer at the moment.
This is not to knock the idea of dimensional analysis. It is just not something that would be done in most programs that process physical quantities. Rather it is something that is largely done as a one-time check on your analysis. It is a “second opinion” on whether your hand calculation are correct, or at least plausible. So dimensional analysis packages such as pint have their place, but dimensional analysis is not something that belongs in the base language or even the standard library.
So you're saying that because you can't implement all possible aspects of dimensional analysis at a native level, it is not worth implementing any?
However I do believe that a case can be made to allow numbers to be easily tagged with units in the base language, and then allowing those units to be accessed as an attribute of the number. Packages such as pint and QuantiPhy could then use that attribute to provide processing of units that is appropriate for the particular application.
This is QuantiPhy in action: ``` import quantiphy A = quantiphy.Quantity('2m') B = quantiphy.Quantity('5m') A + B # 0.007 type(A) # <class 'quantiphy.Quantity'> type(A + B) # <class 'float'> ``` How is that useful? It does not even come close to solving problems that I have encountered in the real world. pint has struggled with NumPy, and as far getting colleagues that I have worked with to use it, they stop dead in their tracks as soon as they see dot-notation. And I can understand the reaction of "well, if they don't like dot-notation, and therefore 90% of what makes Python Python, then why help them?" Because I'm a people pleaser, because helping them helps me, and because why not?
![](https://secure.gravatar.com/avatar/5615a372d9866f203a22b2c437527bbb.jpg?s=120&d=mm&r=g)
On Sun, Apr 03, 2022 at 10:42:16PM -0400, Ricky Teachey wrote:
I was cheerleading this effort earlier and I still think it would be a massive contribution to needs of the engineering world to solve this problem at the language level. But boy howdy is it a tough but of a problem to crack.
More than 35 years of prior art says hello. My HP-28C calculator supported unit conversion in the mid 1980s, and a few years later HP were offering calculators that supported arithmetic on units. If you want to see some prior art, check out the chapters on Units here: https://web.archive.org/web/20150608024051/http://www.hp41.net/forum/fileshp... http://h10032.www1.hp.com/ctg/Manual/c00442266.pdf If you are on a Linux or Unix system, you can check out the "units" program: [steve@ando ~]$ units "3 ounces * 200 furlongs per fortnight" "kg m/s" * 0.0028288774 / 353.49711 Then there is also Frink: https://frinklang.org/ -- Steve
![](https://secure.gravatar.com/avatar/9ea64fa01ed0d8529e4ae1b8873bb930.jpg?s=120&d=mm&r=g)
On Mon, Apr 4, 2022 at 2:24 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Sun, Apr 03, 2022 at 10:42:16PM -0400, Ricky Teachey wrote:
I was cheerleading this effort earlier and I still think it would be a massive contribution to needs of the engineering world to solve this problem at the language level. But boy howdy is it a tough but of a problem to crack.
More than 35 years of prior art says hello.
My HP-28C calculator supported unit conversion in the mid 1980s, and a few years later HP were offering calculators that supported arithmetic on units.
If you want to see some prior art, check out the chapters on Units here:
https://web.archive.org/web/20150608024051/http://www.hp41.net/forum/fileshp...
http://h10032.www1.hp.com/ctg/Manual/c00442266.pdf
If you are on a Linux or Unix system, you can check out the "units" program:
[steve@ando ~]$ units "3 ounces * 200 furlongs per fortnight" "kg m/s" * 0.0028288774 / 353.49711
Then there is also Frink:
-- Steve
These are cool finds. There is also Mathcad, which beautifully solved the units problem well over 2 (3?) decades ago. Of course a Mathcad seat ain't cheap. And there's Maple, and matlab. All of these probably have things to teach us. Here is a screenshot of the Mathcad units system definition interface; it doesn't solve all the problems, but it solves many. [image: image.png] --- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler
![](https://secure.gravatar.com/avatar/72ee673975357d43d79069ac1cd6abda.jpg?s=120&d=mm&r=g)
On 4/04/22 2:42 pm, Ricky Teachey wrote:
height = 5ft + 4.5in
Surely we ought to be able to add these values. But what should the resulting tag be?
One answer might be that the tag only tracks what kind of quantity it is -- length, mass, time, etc. Internally the number would be represented in a standard unit for each quantity. Whether to display it as feet or inches or a combination of both would then be a matter of formatting. A default could be chosen based on the magnitude of the number, and if you wanted something else you would have to specify it, just as with any other formatting decision.
Should we be able to write it like this?
height = 5ft 4.5in
With my "no new syntax" suggestion there would be no question here -- the only way to write it would be height = 5 * ft + 4.5 * in -- Greg
![](https://secure.gravatar.com/avatar/8da339f04438d3fcc438e898cfe73c47.jpg?s=120&d=mm&r=g)
Greg Ewing writes:
With my "no new syntax" suggestion there would be no question here -- the only way to write it would be
height = 5 * ft + 4.5 * in
I'm very sympathetic to the "no new syntax" suggestion, but suppose I wanted to know how many cm there are in an in: cm_per_in = 1 * in / 1 * cm Of course that's a silly mistake, but the (sole, IMO) advantage of the original proposal is that you can't make that silly mistake.
![](https://secure.gravatar.com/avatar/5615a372d9866f203a22b2c437527bbb.jpg?s=120&d=mm&r=g)
On Wed, Apr 06, 2022 at 12:30:47AM +0900, Stephen J. Turnbull wrote:
Greg Ewing writes:
With my "no new syntax" suggestion there would be no question here -- the only way to write it would be
height = 5 * ft + 4.5 * in
I'm very sympathetic to the "no new syntax" suggestion, but suppose I wanted to know how many cm there are in an in:
cm_per_in = 1 * in / 1 * cm
inch.definition() inch.convert(cm)
Of course that's a silly mistake, but the (sole, IMO) advantage of the original proposal is that you can't make that silly mistake.
Don't worry, I'm sure it will allow its own distinct silly mistakes :-) -- Steve
![](https://secure.gravatar.com/avatar/72ee673975357d43d79069ac1cd6abda.jpg?s=120&d=mm&r=g)
On 6/04/22 3:30 am, Stephen J. Turnbull wrote:
suppose I wanted to know how many cm there are in an in:
cm_per_in = 1 * in / 1 * cm
Of course that's a silly mistake, but the (sole, IMO) advantage of the original proposal is that you can't make that silly mistake.
Well, you can make the same silly mistake any time you mix multiplication and division in an expression, so we should really be lobbying to fix the borked precedence of * and / in general. :-) -- Greg
![](https://secure.gravatar.com/avatar/cdc87637918eccd37ca88e9079e73705.jpg?s=120&d=mm&r=g)
While units are certainly useful, I think this is FAR too large to add to Python syntax. A library like "units" is great, but it really needs to be a library, and not a small one. There are thousands of units in use in sciences, commerce, engineering, ordinary life, etc. In all of them, it is not uncommon for the same few letters to represent multiple things, depending on context. But as well as the sheer number, what is convertible to what, and in what context, is very nuanced. An electron volt is a unit of energy. Or of mass. Or of momentum. It's usual to convert among these meanings for these units. But bordering on nonsense to convert among kilowatt-hours and grams. It depends on the practical context of units. Even in Ethan's example, he uses kilogram as a unit of weight. It answers a question with the word weight in it (albeit 300 kg is extreme for a human weight, maybe for brick pallets). But kg IS NOT a unit of weight. It's a unit of mass. In the context of thing on the surface of the earth near the same elevation, we might convert between kg and lbs. But in other contexts it makes no sense. I'm older than 35 (Earth) years, but my age is also counted differently according to different calendars and cultural conventions. Of his examples, °F is the only one that is completely convertible to other temperature scales, but even there, it's rare to see a bread recipe in degrees Kelvin. Units like time intervals and currencies are EXCEEDINGLY complicated. The conversion rate between Dirham and USD fluctuates by microsecond, but can also have two simultaneous values on different exchanges. At the least, a unit of currency has built into it an absolute time, but also often a relative time to a specific refence time. ... So then we get to the point of saying "don't do any of that, just annotate the number with an arbitrary string/object." Which sure, is possible. Not even difficult. But we can also do it already without syntax by subclassing int, or float, or complex, or Decimal, or Fraction. Or we can do it just with name conventions too. Perhaps 'person_kg' is just a numeric variable reminding us of its unit. On Sun, Apr 3, 2022, 6:25 PM Chris Angelico <rosuav@gmail.com> wrote:
On Mon, 4 Apr 2022 at 07:45, Ethan Furman <ethan@stoneleaf.us> wrote:
Mechanically, are `lbs`, `km`, `hr`, etc., something that is imported, or are they tags attached to the numbers? If attached to the numbers, memory size would increase and performance might decrease -- but, how often do we have a number that is truly without a unit?
How old are you? 35 years How much do you weigh? 300 kg What temperature do you cook bread at? 350 F
Very frequently - it'd be called an index. (What sort of numbers should enumerate() return, for instance? Clearly that, whatever it is, is an index.) But if every int and float has a tag attached to it, it's not that big a deal to have either a default tag, or leave the field NULL or None, to define it to be a unitless value or index.
ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/7LTKML... Code of Conduct: http://python.org/psf/codeofconduct/
![](https://secure.gravatar.com/avatar/d67ab5d94c2fed8ab6b727b62dc1b213.jpg?s=120&d=mm&r=g)
On Mon, 4 Apr 2022 at 13:45, David Mertz, Ph.D. <david.mertz@gmail.com> wrote:
While units are certainly useful, I think this is FAR too large to add to Python syntax. A library like "units" is great, but it really needs to be a library, and not a small one.
There are thousands of units in use in sciences, commerce, engineering, ordinary life, etc. In all of them, it is not uncommon for the same few letters to represent multiple things, depending on context.
But as well as the sheer number, what is convertible to what, and in what context, is very nuanced.
And that's exactly why I think that the *concept* of units could be added to the language, with some syntax and low-level semantics, but the actual units themselves belong in libraries. ChrisA
![](https://secure.gravatar.com/avatar/6d6150353bc4f27822f669a36559ec13.jpg?s=120&d=mm&r=g)
Native level support for units would be much more powerful and be much more worth the effort than libraries. Libraries already exist, and they are not sufficient.
![](https://secure.gravatar.com/avatar/5615a372d9866f203a22b2c437527bbb.jpg?s=120&d=mm&r=g)
On Mon, Apr 04, 2022 at 01:53:49PM +1000, Chris Angelico wrote:
And that's exactly why I think that the *concept* of units could be added to the language, with some syntax and low-level semantics, but the actual units themselves belong in libraries.
+1 on that. I'm not even sure about the need for syntax beyond what we already have. Yes, it would be nice to write: speed = 15 ft 3 in / 3 sec but is it really so painful to use existing syntax? speed = (15*ft + 3*in) / (3*sec) I don't think so. -- Steve
![](https://secure.gravatar.com/avatar/9ea64fa01ed0d8529e4ae1b8873bb930.jpg?s=120&d=mm&r=g)
On Tue, Apr 5, 2022 at 7:43 AM Steven D'Aprano <steve@pearwood.info> wrote:
On Mon, Apr 04, 2022 at 01:53:49PM +1000, Chris Angelico wrote:
And that's exactly why I think that the *concept* of units could be added to the language, with some syntax and low-level semantics, but the actual units themselves belong in libraries.
+1 on that.
I'm not even sure about the need for syntax beyond what we already have. Yes, it would be nice to write:
speed = 15 ft 3 in / 3 sec
but is it really so painful to use existing syntax?
speed = (15*ft + 3*in) / (3*sec)
I don't think so.
-- Steve
The motivation is much more than just being able to not have the * symbol (though all things being equal it would be nice). I think I've mostly disagreed with Brian McCall on just about every DETAIL he has expressed regarding what "language level unit system support" ought to look like, but he had it right in his first post that spurred this discussion. I'll quote bits of it: On Sun, Apr 3, 2022 at 2:56 PM Brian McCall <brian.patrick.mccall@gmail.com> wrote:
...I have spent a fair amount of my own time, and I have seen so many others' time wasted because command line or input fields do not include units, or the inputs field units are accidentally handled with different units, or units are not used at all.
I get the sentiment that Python, or programming languages in general, are not meant to deal with units. From the perspective of a computer scientist, I can understand why this would be seen as a level of abstraction too high for programming languages and core libraries to aspire to. But from the perspective of a scientist or engineer, units are a CORE part of language. Anyone who has taken science or engineering classes in college knows what happens when you turn in homework with missing units in your answers - zero credit. Anyone who has worked out complicated calculations by hand, or with the help of packages like "units" knows the sinking feeling and the red flags raised when your answer comes out in the wrong units.
There has also been a shift in the expectations of scientists and engineers regarding their programming capabilities. A generation ago, a good many of them would not be expected to use their computers for anything more than writing documents, crunching numbers in a spreadsheet, or using a fully integrated task-specific application for which their employer paid dearly. These assumptions were codified in workflows and job descriptions. Today, if your workflow, especially in R&D, has a gap that Microsoft Office or task-specific software doesn't solve for you, then you are pretty much expected to write your own code. Job postings for engineering roles (other than software engineering) regularly include programming in their required skills. Software design, on the other hand, is rarely a required or hired skill. And even though these scientists and engineers are required to know how to program, they are almost never *paid* to write code. Spending any more time than needed writing code, even if it is to fill a critical gap in a workflow, is seen as a negative. So software design best practices are non-existent. All of this leads to very poor practices around and improper handling of an absolutely essential part of scientific and engineering language - units.
...The lack of native language support for SI units is a problem for an entire segment of programmers. Programming languages took a big step forward in deciding that EVERYTHING is a pointer/reference, and EVERYTHING is an object. They need to take another step forward to say that EVERY number has a unit, including "unitless". Not having this language feature is becoming (or already is) a problem. The question is, is it Python's problem?
And I said: The old engineering disciplines- mine (civil engineering), structural,
electrical, etc- are the next frontier in the "software eats the world" revolution, and they desperately need a language with native units support....
Python SHOULD be that language we do this with. It is awesome in every other way. But if it isn't DEAD SIMPLE to use units in python, it won't happen.
Steven: I am telling you that there is a HUGE NEED and DESIRE for what we are talking about above. the need to automate design processes in the "i need you to get me a set of calculations and drawings for this complicated project to me, TODAY, mr engineer?" electrical, structural, chemical, industrial/manufacturing, mechanical, and general civil (environmental, water/wastewater, geotechnical, chemical, transportation) fields is monstrous. it said above it is "the next frontier". but it won't be unless these men and women get the tools they need. someone is going to fill the need. units are at THE CORE of that need. i think python should be the language we reach for. i have made python work for me as a civil engineer and been extremely successful with it for all the usual reasons: ease of learning and community backing that learning, the open source resources (libraries and applications), the momentum of the language, its ability to be a swiss army knife (need to transition to web? automate the boring thing? sure, easy). but i do not think the people in the disciplines listed above are flocking to python like they should. almost nobody in the PSF foundation surveys answers "civil engineer" when they take the survey (i do, every year!). it should be hundreds, maybe thousands. and a big big part is that using units is too hard. we need to make it easier. how? i don't know. i'm not software engineer. but i am telling y'all, it's too hard. --- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler
![](https://secure.gravatar.com/avatar/d995b462a98fea412efa79d17ba3787a.jpg?s=120&d=mm&r=g)
On Tue, 5 Apr 2022 at 14:25, Ricky Teachey <ricky@teachey.org> wrote:
units are at THE CORE of that need.
i think python should be the language we reach for. i have made python work for me as a civil engineer and been extremely successful with it for all the usual reasons: ease of learning and community backing that learning, the open source resources (libraries and applications), the momentum of the language, its ability to be a swiss army knife (need to transition to web? automate the boring thing? sure, easy).
I've been reading this discussion with gradually increasing levels of bemusement. I genuinely had no idea that handling units was so complex. But one thing I did note was that there are various libraries (at least one, I think I saw more than one mentioned) that do units handling. Why are those libraries insufficient? You said that "The motivation is much more than just being able to not have the * symbol", but no-one seems to have explained why a library isn't enough. After all, scientists manage with numpy being a library and not a core feature. Data scientists manage with tensorflow being a library. What's not sufficient for unit support to be a library? (And remember, the numeric users successfully got the @ operator added to the language by arguing from the basis of it being a sufficient enhancement to improve the experience of using numpy, after years of having requests for general "matrix operations" rejected - language changes are *more likely* based on a thriving community of library users, so starting with a library is a positive way of arguing for core changes). Paul
![](https://secure.gravatar.com/avatar/8da339f04438d3fcc438e898cfe73c47.jpg?s=120&d=mm&r=g)
I'll start with some opinionated stuff, but possibly helpful attempts to understand the requirements follows from about the middle of the post. Ricky Teachey writes:
[Brian McCall] had it right in his first post that spurred this discussion. I'll quote bits of it:
...I have spent a fair amount of my own time, and I have seen so many others' time wasted because command line or input fields do not include units, or the inputs field units are accidentally handled with different units, or units are not used at all.
This has zip-o to do with having quantity (number-with-unit) objects as part of the language. It's all about parsing natural language, or providing the user with sufficient hints about the expected input. Ie, it's UI/UX.
I get the sentiment that Python, or programming languages in general, are not meant to deal with units.
I have no idea where this came from. The plethora of units libraries both on PyPI and in various folks' personal toolkits gives it the lie.
Anyone who has taken science or engineering classes in college knows what happens when you turn in homework with missing units in your answers - zero credit. Anyone who has worked out complicated calculations by hand, or with the help of packages like "units"
Oops, we just let the cat out of the bag. How hard it is to use the units library? What is hard about it? Sure, it's an annoying amount of setup, but do it once for your field and copy-paste it (or import from your toolkit) and the only differences between using units and having user-defined literals is typing * instead of _, and the possible embarrassment of typing "cm_per_in = 1 * in / 1 * cm" and getting units meters**2 (but you immediately know you did something wrong).
...The lack of native language support for SI units is a problem for an entire segment of programmers.
This is the X Y problem. We'll all concede that units support is nice to have, and if you want to consider the lack of units support to be existential, fine by us. But you're going to have to do a tun of work to show that it needs to be *native* support (ie, in the language), that it requires new syntax.
They need to take another step forward to say that EVERY number has a unit, including "unitless".
I disagree, but it's not open and shut -- I at least am open to evidence of this claim. Suppose we do need that. Even so, it's not clear to me that this shouldn't be handled the way we handle typing, that is, by treating a quantity of 1 kg as being a different type from 1 m, and inferring types. (Example below.)
The old engineering disciplines- mine (civil engineering), structural, electrical, etc- are the next frontier in the "software eats the world" revolution, and they desperately need a language with native units support....
Again, nobody has provided any foundation for the claim that they *need* *native* support, except that C++ jumped off that bridge so Python should too. What follows are questions that need to be answered to implement the feature well. They shouldn't be taken as "YAGNI".
Python SHOULD be that language we do this with. It is awesome in every other way. But if it isn't DEAD SIMPLE to use units in python, it won't happen.
What does "dead simple" mean? Do you expect people to be using Python as a calculator so that the majority of their typing involves literal quantities (ie, numbers with explicit units)? Or is the more common use case writing def foo(v: Volume, d: Density) -> Mass: return v * d and have an I/O functionality that can convert strings or csv files to Volume and Density objects, and Mass objects have a nice __str___. Then in the rare case where you need to specify a quantity (eg, demoing at the interpreter prompt) you have to write foo(2.34 * m * m * m, 0.06 * kg / (m * m * m)) Is that too "difficult"? If it is, how about m3 = m * m * m foo(2.34 * m3, 0.06 * kg / m3) By the way, this is possible right now, mypy will check it for you. (Of course somebody must define the classes Volume, Density, and Mass, and write the I/O and __str__ functions. But only the I/O is at all hard.)
a big big part is that using units is too hard.
I have no idea what you mean by "using units". Abstractly, yes, but what programs are you going to write? Are they going to be littered with literal quantities, as Brian's proposal for custom literals suggests? Or is it just important that when you operate on some quantities, their units are checked for compatibility, and results can be easily checked for appropriate units if the programmer wants to? Do people often confuse moments, energy, and torque in their computations so that we need to distinguish them in our unit libraries, or does that kind of thing happen so rarely that cancelling SI units and prefixes wherever possible would give good enough results? (Note that the "units as types" approach can prevent you from substituting energy for torque.) In large data sets, are variables polymorphic so that every value must be a quantity, or can NumPy tag individual series with their type (= unit), and the values can be just numbers?
![](https://secure.gravatar.com/avatar/6d6150353bc4f27822f669a36559ec13.jpg?s=120&d=mm&r=g)
There are thousands of units in use in sciences, commerce, engineering, ordinary life, etc. In all of them, it is not uncommon for the same few letters to represent multiple things, depending on context.
Units in science and engineering are NOT AT ALL ambiguous. If they were, planes would be crashing every day. If we consider distinct sets of units, such as SI, there are guaranteed to not be any collisions in notation. Even English/Imperial and SI units do not conflict as far as I know. Commerce (finance?) is another story, so yes, there needs be groupings (i.e. - namespaces) of such things.
An electron volt is a unit of energy. Or of mass. Or of momentum. An electron volt is a unit of energy and only a unit of energy. Knowing a particle's energy (in certain situations) means that you also know other physical quantities about that object, and so in casual conversation (and the occasional poorly reviewed journal article) you find them used interchangeably. But that does not mean that an electron volt is a unit of mass. It just isn't. These units are set by standards. Standards do not leave any room for ambiguity. The only context needed is what standard applies. And there are LOTS of people who are familiar with these standards who could be called upon to lend a hand.
I'm older than 35 (Earth) years, but my age is also counted differently according to different calendars and cultural conventions. Of his examples, °F is the only one that is completely convertible to other temperature scales, but even there, it's rare to see a bread recipe in degrees Kelvin.
Dates and years are not standard units of time. The only SI unit of time is the "s" (second). The `datetime` object today is composed of unitless primitives and would continue to be so. The value returned by time.time() could remain unitless, or it could have a unit of seconds. Either one would make sense. So this is also not a problem.
Units like time intervals and currencies are EXCEEDINGLY complicated. The conversion rate between Dirham and USD fluctuates by microsecond, but can also have two simultaneous values on different exchanges. At the least, a unit of currency has built into it an absolute time, but also often a relative time to a specific refence time.
Would it help if we stopped saying "units" and instead referred to "standard units"?
![](https://secure.gravatar.com/avatar/cdc87637918eccd37ca88e9079e73705.jpg?s=120&d=mm&r=g)
On Mon, Apr 4, 2022, 12:53 AM Brian McCall
An electron volt is a unit of energy. Or of mass. Or of momentum. An electron volt is a unit of energy and only a unit of energy. Knowing a particle's energy (in certain situations) means that you also know other physical quantities about that object, and so in casual conversation (and the occasional poorly reviewed journal article) you find them used interchangeably.
This is just flatly wrong of usage in particle physics. Electron volts are precisely the default units used to describe the mass of subatomic particles. Would it help if we stopped saying "units" and instead referred to
"standard units"?
Yes, limiting the idea to ""SI units" would cover far less. And thereby have far less motivation to change Python syntax rather than use a library.
![](https://secure.gravatar.com/avatar/1b5e7cdeb75cd31194b82101a245f8a6.jpg?s=120&d=mm&r=g)
On 4/4/22 07:25, David Mertz, Ph.D. wrote:
On Mon, Apr 4, 2022, 12:53 AM Brian McCall
> An electron volt is a unit of energy. Or of mass. Or of momentum. An electron volt is a unit of energy and only a unit of energy. Knowing a particle's energy (in certain situations) means that you also know other physical quantities about that object, and so in casual conversation (and the occasional poorly reviewed journal article) you find them used interchangeably.
This is just flatly wrong of usage in particle physics. Electron volts are precisely the default units used to describe the mass of subatomic particles.
I beg to disagree here, here---mass is measured in eV / c^2, and momentum in eV / c. (Although, as Brian says, we're all guilty of taking shortcuts in conversations.)
Would it help if we stopped saying "units" and instead referred to "standard units"?
Yes, limiting the idea to ""SI units" would cover far less. And thereby have far less motivation to change Python syntax rather than use a library.
_______________________________________________ Python-ideas mailing list --python-ideas@python.org To unsubscribe send an email topython-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived athttps://mail.python.org/archives/list/python-ideas@python.org/message/55MZRW... Code of Conduct:http://python.org/psf/codeofconduct/
-- =============================================================================== Luca Baldini Universita' di Pisa and Istituto Nazionale di Fisica Nucleare - Sezione di Pisa Largo Bruno Pontecorvo 3, I-56127, Pisa, ITALY. phone : +39 050 2214438 fax : +39 050 2214317 e-mail :luca.baldini@pi.infn.it icq : 396247302 (Garrone) web :http://www.df.unipi.it/~baldini mirror :http://www.pi.infn.it/~lbaldini ===============================================================================
![](https://secure.gravatar.com/avatar/5615a372d9866f203a22b2c437527bbb.jpg?s=120&d=mm&r=g)
On Mon, Apr 04, 2022 at 07:42:37AM +0200, Luca Baldini wrote:
I beg to disagree here, here---mass is measured in eV / c^2, and momentum in eV / c. (Although, as Brian says, we're all guilty of taking shortcuts in conversations.)
I beg to differ. *Real* physicists measure everything in terms of powers of length: https://en.wikipedia.org/wiki/Geometrized_unit_system *half a wink* -- Steve
![](https://secure.gravatar.com/avatar/d67ab5d94c2fed8ab6b727b62dc1b213.jpg?s=120&d=mm&r=g)
On Mon, 4 Apr 2022 at 15:26, David Mertz, Ph.D. <david.mertz@gmail.com> wrote:
On Mon, Apr 4, 2022, 12:53 AM Brian McCall
An electron volt is a unit of energy. Or of mass. Or of momentum. An electron volt is a unit of energy and only a unit of energy. Knowing a particle's energy (in certain situations) means that you also know other physical quantities about that object, and so in casual conversation (and the occasional poorly reviewed journal article) you find them used interchangeably.
This is just flatly wrong of usage in particle physics. Electron volts are precisely the default units used to describe the mass of subatomic particles.
Not a particle physicist, so I don't know what the usage actually is, but wouldn't mass actually be eV/c²? If that's frequently written as simply "eV", then that's another example of common non-SI usage that really should be supportable, but only within its context. ChrisA
![](https://secure.gravatar.com/avatar/6d6150353bc4f27822f669a36559ec13.jpg?s=120&d=mm&r=g)
This is just flatly wrong of usage in particle physics. Electron volts are precisely the default units used to describe the mass of subatomic particles. I don't know what to tell you man. Here's Wikipedia. If you follow the link to the actual SI publication, it says the same thing. How something is used is not the same as how it is defined. I might use my car key to open my mail, but if I ask someone if they've seen my letter opener, they're probably not going to be able to help me find my car keys.
https://en.wikipedia.org/wiki/Non-SI_units_mentioned_in_the_SI
Yes, limiting the idea to ""SI units" would cover far less. And thereby have far less motivation to change Python syntax rather than use a library.
Not sure what you mean by "cover far less". Even implementing just the 29 base and derived SI units would have a profound impact on the work done by engineers and scientists on a day to day basis. And the impact of such a change will only grow more in importance over time. As I mentioned in another reply, adding in the remaining SI units AND the Imperial / US Customary still only brings up a grand total of 160 units. There are other systems too, but they all use units that are defined elsewhere. FWIW, I do not subscribe to the mindset of "Well, I had to do it this way, so why can't you kids learn to do it this way too?"
![](https://secure.gravatar.com/avatar/d67ab5d94c2fed8ab6b727b62dc1b213.jpg?s=120&d=mm&r=g)
On Mon, 4 Apr 2022 at 16:41, Brian McCall <brian.patrick.mccall@gmail.com> wrote:
How something is used is not the same as how it is defined. I might use my car key to open my mail, but if I ask someone if they've seen my letter opener, they're probably not going to be able to help me find my car keys.
+1 QOTW!
Not sure what you mean by "cover far less". Even implementing just the 29 base and derived SI units would have a profound impact on the work done by engineers and scientists on a day to day basis. And the impact of such a change will only grow more in importance over time. As I mentioned in another reply, adding in the remaining SI units AND the Imperial / US Customary still only brings up a grand total of 160 units. There are other systems too, but they all use units that are defined elsewhere.
FWIW, I do not subscribe to the mindset of "Well, I had to do it this way, so why can't you kids learn to do it this way too?"
The 160 units would be more likely to have collisions though. Also, the base and derived SI units will be used with magnitude prefixes, which increases the effective number of collision chances. So I do still think they need to be kept separate - also because it allows units that Python never thought of. ChrisA
![](https://secure.gravatar.com/avatar/cdc87637918eccd37ca88e9079e73705.jpg?s=120&d=mm&r=g)
On Mon, Apr 4, 2022 at 3:53 AM Chris Angelico <rosuav@gmail.com> wrote:
The 160 units would be more likely to have collisions though. Also, the base and derived SI units will be used with magnitude prefixes,
The supposed 160 are far fewer than I use on a daily (or at least weekly) basis. Yes, all the rest are convertible, but that's not how they're actually used. A hectare or acre represents the square of a unit of distance (and a constant). But no one (realtors or land surveyors goes through the large inconvenience of converting units, constants, and multipliers, when they work with the units. And no one baking a cake cares that you can cube a distance to get a (dry) volume of flour. Reams are a "unitless" constant of 500, but no one ever uses them for things other than paper... plus that fact that they used to be 480, so the unitless number itself has an implicit date attached to it. Yes... if you ignore the real world, and real users of Python, units can be made so simple as to potentially be syntax. In the real world, and as I stated, 160 is at least an order of magnitude too low. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.
![](https://secure.gravatar.com/avatar/6d6150353bc4f27822f669a36559ec13.jpg?s=120&d=mm&r=g)
Back in my original post, I pointed out that engineers and scientists in their modern day workflows are expected to have basic programming language skills, and are expected to use those skills when pre-packaged software solutions leave gaps in their workflows, but they are explicitly told that they are not "paid to write code", leading to bad programming practices and wasted hours and (unpaid) overtime resolving issues that could have been avoided if there was an implementation of scientific and engineering units that was too easy use to be ignored. This is a problem that I have run into dozens of times, and at more than place. So yes, I am thinking about the real world. This is not an ivory tower fantasy. If a baker is writing python code, it's either a hobby or something that they plan to sell or are getting paid to do. Inventory systems have their own concept of units. But they also have dedicated teams of software engineers to handle whatever conversions and representations they need, and will continue to do so with unitless quantities. As I said originally, there is a real-world need for native or like-native support for standard scientific and engineering units is there, and it is growing. The only question is, should Python provide a solution? I can see that your answer is no, because it doesn't meet the needs of the bakers. I don't expect everyone to see updating or creating a new programming language to be the answer. But, no offense, the reasoning you have given for your opinion is not something that I can take seriously.
![](https://secure.gravatar.com/avatar/6d6150353bc4f27822f669a36559ec13.jpg?s=120&d=mm&r=g)
There are thousands of units in use in sciences, commerce, engineering In the SI system, there are: 7 base SI standard units 22 named, derived standard units 14 alternative standard units of measure that are commensurate with one of the 29 base and derived units, and ~35 units (not symbols) that are defined, but not officially sanctioned and having potential lexicographical conflicts with other standard units ... for a total of 78 in the SI system.
The Imperial / US Customary carried over many very old-fashioned units of measure, including the "hand", but in many cases refrained from adopting "new" units (such as the Ampere, Coulomb, or Weber) that were being introduced into the SI system. So the total number of systems in the Imperial / US Customary system is about 80, most of which are completely out of use, and certainly not in use by anyone who regularly performs dimensional analysis. For instance, I don't think there would be much uproar if "teaspoons" were left out of any kind of implementation. So a comprehensive implementation of units would not require more than 160 units, and in reality, a "sufficient" implementation would need only 7. An exceptionally good implementation could probably be done with right around 100.
![](https://secure.gravatar.com/avatar/72ee673975357d43d79069ac1cd6abda.jpg?s=120&d=mm&r=g)
On 5/04/22 12:27 am, David Mertz, Ph.D. wrote:
On Mon, Apr 4, 2022, 2:17 AM Brian McCall
For instance, I don't think there would be much uproar if "teaspoons" were left out of any kind of implementation.
Apparently someone other than you does the cooking in you household!
Wouldn't recommend this, the National Union of Chefs, Cooks, Cuisiniers and Other Professional Culinary Persons would be out on strike until teaspoons were included. -- Greg
![](https://secure.gravatar.com/avatar/5615a372d9866f203a22b2c437527bbb.jpg?s=120&d=mm&r=g)
On Mon, Apr 04, 2022 at 06:16:49AM -0000, Brian McCall wrote:
So a comprehensive implementation of units would not require more than 160 units, and in reality, a "sufficient" implementation would need only 7.
The idea that a system which only supported the seven SI base quantities, and so couldn't even convert from imperial to metric, would be "sufficient" boggles my mind. Sufficient for what?
An exceptionally good implementation could probably be done with right around 100.
You must be easily pleased if you consider 100 units "exceptionally good". In 1986 the HP-28C calculator supported 120 units, plus user-defined units. If you don't at least reach the standard available in 1986, I'm not even sure if you reach "good", let alone exceptional. (And yes, the HP-28C included teaspoon and tablespoon, but not hogshead, Roman mile or furlong.) I would consider the Unix program "units", and Frink (which inherits its unit database from units), to be comprehensive. 160 units is about 7% of comprehensive. -- Steve
![](https://secure.gravatar.com/avatar/72ee673975357d43d79069ac1cd6abda.jpg?s=120&d=mm&r=g)
On 4/04/22 3:45 pm, David Mertz, Ph.D. wrote:
An electron volt is a unit of energy. Or of mass. Or of momentum.
Well, in relativity they're all really the same thing, or at least interconvertible. But there are more glaring examples of this. What do you get when you multiply a number of newtons by a number of metres? It could be either joules of energy or newton-metres of torque, depending on what you're doing. You definitely don't want to add those together! I'm not sure how you teach a unit system to deal with things like that. Somehow when you do the multiplication you have to specify whether you're calculating energy or torque. -- Greg
![](https://secure.gravatar.com/avatar/1b108f4b1167bbf9d35404290f53e1eb.jpg?s=120&d=mm&r=g)
On 04/04/2022 09.45, Ethan Furman wrote:
On 4/3/22 11:52, Brian McCall wrote:
...
The old engineering disciplines- mine (civil engineering), structural, electrical, etc- are the next frontier in the "software eats the world" revolution, and they desperately need a language with native units support. I was just on an interview call yesterday for a senior engineer role at a large multinational earth works engineering firm and we spent 15 minutes talking about software and what we see coming down the road when it comes to the need for our discipline to grow in its software creation capabilities.
...
It's not only 'engineering'. Although, the archetypical story ("learning opportunity") is that of the space-craft software partly developed in Europe and partly 'in-Imperial', resulting in a catastrophic navigation error... Many (many) years ago, I was part of a team developing a replacement Inventory/Stock Control system. When 'problems' arose in the 'old system', rather than fixing, the 'solution' was sometimes to rip-out the problematic functionality - the new system will be ready 'soon'*. This resulted in 'consequences'! One day, as the only prog who could work on the mainframe's overnight-batch processes and the mini-computers running the warehouses, I was asked to 'put back' the code dealing with pack-sizes. Once done, a stock-take was necessary. I went out to a warehouse to do some 'live testing' and was shown some of the "problems we're facing". Sure-enough, a quick call-around and a back-of-an-envelope calculation revealed that we had more than sufficient stock - every man, woman, and child in the country could have enjoyed a fresh toothbrush, every day, for several years. Yes, they'd been ordering quantities in one pack-size, whereas the supplier was delivering something (much) larger! * see also 'the sound of deadlines': https://www.brainyquote.com/quotes/douglas_adams_134151 You wouldn't believe it - have interrupted typing here to receive a package. However, the clothing delivered is NOT the size ordered... Recently, I've been struggling with graphics and thus using both Cartesian- and Polar-coordinates. Worse are vectors (in both) which cannot be easily distinguished. Just coping with the math is (more than) enough for this limited brain, without also keeping-track of the unit/convention! How about 4/5/2022? Is that tomorrow's date or almost one-month away?
How old are you? 35 years How much do you weigh? 300 kg What temperature do you cook bread at? 350 F
These look like input prompts, and responses. Do you want to be able to accept different temporal formats, different measures of weight, and different temperature units - from the same input interaction? How about writing sample Python code to accomplish this, eg some_meaningful_identifier_here = input("How old are you? ") or are you thinking: class Weight(): ... weight = Weight(input(etc)) or some combination thereof, or another approach, or ...? -- Regards, =dn
![](https://secure.gravatar.com/avatar/6d6150353bc4f27822f669a36559ec13.jpg?s=120&d=mm&r=g)
On the previous thread, Rickey wrote:
Python SHOULD be that language we do this with. It is awesome in every other way. But if it isn't DEAD SIMPLE to use units in python, it won't happen.
I agree, being dead simple to use is of critical importance. Not only that, but there are a number of modules and packages that already exist that would need an update to support units natively. NumPy and SciPy of course, but also Matplotlib, Pandas. Even argparse would need to be updated (which is one I have to use a lot, fortunately not with negative values): ```
import re, argparse meters = lambda x : float(re.sub(r'm','',x)) parser = argparse.ArgumentParser() parser.add_argument("-l", "--length", dest='length', type=meters) _StoreAction(option_strings=['-l', '--length'], dest='length', nargs=None, const=None, default=None, type=<function <lambda> at 0x7fafb8f95c10>, choices=None, help=None, metavar=None) parser.parse_args("-l -12".split()) Namespace(length=-12.0) parser.parse_args("-l -12m".split()) usage: [-h] [-l LENGTH] : error: argument -l/--length: expected one argument
![](https://secure.gravatar.com/avatar/6d6150353bc4f27822f669a36559ec13.jpg?s=120&d=mm&r=g)
In the previous thread (Custom C++ literals), ChrisA raised some good questions, some of which I can actually answer :D
Part of the problem here is that Python has to be many many things. Which set of units is appropriate? For instance, in a lot of contexts, it's fine to simply attach K to the end of something to mean "a thousand", while still keeping it unitless; but in other contexts, 273K clearly is a unit of temperature. (Although I think the solution there is to hard-disallow prefixes without units, as otherwise there'd be all manner of collisions.) Is it valid to refer to fifteen Angstroms as 15A, or do you have to say 15Å, or 15e-10m and accept that it's now a float not an int? Similarly, what if you want to write a Python script that works in natural units - the Planck length, mass, time, and temperature?
I think if you look into CGPM standards (they're the grand pooh bahs who decide what SI units are) then you'd find that a lot of these potential collisions have already been encountered and resolved. Under SI, there is no ambiguity regarding K. K means Kelvin and only Kelvin, whereas k means 1000. Some units, like Å do pose challenges. We often substitute u instead of μ, which works fine since there don't seem to be any SI units that start with u. But we can't do likewise for Å, since A is already reserved for Amperes. The easy way out is to say Å is not SI, so it's out. But I would rather not see this feature limited to SI units only (although SI should be preferred). A somewhat gentler approach would be let Å be Å. Unicode letters are allowed in Python these days. I use theta, mu, lambda - the whole bunch of them, in my code all the time. If someone wants to use Å bad enough, let them use the unicode for it, otherwise use nm. Units like Planck's length are valid, and I don't see any reason to exclude them. The problem is that CGPM (nor anyone else, as far as I can tell) hasn't created an SI unit Planck length and other similar units that are lexicographically distinct from other units. And creating one would only be worth the trouble if all of the physicists who might use it could immediately recognize it. Not that I speak for them, but I'm guessing the folks who run SciPy or astropy could be of help in answering these sort of questions, rather than trying to get a Python steering committee to work with a possibly more bureaucratic organization like CGPM. Regarding precision, this is not something that so many scientists and engineers understand as well as computer scientists and engineers. I'd rather see units available for integers as well as floats. I think that as long as a unit is defined, it makes sense to allow integer quantities of them. If they are to built-in types, as I would prefer, then I suppose unfortunately one would not be able to define fractions of these units as new units. But again, most of this work is done with floats anyway, so if units were only available for floats, I would still see this as a big step forward. Related to these questions, there is the question of what to do about mixed systems? Should 2.54 in / 1 cm evaluate to 2.54 in/cm or should it evaluate to 1? I'd much rather it evaluate to 1, but if anyone else has a stronger opinion, I would not let a dispute over such a thing stand in the way of getting units. Regarding 1m / 1mm, though, I have a much stronger opinion. It should be 1000, without any units. There is yet another question related to the interpretation of K as 1000 vs Kelvin. As I said, SI is clear that K means Kelvin, but what about Python users that are not familiar with SI? What about those in the financial industry? To them, K means 1000, and might not even know what Kelvin is. Now, unless adding a suffix K to a number is supported later on, a financial person would have to go pretty far out of their way, or be looking at the wrong code to be confused by something referring to Kelvin. But it would indeed be a mistake to assume that everyone who uses Python wants and can live with SI units, or even that they would be using the same set of units! Which brings me to the next part of ChrisA's reply...
Purity and practicality are at odds here. Practicality says that you should be able to have "miles" as a unit, purity says that the only valid units are pure SI fundamentals and everything else is transformed into those. Leaving it to libraries would allow different Python programs to make different choices.
But I would very much like to see a measure of language support for "number with alphabetic tag", without giving it any semantic meaning whatsoever. Python currently has precisely one such tag, and one conflicting piece of syntax: "10j" means "complex(imag=10)", and "10e1" means "100.0". (They can of course be combined, 10e1j does indeed mean 100*sqrt(-1).) This is what could be expanded.
As I mentioned above, I am not a purist. I keep a set of Thorlabs thread adapters handy in my lab so that I can screw imperial cage plates onto metric posts. I think I diverge (or perhaps just don't understand) statement on "semantic meaning". To me, semantic meaning of the units seems pretty essential. Wherever possible, units should be simplified in a prescribed manner. 1W / 1s = 1J, 10km/1cm = 1000000. The meaning of these suffixes should be explicit, not implicit. Also, see above about precision of unit-aware data types. Floating point only would be fine, but I don't see why integers cannot be supported as well.
C++ does things differently, since it can actually compile things in, and declarations earlier in the file can redefine how later parts of the file get parsed. In Python, I think it'd make sense to syntactically accept *any* suffix, and then have a run-time translation table that can have anything registered; if you use a suffix that isn't registered, it's a run-time error. Something like this:
import sys # sys.register_numeric_suffix("j", lambda n: complex(imag=n)) sys.register_numeric_suffix("m", lambda n: unit(n, "meter")) sys.register_numeric_suffix("mol", lambda n: unit(n, "mole"))
(For backward compatibility, the "j" suffix probably still has to be handled at compilation time, which would mean you can't actually do that first one.)
Using it would look something like this:
def spread(): """Calculate the thickness of avocado when spread on a single slice of bread""" qty = 1.5mol area = 200mm * 200mm return qty / area
Unfortunately, these would no longer be "literals" in the same way that imaginary numbers are, but let's call them "unit displays". To evaluate a unit display, you take the literal (1.5) and the unit (stored as a string, "mol"), and do a lookup into the core table (CPython would probably have an opcode for this, rather than doing it with a method that could be overridden, but it would basically be "sys.lookup_unit(1.5, 'mol')" or something). Whatever it gives back is the object you use.
Does this seem like a plausible way to go about it?
As far as registering units, I think registering individual units is a bit much. Of course, several of these statements could be put inside a module or package to make things easier. But I also don't like that it means the syntax of the "literals" needs to be allowed during parsing, and left to the interpreter to figure out if the unit was registered. I do think it is reasonable to require programmers to "opt in" to using SI or other units, and possibly even specify which set or sets of units they intend to use. But if their constants are ill-formed, then that should still be caught during parsing and throw a SyntaxError. How that would be implemented behind the scenes, I don't know, but from a syntax point of view, I am envisioning something like a namespace statement with a new keyword (I propose `measure`). Here, I am referring to namespaces like `local` and `global`, not something like `argparse.Namespace`. Consider the following example as of today: ``` A = 1 global A A = 2 ``` This will generate a syntax error during parsing: SyntaxError: name 'A' is assigned to before global declaration Similarly, what I envision is something like this: ``` length = 12m ``` SyntaxError: invalid syntax ``` measure SI length = 12m width = 10mm area = length * width print(area) ``` ... with no SyntaxErrors and a result of "0.12 m2" After the "measure SI" statement, all literals that are formed with SI units are considered valid syntax and are evaluated accordingly. Prior to "measure SI", only the unitless primitives are allowed. Clearly this works differently than does the `global` or `local` statement, which are modifying a namespace. Also, the choice of keyword matters, because making "measure" a keyword would probably break a lot of existing code (3to4.py!!!). But it is dead simple, and it does behave in a way that is actually quite similar to modifying the existing namespace. This ended up being a much longer reply than I anticipated, but I hope it helps.
![](https://secure.gravatar.com/avatar/d67ab5d94c2fed8ab6b727b62dc1b213.jpg?s=120&d=mm&r=g)
On Mon, 4 Apr 2022 at 14:22, Brian McCall <brian.patrick.mccall@gmail.com> wrote:
I think if you look into CGPM standards (they're the grand pooh bahs who decide what SI units are) then you'd find that a lot of these potential collisions have already been encountered and resolved. Under SI, there is no ambiguity regarding K. K means Kelvin and only Kelvin, whereas k means 1000.
The trouble is that SI isn't the only set of units out there. And particularly if you support SI derived units, there will be innumerable collisions of abbreviations with other systems. Unless you're going to mandate *in the language* that SI units are the only ones permitted (and thus anger a fairly large slab of people), the precise meanings of the abbreviations will have to be handled by libraries.
Regarding precision, this is not something that so many scientists and engineers understand as well as computer scientists and engineers. I'd rather see units available for integers as well as floats. I think that as long as a unit is defined, it makes sense to allow integer quantities of them. If they are to built-in types, as I would prefer, then I suppose unfortunately one would not be able to define fractions of these units as new units. But again, most of this work is done with floats anyway, so if units were only available for floats, I would still see this as a big step forward.
Oh absolutely they should be available for integers. Unfortunately, that doesn't always solve the problem. Exactly how far is it from the sun to Saturn at aphelion? Do you write it as 1357554000000m, which implies false precision? What about 1.357554e12m ? Well, that's a float, not an int. (My apologies if the figure is straight-up wrong, I just pulled that from a quick Google search.) It's pretty much inevitable that floats are going to show up, and that means that forcing everything into the same scale system is likely to cause problems (just like when you're building a 3D graphical world and you place a light source 1AU away to represent sunlight - you just create unnecessary FP inaccuracies).
Related to these questions, there is the question of what to do about mixed systems? Should 2.54 in / 1 cm evaluate to 2.54 in/cm or should it evaluate to 1? I'd much rather it evaluate to 1, but if anyone else has a stronger opinion, I would not let a dispute over such a thing stand in the way of getting units. Regarding 1m / 1mm, though, I have a much stronger opinion. It should be 1000, without any units.
I would say that 2.54in/1cm should be equal to 1. Units should completely cancel out. It's less clear when there's choice of equivalent units, like "5ft+6in" potentially being "5.5ft" or "66in"; but I would also say that those two quantities should compare equal. (Which may make their hashes tricky to calculate.)
There is yet another question related to the interpretation of K as 1000 vs Kelvin. As I said, SI is clear that K means Kelvin, but what about Python users that are not familiar with SI? What about those in the financial industry? To them, K means 1000, and might not even know what Kelvin is. Now, unless adding a suffix K to a number is supported later on, a financial person would have to go pretty far out of their way, or be looking at the wrong code to be confused by something referring to Kelvin. But it would indeed be a mistake to assume that everyone who uses Python wants and can live with SI units, or even that they would be using the same set of units! Which brings me to the next part of ChrisA's reply...
Purity and practicality are at odds here. Practicality says that you should be able to have "miles" as a unit, purity says that the only valid units are pure SI fundamentals and everything else is transformed into those. Leaving it to libraries would allow different Python programs to make different choices.
But I would very much like to see a measure of language support for "number with alphabetic tag", without giving it any semantic meaning whatsoever. Python currently has precisely one such tag, and one conflicting piece of syntax: "10j" means "complex(imag=10)", and "10e1" means "100.0". (They can of course be combined, 10e1j does indeed mean 100*sqrt(-1).) This is what could be expanded.
As I mentioned above, I am not a purist. I keep a set of Thorlabs thread adapters handy in my lab so that I can screw imperial cage plates onto metric posts.
I think I diverge (or perhaps just don't understand) statement on "semantic meaning". To me, semantic meaning of the units seems pretty essential. Wherever possible, units should be simplified in a prescribed manner. 1W / 1s = 1J, 10km/1cm = 1000000. The meaning of these suffixes should be explicit, not implicit.
There are three levels of meaning happening here. 1) Syntax. This is set by the language, must be locked in at compilation time, and if you get this wrong, you get SyntaxError. 2) Low-level semantics. In my proposal, this is a simple concept of "look up a registration table, call the appropriate function, and use what it gives back". 3) High-level semantics. This is where the true meaning of "meter" is assigned, and in my proposal, that's the job of libraries, not the language. To illustrate the difference here, I'll use a different piece of code: with slider.handler_block(sig): slider.set_value(value) What is the meaning of the "with" statement? 1) "with" thing ":" suite This is how you parse the line of code. You expect the word "with", then something that is the context manager, you might capture it in a name (not happening here), and then there's an indented block of code. Doesn't tell you anything about what it means, but it tells you how to spell it out. This is the syntax. 2) Call the __enter__ method of the object, then run the suite, then call the __exit__ method, even if an exception is raised. This is low-level semantics. The language doesn't define anything more than this, but on its own, it doesn't tell you why this is useful. 3) Temporarily block this signal from being sent, long enough to set the slider's value. This is high-level semantics - the true meaning of the code, what makes it actually useful. In this particular case, the meaning comes from the GTK library, not the language itself. It's entirely possible for some high-level semantics to come from core language or standard library features (notably files, and synchronization primitives), but it's also possible to have a feature with absolutely no useful meaning in the core language (like matrix multiplication). In my opinion, the language should be defining the first two levels, but not the third. Rules like "1W / 1s == 1J" come from high level semantics, and can be provided by libraries. The language just defines the hooks by which those libraries can set themselves up.
Does this seem like a plausible way to go about it?
As far as registering units, I think registering individual units is a bit much. Of course, several of these statements could be put inside a module or package to make things easier. But I also don't like that it means the syntax of the "literals" needs to be allowed during parsing, and left to the interpreter to figure out if the unit was registered. I do think it is reasonable to require programmers to "opt in" to using SI or other units, and possibly even specify which set or sets of units they intend to use. But if their constants are ill-formed, then that should still be caught during parsing and throw a SyntaxError.
This is where I currently disagree, but would be open to seeing an actual implementation. The trouble is, very very little in a .py file can change the way that that file is parsed; if something's a SyntaxError, you generally can't make it legal with code in that file, only with code somewhere else. (The one exception is __future__ directives, and they should be rare.) That has the unfortunate side effect that, if you want "25mm" to be legal but "25zm" to be a SyntaxError, there needs to be something executed before this file gets imported, which registers those suffixes. On the other hand, it is entirely reasonable to have "25zm" raise a runtime exception when that point in the code is reached.
``` measure SI length = 12m width = 10mm area = length * width print(area) ``` ... with no SyntaxErrors and a result of "0.12 m2"
After the "measure SI" statement, all literals that are formed with SI units are considered valid syntax and are evaluated accordingly. Prior to "measure SI", only the unitless primitives are allowed. Clearly this works differently than does the `global` or `local` statement, which are modifying a namespace. Also, the choice of keyword matters, because making "measure" a keyword would probably break a lot of existing code (3to4.py!!!). But it is dead simple, and it does behave in a way that is actually quite similar to modifying the existing namespace.
This ended up being a much longer reply than I anticipated, but I hope it helps.
Where would "measure SI" be implemented? Does it have to be a core feature of the language? The global statement (there's no "local" statement - do you mean "nonlocal"?) is fully defined by the language, so its impact on the surrounding code is also well defined. But if "measure" statements can be provided by importable modules, there are only two options: either it's a run-time action (like most imports), or it has to be done before you import something else, which requires a bootstrap script: # SI_runner.py import measurements.SI measurements.SI.register() import real_code.py # real_code.py length = 12m width = 10mm print(length * width) That's annoying. Very annoying. The strictly run-time nature of Python's import statement does have some consequences, but it's also extremely easy to work with (trust me, JavaScript module imports are far more annoying simply because they get executed before the rest of the code does). But I would be open to seeing an actual implementation before locking in an opinion on this. Well, actually, I'm open to just never locking in an opinion, but you know what I mean :) ChrisA
![](https://secure.gravatar.com/avatar/6d6150353bc4f27822f669a36559ec13.jpg?s=120&d=mm&r=g)
The trouble is that SI isn't the only set of units out there. And particularly if you support SI derived units, there will be innumerable collisions of abbreviations with other systems. Unless you're going to mandate *in the language* that SI units are the only ones permitted (and thus anger a fairly large slab of people), the precise meanings of the abbreviations will have to be handled by libraries.
Mentioned in another part of the thread, but if you wanted to be completely comprehensive, including measures used for cooking, you would *only* need 160 units. But if you want to focus on where you make an impact - that is, units that are used in computations and dimensional analysis (i.e. - not teaspoons), 7-14 units is a good start. That's the 7 base units and their Imperial / US Customary counterparts. Literally every other unit of measurement can be derived from these. After that, ~50 units is pretty fantastic, and by the time you get to 100 units, there might be 1 or 2 people who feel left out, but they're probably not going to be angry about it.
Oh absolutely they should be available for integers. Unfortunately, that doesn't always solve the problem. Exactly how far is it from the sun to Saturn at aphelion? Do you write it as 1357554000000m, which implies false precision? What about 1.357554e12m ? Well, that's a float, not an int. (My apologies if the figure is straight-up wrong, I just pulled that from a quick Google search.) It's pretty much inevitable that floats are going to show up, and that means that forcing everything into the same scale system is likely to cause problems (just like when you're building a 3D graphical world and you place a light source 1AU away to represent sunlight - you just create unnecessary FP inaccuracies).
A standard unit of measure has infinite precision. If the distance from the sun to Saturn at aphelion does not have infinite precision, then it cannot be a unit. It's definitely not a standard unit. If you want to define it as a unit, then I guess you have to pinch your nose and claim that it has infinite precision? I totally agree that you need to be able to use units at the correct scale. But a unit *must* have infinite precision. It might make more sense to define an arbitrary unit called the AsirhC as 1.357554e12m (with infinite precision), and you would be doing so because that is the distance that you believe the sun to Saturn at aphelion is, but once defined in that way, you shouldn't worry about how far off it is.
I would say that 2.54in/1cm should be equal to 1. Units should completely cancel out. It's less clear when there's choice of equivalent units, like "5ft+6in" potentially being "5.5ft" or "66in"; but I would also say that those two quantities should compare equal. (Which may make their hashes tricky to calculate.)
Easy answer: whichever unit results in the floating point portion of the data type being closest to 0. It only matters at the point in time when the answer is presented to the end user anyway, at which point best practice would say to use an explicit cast.
Where would "measure SI" be implemented? Does it have to be a core feature of the language? The global statement (there's no "local" statement - do you mean "nonlocal"?) is fully defined by the language, so its impact on the surrounding code is also well defined. But if "measure" statements can be provided by importable modules, there are only two options: either it's a run-time action (like most imports), or it has to be done before you import something else, which requires a bootstrap script:
Yes, I mean "nonlocal". In my example the "measure" statement would be a core part of the language. Which, as I said in my original soapbox post, is what I would ideally like to see change. "Every number has a unit" sounds crazy until you get used to it, and its a level of crazy that requires a core language change. Current library implementations have so far failed to gain much traction. Better implementations *might* work without changing the language? Tbh, I really don't know. To elaborate further on what I mean by my proposed "measure" statement, it would need to be supported by core language. The very first block of parser code that would act on this statement would in fact sit right next to the code that currently handles "global" and "nonlocal" and would look very similar. Once the "measure" statement is encountered, the state of the parser itself is altered. I'm not sure if this happens at all in the parser today, but after executing the "measure" statement, literals like "12m" would suddenly become legal, by virtue of the parser loading or enabling a set of additional parsing rules. Sort of like a dynamic parser. Your approach would be great, btw, I just don't quite see how the string literal gets into the library. But I likewise will keep an open mind on it. Another possibility is to make floats and ints "subscriptable". The reason is, it is not unusual to see units enclosed in brackets. Like this: 12[m] 12[m/s] 100[cd/m2] # Or 100[cd/m**2]? In that case, the parser does not need to be dynamic. m in this case could be interpreted as a variable name, which would be registered, as you say. But I would strongly argue in favor of a separate namespace for objects within an integer or float subscript. "12[SI.m]" is ugly, especially when you start getting into complex units. Not to mention, lots of scientists and engineers are allergic to dot-notation (I know, why even bother, but I'm a people pleaser). Similarly, ruling out "m" as a variable name is a non-starter.
![](https://secure.gravatar.com/avatar/de311342220232e618cb27c9936ab9bf.jpg?s=120&d=mm&r=g)
On 4/3/22 22:39, Chris Angelico wrote:
On Mon, 4 Apr 2022 at 14:22, Brian McCall wrote:
Related to these questions, there is the question of what to do about mixed systems? Should 2.54 in / 1 cm evaluate to 2.54 in/cm or should it evaluate to 1?
I would say that 2.54in/1cm should be equal to 1. Units should completely cancel out.
It seems like the point of this exercise is to *not* have units cancel out -- at least, not unnecessarily. 2.54in / 1 cm should be 2.54in/cm, otherwise multiplying by 5 kelvin will be one of those hard-to-find bugs. Of course, it's actually 2.54cm/in. -- ~Ethan~
![](https://secure.gravatar.com/avatar/d67ab5d94c2fed8ab6b727b62dc1b213.jpg?s=120&d=mm&r=g)
On Tue, 5 Apr 2022 at 04:19, Ethan Furman <ethan@stoneleaf.us> wrote:
On 4/3/22 22:39, Chris Angelico wrote:
On Mon, 4 Apr 2022 at 14:22, Brian McCall wrote:
Related to these questions, there is the question of what to do about mixed systems? Should 2.54 in / 1 cm evaluate to 2.54 in/cm or should it evaluate to 1?
I would say that 2.54in/1cm should be equal to 1. Units should completely cancel out.
It seems like the point of this exercise is to *not* have units cancel out -- at least, not unnecessarily.
2.54in / 1 cm should be 2.54in/cm, otherwise multiplying by 5 kelvin will be one of those hard-to-find bugs.
Hmm, fair point, I guess. It gets tricky, though. For an example, let's look at fuel efficiency. Outside of the US, vehicle fuel economy is measured in liters per hundred kilometers. What unit category should this be considered to be? 5 L/100km in SI units is 5e-3 m³ / 1e5m. That's 5e-8 m². Or if you prefer, 50mm². Fuel economy is a unit of area. https://what-if.xkcd.com/11/ This DOES, as Munroe points out, have a geometric interpretation. I think it's reasonable to say that fuel economy cancels down to a unit of area, and it's *also* reasonable to say that it doesn't, and that it simply remains as volume-per-distance. Some unit cancellations really do result in pure scalars. The ratio of a circle's circumference to its diameter isn't a unit of m/m any more than the ratio of a circle's area to that of a circumscribed square is a unit of m²/m². They're both just numbers. On the other hand, a radian is a very real unit of distance/distance (based on its definition of arc length), and it's a unit of angle. I suspect that the rules of cancellation would be best handed off to libraries, and there will be different choices for different applications. Maybe at some time in the future, there'll be a proposal to lock it down and define it more by the language. It wouldn't be the first time - type hints are now the only officially supported form of annotations, but the precise meanings of those hints is still partly up to the library.
Of course, it's actually 2.54cm/in.
(I actually didn't even spot that part) ChrisA
![](https://secure.gravatar.com/avatar/5615a372d9866f203a22b2c437527bbb.jpg?s=120&d=mm&r=g)
On Tue, Apr 05, 2022 at 04:36:24AM +1000, Chris Angelico wrote:
Some unit cancellations really do result in pure scalars. The ratio of a circle's circumference to its diameter isn't a unit of m/m any more than the ratio of a circle's area to that of a circumscribed square is a unit of m²/m². They're both just numbers.
Of course it is a ratio. You said it yourself: it is a ratio of circumference to diameter. That ratio is only numerically equal to π if the units you measure the cicumference and diameter are the same. Otherwise it has units "inches/cm" (or whatever units you used) and a completely different numerical value.
On the other hand, a radian is a very real unit of distance/distance (based on its definition of arc length), and it's a unit of angle.
The SI system defines both radians and steradians as dimensionless derived units, previously known as "supplementary units". -- Steve
![](https://secure.gravatar.com/avatar/d67ab5d94c2fed8ab6b727b62dc1b213.jpg?s=120&d=mm&r=g)
On Tue, 5 Apr 2022 at 11:44, Steven D'Aprano <steve@pearwood.info> wrote:
On Tue, Apr 05, 2022 at 04:36:24AM +1000, Chris Angelico wrote:
Some unit cancellations really do result in pure scalars. The ratio of a circle's circumference to its diameter isn't a unit of m/m any more than the ratio of a circle's area to that of a circumscribed square is a unit of m²/m². They're both just numbers.
Of course it is a ratio. You said it yourself: it is a ratio of circumference to diameter.
That ratio is only numerically equal to π if the units you measure the cicumference and diameter are the same. Otherwise it has units "inches/cm" (or whatever units you used) and a completely different numerical value.
On the other hand, a radian is a very real unit of distance/distance (based on its definition of arc length), and it's a unit of angle.
The SI system defines both radians and steradians as dimensionless derived units, previously known as "supplementary units".
You're missing the point: these are ALL dimensionless values, yet they are incompatible. Regardless of the units used to measure the circumference and diameter, they will *by definition* cancel out and leave you with pi (case in point: using a pie as the unit https://www.youtube.com/watch?v=ZNiRzZ66YN0 ). But a radian and a steradian are not the same type of thing. Nor is an index of refraction.They're all dimensionless. They're NOT all fungible. The unit type "inches/cm" is meaningless, but the unit type "length/length" is very meaningful. And that's where the problem comes in. Please, stop being all caught up on one small error that I propagated, and look at the actual point? You're really good at this sort of thing. ChrisA
![](https://secure.gravatar.com/avatar/72ee673975357d43d79069ac1cd6abda.jpg?s=120&d=mm&r=g)
On 5/04/22 6:36 am, Chris Angelico wrote:
5 L/100km in SI units is 5e-3 m³ / 1e5m. That's 5e-8 m². Or if you prefer, 50mm².
Fuel economy is a unit of area.
This misses the rather important point that it's not just litres of empty space, it's litres *of fuel*. You really need to consider "litres of fuel" to be an indivisible unit that doesn't cancel with a distance. Alternatively, measure the fuel by mass rather than volume. Then your fuel consumption is in kg/100km, which is in far less danger of being cancelled down. Or you could measure the fuel in moles... but then since moles are dimensionless, you get reciprocal distance... Who would have thought units could be such fun? -- Greg
![](https://secure.gravatar.com/avatar/d67ab5d94c2fed8ab6b727b62dc1b213.jpg?s=120&d=mm&r=g)
On Tue, 5 Apr 2022 at 05:47, Steven D'Aprano <steve@pearwood.info> wrote:
On Mon, Apr 04, 2022 at 03:39:26PM +1000, Chris Angelico wrote:
I would say that 2.54in/1cm should be equal to 1.
2.54 inches is not 1 cm. This is how you get a billion dollar spacecraft crashing into Mars instead of landing softly :-)
This is why you don't get billion dollar spacecraft built in hybrid unit systems to the specifications of quickly-written emails :) ChrisA
![](https://secure.gravatar.com/avatar/5615a372d9866f203a22b2c437527bbb.jpg?s=120&d=mm&r=g)
On Mon, Apr 04, 2022 at 04:22:25AM -0000, Brian McCall wrote: [Chris]
Part of the problem here is that Python has to be many many things. Which set of units is appropriate? For instance, in a lot of contexts, it's fine to simply attach K to the end of something to mean "a thousand", while still keeping it unitless; but in other contexts, 273K clearly is a unit of temperature.
Python is a programming language. I think we can require a certain minimum level of strictness, and reject ambiguity. Please read the Frink FAQ, especially the part about DWIM: https://frinklang.org/faq.html K means Kelvin, end of story. We don't need to support informal use or slang out of the box. But if units are scoped, like variables are, then people who really, really want K to mean 8192 bits can shadow the unit database with their own.
Should 2.54 in / 1 cm evaluate to 2.54 in/cm or should it evaluate to 1? I'd much rather it evaluate to 1
There are only two reasonable ways to parse that, depending on precedence of units and operators: * (2.54 inches) / (1 cm) = 6.4516 (dimensionless) * (2.54 inches / 1) * cm = 6.4516 cm**2 (or 1 square inch) If there is a third way, I can't think of it. In any case, I don't see how you can get 1. Maybe you mean 1 inch per 2.54 cm? This is why we need unit management :-) -- Steve
![](https://secure.gravatar.com/avatar/d67ab5d94c2fed8ab6b727b62dc1b213.jpg?s=120&d=mm&r=g)
On Tue, 5 Apr 2022 at 05:44, Steven D'Aprano <steve@pearwood.info> wrote:
Should 2.54 in / 1 cm evaluate to 2.54 in/cm or should it evaluate to 1? I'd much rather it evaluate to 1
There are only two reasonable ways to parse that, depending on precedence of units and operators:
* (2.54 inches) / (1 cm) = 6.4516 (dimensionless)
* (2.54 inches / 1) * cm = 6.4516 cm**2 (or 1 square inch)
This second format is nonsense, and demonstrates why these need to be tagged numbers, NOT simple multiplications.
If there is a third way, I can't think of it. In any case, I don't see how you can get 1.
Maybe you mean 1 inch per 2.54 cm? This is why we need unit management :-)
The question is really whether it's dimensionless or retains some record of the fact that it's length/length. The aspect ratio of a rectangle is really a length/length measure, even though it's technically dimensionless. An index of refraction is also dimensionless, but you can't talk about them as being equivalent. ChrisA
![](https://secure.gravatar.com/avatar/72ee673975357d43d79069ac1cd6abda.jpg?s=120&d=mm&r=g)
On 4/04/22 9:45 am, Ethan Furman wrote:
Well, if we're spit-balling ideas, what about:
63_lbs
77_km/hr
I'm not convinced there's a need for new syntax here. 63*lbs 77*km/hr With appropriate definitions of lbs, km and hr these can be made to construct numbers with attached units. -- Greg
![](https://secure.gravatar.com/avatar/5615a372d9866f203a22b2c437527bbb.jpg?s=120&d=mm&r=g)
On Mon, Apr 04, 2022 at 07:46:12AM -0000, Brian McCall wrote:
Now do it for NumPy arrays
In response to Greg: [quoting Greg] I'm not convinced there's a need for new syntax here. 63*lbs 77*km/hr With appropriate definitions of lbs, km and hr these can be made to construct numbers with attached units. [end quote] Numpy arrays support array*scalar, which multiplies each element of the array by the scalar.
import numpy as np arr = np.array([2, 3, 4, 5]) arr*1.5 array([3. , 4.5, 6. , 7.5])
So we're part way there. However, I suspect that having an array of unit objects rather than low-level machine ints or floats will reduce the performance of numpy a lot. This is probably unavoidable: there is no way you can do numeric computations and track units as cheaply as doing numeric computations *without* tracking units. But performance should be the least of our concerns at this point. -- Steve
![](https://secure.gravatar.com/avatar/72ee673975357d43d79069ac1cd6abda.jpg?s=120&d=mm&r=g)
On 5/04/22 5:17 am, Steven D'Aprano wrote:
However, I suspect that having an array of unit objects rather than low-level machine ints or floats will reduce the performance of numpy a lot.
If numpy were to incorporate units, I would expect there to be just one unit tag for the whole array. -- Greg
![](https://secure.gravatar.com/avatar/6d6150353bc4f27822f669a36559ec13.jpg?s=120&d=mm&r=g)
Wow, this thread has grown quite a bit in the last two days. And there's some really good points raised alongside the light trolling here and there. While the discussion around implementation is important and very interesting, I think the question around motivation is critical. Since I see some new discussion around that topic, I'll start there. Stephen J Turnbull and Paul Moore have asked why the "need" for something other than a library (or perhaps a better library). There are a number of examples that show simple unit calculations and it's easy to argue based on those that nothing further is needed. I think more complicated counter-examples could help push back against this, but complicated counter-examples might be protected IP, or they just might take a lot of work to talk all the way through. Still, I'd love to see a whole thread dedicated to this, because I really am curious to see examples that aren't my own. But I'll start with one example here. It might not be the most compelling, but I'll throw it out there because it's not protected and it has bitten me once. Before I get to this example, though, there is more to arguments for the "need" than just counter-examples. I keep using quotes because nothing is really a need for anything. There isn't a need for built-in support for lists, sets, or even Python, but I sure am glad we have it. Let me say desirability instead. Desirability doesn't just come from things that programmers and CS majors care about, like readability, concise syntax, and self-explanatory code. Desirability for this feature also comes from an operations standpoint. I have made personnel decisions based on whether a candidate has ever used dot-notation in their code before, or if they have only ever used Fotran/Matlab style coding. In regulated environments, risk analysis and mitigation is very much affected by whether a feature has native support or if it comes from a third party library (see https://en.wikipedia.org/wiki/Software_of_unknown_pedigree). That same risk analysis is also heavily impacted by looking at what the worst case scenario might be. And if you want an example of that, look no further than the pre-amble of https://pypi.org/project/units/ (which I believe led to astropy.units).
The Mars Climate Orbiter was intended to enter orbit at an altitude of 140-150 km (460,000-500,000 ft.) above Mars. However, a navigation error caused the spacecraft to reach as low as 57 km (190,000 ft.). The spacecraft was destroyed by atmospheric stresses and friction at this low altitude. The navigation error arose because a NASA subcontractor (Lockheed Martin) used Imperial units (pound-seconds) instead of the metric system.
I think that a thread of examples along these lines would also go a long way informing both high level and nuanced decisions about what the role of units should be in programming languages and libraries. One last pontification before I get to my example relating to units. We already have examples of features that have both a native implementation and library extensions. int and float are primitives in Python. They are more than enough for most users, but limiting for quite a few other users. So modules like fractions and decimal provide extended support, and libraries like numpy provide even more data types for task-specific needs. Alright, now let's look at an example. Again, it's not my best, let's go with it. This is just a calculation of the expected number of photons to reach a pixel through a camera of a given f-number (F). I mentioned this has bitten me before. All that means is that based on a set of simulations, we though something was possible, spent a few weeks building a prototype, got results that made no sense, and then realized that there was a unit error in the original feasibility analysis. That one was on me, and since I am a capable programmer, I ought to have been using a units package. Symbol definitions: h - Planck's constant c - speed of light Ee - irradiance R - reflectance Q - quantum efficiency F - f-number λ - wavelength a - width of a pixel t - exposure time ȳ - output of luminosity function integral From here, if the triple tick marks do not render this example in monospace for you, then I recommend copy/pasting into something that does. ``` echo no units python -c " h = 6.62607015e-34 c = 299792458 Ee = 200 R = 0.25 Q = 0.63 F = 2.4 λ = 550e-9 a = 3.45e-6 t = 30e-3 ȳ = 683 n = (Ee * R * Q * λ * t * a**2) / (2 * h * c * ȳ * F**2) # n = (200 * 0.25 * 0.63 * 550e-9 * 30e-3 * (3.45e-6)**2) / (2 * h*c * 683 * 2.4**2) print(n) " # Pros - compact, the code representing the equation is easily verifiable, and the magnitudes are also easily verifiable # Cons - no units echo What literals would look like python -c " # h = 6.62607015e-34m2kg / 1s # c = 299792458m / 1s # Ee = 200lx # R = 0.25 # Q = 0.63 # F = 2.4 # λ = 550e-9nm # a = 0.00345mm # t = 30ms # ȳ = 683lm / 1W # n = (Ee * R * Q * λ * t * a**2) / (2 * h * c * ȳ * F**2) # n = (200lx * 0.25 * 0.63 * 500nm * 30ms * (3450nm)**2) / (2 * h*c * 683lm/1W * 2.4**2) " # Pros - Still compact. Dead simple. Planck's constant looks a little weird, but this is usually imported from a library anyway # Cons - requires a syntax change; inline computations like the last line are not IMO quite as readable as the next example echo 'What bracket syntax might look like' python -c " # h = 6.62607015e-34 [m**2*kg/s] # c = 299792458 [m/s] # Ee = 200 [lx] # R = 0.25 # Q = 0.63 # F = 2.4 # λ = 550e-9 [nm] # a = 0.00345 [mm] # t = 30 [ms] # ȳ = 683 [lm/W] # n = (Ee * R * Q * λ * t * a**2) / (2 * h * c * ȳ * F**2) # n = (200[lx] * 0.25 * 0.63 * 500[nm] * 30[ms] * (0.00345[mm])**2) / (2 * h*c * 683[lm/W] * 2.4**2) " # Pros - Still compact, dead simple, and IMO the best way to look at this code # Cons - requires a syntax change and a new kind of namespace in addition to global, nonlocal, and enclosure echo units python -c " from units import unit h = unit('m**2*kg/s') (6.62607015e-34) c = unit('m/s') (299792458) Ee = unit('lx') (200) R = 0.25 Q = 0.63 F = 2.4 λ = unit('nm') (550) a = unit('mm') (0.00345) t = unit('ms') (30) ȳ = unit('lm/W') (683) n = (Ee * R * Q * λ * t * a**2) / (2 * h * c * ȳ * F**2) print(n) n = unit('lx')(200) * 0.25 * 0.63 * unit('nm')(550) * unit('mm')(0.00345)**2 * unit('ms')(30) / (2 * 2.4**2 * h*c) / unit('lm/W')(683) print(n) " # Pros - Has units, and no syntax change required # Cons - less compact # Cons - How do you get the final answer? You need to know that units became astropy.units and see below # Cons - Not dead simple. Multiple adjacent ()() is going to be unpopular with the crowd that uses Python as if it were Fortran/Matlab echo astropy.units python -c " import astropy.units as units h = 6.62607015e-34 * units.m**2*units.kg/units.s c = 299792458 * units.m / units.s Ee = 200 * units.lx R = 0.25 Q = 0.63 F = 2.4 λ = 550 * units.nm a = 0.00345 * units.mm t = 30 * units.ms ȳ = 683 * units.lm / units.W n = (Ee * R * Q * λ * t * a**2) / (2 * h * c * ȳ * F**2) print(n) n = (200*units.lx * 0.25 * 0.63 * 550*units.nm * 30*units.ms * (0.00345*units.mm)**2) / (2 * h*c * 683*units.lm/units.W * 2.4**2) print(n) print(n.decompose()) " # Pros - Has units, and no syntax change required # Cons - less compact, everything is an instance of numpy.ndarray which will break some existing code # Cons - .decompose() is required to convert final result # Cons - dot-notation is better than ()(), but again will encounter resistance from those who use Python as if it were Fortran/Matlab echo astropy.units again, but importing everything into globals python -c " from astropy.units import * h = 6.62607015e-34 * m**2*kg/s c = 299792458 * m/s Ee = 200 * lx R = 0.25 Q = 0.63 F = 2.4 λ = 550 * nm a = 0.00345 * mm t = 30 * ms ȳ = 683 * lm/W n = (Ee * R * Q * λ * t * a**2) / (2 * h * c * ȳ * F**2) print(n) n = (200*lx * 0.25 * 0.63 * 550*nm * 30*ms * (3450*mm)**2) / (2 * h*c * 683*lm/W * 2.4**2) print(n) print(n.decompose()) " # Pros - Compact, has units, and no syntax change required # Cons - Everything is an instance of numpy.ndarray which will break some existing code # Cons - Can't use "m" as a counter (and other namespace collisions) # Cons - .decompose() is required to convert final result # Cons - dot-notation is better than ()(), but again will encounter resistance from those who use Python as if it were Fortran/Matlab echo Quantiphy python -c " from quantiphy import Quantity h = Quantity('6.62607015e-34 m2kg/s' ) c = Quantity('299792458 m/s' ) Ee = Quantity('200 lux' ) R = 0.25 Q = 0.63 F = 2.4 λ = Quantity('550 nm' ) ȳ = Quantity('683 lm/W' ) a = Quantity('0.00345 mm' ) t = Quantity('30 ms' ) n = (Ee * R * Q * λ * t * a**2) / (2 * h * c * ȳ * F**2) #n = I'm not going to bother. print(n) " # Ugh ```
![](https://secure.gravatar.com/avatar/de311342220232e618cb27c9936ab9bf.jpg?s=120&d=mm&r=g)
On 4/6/22 08:50, Brian McCall wrote:
Alright, now let's look at an example. Again, it's not my best, let's go with it. This is just a calculation of the expected number of photons to reach a pixel through a camera of a given f-number (F). I mentioned this has bitten me before. All that means is that based on a set of simulations, we though something was possible, spent a few weeks building a prototype, got results that made no sense, and then realized that there was a unit error in the original feasibility analysis. That one was on me, and since I am a capable programmer, I ought to have been using a units package.
Thank you for all the examples, that should help a lot. Out of curiosity, those are the corrected versions of the code? Where was the mistake in the original? -- ~Ethan~
![](https://secure.gravatar.com/avatar/6d6150353bc4f27822f669a36559ec13.jpg?s=120&d=mm&r=g)
Thank you for all the examples, that should help a lot. Out of curiosity, those are the corrected versions of the code? Where was the mistake in the original?
Happy to help! The original was unitless altogether, so that was the first problem. The particular error was λ. I typically work in Zemax, where the base unit of length is mm (or inch, if you change default settings). So my λ was 550e-6 instead of 550e-9. Funny thing is, my ȳ was a function that expects λ to have units of nm, and I did correct that. An exception told me that 550e-6 was out of bounds. But I did not think to check other dimensional mismatches. We built the prototype, saw the signal on the sensor that we expected to, but it was ~1000 times (~10 stops - another unit!) dimmer than expected. So basically useless.
![](https://secure.gravatar.com/avatar/5615a372d9866f203a22b2c437527bbb.jpg?s=120&d=mm&r=g)
On Wed, Apr 06, 2022 at 04:54:15PM -0000, Brian McCall wrote:
Oops, my examples have some other problems:
# λ = 550e-9nm
should be
# λ = 550nm
This is an excellent example of why unit tracking, or any other programming system, is not a panacea. No programming system in the world is going to save you if you type x = 135 when it should be x = 315. -- Steve
![](https://secure.gravatar.com/avatar/d995b462a98fea412efa79d17ba3787a.jpg?s=120&d=mm&r=g)
On Wed, 6 Apr 2022 at 16:51, Brian McCall <brian.patrick.mccall@gmail.com> wrote:
Before I get to this example, though, there is more to arguments for the "need" than just counter-examples. I keep using quotes because nothing is really a need for anything.
Yes, it's not about "need" in an absolute sense, but more about whether the gains justify the costs. In this case, the costs are non-trivial, because there's very little prior art (at least that I know of) in other programming languages in this area. So there's a learning curve, making Python a bit less approachable for the average programmer, which offsets the benefits to people who gain from this sort of functionality. And that's on top of the usual costs of any syntax change/new language feature.
Alright, now let's look at an example. Again, it's not my best, let's go with it.
Thanks for this, it helps a lot to have something concrete.
Symbol definitions: h - Planck's constant c - speed of light Ee - irradiance R - reflectance Q - quantum efficiency F - f-number λ - wavelength a - width of a pixel t - exposure time ȳ - output of luminosity function integral
From here, if the triple tick marks do not render this example in monospace for you, then I recommend copy/pasting into something that does.
``` echo no units
python -c " h = 6.62607015e-34 c = 299792458 Ee = 200 R = 0.25 Q = 0.63 F = 2.4 λ = 550e-9 a = 3.45e-6 t = 30e-3 ȳ = 683 n = (Ee * R * Q * λ * t * a**2) / (2 * h * c * ȳ * F**2)
Someone's already asked, so I know that the issue was with the value given to one of the constants, rather than with the formula. And I know all of this is intended to be read by specialists, not by the likes of me, but I wonder whether some comments might have helped here, as well? t = 30e-3 # exposure time (seconds) There's a subtle difference between "scripting" and "programming", and as a programmer, I'd almost certainly add comments like this. But if I was writing a script, I wouldn't. However, I'd generally not trust the output of a script as much as I would that of a program. (Jupyter notebooks fall somewhere in between, for what it's worth...)
# n = (200 * 0.25 * 0.63 * 550e-9 * 30e-3 * (3.45e-6)**2) / (2 * h*c * 683 * 2.4**2) print(n) "
# Pros - compact, the code representing the equation is easily verifiable, and the magnitudes are also easily verifiable
Given that the error was in the magnitude of one of the values, "magnitudes are easily verifiable" doesn't really seem that correct...
# Cons - no units
Agreed, this is just using Python as a glorified calculator. I understand that this is just an example, but I *am* curious, is the bulk of what you do simply calculations like this, or do your more complicated examples tend to be more like actual programs?
echo What literals would look like python -c " # h = 6.62607015e-34m2kg / 1s # c = 299792458m / 1s # Ee = 200lx # R = 0.25 # Q = 0.63 # F = 2.4 # λ = 550e-9nm # a = 0.00345mm # t = 30ms # ȳ = 683lm / 1W # n = (Ee * R * Q * λ * t * a**2) / (2 * h * c * ȳ * F**2) # n = (200lx * 0.25 * 0.63 * 500nm * 30ms * (3450nm)**2) / (2 * h*c * 683lm/1W * 2.4**2) " # Pros - Still compact. Dead simple. Planck's constant looks a little weird, but this is usually imported from a library anyway # Cons - requires a syntax change; inline computations like the last line are not IMO quite as readable as the next example
What would the output of "print(n)" be here? Presumably you'd be expecting some sort of calculation on the units, so you didn't just get something like 3958.0636423739215lxnm3msWs2/m2kglm Or is that sufficient for you (I note that it's the same as units provides, below)? Excuse me if I got the scale of the constant wrong, you changed the units between the no-units example and this one (t, for example was seconds and is now ms).
echo 'What bracket syntax might look like' python -c " # h = 6.62607015e-34 [m**2*kg/s] # c = 299792458 [m/s] # Ee = 200 [lx] # R = 0.25 # Q = 0.63 # F = 2.4 # λ = 550e-9 [nm] # a = 0.00345 [mm] # t = 30 [ms] # ȳ = 683 [lm/W] # n = (Ee * R * Q * λ * t * a**2) / (2 * h * c * ȳ * F**2) # n = (200[lx] * 0.25 * 0.63 * 500[nm] * 30[ms] * (0.00345[mm])**2) / (2 * h*c * 683[lm/W] * 2.4**2) "
# Pros - Still compact, dead simple, and IMO the best way to look at this code # Cons - requires a syntax change and a new kind of namespace in addition to global, nonlocal, and enclosure
This is basically just a different bikeshed colour for the previous example, I think. Is that right?
echo units python -c " from units import unit h = unit('m**2*kg/s') (6.62607015e-34) c = unit('m/s') (299792458) Ee = unit('lx') (200) R = 0.25 Q = 0.63 F = 2.4 λ = unit('nm') (550) a = unit('mm') (0.00345) t = unit('ms') (30) ȳ = unit('lm/W') (683) n = (Ee * R * Q * λ * t * a**2) / (2 * h * c * ȳ * F**2) print(n) n = unit('lx')(200) * 0.25 * 0.63 * unit('nm')(550) * unit('mm')(0.00345)**2 * unit('ms')(30) / (2 * 2.4**2 * h*c) / unit('lm/W')(683) print(n) "
This seems pretty readable to me, but I concede I'm a non-expert in this field. However, it's very hard to judge this without seeing the output. So I installed units and ran it, and I get 3958063642373921964032.00 lx * mm * mm * nm * ms / m**2*kg/s * m/s * lm/W Hmm. I'll take your word for it that this (a) is readable and useful to you, and (b) would have helped you spot the error you made.
# Pros - Has units, and no syntax change required
A significant pro, in my view. Not least because it means you can use it *right now*.
# Cons - less compact
I guess so, but compactness isn't typically regarded as a significant selling point for a proposal.
# Cons - How do you get the final answer? You need to know that units became astropy.units and see below
Please can you explain this to me? I don't know what you mean by "get the final answer", nor do I know how astropy.units is relevant. Units seems to be a perfectly acceptable library without astropy, is that not the case?
# Cons - Not dead simple. Multiple adjacent ()() is going to be unpopular with the crowd that uses Python as if it were Fortran/Matlab
It seems dead simple to me, and as for the adjacent parentheses, presumably you can name the units you need, like: ms = unit('ms') t = ms(30) It's not as close to natural language as "30 ms", but again, that's a fairly minor disadvantage in these types of discussion (and opens up a lot of debate about subjective issues like what "looks natural").
echo astropy.units python -c " import astropy.units as units h = 6.62607015e-34 * units.m**2*units.kg/units.s c = 299792458 * units.m / units.s Ee = 200 * units.lx R = 0.25 Q = 0.63 F = 2.4 λ = 550 * units.nm a = 0.00345 * units.mm t = 30 * units.ms ȳ = 683 * units.lm / units.W n = (Ee * R * Q * λ * t * a**2) / (2 * h * c * ȳ * F**2) print(n) n = (200*units.lx * 0.25 * 0.63 * 550*units.nm * 30*units.ms * (0.00345*units.mm)**2) / (2 * h*c * 683*units.lm/units.W * 2.4**2) print(n) print(n.decompose()) "
# Pros - Has units, and no syntax change required # Cons - less compact, everything is an instance of numpy.ndarray which will break some existing code # Cons - .decompose() is required to convert final result # Cons - dot-notation is better than ()(), but again will encounter resistance from those who use Python as if it were Fortran/Matlab
As has been mentioned, if you don't like "units." then "from astropy import units as U" and use "U.ms" of "from astropy.units import ms" and use ms directly. The fact that things are ndarrays is presumably because astropy is intended for use with numpy. I would not expect a library for general use to have such a constraint. I don't know why .decompose() is needed, but again that's an implementation detail of astropy.
echo astropy.units again, but importing everything into globals python -c " from astropy.units import * h = 6.62607015e-34 * m**2*kg/s c = 299792458 * m/s Ee = 200 * lx R = 0.25 Q = 0.63 F = 2.4 λ = 550 * nm a = 0.00345 * mm t = 30 * ms ȳ = 683 * lm/W n = (Ee * R * Q * λ * t * a**2) / (2 * h * c * ȳ * F**2) print(n) n = (200*lx * 0.25 * 0.63 * 550*nm * 30*ms * (3450*mm)**2) / (2 * h*c * 683*lm/W * 2.4**2) print(n) print(n.decompose()) "
# Pros - Compact, has units, and no syntax change required # Cons - Everything is an instance of numpy.ndarray which will break some existing code
Already noted above
# Cons - Can't use "m" as a counter (and other namespace collisions)
Don't use "import *" then, just import the names you need.
# Cons - .decompose() is required to convert final result
Already noted above.
# Cons - dot-notation is better than ()(), but again will encounter resistance from those who use Python as if it were Fortran/Matlab
Already noted above.
echo Quantiphy [...] # Ugh
No further comment :-) These examples were very useful. Thanks for providing them. I'm not trying to shoot them down by doing a point-by-point response, but I wanted to explain some of my thoughts. Overall, my feeling is that the existing library solutions might not be ideal, but they do broadly address the requirement, as I interpret it from your explanations. Yes, the notation may be suboptimal for an expert, but there's always going to be *some* level of compromise needed, and having to write "35 * mm" rather than "35 mm" doesn't seem impossible. On the other hand, it seems clear to me that the various libraries all have their problem points at the moment. Maybe that suggests that there's room for a unified library that takes the best ideas from all of the existing ones, and pulls them together into something that subject experts like yourself *would* be happy with (within the constraints of the existing language). And if new syntax is a clear win even with such a library, then designing a language feature that enables better syntax for that library would still be possible (and there would be a clear use case for it, making the arguments easier to make). I should say, though, that I doubt I'll ever use such a new syntax, or even the existing libraries, for anything other than experimentation. So treat my comments with that in mind. Paul
![](https://secure.gravatar.com/avatar/e6e28dcae5e3df0190e0760e96f7d8ab.jpg?s=120&d=mm&r=g)
On 2022-04-06 12:36, Paul Moore wrote:
And if new syntax is a clear win even with such a library, then designing a language feature that enables better syntax for that library would still be possible (and there would be a clear use case for it, making the arguments easier to make).
If folks are really hung up on the syntax limits, there is shortcut to prototype new syntax using the codecs module to edit code on the fly. It's a cheat, but probably less work than writing a transpiler/dsl. Don't remember who invented the technique, but the last project I tried was to support f-strings under Python of pre-3.6 vintage, called future-fstrings: https://github.com/asottile-archive/future-fstrings Should be enough there to show how it's done. Why not try this technique, combine with one of the libraries, and start writing some code? If it works well enough and enough folks are interested, the proposal would all the evidence it needs, imho. -Mike
![](https://secure.gravatar.com/avatar/ca465da45735c9efed28478928fa9fbe.jpg?s=120&d=mm&r=g)
On Wed, Apr 6, 2022 at 7:05 PM Mike Miller <python-ideas@mgmiller.net> wrote:
On 2022-04-06 12:36, Paul Moore wrote:
And if new syntax is a clear win even with such a library, then designing a language feature that enables better syntax for that library would still be possible (and there would be a clear use case for it, making the arguments easier to make).
If folks are really hung up on the syntax limits, there is shortcut to prototype new syntax using the codecs module to edit code on the fly. It's a cheat, but probably less work than writing a transpiler/dsl.
Don't remember who invented the technique, but the last project I tried was to support f-strings under Python of pre-3.6 vintage, called future-fstrings:
There is also my ideas module https://aroberge.github.io/ideas/docs/html/ which I created to facilitate this type of exploration. In fact, I made a note just two days ago about doing this myself https://github.com/aroberge/ideas/issues/16 André
Should be enough there to show how it's done.
Why not try this technique, combine with one of the libraries, and start writing some code? If it works well enough and enough folks are interested, the proposal would all the evidence it needs, imho.
-Mike
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/MKZW5U... Code of Conduct: http://python.org/psf/codeofconduct/
![](https://secure.gravatar.com/avatar/6d6150353bc4f27822f669a36559ec13.jpg?s=120&d=mm&r=g)
Mike and Andre, thanks for the links! I was indeed planning to take a crack at implementing some ideas. First, for my part, I need to get approval from the people who own my intellectual property ;-) I also figured it would take me a while to sift through the Python code, which I have never done before. But if there is a shortcut I'll gladly take it.
![](https://secure.gravatar.com/avatar/6d6150353bc4f27822f669a36559ec13.jpg?s=120&d=mm&r=g)
Agreed, this is just using Python as a glorified calculator. I understand that this is just an example, but I *am* curious, is the bulk of what you do simply calculations like this, or do your more complicated examples tend to be more like actual programs?
I have never shipped code that would depend on these features. Code like this, including this exact formula, might sit in a script and be used to calculate normalization factors and feed into a noise model. That script might look at one complete concept that I want to test out in theoretical sense. In other cases, I'll make one or more of those quantities out to be variables that I pass in via command line or config file. This allows me to explore a system design space. I used to ship code when I worked in telecom ~15 years ago, and that was mostly C++. Units were not a concern. I shipped C++ and Python code 2-5 years ago at another company (startup) which did not require units, but I also did my feasibility assessments with stuff like this that did depend on units.
Given that the error was in the magnitude of one of the values, "magnitudes are easily verifiable" doesn't really seem that correct...
... ah, I mean the mantissa, not the magnitude
Please can you explain this to me? I don't know what you mean by "get the final answer", nor do I know how astropy.units is relevant. Units seems to be a perfectly acceptable library without astropy, is that not the case?
Am I mistaken, or is the units module no longer maintained? I could not find any documentation for it. Which is also a con. I honestly thought that "units" took on a new life as "astropy.units".
As has been mentioned, if you don't like "units." then "from astropy import units as U" and use "U.ms" of "from astropy.units import ms" and use ms directly.
As an fyi, lots of single-letter variables are commonly used to match variables in physics and engineering equations. U and u are of particular importance in my field.
Don't use "import *" then, just import the names you need.
Still rules out a lot of commonly used variables. ¯\_(ツ)_/¯
Maybe that suggests that there's room for a unified library that takes the best ideas from all of the existing ones, and pulls them together into something that subject experts like yourself *would* be happy with (within the constraints of the existing language)
Just to point out, I don't make this suggestion just for my benefit. I can handle all of the syntax limitations that I pointed out. I am in a somewhat unique position of having a BS in computer engineering, but I have worked for 10+ years now in the optics industry, along with physicists, electrical engineers, and mechanical engineers. I was taught in school all about OOP, data structures, all that stuff. But most of the people I know today who write code do not have any formal training in this area. There are varying ranges of comfort with things like logical indexing, dot-notation (which you and I take for granted) and functions as objects that can be returned by other functions. I have seen code that looks like spaghetti served in a bowl that was shattered and put back together with duct tape. And some of it is related to units. If I make suggestions like what you have made, the response I tend to get is "my code works". Which would be fine with me if I never had to use their code. Others who feel strongly about units, or dual threat CS/ECE types who have to work with self-taught programmers with physics or chemistry backgrounds may have other stories.
![](https://secure.gravatar.com/avatar/d67ab5d94c2fed8ab6b727b62dc1b213.jpg?s=120&d=mm&r=g)
On Thu, 7 Apr 2022 at 10:19, Brian McCall <brian.patrick.mccall@gmail.com> wrote:
Agreed, this is just using Python as a glorified calculator. I understand that this is just an example, but I *am* curious, is the bulk of what you do simply calculations like this, or do your more complicated examples tend to be more like actual programs?
I have never shipped code that would depend on these features. Code like this, including this exact formula, might sit in a script and be used to calculate normalization factors and feed into a noise model. That script might look at one complete concept that I want to test out in theoretical sense. In other cases, I'll make one or more of those quantities out to be variables that I pass in via command line or config file. This allows me to explore a system design space.
You have a VERY restricted definition of "shipped", then. If that code did some calculation and that calculation was used, then that's production code. ChrisA
![](https://secure.gravatar.com/avatar/d995b462a98fea412efa79d17ba3787a.jpg?s=120&d=mm&r=g)
On Thu, 7 Apr 2022 at 01:18, Brian McCall <brian.patrick.mccall@gmail.com> wrote:
Please can you explain this to me? I don't know what you mean by "get the final answer", nor do I know how astropy.units is relevant. Units seems to be a perfectly acceptable library without astropy, is that not the case?
Am I mistaken, or is the units module no longer maintained? I could not find any documentation for it. Which is also a con. I honestly thought that "units" took on a new life as "astropy.units".
I honestly have no idea. I simply did `pip install units` and it worked. If the units module is no longer maintained, then who would write (and maintain) a module that worked with any core python syntax to allow units to be added to quantities? (Note that any core feature would simply allow something like 3_ft to be written to mean the number 3, with some object associated with the name "ft" attached to it - it would be up to a 3rd party module to actually implement the units calculations). It doesn't look like the astropy people want to maintain a generalised units library if they are solely working with numpy arrays. Also, you didn't explain what you meant by "get the final answer" - I'm still not clear with what you want beyond doing the calculation as you showed in your code.
As has been mentioned, if you don't like "units." then "from astropy import units as U" and use "U.ms" of "from astropy.units import ms" and use ms directly.
As an fyi, lots of single-letter variables are commonly used to match variables in physics and engineering equations. U and u are of particular importance in my field.
Don't use "import *" then, just import the names you need.
Still rules out a lot of commonly used variables. ¯\_(ツ)_/¯
If your community typically uses short and frequently clashing names, relying on context and intuition to distinguish them, there's nothing a library or language feature can do to help with that... As I said, there have to be some compromises made. Paul
![](https://secure.gravatar.com/avatar/5615a372d9866f203a22b2c437527bbb.jpg?s=120&d=mm&r=g)
On Wed, Apr 06, 2022 at 08:36:35PM +0100, Paul Moore wrote:
compactness isn't typically regarded as a significant selling point for a proposal.
I would word that differently. Compactness *alone* isn't typically regarded as a *compelling* selling point, and *excessive* compactness is regarded as a negative. ("Excessive" may be subjective. But we're not trying to emulate Perl or APL.)
It's not as close to natural language as "30 ms", but again, that's a fairly minor disadvantage in these types of discussion (and opens up a lot of debate about subjective issues like what "looks natural").
And then you have a problem that units in natural language is ambiguous. 5ft 6in is not an area. The Unix program units tries to deal with the ambiguity by introducing a second division symbol, |, and Frink deals with it by just telling the user to use parentheses where needed.
I don't know why .decompose() is needed, but again that's an implementation detail of astropy.
Correction: it is part of the astrophy interface. Other libraries have different interfaces. -- Steve
![](https://secure.gravatar.com/avatar/c6313f579e12a3332028d33fe6c0814f.jpg?s=120&d=mm&r=g)
Previously you were adamant that it is important that the units be propagated into the final result. In my experience this rarely works out as one expects because the units come out in an overly complicated, confusing form or unexpected form. For example, you gave the units for Planck's constant as m²kg/s which to me was confusing and unexpected. It took me a while to confirm that was equivalent to J-s, which is the more traditional way of specifying its units. To show the value of what you are proposing, you should give the units of the final result. The fact that they are missing significantly weakens your example. Given your previous comments about the importance of propagating the units, it is a little weird that in your one example you don't bother to show it. It is unclear what these examples are meant to show. Is the point to compute the final units of n? If so, why did you not show them. If not, why were the units specified at all? Are they primarily there for documentation purposes? If so, can't you just put them in comments. If the units need to be in the code because they are employed somehow, you should show that. I am still not understanding your justification for why you want to add units to Python. This application seems like a one-time hand calculation rather than a programming problem. Perhaps a better approach would simply be to build a units aware scientific calculator application. That was you could design the best way to include units without any constrains from the Python calculator. -Ken
![](https://secure.gravatar.com/avatar/5615a372d9866f203a22b2c437527bbb.jpg?s=120&d=mm&r=g)
On Wed, Apr 06, 2022 at 11:48:44PM -0700, Ken Kundert wrote:
Perhaps a better approach would simply be to build a units aware scientific calculator application.
The Python REPL makes an awesome interactive calculator. Sagemath has done what you suggest: it is a powerful symbolic maths calculator written in Python and Cython. And it has units: https://doc.sagemath.org/html/en/reference/calculus/sage/symbolic/units.html -- Steve
![](https://secure.gravatar.com/avatar/8da339f04438d3fcc438e898cfe73c47.jpg?s=120&d=mm&r=g)
Brian McCall writes:
Stephen J Turnbull and Paul Moore have asked why the "need" for something other than a library (or perhaps a better library). There are a number of examples that show simple unit calculations and it's easy to argue based on those that nothing further is needed.
Nobody is arguing that, though. We ask *because* we believe you perceive need (and we don't put it in quotes, for exactly the reason you give -- we do not define "need" as "if I don't get it I will literally die"). You don't need to prove your need to Python core. You need to show two things. (1) The particular solution proposed solves problems that already available approaches don't (including "too inconvenient to consistently use in practice" which might apply to the library approach vs the user-defined literal approach). (2) There are enough people with this need to outweigh the slight costs to all Python users of maintaining the facility, documenting it, and learning it, and possibly preventing a more valuable use of the same syntax.
In regulated environments, risk analysis and mitigation is very much affected by whether a feature has native support
That you say this after citing Paul of all people gave me a chuckle, as he's probably the most prominent advocate of a comprehensive standard library on these lists. If anybody will be sympathic to your needs on these grounds, he will. (I don't expect you to know that, but it might be useful to know you have a potential ally.) It's important to distinguish between several levels of "native support" here. 0. Adding units to all numeric objects. I don't think level 0 is a very good idea. I think it was Mac Lane who pointed out that the solution to a problem expressed in mathematics starts by giving a formal statement of the problem, manipulates the form according to syntactic rules, and finally interprets the final form as a semantic solution to the problem. Similarly, as long as the expected units of the result, and the units output by a Python program are the same, the numbers don't need to have units. I suspect that from the point of view of astropy, attaching units to ndarray rather than atomic numeric types is optimal for this reason, not a workaround. 1. Syntax, aka language support (the original suggestion of user-defined literals fits here because either the way numbers are parsed or the way '_' is parsed would have to change). It's very difficult to get *any* syntax change in. In particular, changing '_' from an identifier component to an operator for combining numeric literals would invalidate *tons* of code (including internationalization code that is the 0.454kg nearest my heart). I can't imagine that being possible. There was a fair amount of pushback to using it as the non-capturing marker in the match statement. Changing its interpretation only in a numeric literal is probably possible, since as far as I know it's currently an error to follow a sequence of digits with _ without interposing an operator or whitespace. NOTE: The big problem I see with this is that I don't see any practical way to use syntax for non-numeric types like ndarray. The problem is that in, say, economics we use a lot of m x 2 arrays where the two columns have different units (ie, a quantity unit and a $/quantity unit). The sensible way to express this isn't some kind of appended syntax as with numbers, but rather a sequence of units corresponding to the sequence of columns. If I'm right about that, the only time the literal would be used syntax is when using Python interactively as a calculator. Any time you read from a string or file, you have to interpose a parsing step anyway, so it makes sense to handle the construction of quantities there. 2. Built-in, that is pre-loaded in the interpreter. These are implemented as library functionality in the builtins module. I would definitely be against level 2. It would pollute the builtin namespace to the great detriment of all non-numeric applications. 3. Modules in the standard library. Always available, but must be explicitly imported to be used. I think adding a module that provides units for the stdlib numeric types is a very interesting idea. I doubt I'd ever use it, and I'm not sure that the requirements can be well-enough specified at the moment to go to +1 on it. (As the Zen says, "Although never is often better than *right* now.) For me, level 4 (available on PyPI) is good enough. I can think of cases where it would be useful for class demos to be able to interrogate a computation for its use of units, but that's about it. So I depend on descriptions of proponents' use cases to sort out the question of level 1 vs. level 3.
The Mars Climate Orbiter [...] navigation error arose because a NASA subcontractor (Lockheed Martin) used Imperial units (pound-seconds) instead of the metric system.
These examples don't need to be repeated over and over again, though. What is needed is to show that a particular proposal is especially effective in preventing them compared to others. This is what I haven't seen.
One last pontification before I get to my example relating to units. We already have examples of features that have both a native implementation and library extensions. int and float are primitives in Python. They are more than enough for most users, but limiting for quite a few other users. So modules like fractions and decimal provide extended support, and libraries like numpy provide even more data types for task-specific needs.
?? That analogy looks like an argument for users who need units to get facilities suited to their needs in libraries like numpy and units.
That one was on me, and since I am a capable programmer, I ought to have been using a units package.
This is the big question, to me: will people be so much more likely to use units with syntactic support than they are with something like the units package plus a simple facility for getting the great majority of units they use with one import or function call?
echo What literals would look like python -c " # h = 6.62607015e-34m2kg / 1s
That's pretty unreadable IMO. And how do you distinguish k(m2) from (km)2? Is latter always going to be the natural reading?
echo 'What bracket syntax might look like' python -c " # n = (200[lx] * 0.25 * 0.63 * 500[nm] * 30[ms] * (0.00345[mm])**2) / (2 * h*c * 683[lm/W] * 2.4**2) "
This is quite readable for me. The unit "sticks to" the number nicely. I guess the parentheses in "(0.00345[mm])**2" are unnecessary? This also has the k(m2) vs (km)2 issue, if it's real.
echo units python -c " from units import unit h = unit('m**2*kg/s') (6.62607015e-34) [...] n = unit('lx')(200) * 0.25 * 0.63 * unit('nm')(550) * unit('mm')(0.00345)**2 * unit('ms')(30) / (2 * 2.4**2 * h*c) / unit('lm/W')(683) print(n) "
# Pros - Has units, and no syntax change required # Cons - How do you get the final answer?
I don't understand the question:
unit('m')(1) * unit('kg')(1) Quantity(1, ComposedUnit([LeafUnit('m', False), LeafUnit('kg', False)], [], 1)) str(unit('m')(1) * unit('kg')(1)) '1.00 m * kg'
You need to know that units became astropy.units and see below
This isn't a con, because you're proposing to change the Python distribution. Some form of some library would be added to the stdlib. If you can't do that, there's no way you're going to get syntax to support not doing it. :-)
# Cons - less compact # Cons - Not dead simple. Multiple adjacent ()() is going to be # unpopular with the crowd that uses Python as if it were # Fortran/Matlab
Both of these can be addressed with a prologue mps = unit('m/s') lx = unit('lx') and so on. I would expect that various fields would develop their own abbreviations that would be commonly used. You missed one "con", here. This won't handle SI prefixes in a natural way. I guess the unit() constructor can do it, but it seems pretty nasty from the point of view of a Pythonista to have to define all of m, km, mm, g, kg, and mg when you could get away with m and g, and k and m, and obviously it gets a lot worse as we go to the full suites of prefixes and units.
echo astropy.units again, but importing everything into globals python -c " from astropy.units import * [...] # Cons - Can't use "m" as a counter (and other namespace collisions)
You could limit the name collisions by having field-specific submodules. Losing 'm' would be painful, though. I guess you could use "meter" instead, and ask those who want the abbreviated unit to assign m = meter. But most of the comments above aren't particularly important. The key question remains: How often would the syntax prevent an error that adding an appropriate units library to the stdlib would not? Among other things, in a later post you wrote
Oops, my examples have some other problems:
# λ = 550e-9nm
should be
# λ = 550nm
I can easily see myself making this mistake, hearing "lambda equals 550 nanometers", writing "λ = 550e-9", thinking "I really should add the explicit units, and appending "nm" because that's the unit I heard. Explicit units won't fix that, nor will syntax for explicit units. So you need to make sure that when people say "I make units mistakes a lot" they mean "I think 1000kg as I enter '1000' into a text box that expects lb", not "my SI prefixes frequently duplicate my engineering notation exponents".
![](https://secure.gravatar.com/avatar/176220408ba450411279916772c36066.jpg?s=120&d=mm&r=g)
After reading this long thread, a few notes: [preface: I write this as someone whose entire career is based on doing computation with Python, and with systems where physical units are critical and complex, and who maintains a fair bit of my own custom unit-oriented code -- this is not hypothesis] 1) It is absolutely absurd to even talk about this without consideration of the scipy community -- there has been some mention, but it seems to have been largely skipped over. I think this whole thread started (Or at least it was mentioned early on) with a note that Python could be THE engineering tool (which now that I say it, was a quote from a Professor of mine from 25 years ago -- applying to MATLAB :-) ) if it included good unit handling. But it's not actually Python that has become so widely used in the engineering/science/data analysis world -- it's the "Scipy Stack" / "numpy ecosystem", whatever you want to call it. Along those lines, there was some talk about astropy.units about how they only work with numpy arrays, and that's a limitation -- but it's really not a critical one. The fact is that anything that doesn't work with numpy arrays is dead in the water for widespread use in the engineering/science community. Whereas not working with the built in scalar types is a minor limitation. 2) As for the SciPy community, I've lost track a bit of the current status, but about ten years ago there was a nice talk at the SciPy conference about handling physical units -- the conclusions at that point were: * There are a dozen or so libraries to do it * a few of those are pretty darn good * there is no obvious "best" one. Given the continued existence of astropy.units, and pint, and SciMath units, and wrappers around C libs, like udunits, and ..... I think we're still in that state. What does this tell us? Well, the scipy community is full of a lot of really smart people that are trying to get real work done, and tend to be very cooperative: if there was one clear winner, it would have risen to the top by now. Which tells me that unit handling is very, very, hard, and that there may be no one solution that is suitable for all (or even most) applications. 3) I am having a really hard time figuring out why folks think this needs to be built into the language -- having a nice compact and clear literal for numbers-with-units is great, but really only matters for UIs, interactive use, things like Jupyter Notebooks, and maybe quick one-off scripts. And all those can be handled with some sort of pre-processor -- either in the application, or using the existing import hooks or codec system. 4) I can see the appeal of adding some sort of syntax or something to the language, and then letting third-party libraries actually implement it. But I'm wary of this -- it reminds me of adding annotations to Python without specifying how they should be used -- and now we've discovered that it really is important to have a standard type system, and in fact, are moving towards breaking other uses of annotations. Let's not go down that path again. -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
![](https://secure.gravatar.com/avatar/d67ab5d94c2fed8ab6b727b62dc1b213.jpg?s=120&d=mm&r=g)
On Fri, 8 Apr 2022 at 00:48, Stephen J. Turnbull <stephenjturnbull@gmail.com> wrote:
1. Syntax, aka language support (the original suggestion of user-defined literals fits here because either the way numbers are parsed or the way '_' is parsed would have to change).
It's very difficult to get *any* syntax change in. In particular, changing '_' from an identifier component to an operator for combining numeric literals would invalidate *tons* of code (including internationalization code that is the 0.454kg nearest my heart). I can't imagine that being possible.
123_456 123456 123_456_ File "<stdin>", line 1 123_456_ ^ SyntaxError: invalid decimal literal
Trailing underscore is currently invalid, so that would be fine. ChrisA
![](https://secure.gravatar.com/avatar/8da339f04438d3fcc438e898cfe73c47.jpg?s=120&d=mm&r=g)
Chris Angelico writes:
It's very difficult to get *any* syntax change in. In particular, changing '_' from an identifier component to an operator for combining numeric literals would invalidate *tons* of code (including internationalization code that is the 0.454kg nearest my heart). I can't imagine that being possible.
123_456 123456 123_456_ File "<stdin>", line 1 123_456_ ^ SyntaxError: invalid decimal literal
Trailing underscore is currently invalid, so that would be fine.
I really meant "operator", not just syntax. It was an extreme example.
![](https://secure.gravatar.com/avatar/cdc87637918eccd37ca88e9079e73705.jpg?s=120&d=mm&r=g)
On Thu, Apr 7, 2022 at 10:51 AM Stephen J. Turnbull < stephenjturnbull@gmail.com> wrote:
NOTE: The big problem I see with this is that I don't see any practical way to use syntax for non-numeric types like ndarray. The problem is that in, say, economics we use a lot of m x 2 arrays where the two columns have different units (ie, a quantity unit and a $/quantity unit). The sensible way to express this isn't some kind of appended syntax as with numbers, but rather a sequence of units corresponding to the sequence of columns.
Here's a few minutes of work by me:
df = pd.DataFrame([['Apples', 11, 2], ['Pears', 12, 3]], ... columns=[Units("Item", "Kind-of-Fruit"), Units("Per-Bushel", "USD/bushel"), Units("Number")]) ... ... df Item Per-Bushel Number 0 Apples 11 2 1 Pears 12 3 df.columns[1].units 'USD/bushel'
I did this by defining:
class Units(str): ... def __new__(self, s, units="Dimensionless"): ... return str.__new__(self, s) ... def __init__(self, s, units="Dimensionless"): ... float.__init__(s) ... self.units = units
If this were actually needed, I'm sure better interfaces could be created. But this very simple thing is a possible way to retain units per-column. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.
![](https://secure.gravatar.com/avatar/5615a372d9866f203a22b2c437527bbb.jpg?s=120&d=mm&r=g)
On Wed, Apr 06, 2022 at 03:50:35PM -0000, Brian McCall wrote:
Stephen J Turnbull and Paul Moore have asked why the "need" for something other than a library (or perhaps a better library). There are a number of examples that show simple unit calculations and it's easy to argue based on those that nothing further is needed. I think more complicated counter-examples could help push back against this,
Thanks for the extensive example, but it is still a *simple* unit calculation. It is basically just a longer version of `n = b*c/d`, only with ten variables instead of three. All of the library's formats seem reasonable to me in the example you give. Syntax is important, but in this case all you are doing is binding a literal quantity with its unit to a name, and for that purpose, the syntax is not that important. What does it *really* matter which of these you write? h = 6.62607015e-34[m**2*kg/s] h = unit('m**2*kg/s')(6.62607015e-34) h = 6.62607015e-34 * units.m**2*units.kg/units.s h = 6.62607015e-34 * m**2*kg/s h = Quantity('6.62607015e-34 m2kg/s') h = unit(6.62607015e-34, "m2kg/s") Yes, you have to learn a slightly different notation for each, but every branch of science and engineering has their own notation and conventions which needs to be learned. For constants, that's just arguing over the colour of the bikeshed. The aim of your post here was to try to justify why units have to be built into the language, rather than supported via a library. You have done a reasonable job of showing that calculations with units are useful, but nobody denied that. Regardless of the individual strengths and weaknesses of the existing libraries, you have shown nothing to justify why unit support must be built into the language itself. -- Steve
![](https://secure.gravatar.com/avatar/6d6150353bc4f27822f669a36559ec13.jpg?s=120&d=mm&r=g)
What does it *really* matter which of these you write?
that's just arguing over the colour of the bikeshed.
you have shown nothing to justify why unit support must be built into the language itself.
I did what I could, but I'm not going to try and justify any more. At the end of the day, units are a core part of science and engineering. Scientists and engineers are freaking passionate about units. More and more of them are also being expected to write code to do their jobs. To me that says it's going to happen at some point. It might be 10-20 years, and it might not be Python. But there will be a programming language, likely built on what Python is today, with native support for SI base units and derivatives. The unit torque vs unit unit energy problem will also be resolved. With support at the native level, the language syntax, parsers, standard library and third party libraries will have an enhanced ability to support any locale's version of units, as well as the varieties of dimensionless quantities that have been mentioned - like A/A and V/V. Such is my prediction.
![](https://secure.gravatar.com/avatar/8da339f04438d3fcc438e898cfe73c47.jpg?s=120&d=mm&r=g)
Brian McCall writes:
Steven d'Aprano writes:
you have shown nothing to justify why unit support must be built into the language itself.
I did what I could, but I'm not going to try and justify any more.
That makes me sad, because everybody in the thread acknowledges that improving the Python distribution's support for units is a good idea, but nobody is as enthusiastic about getting it done as you. Chris Barker's comments about multiple attractive library implementations are well-taken, I think, but I also think that with more focus on getting a satisfactory module into the stdlib, it would be quite possible to pick one that doesn't rely on non-stdlib types (so I guess astropy.units would be out). That doesn't directly get you the literal syntax for units you focus on, but if units are easier to use, more applications will use them, and perhaps build up momentum for a syntax change. And the syntax change is useless without the library.
![](https://secure.gravatar.com/avatar/9ea64fa01ed0d8529e4ae1b8873bb930.jpg?s=120&d=mm&r=g)
On Fri, Apr 8, 2022, 2:40 AM Stephen J. Turnbull <stephenjturnbull@gmail.com> wrote:
Brian McCall writes:
Steven d'Aprano writes:
you have shown nothing to justify why unit support must be built into the language itself.
I did what I could, but I'm not going to try and justify any more.
That makes me sad, because everybody in the thread acknowledges that improving the Python distribution's support for units is a good idea, but nobody is as enthusiastic about getting it done as you.
Chris Barker's comments about multiple attractive library implementations are well-taken, I think, but I also think that with more focus on getting a satisfactory module into the stdlib, it would be quite possible to pick one that doesn't rely on non-stdlib types (so I guess astropy.units would be out).
That doesn't directly get you the literal syntax for units you focus on, but if units are easier to use, more applications will use them, and perhaps build up momentum for a syntax change. And the syntax change is useless without the library.
I'll try to provide examples of my struggles with units in python but I'm not an accomplished coder at all and don't have much to look at in the way of examples. I sometimes go weeks without writing any code at all, followed by days of nothing but writing code. Python is so painful to use for units I've actually avoided it, so there won't be many examples I can give anyway. Hence my silence in this thread the past few days. I just get really excited at the idea of it being native to the language and am dreaming of being able to use it more often for my every day calculations. Right now I just don't feel confident I can.
![](https://secure.gravatar.com/avatar/d995b462a98fea412efa79d17ba3787a.jpg?s=120&d=mm&r=g)
On Fri, 8 Apr 2022 at 12:22, Ricky Teachey <ricky@teachey.org> wrote:
I just get really excited at the idea of it being native to the language and am dreaming of being able to use it more often for my every day calculations. Right now I just don't feel confident I can.
If you can describe what the Python of your dreams would look like, that would be really useful. Most of the problem here is with people who *don't* need units for every day calculations struggling to understand what is wrong with a library-based solution, and what "language support" would look like in practice. Paul
![](https://secure.gravatar.com/avatar/e6a87d3f508d8742129e4bdf025b47d3.jpg?s=120&d=mm&r=g)
My personal preference for adding units to python would be to make instances of all numeric classes subscriptable, with the implementation being roughly equivalent to: def __getitem__(self, unit_cls: type[T]) -> T: return unit_cls(self) We could then discuss the possibility of adding some implementation of units to the stdlib. For example: from units.si import km, m, N, Pa 3[km] + 4[m] == 3004[m] # True 5[N]/1[m**2] == 5[Pa] # True 'Casual' users could also use a star import (despite its pitfalls) and not have to worry about going back and updating the import statement, so I don't think requiring that import would be much of a barrier to beginners. They'd just learn they need that star import at the top of the file as a sort of 'magic spell'. Third-party libraries could provide their own unit classes with additional features and characteristics that you could substitute in by simply changing the import statement and nothing else. To write a custom unit class you would just have to implement an __init__ that accepts a single numeric argument. To enable units like m², the __pow__ magic method would have to be implemented in the unit class' metaclass. The advantages of this seem to me like: 1) no new syntax, just an extra magic method for numeric types 2) batteries included, 3) Won't clutter up the builtins, you have to opt in by using imports 3) simple for third-party libraries to support and extend I can't really see much in the way of disadvantages aside from: 1) aesthetic objections to the use of subscription for this purpose. I personally quite like it because in a way a unit at the end of a number *is* a subscript anyway, so it seems quite fitting to use python's subscription syntax for it. 2) the opposite of advantage 3) above: people actually *wanting* the units to be part of the builtins so that you don't have to use any imports. Depending on your opinion this can be a good or bad thing And if no reasonable implementation of the batteries can be agreed upon, that's fine, that part can be delayed or rejected. On Fri, Apr 8, 2022 at 12:21 PM Ricky Teachey <ricky@teachey.org> wrote:
On Fri, Apr 8, 2022, 2:40 AM Stephen J. Turnbull < stephenjturnbull@gmail.com> wrote:
Brian McCall writes:
Steven d'Aprano writes:
you have shown nothing to justify why unit support must be built into the language itself.
I did what I could, but I'm not going to try and justify any more.
That makes me sad, because everybody in the thread acknowledges that improving the Python distribution's support for units is a good idea, but nobody is as enthusiastic about getting it done as you.
Chris Barker's comments about multiple attractive library implementations are well-taken, I think, but I also think that with more focus on getting a satisfactory module into the stdlib, it would be quite possible to pick one that doesn't rely on non-stdlib types (so I guess astropy.units would be out).
That doesn't directly get you the literal syntax for units you focus on, but if units are easier to use, more applications will use them, and perhaps build up momentum for a syntax change. And the syntax change is useless without the library.
I'll try to provide examples of my struggles with units in python but I'm not an accomplished coder at all and don't have much to look at in the way of examples. I sometimes go weeks without writing any code at all, followed by days of nothing but writing code.
Python is so painful to use for units I've actually avoided it, so there won't be many examples I can give anyway. Hence my silence in this thread the past few days.
I just get really excited at the idea of it being native to the language and am dreaming of being able to use it more often for my every day calculations. Right now I just don't feel confident I can. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/LSQQKL... Code of Conduct: http://python.org/psf/codeofconduct/
![](https://secure.gravatar.com/avatar/d995b462a98fea412efa79d17ba3787a.jpg?s=120&d=mm&r=g)
On Fri, 8 Apr 2022 at 13:09, Matt del Valle <matthewgdv@gmail.com> wrote:
My personal preference for adding units to python would be to make instances of all numeric classes subscriptable, with the implementation being roughly equivalent to:
def __getitem__(self, unit_cls: type[T]) -> T: return unit_cls(self)
We could then discuss the possibility of adding some implementation of units to the stdlib. For example:
from units.si import km, m, N, Pa
3[km] + 4[m] == 3004[m] # True 5[N]/1[m**2] == 5[Pa] # True
Thanks. That's extremely useful, and I can see it as a reasonable language feature request. BUT (and it's a big "but"!) someone would have to write, support and maintain that units library. Obviously in the first instance, it couldn't use the dedicated syntax, but unit_cls(number) doesn't seem like a horribly bad compromise for a 3rd party library right now. So here's my proposal. 1. Somebody (or a group of people) who wants to see this happen, write (or adopt) a library and publish it on PyPI. It should provide *all* of the functionality that the proposed stdlib support would offer, with the sole exception that units get attached to values using unit_cls(number) rather than special syntax. It's possible that the "units" library that's already on PyPI is (nearly) that library - but from what I've heard in this thread, the community hasn't reached consensus on what "best of breed" looks like yet. 2. Once that library has demonstrated its popularity, someone writes a PEP suggesting that the language adds support for the syntax `number[annotation]` that can be customised by user code. This would be very similar in principle to the PEP for the matrix multiplication @ operator - a popular 3rd party library demonstrates that a well-focused language change, designed to be generally useful, can significantly improve the UI of the library in a way which would be natural for that library's users (while still being general enough to allow others to experiment with the feature as well). 3. Once the new language feature is accepted, and the library authors are willing, propose that the library gets added to the stdlib. We're currently at step 1 - we need someone to come up with a library that demonstrates how to provide this functionality in a way that matches users' requirements, and which has unified community support. That step doesn't need anything much from the Python core devs or even this list, beyond maybe a general feeling that the overall plan "isn't a totally dumb idea"... Step 2 is where a PEP and proper core dev support would be needed. But the library would be useful even if this doesn't happen (and conversely, if the library proves *not* to be useful, it demonstrates that the language change wouldn't actually be as valuable as people had hoped). Step 3 is optional. With language support that can be used by external libraries, "being part of the stdlib" isn't needed. This is true of pretty much everything in the stdlib, though - stdlib modules don't have any special benefits that external libraries don't. As a supporter of a large stdlib, I'd be OK with moving the units library into the stdlib (on the assumption that the library maintainers commit to supporting it in the stdlib, and don't run away and dump the problem on the core devs). Others who prefer a smaller stdlib would argue it's fine on PyPI. But that's an argument about principles which frankly end users and 3rd party library authors can't influence much (and can probably ignore in practice). So honestly, I'd encourage interested users to get on with implementing the library of their dreams. By all means look ahead to how language syntax improvements might help you, but don't let that stop you getting something useful working right now. Paul
![](https://secure.gravatar.com/avatar/5615a372d9866f203a22b2c437527bbb.jpg?s=120&d=mm&r=g)
On Fri, Apr 08, 2022 at 07:20:35AM -0400, Ricky Teachey wrote:
Python is so painful to use for units I've actually avoided it, so there won't be many examples I can give anyway. Hence my silence in this thread the past few days.
Which of the many Python libraries have you tried, and what makes them painful?
I just get really excited at the idea of it being native to the language and am dreaming of being able to use it more often for my every day calculations. Right now I just don't feel confident I can.
You should try Frink. At the very least, that will give you some idea of what is possible with unit calculations, how it might be integrated with a language, and the limitations of units as a concept. https://frinklang.org/#SampleCalculations -- Steve
![](https://secure.gravatar.com/avatar/176220408ba450411279916772c36066.jpg?s=120&d=mm&r=g)
On Sat, Apr 9, 2022 at 5:52 AM Steven D'Aprano <steve@pearwood.info> wrote:
Python is so painful to use for units I've actually avoided it,
What have you tried? and what do you do instead? MathCAD, maybe? For my part, there is a bit of a barrier to entry: I need to pick a library, I need to get over the learning curve, etc. But I dont think having it as a Python built in would help much. Another BIG barrier for me is that in my real work, I need to do a lot of things with units that aren't strictly correct: equivalence of weight and mass (kg and lbs) equivalence of mass per volume and unitless (ppm and micrograms/liter) Really strange "units" like API Gravity, and slightly less ones like Specific Gravity These are all awkward to deal with ain a proper unit system that is specifically intended to not let you make these kinds of "errors". And a system that worked well for my line of work would likely be a disaster for others' So what barriers do you have? Also -- as someone has mentioned on this list -- nifty easy syntax would help mostly for scripting and "using Python as a calculator" -- so a plug-in for Jupyter or and a calculator application of some sort might be almost as good as built-in syntax. And the downsides of carrying units around with the built in numbers (overhead, numpy incompatibility) is substantial, and would be irrelevant. finally: give PInt a try -- it really is pretty nifty -- particularly in a notebook: https://colab.research.google.com/github/agile-geoscience/xlines/blob/master... -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
![](https://secure.gravatar.com/avatar/5615a372d9866f203a22b2c437527bbb.jpg?s=120&d=mm&r=g)
On Thu, Apr 07, 2022 at 11:18:57PM -0000, Brian McCall wrote:
What does it *really* matter which of these you write?
that's just arguing over the colour of the bikeshed.
you have shown nothing to justify why unit support must be built into the language itself.
I did what I could, but I'm not going to try and justify any more.
But that's my point, you haven't tried to justify why units need to be built-in to the language except to declare that they must and that a library won't cut it. I'm sympathetic to that view, but "Trust me, I'm right!" is not going to sway the core developers or the Steering Council. You're a scientist, think of this as a grant proposal. Convince us.
At the end of the day, units are a core part of science and engineering.
There are many things which are core to science and engineering but aren't part of the core Python language. What makes units of measurements more special than, say, numpy arrays or dataframes? The numpy/scipy/pandas stack is HUGE in scientific Python, and it is not built into the language. nltk is a very big, powerful library used for computational linguistics, and its not built into the core language. If there is something critical in units that requires core language support, what is it?
Scientists and engineers are freaking passionate about units. More and more of them are also being expected to write code to do their jobs. To me that says it's going to happen at some point.
People have been talking about unifying units into programming languages since the 1970s, in Fortran and Pascal. Ada has had support for measurement units since at least 2003 and probably earlier. https://link.springer.com/chapter/10.1007/3-540-44947-7_19 (Sorry, not a free link.) Both F# (2005) and Swift (2014) also support units in the core language. I've also mentioned Frink, and HP calculator "RPL" language supports units. HP has supported them since the mid 1980s. Java, C++, C#, Javascript, Ruby and others all have units libraries. Quote: "These types of libraries do already exist, several hundreds in fact. The problem thus is not the lack of these solutions but the opposite that they are so numerous that it is hard to get an overview." https://onlinelibrary.wiley.com/doi/full/10.1002/spe.2926 So for scientists who want units of measurement support, there are many existing solutions they can use today. Why are those solutions insufficient, and what can be done about it? -- Steve
![](https://secure.gravatar.com/avatar/176220408ba450411279916772c36066.jpg?s=120&d=mm&r=g)
Steven suggesting looking at Frink -- so I've done that (briefly). I can see the appeal, but it reminds me a bit of the old Apple Hypertalk -- kinda cool how it looks like natural language, but also stickers me as less precise and clear than I like in a programming language -- "explicit is better than implicit". It also reminds me a bit of MathCAD, (which I haven't used in years, no decades!) -- MathCAD is very nifty in looking like "blackboard" math -- but I did have issues with implied parentheses and the like. Great for experimentation, perhaps a bit problematic for "real work". Anyway, enclosed is a (very simple) example from Frink, re-implimented in a Jupyter notebook with the Pint library. (I have no easyway to publish an interactive notebook (yes, I know it's not that hard) so I provide three versions: PDF if you just want to look at it. Python if you want to run it, and ipnb if you want to play with it in a notebook. Enjoy! -CHB On Sat, Apr 9, 2022 at 8:12 AM Steven D'Aprano <steve@pearwood.info> wrote:
On Thu, Apr 07, 2022 at 11:18:57PM -0000, Brian McCall wrote:
What does it *really* matter which of these you write?
that's just arguing over the colour of the bikeshed.
you have shown nothing to justify why unit support must be built into the language itself.
I did what I could, but I'm not going to try and justify any more.
But that's my point, you haven't tried to justify why units need to be built-in to the language except to declare that they must and that a library won't cut it.
I'm sympathetic to that view, but "Trust me, I'm right!" is not going to sway the core developers or the Steering Council.
You're a scientist, think of this as a grant proposal. Convince us.
At the end of the day, units are a core part of science and engineering.
There are many things which are core to science and engineering but aren't part of the core Python language. What makes units of measurements more special than, say, numpy arrays or dataframes?
The numpy/scipy/pandas stack is HUGE in scientific Python, and it is not built into the language. nltk is a very big, powerful library used for computational linguistics, and its not built into the core language.
If there is something critical in units that requires core language support, what is it?
Scientists and engineers are freaking passionate about units. More and more of them are also being expected to write code to do their jobs. To me that says it's going to happen at some point.
People have been talking about unifying units into programming languages since the 1970s, in Fortran and Pascal.
Ada has had support for measurement units since at least 2003 and probably earlier.
https://link.springer.com/chapter/10.1007/3-540-44947-7_19
(Sorry, not a free link.)
Both F# (2005) and Swift (2014) also support units in the core language.
I've also mentioned Frink, and HP calculator "RPL" language supports units. HP has supported them since the mid 1980s. Java, C++, C#, Javascript, Ruby and others all have units libraries.
Quote:
"These types of libraries do already exist, several hundreds in fact. The problem thus is not the lack of these solutions but the opposite that they are so numerous that it is hard to get an overview."
https://onlinelibrary.wiley.com/doi/full/10.1002/spe.2926
So for scientists who want units of measurement support, there are many existing solutions they can use today. Why are those solutions insufficient, and what can be done about it?
-- Steve _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/B2HZEP... Code of Conduct: http://python.org/psf/codeofconduct/
-- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
![](https://secure.gravatar.com/avatar/ca465da45735c9efed28478928fa9fbe.jpg?s=120&d=mm&r=g)
Greetings, This message is for those that would like to "play" with a more natural looking syntax for units in Python. First, a quick look:
python -m ideas -t easy_units Ideas Console version 0.0.23. [Python version: 3.10.2]
~>> import pint ~>> U = pint.UnitRegistry() ~>> walk = 3[km] + 100[m] ~>> walk <Quantity(3.1, 'kilometer')> ~>> p1 = 1.0[N/m^2] ~>> p2 = 1.0[Pa] ~>> p1 == p2 True == Or, for those that prefer astropy to pint:
python -m ideas -t easy_units Ideas Console version 0.0.23. [Python version: 3.10.2]
~>> from astropy.units import m, km, N, Pa ~>> walk = 3[km] + 100[m] ~>> walk <Quantity 3.1 km> ~>> p1 = 1.0[N/m^2] ~>> p2 = 1.0[Pa] ~>> p1 == p2 True === Or, I simply want to run a script, say # example.py import pint units = pint.UnitRegistry() print(1.0[km] + 2[m]) ===
python -m ideas example -t easy_units 1.002 kilometer
Note that it is "example" and not "example.py" that is run (imported). ====== To try these examples, you need to: python -m pip install ideas and either python -m pip install pint or python -m pip install astropy ## How does it work? "ideas" (https://github.com/aroberge/ideas; documentation at https://aroberge.github.io/ideas/docs/html/) is a library I created a few years ago to allow easy experiments with variations on Python's syntax. When a module is imported (or when some code is run in the modified interpreter), it is first transformed prior to execution. Users of ideas can define transformations that operate: 1. on the source (text) 2. on the AST 3. on the bytecode. If one uses simple source transformations (which is what I did for easy_units), one can see the transformed code prior to its execution in the interactive console, using a "verbosity" flag (-v or --verbosity)
python -m ideas -t easy_units -v Ideas Console version 0.0.23. [Python version: 3.10.2]
~>> import pint ~>> Units = pint.UnitRegistry() # "Units" here is an arbitrary name ~>> walk = 3[km] + 100[m] ===========Transformed============ walk = 3 * Units.km + 100 * Units.m ----------------------------- ~>> p1 = 1.0[N/m^2] ===========Transformed============ p1 = 1.0 * Units.N/(Units.m**2) ----------------------------- ~>> p2 = 1.0[N/m**2] # using ** instead of ^ for Python purists ===========Transformed============ p2 = 1.0 * Units.N/(Units.m**2) # using ** instead of ^ for Python purists ----------------------------- Admittedly, it is a quick hack and may very well be buggy. But it can be fun to use! ;-) André Roberge
![](https://secure.gravatar.com/avatar/176220408ba450411279916772c36066.jpg?s=120&d=mm&r=g)
<andre.roberge@gmail.com> wrote:
This message is for those that would like to "play" with a more natural looking syntax for units in Python.
This is very cool -- thanks! -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
![](https://secure.gravatar.com/avatar/8da339f04438d3fcc438e898cfe73c47.jpg?s=120&d=mm&r=g)
Steven D'Aprano writes:
There are many things which are core to science and engineering but aren't part of the core Python language. What makes units of measurements more special than, say, numpy arrays or dataframes?
Arrays and dataframes are data structures, hidden behind the syntax. They can be conveniently referenced and manipulated with functions invoked via existing syntax. This existing syntax is familiar to us from many decades of programming language practice, and we have always accepted it because of the constraint of 1-dimensional computer text. Much of it is extensions of centuries of mathematical practice, from operator symbols to function call notation, to these modern data structures. And that's one thing I love about Python, the ability to use familiar syntax in analogous ways. Applying this line of thought to units, the practice for centuries (at least in English and Japanese) has been to append the unit after the value. But this is a syntax error in Python, and even the 'value[unit]' dodge fails for literal numbers, while it probably isn't needed often for variables (I would probably even consider it bad style). We can construct a Unit class, and use the multiplication operator to combine a "bare" number with the unit, but this doesn't feel right to naive users. To me it feels like multiplying a number by a list. What I ideally want is two parts of a coherent whole, like real and imag. (Note: This does not bother me in practice at all, I'm trying to empathize with the folks whole say they can't stomach Python's approach to units at all.) Furthermore, there's another problem: units "should" have their own namespace, almost like keywords. But it's much more complicated than keywords, because (1) there are many more potential collisions, both between ordinary identifiers and units ('m' as an integer variable vs. 'm' as a unit) and among units ('A' for ampere and 'A' for Angstrom), and (2) the units namespace should be extensible. Both of these issues (natural language practice and namespace) can be most fully addressed with syntax that allows an optional identifier expression to be placed directly after a literal or identifier, and enforcing the semantics that this optional identifier expression must be composed of registered units and only the allowed operations (* and /, I guess). Invalid operations would be SyntaxErrors, unregistered identifiers would be RuntimeErrors. I think that one way both issues can be addressed without syntax is to take a package like units, add "repertoire attributes", so we can write things like this import units u = units.si.modifiable_copy() # the SI repertoire is read-only u.register_unit('frob', 'Fr', composed=u.foo/u.m, doc='Frob is a devo unit invented by A. Panshin.') if 12*u.mm * 42*u.MFr == 502*u.foo: print('Well done!') That would work fine for me. But I can see why somebody who frequently uses interactive Python as a scientific calculator would prefer to write if 12 m/s * 42 s == 502 m: print('Well done!') with the base SI repertoire (a dozen or so prefixes and 7 units) in builtins. As far as I can tell, that's the whole argument. Steve
![](https://secure.gravatar.com/avatar/de311342220232e618cb27c9936ab9bf.jpg?s=120&d=mm&r=g)
On 4/9/22 21:17, Stephen J. Turnbull wrote:
if 12*u.mm * 42*u.MFr == 502*u.foo: print('Well done!')
That would work fine for me. But I can see why somebody who frequently uses interactive Python as a scientific calculator would prefer to write
if 12 m/s * 42 s == 502 m: print('Well done!')
with the base SI repertoire (a dozen or so prefixes and 7 units) in builtins.
Part of the argument as well, I think, is that the top expression would be parsed as: ((12 * u.m) * 42) * u.MFr which, if actually equal to 502*u.foo, is dumb luck. -- ~Ethan~
![](https://secure.gravatar.com/avatar/8da339f04438d3fcc438e898cfe73c47.jpg?s=120&d=mm&r=g)
Warning: serious linguistic hacking follows. I tried to be careful in writing, please be careful in reading. Corrections welcome. Ethan Furman writes:
On 4/9/22 21:17, Stephen J. Turnbull wrote:
if 12*u.mm * 42*u.MFr == 502*u.foo: print('Well done!')
Part of the argument as well, I think, is that the top expression would be parsed as:
((12 * u.m) * 42) * u.MFr
which, if actually equal to 502*u.foo, is dumb luck.
It's actually equal to 504*u.foo, I shouldn't do multiplications larger than 12 * 12 in my head. But yes, the correct answer is (12 * 42) * u.foo, if I could only do arithmetic on integers! I guess you could call the associative law of multiplication "dumb luck", but most mathematicians will consider that hate speech. If the unit class is designed correctly (the units package is designed correctly), both the associative and commutative laws of multiplication hold. The only gotcha as far as the value is concerned is that if you put in divisions such as u.m/u.s, then you often will need parentheses. But that's true for ordinary arithmetic as well. There would be a problem if the LHS were 12*u.mm / 42*u.MFr. That's actually nonsense in a units-aware world. Parentheses are required: 12*u.mm / (42*u.MFr). The "we demand syntax" crowd wants the "add unit" operation to have higher precedence than numerical multiplication and division. Hmm, unfortunately both '@' and '%' have the same precedence as '*' and '/', but we could make the precedence more natural by using '**' at the cost of the intuition that a unit is just a quantity (object with both value and unit attributes) with value 1. I'm not sure that intuition helps the "we demand syntax" crowd, though. Another way to put the issue is that libraries like units don't provide units, they provide quantities. You can create a quantity foo that acts like a unit by giving it the value 1, but 10*foo is of the same type as foo. It just has value attribute 10. Again, that doesn't bother me, but I suspect that the "we demand syntax" crowd will feel queasy about the idea that 'nm' *is* '1 nm'. For them, 'nm' needs to be notated with a value to be a quantity. I don't want to put words in anybody's mouth. If Brian or Rickey disclaims that interpretation, they're right. If they think it has expressive value but needs to be amended, they're right. I'm just trying to provide words to express what the need is here.
![](https://secure.gravatar.com/avatar/de311342220232e618cb27c9936ab9bf.jpg?s=120&d=mm&r=g)
On 4/10/22 21:33, Stephen J. Turnbull wrote:
I guess you could call the associative law of multiplication "dumb luck", but most mathematicians will consider that hate speech.
My apologies for not understanding your example. The counter example I had in my head, and should have written down, was something like: 15mpg * 7l == how many miles? where mpg = miles per gallons l = litres I'm pretty sure the answer there is not 105. -- ~Ethan~
![](https://secure.gravatar.com/avatar/ca465da45735c9efed28478928fa9fbe.jpg?s=120&d=mm&r=g)
On Mon, Apr 11, 2022 at 11:33 AM Ethan Furman <ethan@stoneleaf.us> wrote:
On 4/10/22 21:33, Stephen J. Turnbull wrote:
I guess you could call the associative law of multiplication "dumb luck", but most mathematicians will consider that hate speech.
My apologies for not understanding your example. The counter example I had in my head, and should have written down, was something like:
15mpg * 7l == how many miles?
where
mpg = miles per gallons l = litres
I'm pretty sure the answer there is not 105.
Really? ;-) ;-)
import pint ureg = pint.UnitRegistry() mpg = ureg.define('mpg = 1 * mile / gallon') dist = 15 * ureg.mpg * 7 * ureg.l dist <Quantity(105, 'liter * mpg')>
It is 105 ... but in some weird distance units.
dist.to(ureg.miles) <Quantity(27.7380655, 'mile')>
Or, in a more readable way! ;-) ;-)
python -m ideas -t easy_units Ideas Console version 0.0.30. [Python version: 3.9.10]
import pint ureg = pint.UnitRegistry() ureg.define('mpg = 1 * mile / gallon') dist = 15[mpg] * 7[l] dist.to(1[mile]) <Quantity(27.7380655, 'mile')>
André Roberge Thought: I should probably make it even easier to convert units within ideas...
-- ~Ethan~ _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/OGHZSH... Code of Conduct: http://python.org/psf/codeofconduct/
![](https://secure.gravatar.com/avatar/5615a372d9866f203a22b2c437527bbb.jpg?s=120&d=mm&r=g)
On Mon, Apr 11, 2022 at 07:32:12AM -0700, Ethan Furman wrote:
My apologies for not understanding your example. The counter example I had in my head, and should have written down, was something like:
15mpg * 7l == how many miles?
where
mpg = miles per gallons l = litres
I'm pretty sure the answer there is not 105.
Indeed. The answer is 27.7 miles. [steve ~]$ units "15mpg * 7l" miles * 27.738065 / 0.036051541 Or if you prefer an exact answer, Frink gives 625000000/22532213 miles. Because one can never have too much precision *wink* (By the way, those are US gallons and miles. If we use British units instead, the answer is 27.738114 miles. A difference of about 7.9cm in either direction. If you care.) -- Steve
![](https://secure.gravatar.com/avatar/8da339f04438d3fcc438e898cfe73c47.jpg?s=120&d=mm&r=g)
Ethan Furman writes:
On 4/10/22 21:33, Stephen J. Turnbull wrote:
I guess you could call the associative law of multiplication "dumb luck", but most mathematicians will consider that hate speech.
My apologies for not understanding your example. The counter example I had in my head, and should have written down, was something like:
15mpg * 7l == how many miles?
Now it's my turn to not understand the point of this example. Are you arguing Chris A's point that in some applications you want those conversions done automagically[1], and in other applications you want to get a YouSureYouMeanThatBoss? Exception[2]. :-) Footnotes: [1] American you just bought a car in Detroit MI and it still has the EPA sticker "15mpg" on the window, and now you're in Windsor ON (Canada) looking at the cost of the 7l you put in and wondering if you can get back home to Ann Arbor MI on that. [2] Building a Mars lander for NASA.
![](https://secure.gravatar.com/avatar/176220408ba450411279916772c36066.jpg?s=120&d=mm&r=g)
I guess you could call the associative law of multiplication "dumb luck", but most mathematicians will consider that hate speech.
My apologies for not understanding your example. The counter example I had in my head, and should have written down, was something like:
15mpg * 7l == how many miles?
Using pint: In [76]: U = pint.UnitRegistry() In [77]: (15 * U.miles / U.gallons * 7 * U.liter).to('miles') Out[77]: 27.7380654976056 <Unit('mile')> A bit verbose, perhaps, but to me clear, and the operator precedence rules seem to "just work". And it you want it a tad less verbose, you can give some of those units names: In [78]: mpg = U.miles / U.gallons In [79]: l = U.liter In [80]: (15 * mpg * 7 * l).to('miles') Out[80]: 27.7380654976056 <Unit('mile')> My question for the folks that want units built in to Python is "what's so hard about that? Ricky wrote: "Python is so painful to use for units I've actually avoided it," Really? have you tried pint? or anything else? what is so painful about this? -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
![](https://secure.gravatar.com/avatar/9ea64fa01ed0d8529e4ae1b8873bb930.jpg?s=120&d=mm&r=g)
On Tue, Apr 12, 2022 at 11:27 AM Christopher Barker <pythonchb@gmail.com> wrote:
I guess you could call the associative law of multiplication "dumb
luck", but most mathematicians will consider that hate speech.
My apologies for not understanding your example. The counter example I had in my head, and should have written down, was something like:
15mpg * 7l == how many miles?
Using pint:
In [76]: U = pint.UnitRegistry()
In [77]: (15 * U.miles / U.gallons * 7 * U.liter).to('miles') Out[77]: 27.7380654976056 <Unit('mile')>
A bit verbose, perhaps, but to me clear, and the operator precedence rules seem to "just work".
And it you want it a tad less verbose, you can give some of those units names:
In [78]: mpg = U.miles / U.gallons In [79]: l = U.liter
In [80]: (15 * mpg * 7 * l).to('miles') Out[80]: 27.7380654976056 <Unit('mile')>
My question for the folks that want units built in to Python is "what's so hard about that?
Ricky wrote:
"Python is so painful to use for units I've actually avoided it,"
Really? have you tried pint? or anything else? what is so painful about this?
-CHB
-- Christopher Barker, PhD (Chris)
I will try to finish my email about this I started writing a week ago! --- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler
![](https://secure.gravatar.com/avatar/de311342220232e618cb27c9936ab9bf.jpg?s=120&d=mm&r=g)
On 4/12/22 00:57, Stephen J. Turnbull wrote:
Ethan Furman writes:
15mpg * 7l == how many miles?
Now it's my turn to not understand the point of this example.
My point is that when an object is instantiated it can normalize its arguments, and that that normalization should happen with the original value (7 above, not 105), so when I put the above into the REPL I get `27 miles` instead of `105 l*mpg`. Now, it could easily be that more advanced uses of units (torque vs force? or whatever was mentioned some time ago) would work better with the intermediate results being more in flux (quantum mechanics, anyone? heh) with the final units being selected later (perhaps somebody wants kilometers instead of miles in the above example). To rephrase my original point: `7_litres` is a different thing than `105_litres_mpg` -- is that a meaningless difference? I don't know. -- ~Ethan~
participants (20)
-
André Roberge
-
Ben Rudiak-Gould
-
Brian McCall
-
Chris Angelico
-
Christopher Barker
-
David Mertz, Ph.D.
-
dn
-
Ethan Furman
-
Greg Ewing
-
Ken Kundert
-
Luca Baldini
-
Matt del Valle
-
Mike Miller
-
Paul Moore
-
python@shalmirane.com
-
Ricky Teachey
-
Sebastian Berg
-
Simão Afonso
-
Stephen J. Turnbull
-
Steven D'Aprano