Re: [Python-ideas] SI scale factors in Python

All, This proposal basically has two parts. One part is that Python should naturally support printing real numbers with SI scale factors. Currently there are three formats for printing real number: %f, %d, %g. They all become difficult to read for even moderately large or small numbers. Exponential notation is hard for humans to read. That is why SI scale factors have largely replaced exponential notation everywhere except in programming. Adding another format for printing real numbers in human readable form seems like a modest extension that is long past due in all programming languages. I am asking that Python be the leader here. I am sure other languages will pick it up once it is implemented in Python. The second part is the logical dual to the first: input. People should be able to enter numbers in Python using SI scale factors. This means as real literals, such as 2.4G, but it should also work with casting, float('2.4G'). Once you allow SI scale factors on numbers, the natural tendency is for people to want to add units, which is a good thing because it gives important information about the number. We should allow it because it improves the code by making it more self documenting. Even if the language completely ignores the units, we have still gained by allowing the units to be there, just like we gain when we allow user to add comments to their code even though the compiler ignores them. Some people have suggested that we take the next step and use the units for dimensional analysis, but that is highly problematic because you cannot do dimensional analysis unless everything is specified with the correct units, and that can be a huge burden for the user. So instead, I am suggesting that we provide simple hooks that simply allow access to the units. That way people can build dimensional analysis packages using the units if they felt the need. -Ken

On 25 August 2016 at 19:02, Ken Kundert <python-ideas@shalmirane.com> wrote:
This part would be easy to implement as a small PyPI module, with a formatting function siformat(num) -> str. If that proved popular (and it well might) then there's more of a case for adding it as a language feature, Maybe as a custom string formatting code, or maybe just as a standard library function. But without that sort of real-world experience, it's likely that any proposal will get bogged down in "what if" theoretical debate.
This is much more problematic. Currently, even the long-established decimal and rational types don't enjoy syntactic support, so I'd be surprised if SI scale factors got syntax support first. But following their lead, by (again) starting with a conversion function along the lines of SI("3k") -> 3000 would be a good test of applicability. It could easily go in the same module as you're using to trial the string formatting function above. I'd expect that a function to do this conversion would be a good way of thrashing out the more controversial aspects of the proposal - whether E means "exponent" or "exa", whether M and G get misinterpreted as computing-style 2**20 and 2**30, etc. Having real world experience of how to solve these questions would be invaluable in moving forward with a proposal to add language support.
Dimensional analysis packages already exist (I believe) and they don't rely on syntactic support. Any proposal to add something to the language *really* needs to demonstrate that 1. Those packages are currently suffering from the lack of language support. 2. The proposed change would allow them to resolve existing problems that they haven't been able to address any other way. 3. The proposed change isn't some sort of "attractive nuisance" for naive users, leading them to think they can write dimensionally correct programs *without* using one of the existing packages. Python has a track record of being open to adding syntactic support if it demonstrably helps 3rd party tools (for example, the matrix multiplication operator was added specifically to help the numeric Python folks address a long-standing issue they had), so this is a genuine possibility - but such proposals need support from the groups they are intended to help. At the moment, I'm not even aware of a particular "dimensional analysis with Python" community, or any particular "best of breed" package in this area that might lead such a proposal - and a language change of this nature probably does need that sort of backing. That's not to say there's no room for debate here - the proposal is interesting, and not without precedent (for example Windows Powershell supports constants of the form 1MB, 1GB - which ironically are computing-style 2*20 and 2*30 rather than SI-style 10*6 and 10*9). But there's a pretty high bar for a language change like this, and it's worth doing the groundwork to avoid wasting a lot of time on something that's not going to be accepted in its current form. Hope this helps, Paul

On Thu, Aug 25, 2016 at 09:03:32PM +0100, Paul Moore wrote:
That's great! I know a few command line tools and scripts which do that, and it's really useful.
- which ironically are computing-style 2*20 and 2*30 rather than SI-style 10*6 and 10*9).
Do I remember correctly that Windows file Explorer displays disk sizes is decimal SI units? If so, how very Microsoft, to take a standard and confuse it rather than encourage it :-( Historically, there are *three* different meanings for "MB", only one of which is an official standard: http://physics.nist.gov/cuu/Units/binary.html -- Steve

On Thu, Aug 25, 2016, at 19:50, Steven D'Aprano wrote:
The link doesn't work for me... is the third one the 1,024,000 bytes implicit in describing standard-formatted floppy disks as "1.44 MB" (they are actually 1440 bytes: 80 tracks, 2 sides, 18 512-byte sectors) or "1.2 MB" (15 sectors).

On Thu, Aug 25, 2016 at 11:34:23PM -0400, Random832 wrote:
Quoting from the above document: Historical context* Once upon a time, computer professionals noticed that 2**10 was very nearly equal to 1000 and started using the SI prefix "kilo" to mean 1024. That worked well enough for a decade or two because everybody who talked kilobytes knew that the term implied 1024 bytes. But, almost overnight a much more numerous "everybody" bought computers, and the trade computer professionals needed to talk to physicists and engineers and even to ordinary people, most of whom know that a kilometer is 1000 meters and a kilogram is 1000 grams. Then data storage for gigabytes, and even terabytes, became practical, and the storage devices were not constructed on binary trees, which meant that, for many practical purposes, binary arithmetic was less convenient than decimal arithmetic. The result is that today "everybody" does not "know" what a megabyte is. When discussing computer memory, most manufacturers use megabyte to mean 2**20 = 1 048 576 bytes, but the manufacturers of computer storage devices usually use the term to mean 1 000 000 bytes. Some designers of local area networks have used megabit per second to mean 1 048 576 bit/s, but all telecommunications engineers use it to mean 10**6 bit/s. And if two definitions of the megabyte are not enough, a third megabyte of 1 024 000 bytes is the megabyte used to format the familiar 90 mm (3 1/2 inch), "1.44 MB" diskette. The confusion is real, as is the potential for incompatibility in standards and in implemented systems. -- Steve

On Thu, Aug 25, 2016 at 11:02:11AM -0700, Ken Kundert wrote:
This is dangerously wrong, and the analogy with comments is misleading. Everyone knows that comments are ignored by the interpreter, and even then, the ideal is to write self-documenting code, not comments: "At Resolver we've found it useful to short-circuit any doubt and just refer to comments in code as 'lies'. " --Michael Foord paraphrases Christian Muirhead on python-dev, 2009-03-22 This part of your proposal would be *worse*: you would fool the casual or naive user into believing that Python did dimensional analysis, while in fact not doing so. You would give them a false sense of security. Don't think of people writing code like this: result = 23mA + 75MHz which is obviously wrong. Think about them writing code like this: total = sum_resistors_in_parallel(input, extra) where the arguments may themselves have been passed to the current function as parameters from somewhere else. Or they may be data values read from a file. Their definitions may be buried deep in another part of the program. Their units aren't obvious to the reader without serious work. This part of your proposal makes the language *worse*: we lose the simple data validation that "23mA" is not a valid number, but without gaining the protection of dimensional analysis. To give an analogy, you are suggesting that we stick a sticker on the dashboard of our car saying "Airbag" but without actually installing an airbag. And you've removed the seat belt. The driver has to read the manual to learn that the "Airbag" is just a label, not an actual functioning airbag.
What? That's the *whole point* of dimensional analysis: to ensure that the user is not adding a length to a weight and then treating the result as a time. To say that "it is too hard to specify the correct units, so we should just ignore the units" boggles my mind. Any reasonable dimensional program should perform automatic unit conversions: you can add inches to metres, but not inches to pounds. There are already many of these available for Python. -- Steve

On Fri, Aug 26, 2016 at 10:14:53AM +1000, Steven D'Aprano wrote:
This idea is new to general purpose languages, but it has been used for over 40 years in the circuit design community. Specifically, SPICE, an extremely heavily used circuit simulation package, introduced this concept in 1974. In SPICE the scale factor is honored but any thing after the scale factor is ignored. Being both a heavy user and developer of SPICE, I can tell you that in all that time this issue has never come up. In fact, the users never expected there to be any support for dimensional analysis, nor did they request it.
You say that '23mA + 75MHz' is obviously wrong, but it is only obviously wrong because the units are included, which is my point. If I had written '0.023 + 76e6', it would not be obviously wrong.
Indeed that is the point of dimensional analysis. However, despite the availability of all these packages, they are rarely if ever used because you have to invest a tremendous effort before they can be effective. For example, consider the simple case of Ohms Law: V = R*I To perform dimensional analysis we need to know the units of V, R, and I. These are variables not literals, so some mechanism needs to be provided for specifying the units of variables, even those that don't exist yet, like V. And what if the following is encountered: V = I Dimensional analysis says this is wrong, but the it may be that the resistance is simply being suppressed because it is unity. False positives of this sort are a tremendous problem with this form of automated dimensional analysis. Then there are things like this: V = 2*sin(f*t) In this case dimensional analysis (DA) indicates an error, but it is the wrong error. DA will complain about the fact that a dimensionless number is being assigned to a variable intended to carry a voltage. But in this case 2 has units of voltage, but they were not explicitly specified, so this is another false positive. The real error is the argument of the sin function. The sin function expects radians, which is dimensionless, and f*t is dimensionless, so there is no complaint, but it is not in radians, and so there should be an error. You could put a 'unit' of radians on pi, but that is not right either. Really it is 2*pi that gives you radians, and if you put radians on py and then used pi to compute the area of a unit circle, you would get pi*r^2 where r=1 meter, and the resulting units would be radians*m^2, which is nonsensical. Turns out there are many kinds of dimensionless numbers. For example, the following represents a voltage amplifier: Av = 4 # voltage gain (V/V) Ai = 0.03 # current gain (A/A) Vout = Ai*Vin In this case Ai is expected to be unitless, which it is, so there is no error. However Ai is the ratio of two currents, not two voltages, so there actually should be an error. Now consider an expression that contains an arbitrary function call: V = f(I) How do we determine the units of the return value of f? Now, finally consider the BSIM4 MOSFET model equations. They are described in http://www-device.eecs.berkeley.edu/bsim/Files/BSIM4/BSIM460/doc/BSIM460_Man... If you look at this document you will find over 200 pages of extremely complicated and tedious model equations. The parameters of these models can have extremely complicated units. Well beyond anything I am proposing for real literals. For example, consider NOIA, the 'flicker noise parameter A'. It has units of (eV)^-1 s^(1-EF) m^-3. Where EF is some number between 0 and 1. That will never work with dimensional analysis because it does not even make sense from the perspectives of dimensional analysis. Describing all of the units in those equations would be a huge undertaking, and in the end they would end up with errors they cannot get rid off. Dimensional analysis is a seductive siren that, in the end, demands a great deal of you and generally delivers very little. And it has little to do with my proposal, which is basically this: Numbers with SI scale factor and units have become very popular. Using them is a very common way of expressing either large or small numbers. And that is true in the scientific and engineering communities, in the programming community (even the linux sort command supports sorting on numbers with SI scale factors: --human-numeric-sort), and even in popular culture. Python should support them. And I mean support with a capital S. I can come up with many different hacks to support these ideas in Python today, and I have. But this should not be a hack. This should be built into the language front and center. It should be the preferred way that we specify and output real numbers. -Ken

On 26 August 2016 at 13:46, Ken Kundert <python-ideas@shalmirane.com> wrote:
[snip]
Thanks for the additional background Ken - that does start to build a much more compelling case. I now think there's another analogy you'll be able to draw on to make it even more compelling at a language design level: just because the *runtime* doesn't do dimensional analysis on static unit annotations doesn't mean that sufficiently clever static analysers couldn't do so at some point in the future. That then puts this proposal squarely in the same category as function annotations and gradual typing: semantic annotations that more clearly expressed developer intent, and aren't checked at runtime, but can be checked by a human during code review, and (optionally) by static analysers as a quality gate. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 26 August 2016 at 08:34, Nick Coghlan <ncoghlan@gmail.com> wrote:
Unfortunately, I didn't read the whole thread, but it seems to me that this would be just a more sophisticated version of NewType. mypy type checker already supports NewType (not sure about pytype). So that one can write (assuming PEP 526): USD = NewType('USD', float) EUR = NewType('EUR', float) amount = EUR(100) # later in code new_amount: USD = amount # flagged as error by type checker The same idea applies to physical units. Of course type checkers do not know that e.g. 1m / 1s is 1 m/s, but it is something they could be taught (for example by adding @overload for division operator). -- Ivan

On Thu, Aug 25, 2016 at 08:46:54PM -0700, Ken Kundert wrote:
This idea is new to general purpose languages,
For the record, at least some HP calculators include a "units" data type as part of the programming language "RPL", e.g. the HP-28 and HP-48 series. I've been using those for 20+ years so I'm quite familiar with how useful this feature can be.
I can't comment about the circuit design community, but you're trying to extrapolate from a single specialist application to a general purpose programming language used by people of many, many varied levels of expertise, of competence, with many different needs. It makes a lot of sense for applications to allow SI prefixes as suffixes within a restricted range. For example, the dd application allows the user to specify the amount of data to copy using either bytes or blocks, with optional suffixes: BLOCKS and BYTES may be followed by the following multiplicative suffixes: xM M, c 1, w 2, b 512, kB 1000, K 1024, MB 1000*1000, M 1024*1024, GB 1000*1000*1000, G 1024*1024*1024, and so on for T, P, E, Z, Y. (Quoting from the man page.) That makes excellent sense for a specialist application where numeric quantities always mean the same thing, or in this case, one of two things. As purely multiplicative suffixes, that even makes sense for Python: earlier I said that it was a good idea to add a simple module defining SI and IEC multiplicative constants in the std lib so that we could do x = 42*M or similar. But that's a far cry from allowing and ignoring units.
I understand your point and the benefit of dimensional analysis. But the problem is, as users of specialised applications we may be used to doing direct arithmetic on numeric literal values, with or without attached units: 23mA + 75MHz # error is visible 23 + 75 # error is hidden but as *programmers* we rarely do that. Generally speaking, it is rare to be doing arithmetic on literals where we might have the opportunity to attach a unit. We doing arithmetic on *variables* that have come from elsewhere. Reading the source code doesn't show us something that might be improved by adding a unit: # we hardly ever see this 23 + 75 # we almost always see something like this input + argument At best, we can choose descriptive variable names that hint what the correct dimensions should be: weight_of_contents + weight_of_container The argument that units would make it easier for the programmer to spot errors is, I think, invalid, because the programmer will hardly ever get to see the units. [...]
This is not difficult, and you exaggerate the amount of effort required. To my sorrow, I'm not actually familiar with any of the Python libraries for this, so I'll give an example using the HP-48GX RPL language. Suppose I have a value which I am expecting to be current in amperes. (On the HP calculator, it will be passed to me on the stack, but the equivalent in Python will be a argument passed to a function.) For simplicity, let's assume that if it is a raw number, I will trust that the user knows what they are doing and just convert it to a unit object with dimension "ampere", otherwise I expect some sort of unit object which is dimensionally compatible: 1_A CONV is the RPL program to perform this conversion on the top of the stack, and raise an error if the dimensions are incompatible. Converting to a more familiar Python-like API, I would expect something like: current = Unit("A").convert(current) or possibly: current = Unit("A", current) take your pick. That's not a "tremendous" amount of effort, it is comparable to ensuring that I'm using (say) floats in the first place: if not isinstance(current, float): raise TypeError
That's because it is wrong.
but the it may be that the resistance is simply being suppressed because it is unity.
Your specialist experience in the area of circuit design is misleading you. There's effectively only one unit of resistance, the ohm, although my "units" program also lists: R_K 25812.807 ohm abohm abvolt / abamp intohm 1.000495 ohm kilohm kiloohm megohm megaohm microhm microohm ohm V/A siemensunit 0.9534 ohm statohm statvolt / statamp So even here with resistence "unity" is ambiguous. Do you mean one intohm, one microhm, one statohm or something else? I'll grant you that in the world of circuit design perhaps it could only mean the SI ohm. But you're not in the world of circuit design any more, you are dealing with a programming language that will be used by people for many, many different purposes, for whom "unity" might mean (for example): 1 foot per second 1 foot per minute 1 metre per second 1 kilometre per hour 1 mile per hour 1 lightspeed 1 knot 1 mach Specialist applications might be able to take shortcuts in dimensional analysis when "everybody knows" what the suppressed units must be. General purpose programming languages *cannot*. It is better NOT to offer the illusion of dimensional analysis than to mislead the user into thinking they are covered when they are not. Better to let them use a dedicated units package, not build a half-baked bug magnet into the language syntax. -- Steve

On 26 August 2016 at 16:54, Steven D'Aprano <steve@pearwood.info> wrote:
This is based on a narrowly construed definition of "programming" though. It makes more sense in the context of interactive data analysis and similar activities, where Python is being used as a scripting language, rather than as a full-fledged applications programming language. So let's consider the following hypothetical: 1. We add SI multiplier support to the numeric literal syntax, with "E" unilaterally replaced with "X" (for both input and output) to avoid the ambiguity with exponential notation 2. One or more domain specific libraries adopt Ivan Levkivskyi's suggestion of using PEP 526 to declare units Then Ken's example becomes: from circuit_units import A, V, Ohm, seconds delta: A for delta in [-500n, 0, 500n]: input: A = 2.75u + delta wait(seconds(1u)) expected: V = Ohm(100k)*input tolerance: V = 2.2m fails = check_output(expected, tolerance) print('%s: I(in)=%rA, measured V(out)=%rV, expected V(out)=%rV, diff=%rV.' % ( 'FAIL' if fails else 'pass', input, get_output(), expected, get_output() - expected )) The only new pieces there beyond PEP 526 itself are the SI unit multiplier on literals, and the type annotations declared in the circuit_units module. To actually get a typechecker to be happy with the code, Ohm.__mul__ would need to be overloaded as returning a V result when the RHS is categorised as A. An environment focused on circuit simulation could pre-import some of those symbols so users didn't need to do it explicitly. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Fri, Aug 26, 2016 at 06:31:30PM +1000, Nick Coghlan wrote:
I don't think that is right. I'd put it the other way: as important as interactive use is, it is a subset of general programming, not a superset of it. I *love* using Python as a calculator (which is why this thread is inspiring me to investigate the unit conversion/tracking packages already available). But even when using Python as a calculator, oh I'm sorry, "for interactive data analysis" *wink*, there are going to be plenty of opportunities for me to write: x = ... y = ... # much later on z = x + y so that I don't necessarily see the units directly there on the screen by the time I actually go to use them. Likewise if I'm reading my values from a data file. IPython even generalises the magic variable _ into a potentially unlimited series of magic variables _1 _2 _3 etc, and it is normal to be using values taken from a variable rather than as a literal. The point is that Ken's examples of calculations on literals is misleading, because only a fraction of calculations involve literals. And likely a small fraction, if you consider all the people using Python for scripting, server-side programming, application programming, etc rather than just the subset of them using it for interactive use. By the way, here's another programming language designed for interactive use as a super-calculator. Frink does dimensional analysis and unit conversions: http://frinklang.org/#SampleCalculations If we're serious about introducing dimension and unit handling to Python, we should look at: - existing Python libraries; - HP calculators; - Frink; at the very least. -- Steve

On 2016-08-26 07:54, Steven D'Aprano wrote: [snip]
If you're going to have units, you might also include (for want of a name) "colours" (or "flavours"), which behave slightly differently with respect to arithmetic operators. For example: When you add 2 values, they must have the same units and same colours. The result will have the same units and colours. # "amp" is a unit, "current" is a colour. # The result is a current measured in amps. 1 amp current + 2 amp current == 3 amp current When you divide 2 values, they could have the same or different units, but must have the same colours. The result will have a combination of the units (some might also cancel out), but will have the same colours. # "amp" is a unit, "current" is a colour. # The result is a ratio of currents. 6 amp current / 2 amp current == 3 current

MRAB writes:
I don't understand why a ratio would retain color. What's the application? For example, in circuit analysis, if "current" is a color, I would expect "potential" and "resistance" to be colors, too. But from I1*R1 = V = I2*R2, we have I1/I2 = R2/R1 in a parallel circuit, so unitless ratios of color current become unitless ratios of color resistance. Furthermore that ratio might arise from physical phenomena such as temperature-varying resistance, and be propagated to another physical phenomenon such as the deflection of a meter's needle. What might color tell us about these computations? (Note: I'm pretty sure that MRAB didn't choose "current" as a parallel to "resistance". Nevertheless, the possibility of propagation of values across color boundaries to be necessary and I don't see how color is going to be used.) My own take is that units specify possible operations, ie, they are nothing more than a partial specification of a type. Rather than speculate on additional attributes that might useful in conjunction with units, we should see if there are convenient ways to describe the constraints that units produce on behavior of types. Ie, by creating types VoltType(UnitType), AmpereType(UnitType), and OhmType(UnitType), and specifying the "equation" VoltType = AmpereType * OhmType on those types, the __mul__ and __div__ operators would be modified to implement the expected operations as a function of UnitType. That is, there would be a helper impose_type_expression_equivalence() that would take the string "VoltType = AmpereType * OhmType" and manipulate the derived type methods appropriately. One aspect of this approach is that we can conveniently derive concrete units such as V = VoltType(1) # Some users might prefer the unit # Volt, variable V, thus "VoltType" mA = AmpereType(1e-3) # SI scale prefix! kΩ = OhmType(1e3) # Unicode! <wink/> They don't serve the OP's request (he *doesn't* want type checking, he *does* want syntax), but I prefer these anyway: 10*V == (2*mA)*(5*kΩ) Developers of types for circuit analysis derived from UnitType might prefer different names that reflect the type being measured rather than the unit, eg Current or CurrentType instead of AmpereType. There is no problem with units like Joule (in syntax-based proposals, it collides with the imaginary unit) and Kelvin (in syntax-based proposals, it collides with a non-SI prefix that nevertheless is so commonly used that both the OP and Steven d'Aprano say should be recognized as "kilo"). Another advantage (IMHO) is that "reified" units can be treated as equivalent to "real" (aka standard or base) units. What I mean is that New York is not approximately 4 Mm from Los Angeles (that would give most people a WTF feeling), it's about 4000 km. While I realize programmers will be able to do that conversion, this flexibility allows people to use units that feel natural to them. If you want to miles and feet, you can define them as ft = LengthType(12/39.37) # base unit is meter per SI mi = 5280*ft very conveniently. Using this approach, Ken's example that Nick rewrote to use type hinting would look like this: from circuit_units import uA, V, mV, kOhm, u_second, VoltType us = u_second # Use project convention. # u_second = SecondType(1e-6) # A gratuitous style change. expected: VoltType # With so few declarations, # I prefer "predeclaration". # There is no millivolt type, # so derived units are all # consistent with this variable. # A gratuitous style change. for delta in [-0.5*uA, 0*uA, 0.5*uA]: # uA = AmpereType(1e-6) # I dislike [-0.5, 0, 0.5]*uA, # but it could be implemented. input = 2.75*uA + delta wait(1*us) # The "1*" is redundant. expected = (100*kOhm)*input # kOhm = OhmType(1e3) tolerance = 2.2*mV # mV = VoltType(1e-3) fails = check_output(expected, tolerance) print('%s: I(in)=%rA, measured V(out)=%rV, expected V(out)=%rV, diff=%rV.' % ( 'FAIL' if fails else 'pass', input, get_output(), expected, get_output() - expected )) Hmm: need for only *one* variable declaration. This is very much to my personal taste, YMMV. The main question is whether this device could support efficient computation. All of these units are objects with math dunders that have to dispatch on type (or else they need to produce "expression Types" such as Ampere_Ohm, but I don't think type checkers would automatically know that Volt = Ampere_Ohm). This clearly can't compare to the efficiency of NewType(float). But AIUI, one NewType(float) can't mix with another, which is not the behavior we want here. We could do VoltType = NewType('Volt', float) AmpereType = NewType('Ampere', float) WattType = NewType('Watt', float) def power(potential, current): return WattType(float(Volt)*float(Ampere)) but this is not very readable, and error-prone IMO. It's also less efficient than the "zero cost" that NewType promises for types like User_ID (https://www.python.org/dev/peps/pep-0484/#newtype-helper-function). I suppose it would be feasible (though ugly) to provide two implementations of VoltType, one as a "real" class as I propose above, and the other simply VoltType = float. The former would be used for type checking, the latter for production computations. Perhaps such a process could be automated in mypy? A final advantage is that I suppose that it should be possible to implement "color" as MRAB proposes through the Python type system. We don't have to define it now, but can take advantage of the benefits of "units as types" approach immediately. I don't know how to implement impose_type_expression_equivalence(), yet, so this is not a proposal for the stdlib. But the necessary definitions by hand are straightforward, though tedious. Individual implementations of units can be done *now*, without change to Python, AFAICS. Computational efficiency is an issue, but one that doesn't matter to educational applications, for example. Steve

I've been following this discussion on and off for a while, but still fail to see how SI units, factors or the like are a use case which is general enough to warrant changing the language. There are packages available on PyPI for dealing with this in a similar way we deal with decimal literals in Python: C extension: https://pypi.python.org/pypi/cfunits/ http://pythonhosted.org/cfunits/cfunits.Units.html (interfaces to the udunits-2 lib: http://www.unidata.ucar.edu/software/udunits/udunits-2.2.20/doc/udunits/udun...) Pure python: https://pypi.python.org/pypi/units/ IMHO, a literal notation like "2 m" is more likely related to a missing operator which should be flagged as SyntaxError than the declaration of an integer with associated unit. By keeping such analysis to string to object conversion tools/functions you make the intent explicit, which allows for better error reporting. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Aug 27 2016)
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/

On 25 August 2016 at 19:02, Ken Kundert <python-ideas@shalmirane.com> wrote:
This part would be easy to implement as a small PyPI module, with a formatting function siformat(num) -> str. If that proved popular (and it well might) then there's more of a case for adding it as a language feature, Maybe as a custom string formatting code, or maybe just as a standard library function. But without that sort of real-world experience, it's likely that any proposal will get bogged down in "what if" theoretical debate.
This is much more problematic. Currently, even the long-established decimal and rational types don't enjoy syntactic support, so I'd be surprised if SI scale factors got syntax support first. But following their lead, by (again) starting with a conversion function along the lines of SI("3k") -> 3000 would be a good test of applicability. It could easily go in the same module as you're using to trial the string formatting function above. I'd expect that a function to do this conversion would be a good way of thrashing out the more controversial aspects of the proposal - whether E means "exponent" or "exa", whether M and G get misinterpreted as computing-style 2**20 and 2**30, etc. Having real world experience of how to solve these questions would be invaluable in moving forward with a proposal to add language support.
Dimensional analysis packages already exist (I believe) and they don't rely on syntactic support. Any proposal to add something to the language *really* needs to demonstrate that 1. Those packages are currently suffering from the lack of language support. 2. The proposed change would allow them to resolve existing problems that they haven't been able to address any other way. 3. The proposed change isn't some sort of "attractive nuisance" for naive users, leading them to think they can write dimensionally correct programs *without* using one of the existing packages. Python has a track record of being open to adding syntactic support if it demonstrably helps 3rd party tools (for example, the matrix multiplication operator was added specifically to help the numeric Python folks address a long-standing issue they had), so this is a genuine possibility - but such proposals need support from the groups they are intended to help. At the moment, I'm not even aware of a particular "dimensional analysis with Python" community, or any particular "best of breed" package in this area that might lead such a proposal - and a language change of this nature probably does need that sort of backing. That's not to say there's no room for debate here - the proposal is interesting, and not without precedent (for example Windows Powershell supports constants of the form 1MB, 1GB - which ironically are computing-style 2*20 and 2*30 rather than SI-style 10*6 and 10*9). But there's a pretty high bar for a language change like this, and it's worth doing the groundwork to avoid wasting a lot of time on something that's not going to be accepted in its current form. Hope this helps, Paul

On Thu, Aug 25, 2016 at 09:03:32PM +0100, Paul Moore wrote:
That's great! I know a few command line tools and scripts which do that, and it's really useful.
- which ironically are computing-style 2*20 and 2*30 rather than SI-style 10*6 and 10*9).
Do I remember correctly that Windows file Explorer displays disk sizes is decimal SI units? If so, how very Microsoft, to take a standard and confuse it rather than encourage it :-( Historically, there are *three* different meanings for "MB", only one of which is an official standard: http://physics.nist.gov/cuu/Units/binary.html -- Steve

On Thu, Aug 25, 2016, at 19:50, Steven D'Aprano wrote:
The link doesn't work for me... is the third one the 1,024,000 bytes implicit in describing standard-formatted floppy disks as "1.44 MB" (they are actually 1440 bytes: 80 tracks, 2 sides, 18 512-byte sectors) or "1.2 MB" (15 sectors).

On Thu, Aug 25, 2016 at 11:34:23PM -0400, Random832 wrote:
Quoting from the above document: Historical context* Once upon a time, computer professionals noticed that 2**10 was very nearly equal to 1000 and started using the SI prefix "kilo" to mean 1024. That worked well enough for a decade or two because everybody who talked kilobytes knew that the term implied 1024 bytes. But, almost overnight a much more numerous "everybody" bought computers, and the trade computer professionals needed to talk to physicists and engineers and even to ordinary people, most of whom know that a kilometer is 1000 meters and a kilogram is 1000 grams. Then data storage for gigabytes, and even terabytes, became practical, and the storage devices were not constructed on binary trees, which meant that, for many practical purposes, binary arithmetic was less convenient than decimal arithmetic. The result is that today "everybody" does not "know" what a megabyte is. When discussing computer memory, most manufacturers use megabyte to mean 2**20 = 1 048 576 bytes, but the manufacturers of computer storage devices usually use the term to mean 1 000 000 bytes. Some designers of local area networks have used megabit per second to mean 1 048 576 bit/s, but all telecommunications engineers use it to mean 10**6 bit/s. And if two definitions of the megabyte are not enough, a third megabyte of 1 024 000 bytes is the megabyte used to format the familiar 90 mm (3 1/2 inch), "1.44 MB" diskette. The confusion is real, as is the potential for incompatibility in standards and in implemented systems. -- Steve

On Thu, Aug 25, 2016 at 11:02:11AM -0700, Ken Kundert wrote:
This is dangerously wrong, and the analogy with comments is misleading. Everyone knows that comments are ignored by the interpreter, and even then, the ideal is to write self-documenting code, not comments: "At Resolver we've found it useful to short-circuit any doubt and just refer to comments in code as 'lies'. " --Michael Foord paraphrases Christian Muirhead on python-dev, 2009-03-22 This part of your proposal would be *worse*: you would fool the casual or naive user into believing that Python did dimensional analysis, while in fact not doing so. You would give them a false sense of security. Don't think of people writing code like this: result = 23mA + 75MHz which is obviously wrong. Think about them writing code like this: total = sum_resistors_in_parallel(input, extra) where the arguments may themselves have been passed to the current function as parameters from somewhere else. Or they may be data values read from a file. Their definitions may be buried deep in another part of the program. Their units aren't obvious to the reader without serious work. This part of your proposal makes the language *worse*: we lose the simple data validation that "23mA" is not a valid number, but without gaining the protection of dimensional analysis. To give an analogy, you are suggesting that we stick a sticker on the dashboard of our car saying "Airbag" but without actually installing an airbag. And you've removed the seat belt. The driver has to read the manual to learn that the "Airbag" is just a label, not an actual functioning airbag.
What? That's the *whole point* of dimensional analysis: to ensure that the user is not adding a length to a weight and then treating the result as a time. To say that "it is too hard to specify the correct units, so we should just ignore the units" boggles my mind. Any reasonable dimensional program should perform automatic unit conversions: you can add inches to metres, but not inches to pounds. There are already many of these available for Python. -- Steve

On Fri, Aug 26, 2016 at 10:14:53AM +1000, Steven D'Aprano wrote:
This idea is new to general purpose languages, but it has been used for over 40 years in the circuit design community. Specifically, SPICE, an extremely heavily used circuit simulation package, introduced this concept in 1974. In SPICE the scale factor is honored but any thing after the scale factor is ignored. Being both a heavy user and developer of SPICE, I can tell you that in all that time this issue has never come up. In fact, the users never expected there to be any support for dimensional analysis, nor did they request it.
You say that '23mA + 75MHz' is obviously wrong, but it is only obviously wrong because the units are included, which is my point. If I had written '0.023 + 76e6', it would not be obviously wrong.
Indeed that is the point of dimensional analysis. However, despite the availability of all these packages, they are rarely if ever used because you have to invest a tremendous effort before they can be effective. For example, consider the simple case of Ohms Law: V = R*I To perform dimensional analysis we need to know the units of V, R, and I. These are variables not literals, so some mechanism needs to be provided for specifying the units of variables, even those that don't exist yet, like V. And what if the following is encountered: V = I Dimensional analysis says this is wrong, but the it may be that the resistance is simply being suppressed because it is unity. False positives of this sort are a tremendous problem with this form of automated dimensional analysis. Then there are things like this: V = 2*sin(f*t) In this case dimensional analysis (DA) indicates an error, but it is the wrong error. DA will complain about the fact that a dimensionless number is being assigned to a variable intended to carry a voltage. But in this case 2 has units of voltage, but they were not explicitly specified, so this is another false positive. The real error is the argument of the sin function. The sin function expects radians, which is dimensionless, and f*t is dimensionless, so there is no complaint, but it is not in radians, and so there should be an error. You could put a 'unit' of radians on pi, but that is not right either. Really it is 2*pi that gives you radians, and if you put radians on py and then used pi to compute the area of a unit circle, you would get pi*r^2 where r=1 meter, and the resulting units would be radians*m^2, which is nonsensical. Turns out there are many kinds of dimensionless numbers. For example, the following represents a voltage amplifier: Av = 4 # voltage gain (V/V) Ai = 0.03 # current gain (A/A) Vout = Ai*Vin In this case Ai is expected to be unitless, which it is, so there is no error. However Ai is the ratio of two currents, not two voltages, so there actually should be an error. Now consider an expression that contains an arbitrary function call: V = f(I) How do we determine the units of the return value of f? Now, finally consider the BSIM4 MOSFET model equations. They are described in http://www-device.eecs.berkeley.edu/bsim/Files/BSIM4/BSIM460/doc/BSIM460_Man... If you look at this document you will find over 200 pages of extremely complicated and tedious model equations. The parameters of these models can have extremely complicated units. Well beyond anything I am proposing for real literals. For example, consider NOIA, the 'flicker noise parameter A'. It has units of (eV)^-1 s^(1-EF) m^-3. Where EF is some number between 0 and 1. That will never work with dimensional analysis because it does not even make sense from the perspectives of dimensional analysis. Describing all of the units in those equations would be a huge undertaking, and in the end they would end up with errors they cannot get rid off. Dimensional analysis is a seductive siren that, in the end, demands a great deal of you and generally delivers very little. And it has little to do with my proposal, which is basically this: Numbers with SI scale factor and units have become very popular. Using them is a very common way of expressing either large or small numbers. And that is true in the scientific and engineering communities, in the programming community (even the linux sort command supports sorting on numbers with SI scale factors: --human-numeric-sort), and even in popular culture. Python should support them. And I mean support with a capital S. I can come up with many different hacks to support these ideas in Python today, and I have. But this should not be a hack. This should be built into the language front and center. It should be the preferred way that we specify and output real numbers. -Ken

On 26 August 2016 at 13:46, Ken Kundert <python-ideas@shalmirane.com> wrote:
[snip]
Thanks for the additional background Ken - that does start to build a much more compelling case. I now think there's another analogy you'll be able to draw on to make it even more compelling at a language design level: just because the *runtime* doesn't do dimensional analysis on static unit annotations doesn't mean that sufficiently clever static analysers couldn't do so at some point in the future. That then puts this proposal squarely in the same category as function annotations and gradual typing: semantic annotations that more clearly expressed developer intent, and aren't checked at runtime, but can be checked by a human during code review, and (optionally) by static analysers as a quality gate. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 26 August 2016 at 08:34, Nick Coghlan <ncoghlan@gmail.com> wrote:
Unfortunately, I didn't read the whole thread, but it seems to me that this would be just a more sophisticated version of NewType. mypy type checker already supports NewType (not sure about pytype). So that one can write (assuming PEP 526): USD = NewType('USD', float) EUR = NewType('EUR', float) amount = EUR(100) # later in code new_amount: USD = amount # flagged as error by type checker The same idea applies to physical units. Of course type checkers do not know that e.g. 1m / 1s is 1 m/s, but it is something they could be taught (for example by adding @overload for division operator). -- Ivan

On Thu, Aug 25, 2016 at 08:46:54PM -0700, Ken Kundert wrote:
This idea is new to general purpose languages,
For the record, at least some HP calculators include a "units" data type as part of the programming language "RPL", e.g. the HP-28 and HP-48 series. I've been using those for 20+ years so I'm quite familiar with how useful this feature can be.
I can't comment about the circuit design community, but you're trying to extrapolate from a single specialist application to a general purpose programming language used by people of many, many varied levels of expertise, of competence, with many different needs. It makes a lot of sense for applications to allow SI prefixes as suffixes within a restricted range. For example, the dd application allows the user to specify the amount of data to copy using either bytes or blocks, with optional suffixes: BLOCKS and BYTES may be followed by the following multiplicative suffixes: xM M, c 1, w 2, b 512, kB 1000, K 1024, MB 1000*1000, M 1024*1024, GB 1000*1000*1000, G 1024*1024*1024, and so on for T, P, E, Z, Y. (Quoting from the man page.) That makes excellent sense for a specialist application where numeric quantities always mean the same thing, or in this case, one of two things. As purely multiplicative suffixes, that even makes sense for Python: earlier I said that it was a good idea to add a simple module defining SI and IEC multiplicative constants in the std lib so that we could do x = 42*M or similar. But that's a far cry from allowing and ignoring units.
I understand your point and the benefit of dimensional analysis. But the problem is, as users of specialised applications we may be used to doing direct arithmetic on numeric literal values, with or without attached units: 23mA + 75MHz # error is visible 23 + 75 # error is hidden but as *programmers* we rarely do that. Generally speaking, it is rare to be doing arithmetic on literals where we might have the opportunity to attach a unit. We doing arithmetic on *variables* that have come from elsewhere. Reading the source code doesn't show us something that might be improved by adding a unit: # we hardly ever see this 23 + 75 # we almost always see something like this input + argument At best, we can choose descriptive variable names that hint what the correct dimensions should be: weight_of_contents + weight_of_container The argument that units would make it easier for the programmer to spot errors is, I think, invalid, because the programmer will hardly ever get to see the units. [...]
This is not difficult, and you exaggerate the amount of effort required. To my sorrow, I'm not actually familiar with any of the Python libraries for this, so I'll give an example using the HP-48GX RPL language. Suppose I have a value which I am expecting to be current in amperes. (On the HP calculator, it will be passed to me on the stack, but the equivalent in Python will be a argument passed to a function.) For simplicity, let's assume that if it is a raw number, I will trust that the user knows what they are doing and just convert it to a unit object with dimension "ampere", otherwise I expect some sort of unit object which is dimensionally compatible: 1_A CONV is the RPL program to perform this conversion on the top of the stack, and raise an error if the dimensions are incompatible. Converting to a more familiar Python-like API, I would expect something like: current = Unit("A").convert(current) or possibly: current = Unit("A", current) take your pick. That's not a "tremendous" amount of effort, it is comparable to ensuring that I'm using (say) floats in the first place: if not isinstance(current, float): raise TypeError
That's because it is wrong.
but the it may be that the resistance is simply being suppressed because it is unity.
Your specialist experience in the area of circuit design is misleading you. There's effectively only one unit of resistance, the ohm, although my "units" program also lists: R_K 25812.807 ohm abohm abvolt / abamp intohm 1.000495 ohm kilohm kiloohm megohm megaohm microhm microohm ohm V/A siemensunit 0.9534 ohm statohm statvolt / statamp So even here with resistence "unity" is ambiguous. Do you mean one intohm, one microhm, one statohm or something else? I'll grant you that in the world of circuit design perhaps it could only mean the SI ohm. But you're not in the world of circuit design any more, you are dealing with a programming language that will be used by people for many, many different purposes, for whom "unity" might mean (for example): 1 foot per second 1 foot per minute 1 metre per second 1 kilometre per hour 1 mile per hour 1 lightspeed 1 knot 1 mach Specialist applications might be able to take shortcuts in dimensional analysis when "everybody knows" what the suppressed units must be. General purpose programming languages *cannot*. It is better NOT to offer the illusion of dimensional analysis than to mislead the user into thinking they are covered when they are not. Better to let them use a dedicated units package, not build a half-baked bug magnet into the language syntax. -- Steve

On 26 August 2016 at 16:54, Steven D'Aprano <steve@pearwood.info> wrote:
This is based on a narrowly construed definition of "programming" though. It makes more sense in the context of interactive data analysis and similar activities, where Python is being used as a scripting language, rather than as a full-fledged applications programming language. So let's consider the following hypothetical: 1. We add SI multiplier support to the numeric literal syntax, with "E" unilaterally replaced with "X" (for both input and output) to avoid the ambiguity with exponential notation 2. One or more domain specific libraries adopt Ivan Levkivskyi's suggestion of using PEP 526 to declare units Then Ken's example becomes: from circuit_units import A, V, Ohm, seconds delta: A for delta in [-500n, 0, 500n]: input: A = 2.75u + delta wait(seconds(1u)) expected: V = Ohm(100k)*input tolerance: V = 2.2m fails = check_output(expected, tolerance) print('%s: I(in)=%rA, measured V(out)=%rV, expected V(out)=%rV, diff=%rV.' % ( 'FAIL' if fails else 'pass', input, get_output(), expected, get_output() - expected )) The only new pieces there beyond PEP 526 itself are the SI unit multiplier on literals, and the type annotations declared in the circuit_units module. To actually get a typechecker to be happy with the code, Ohm.__mul__ would need to be overloaded as returning a V result when the RHS is categorised as A. An environment focused on circuit simulation could pre-import some of those symbols so users didn't need to do it explicitly. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Fri, Aug 26, 2016 at 06:31:30PM +1000, Nick Coghlan wrote:
I don't think that is right. I'd put it the other way: as important as interactive use is, it is a subset of general programming, not a superset of it. I *love* using Python as a calculator (which is why this thread is inspiring me to investigate the unit conversion/tracking packages already available). But even when using Python as a calculator, oh I'm sorry, "for interactive data analysis" *wink*, there are going to be plenty of opportunities for me to write: x = ... y = ... # much later on z = x + y so that I don't necessarily see the units directly there on the screen by the time I actually go to use them. Likewise if I'm reading my values from a data file. IPython even generalises the magic variable _ into a potentially unlimited series of magic variables _1 _2 _3 etc, and it is normal to be using values taken from a variable rather than as a literal. The point is that Ken's examples of calculations on literals is misleading, because only a fraction of calculations involve literals. And likely a small fraction, if you consider all the people using Python for scripting, server-side programming, application programming, etc rather than just the subset of them using it for interactive use. By the way, here's another programming language designed for interactive use as a super-calculator. Frink does dimensional analysis and unit conversions: http://frinklang.org/#SampleCalculations If we're serious about introducing dimension and unit handling to Python, we should look at: - existing Python libraries; - HP calculators; - Frink; at the very least. -- Steve

On 2016-08-26 07:54, Steven D'Aprano wrote: [snip]
If you're going to have units, you might also include (for want of a name) "colours" (or "flavours"), which behave slightly differently with respect to arithmetic operators. For example: When you add 2 values, they must have the same units and same colours. The result will have the same units and colours. # "amp" is a unit, "current" is a colour. # The result is a current measured in amps. 1 amp current + 2 amp current == 3 amp current When you divide 2 values, they could have the same or different units, but must have the same colours. The result will have a combination of the units (some might also cancel out), but will have the same colours. # "amp" is a unit, "current" is a colour. # The result is a ratio of currents. 6 amp current / 2 amp current == 3 current

MRAB writes:
I don't understand why a ratio would retain color. What's the application? For example, in circuit analysis, if "current" is a color, I would expect "potential" and "resistance" to be colors, too. But from I1*R1 = V = I2*R2, we have I1/I2 = R2/R1 in a parallel circuit, so unitless ratios of color current become unitless ratios of color resistance. Furthermore that ratio might arise from physical phenomena such as temperature-varying resistance, and be propagated to another physical phenomenon such as the deflection of a meter's needle. What might color tell us about these computations? (Note: I'm pretty sure that MRAB didn't choose "current" as a parallel to "resistance". Nevertheless, the possibility of propagation of values across color boundaries to be necessary and I don't see how color is going to be used.) My own take is that units specify possible operations, ie, they are nothing more than a partial specification of a type. Rather than speculate on additional attributes that might useful in conjunction with units, we should see if there are convenient ways to describe the constraints that units produce on behavior of types. Ie, by creating types VoltType(UnitType), AmpereType(UnitType), and OhmType(UnitType), and specifying the "equation" VoltType = AmpereType * OhmType on those types, the __mul__ and __div__ operators would be modified to implement the expected operations as a function of UnitType. That is, there would be a helper impose_type_expression_equivalence() that would take the string "VoltType = AmpereType * OhmType" and manipulate the derived type methods appropriately. One aspect of this approach is that we can conveniently derive concrete units such as V = VoltType(1) # Some users might prefer the unit # Volt, variable V, thus "VoltType" mA = AmpereType(1e-3) # SI scale prefix! kΩ = OhmType(1e3) # Unicode! <wink/> They don't serve the OP's request (he *doesn't* want type checking, he *does* want syntax), but I prefer these anyway: 10*V == (2*mA)*(5*kΩ) Developers of types for circuit analysis derived from UnitType might prefer different names that reflect the type being measured rather than the unit, eg Current or CurrentType instead of AmpereType. There is no problem with units like Joule (in syntax-based proposals, it collides with the imaginary unit) and Kelvin (in syntax-based proposals, it collides with a non-SI prefix that nevertheless is so commonly used that both the OP and Steven d'Aprano say should be recognized as "kilo"). Another advantage (IMHO) is that "reified" units can be treated as equivalent to "real" (aka standard or base) units. What I mean is that New York is not approximately 4 Mm from Los Angeles (that would give most people a WTF feeling), it's about 4000 km. While I realize programmers will be able to do that conversion, this flexibility allows people to use units that feel natural to them. If you want to miles and feet, you can define them as ft = LengthType(12/39.37) # base unit is meter per SI mi = 5280*ft very conveniently. Using this approach, Ken's example that Nick rewrote to use type hinting would look like this: from circuit_units import uA, V, mV, kOhm, u_second, VoltType us = u_second # Use project convention. # u_second = SecondType(1e-6) # A gratuitous style change. expected: VoltType # With so few declarations, # I prefer "predeclaration". # There is no millivolt type, # so derived units are all # consistent with this variable. # A gratuitous style change. for delta in [-0.5*uA, 0*uA, 0.5*uA]: # uA = AmpereType(1e-6) # I dislike [-0.5, 0, 0.5]*uA, # but it could be implemented. input = 2.75*uA + delta wait(1*us) # The "1*" is redundant. expected = (100*kOhm)*input # kOhm = OhmType(1e3) tolerance = 2.2*mV # mV = VoltType(1e-3) fails = check_output(expected, tolerance) print('%s: I(in)=%rA, measured V(out)=%rV, expected V(out)=%rV, diff=%rV.' % ( 'FAIL' if fails else 'pass', input, get_output(), expected, get_output() - expected )) Hmm: need for only *one* variable declaration. This is very much to my personal taste, YMMV. The main question is whether this device could support efficient computation. All of these units are objects with math dunders that have to dispatch on type (or else they need to produce "expression Types" such as Ampere_Ohm, but I don't think type checkers would automatically know that Volt = Ampere_Ohm). This clearly can't compare to the efficiency of NewType(float). But AIUI, one NewType(float) can't mix with another, which is not the behavior we want here. We could do VoltType = NewType('Volt', float) AmpereType = NewType('Ampere', float) WattType = NewType('Watt', float) def power(potential, current): return WattType(float(Volt)*float(Ampere)) but this is not very readable, and error-prone IMO. It's also less efficient than the "zero cost" that NewType promises for types like User_ID (https://www.python.org/dev/peps/pep-0484/#newtype-helper-function). I suppose it would be feasible (though ugly) to provide two implementations of VoltType, one as a "real" class as I propose above, and the other simply VoltType = float. The former would be used for type checking, the latter for production computations. Perhaps such a process could be automated in mypy? A final advantage is that I suppose that it should be possible to implement "color" as MRAB proposes through the Python type system. We don't have to define it now, but can take advantage of the benefits of "units as types" approach immediately. I don't know how to implement impose_type_expression_equivalence(), yet, so this is not a proposal for the stdlib. But the necessary definitions by hand are straightforward, though tedious. Individual implementations of units can be done *now*, without change to Python, AFAICS. Computational efficiency is an issue, but one that doesn't matter to educational applications, for example. Steve

I've been following this discussion on and off for a while, but still fail to see how SI units, factors or the like are a use case which is general enough to warrant changing the language. There are packages available on PyPI for dealing with this in a similar way we deal with decimal literals in Python: C extension: https://pypi.python.org/pypi/cfunits/ http://pythonhosted.org/cfunits/cfunits.Units.html (interfaces to the udunits-2 lib: http://www.unidata.ucar.edu/software/udunits/udunits-2.2.20/doc/udunits/udun...) Pure python: https://pypi.python.org/pypi/units/ IMHO, a literal notation like "2 m" is more likely related to a missing operator which should be flagged as SyntaxError than the declaration of an integer with associated unit. By keeping such analysis to string to object conversion tools/functions you make the intent explicit, which allows for better error reporting. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Aug 27 2016)
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/
participants (10)
-
Chris Angelico
-
Ivan Levkivskyi
-
Ken Kundert
-
M.-A. Lemburg
-
MRAB
-
Nick Coghlan
-
Paul Moore
-
Random832
-
Stephen J. Turnbull
-
Steven D'Aprano