On Thu, Nov 2, 2017 at 12:46 PM, Ryan May <rmay31@gmail.com> wrote:
On Thu, Nov 2, 2017 at 6:56 AM, <josef.pktd@gmail.com> wrote:
On Thu, Nov 2, 2017 at 8:46 AM, <josef.pktd@gmail.com> wrote:
On Wed, Nov 1, 2017 at 6:55 PM, Nathan Goldbaum <nathan12343@gmail.com> wrote:
I think the biggest issues could be resolved if __array_concatenate__ were finished. Unfortunately I don't feel like I can take that on right now.

See Ryan May's talk at scipy about using an ndarray subclass for units and the issues he's run into:


Interesting talk, but I don't see how general library code should know what units the output has.
for example if units are some flows per unit of time and we average, sum or integrate over time, then what are the new units? (e.g. pandas time aggregation)
What are units of covariance or correlation between two variables with the same units, and what are they between variables with different units?

How do you concatenate and operate arrays with different units?

interpolation or prediction would work with using the existing units.

partially related:
statsmodels uses a wrapper for pandas Series and DataFrames and tries to preserve the index when possible and make up a new DataFrame or Series if the existing index doesn't apply.
E.g. predicted values and residuals are in terms of the original provided index, and could also get original units assigned. That would also be possible with prediction confidence intervals. But for the rest, see above.

using pint

>>> x
<Quantity([0 1 2 3 4], 'meter')>
>>> x / x
<Quantity([ nan   1.   1.   1.   1.], 'dimensionless')>

>>> x / (1 + x)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\...\python-3.4.4.amd64\lib\site-packages\pint\quantity.py", line 669, in __add__
    raise DimensionalityError(self._units, 'dimensionless')
    return self._add_sub(other, operator.add)
  File "C:\...\python-3.4.4.amd64\lib\site-packages\pint\quantity.py", line 580, in _add_sub
pint.errors.DimensionalityError: Cannot convert from 'meter' to 'dimensionless'

I'm not sure why you have a problem with that results. You tried to take a number in meters and add a dimensionless value to that--that's not a defined operation. That's like saying: "I have a distance of 12 meters and added 1 to it." 1 what? 1 meter? Great. 1 centimeter? I need to convert, but I can do that operation. 1 second? That makes no sense.

If you add units to the 1 then it's a defined operation:

>>> reg = pint.UnitRegistry()
>>> x / (1 * ureg.meters + x)
<Quantity([ 0.          0.5         0.66666667  0.75        0.8       ], 'dimensionless')>
pint.errors.DimensionalityError: Cannot convert from 'meter' ([length]) to 'dimensionless' (dimensionless)

Well, the Taylor series for exp (around a=0) is:

exp(x) = 1 + x + x**2 / 2 + x**3 / 6 + ...

so for that to properly add up, x needs to be dimensionless. It should be noted, though, that I've *never* seen a formula, theoretically derived or empirically fit, require directly taking exp(x) where x is a physical quantity with units. Instead, you have:

f = a * exp(kx)

Properly calculated values for a, k will have appropriate units attached to them that allows the calculation to proceed without error

I was thinking of a simple logit model to predict whether it rains tomorrow
The Logit transformation for the probability is exp(k x) / (1 + exp(k x) where k is a parameter to search for in the optimization.

x is a matrix with all predictors or explanatory variables which could all have different units.

So it sounds to me if we drop asarray, then we just get exceptions or possibly strange results, or we have to introduce a unit that matches everything (like a joker card) for any constants that we are using.




Ryan May

NumPy-Discussion mailing list