Re: [Numpy-discussion] pareto docstring

11 May 2010

      On Tue, May 11, 2010 at 8:11 AM, Kevin Jacobs 
 wrote:
...
On Tue, May 11, 2010 at 4:14 AM, Pauli Virtanen  wrote:
...
A third option would be just to silently fix the bug. In any case the
change should be mentioned noticeably in the release notes.
I see this as two bugs: the Lomax distribution was named incorrectly and the
Parato distribution was incorrect or confusingly labeled.  Both should be
fixed and clearly documented.  Unlike cases of changing tastes and
preferences, it seems unduly complicated and confusing to perseverate with
backward compatibility shims.  The next release is NumPy 2.0, which will
have other known and well advertised API and ABI incompatibilities.
Just my 2e-10 cents,
-Kevin
I would have also considered it as a bug fix, except that there might be users
who use the correction (+1) as a workaround. In that case, just changing the
behavior without raising an exception for the current usage will introduce
hard to find bugs. (It's difficult to see whether the random numbers are correct
or as expected without proper testing.)

For example, we use the work-around in the docstring of
http://docs.scipy.org/numpy/docs/numpy.random.mtrand.RandomState.power/

and actually, reading the numpy.random.pareto docstring again more
carefully, the example does the correction also::

Draw samples from the distribution:
...
...
...
a, m = 3., 1. # shape and mode
s = np.random.pareto(a, 1000) + m
But it's very confusing, also there is a relationship between
Pareto/Lomax and GPD, but I'm not sure yet my algebra is
correct. (and I have misplaced the graphs and tables with the
relationships between different distributions)

To minimize backwards compatibility problems we could attach a *big*
warning text to pareto ("use at your own risk")
and create new random variates, as Pauli proposed

pareto1 - classical pareto
pareto2 or lomax - with random variates the same as current pareto

both could then get clear,  unambiguous descriptions.

and a note to using the uniform distribution for the generation of
random numbers.
python 2.5 random.py uses the half open uniform distribution to
avoid division by zero, I don't know how numpy.random handles
boundary values

    def paretovariate(self, alpha):
        """Pareto distribution.  alpha is the shape parameter."""
        # Jain, pg. 495

        u = 1.0 - self.random()
        return 1.0 / pow(u, 1.0/alpha)

Josef
...
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] pareto docstring

josef.pktd＠gmail.com