On Tue, May 11, 2010 at 8:11 AM, Kevin Jacobs <jacobs@bioinformed.com> <bioinformed@gmail.com> wrote:

On Tue, May 11, 2010 at 4:14 AM, Pauli Virtanen <pav@iki.fi> wrote:

A third option would be just to silently fix the bug. In any case the change should be mentioned noticeably in the release notes.

I see this as two bugs: the Lomax distribution was named incorrectly and the Parato distribution was incorrect or confusingly labeled. Both should be fixed and clearly documented. Unlike cases of changing tastes and preferences, it seems unduly complicated and confusing to perseverate with backward compatibility shims. The next release is NumPy 2.0, which will have other known and well advertised API and ABI incompatibilities. Just my 2e-10 cents, -Kevin

I would have also considered it as a bug fix, except that there might be users who use the correction (+1) as a workaround. In that case, just changing the behavior without raising an exception for the current usage will introduce hard to find bugs. (It's difficult to see whether the random numbers are correct or as expected without proper testing.) For example, we use the work-around in the docstring of http://docs.scipy.org/numpy/docs/numpy.random.mtrand.RandomState.power/ and actually, reading the numpy.random.pareto docstring again more carefully, the example does the correction also:: Draw samples from the distribution:

a, m = 3., 1. # shape and mode s = np.random.pareto(a, 1000) + m

But it's very confusing, also there is a relationship between Pareto/Lomax and GPD, but I'm not sure yet my algebra is correct. (and I have misplaced the graphs and tables with the relationships between different distributions) To minimize backwards compatibility problems we could attach a *big* warning text to pareto ("use at your own risk") and create new random variates, as Pauli proposed pareto1 - classical pareto pareto2 or lomax - with random variates the same as current pareto both could then get clear, unambiguous descriptions. and a note to using the uniform distribution for the generation of random numbers. python 2.5 random.py uses the half open uniform distribution to avoid division by zero, I don't know how numpy.random handles boundary values def paretovariate(self, alpha): """Pareto distribution. alpha is the shape parameter.""" # Jain, pg. 495 u = 1.0 - self.random() return 1.0 / pow(u, 1.0/alpha) Josef

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion