[Numpy-discussion] Pull request review #3770: Trapezoidal distribution
Mark Szepieniec
mszepien at gmail.com
Sun Sep 22 08:17:39 EDT 2013
On Sun, Sep 22, 2013 at 1:24 PM, <josef.pktd at gmail.com> wrote:
> On Sat, Sep 21, 2013 at 1:55 PM, Jeremy Hetzel <jthetzel at gmail.com> wrote:
> > I've added a trapezoidal distribution to numpy.random for consideration,
> > pull request 3770:
> > https://github.com/numpy/numpy/pull/3770
> >
> > Similar to the triangular distribution, the trapezoidal distribution may
> be
> > used where the underlying distribution is not known, but some knowledge
> of
> > the limits and mode exists. The trapezoidal distribution generalizes the
> > triangular distribution by allowing the modal values to be expressed as a
> > range instead of a point estimate.
> >
> > The trapezoidal distribution implemented, known as the "generalized
> > trapezoidal distribution," has three additional parameters: growth,
> decay,
> > and boundary ratio. Adjusting these from the default values create
> > trapezoidal-like distributions with non-linear behavior. Examples can be
> > seen in an R vignette (
> > http://cran.r-project.org/web/packages/trapezoid/vignettes/trapezoid.pdf),
> > as well as these papers by J.R. van Dorp and colleagues:
> >
> > 1) van Dorp, J. R. and Kotz, S. (2003) Generalized trapezoidal
> > distributions. Metrika. 58(1):85–97. Preprint available:
> >
> http://www.seas.gwu.edu/~dorpjr/Publications/JournalPapers/Metrika2003VanDorp.pdf
> >
> > 2) van Dorp, J. R., Rambaud, S.C., Perez, J. G., and Pleguezuelo, R. H.
> > (2007) An elicitation procedure for the generalized trapezoidal
> distribution
> > with a uniform central stage. Decision Analysis Journal. 4:156–166.
> Preprint
> > available:
> > http://www.seas.gwu.edu/~dorpjr/Publications/JournalPapers/DA2007.pdf
> >
> > The docstring for the proposed numpy.random.trapezoidal() is as follows:
> >
> > """
> > trapezoidal(left, mode1, mode2, right, size=None, m=2, n=2,
> alpha=1)
> >
> > Draw samples from the generalized trapezoidal distribution.
> >
> > The trapezoidal distribution is defined by minimum (``left``),
> lower
> > mode (``mode1``), upper
> > mode (``mode1``), and maximum (``right``) parameters. The
> > generalized trapezoidal distribution
> > adds three more parameters: the growth rate (``m``), decay rate
> > (``n``), and boundary
> > ratio (``alpha``) parameters. The generalized trapezoidal
> > distribution simplifies
> > to the trapezoidal distribution when ``m = n = 2`` and ``alpha =
> > 1``. It further
> > simplifies to a triangular distribution when ``mode1 == mode2``.
> >
> > Parameters
> > ----------
> > left : scalar
> > Lower limit.
> > mode1 : scalar
> > The value where the first peak of the distribution occurs.
> > The value should fulfill the condition ``left <= mode1 <=
> > mode2``.
> > mode2 : scalar
> > The value where the first peak of the distribution occurs.
> > The value should fulfill the condition ``mode1 <= mode2 <=
> > right``.
> > right : scalar
> > Upper limit, should be larger than or equal to `mode2`.
> > size : int or tuple of ints, optional
> > Output shape. Default is None, in which case a single value
> is
> > returned.
> > m : scalar, optional
> > Growth parameter.
> > n : scalar, optional
> > Decay parameter.
> > alpha : scalar, optional
> > Boundary ratio parameter.
> >
> > Returns
> > -------
> > samples : ndarray or scalar
> > The returned samples all lie in the interval [left, right].
> >
> > Notes
> > -----
> > With ``left``, ``mode1``, ``mode2``, ``right``, ``m``, ``n``, and
> > ``alpha`` parametrized as
> > :math:`a, b, c, d, m, n, \\text{ and } \\alpha`, respectively,
> > the probability density function for the generalized trapezoidal
> > distribution is
> >
> > .. math::
> > f{\\scriptscriptstyle X}(x\mid\theta) =
> > \\mathcal{C}(\\Theta) \\times
> > \\begin{cases}
> > \\alpha \\left(\\frac{x - \\alpha}{b - \\alpha}
> > \\right)^{m - 1}, & \\text{for } a \\leq x < b \\\\
> > (1 - \\alpha) \\left(\frac{x - b}{c - b}
> \\right)
> > + \\alpha, & \\text{for } b \\leq x < c \\\\
> > \\left(\\frac{d - x}{d - c} \\right)^{n-1}, &
> > \\text{for } c \\leq x \\leq d
> > \\end{cases}
> >
> > with the normalizing constant :math:`\\mathcal{C}(\\Theta)`
> defined
> > as
> >
> > ..math::
> > \\mathcal{C}(\\Theta) =
> > \\frac{2mn}
> > {2 \\alpha \\left(b - a\\right) n +
> > \\left(\\alpha + 1 \\right) \\left(c - b
> \\right)mn
> > +
> > 2 \\left(d - c \\right)m}
> >
> > and where the parameter vector :math:`\\Theta = \\{a, b, c, d,
> m, n,
> > \\alpha \\}, \\text{ } a \\leq b \\leq c \\leq d, \\text{ and } m, n,
> > \\alpha >0`.
> >
> > Similar to the triangular distribution, the trapezoidal
> distribution
> > may be used where the
> > underlying distribution is not known, but some knowledge of the
> > limits and
> > mode exists. The trapezoidal distribution generalizes the
> triangular
> > distribution by allowing
> > the modal values to be expressed as a range instead of a point
> > estimate. The growth, decay, and
> > boundary ratio parameters of the generalized trapezoidal
> > distribution further allow for non-linear
> > behavior to be specified.
> >
> > References
> > ----------
> > .. [1] van Dorp, J. R. and Kotz, S. (2003) Generalized
> trapezoidal
> > distributions.
> > Metrika. 58(1):85–97.
> > Preprint available:
> >
> http://www.seas.gwu.edu/~dorpjr/Publications/JournalPapers/Metrika2003VanDorp.pdf
> > .. [2] van Dorp, J. R., Rambaud, S.C., Perez, J. G., and
> > Pleguezuelo, R. H. (2007)
> > An elicitation proce-dure for the generalized trapezoidal
> > distribution with a uniform central stage.
> > Decision AnalysisJournal. 4:156–166.
> > Preprint available:
> > http://www.seas.gwu.edu/~dorpjr/Publications/JournalPapers/DA2007.pdf
> >
> > Examples
> > --------
> > Draw values from the distribution and plot the histogram:
> >
> > >>> import matplotlib.pyplot as plt
> > >>> h = plt.hist(np.random.triangular(0, 0.25, 0.75, 1, 100000),
> > bins=200,
> > ... normed=True)
> > >>> plt.show()
> >
> > """
> >
> > I am unsure if NumPy encourages incorporation of new distributions into
> > numpy.random or instead into separate modules, but found the exercise to
> be
> > helpful regardless.
>
> I don't see a reason that numpy.random shouldn't get new
> distributions. It would also be useful to add the corresponding
> distribution to scipy.stats.
>
> I'm not familiar with the generalized trapezoidal distribution and
> don't know where it's used, neither have I ever used triangular.
>
> naming: n, m would indicate to me that they are integers, but it they
> can be floats (>0)
> alpha, beta ?
>
>
> about the parameterization - no problem here
>
> Is there a standard version, e.g. left=0, right=1, mode1=?, ... ?
>
> In scipy.stats.distribution we are required to use a location, scale
> parameterization, where loc shifts the distribution and scale
> stretches it.
> Is there a standard parameterization for that?, for example
> left = loc = 0 (default) or left = loc / scale = 0
> right = scale = 1 (default)
> mode1_relative = mode1 / scale
> mode2_relative = mode2 / scale
> n, m unchanged no defaults
>
> just checked:
> your naming corresponds to triangular, and triang in scipy has the
> corresponding loc-scale parameterization.
>
>
> Josef
>
>
> >
> > Thanks,
> > Jeremy
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
I think you need to s/first/second in the description of the mode2
parameter?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130922/af370bc1/attachment.html>
More information about the NumPy-Discussion
mailing list