[Numpy-discussion] Pull request review #3770: Trapezoidal distribution

Mark Szepieniec mszepien at gmail.com
Sun Sep 22 08:17:39 EDT 2013


On Sun, Sep 22, 2013 at 1:24 PM, <josef.pktd at gmail.com> wrote:

> On Sat, Sep 21, 2013 at 1:55 PM, Jeremy Hetzel <jthetzel at gmail.com> wrote:
> > I've added a trapezoidal distribution to numpy.random for consideration,
> > pull request 3770:
> > https://github.com/numpy/numpy/pull/3770
> >
> > Similar to the triangular distribution, the trapezoidal distribution may
> be
> > used where the underlying distribution is not known, but some knowledge
> of
> > the limits and mode exists. The trapezoidal distribution generalizes the
> > triangular distribution by allowing the modal values to be expressed as a
> > range instead of a point estimate.
> >
> > The trapezoidal distribution implemented, known as the "generalized
> > trapezoidal distribution," has three additional parameters: growth,
> decay,
> > and boundary ratio. Adjusting these from the default values create
> > trapezoidal-like distributions with non-linear behavior. Examples can be
> > seen in an R vignette (
> > http://cran.r-project.org/web/packages/trapezoid/vignettes/trapezoid.pdf),
> > as well as these papers by J.R. van Dorp and colleagues:
> >
> > 1) van Dorp, J. R. and Kotz, S. (2003) Generalized trapezoidal
> > distributions. Metrika. 58(1):85–97. Preprint available:
> >
> http://www.seas.gwu.edu/~dorpjr/Publications/JournalPapers/Metrika2003VanDorp.pdf
> >
> > 2) van Dorp, J. R., Rambaud, S.C., Perez, J. G., and Pleguezuelo, R. H.
> > (2007) An elicitation procedure for the generalized trapezoidal
> distribution
> > with a uniform central stage. Decision Analysis Journal. 4:156–166.
> Preprint
> > available:
> > http://www.seas.gwu.edu/~dorpjr/Publications/JournalPapers/DA2007.pdf
> >
> > The docstring for the proposed numpy.random.trapezoidal() is as follows:
> >
> > """
> >         trapezoidal(left, mode1, mode2, right, size=None, m=2, n=2,
> alpha=1)
> >
> >         Draw samples from the generalized trapezoidal distribution.
> >
> >         The trapezoidal distribution is defined by minimum (``left``),
> lower
> > mode (``mode1``), upper
> >         mode (``mode1``), and maximum (``right``) parameters. The
> > generalized trapezoidal distribution
> >         adds three more parameters: the growth rate (``m``), decay rate
> > (``n``), and boundary
> >         ratio (``alpha``) parameters. The generalized trapezoidal
> > distribution simplifies
> >         to the trapezoidal distribution when ``m = n = 2`` and ``alpha =
> > 1``. It further
> >         simplifies to a triangular distribution when ``mode1 == mode2``.
> >
> >         Parameters
> >         ----------
> >         left : scalar
> >             Lower limit.
> >         mode1 : scalar
> >             The value where the first peak of the distribution occurs.
> >             The value should fulfill the condition ``left <= mode1 <=
> > mode2``.
> >         mode2 : scalar
> >             The value where the first peak of the distribution occurs.
> >             The value should fulfill the condition ``mode1 <= mode2 <=
> > right``.
> >         right : scalar
> >             Upper limit, should be larger than or equal to `mode2`.
> >         size : int or tuple of ints, optional
> >             Output shape. Default is None, in which case a single value
> is
> >             returned.
> >         m : scalar, optional
> >             Growth parameter.
> >         n : scalar, optional
> >             Decay parameter.
> >         alpha : scalar, optional
> >             Boundary ratio parameter.
> >
> >         Returns
> >         -------
> >         samples : ndarray or scalar
> >             The returned samples all lie in the interval [left, right].
> >
> >         Notes
> >         -----
> >         With ``left``, ``mode1``, ``mode2``, ``right``, ``m``, ``n``, and
> > ``alpha`` parametrized as
> >         :math:`a, b, c, d, m, n, \\text{ and } \\alpha`, respectively,
> >         the probability density function for the generalized trapezoidal
> > distribution is
> >
> >         .. math::
> >                   f{\\scriptscriptstyle X}(x\mid\theta) =
> > \\mathcal{C}(\\Theta) \\times
> >                       \\begin{cases}
> >                           \\alpha \\left(\\frac{x - \\alpha}{b - \\alpha}
> > \\right)^{m - 1}, & \\text{for } a \\leq x < b \\\\
> >                           (1 - \\alpha) \\left(\frac{x - b}{c - b}
> \\right)
> > + \\alpha, & \\text{for } b \\leq x < c \\\\
> >                           \\left(\\frac{d - x}{d - c} \\right)^{n-1}, &
> > \\text{for } c \\leq x \\leq d
> >                       \\end{cases}
> >
> >         with the normalizing constant :math:`\\mathcal{C}(\\Theta)`
> defined
> > as
> >
> >         ..math::
> >                 \\mathcal{C}(\\Theta) =
> >                     \\frac{2mn}
> >                     {2 \\alpha \\left(b - a\\right) n +
> >                         \\left(\\alpha + 1 \\right) \\left(c - b
> \\right)mn
> > +
> >                         2 \\left(d - c \\right)m}
> >
> >         and where the parameter vector :math:`\\Theta = \\{a, b, c, d,
> m, n,
> > \\alpha \\}, \\text{ } a \\leq b \\leq c \\leq d, \\text{ and } m, n,
> > \\alpha >0`.
> >
> >         Similar to the triangular distribution, the trapezoidal
> distribution
> > may be used where the
> >         underlying distribution is not known, but some knowledge of the
> > limits and
> >         mode exists. The trapezoidal distribution generalizes the
> triangular
> > distribution by allowing
> >         the modal values to be expressed as a range instead of a point
> > estimate. The growth, decay, and
> >         boundary ratio parameters of the generalized trapezoidal
> > distribution further allow for non-linear
> >         behavior to be specified.
> >
> >         References
> >         ----------
> >         .. [1] van Dorp, J. R. and Kotz, S. (2003) Generalized
> trapezoidal
> > distributions.
> >                 Metrika. 58(1):85–97.
> >                 Preprint available:
> >
> http://www.seas.gwu.edu/~dorpjr/Publications/JournalPapers/Metrika2003VanDorp.pdf
> >         .. [2] van Dorp, J. R., Rambaud, S.C., Perez, J. G., and
> > Pleguezuelo, R. H. (2007)
> >                 An elicitation proce-dure for the generalized trapezoidal
> > distribution with a uniform central stage.
> >                 Decision AnalysisJournal. 4:156–166.
> >                 Preprint available:
> > http://www.seas.gwu.edu/~dorpjr/Publications/JournalPapers/DA2007.pdf
> >
> >         Examples
> >         --------
> >         Draw values from the distribution and plot the histogram:
> >
> >         >>> import matplotlib.pyplot as plt
> >         >>> h = plt.hist(np.random.triangular(0, 0.25, 0.75, 1, 100000),
> > bins=200,
> >         ...              normed=True)
> >         >>> plt.show()
> >
> > """
> >
> > I am unsure if NumPy encourages incorporation of new distributions into
> > numpy.random or instead into separate modules, but found the exercise to
> be
> > helpful regardless.
>
> I don't see a reason that numpy.random shouldn't get new
> distributions. It would also be useful to add the corresponding
> distribution to scipy.stats.
>
> I'm not familiar with the generalized trapezoidal distribution and
> don't know where it's used, neither have I ever used triangular.
>
> naming: n, m would indicate to me that they are integers, but it they
> can be floats (>0)
> alpha, beta ?
>
>
> about the parameterization - no problem here
>
> Is there a standard version, e.g. left=0, right=1, mode1=?, ... ?
>
> In scipy.stats.distribution we are required to use a location, scale
> parameterization, where loc shifts the distribution and scale
> stretches it.
> Is there a standard parameterization for that?, for example
> left = loc = 0 (default)     or left = loc / scale = 0
> right = scale = 1 (default)
> mode1_relative = mode1 / scale
> mode2_relative = mode2 / scale
> n, m unchanged     no defaults
>
> just checked:
> your naming corresponds to triangular, and triang in scipy has the
> corresponding loc-scale parameterization.
>
>
> Josef
>
>
> >
> > Thanks,
> > Jeremy
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


I think you need to s/first/second in the description of the mode2
parameter?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130922/af370bc1/attachment.html>


More information about the NumPy-Discussion mailing list