<div dir="ltr"><br><div class="gmail_extra"><br><br><div class="gmail_quote">On Sun, Sep 22, 2013 at 1:24 PM, <span dir="ltr"><<a href="mailto:josef.pktd@gmail.com" target="_blank">josef.pktd@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5">On Sat, Sep 21, 2013 at 1:55 PM, Jeremy Hetzel <<a href="mailto:jthetzel@gmail.com">jthetzel@gmail.com</a>> wrote:<br>
> I've added a trapezoidal distribution to numpy.random for consideration,<br>
> pull request 3770:<br>
> <a href="https://github.com/numpy/numpy/pull/3770" target="_blank">https://github.com/numpy/numpy/pull/3770</a><br>
><br>
> Similar to the triangular distribution, the trapezoidal distribution may be<br>
> used where the underlying distribution is not known, but some knowledge of<br>
> the limits and mode exists. The trapezoidal distribution generalizes the<br>
> triangular distribution by allowing the modal values to be expressed as a<br>
> range instead of a point estimate.<br>
><br>
> The trapezoidal distribution implemented, known as the "generalized<br>
> trapezoidal distribution," has three additional parameters: growth, decay,<br>
> and boundary ratio. Adjusting these from the default values create<br>
> trapezoidal-like distributions with non-linear behavior. Examples can be<br>
> seen in an R vignette (<br>
> <a href="http://cran.r-project.org/web/packages/trapezoid/vignettes/trapezoid.pdf" target="_blank">http://cran.r-project.org/web/packages/trapezoid/vignettes/trapezoid.pdf</a> ),<br>
> as well as these papers by J.R. van Dorp and colleagues:<br>
><br>
> 1) van Dorp, J. R. and Kotz, S. (2003) Generalized trapezoidal<br>
> distributions. Metrika. 58(1):85–97. Preprint available:<br>
> <a href="http://www.seas.gwu.edu/~dorpjr/Publications/JournalPapers/Metrika2003VanDorp.pdf" target="_blank">http://www.seas.gwu.edu/~dorpjr/Publications/JournalPapers/Metrika2003VanDorp.pdf</a><br>
><br>
> 2) van Dorp, J. R., Rambaud, S.C., Perez, J. G., and Pleguezuelo, R. H.<br>
> (2007) An elicitation procedure for the generalized trapezoidal distribution<br>
> with a uniform central stage. Decision Analysis Journal. 4:156–166. Preprint<br>
> available:<br>
> <a href="http://www.seas.gwu.edu/~dorpjr/Publications/JournalPapers/DA2007.pdf" target="_blank">http://www.seas.gwu.edu/~dorpjr/Publications/JournalPapers/DA2007.pdf</a><br>
><br>
> The docstring for the proposed numpy.random.trapezoidal() is as follows:<br>
><br>
> """<br>
> trapezoidal(left, mode1, mode2, right, size=None, m=2, n=2, alpha=1)<br>
><br>
> Draw samples from the generalized trapezoidal distribution.<br>
><br>
> The trapezoidal distribution is defined by minimum (``left``), lower<br>
> mode (``mode1``), upper<br>
> mode (``mode1``), and maximum (``right``) parameters. The<br>
> generalized trapezoidal distribution<br>
> adds three more parameters: the growth rate (``m``), decay rate<br>
> (``n``), and boundary<br>
> ratio (``alpha``) parameters. The generalized trapezoidal<br>
> distribution simplifies<br>
> to the trapezoidal distribution when ``m = n = 2`` and ``alpha =<br>
> 1``. It further<br>
> simplifies to a triangular distribution when ``mode1 == mode2``.<br>
><br>
> Parameters<br>
> ----------<br>
> left : scalar<br>
> Lower limit.<br>
> mode1 : scalar<br>
> The value where the first peak of the distribution occurs.<br>
> The value should fulfill the condition ``left <= mode1 <=<br>
> mode2``.<br>
> mode2 : scalar<br>
> The value where the first peak of the distribution occurs.<br>
> The value should fulfill the condition ``mode1 <= mode2 <=<br>
> right``.<br>
> right : scalar<br>
> Upper limit, should be larger than or equal to `mode2`.<br>
> size : int or tuple of ints, optional<br>
> Output shape. Default is None, in which case a single value is<br>
> returned.<br>
> m : scalar, optional<br>
> Growth parameter.<br>
> n : scalar, optional<br>
> Decay parameter.<br>
> alpha : scalar, optional<br>
> Boundary ratio parameter.<br>
><br>
> Returns<br>
> -------<br>
> samples : ndarray or scalar<br>
> The returned samples all lie in the interval [left, right].<br>
><br>
> Notes<br>
> -----<br>
> With ``left``, ``mode1``, ``mode2``, ``right``, ``m``, ``n``, and<br>
> ``alpha`` parametrized as<br>
> :math:`a, b, c, d, m, n, \\text{ and } \\alpha`, respectively,<br>
> the probability density function for the generalized trapezoidal<br>
> distribution is<br>
><br>
> .. math::<br>
> f{\\scriptscriptstyle X}(x\mid\theta) =<br>
> \\mathcal{C}(\\Theta) \\times<br>
> \\begin{cases}<br>
> \\alpha \\left(\\frac{x - \\alpha}{b - \\alpha}<br>
> \\right)^{m - 1}, & \\text{for } a \\leq x < b \\\\<br>
> (1 - \\alpha) \\left(\frac{x - b}{c - b} \\right)<br>
> + \\alpha, & \\text{for } b \\leq x < c \\\\<br>
> \\left(\\frac{d - x}{d - c} \\right)^{n-1}, &<br>
> \\text{for } c \\leq x \\leq d<br>
> \\end{cases}<br>
><br>
> with the normalizing constant :math:`\\mathcal{C}(\\Theta)` defined<br>
> as<br>
><br>
> ..math::<br>
> \\mathcal{C}(\\Theta) =<br>
> \\frac{2mn}<br>
> {2 \\alpha \\left(b - a\\right) n +<br>
> \\left(\\alpha + 1 \\right) \\left(c - b \\right)mn<br>
> +<br>
> 2 \\left(d - c \\right)m}<br>
><br>
> and where the parameter vector :math:`\\Theta = \\{a, b, c, d, m, n,<br>
> \\alpha \\}, \\text{ } a \\leq b \\leq c \\leq d, \\text{ and } m, n,<br>
> \\alpha >0`.<br>
><br>
> Similar to the triangular distribution, the trapezoidal distribution<br>
> may be used where the<br>
> underlying distribution is not known, but some knowledge of the<br>
> limits and<br>
> mode exists. The trapezoidal distribution generalizes the triangular<br>
> distribution by allowing<br>
> the modal values to be expressed as a range instead of a point<br>
> estimate. The growth, decay, and<br>
> boundary ratio parameters of the generalized trapezoidal<br>
> distribution further allow for non-linear<br>
> behavior to be specified.<br>
><br>
> References<br>
> ----------<br>
> .. [1] van Dorp, J. R. and Kotz, S. (2003) Generalized trapezoidal<br>
> distributions.<br>
> Metrika. 58(1):85–97.<br>
> Preprint available:<br>
> <a href="http://www.seas.gwu.edu/~dorpjr/Publications/JournalPapers/Metrika2003VanDorp.pdf" target="_blank">http://www.seas.gwu.edu/~dorpjr/Publications/JournalPapers/Metrika2003VanDorp.pdf</a><br>
> .. [2] van Dorp, J. R., Rambaud, S.C., Perez, J. G., and<br>
> Pleguezuelo, R. H. (2007)<br>
> An elicitation proce-dure for the generalized trapezoidal<br>
> distribution with a uniform central stage.<br>
> Decision AnalysisJournal. 4:156–166.<br>
> Preprint available:<br>
> <a href="http://www.seas.gwu.edu/~dorpjr/Publications/JournalPapers/DA2007.pdf" target="_blank">http://www.seas.gwu.edu/~dorpjr/Publications/JournalPapers/DA2007.pdf</a><br>
><br>
> Examples<br>
> --------<br>
> Draw values from the distribution and plot the histogram:<br>
><br>
> >>> import matplotlib.pyplot as plt<br>
> >>> h = plt.hist(np.random.triangular(0, 0.25, 0.75, 1, 100000),<br>
> bins=200,<br>
> ... normed=True)<br>
> >>> plt.show()<br>
><br>
> """<br>
><br>
> I am unsure if NumPy encourages incorporation of new distributions into<br>
> numpy.random or instead into separate modules, but found the exercise to be<br>
> helpful regardless.<br>
<br>
</div></div>I don't see a reason that numpy.random shouldn't get new<br>
distributions. It would also be useful to add the corresponding<br>
distribution to scipy.stats.<br>
<br>
I'm not familiar with the generalized trapezoidal distribution and<br>
don't know where it's used, neither have I ever used triangular.<br>
<br>
naming: n, m would indicate to me that they are integers, but it they<br>
can be floats (>0)<br>
alpha, beta ?<br>
<br>
<br>
about the parameterization - no problem here<br>
<br>
Is there a standard version, e.g. left=0, right=1, mode1=?, ... ?<br>
<br>
In scipy.stats.distribution we are required to use a location, scale<br>
parameterization, where loc shifts the distribution and scale<br>
stretches it.<br>
Is there a standard parameterization for that?, for example<br>
left = loc = 0 (default) or left = loc / scale = 0<br>
right = scale = 1 (default)<br>
mode1_relative = mode1 / scale<br>
mode2_relative = mode2 / scale<br>
n, m unchanged no defaults<br>
<br>
just checked:<br>
your naming corresponds to triangular, and triang in scipy has the<br>
corresponding loc-scale parameterization.<br>
<br>
<br>
Josef<br>
<br>
<br>
><br>
> Thanks,<br>
> Jeremy<br>
><br>
><br>
> _______________________________________________<br>
> NumPy-Discussion mailing list<br>
> <a href="mailto:NumPy-Discussion@scipy.org">NumPy-Discussion@scipy.org</a><br>
> <a href="http://mail.scipy.org/mailman/listinfo/numpy-discussion" target="_blank">http://mail.scipy.org/mailman/listinfo/numpy-discussion</a><br>
><br>
_______________________________________________<br>
NumPy-Discussion mailing list<br>
<a href="mailto:NumPy-Discussion@scipy.org">NumPy-Discussion@scipy.org</a><br>
<a href="http://mail.scipy.org/mailman/listinfo/numpy-discussion" target="_blank">http://mail.scipy.org/mailman/listinfo/numpy-discussion</a></blockquote><div><br></div><div>I think you need to s/first/second in the description of the mode2 parameter? </div>
</div><br></div></div>