<div dir="ltr"><div dir="ltr"><div dir="ltr">My main concern about planck is that I am not aware that this is a known distribution name. I found Planck's law (<a href="https://en.wikipedia.org/wiki/Planck%27s_law">https://en.wikipedia.org/wiki/Planck%27s_law</a>) but I don't recognize the distribution implemented in SciPy. Does anyone know the distribution under that name?</div><div><br>It is also called discrete exponential in scipy: normally, the geometric distribution is called the discrete analogue of the exponential (no memory property), so this could be confusing for users.<br>The implementation of geom in SciPy is based on geometric in NumPy, my guess is that it has a better sampling method than the one of planck based on the ppf.</div><div><br></div><div>We can also leave the different parametrization in stats and explain it in the docstring.</div><div><br></div><div>Christoph</div><div><br></div><div class="gmail_quote"><div dir="ltr">On Thu, Jan 3, 2019 at 10:30 PM <<a href="mailto:scipy-dev-request@python.org">scipy-dev-request@python.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;padding-left:1ex;border-left-color:rgb(204,204,204);border-left-width:1px;border-left-style:solid">Send SciPy-Dev mailing list submissions to<br>
    <a href="mailto:scipy-dev@python.org" target="_blank">scipy-dev@python.org</a><br>
<br>
To subscribe or unsubscribe via the World Wide Web, visit<br>
    <a href="https://mail.python.org/mailman/listinfo/scipy-dev" target="_blank" rel="noreferrer">https://mail.python.org/mailman/listinfo/scipy-dev</a><br>
or, via email, send a message with subject or body 'help' to<br>
    <a href="mailto:scipy-dev-request@python.org" target="_blank">scipy-dev-request@python.org</a><br>
<br>
You can reach the person managing the list at<br>
    <a href="mailto:scipy-dev-owner@python.org" target="_blank">scipy-dev-owner@python.org</a><br>
<br>
When replying, please edit your Subject line so it is more specific<br>
than "Re: Contents of SciPy-Dev digest..."<br>
<br>
<br>
Today's Topics:<br>
<br>
  1. Re: add johnson SL distribution (<a href="mailto:josef.pktd@gmail.com" target="_blank">josef.pktd@gmail.com</a>)<br>
  2. Re: Deprecate planck distribution? (<a href="mailto:josef.pktd@gmail.com" target="_blank">josef.pktd@gmail.com</a>)<br>
<br>
<br>
----------------------------------------------------------------------<br>
<br>
Message: 1<br>
Date: Thu, 3 Jan 2019 15:57:26 -0500<br>
From: <a href="mailto:josef.pktd@gmail.com" target="_blank">josef.pktd@gmail.com</a><br>
To: SciPy Developers List <<a href="mailto:scipy-dev@python.org" target="_blank">scipy-dev@python.org</a>><br>
Subject: Re: [SciPy-Dev] add johnson SL distribution<br>
Message-ID:<br>
    <CAMMTP+BXHOf33E3CxzM9YSpaHKtV189hqmAP=<a href="mailto:xNSuRn4b6okWQ@mail.gmail.com" target="_blank">xNSuRn4b6okWQ@mail.gmail.com</a>><br>
Content-Type: text/plain; charset="utf-8"<br>
<br>
On Thu, Jan 3, 2019 at 3:54 PM <<a href="mailto:josef.pktd@gmail.com" target="_blank">josef.pktd@gmail.com</a>> wrote:<br>
<br>
><br>
><br>
> On Thu, Jan 3, 2019 at 3:31 PM Matt Haberland <<a href="mailto:haberland@ucla.edu" target="_blank">haberland@ucla.edu</a>> wrote:<br>
><br>
>> I am not personally familiar with the Johnson family of distributions<br>
>> <<a href="https://books.google.com/books?id=_LvgBwAAQBAJ&pg=PA197&lpg=PA197&dq=johns+su+sb+sl+distributions&source=bl&ots=LBowBmYTse&sig=9KPViyvSlLAFp9EYqi-ejTYgQ30&hl=en&sa=X&ved=2ahUKEwjE6cnvt9LfAhWG458KHdrQAmkQ6AEwDXoECAIQAQ#v=onepage&q=johns%20su%20sb%20sl%20distributions&f=false" target="_blank" rel="noreferrer">https://books.google.com/books?id=_LvgBwAAQBAJ&pg=PA197&lpg=PA197&dq=johns+su+sb+sl+distributions&source=bl&ots=LBowBmYTse&sig=9KPViyvSlLAFp9EYqi-ejTYgQ30&hl=en&sa=X&ved=2ahUKEwjE6cnvt9LfAhWG458KHdrQAmkQ6AEwDXoECAIQAQ#v=onepage&q=johns%20su%20sb%20sl%20distributions&f=false</a>>,<br>
>> but the SL does seem to complete the set.<br>
>><br>
>> The license for the Matlab implementation does seem to be BSD 3-clause<br>
>> <<a href="https://en.wikipedia.org/wiki/BSD_licenses#3-clause" target="_blank" rel="noreferrer">https://en.wikipedia.org/wiki/BSD_licenses#3-clause</a>> and thus<br>
>> compatible with SciPy.<br>
>><br>
>> Seems like a reasonable first issue, but certainly finishing stalled PRs<br>
>> would be helpful, too!<br>
>><br>
>> Matt Haberland<br>
>><br>
>> On Thu, Jan 3, 2019 at 10:09 AM Michael Watson <<br>
>> <a href="mailto:mike.watson@sheffield.ac.uk" target="_blank">mike.watson@sheffield.ac.uk</a>> wrote:<br>
>><br>
>>> Hi all, happy new year,<br>
>>> We have the SB and SU Johnson distributions implemented but not the SL<br>
>>> distribution, it doesn't look like much work to add it in if it's<br>
>>> appropriate, I'm doing some work with these distributions and ultimately<br>
>>> would like to implement functions to fit by moments and by quantiles too.<br>
>>> there are existing implementations that are distributed under the BSD<br>
>>> licence here:<br>
>>><br>
>>><br>
>>> <a href="https://uk.mathworks.com/matlabcentral/fileexchange/46123-johnson-curve-toolbox" target="_blank" rel="noreferrer">https://uk.mathworks.com/matlabcentral/fileexchange/46123-johnson-curve-toolbox</a><br>
>>><br>
>>> so it doesn't seem like a big job from my point of view and I'll be<br>
>>> doing it anyway.<br>
>>><br>
>>> it would also be my first contribution so if it would be better to start<br>
>>> with another issue (I saw a list and 2 stalled PRs in another email) then<br>
>>> try to add functionality just say and I can look at contributing other ways<br>
>>> first.<br>
>>><br>
>><br>
> In general to adding new distributions<br>
><br>
> The speed of getting a new distribution in depends a lot on how well it<br>
> fits into the general distribution pattern and whether all core methods are<br>
> available as closed form expression or by using scipy.special functions.<br>
> If that is the case, then adding a new distribution is easy.<br>
> If that is not the case, then it can be difficult to get a good version<br>
> merged. One difficult case is if the pdf is only available as<br>
> computationally expensive numerical approximation.<br>
><br>
> The distributions have in general only the fit method using maximum<br>
> likelihood estimation of parameters (which might reduce to method of<br>
> moments in special cases).<br>
><br>
> Based on a quick search it looks like JohnsonSL is just the log-normal<br>
> distribution (as loc-scale family which is available in scipy)<br>
><br>
<br>
scipy lognorm is a 3 parameter family, maybe there should also be a 4<br>
parameter family<br>
<br>
<br>
><br>
> Josef<br>
><br>
><br>
>> Mike<br>
>>> _______________________________________________<br>
>>> SciPy-Dev mailing list<br>
>>> <a href="mailto:SciPy-Dev@python.org" target="_blank">SciPy-Dev@python.org</a><br>
>>> <a href="https://mail.python.org/mailman/listinfo/scipy-dev" target="_blank" rel="noreferrer">https://mail.python.org/mailman/listinfo/scipy-dev</a><br>
>>><br>
>><br>
>><br>
>> --<br>
>> Matt Haberland<br>
>> Assistant Adjunct Professor in the Program in Computing<br>
>> Department of Mathematics<br>
>> 6617A Math Sciences Building, UCLA<br>
>> _______________________________________________<br>
>> SciPy-Dev mailing list<br>
>> <a href="mailto:SciPy-Dev@python.org" target="_blank">SciPy-Dev@python.org</a><br>
>> <a href="https://mail.python.org/mailman/listinfo/scipy-dev" target="_blank" rel="noreferrer">https://mail.python.org/mailman/listinfo/scipy-dev</a><br>
>><br>
><br>
-------------- next part --------------<br>
An HTML attachment was scrubbed...<br>
URL: <<a href="http://mail.python.org/pipermail/scipy-dev/attachments/20190103/5b18e5d7/attachment-0001.html" target="_blank" rel="noreferrer">http://mail.python.org/pipermail/scipy-dev/attachments/20190103/5b18e5d7/attachment-0001.html</a>><br>
<br>
------------------------------<br>
<br>
Message: 2<br>
Date: Thu, 3 Jan 2019 16:29:22 -0500<br>
From: <a href="mailto:josef.pktd@gmail.com" target="_blank">josef.pktd@gmail.com</a><br>
To: SciPy Developers List <<a href="mailto:scipy-dev@python.org" target="_blank">scipy-dev@python.org</a>><br>
Subject: Re: [SciPy-Dev] Deprecate planck distribution?<br>
Message-ID:<br>
    <CAMMTP+A=AtSWy8XH9FsM8mjZ=<a href="mailto:HNQvX9b572p4SWMF765D6sJYw@mail.gmail.com" target="_blank">HNQvX9b572p4SWMF765D6sJYw@mail.gmail.com</a>><br>
Content-Type: text/plain; charset="utf-8"<br>
<br>
On Thu, Jan 3, 2019 at 9:22 AM Ali Cetin <<a href="mailto:ali.cetin@outlook.com" target="_blank">ali.cetin@outlook.com</a>> wrote:<br>
<br>
><br>
><br>
> ------------------------------<br>
> *From:* SciPy-Dev <scipy-dev-bounces+ali.cetin=<a href="mailto:outlook.com@python.org" target="_blank">outlook.com@python.org</a>> on<br>
> behalf of Robert Kern <<a href="mailto:robert.kern@gmail.com" target="_blank">robert.kern@gmail.com</a>><br>
> *Sent:* Wednesday, January 2, 2019 21:07<br>
> *To:* SciPy Developers List<br>
> *Subject:* Re: [SciPy-Dev] Deprecate planck distribution?<br>
><br>
> On Wed, Jan 2, 2019 at 1:36 AM Christoph Baumgarten <<br>
> <a href="mailto:christoph.baumgarten@gmail.com" target="_blank">christoph.baumgarten@gmail.com</a>> wrote:<br>
> ><br>
> > Hi all,<br>
> ><br>
> > happy new year!<br>
> ><br>
> > I noted that the Planck distribution is a geometric distribution with a<br>
> different parametrization, see Issue #9359:<br>
> ><br>
> > import numpy as np<br>
> > from scipy.stats import planck, geom<br>
> ><br>
> > a = 0.5<br>
> > k = np.arange(20)<br>
> > sum(abs(geom.pmf(k, 1-np.exp(-a), loc=-1) - planck.pmf(k, a))) # 1.30e-18<br>
> ><br>
> > I don't know if there is a specific reason to have the Planck<br>
> distribution in addition to the geometric. If not, I would propose to<br>
> deprecate it.<br>
> ><br>
> > Any views? Thanks<br>
><br>
> If we were to turn back time, and the question was whether to *add* the<br>
> Planck distribution given that we had the geometric distribution, I would<br>
> probably be convinced by this. However, given that the Planck distribution<br>
> has already been added, I don't think that it's worth removing it. The<br>
> marginal cost to having this alternate parameterization is likely less than<br>
> the cost of anyone changing their code.<br>
><br>
> The collection of probability distributions are also a place where some<br>
> nontrivial duplication actually has some positive value. People typically<br>
> come to `scipy.stats` with a distribution (with a name and specific<br>
> parameterization conventions) already in mind. Having more than one<br>
> parameterization available helps people recognize the distribution that<br>
> they want; having an alternate present doesn't impair the search task while<br>
> not having one they are looking for (or burying it in the Notes of the<br>
> docstring of the canonical version) can make the search task much harder.<br>
> It's a common complaint that `scipy.stats` doesn't expose certain common<br>
> parameterizations of distributions, so we should probably be working to<br>
> expand the collection of parameterizations rather than collapsing them.<br>
><br>
><br>
> Robert Kern<br>
><br>
> I agree with Robert on this one. If you want to go down that rat hole, you<br>
> will quickly find that most distribution functions are mere special cases<br>
> and/or alternative parameterizations of a few general classes of<br>
> distributions. If the concern is code management, then it could be argued<br>
> that an effort should be made on abstracting distribution functions from<br>
> these more general classes. However, personally, I prefer transparency and<br>
> consistency with established literature when it comes to parametrization.<br>
><br>
<br>
I think there is a good reason for implementing special cases instead of<br>
only general cases because then computational simplifications can be used,<br>
e.g. using only general distribution with several extra parameters is<br>
cumbersome and requires a lot more work for the user, e.g. in setting all<br>
the extra parameters to their special case values.<br>
<br>
This is not the case for pure reparameterization that still have the same<br>
number of parameters.<br>
<br>
The main straight jacket in the scipy.stats distribution case in terms of<br>
parameterization is that all continuous distributions use the loc-scale<br>
(plus possibly shape) parameterization.<br>
I think there are enough maintainers now (where I don't count myself), that<br>
it would be feasible to add other distribution classes that don't have to<br>
follow the loc-scale parameterization, or that could be intermediate<br>
classes for groups of similar distributions.<br>
<br>
For example, I think something similar to the frozen distribution class<br>
could be added that is just a Reparameterization class, i.e. internally<br>
delegates to a standard scipy distribution, but uses a parameterization and<br>
parameter transformation that is more common and more familiar to users.<br>
Another advantage of reparameterization classes would be that estimation is<br>
often easier or more interpretable in a different parameterization. E.g.<br>
statsmodels uses negativebinomial in the mean-dispersion parameterization<br>
instead of the common negbin parameterization.<br>
Another advantage of that is that the hessian, covariance of the parameter<br>
estimates has often a nicer shape in different parameterization.<br>
<br>
A example for a intermediate class would be common support for distribution<br>
that are created by a transformation of another, mainly normal distribution.<br>
This includes the Johnson system of distribution in the other open thread<br>
on the list.<br>
<br>
(Just some thoughts, I'm currently not in this neighborhood of stats.)<br>
<br>
Josef<br>
<br>
<br>
<br>
><br>
> That's my two cents on the issue.<br>
><br>
> Cheers,<br>
> Ali Cetin<br>
> _______________________________________________<br>
> SciPy-Dev mailing list<br>
> <a href="mailto:SciPy-Dev@python.org" target="_blank">SciPy-Dev@python.org</a><br>
> <a href="https://mail.python.org/mailman/listinfo/scipy-dev" target="_blank" rel="noreferrer">https://mail.python.org/mailman/listinfo/scipy-dev</a><br>
><br>
-------------- next part --------------<br>
An HTML attachment was scrubbed...<br>
URL: <<a href="http://mail.python.org/pipermail/scipy-dev/attachments/20190103/f9e0f17f/attachment.html" target="_blank" rel="noreferrer">http://mail.python.org/pipermail/scipy-dev/attachments/20190103/f9e0f17f/attachment.html</a>><br>
<br>
------------------------------<br>
<br>
Subject: Digest Footer<br>
<br>
_______________________________________________<br>
SciPy-Dev mailing list<br>
<a href="mailto:SciPy-Dev@python.org" target="_blank">SciPy-Dev@python.org</a><br>
<a href="https://mail.python.org/mailman/listinfo/scipy-dev" target="_blank" rel="noreferrer">https://mail.python.org/mailman/listinfo/scipy-dev</a><br>
<br>
<br>
------------------------------<br>
<br>
End of SciPy-Dev Digest, Vol 183, Issue 6<br>
*****************************************<br>
</blockquote></div></div></div>