<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Thu, Aug 15, 2013 at 6:48 PM, Ryan <span dir="ltr"><<a href="mailto:rymg19@gmail.com" target="_blank">rymg19@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div><div>For the naming, how about changing median(callable) to median.regular? That way, we don't have to deal with a callable namespace.<br></div></div></blockquote><div><br></div><div>Hmm. That sounds like a step backwards to me: whatever the API is, a simple "from statistics import median; m = median(my_data)" should still work in the simple case.</div>
<div><br></div><div>Mark</div><div><br></div><div><br></div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div><br><div class="gmail_quote">
<div class="im">Steven D'Aprano <<a href="mailto:steve@pearwood.info" target="_blank">steve@pearwood.info</a>> wrote:</div><div><div class="h5"><blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<pre style="white-space:pre-wrap;word-wrap:break-word;font-family:sans-serif;margin-top:0px">On 15/08/13 21:42, Mark Dickinson wrote:<br><blockquote class="gmail_quote" style="margin:0pt 0pt 1ex 0.8ex;border-left:1px solid #729fcf;padding-left:1ex">
The PEP and code look generally good to me.<br><br>I think the API for median and its variants deserves some wider discussion:<br>the reference implementation has a callable 'median', and variant callables<br>'median.low', 'median.high', 'median.grouped'. The pattern of attaching<br>
the variant callables as attributes on the main callable is unusual, and<br>isn't something I've seen elsewhere in the standard library. I'd like to<br>see some explanation in the PEP for why it's done this way. (There was<br>
already some discussion of this on the issue, but that was more centered<br>around the implementation than the API.)<br><br>I'd propose two alternatives for this: either have separate functions<br>'median', 'median_low', 'median_high', etc., or have a single function<br>
'median' with a "method" argument that takes a string specifying<br>computation using a particular method. I don't see a really good reason to<br>deviate from standard patterns here, and fear that users would find the<br>
current API surprising.</blockquote><br>Alexander Belopolsky has convinced me (off-list) that my current implementation is better changed to a more conservative one of a callable singleton instance with methods implementing the alternative computations. I'll have something like:<br>
<br><br>def _singleton(cls):<br>return cls()<br><br><br>@_singleton<br>class median:<br>def __call__(self, data):<br>...<br>def low(self, data):<br>...<br>...<br><br><br>In my earlier stats module, I had a single median function that took a argument to choose between alternatives. I called it "scheme":<br>
<br>median(data, scheme="low")<br><br>R uses parameter
called "type" to choose between alternate calculations, not for median as we are discussing, but for quantiles:<br><br>quantile(x, probs ... type = 7, ...).<br><br>SAS also uses a similar system, but with different numeric codes. I rejected both "type" and "method" as the parameter name since it would cause confusion with the usual meanings of those words. I eventually decided against this system for two reasons:<br>
<br>- Each scheme ended up needing to be a separate function, for ease of both implementation and testing. So I had four private median functions, which I put inside a class to act as namespace and avoid polluting the main namespace. Then I needed a "master function" to select which of the methods should be called, with all the additional testing and documentation that entailed.<br>
<br>- The API doesn't really feel very Pythonic to me. For example, we write:<br><br>mystring.rjust(width)<br>dict.items()<br><br>rather than mystring.justify(width,
"right") or dict.iterate("items"). So I think individual methods is a better API, and one which is more familiar to most Python users. The only innovation (if that's what it is) is to have median a callable object.<br>
<br><br>As far as having four separate functions, median, median_low, etc., it just doesn't feel right to me. It puts four slight variations of the same function into the main namespace, instead of keeping them together in a namespace. Names like median_low merely simulates a namespace with pseudo-methods separated with underscores instead of dots, only without the advantages of a real namespace.<br>
<br>(I treat variance and std dev differently, and make the sample and population forms separate top-level functions rather than methods, simply because they are so well-known from scientific calculators that it is unthinkable to me to do differently. Whenever I use numpy, I am surprised all over again that it has only a single variance function.)<br>
<br><br></pre></blockquote></div></div></div><span class="HOEnZb"><font color="#888888"><br>
-- <br>
Sent from my Android phone with K-9 Mail. Please excuse my brevity.</font></span></div></div><br>_______________________________________________<br>
Python-Dev mailing list<br>
<a href="mailto:Python-Dev@python.org">Python-Dev@python.org</a><br>
<a href="http://mail.python.org/mailman/listinfo/python-dev" target="_blank">http://mail.python.org/mailman/listinfo/python-dev</a><br>
Unsubscribe: <a href="http://mail.python.org/mailman/options/python-dev/dickinsm%40gmail.com" target="_blank">http://mail.python.org/mailman/options/python-dev/dickinsm%40gmail.com</a><br>
<br></blockquote></div><br></div></div>