Numpy outlier removal

Robert Kern robert.kern at
Mon Jan 7 16:35:05 CET 2013

On 07/01/2013 15:20, Oscar Benjamin wrote:
> On 7 January 2013 05:11, Steven D'Aprano
> <steve+comp.lang.python at> wrote:
>> On Mon, 07 Jan 2013 02:29:27 +0000, Oscar Benjamin wrote:
>>> On 7 January 2013 01:46, Steven D'Aprano
>>> <steve+comp.lang.python at> wrote:
>>>> On Sun, 06 Jan 2013 19:44:08 +0000, Joseph L. Casale wrote:
>>>> I'm not sure that this approach is statistically robust. No, let me be
>>>> even more assertive: I'm sure that this approach is NOT statistically
>>>> robust, and may be scientifically dubious.
>>> Whether or not this is "statistically robust" requires more explanation
>>> about the OP's intention.
>> Not really. Statistics robustness is objectively defined, and the user's
>> intention doesn't come into it. The mean is not a robust measure of
>> central tendency, the median is, regardless of why you pick one or the
>> other.
> Okay, I see what you mean. I wasn't thinking of robustness as a
> technical term but now I see that you are correct.
> Perhaps what I should have said is that whether or not this matters
> depends on the problem at hand (hopefully this isn't an important
> medical trial) and the particular type of data that you have; assuming
> normality is fine in many cases even if the data is not "really"
> normal.

"Having outliers" literally means that assuming normality is not fine. If 
assuming normality were fine, then you wouldn't need to remove outliers.

Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco

More information about the Python-list mailing list