[SciPy-User] KDE bandwith selection question
Zachary Pincus
zachary.pincus at yale.edu
Mon Feb 14 11:19:37 EST 2011
> I read the Kernel Density Estimation documentation online but I was
> unable to find any reference to the bandwith selection algorithm (in
> scipy.stats I mean). My question is, wich kind of algorithm use
> stats.gaussian_kde to evaluate the bandwith?
The default is "Scott's Factor" (look at the source code), which is
pretty simplistic but seems to work well.
A while ago I asked where this came from, and Josef did some really
helpful research. Below is his answer (and my original question etc).
Zach
>> I've been wading through the old literature on gaussian KDE for a
>> little while trying to find a reference for the "Scott's factor"
>> rule-
>> of-thumb for gaussian KDE bandwidth selection (n**(-1/(d+4)), where n
>> is the number of data points and d their dimension; this factor is
>> multiplied by the covariance matrix to yield the bandwidths).
>>
>> I can find a lot of Scott's later contributions of fancier methods,
>> but nothing about this basic one...
>
> Scotts 1992 is the reference in Haerdle
>
> http://books.google.com/books?id=qPCmAOS-CoMC&pg=PA73&lpg=PA73&dq=scott%27s+factor+rule-+of-thumb+hardle&source=bl&ots=kTNHJpyk6w&sig=5wwCOzThGsIzXOyVax2AbKQ11Rw&hl=en&ei=MOwlTdC3F4aBlAeRsZDNAQ&sa=X&oi=book_result&ct=result&resnum=1&sqi=2&ved=0CBYQ6AEwAA#v
> =onepage&q&f=false
>
> Haerdle's book is also online, but I need to look for the link.
>
> Josef
I think it's equation (3.70) in
http://fedc.wiwi.hu-berlin.de/xplore/ebooks/html/spm/spmhtmlnode18.html
with page reference to scott 92 p 152
more online Haerdle is here http://fedc.wiwi.hu-berlin.de/xplore/ebooks/html/
Josef
More information about the SciPy-User
mailing list