# [scikit-learn] mutual information for continuous variables with scikit-learn

m m mhfh.kvd5 at gmail.com
Wed Feb 1 11:04:05 EST 2023

```Thanks Sole and Gael, I'll try both ways. Are the two methods fundamentally
different, or will they give me similar results?
Also, the majority of MI analysis I've seen with continuous variables
discretize the data into arbitrary bins. Is this procedure actually valid?
I'd think by discretizing continuous data we would be losing important
variation in the data.

On Wed, Feb 1, 2023 at 3:19 PM Gael Varoquaux <gael.varoquaux at normalesup.org>
wrote:

> For estimating mutual information on continuous variables, have a look at
> the corresponding package
> https://pypi.org/project/mutual-info/
>
> G
>
> On Wed, Feb 01, 2023 at 02:32:03PM +0100, m m wrote:
> > Hello,
>
> > I have two continuous variables (heart rate samples over a period of
> time), and
> > would like to compute mutual information between them as a measure of
> > similarity.
>
> > I've read some posts suggesting to use the mutual_info_score from
> scikit-learn
> > but will this work for continuous variables? One stackoverflow answer
> suggested
> > converting the data into probabilities with np.histogram2d() and passing
> the
> > contingency table to the mutual_info_score.
>
> > from sklearn.metrics import mutual_info_score
>
> > def calc_MI(x, y, bins):
> >     c_xy = np.histogram2d(x, y, bins)[0]
> >     mi = mutual_info_score(None, None, contingency=c_xy)
> >     return mi
>
> > # generate data
> > L = np.linalg.cholesky( [[1.0, 0.60], [0.60, 1.0]])
> > uncorrelated = np.random.standard_normal((2, 300))
> > correlated = np.dot(L, uncorrelated)
> > A = correlated[0]
> > B = correlated[1]
> > x = (A - np.mean(A)) / np.std(A)
> > y = (B - np.mean(B)) / np.std(B)
>
> > # calculate MI
> > mi = calc_MI(x, y, 50)
>
> > Is calc_MI a valid approach? I'm asking because I also read that when
> variables
> > are continuous, then the sums in the formula for discrete data become
> > integrals, but I'm not sure if this procedure is implemented in
> scikit-learn?
>
> > Thanks!
>
> > _______________________________________________
> > scikit-learn mailing list
> > scikit-learn at python.org
> > https://mail.python.org/mailman/listinfo/scikit-learn
>
>
> --
>     Gael Varoquaux
>     Research Director, INRIA