Mailman 3 Entropy from empirical high-dimensional data - SciPy-Dev

25 May 2011

      Hi list,

I am looking at estimating entropy and conditional entropy from data for
which I have only access to observations, and not the underlying
probabilistic laws.

With low dimensional data, I would simply use an empirical estimate of
the probabilities by converting each observation to its quantile, and
then apply the standard formula for entropy (for instance using
scipy.stats.entropy).

However, I have high-dimensional data (~100 features, and 30000
observations). Not only is it harder to convert observations to
probabilities in the empirical law, but I am also worried of curse of
dimensionality effects: density estimation in high-dimension is a
difficult problem.

Does anybody has advices, or code in Python to point to, for this task?

Cheers,

Gaël

Entropy from empirical high-dimensional data

Gael Varoquaux

Gael Varoquaux

Emanuele Olivetti

Gael Varoquaux

tags

participants (2)