Compression of random binary data
Gregory Ewing
greg.ewing at canterbury.ac.nz
Sat Oct 28 22:00:23 EDT 2017
Ben Bacarisse wrote:
> But that has to be about the process that gives rise to the data, not
> the data themselves.
> If I say: "here is some random data..." you can't tell if it is or is
> not from a random source. I can, as a parlour trick, compress and
> recover this "random data" because I chose it.
Indeed. Another way to say it is that you can't conclude
anything about the source from a sample size of one.
If you have a large enough sample, then you can estimate
a probability distribution, and calculate an entropy.
> I think the argument that you can't compress arbitrary data is simpler
> ... it's obvious that it includes the results of previous
> compressions.
What? I don't see how "results of previous compressions" comes
into it. The source has an entropy even if you're not doing
compression at all.
--
Greg
More information about the Python-list
mailing list