Compression of random binary data
Ben Bacarisse
ben.usenet at bsb.me.uk
Sun Oct 29 11:17:53 EDT 2017
Gregory Ewing <greg.ewing at canterbury.ac.nz> writes:
> Ben Bacarisse wrote:
>> But that has to be about the process that gives rise to the data, not
>> the data themselves.
>
>> If I say: "here is some random data..." you can't tell if it is or is
>> not from a random source. I can, as a parlour trick, compress and
>> recover this "random data" because I chose it.
>
> Indeed. Another way to say it is that you can't conclude
> anything about the source from a sample size of one.
>
> If you have a large enough sample, then you can estimate
> a probability distribution, and calculate an entropy.
>
>> I think the argument that you can't compress arbitrary data is simpler
>> ... it's obvious that it includes the results of previous
>> compressions.
>
> What? I don't see how "results of previous compressions" comes
> into it. The source has an entropy even if you're not doing
> compression at all.
Maybe we are taking at cross purposes. A claim to be able to compress
arbitrary data leads immediately to the problem that iterating the
compression will yield zero-size results. That, to me, is a simpler
argument that talking about data from a random source.
--
Ben.
More information about the Python-list
mailing list