xor: how come so slow?
Steven D'Aprano
steve at REMOVE-THIS-cybersource.com.au
Sat Oct 18 23:01:50 EDT 2008
On Sat, 18 Oct 2008 09:16:11 +1300, Lawrence D'Oliveiro wrote:
> In message <Jlj*lmIps at news.chiark.greenend.org.uk>, Sion Arrowsmith
> wrote:
>
>> Maybe it should be "fewer random data".
>
> Except these days we tend to think of "data" being, say, more like
> "flour" than "bees", so it's "less data", like "less flour", rather than
> like "fewer bees". :)
>
>> After all, each byte in the block is discrete.
>
> Data can come in fractional bits. That's how compression works.
Compression works by removing redundant data so the same amount of useful
information requires fewer bits. If you don't believe me, try compressing
a single bit and see if you get a "fractional bit". I leave it to you to
choose whether the bit should be on or off.
That's why compressing data twice doesn't gain much, if anything. Try
compressing a typical JPEG image, chances are it won't get any smaller.
That's because JPEGs already remove the redundancy in the data,
compressing it. With no redundancy, there can be no compression.
Due to the pigeon-hole principle, any reversible compression scheme must
cause some data to be expanded rather than compressed. (If not, then
there must be two different pieces of data that compress to the same
thing, which means the scheme is not reversible.) If there were
fractional bits, that wouldn't apply, because you can always divide a
pigeon-hole (a bit) into two smaller pigeon-holes.
--
Steven
More information about the Python-list
mailing list