random

Alex Martelli aleaxit at yahoo.com
Mon Jun 4 03:16:12 EDT 2001


"Laura Creighton" <lac at cd.chalmers.se> wrote in message
news:mailman.991632672.19827.python-list at python.org...
> Gang, you have forgot what I said about sampling error.  Lets try a
different
> thought experiment.
>
> If you decide to go north or south based on whatever train arrives
> first, randomly arriving at the train station will not help you because
> if the trains run 1 an hour and at the 00 going n and the 10 going s
> you are 5 to 1 more likely to find the next train is north no matter
> how crazy your work schedule is.

I'm not sure what you mean by "won't help you" in this context (I
think I've missed a post of yours since I don't recall the sampling
error reference).  Under your posited conditions, it seems to me that,
if you can arrive at the station at any time with a uniform distrib.,
then p(N)=5/6, p(S)=1/6.  Instead of 1 full bit of information, then,
your going North or South will carry:
    -log2(5/6) = 0.263    when you go N (5/6 of the time),
    -log2(1/6) = 2.585    when you go S (1/6 of the time)
with an expected value of (5/6)*0.263+(1/6)*2.585 = 0.65
bits of information.

I'm not sure if this satisfies any given definition of "perfectly
random", because it seems to depend entirely on the definition.

In a sequence of such N/S bits there will be some weak 'pattern'
(of order 0) making them carry less than 1 bit of information
(unpredictability, "surprise") each, but no other (on the thought
experiment as given).  Starting from a analysis of lot of such bits,
an observer can estimate the 5/6 probability and therefore predict
future bits (probabilistically) better than if the distribution was
uniform.  But it still takes an amount of information O(N) to
compress a sequence of N such bits, we just have a slightly
lower multiplicative constant than if the two outcomes were
equally likely.  We could presumably compress a million such
bits into 650,000+K' bits where K' does not depend on the
number of bits being compressed (it's the size of the program
stricto-sensu doing the decoding of the compressed stream, by
Huffman or whatever).


Alex






More information about the Python-list mailing list