[Numpy-discussion] Log Arrays

Thu May 8 13:05:48 EDT 2008

On Thu, May 8, 2008 at 12:02 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
> On Thu, May 8, 2008 at 10:56 AM, Robert Kern <robert.kern at gmail.com> wrote:
>>
>> On Thu, May 8, 2008 at 11:25 AM, Charles R Harris
>> <charlesr.harris at gmail.com> wrote:
>> >
>> > On Thu, May 8, 2008 at 10:11 AM, Anne Archibald
>> > <peridot.faceted at gmail.com>
>> > wrote:
>> >>
>> >> 2008/5/8 Charles R Harris <charlesr.harris at gmail.com>:
>> >> >
>> >> > What realistic probability is in the range exp(-1000) ?
>> >>
>> >> Well, I ran into it while doing a maximum-likelihood fit - my early
>> >> guesses had exceedingly low probabilities, but I needed to know which
>> >> way the probabilities were increasing.
>> >
>> > The number of bosons in the universe is only on the order of 1e-42.
>> > Exp(-1000) may be convenient, but as a probability it is a delusion. The
>> > hypothesis "none of the above" would have a much larger prior.
>>
>> When you're running an optimizer over a PDF, you will be stuck in the
>> region of exp(-1000) for a substantial amount of time before you get
>> to the peak. If you don't use the log representation, you will never
>> get to the peak because all of the gradient information is lost to
>> floating point error. You can consult any book on computational
>> statistics for many more examples. This is a long-established best
>> practice in statistics.
>
> But IEEE is already a log representation. You aren't gaining precision, you
> are gaining more bits in the exponent at the expense of fewer bits in the
> mantissa, i.e., less precision.

*YES*. As David pointed out, many of these PDFs are in exponential
form. Most of the meaningful variation is in the exponent, not the
mantissa.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
 -- Umberto Eco