[Python-Dev] Decimal <-> float comparisons in py3k.

Bob Ippolito bob at redivi.com
Tue Mar 23 16:43:11 CET 2010


On Sat, Mar 20, 2010 at 4:38 PM, Mark Dickinson <dickinsm at gmail.com> wrote:
> On Sat, Mar 20, 2010 at 7:56 PM, Guido van Rossum <guido at python.org> wrote:
>> I propose to reduce all hashes to the hash of a normalized fraction,
>> which we can define as a combination of the hashes for the numerator
>> and the denominator. Then all we have to do is figure fairly efficient
>> ways to convert floats and decimals to normalized fractions (not
>> necessarily Fractions). I may be naive but this seems doable: for a
>> float, the denominator is always a power of 2 and removing factors of
>> 2 from the denominator is easy (just right-shift until the last bit is
>> zero). For Decimal, the unnormalized denominator is always a power of
>> 10, and the normalization is a bit messier, but doesn't seem
>> excessively so. The resulting numerator and denominator may be large
>> numbers, but for typical use of Decimal and float they will rarely be
>> excessively large, and I'm not too worried about slowing things down
>> when they are (everything slows down when you're using really large
>> integers anyway).
>
> I *am* worried about slowing things down for large Decimals:  if you
> can't put Decimal('1e1234567') into a dict or set without waiting for
> an hour for the hash computation to complete (because it's busy
> computing 10**1234567), I consider that a problem.
>
> But it's solvable!  I've just put a patch on the bug tracker:
>
> http://bugs.python.org/issue8188
>
> It demonstrates how hashes can be implemented efficiently and
> compatibly for all numeric types, even large Decimals like the above.
> It needs a little tidying up, but it works.

I was interested in how the implementation worked yesterday,
especially given the lack of explanation in the margins of
numeric_hash3.patch. numeric_hash4.patch has much better comments, but
I didn't see this patch until after I had sufficiently deciphered the
previous patch and wrote most of this:
http://bob.pythonmac.org/archives/2010/03/23/py3k-unified-numeric-hash/

I'm not really qualified to review the patch, what little formal math
training I had has atrophied quite a bit over the years, but as far as
I can tell it seems to work. The results also seem to match the Python
implementations that I created.

-bob


More information about the Python-Dev mailing list