fastmath library?

Alex Martelli aleaxit at yahoo.com
Fri Nov 17 05:37:09 EST 2000


"Michael Hudson" <mwh21 at cam.ac.uk> wrote in message
news:m3zoizr5nx.fsf at atrus.jesus.cam.ac.uk...
> "Pete Shinners" <pete at visionart.com> writes:
>
> > is there a "fast" math library available?
> >
> > one that uses high-speed lookup tables and estimates for
> > low quality results? i'd mainly like one that can handle
> > things like square-root, cosine, etc, etc
> >
> > i figure there's got to be something like this already
> > available for python?
>
> Wouldn't have though there'd be much point - the faffing around Python
> does in calling a function surely dwarfs the cost of the actual

Indeed, *measuring* the proposed 'memoize' solution shows that:

class memoize:
    def __init__(self, fn):
        self.fn = fn
        self.args = {}

    def __call__(self, *args):
        if not self.args.has_key(args):
            self.args[args] = apply(self.fn, args)

        return self.args[args]

def test():
    import math
    import time

    print "Without memoizing:",
    start = time.clock()
    for i in range(100*1000):
        a = math.sin(0.12)
    stend = time.clock()
    print stend-start

    msin = memoize(math.sin)
    print "With    memoizing:",
    start = time.clock()
    for i in range(100*1000):
        a = msin(0.12)
    stend = time.clock()
    print stend-start

>>> memo.test()
With a dumy funct: 0.399322148674
Without memoizing: 0.909183175743
With    memoizing: 3.99693843856
>>> memo.test()
With a dumy funct: 0.397101196632
Without memoizing: 0.906404052357
With    memoizing: 3.96675695745
>>>

I.e., the lookup-table is slowing things down
by a factor of 4 or more (while the actual operation
of math.sin accounts for just slightly more than
50% of the call-cost -- the comparison with the
dummy-function call tells us that).


We *can* do a little bit better with the usual "it's
easier to ask forgiveness than permission" idiom:

class memoize:
    def __init__(self, fn):
        self.fn = fn
        self.args = {}

    def __call__(self, *args):
        try: return self.args[args]
        except KeyError:
            return self.args.setdefault(args, self.fn(*args))

>>> memo.test()
With a dumy funct: 0.402718110063
Without memoizing: 0.948696007817
With    memoizing: 2.4527448453
>>> memo.test()
With a dumy funct: 0.402881538609
Without memoizing: 0.941703780311
With    memoizing: 2.5360632326
>>>

but still, despite the remarkable improvement, the
memoizing is still slowing us down by over a factor
of two.


Pete's needs are actually for "fuzzy memoizing" of
some sort -- a tall order to implement really fast
(he wants 'low-quality results' as zippingly fast
as possible).  No doubt it WILL have to work on top
of Numeric, to have halfway-decent performance
improvements -- a lot of Python function calls on
scalars in a loop ain't gonna improve by enough.

What IS available in/for C/C++ that does "crudely
approximate but horribly fast elementary functions"?
No doubt, the most productive task would be to wrap
such a beast (assuming it exists... it IS going to
be DIFFICULT indeed to beat hardware-FPU performance
with ANY software implementation, rough at its
approximations may be...!-)


Alex






More information about the Python-list mailing list