[SciPy-user] Calling C code (Newbie Question)

Tue Sep 28 03:36:50 EDT 2004

Just two notes:

1) You can look at SWIG (http://www.swig.org/) as an alternative to 
Weave. It lets you wrap an entire C library in your Python code, not 
only code chuncks as Weave does.

2) As stated by Eric I think the limiting factor are the trig 
functions. Look at Numerical Recipes section 5.5. I've found that the 
algorithm suggested there to calculate sin and cos function can speed 
up loops a lot.

HTH,
   Andrea.

On 28 Sep 2004, at 07:53, eric jones wrote:

> Hey Tom,
>
> Tom Kornack wrote:
>
>> Hi Robert:
>>
>>> Your code:
>>>      sh = sum( cn*sin(outerproduct(om,time) ), 1)
>>>
>>> My code:
>>>      sh = dot(sin(outerproduct(time, om)), cn)
>>
>>
>> Thanks for your suggestions. Using dot() is better than sum(), 
>> however, the outerproduct() alone gives me malloc errors and I have 2 
>> GB memory. I mean, it's a huge matrix that gets created when I have a 
>> million points. That's why I wanted to use C.
>>
>> The better question for this list would be: how do I Weave this in C?
>>
> I inserted the following replacements into your example app:
>
>>>  for i in range(numf):
>>>        s2[i] = sum( sin(2.*om[i]*time) )
>>>        c2[i] = sum( cos(2.*om[i]*time) )
>>
>
>    import weave
>    code = """
>           for (int i=0; i < Nom[0]; i++)
>           {
>               s2[i]=0.0;
>               c2[i]=0.0;
>               float om_2 = 2.0*om[i];
>               for (int j=0; j < Ntime[0]; j++)
>               {
>                   float ang = om_2*time[j];
>                   s2[i] += sin(ang);
>                   c2[i] += cos(ang);
>               }             }
>           """
>    weave.inline(code,['om','s2','c2','time'])
>
> and a little later in your code:
>
>        #for i in range(numf):                  #    sh[i] = sum( 
> cn*sin(om[i]*time) )
>        #    ch[i] = sum( cn*cos(om[i]*time) )
>              code = """
>               for (int i=0; i < Nom[0]; i++)
>               {
>                   sh[i] = 0.0;
>                   ch[i] = 0.0;
>                   for (int j=0; j < Ntime[0]; j++)
>                   {
>                       float ang om[i]*time[j];
>                       sh[i] += cn[j]*sin(ang);
>                       ch[i] += cn[j]*cos(ang);
>                   }                 }
>               """
>        weave.inline(code,['om','sh','ch','time','cn'])
>
> [Note: Check the accuracy of the calculations as this was just a quick 
> example.]
>
> I've attached the edited example file that I used for testing.  It has 
> two calls to your function in the main() to get rid of any caching 
> affects from the first run or a weave.inline() function.
> I got a speedup of about 1.75 when running your function with 
> multiple=0.  How much faster is the pure C version?  I played around a 
> little, and it looks like the sin() and cos() are the limiting factors 
> here.  For N=4000, the function runs in about 3.95 seconds on my 
> machine.  It is an order of magnitude faster if I comment out all the 
> inner loop trig calls with something like:
>
>    sh[i] += cn[j]*ang; //sin(ang);
>    ch[i] += cn[j]*ang  //cos(ang);
>
> This suggest that even the pure C version that uses the same approach 
> isn't going to be much faster since it also will be limited byt the 
> speed of the sin() and cos() calls.
>
>>
>> Sorry for the rather basic question.
>
> Not at all.
>
> eric
>
> #! /usr/bin/env python
> # -*- coding: UTF-8 -*-
>
> from __future__ import division
> from Numeric import *
> from scipy.stats import std, mean, norm
>
> #from  Scientific.Statistics import standardDeviation
> from RandomArray import normal
> from MLab import mean
>
> def lomb(time, signal, freqin=[], fap=0.01, multiple=0, noise=0, 
> verbosity=2):
> #
> # NAME:
> #         lomb
> #
> #
> # PURPOSE:
> #         Compute the lomb-scargle periodogram of an unevenly sampled
> #         lightcurve
> #
> #
> # CATEGORY:
> #         time series analysis
> #
> #
> # CALLING SEQUENCE:
> #   psd, freq = 
> scargle(time,signal,fmin=fmin,fmax=fmax,numf=numf,pmin=pmin,pmax=pmax,
> #            omega=omega,fap=fap,signi=signi
> #
> #
> # INPUTS:
> #         time: The times at which the time series was measured
> #         rate: the corresponding count rates
> #
> #
> # OPTIONAL INPUTS:
> #         fmin,fmax: minimum and maximum frequency (NOT ANGULAR FREQ!)
> #                to be used (has precedence over pmin,pmax)
> #         pmin,pmax: minimum and maximum PERIOD to be used
> #         omega: angular frequencies for which the PSD values are
> #                desired
> #         fap : false alarm probability desired
> #               (see Scargle et al., p. 840, and signi
> #               keyword). Default equal to 0.01 (99% significance)
> #         noise: for the normalization of the periodogram and the
> #            compute of the white noise simulations. If not set, equal 
> to
> #            the variance of the original lc.
> #         multiple: number of white  noise simulations for the FAP
> #            power level. Default equal to 0 (i.e., no simulations).
> #         numf: number of independent frequencies
> #
> #
> # KEYWORD PARAMETERS:
> #         verbosity: print out debugging information if set
> #
> # OUTPUTS:
> #            psd  : the psd-values corresponding to omega
> #            freq   : frequency of PSD
> #
> #
> # OPTIONAL OUTPUTS:
> #            signi : power threshold corresponding to the given
> #                    false alarm probabilities fap and according to the
> #                    desired number of independent frequencies
> #            simsigni : power threshold corresponding to the given
> #                    false alarm probabilities fap according to white
> #                    noise simulations
> #            psdpeaksort : array with the maximum peak pro each 
> simulation
> #
> # PROCEDURE:
> #         The Lomb Scargle PSD is computed according to the
> #         definitions given by Scargle, 1982, ApJ, 263, 835, and Horne
> #         and Baliunas, 1986, MNRAS, 302, 757. Beware of patterns and
> #         clustered data points as the Horne results break down in
> #         this case! Read and understand the papers and this
> #         code before using it! For the fast algorithm read W.H. Press
> #         and G.B. Rybicki 1989, ApJ 338, 277.
> #
> #
> # EXAMPLE:
> #
> #
> #
> # MODIFICATION HISTORY:
> #          Version 1.0, 1997, Joern Wilms IAAT
> #          Version 1.1, 1998.09.23, JW: Do not normalize if variance 
> is 0
> #             (for computation of LSP of window function...)
> #          Version 1.2, 1999.01.07, JW: force numf to be int
> #          Version 1.3, 1999.08.05, JW: added omega keyword
> #          Version 1.4, 1999.08
> #              KP: significance levels
> #              JW: pmin,pmax keywords
> #          Version 1.5, 1999.08.27, JW: compute the significance levels
> #               from the horne number of independent frequencies, and 
> not from
> #               numf
> #          Version 1.6, 2000.07.27, SS and SB: added fast algorithm 
> and FAP
> #               according to white noise lc simulations.
> #          Version 1.7, 2000.07.28 JW: added debug keyword, sped up
> #               simulations by factor of four (use /slow to get old
> #               behavior of the simulations)
> #           Version 2.0 2004.09.01, Thomas Kornack rewritten in Python
>
>     if verbosity>1: print('Starting Lomb (standard)...')
>
>     # defaults
>     if noise == 0: noise = sqrt(std(signal))
>
>     # make times manageable (Scargle periodogram is time-shift 
> invariant)
>     time = time-time[0]
>
>     # number of independent frequencies
>     #  (Horne and Baliunas, eq. 13)
>     n0 = len(time)
>     horne = long(-6.362+1.193*n0+0.00098*n0**2.)
>     if (horne < 0): horne=5
>     numf = horne
>
>     # min.freq is 1/T
>     fmin = 1./max(time)
>
>     # max. freq: approx. to Nyquist frequency
>     fmax = n0 / (2.*max(time))
>
>     # if omega is not given, compute it
>     if (len(freqin) > 0):
>         om = freqin*2*pi
>         numf = len(om)
>     else:
>         om = 2.*pi*(fmin+(fmax-fmin)*arange(numf)/(numf-1.))
>
>     # False Alarm Probability according to Numf
>     signi = -log( 1. - ((1.-fap)**(1./horne)) )
>
>     if verbosity>1: print('Setting up periodogram...')
>
>     # Periodogram
>     # Ref.: W.H. Press and G.B. Rybicki, 1989, ApJ 338, 277
>
>     # Eq. (6); s2, c2
>     # Do we REALLY need these to be doubles?!
>     s2 = zeros(numf, typecode='f') # , typecode = 'd'
>     c2 = zeros(numf, typecode ='f') # , typecode = 'd'
>     #for i in range(numf):
>     #   s2[i] = sum( sin(2.*om[i]*time) )
>     #   c2[i] = sum( cos(2.*om[i]*time) )
>
>     import weave
>     code = """
>            for (int i=0; i < Nom[0]; i++)
>            {
>                s2[i]=0.0;
>                c2[i]=0.0;
>                float om_2 = 2.0*om[i];
>                for (int j=0; j < Ntime[0]; j++)
>                {
>                    float ang = om_2*time[j];
>                    s2[i] += sin(ang);
>                    c2[i] += cos(ang);
>                }
>            }
>            """
>     weave.inline(code,['om','s2','c2','time'])
>
>     # Eq. (2): Definition -> tan(2omtau)
>     # --- tan(2omtau)  =  s2 / c2
>     omtau = arctan(s2/c2)/2
>
>     # cos(tau), sin(tau)
>     cosomtau = cos(omtau)
>     sinomtau = sin(omtau)
>
>     # Eq. (7); sum(cos(t-tau)**2)  and sum(sin(t-tau)**2)
>     tmp = c2*cos(2.*omtau) + s2*sin(2.*omtau)
>     tc2 = 0.5*(n0+tmp) # sum(cos(t-tau)**2)
>     ts2 = 0.5*(n0-tmp) # sum(sin(t-tau)**2)
>
>     # clean up
>     tmp = 0.
>     omtau= 0.
>     s2 = 0.
>     t2 = 0.
>
>     # computing the periodogram for the original lc
>
>     # Subtract mean from data
>     cn = signal - mean(signal)
>
>     # Eq. (5); sh and ch
>     sh = zeros(numf, typecode='f')
>     ch = zeros(numf, typecode='f')
>
>     if verbosity>1: print('Looping...')
>     print 'multiple:', multiple
>     if (multiple > 0):
>         sisi=zeros([n0,numf], typecode='f')
>         coco=zeros([n0,numf], typecode='f')
>         for i in range(numf):
>             sisi[:,i]=sin(om[i]*time)
>             coco[:,i]=cos(om[i]*time)
>
>             sh[i]=sum(cn*sisi[:,i])
>             ch[i]=sum(cn*coco[:,i])
>     else:
>         #for i in range(numf):
>         #    sh[i] = sum( cn*sin(om[i]*time) )
>         #    ch[i] = sum( cn*cos(om[i]*time) )
>
>         code = """
>                for (int i=0; i < Nom[0]; i++)
>                {
>                    sh[i] = 0.0;
>                    ch[i] = 0.0;
>                    for (int j=0; j < Ntime[0]; j++)
>                    {
>                        float ang = om[i]*time[j];
>                        sh[i] += cn[j]*sin(ang);
>                        ch[i] += cn[j]*cos(ang);
>                    }
>                }
>                """
>         weave.inline(code,['om','sh','ch','time','cn'])
>
>     # Eq. (3)
>     px = (ch*cosomtau + sh*sinomtau)**2 / tc2 + (sh*cosomtau - 
> ch*sinomtau)**2 / ts2
>
>     # correct normalization
>     psd = 0.5*px/(noise**2)
>
>     if verbosity>1: print('Running Simulations...')
>     # --- RUN SIMULATIONS for multiple > 0
>     simsigni=[]
>     psdpeaksort=[]
>     if multiple > 0:
>         if (multiple*min(fap) < 10):
>             print('WARNING: Number of iterations (multiple keyword) 
> not large enough for false alarm probability requested (need 
> multiple*FAP > 10 )')
>
>         psdpeak = zeros(multiple, typecode='f')
>         for m in range(multiple):
>             if ((m+1)%100 == 0) and verbosity>0:
>                 print('...working on %ith simulation . (%3i% Done)' % 
> (m,m/multiple))
>
>             # white noise simulation
>             cn = normal(0,noise,n0)
>             cn = cn-mean(cn) # force OBSERVED count rate to zero
>
>             # Eq. (5); sh and ch
>             for i in range(numf):
>                 sh[i]=sum(cn*sisi[:,i])
>                 ch[i]=sum(cn*coco[:,i])
>
>             # Eq. (3) ; computing the periodogram for each simulation
>             psdpeak[m] = max ( (ch*cosomtau + sh*sinomtau)**2 / tc2 + 
> (sh*cosomtau - ch*sinomtau)**2 / ts2 )
>
>         # False Alarm Probability according to simulations
>         if len(psdpeak) != 0:
>             idx = sort(psdpeak)
>             # correct normalization
>             psdpeaksort = 0.5 * psdpeak[idx]/(noise**2)
>             simsigni = psdpeaksort[(1-fap)*(multiple-1)]
>
>     freq = om/(2.*pi)
>
>     if verbosity>1: print('Done...')
>
>     return (psd, freq, signi, simsigni, psdpeaksort)
>
>
> if __name__ == '__main__':
>     print('Testing Lomb-Scargle Periodogram with white noise...')
>     freq = 10. # Hz - Sample frequency
>     time = 400. #seconds
>     noisedata = normal(0,5,int(round(freq*time)))
>     noisetime = arange(0,time,1/freq)
>     var = { 'x': noisetime, 'y': noisedata, 'ylabel': 'Amplitude', 
> 'xlabel':'Time (s)' }
>
>     #from CPTAnalyze import findAverageSampleTime
>     N=4000
>     dt = 1.0 #findAverageSampleTime(var,0)
>     maxlogx = log(1/(2*dt)) # max frequency is the sampling rate
>     minlogx = log(1/(max(var['x'])-min(var['x']))) #min frequency is 
> 1/T
>     frequencies = exp(arange(N, typecode = 
> Float)/(N-1.)*(maxlogx-minlogx)+minlogx)
>     psd, freqs, signi, simsigni, psdpeaks = lomb(var['x'], 
> var['y'],freqin=frequencies)
>     #lomb(var['x'], var['y'],freqin=frequencies)
>     import time
>     t1 = time.clock()
>     psd, freqs, signi, simsigni, psdpeaks = lomb(var['x'], 
> var['y'],freqin=frequencies)
>     #lomb(var['x'], var['y'],freqin=frequencies)
>     t2 = time.clock()
>     print t2 - t1
>     print freqs[:5], freqs[-5:]
>     print psd[:5], psd[-5:]
>
>     """
>     import Gnuplot
>     plotobj = Gnuplot.Gnuplot()
>     plotobj.title('Testing Lomb-Scargle Periodogram')
>     plotobj.xlabel('Frequency (Hz)')
>     plotobj.ylabel('Power Spectral Density (arb/Hz^1/2)')
>     plotobj('set logscale xy')
>     plotobj.plot(Gnuplot.Data(freqs,psd, with = 'l 4 0'))
>     """_______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.net
> http://www.scipy.net/mailman/listinfo/scipy-user