Contractor need for integration prj (long)

Hoon Yoon hyoon at bigfoot.com
Fri Jul 16 10:19:25 EDT 1999


Hi,


  Our firm is looking to see if it is possible to wrap some of more
advanced lib 
in R like modreg.dll and specifically Loess function in there for both 
regression and prediction into MSVC++. We will probably take it to
either VB or 
Python for implementation issues (Guess which I prefer). Given that
locfit has 
many utitlities built in for data transformation, this would be a good
thing.
  If that's not possible, I would like to have dloess from Netlib
compiled under 
MSVC++, either linked by g77 or f2c or other fortran compiler.
  I am trying to make sure the final version is available to public. It
does not 
seem to be a major prb from our stand point, but that may change. If
developer 
wishes not to disclose his sources to public. That's fine with us. We,
however, 
need full sources, projects, obj, and exec files. Please submit bid with 
timeframe and qualification back to this e-mail.
  The objective is quickly implent Loess with VB or Python for
regression and 
prediction. I would prefer to use existing R dll's, because I can just
piggie 
back off of it everytime source changes or for new analytics; however,
quick 
timing is crucial. Any quick workable solution with high degree of
confidence 
would be chosen. I need to do multivariate regression with loess.
  We will also take this to Unix at some point.
  A pure Python version would be nice. I have a pure Gauss version. Only
prb with
it is that it's single variable only.
  We are Fortune 500 company.

  
S.Y.

REPLY to: hoon at bigfoot.com

R is available at: http://www.stat.cmu.edu/R/CRAN/
netLib w/ loess: http://www.wavelet.org/netlib/a/index.html

Gauss code: Pls don't just try translate this. This is single variable
only for
dependent variable. I providing it for reference. SAS also has Loess
built in.
/* LOESS - Local Regression and optional Robust Fitting and Symmetric
Errors
**
** (C) Copyright 1996  Aptech Systems, Inc.
** All Rights Reserved.
**
** This Software Product is PROPRIETARY SOURCE CODE OF APTECH
** SYSTEMS, INC.    This File Header must accompany all files using
** any portion, in whole or in part, of this Source Code.   In
** addition, the right to create such files is strictly limited by
** Section 2.A. of the GAUSS Applications License Agreement
** accompanying this Software Product.
**
** If you wish to distribute any portion of the proprietary Source
** Code, in whole or in part, you must first obtain written
** permission from Aptech Systems.
**
**-------------------**------------------**-------------------**-----------**
**-------------------**------------------**-------------------**-----------**
**
**> loess
**
**  Format:   { yhat,ys,xs } = loess(y,x)
**
**  Input:    y    Nx1 vector, dependent variable.
**
**            x    Nx1 vector, independent variable.
**
**  Output:   yhat    Nx1 vector, predicted depvar given indvars.
**
**            ys      numEvalx1 vector, ordinate values given abscissae
**                    values in xs.
**
**            xs      numEvalx1 vector, equally spaced abscissae values. 
**
**
**   Globals:
**
**   _loess_Span -- scalar, degree of smoothing, must be greater than
**                 2 / N.  Default = .67777.
**
**   _loess_NumEval -- scalar, number of points in ys and xs. 
**                     Default = 50.
**
**   _loess_Degree -- scalar, if 2, quadratic fit, otherwise linear.
**                    Default = 1.
**
**   _loess_WgtType -- scalar, type of weights.  Default weight type
**                 is Gaussian. If set to 1, robust, symmetric weights
**                 are used.
**
**   __ouput -- scalar, if 1 (default), iteration information and
results
**                 are printed, otherwise nothing is printed.  
**
**   Remarks:
**
**   Adapted from a program developed by Simon Jackson 
**     (acbsjack at gosnell.spc.uchicago.edu)
**
**   References:
**   William S. Cleveland. 1979. Robust Locally Weighted Regression and
**      Smoothing Scatterplots.  _JASA_. 74:829-836.
**   William S. Cleveland and Susan J. Devlin. 1988. Locally Weighted
**       Regression: An Approach to Regression Analysis by Local
**       Fitting. _JASA_. 83:596-609. [mainly on multivariate loess]
**   William S. Cleveland, Eric Grosse, and William M. Shyu. 1991.
**      Local Regression Models. in John M. Chambers and Trevor
**      J. Hastie (eds). _Statistical Models in S_. Pacific Grove:
**      Wadsworth.
**   Michael Friendly's routines in _SAS System for Statistical
**      Graphics, First Edition_. Cary, NC: SAS Institute.
**   Trevor J. Hastie and R.J. Tibshirani. 1990. _Generalized
**      Additive Models_. London: Chapman and Hall.
*/

#include loess.ext;


proc(3) = loess(y,x);

    local
data,n,q,iter,i,res,wts,b,yhat,dist,s,near,nx,ny,d,delta,ystar,
        xstar,u,x0,x1,x2,y1,y2,loind,upind,singlist,slope,dx,step,fixx,
        sing,flag2,mask,fmt,unit,flag,oldfmt;


    n = rows(x);
    unit = seqa(1,1,n);
    if floor(_loess_Span*n) < 2;        /* check for valid _loess_Span
*/
        if not trapchk(1);
            errorlog "Invalid value for _loess_Span";
            end;
        else;
            retp(error(0),error(0),error(0),error(0));
        endif;
    endif;

    if __output;
        if _loess_WgtType == 1;
            print "Gaussian weights";
        else;
            print "Symmetric weights";
        endif;
        if _loess_Degree == 1;
            print "Locally linear";
        else;
            print "Locally quadratic";
        endif;
        oldfmt = sysstate(19,0);
        format 4,2;
        print "_loess_Span = " _loess_Span;
        format 3,0;
        print "Number of evaluation points " _loess_NumEval;
        call sysstate(19,oldfmt);
    endif;

    data = sortc(x~y,1);    /* sort data first */
    x = data[.,1];
    y = data[.,2];

    yhat = zeros(n,1);
    delta = ones(n,1);      /* initialize robustness weights */
    res = y;        /* initialize residuals */
    s = ones(n,1);          /* initialize neighbor ranks */
    singlist = zeros(n,1);          /* initialize singularity
indicators  */

    xstar =
seqa(minc(x),(maxc(x)-minc(x))./(_loess_NumEval-1),_loess_NumEval);
    yhat = ones(n,1);

    iter = 1;
    do until iter > _loess_WgtType; /* iterations start 2 iterations for
                                      :: symmetric, 1 for Gaussian
                                      */
        if __output;
            print "Iteration "$+ftos(iter,"*.*lf",1,0);
        endif;
        i = 1;
        do until i > n;    /* loop over x values */
            flag2 = 0;      /* initialize singularity flag */
        loop:

            if _loess_Span <= 1;
                q = floor(_loess_Span.*n);
            else;
                q = n;
            endif;          /* number of nearest neighbors */
            x0 = x[i];      /* reference obs for this iteration */
            dist = abs(x-x0);       /* distance to other points */
            s = sortc(x~y~unit~dist,4);     /* identify sorted
distances  */
            s = s[1:q,.];
            nx = s[.,1];
            ny = s[.,2];    /* q nearest neighbors of x0 */
            d = maxc(s[.,4]);       /* distance to furthest nearest  */
            if _loess_Span > 1;
                d = d.*_loess_Span;
            endif;          /* p314, CGS */
            if d > 0;
                u = abs(nx-x0)./d;
                wts = (u .< 1).*((1-u^3)^3);       /* tri-cube
neighborhood
                                                    :: wts
                                                    */
                wts = delta[s[.,3]].*wts;           /* symmetric
robustness
                                                    :: weights
                                                    */
                if sumc(wts[2:q]) > .0001;

                    if _loess_Degree == 2;
                        b =
(ny.*wts)/((ones(rows(nx),1)~nx~nx.*nx).*wts);
                    else;
                        b = (ny.*wts)/((ones(rows(nx),1)~nx).*wts);
                    endif;
                        
                    if scalmiss(b);
                        if __output;
                            oldfmt = sysstate(19,0);
                            format 1,0;
                            print "Singularity obs " i;;
                            format 8,4;
                            print " x = " x0;;
                            print " y = " y[i];
                            call sysstate(19,oldfmt);
                        endif;
                        q = n;
                        if flag2 /= 1;
                            flag2 = 1;
                            goto loop;
                        endif;
                        singlist[i] = 1;
                    endif;
                    if _loess_Degree == 2;
                        yhat[i] = (1~x0~(x0^2))*b;          /* smoothed
value
                                                            :: - quad
                                                            */
                    else;
                        yhat[i] = (1~x0)*b;         /* smoothed value -
                                                    :: linear
                                                    */
                    endif;
                else;
                    yhat[i] = y[i];
                endif;
            else;
                yhat[i] = meanc(ny);
                if __output;
                    oldfmt = sysstate(19,0);
                    format 1,0;
                    print "x(0) = 0,  obs " i;;
                    format 8,4;
                    print " x = " x0;;
                    print " y = " y[i];
                    call sysstate(19,oldfmt);
                endif;
                singlist[i] = 1;
            endif;
            i = i + 1;
        endo;

        if sumc(singlist) /= 0;    /* singularity fixup - interpolation
*/

            if __output;
            print "Singularity fixup (interpolation) for "
sumc(singlist) ""\
                " obs";
            endif;
            fixx = indexcat(singlist,1);
            i = 1;
            do until i eq rows(fixx);
                if fixx[i] == 1 or fixx[i] == n;
                    yhat[i] = y[i];
                    i = i+1;
                    continue;
                endif;
                dx = x[fixx[i]]-x[fixx[i]-1];
                flag = 0;
                step = 0;
                do until flag == 1;
                    step = step+1;
                    if scalerr(indexcat(fixx,fixx[i]+step)) == 13;
                        flag = 1;
                    endif;
                endo;
                slope = (yhat[fixx[i]+step]-yhat[fixx[i]-1]);
                slope = slope./(x[fixx[i]+step]-x[fixx[i]-1]);
                yhat[fixx[i]] = yhat[fixx[i]-1]+(slope.*dx);
                i = i + 1;
            endo;
        endif;

        res = y - yhat;       /* residuals */
        if _loess_WgtType == 2;   /* get robustness weights */
            u = res./(6*median(abs(res)));
            delta = (abs(u) .< 1) .* ((1 - u.*u)^2);
        endif;
        iter = iter + 1;
    endo;

    if __output;
        if _loess_WgtType == 2;
            print "Obs    X      Y     Fitted Y    Weights  Residuals";
            mask = { 1 1 1 1 1 1 };
            let fmt[6,3] = "*.*lf " 3 0 "*.*lf" 8 3 "*.*lf" 8 3 "*.*lf"
8
                3 "*.*lf" 8 3 "*.*lf" 8 3 ;
            call printfm(seqa(1,1,n)~x~y~yhat~delta~res,mask,fmt);
        else;
            print "Obs    X      Y     Fitted Y    Residuals";
            mask = { 1 1 1 1 1 };
            let fmt[5,3] = "*.*lf " 3 0 "*.*lf" 8 3 "*.*lf" 8 3 "*.*lf"
8
                3 "*.*lf" 8 3 ;
            call printfm(seqa(1,1,n)~x~y~yhat~res,mask,fmt);
        endif;
    endif;

/* I N T E R P O L A T I O N */
    if _loess_NumEval == n or abs(_loess_NumEval-rows(unique(x,1))) < 2;
        xstar = unique(x,1);
        ystar = yhat[uniqindx(x,1)];
        goto done;
    endif;
    ystar = ones(_loess_NumEval,1);
    ystar[1] = yhat[1];
    ystar[_loess_NumEval] = yhat[n];         /* fix end points */
    i = 2;
    do until i == _loess_NumEval;
        d = x-xstar[i];
        upind = minindc(recode(d,d .< 0,maxc(d)+1));
        loind = maxindc(recode(d,d .> 0,minc(d)-1));
        x1 = x[loind];
        x2 = x[upind];
        y1 = yhat[loind];
        y2 = yhat[upind];
        ystar[i] = y1+((y2-y1)./(x2-x1).*(xstar[i]-x1));
        i = i+1;
    endo;
done:

    retp(yhat,ystar,xstar);
endp;
-------------- next part --------------
A non-text attachment was scrubbed...
Name: vcard.vcf
Type: text/x-vcard
Size: 202 bytes
Desc: Card for Hoon Yoon
URL: <http://mail.python.org/pipermail/python-list/attachments/19990716/33ab627f/attachment.vcf>


More information about the Python-list mailing list