Maximizing observations from Sparse matrices

Markus von Ehr markus.vonehr at ipm.fhg.de
Tue Jul 16 02:43:20 EDT 2002


Hi,

I didn't understand exactly what you wanna do but I would
recommend you to use Numeric.py, too.
I use the Numeric lib for Image operations, the Image is represented
in a Matrix, I cut submatrices, process them, it's quite fast and easy
to handle with Numeric.py.

Markus


PoulsenL at capanalysis.com schrieb:
> 
> I am running a regression that gathers its data from a panel dataset
> (cross-sectional time series).  Because the dataset is sparse I must
> sometimes either drop cross-sections or variables or some combination of the
> two to avoid a singular matrix.  A quick, extremely simplified example:
> 
> obs     region  var1    var2    var3    Valid ob?
> 1       1               1       4       7       3
> 2       1               7       NA      9       0
> 3       1               7       NA      9       0
> 4       1               5       NA      7       0
> 5       1               7       7       5       3
> 
> 1       2               7       NA      9       0
> 2       2               5       NA      6       0
> 3       2               7       4       5       3
> 4       2               9       8       NA      0
> 5       2               7       6       5       3
> 
> 1       3               NA      58      4       0
> 2       3               NA      98      25      0
> 3       3               63      85      NA      0
> 4       3               74      NA      78      0
> 5       3               97      54      NA      0
> 
> 1       4               NA      89      7       0
> 2       4               25      85      NA      0
> 3       4               5       NA      2       0
> 4       4               32      85      NA      0
> 5       4               45      12      3       3
> Sum of valid obs                                15
> 
> In this example I have 4 regions and 3 variables.  Region 3 has no valid
> observations and therefore cannot be utilized unless corrected.  I have two
> choices: eliminate a variable or eliminate the region.
> 
> If I eliminate the variable it will now run because I have valid obs across
> all remaining regions:
> 
> obs     region  var1    var2    var3    Valid ob?
> 1       1                       4       7       2
> 2       1                       NA      9       0
> 3       1                       NA      9       0
> 4       1                       NA      7       0
> 5       1                       7       5       2
> 
> 1       2                       NA      9       0
> 2       2                       NA      6       0
> 3       2                       4       5       2
> 4       2                       8       NA      0
> 5       2                       6       5       2
> 
> 1       3                       58      4       2
> 2       3                       98      25      2
> 3       3                       85      NA      0
> 4       3                       NA      78      0
> 5       3                       54      NA      0
> 
> 1       4                       89      7       2
> 2       4                       85      NA      0
> 3       4                       NA      2       0
> 4       4                       85      NA      0
> 5       4                       12      3       2
> Sum of valid obs                                16
> 
> Finally I can instead eliminate a region
> 
> obs     region          var1    var2    var3    Valid ob?
> 1       1               1       4       7       3
> 2       1               7       NA      9       0
> 3       1               7       NA      9       0
> 4       1               5       NA      7       0
> 5       1               7       7       5       3
> 
> 1       2               7       NA      9       0
> 2       2               5       NA      6       0
> 3       2               7       4       5       3
> 4       2               9       8       NA      0
> 5       2               7       6       5       3
> 
> 1       4               NA      89      7       0
> 2       4               25      85      NA      0
> 3       4               5       NA      2       0
> 4       4               32      85      NA      0
> 5       4               45      12      3       3
> Sum of valid obs                                15
> 
> This quick example yields the same number of observations, but there are
> configurations where this may lead to more obs than eliminating a variable.
> The problem is I have hundreds of variables and about 50 regions.  Is there
> an efficient way to maximize the number of observations?
> 
> Thanks for any help
> 
> Loren



More information about the Python-list mailing list