Maximizing observations from Sparse matrices

PoulsenL at capanalysis.com PoulsenL at capanalysis.com
Mon Jul 15 18:14:21 EDT 2002


I am running a regression that gathers its data from a panel dataset
(cross-sectional time series).  Because the dataset is sparse I must
sometimes either drop cross-sections or variables or some combination of the
two to avoid a singular matrix.  A quick, extremely simplified example:

obs	region	var1	var2	var3	Valid ob?
1	1		1	4	7	3
2	1		7	NA	9	0
3	1		7	NA	9	0
4	1		5	NA	7	0
5	1		7	7	5	3
						
1	2		7	NA	9	0
2	2		5	NA	6	0
3	2		7	4	5	3
4	2		9	8	NA	0
5	2		7	6	5	3
						
1	3		NA	58	4	0
2	3		NA	98	25	0
3	3		63	85	NA	0
4	3		74	NA	78	0
5	3		97	54	NA	0
						
1	4		NA	89	7	0
2	4		25	85	NA	0
3	4		5	NA	2	0
4	4		32	85	NA	0
5	4		45	12	3	3
Sum of valid obs				15

In this example I have 4 regions and 3 variables.  Region 3 has no valid
observations and therefore cannot be utilized unless corrected.  I have two
choices: eliminate a variable or eliminate the region.

If I eliminate the variable it will now run because I have valid obs across
all remaining regions:

obs	region	var1	var2	var3	Valid ob?
1	1			4	7	2
2	1			NA	9	0
3	1			NA	9	0
4	1			NA	7	0
5	1			7	5	2
						
1	2			NA	9	0
2	2			NA	6	0
3	2			4	5	2
4	2			8	NA	0
5	2			6	5	2
						
1	3			58	4	2
2	3			98	25	2
3	3			85	NA	0
4	3			NA	78	0
5	3			54	NA	0
						
1	4			89	7	2
2	4			85	NA	0
3	4			NA	2	0
4	4			85	NA	0
5	4			12	3	2
Sum of valid obs				16

Finally I can instead eliminate a region

obs	region		var1	var2	var3	Valid ob?
1	1		1	4	7	3
2	1		7	NA	9	0
3	1		7	NA	9	0
4	1		5	NA	7	0
5	1		7	7	5	3
						
1	2		7	NA	9	0
2	2		5	NA	6	0
3	2		7	4	5	3
4	2		9	8	NA	0
5	2		7	6	5	3
						
1	4		NA	89	7	0
2	4		25	85	NA	0
3	4		5	NA	2	0
4	4		32	85	NA	0
5	4		45	12	3	3
Sum of valid obs				15
						
This quick example yields the same number of observations, but there are
configurations where this may lead to more obs than eliminating a variable.
The problem is I have hundreds of variables and about 50 regions.  Is there
an efficient way to maximize the number of observations?

Thanks for any help

Loren





More information about the Python-list mailing list